Lecture 03 - Vanilla and Flux.jl deep neural network implementation

MachineLearningCourse.Lecture03Module
Lecture03

Vanilla and Flux.jl deep neural network implementation.

Available Functions

  • demo(): Vanilla deep learning demo for MNIST handwritten digit recognition
  • flux_demo(): Flux.jl deep learning demo for MNIST handwritten digit recognition

Usage

using MachineLearningCourse
Lecture03.demo()
using MachineLearningCourse
Lecture03.flux_demo()
source
MachineLearningCourse.Lecture03.VanillaDNNType
VanillaDNN(layers)

Vanilla Deep Neural Network structure with fully connected layers (educational implementation).

Arguments

  • layers::Vector{Int}: Number of neurons per layer [input, hidden..., output]

Fields

  • layers::OffsetVector{Int}: Layer architecture specification
  • W::Vector{Matrix{Float32}}: Weight matrices W^[l] for each layer
  • b::Vector{Vector{Float32}}: Bias vectors b^[l] for each layer
  • L::Int: Total number of layers (excluding input layer)

Uses He initialization for weights and zero initialization for biases. ReLU activation for hidden layers, linear activation for output layer.

Example

# Create network: 784 inputs → 128 hidden → 64 hidden → 10 outputs
network = VanillaDNN([784, 128, 64, 10])
source
MachineLearningCourse.Lecture03.accuracyMethod
accuracy(network::VanillaDNN, X_test, Y_test)

Calculate accuracy of DNN model on test data.

Arguments

  • network::DNN: Trained DNN network
  • X_test::Vector{Vector{Float32}}: Test input data
  • Y_test::Vector{Vector{Float32}}: Test target labels (one-hot encoded)

Returns

  • Float32: Test accuracy (0.0f0 to 1.0f0)
source
MachineLearningCourse.Lecture03.backpropagationMethod
backpropagation(network, activations, z_values, y)

Compute gradients using backpropagation algorithm.

Arguments

  • network::DNN: Neural network structure
  • activations::OffsetVector{Vector{Float32}}: Layer activations from forward pass
  • z_values::Vector{Vector{Float32}}: Linear combinations from forward pass
  • y::Vector{Float32}: True target values

Returns

  • Tuple{Vector{Matrix{Float32}}, Vector{Vector{Float32}}}: (∇W, ∇b)
    • ∇W: Weight gradients for each layer
    • ∇b: Bias gradients for each layer
source
MachineLearningCourse.Lecture03.demoMethod
demo(seed=42, hidden_layers=[128, 64], train_size=5000, test_size=1000, η=0.001, epochs=50, verbose=true)

Vanilla MNIST handwritten digit recognition demonstration.

Parameters

  • seed: Random seed (default: 42)
  • hidden_layers: Dimensions of hidden layers (default: [128, 64])
  • train_size: Number of training samples to use (default: 5000)
  • test_size: Number of test samples to use (default: 1000)
  • η: Learning rate for training (default: 0.001)
  • epochs: Number of training epochs (default: 50)
  • verbose: Print training progress (default: true)
source
MachineLearningCourse.Lecture03.display_digitMethod
display_digit(image; title="MNIST Digit")

Display a single MNIST digit image.

Arguments

  • image::Array{Float32, 2}: 28×28 image array with values 0-1
  • title::String: Optional title for the plot

Example

X_train, Y_train, X_test, Y_test = load_mnist_data(100, 50)
# Display first training image
i = 1
display_digit(X_train[:, :, i], title="Digit " * string(argmax(Y_train[:, i]) - 1))
source
MachineLearningCourse.Lecture03.evaluateMethod
evaluate(network::VanillaDNN, X_test, Y_test, classes)

Comprehensive evaluation of DNN model with confusion matrix and per-class metrics.

Arguments

  • network::DNN: Trained DNN network
  • X_test::Vector{Vector{Float32}}: Test input data
  • Y_test::Vector{Vector{Float32}}: Test target labels (one-hot encoded)
  • classes: Vector of class labels or number of classes

Returns

  • NamedTuple: (accuracy=Float32, predictions=Vector{Int}, truelabels=Vector{Int}, confusionmatrix=Matrix{Int})
source
MachineLearningCourse.Lecture03.flux_demoMethod
flux_demo(;train_size=1000, epochs=10)

Minimal MNIST demonstration using gradient descent.

Arguments

  • train_size: Number of training samples (default: 5000)
  • test_size: Number of test samples (default: 1000)
  • epochs: Number of training epochs (default: 1000)

Returns

  • Trained model and final accuracy
source
MachineLearningCourse.Lecture03.forwardpropagationMethod
forwardpropagation(network, x)

Compute forward propagation through the neural network.

Mathematical formulation:

  • z^l = W^l * a^[l-1] + b^l
  • a^l = ϕ(z^l) for hidden layers, a^l = z^l for output layer

Arguments

  • network::VanillaDNN: Neural network structure
  • x::Vector{Float32}: Input vector

Returns

  • Tuple{Vector{Vector{Float32}}, Vector{Vector{Float32}}}: (activations, z_values)
    • activations: [a^0, a^1, ..., a^L] - activations for each layer
    • z_values: [z^1, z^2, ..., z^L] - linear combinations for each layer
source
MachineLearningCourse.Lecture03.load_mnist_dataFunction
load_mnist_data(train_size=5000, test_size=1000)

Load MNIST handwritten digit dataset.

Arguments

  • train_size::Int: Number of training samples to use (default: 5000)
  • test_size::Int: Number of test samples to use (default: 1000)

Returns

  • Tuple: (Xtrain, Ytrain, Xtest, Ytest)
    • X_train::Array{Float32, 3}: Training images (28 × 28 × samples), values 0-1
    • Y_train::Matrix{Float32}: Training labels as one-hot encoded matrix (10 × samples)
    • X_test::Array{Float32, 3}: Test images (28 × 28 × samples), values 0-1
    • Y_test::Matrix{Float32}: Test labels as one-hot encoded matrix (10 × samples)

Example Usage - Image Visualization

# Load data
X_train, Y_train, X_test, Y_test = load_mnist_data(100, 50)
source
MachineLearningCourse.Lecture03.predictMethod
predict(network, x)

Make predictions using the trained neural network.

Performs forward propagation to compute network output ŷ.

Arguments

  • network::DNN: Trained neural network
  • x::Vector{Float32}: Input vector

Returns

  • Vector{Float32}: Network predictions (output layer activations)
source
MachineLearningCourse.Lecture03.train!Function
train!(network, X, Y, η=0.01, epochs=1000, verbose=true)

Train the neural network using gradient descent.

Implements the complete training algorithm:

  1. Forward propagation
  2. Loss computation
  3. Backpropagation
  4. Parameter update

Arguments

  • network::VanillaDNN: Neural network (modified in-place)
  • X::Vector{Vector{Float32}}: Training input data
  • Y::Vector{Vector{Float32}}: Training target data
  • η::Float32: Learning rate (default: 0.01)
  • epochs::Int: Number of training epochs (default: 1000)
  • verbose::Bool: Print training progress (default: true)

Returns

  • Vector{Float32}: Training losses for each epoch
source
MachineLearningCourse.Lecture03.update_parameters!Method
update_parameters!(network, ∇W, ∇b, η)

Update network parameters using gradient descent.

Parameter updates:

  • W^l ← W^l - η * ∂ℒ/∂W^l
  • b^l ← b^l - η * ∂ℒ/∂b^l

Arguments

  • network::DNN: Neural network (modified in-place)
  • ∇W::Vector{Matrix{Float32}}: Weight gradients
  • ∇b::Vector{Vector{Float32}}: Bias gradients
  • η::Float32: Learning rate
source
MachineLearningCourse.Lecture03.∂ϕ_∂zMethod
∂ϕ_∂z(z)

Derivative of ReLU activation function: ∂ϕ/∂z = 1 if z > 0, 0 if z ≤ 0.

Arguments

  • z::Real: Input value

Returns

  • Float32: Derivative value (1.0f0 if z > 0, 0.0f0 if z ≤ 0)
source