Lecture 03 - Vanilla and Flux.jl deep neural network implementation
MachineLearningCourse.Lecture03 — Module
Lecture03Vanilla and Flux.jl deep neural network implementation.
Available Functions
demo(): Vanilla deep learning demo for MNIST handwritten digit recognitionflux_demo(): Flux.jl deep learning demo for MNIST handwritten digit recognition
Usage
using MachineLearningCourse
Lecture03.demo()using MachineLearningCourse
Lecture03.flux_demo()MachineLearningCourse.Lecture03.VanillaDNN — Type
VanillaDNN(layers)Vanilla Deep Neural Network structure with fully connected layers (educational implementation).
Arguments
layers::Vector{Int}: Number of neurons per layer [input, hidden..., output]
Fields
layers::OffsetVector{Int}: Layer architecture specificationW::Vector{Matrix{Float32}}: Weight matrices W^[l] for each layerb::Vector{Vector{Float32}}: Bias vectors b^[l] for each layerL::Int: Total number of layers (excluding input layer)
Uses He initialization for weights and zero initialization for biases. ReLU activation for hidden layers, linear activation for output layer.
Example
# Create network: 784 inputs → 128 hidden → 64 hidden → 10 outputs
network = VanillaDNN([784, 128, 64, 10])MachineLearningCourse.Lecture03.accuracy — Method
accuracy(network::VanillaDNN, X_test, Y_test)Calculate accuracy of DNN model on test data.
Arguments
network::DNN: Trained DNN networkX_test::Vector{Vector{Float32}}: Test input dataY_test::Vector{Vector{Float32}}: Test target labels (one-hot encoded)
Returns
Float32: Test accuracy (0.0f0 to 1.0f0)
MachineLearningCourse.Lecture03.backpropagation — Method
backpropagation(network, activations, z_values, y)Compute gradients using backpropagation algorithm.
Arguments
network::DNN: Neural network structureactivations::OffsetVector{Vector{Float32}}: Layer activations from forward passz_values::Vector{Vector{Float32}}: Linear combinations from forward passy::Vector{Float32}: True target values
Returns
Tuple{Vector{Matrix{Float32}}, Vector{Vector{Float32}}}: (∇W, ∇b)∇W: Weight gradients for each layer∇b: Bias gradients for each layer
MachineLearningCourse.Lecture03.demo — Method
demo(seed=42, hidden_layers=[128, 64], train_size=5000, test_size=1000, η=0.001, epochs=50, verbose=true)Vanilla MNIST handwritten digit recognition demonstration.
Parameters
seed: Random seed (default: 42)hidden_layers: Dimensions of hidden layers (default: [128, 64])train_size: Number of training samples to use (default: 5000)test_size: Number of test samples to use (default: 1000)η: Learning rate for training (default: 0.001)epochs: Number of training epochs (default: 50)verbose: Print training progress (default: true)
MachineLearningCourse.Lecture03.display_digit — Method
display_digit(image; title="MNIST Digit")Display a single MNIST digit image.
Arguments
image::Array{Float32, 2}: 28×28 image array with values 0-1title::String: Optional title for the plot
Example
X_train, Y_train, X_test, Y_test = load_mnist_data(100, 50)
# Display first training image
i = 1
display_digit(X_train[:, :, i], title="Digit " * string(argmax(Y_train[:, i]) - 1))MachineLearningCourse.Lecture03.evaluate — Method
evaluate(network::VanillaDNN, X_test, Y_test, classes)Comprehensive evaluation of DNN model with confusion matrix and per-class metrics.
Arguments
network::DNN: Trained DNN networkX_test::Vector{Vector{Float32}}: Test input dataY_test::Vector{Vector{Float32}}: Test target labels (one-hot encoded)classes: Vector of class labels or number of classes
Returns
NamedTuple: (accuracy=Float32, predictions=Vector{Int}, truelabels=Vector{Int}, confusionmatrix=Matrix{Int})
MachineLearningCourse.Lecture03.flux_demo — Method
flux_demo(;train_size=1000, epochs=10)Minimal MNIST demonstration using gradient descent.
Arguments
train_size: Number of training samples (default: 5000)test_size: Number of test samples (default: 1000)epochs: Number of training epochs (default: 1000)
Returns
- Trained model and final accuracy
MachineLearningCourse.Lecture03.forwardpropagation — Method
forwardpropagation(network, x)Compute forward propagation through the neural network.
Mathematical formulation:
- z^l = W^l * a^[l-1] + b^l
- a^l = ϕ(z^l) for hidden layers, a^l = z^l for output layer
Arguments
network::VanillaDNN: Neural network structurex::Vector{Float32}: Input vector
Returns
Tuple{Vector{Vector{Float32}}, Vector{Vector{Float32}}}: (activations, z_values)activations: [a^0, a^1, ..., a^L] - activations for each layerz_values: [z^1, z^2, ..., z^L] - linear combinations for each layer
MachineLearningCourse.Lecture03.load_mnist_data — Function
load_mnist_data(train_size=5000, test_size=1000)Load MNIST handwritten digit dataset.
Arguments
train_size::Int: Number of training samples to use (default: 5000)test_size::Int: Number of test samples to use (default: 1000)
Returns
Tuple: (Xtrain, Ytrain, Xtest, Ytest)X_train::Array{Float32, 3}: Training images (28 × 28 × samples), values 0-1Y_train::Matrix{Float32}: Training labels as one-hot encoded matrix (10 × samples)X_test::Array{Float32, 3}: Test images (28 × 28 × samples), values 0-1Y_test::Matrix{Float32}: Test labels as one-hot encoded matrix (10 × samples)
Example Usage - Image Visualization
# Load data
X_train, Y_train, X_test, Y_test = load_mnist_data(100, 50)MachineLearningCourse.Lecture03.predict — Method
predict(network, x)Make predictions using the trained neural network.
Performs forward propagation to compute network output ŷ.
Arguments
network::DNN: Trained neural networkx::Vector{Float32}: Input vector
Returns
Vector{Float32}: Network predictions (output layer activations)
MachineLearningCourse.Lecture03.train! — Function
train!(network, X, Y, η=0.01, epochs=1000, verbose=true)Train the neural network using gradient descent.
Implements the complete training algorithm:
- Forward propagation
- Loss computation
- Backpropagation
- Parameter update
Arguments
network::VanillaDNN: Neural network (modified in-place)X::Vector{Vector{Float32}}: Training input dataY::Vector{Vector{Float32}}: Training target dataη::Float32: Learning rate (default: 0.01)epochs::Int: Number of training epochs (default: 1000)verbose::Bool: Print training progress (default: true)
Returns
Vector{Float32}: Training losses for each epoch
MachineLearningCourse.Lecture03.update_parameters! — Method
update_parameters!(network, ∇W, ∇b, η)Update network parameters using gradient descent.
Parameter updates:
- W^l ← W^l - η * ∂ℒ/∂W^l
- b^l ← b^l - η * ∂ℒ/∂b^l
Arguments
network::DNN: Neural network (modified in-place)∇W::Vector{Matrix{Float32}}: Weight gradients∇b::Vector{Vector{Float32}}: Bias gradientsη::Float32: Learning rate
MachineLearningCourse.Lecture03.ϕ — Method
ϕ(z)ReLU activation function: ϕ(z) = max(0, z).
Arguments
z::Real: Input value
Returns
Float32: Activated value (0.0f0 if z ≤ 0, z if z > 0)
MachineLearningCourse.Lecture03.∂ϕ_∂z — Method
∂ϕ_∂z(z)Derivative of ReLU activation function: ∂ϕ/∂z = 1 if z > 0, 0 if z ≤ 0.
Arguments
z::Real: Input value
Returns
Float32: Derivative value (1.0f0 if z > 0, 0.0f0 if z ≤ 0)