Lecture 06 - Graph convolutional networks (GCNs) for collaborative filtering

MachineLearningCourse.Lecture06Module
Lecture06

Graph convolutional networks (GCNs) for collaborative filtering.

Available Functions

  • demo(): Graph convolutional network demo for collaborative filtering using MovieLens data
  • movie_explorer(embeddings, data): Interactive menu for exploring movie similarities using trained embeddings.

Usage

using MachineLearningCourse
Lecture06.demo()

or

_, embeddings, data = Lecture06.demo(interactive=false)
Lecture06.movie_explorer(embeddings, data)
source
MachineLearningCourse.Lecture06.GCNMethod
GCN(graph::Graph; epochs=50, embedding_sizes=[64, 32], η=0.001, batch_size=1024)

Create and train a Graph Convolutional Network.

Arguments

  • graph::Graph: Complete graph with nodes and edges using consecutive indices
  • epochs::Int: Number of training epochs (default: 50)
  • embedding_sizes::Vector{Int}: Embedding dimensions for each layer (default: [64, 32])
  • η::Float64: Learning rate for Adam optimizer (default: 0.001)
  • batch_size::Int: Mini-batch size for SGD (default: 1024)

Returns

  • Tuple: (model, embeddings, losses)
source
MachineLearningCourse.Lecture06.create_adjacency_matrixMethod
create_adjacency_matrix(graph::Graph; use_weights=false)

Creates the adjacency matrix of the graph.

Arguments

  • graph::Graph: Complete graph with nodes and edges using consecutive indices
  • use_weights::Bool: If true, use edge weights; if false, use 1.0 for all edges (default: false)

Returns

  • A: Adjacency matrix
source
MachineLearningCourse.Lecture06.demoFunction
demo(dataset_size="100k"; epochs=100, embedding_sizes=[128, 64], interactive=false)

Demonstrate collaborative filtering using GCN on MovieLens data.

Arguments

  • dataset_size: MovieLens dataset size ("100k" or "1m") (default: "100k")
  • epochs: Number of training epochs (default: 100)
  • embedding_sizes::Vector{Int}: Embedding dimensions for each layer (default: [28, 64])
  • η::Float64: Learning rate for Adam optimizer (default: 0.001)
  • batch_size::Int: Mini-batch size for SGD (default: 1024)
  • interactive::Bool: If true, prompt for movie index to find similar movies (default: true)

Returns

  • Tuple: (model, embeddings, data, losses, test_rmse)

Example

# Run with default 100k dataset
Lecture06.demo()
# Learn embeddings without exploring similar movies
model, embeddings, data = Lecture06.demo(interactive=false)
# Run with 1m dataset and custom architecture (slow)
Lecture06.demo("1m", embedding_sizes=[256, 128, 64])
source
MachineLearningCourse.Lecture06.load_movielens_dataFunction
load_movielens_data(dataset_size="100k"; train_ratio=0.8)

Load MovieLens dataset using MLDatasets.jl and split into train/test sets.

Arguments

  • dataset_size::String: Dataset size ("100k", "1m", "10m", "20m") (default: "100k")
  • train_ratio::Float32: Proportion for training set (default: 0.8f0)

Returns

  • NamedTuple with traingraph, testgraph, nusers, nitems
source
MachineLearningCourse.Lecture06.movie_explorerMethod
movie_explorer(embeddings, data)

Interactive menu for exploring movie similarities using trained embeddings.

Arguments

  • embeddings::Matrix{Float32}: Trained node embeddings from GCN
  • data: Dataset containing graph and metadata

Example

_, embeddings, data = Lecture06.demo(interactive=false)
Lecture06.movie_explorer(embeddings, data)
source
MachineLearningCourse.Lecture06.similaritiesMethod
similarities(E₁, E₂)

Compute cosine similarities between two embedding matrices.

Arguments

  • E₁::Matrix{Float32}: First set of embeddings (batchsize, embeddim)
  • E₂::Matrix{Float32}: Second set of embeddings (batchsize, embeddim)

Returns

  • Vector{Float32}: Cosine similarities
source
MachineLearningCourse.Lecture06.train!Method
train!(model, graph; epochs=100, η=0.001, batch_size=1024)

Train GCN model to predict edge weights using mini-batch gradient descent.

Arguments

  • model::NamedTuple: GCN model with initialembeddings, gcnlayers, and Â
  • graph::Graph: Training graph with edges using consecutive indices
  • epochs::Int: Number of training epochs (default: 50)
  • η::Float32: Learning rate for Adam optimizer (default: 0.001)
  • batch_size::Int: Mini-batch size for SGD (default: 1024)

Returns

  • Vector{Float32}: Training losses per epoch
source