Course Description:
This course introduces modern machine learning techniques for developing and deploying generative
models across a range of data modalities. Topics include autoregressive models for text and image
generation, such as transformers and GPT-style architectures, with sampling carried out sequentially
from conditional distributions. The course also covers image generation approaches including variational
autoencoders (VAEs), generative adversarial networks (GANs), diffusion models, energy-based models,
and flow based methods. Emphasis is placed on model architecture design, training objectives (such as
maximum likelihood estimation, adversarial losses, and denoising losses), and inference methods including
autoregressive sampling, Langevin dynamics, and Markov chain Monte Carlo (MCMC). Additional topics
include multimodal generation, representation learning, foundation model pre-training and post-training,
alignments, evaluation techniques, and applications.
Prerequisites:
Students should have prior coursework in machine learning at the level of CS 446 or equivalent, along with
a strong background in probability and statistics. Mathematical maturity is expected, including familiarity
with advanced linear algebra, multivariable calculus, and abstract probabilistic concepts. Experience with
Python programming and numerical libraries such as NumPy and PyTorch is required. Prior exposure to
generative models is helpful but not necessary.
Class Meeting Time:
Mon, Wed, 9:30am–10:45pm, SC 0216.
Lectures will be conducted in person and via zoom. The zoom session will be recorded.
Instructor: Prof. Tong Zhang
Email: tozhang@illinois.edu
Office Hour: Mon 11:00am – 12:00pm
Office Location: SC 2118 or via zoom
Teaching Assistant
Shivanshu Shekhar
Email: shekhar6@illinois.edu
Office Hour: Friday, 12:00PM – 1:00PM
Office Location: SC 1112 or via: zoom.
Course Material:
Lecture slides (distributed before each lecture)
Reference papers
Course Resources:
Website: https://courses.grainger.illinois.edu/cs498tzu/sp2026/
Gradescope: can be accessed from canvas
Course Credit:
Undergraduate: 3
Graduate: 3 or 4
Grading:
3-credit: best four out of six homeworks (55%) + in class exams (45%)
4-credit: six homeworks (55%) + in class exams (45%)
Lecture Topics (tentative)
Lecture 01 (Jan 21): Introduction
Part I: Review of basic neural architectures, optimization, representation learning
lecture 02 (Jan 26): fully connected neural networks
lecture 03 (Jan 28): stochastic optimization
lecture 04 (Feb 2): convolutional neural networks
home work #1
lecture 05 (Feb 4): neural sequence models
lecture 06 (Feb 9): encoder, decoder, and autoencoder
lecture 07 (Feb 11): variational autoencoder
home work #2
lecture 08 (Feb 16): contrastive learning
lecture 09 (Feb 18): Part I in class exam
Part II: Large Language Model Text Generation
lecture 09 (Feb 23): transformer
lecture 11 (Feb 25): encoder only, encoder-decoder, and decoder only transformer models
lecture 12 (March 2): advanced transformer designs
home work #3
lecture 13 (March 4): instruction tuning
lecture 14 (March 9): scaling law, cot, and emergent abilities
lecture 15 (March 11): resource efficient training
home work #4
spring break
lecture 16 (March 23): reinforcement learning with human feedback
lecture 17 (March 25): reasoning and verifiable rewards
lecture 18 (March 30): evaluation
lecture 19 (April 1): Part II in class exam
Part III: Multimodal Model and Image Generation
lecture 20 (April 6): visual language model
lecture 21 (April 8): auto-regressive image generation
lecture 22 (April 13): energy based model
home work #5
lecture 23 (April 15): MCMC and Langevin sampling
lecture 24 (April 20): generative adversarial network
lecture 25 (April 22): flow matching
home work #6
lecture 26 (April 27): Unet and diffusion transformer
lecture 27 (April 29): score matching
lecture 28 (May 4): diffusion model
lecture 29 (May 6): Part III in class exam