Course Description:
This course rigorously covers foundational concepts in learning theory, emphasizing theoretical analysis
related to modern deep learning frameworks. Key topics include generalization analysis, VC-dimension,
covering numbers, Rademacher complexity, stochastic gradient descent, common techniques for lower
bound analysis, universal approximation results, Neural Tangent Kernel (NTK) regime optimization,
benign overfitting, and mean-field analysis. Evaluation consists of four homework assignments, a take-home
midterm exam, and a group final project. The course aims to equip students with the theoretical foundations
necessary to engage with current research literature.
Prerequisites:
Advanced linear algebra
Probability and statistics
Machine learning at the level of CS446
Strong mathematical skills at the level of mathematical statistics and real analysis.
Class time:
Mon, Wed, 11:00am–12:15pm, SC 0216.
Lectures will be conducted in person and will not be recorded unless otherwise announced.
Instructor: Prof. Tong Zhang (tozhang@illinois.edu)
Office: SC 2118
Office Hour: Mon 10:00am – 10:50am
Course resources:
Grading:
Four theoretical homework assignments (60%)
One take-home midterm (20%), with a 3-day turnaround time
One group final project (15%), with 3–4 students per group
Read and understand a recent paper related to learning theory. Give a 20–25 minute presentation (5%) and write an approximately five-page report (10%).
Class attendance via online sign-in (5%).
Course Material:
Lecture slides (distributed before each lecture)
Reference book: Mathematical Analysis of Machine Learning Algorithms
Paper readings
Lectures (tentative):
Introduction (1 lecture)
Probability inequalities (2 lectures)
Uniform convergence (2 lectures)
Covering numbers (2 lectures)
VC dimension (2 lectures)
Rademacher complexity (2 lectures)
Concentration inequality (1 lecture)
Model selection (1 lecture)
Lower bounds (2 lectures)
SGD analysis (2 lectures)
Universal approximation (2 lectures)
Neural tangent kernel (2 lectures)
Benign overfitting (2 lectures)
Mean-field analysis (2 lectures)
Presentations (4 lectures)