CS 598 TMC, Fall 2023
Advanced Data Structures
Instructor
Timothy Chan
(tmc "at" illinois.edu, office hours: Thu 3:30-4:30, Siebel 3230)
TA: Yuancheng Yu (yyu51 "at" illinois.edu, office hr: Wed 6:00pm-7:00pm
on zoom)
Meeting Time
Wed & Fri 11:00am-12:15pm, Loomis Lab 136
Course Overview
This is a CS theory/algorithms course, covering selected topics in data structures, which go beyond what are typically taught in 2nd and 3rd-year undergraduate classes. Potential topics include: balanced search trees, priority queues (e.g., Fibonacci heaps), amortized analysis, the union-find problem, hashing, geometric data structures (e.g., range searching), approximate nearest neighbor search (e.g., locality-sensitive hashing), bit-packing techniques (e.g., fusion trees and succinct data structures), persistent data structures, dynamic graph algorithms (e.g., dynamic connectivity and shortest paths), distance oracles, strings and text indexing (e.g., suffix trees), I/O-efficient data structures, and (conditional) lower bounds.
Prerequisites: Solid background in algorithm
design and analysis, at the level of CS 374. (CS 473 is not required, though some previous exposure to randomized algorithms might be helpful.) Undergraduate students who did well in 374 or 473 are welcomed, and may request for an UG petition.
Course Work
- 4 homeworks (problem sets), worth 45%
- HW1 (due Sep 22 Fri 10am) (9/17: minor typo in Q4 fixed; 9/20: typo in Q3, see footnote)
- HW2 (due Oct 13 Fri 10am) (10/2: add some footnotes)
- submit homework on Gradescope
(entry code PWXV3D)
- presentation, worth 15%
- project (reading some papers and writing a report, or implementing and experimentally
evaluating data structures from papers, or doing original theoretical research), worth 40%
(Homeworks, presentations, and projects may be done in groups of at most 3.)
Tentative Outline
- Basics: RMQ, balanced search trees, heaps with decrease-key, union-find, ...
- Integers: hashing, vEB trees, fusion trees, ...
- Geometry: orthogonal range search, point location, approximate nearest neighbors/LSH, ...
- Graphs: dynamic connectivity, dynamic reachability, approximate distance oracles, ...
- Strings: suffix trees/arrays, ...
- Other models: succinct data structures, external memory, streaming/sketching, ...
Lectures
(There is no textbook; I'll provide links to relevant references below, and also scribbles from class.)
- Aug 23: First example: Range min queries (RMQ).
- Aug 25: RMQ cont'd (and LCA).
[Ref: Bender and Farach-Colton (2005),
or Fischer and Heun (2006)]
- Aug 30: Dynamic predecessor search and balanced search trees (2-3 trees etc.).
[Ref: wikipedia on 2-3 trees,
red-black trees,
AVL trees,
treaps,
splay trees,
etc.;
Andersson's AA trees (1993)]
- Sep 1: More balanced search trees ((a,b) trees, weight-balanced trees) and amortized analysis.
[Ref: Rolf Fagerberg's notes on
amortized analysis of (a,b) trees; Michiel Smid's notes on
amortized analysis of weight-balanced trees]
- Sep 6: Priority queues and heaps.
- Sep 8: Quake heaps (alternative to Fibonacci heaps).
[Ref: my paper (2013)]
- Sep 13: Union-find.
- Sep 15: Union-find cont'd (with inverse Ackermann).
[Ref: Ch2 of Tarjan's book,
Jeff's notes,
papers by la Poutre (1990) and Seidel and Sharir (2005)]
- Sep 20: Hashing.
- Sep 22: Hashing cont'd (universal, perfect, Fredman-Komlos-Szemeredi).
[Ref: Sec 8.4-8.5 of Motwani and Raghavan's book, or
Jeff's notes]
- Sep 27: More hashing (cuckoo). Integer predecessor search...
[Ref: Pagh and Rodler's paper (2004) (or the
"undergrad version")]
- Sep 29: Van Emde Boas trees.
[Ref: wikipedia]
- Oct 4: Fusion trees. Orthogonal range searching...
[Ref: Fredman and Willard's paper (1990)]
Other Resources