Fall 2024-CS 539-ECE 526-Distributed Algorithms

Distributed algorithms are algorithms designed to run on multiple processes or mutually distrusting participants, often to increase performance through parallel processing or improve resilience by tolerating faults. Without centralized control, distributed algorithms typically require delicate coordination among processes. The concurrency, non-determinism, and potential faults make distributed algorithms hard to design and analyze compared to single-process algorithms. 

This course covers foundations and recent advances in distributed algorithms. The course studies fundamental problems in distributed computing, including graph algorithms, leader election, synchronization, consensus, broadcast, state machine replication, distributed shared memory, mutual exclusion, shared objects, and distributed cryptography. The course studies foundational models and assumptions regarding shared memory vs. message passing, synchrony vs. asynchrony, reliability of message delivery, crashes and Byzantine faults. The course introduces common algorithm design techniques, including synchronization, cryptographic primitives, leader election, randomization, optimistic cases, simulation, and quorums. 

This is a theory/algorithm course that emphasizes formal modeling, rigorous proofs, theoretical analysis, and fundamental limits in the form of lower bounds and impossibility results. While the course also touches on practical aspects whenever applicable, it will not be a priority, and many topics covered in the course will be of theoretical interest primarily at present.

Prerequisite: None required. However, a course on algorithms (like CS/ECE 374) OR distributed systems (like CS 425 / ECE 428) is strongly recommended. 

Distributed Algorithms (CS539) vs. Distributed Systems (CS425): If you are wondering how to choose between these two classes or if you should take both, here is some information for you. CS425 covers basic concepts ranging from algorithmics to systems design principles that underlie distributed systems, especially those deployed in the industry. CS539 focuses on the theoretical underpinnings of distributed algorithms with an emphasis on their formal modeling, rigorous proofs, and fundamental limits. Neither course is a prerequisite for the other. The courses have a small overlap for the sake of continuity so that you can (if you wish) take both courses in either order and still benefit from each course. Take CS425 first if you want an intro course to distributed systems. Take CS539 first if you love doing algorithmics and proofs. Both courses are intended to be accessible to undergrads (with appropriate prerequisites) as well as grad students at all levels.

Grading: five to six problem sets 60%; midterm 15%, final exams 20%; class participation 5%.    

Resources: No required textbook. The following textbooks and online resources may be helpful.

Lectures: Monday and Wednesday 11 am -- 12:15 pm in 0222 Siebel Center for Computer Science.

Office Hours: Monday 10 - 11 am in 4312 Siebel Center AND after each class in the lecture room.

Tentative Schedule: (subject to change)

Date Topic Suggested Reading
08/26 Introduction Download Introduction
08/28 Leader Election in a Ring Download Leader Election in a Ring An Improved Algorithm for Decentralized Extrema-Finding in Circular Configurations of Processes Links to an external site. 
09/02 No Class (Labor Day)
09/04 No Class (Instructor Away)
09/09 Basic Graph Algorithms Download Basic Graph Algorithms
09/11 Clock Synchronization Download Clock Synchronization
09/16 Synchronizers Download Synchronizers Complexity of Network SynchronizationLinks to an external site (Sections 1-3)
09/18 Causality, Logical Clocks, and Global States Download Causality, Logical Clocks, and Global States Time, Clocks and Ordring of Events in a Distributed System
09/23 Common Knowledge Download Common Knowledge Knowledge and common knowledge in a distributed environment Links to an external site.
09/25 Introduction to Consensus Download Introduction to Consensus The Byzantine Generals Problem (Sections 1-3)
09/30 Time Bounds of Consensus Download Time Bounds of Consensus A Simple Bivalency Proof that t-Resilient Consensus Requires t+1 Rounds
10/02 FLP Impossibility Download FLP Impossibility Impossibility of Distributed Consensus with One Faulty Process    
10/07 Randomized Agreement Download Randomized Agreement Another advantage of free choice        Modern Ben-Or
10/09 Partial Synchrony and Paxos Download Partial Synchrony and Paxos Partial synchrony Links to an external site.     The Part-Time Parliament      Paxos Made Simple
10/14 No Class (extra office hour)  
10/16 Midterm  
10/21 Distributed Shared Memory Download Distributed Shared Memory Sharing memory robustly in message-passing systems (Sections 1-4)
10/23 Shared Registers Download Shared Registers  
11/28 Mutual Exclusion Download Mutual Exclusion Peterson's Mutual Exclusion Algorithm Links to an external site.
11/30 Mutual Exclusion Download Mutual Exclusion
Lamport's Bakery Algorithm Links to an external site.
11/04 Mutual Exclusion Download Mutual Exclusion Mellor-Crummey and Scott Mutual Exclusion Algorithm Links to an external site.
11/06 Wait-free Hierarchy Download Wait-free Hierarchy Wait-free synchronization
11/11 Byzantine Broadcast Download Byzantine Broadcast Dolev-Strong Authenticated Broadcast Links to an external site.   
11/13 Reliable Broadcast Download Reliable Broadcast Bracha's Reliable Broadcast Links to an external site.
11/18 PBFT Download PBFT Practical Byzantine Fault Tolerance (Sections 1-4)
11/20 Fault Tolerance Bounds Download Fault Tolerance Bounds
11/25 No Class (Fall Break)
11/27 No Class (Fall Break)
12/02 Communication Bounds of Consensus Download Communication Bounds of Consensus Bounds on information exchange for Byzantine Agreement Links to an external site.
12/04 Nakamoto Consensus Download Nakamoto Consensus Bitcoin: A Peer-to-Peer Electronic Cash System    Bitcoin's Latency--Security Analysis Made Simple
12/09 No Class (extra office hour)
12/11 Final Exam