Synchronization

Synchronization manages the timing of concurrent operations, typically by making one or more operations wait until some condition is met by another operation.

Of our four broad type of concurrency,

Synchronization of threads and shared-resource processes has been a topic of study in computer science at least since the 1960s, and many strategies for synchronizing have been designed, implemented, and studied. These broadly fall into two categories: synchronization primitives and atomic operations. This page gives and overview of the most common components of each.

1 Atomic Operations

Atomic operations are a key concept in lock-free data structures, a family of tools for high-efficiency synchronization in the case when multiple threads contending for a single resource is rare. Atomic operations resemble other assembly instructions, but they have an additional guarantee that all of their work will happen without a separate thread accessing the same resources. Any operation or set of operations that guarantees this kind of exclusive access to its resources during its operation is called atomic. Atomic operations are quite a bit less efficient than their non-atomic counterparts and should be used only when atomicity is required.

Common atomic operations include

2 Synchronization primitives

All synchronization primitives have the following properties:

There are many synchronization primitives; the following are those I see the most often in code I read and write.

2.1 Lock

A lock has two states: locked and unlocked. It also has two functions/methods: acquire and release. Acquire changes the lock from unlocked to locked, blocking until that can be done. Release changes the lock from locked to unlocked.

locked unlocked
acquire block change to locked
release change to unlocked
resume any threads blocked on this lock’s acquire
Error

The most common use of locks is to implement mutual exclusion: a chunk of code (called a critical section) that only one thread can access at a time. This tends to look like

lock.acquire()

do things
using only one thread
at a time

lock.release()

Releasing after doing the work has potential problems if one of the things in the critical section can crash, so it’s often actually written as

lock.acquire()
try:
    do things
    using only one thread
    at a time
finally:
    lock.release()

The finally-based pattern is so common that it often has its own language-level syntax, such as in Java

synchronized(lock) {
    // do things
    // using only one thread
    // at a time
}
// lock has been released here

or in Python

with lock:
    ... # do things
    ... # using only one thread
    ... # at a time
# lock has been released here

Bathroom stalls act like locks: when you acquire it (try to enter the stall) you either get exclusive access to it until you release it (leave the stall) or you block (wait) until it’s available.

The primary function of a lock is to provide mutual exclusion: a thread that acquires the lock excludes others from doing so in a symmetric (mutual) way. Because of this, a lock is often called a mutex.

2.2 Condition Variable

A condition variable is like a lock with a special temporary-release mode. It has acquire and release function/methods just like a lock, but also wait, notify, and notify_all methods

wait is like a temporary release: it releases the lock, then blocks until both (a) notified and (b) can reacquire.

locked unlocked
acquire block change to locked
release change to unlocked
resume any threads blocked on acquire
Error
wait release
block until notify, then acquire
Error
notify unblock one thread blocked on wait unblock one thread blocked on wait
notify_all unblock all threads blocked on wait unblock all threads blocked on wait

This may seem strangely complicated, but it allows a particular pattern that is very useful, namely locking until some arbitrarily-complicated condition is met:

cv.acquire()
while not complicated_condition_is_met():
    cv.wait()

do things using only one thread at a time
that change the complicated condition for some thread

cv.notify_all()
cv.release()

Because the acquire/release pair is essential, it is often facilitated with try/finally or a language-level syntax like Python’s:

with cv:
    while not condition_met():
        cv.wait()
    ... # do things
    ... # using only one thread
    ... # at a time
    cv.notify_all()
# lock has been released here

notify or notify_all?

Using notify is generally more efficient than notify_all, but if several threads are waiting and only a few of them will be enabled by the notification then it can also cause deadlock, with the randomly-selected thread that is unblocked not being able to make progress and all ending up blocked. There’s much more to say here, but for a first course in synchronization it’s probably safer to use the notify_all version unless you are sure notify will work.

I sometimes eat at a buffet where each dish has a condition variable-like interface. Only one person can access a given dish at a time (making it like a lock), but if that dish is empty they can wait for it to be refilled. Workers can also refill the dishes.

In other words, as a customer with a specific food I like I run this:

food_station.acquire()
while not food_station.has_food():
    food_station.wait()
food_station.serve(me)
food_station.notify()
food_station.release()

The worker who refills the food uses it more like a lock, kicking off this process periodically as more food is cooked and ready to serve:

food_station.acquire()
food_station.refill()
food_station.notify()
food_station.release()

Some implementations of condition variables treat them as two elements: a standard Lock that handles the acquire and release and a separate condition variable that handles the wait, notify, and notify_all. Others (including Python and this page) treat the lock as a superclass or field of the condition variable, recognizing that the split-type condition variable does not make sense without a corresponding lock.

One particular way of using a condition variable, called a monitor, ties the various condition variable operations to changes in attribute values of an object, creating one type of synchronized, thread-safe class.

2.3 Reader-writer lock

A reader-writer lock can be locked in two ways. If acquired in exclusive or write mode, it acts just like a regular lock. If acquired in shared or read mode, any number of threads can acquire it simultaneously, but all must release it before it can be acquired in exclusive/write mode.

Reader-writer locks are more common for file systems than for threads, where the file itself acts as the data you lock on. In both C (via #include <sys/file.h>) and Python (via import fcntl) the flock function takes two arguments: the file to lock or unlock and an enumeration value that is one of

Reader-writer locks can be implemented for threads using a condition variable and a count of the number of readers holding the lock. Many reader-writer locks prioritize writers over readers by also having a count of the number of writers waiting for the lock. An example of such an implementation can be found on Wikipedia.

Museums operate a kind of reader-writer lock. Visitors acquire the lock in shared mode when they arrive and release it when they leave, allowing any number of visitors to browse the exhibits at the same time. The curator acquires the lock in exclusive mode to set up a new exhibit and release it when the exhibit is ready, ensuring that visitors can’t get in the way of the exhibit installation.

2.4 Semaphore

Semaphores were among the first synchronization primitives to be used in computing. They come in two types (binary and counting).

Binary semaphores are used as a type of communication between threads: one thread blocks on it, then later another thread unblocks it.

Counting semaphores are used like locks that can be acquired several times before they block.

I’ve never used either type of semaphore outside of low-level systems code like implementing operating-system functionality.

2.5 Barrier

A barrier acts in some ways like the opposite of a lock. A lock blocks if any other threads have acquired it; a barrier blocks until several threads are trying to pass it, then lets them all through at once.

Barriers are created with a count of the number nn of threads needed to pass the barrier and have just one function: wait, which blocks the first n1n-1 threads that call it and then lets them all unblock when the nnth thread calls wait.

There was a ride at the amusement park near my childhood home that operated like a barrier. Each car sat four people, and the workers would only let a car go once it was full. Once there was no line and I was the second person to arrive and sit in a car; the two of us waited until two more people arrived before the worker let our car go.

Barriers are less versatile than some other synchronization primitives, but very helpful in specific situations. Two specific situations I’ve seen barriers used in repeatedly are handling a setup phase before a work phase and collecting intermediate results before deciding what to do next.

Setup phase barriers look like this:

b = Barrier(n)
have n threads each do:
    setup
    b.wait()
    do work

In this design, all threads will finish their setup before any of them start working, ensuring that no work messages are dropped because they are sent to a thread that’s not set up yet or the like.

Intermediate result barriers look like this:

b = Barrier(n)
have n threads each do:
    do part of the job
    b.wait()
    if thread 1:
        look at preliminary results
        make a decision what to do next
    b.wait()
    do rest of job

In this design all threads finish the first part, then that completed first part is used by one thread, and only once that thread is done do the threads pick up with the next part of their work.