Synchronization manages the timing of concurrent operations, typically by making one or more operations wait until some condition is met by another operation.
Of our four broad type of concurrency,
Synchronization of threads and shared-resource processes has been a topic of study in computer science at least since the 1960s, and many strategies for synchronizing have been designed, implemented, and studied. These broadly fall into two categories: synchronization primitives and atomic operations. This page gives and overview of the most common components of each.
Atomic operations are a key concept in lock-free data structures, a family of tools for high-efficiency synchronization in the case when multiple threads contending for a single resource is rare. Atomic operations resemble other assembly instructions, but they have an additional guarantee that all of their work will happen without a separate thread accessing the same resources. Any operation or set of operations that guarantees this kind of exclusive access to its resources during its operation is called atomic.
Atomic operations are quite a bit less efficient than their non-atomic counterparts and should be used only when atomicity is required.
Common atomic operations include
Test-and-set, a single-bit operation that (a) sets a bit in memory to 1
and (b) returns if that bit was changed by this operation (i.e. previously 0) or not.
This is one of the simplest of the atomic operations to implement in hardware, and many other atomic operations and synchronization primitives are built off of it. I’ve never used it outside of a class that describes how to implement synchronization, but it’s refered to with enough frequency to be worth knowing about.
Compare-and-set, a multi-bit version of test-and-set which only modifies a value if it had a specific previous value before the modification; in other words, something like
if (p[0] == oldvalue) { p[0] = newvalue; return 1; }
else { return 0; }
Compare and set is widely used to implement lock-free data structures, such as linked lists that can be modified by multiple threads concurrently. The core idea of such approaches is something like
There are many nuances to this idea, but the core of it (retry on failure) shows why atomic operations are best for situations where threads fighting over a single resource are rare.
Atomic modification operations, like +=
.
Under the hood, x += 1
involves (a) loading x
’s current value from memory, (b) computing the sum of that value and 1
, and (c) writing the result back to memory. With a non-atomic version, there is the potential that some other thread will modify x
’s memory after step (a), then have its work lost when (c) overwrites that memory. With an atomic version, that kind of race condition is prevented by preventing any other thread from accessing x
’s memory between steps (a) and (c).
All synchronization primitives have the following properties:
There are many synchronization primitives; the following are those I see the most often in code I read and write.
threading
library:
A lock is single-user: once some thread acquire
s it, other threads will trying to do so have to wait for the first thread to release
it.
A reentrant lock lets a thread that has acquire
d it acquire
it again; each acquire
must be matched with a release
before the lock is fully released and another thread can acquire
it.
A condition variable is a lock you can temporarily relinquish: wait
ing both releases the lock and causes the thread to wait until some other thread holding the lock notify
s waiting threads that they can try to acquire
it again.
A reader-writer lock1 Reader-writer locks are not in Python’s threading
library, but are in ifs fctl
library and in several other languages’ threading libraries. has two modes. The exclusive mode acts like a regular lock. The shared mode lets any number of threads acquire it simultaneously; they all have to release it before it can be acquired in shared mode.
A semaphore is an atomic counter: acquire
decrements it and release
increments it. It cannot become negative: if a thread tries to acquire
it when it is 0, that thread instead waits for some other thread to release
it.
An event has two modes: when it is set
, threads that wait
on it immediately proceed without waiting; when it is clear
, threads that wait
on it wait until it is set
.
A barrier queues up threads that wait
on it until there are a pre-determined number waiting, then releases them all at once.
A lock has two states: locked and unlocked. It also has two functions/methods: acquire and release. Acquire changes the lock from unlocked to locked, blocking until that can be done. Release changes the lock from locked to unlocked.
locked | unlocked | |
---|---|---|
acquire | block | change to locked |
release | change to unlocked resume any threads blocked on this lock’s acquire |
Error |
The most common use of locks is to implement mutual exclusion: a chunk of code (called a critical section
) that only one thread can access at a time. This tends to look like
lock.acquire()
do things
using only one thread
at a time
lock.release()
Releasing after doing the work has potential problems if one of the things in the critical section can crash, so it’s often actually written as
lock.acquire()
try:
do things
using only one thread
at a time
finally:
lock.release()
The finally-based pattern is so common that it often has its own language-level syntax, such as in Java
synchronized(lock) {
// do things
// using only one thread
// at a time
}
// lock has been released here
or in Python
with lock:
# do things
... # using only one thread
... # at a time
... # lock has been released here
Bathroom stalls act like locks: when you acquire it (try to enter the stall) you either get exclusive access to it until you release it (leave the stall) or you block (wait) until it’s available.
The primary function of a lock is to provide mutual exclusion: a thread that acquires the lock excludes others from doing so in a symmetric (mutual) way. Because of this, a lock is often called a mutex.
A condition variable is like a lock with a special temporary-release mode. It has acquire and release function/methods just like a lock, but also wait, notify, and notify_all methods
wait
is like a temporary release: it releases the lock, then blocks until both (a) notified and (b) can reacquire.
locked | unlocked | |
---|---|---|
acquire |
block | change to locked |
release |
change to unlocked resume any threads blocked on acquire |
Error |
wait |
release block until notify , then acquire |
Error |
notify |
unblock one thread blocked on wait |
unblock one thread blocked on wait |
notify_all |
unblock all threads blocked on wait |
unblock all threads blocked on wait |
This may seem strangely complicated, but it allows a particular pattern that is very useful, namely locking until some arbitrarily-complicated condition is met:
cv.acquire()
while not complicated_condition_is_met():
cv.wait()
do things using only one thread at a time
that change the complicated condition for some thread
cv.notify_all()
cv.release()
Because the acquire/release pair is essential, it is often facilitated with try/finally or a language-level syntax like Python’s:
with cv:
while not condition_met():
cv.wait()# do things
... # using only one thread
... # at a time
...
cv.notify_all()# lock has been released here
notify
or notify_all
?
Using notify
is generally more efficient than notify_all
, but if several threads are waiting and only a few of them will be enabled by the notification then it can also cause deadlock, with the randomly-selected thread that is unblocked not being able to make progress and all ending up blocked. There’s much more to say here, but for a first course in synchronization it’s probably safer to use the notify_all
version unless you are sure notify
will work.
I sometimes eat at a buffet where each dish has a condition variable-like interface. Only one person can access a given dish at a time (making it like a lock), but if that dish is empty they can wait for it to be refilled. Workers can also refill the dishes.
In other words, as a customer with a specific food I like I run this:
food_station.acquire()while not food_station.has_food():
food_station.wait()
food_station.serve(me)
food_station.notify() food_station.release()
The worker who refills the food uses it more like a lock, kicking off this process periodically as more food is cooked and ready to serve:
food_station.acquire()
food_station.refill()
food_station.notify() food_station.release()
Some implementations of condition variables treat them as two elements: a standard Lock that handles the acquire and release and a separate condition variable that handles the wait, notify, and notify_all. Others (including Python and this page) treat the lock as a superclass or field of the condition variable, recognizing that the split-type condition variable does not make sense without a corresponding lock.
One particular way of using a condition variable, called a monitor, ties the various condition variable operations to changes in attribute values of an object, creating one type of synchronized, thread-safe class.
A reader-writer lock can be locked in two ways. If acquired in exclusive
or write
mode, it acts just like a regular lock. If acquired in shared
or read
mode, any number of threads can acquire it simultaneously, but all must release it before it can be acquired in exclusive/write mode.
Reader-writer locks are more common for file systems than for threads, where the file itself acts as the data you lock on. In both C (via #include <sys/file.h>
) and Python (via import fcntl
) the flock
function takes two arguments: the file to lock or unlock and an enumeration value that is one of
LOCK_UN
to unlockLOCK_EX
to lock in exclusive mode, blocking if any other thread has this locked in any modeLOCK_SH
to lock in shared mode, blocking if another thread has this locked in exclusive modeReader-writer locks can be implemented for threads using a condition variable and a count of the number of readers holding the lock. Many reader-writer locks prioritize writers over readers by also having a count of the number of writers waiting for the lock. An example of such an implementation can be found on Wikipedia.
Museums operate a kind of reader-writer lock. Visitors acquire the lock in shared mode when they arrive and release it when they leave, allowing any number of visitors to browse the exhibits at the same time. The curator acquires the lock in exclusive mode to set up a new exhibit and release it when the exhibit is ready, ensuring that visitors can’t get in the way of the exhibit installation.
Semaphores were among the first synchronization primitives to be used in computing. They come in two types (binary and counting).
Binary semaphores are used as a type of communication between threads: one thread blocks on it, then later another thread unblocks it.
Counting semaphores are used like locks that can be acquired several times before they block.
I’ve never used either type of semaphore outside of low-level systems code like implementing operating-system functionality.