Synchronization manages the timing of concurrent operations, typically by making one or more operations wait until some condition is met by another operation.
Of our four broad type of concurrency,
Synchronization of threads and shared-resource processes has been a topic of study in computer science at least since the 1960s, and many strategies for synchronizing have been designed, implemented, and studied. These broadly fall into two categories: synchronization primitives and atomic operations. This page gives and overview of the most common components of each.
Atomic operations are a key concept in lock-free data structures, a family of tools for high-efficiency synchronization in the case when multiple threads contending for a single resource is rare. Atomic operations resemble other assembly instructions, but they have an additional guarantee that all of their work will happen without a separate thread accessing the same resources. Any operation or set of operations that guarantees this kind of exclusive access to its resources during its operation is called atomic.
Atomic operations are quite a bit less efficient than their non-atomic counterparts and should be used only when atomicity is required.
Common atomic operations include
Test-and-set, a single-bit operation that (a) sets a bit in memory to 1
and (b) returns if that bit was changed by this operation (i.e. previously 0) or not.
This is one of the simplest of the atomic operations to implement in hardware, and many other atomic operations and synchronization primitives are built off of it. I’ve never used it outside of a class that describes how to implement synchronization, but it’s referred to with enough frequency to be worth knowing about.
Compare-and-set, a multi-bit version of test-and-set which only modifies a value if it had a specific previous value before the modification; in other words, something like
if (p[0] == oldvalue) { p[0] = newvalue; return 1; }
else { return 0; }
Compare and set is widely used to implement lock-free data structures, such as linked lists that can be modified by multiple threads concurrently. The core idea of such approaches is something like
There are many nuances to this idea, but the core of it (retry on failure) shows why atomic operations are best for situations where threads fighting over a single resource are rare.
Atomic modification operations, like +=
.
Under the hood, x += 1
involves (a) loading x
’s current value from memory, (b) computing the sum of that value and 1
, and (c) writing the result back to memory. With a non-atomic version, there is the potential that some other thread will modify x
’s memory after step (a), then have its work lost when (c) overwrites that memory. With an atomic version, that kind of race condition is prevented by preventing any other thread from accessing x
’s memory between steps (a) and (c).
Atomic operations are available in C using #include <stdatomic.h>
, but are not implemented on all platforms, so portable code using them needs fallback non-atomic code to use when they are not available. Because of this, we do not show more about how to program with atomic operations here.
All synchronization primitives have the following properties:
There are many synchronization primitives; the following are those I see the most often in code I read and write.
In C, the most common threading library is called pthreads. All pthreads synchronization primitives should be kept at a fixed, never-changing memory address for their entire lifespan and passed to functions by pointer, not by value. Some, but not all, implementations use the address of the primitive as part of the logic it implements. The primitive must be initialized before use (with pthread_
type_init
) and destroyed after use (with pthread_
type_destroy
).
pthreads.h
library:
A mutex is single-user: once some thread lock
s it, other threads will trying to do so have to wait for the first thread to unlock
it. Relevant functions are pthread_mutex_init
, pthread_mutex_lock
, pthread_mutex_unlock
, and pthread_mutex_destroy
.
A condition variable must be paired with a mutex and lets code temporarily relinquish the mutex: wait
ing on the condition vaiable both releases the mutex and causes the thread to wait until some other thread holding the mutex notify
s or broadcast
s waiting threads that they can try to lock
it again. Relevant functions are pthread_cond_init
, pthread_cond_wait
, pthead_cond_broadcast
, pthread_cond_signal
, and pthread_cond_destroy
.
A reader-writer lock is like a mutex with two modes. The write mode acts like a regular mutex. The read mode lets any number of threads acquire it simultaneously; they all have to release it before it can be acquired in write mode. Relevant functions are pthread_rwlock_init
, pthread_rwlock_rdlock
, pthead_rwlock_wrlock
, pthread_rwlock_unlock
, and pthread_rwlock_destroy
.
A mutex has two states: locked and unlocked. It also has two functions/methods: lock and unlock. Lock changes the mutex from unlocked to locked, blocking until that can be done. Unlock changes the mutex from locked to unlocked.
locked | unlocked | |
---|---|---|
lock | block | change to locked |
unlock | change to unlocked resume any threads blocked on this lock |
Error |
The most common use of mutex is to implement mutual exclusion – it’s even named after that concept. Mutual exclusion creates a chunk of code (called a critical section
) that only one thread can access at a time. This tends to look like
pthread_mutex_lock(m);
do things
using only one thread
at a time
pthread_mutex_unlock(m);
It’s very important to always unlock; be wary of code that has a return
or the line between the lock and unlock.
The risk of forgetting to unlock a mutex along some code path has led programming languages from the 1990s and later to have language-level syntax that guatantees an unock.
For exmaple, in Java
synchronized(m) {
// do things
// using only one thread
// at a time
}
// m has been released here, even on exception or return
or in Python
with m:
# do things
... # using only one thread
... # at a time
... # m has been released here, even on exception or return
Bathroom stalls act like mutexes: when you lock
it (try to enter the stall) you either get exclusive access to it until you unlock
it (leave the stall) or you block (wait) until it’s available.
Some other languages call mutexes locks
and/or use the verbs acquire
and release
instead of lock
and unlock
.
A condition variable isn’t a variable and doesn’t check any conditions. Rather, it is a tool used to support a particular type of synchronized checking of conditions.
Condition variables are always paired with mutexes. Multiple condition variables may share a single mutex, but trying to use a single condition variable with different mutexes can create problems.
A condition variable works with its mutex to create a special temporary-release mode. The condition variable has wait, broadcast, and signal methods.
The wait
method
broadcast
s (or, in some cases, signal
s),
The difference between broadcast and signal is broadcast wakes up all threads that are waiting on the condition variable while signal wakes up just one such thread, chosen arbitrarily. When in doubt, we recommend broadcast over signal.
All three methods (wait, broadcast, and signal) must be called from a thread that has locked the corresponding mutex.
The primary use of condition variables is the following specific pattern for waiting until some arbitrarily-complicated condition is met:
pthread_mutex_lock(m);
while (!complicated_condition_is_met()) {
pthread_cond_wait(cv, m);
}
do things using only one thread at a time
that change the complicated condition for some thread
pthread_cond_broadcast(cv);
pthread_mutex_unlock(m);
The code
while (!complicated_condition_is_met()) {
pthread_cond_wait(cv, m);
}
might make more sense if we expand it out with the things the wait does:
lock the mutex
repeat as long as the condition is not met:
unlock the mutex
wait for notification that something has changed
lock the mutex
...
unlock the mutex
In other words,
I sometimes eat at a buffet where each dish has a condition variable-like interface. Only one person can access a given dish at a time (making it like a mutex), but if that dish is empty they can wait for it to be refilled. Workers can also refill the dishes.
In other words, as a customer with a specific food I like I run this:
(food_station.mutex);
pthread_mutex_lockwhile (!food_station.has_food()) {
(food_station.condvar, food_station.mutex);
pthread_cond_wait}
.serve(me);
food_station(food_station.mutex); pthread_mutex_unlock
The worker who refills the food uses it more like a lock, kicking off this process periodically as more food is cooked and ready to serve:
(food_station.mutex);
pthread_mutex_lock.refill();
food_station(food_station.condvar);
pthread_cond_broadcast(food_station.mutex); pthread_mutex_unlock
Because every condition variable needs a mutex, some implementations of condition variables integrate the mutex into the condition variable object.
One particular way of using a condition variable, called a monitor, ties the various condition variable operations to changes in attribute values of an object, creating one type of synchronized, thread-safe class.
A reader-writer lock can be acquired in two ways. If acquired in exclusive
or write
mode, it acts just like a regular mutex. If acquired in shared
or read
mode, any number of threads can acquire it simultaneously, but all must release it before it can be acquired in exclusive/write mode.
read-locked | write-locked | unlocked | |
---|---|---|---|
rdlock | proceed, counting reader | block | change to read-locked |
wrlock | block | block | change to write-locked |
unlock | decrement count if 0, change to unlocked* |
change to unlocked* | Error |
* when changing to unlocked, if there’s a blocked thread let one of them unblock and acquire the lock.
Reader-writer locks often add some additional logic to adjust the relative priority of readers and writers.
For the reader-writer lock logic shown in the table above, if there’s a never-ending supply of readers then the count will never reach zero and the writers will never unblock. This is called write starvation
.
If we change it so that rdlock blocks if any wrlock is blocked we won’t have write starvation, but now how we pick what thread to unblock matters. If we always unblock wrlocks first, readers will have to wait for writers possibly leading to read starvation. If we unblock first-come first-serve, readers might not get as much parallelism as we wish. And so on: there’s no single best rule, and different implementations make different choices to optimize for different use cases.
Museums operate a kind of reader-writer lock. Visitors acquire the lock in shared mode when they arrive and release it when they leave, allowing any number of visitors to browse the exhibits at the same time.
(exhibit);
pthread_rwlock_rdlock();
view_exhibit(exhibit); pthread_rwlock_unlock
The curator acquires the lock in exclusive mode to set up a new exhibit and release it when the exhibit is ready, ensuring that visitors can’t get in the way of the exhibit installation.
(exhibit);
pthread_rwlock_wrlock();
update_exhibit(exhibit); pthread_rwlock_unlock