| Locking |
| ======= |
| |
| Locking is well-known and the common use cases are straightforward: Any |
| CPU holding a given lock sees any changes previously seen or made by any |
| CPU before it previously released that same lock. This last sentence |
| is the only part of this document that most developers will need to read. |
| |
| However, developers who would like to also access lock-protected shared |
| variables outside of their corresponding locks should continue reading. |
| |
| |
| Locking and Prior Accesses |
| -------------------------- |
| |
| The basic rule of locking is worth repeating: |
| |
| Any CPU holding a given lock sees any changes previously seen |
| or made by any CPU before it previously released that same lock. |
| |
| Note that this statement is a bit stronger than "Any CPU holding a |
| given lock sees all changes made by any CPU during the time that CPU was |
| previously holding this same lock". For example, consider the following |
| pair of code fragments: |
| |
| /* See MP+polocks.litmus. */ |
| void CPU0(void) |
| { |
| WRITE_ONCE(x, 1); |
| spin_lock(&mylock); |
| WRITE_ONCE(y, 1); |
| spin_unlock(&mylock); |
| } |
| |
| void CPU1(void) |
| { |
| spin_lock(&mylock); |
| r0 = READ_ONCE(y); |
| spin_unlock(&mylock); |
| r1 = READ_ONCE(x); |
| } |
| |
| The basic rule guarantees that if CPU0() acquires mylock before CPU1(), |
| then both r0 and r1 must be set to the value 1. This also has the |
| consequence that if the final value of r0 is equal to 1, then the final |
| value of r1 must also be equal to 1. In contrast, the weaker rule would |
| say nothing about the final value of r1. |
| |
| |
| Locking and Subsequent Accesses |
| ------------------------------- |
| |
| The converse to the basic rule also holds: Any CPU holding a given |
| lock will not see any changes that will be made by any CPU after it |
| subsequently acquires this same lock. This converse statement is |
| illustrated by the following litmus test: |
| |
| /* See MP+porevlocks.litmus. */ |
| void CPU0(void) |
| { |
| r0 = READ_ONCE(y); |
| spin_lock(&mylock); |
| r1 = READ_ONCE(x); |
| spin_unlock(&mylock); |
| } |
| |
| void CPU1(void) |
| { |
| spin_lock(&mylock); |
| WRITE_ONCE(x, 1); |
| spin_unlock(&mylock); |
| WRITE_ONCE(y, 1); |
| } |
| |
| This converse to the basic rule guarantees that if CPU0() acquires |
| mylock before CPU1(), then both r0 and r1 must be set to the value 0. |
| This also has the consequence that if the final value of r1 is equal |
| to 0, then the final value of r0 must also be equal to 0. In contrast, |
| the weaker rule would say nothing about the final value of r0. |
| |
| These examples show only a single pair of CPUs, but the effects of the |
| locking basic rule extend across multiple acquisitions of a given lock |
| across multiple CPUs. |
| |
| |
| Double-Checked Locking |
| ---------------------- |
| |
| It is well known that more than just a lock is required to make |
| double-checked locking work correctly, This litmus test illustrates |
| one incorrect approach: |
| |
| /* See Documentation/litmus-tests/locking/DCL-broken.litmus. */ |
| void CPU0(void) |
| { |
| r0 = READ_ONCE(flag); |
| if (r0 == 0) { |
| spin_lock(&lck); |
| r1 = READ_ONCE(flag); |
| if (r1 == 0) { |
| WRITE_ONCE(data, 1); |
| WRITE_ONCE(flag, 1); |
| } |
| spin_unlock(&lck); |
| } |
| r2 = READ_ONCE(data); |
| } |
| /* CPU1() is the exactly the same as CPU0(). */ |
| |
| There are two problems. First, there is no ordering between the first |
| READ_ONCE() of "flag" and the READ_ONCE() of "data". Second, there is |
| no ordering between the two WRITE_ONCE() calls. It should therefore be |
| no surprise that "r2" can be zero, and a quick herd7 run confirms this. |
| |
| One way to fix this is to use smp_load_acquire() and smp_store_release() |
| as shown in this corrected version: |
| |
| /* See Documentation/litmus-tests/locking/DCL-fixed.litmus. */ |
| void CPU0(void) |
| { |
| r0 = smp_load_acquire(&flag); |
| if (r0 == 0) { |
| spin_lock(&lck); |
| r1 = READ_ONCE(flag); |
| if (r1 == 0) { |
| WRITE_ONCE(data, 1); |
| smp_store_release(&flag, 1); |
| } |
| spin_unlock(&lck); |
| } |
| r2 = READ_ONCE(data); |
| } |
| /* CPU1() is the exactly the same as CPU0(). */ |
| |
| The smp_load_acquire() guarantees that its load from "flags" will |
| be ordered before the READ_ONCE() from data, thus solving the first |
| problem. The smp_store_release() guarantees that its store will be |
| ordered after the WRITE_ONCE() to "data", solving the second problem. |
| The smp_store_release() pairs with the smp_load_acquire(), thus ensuring |
| that the ordering provided by each actually takes effect. Again, a |
| quick herd7 run confirms this. |
| |
| In short, if you access a lock-protected variable without holding the |
| corresponding lock, you will need to provide additional ordering, in |
| this case, via the smp_load_acquire() and the smp_store_release(). |
| |
| |
| Ordering Provided by a Lock to CPUs Not Holding That Lock |
| --------------------------------------------------------- |
| |
| It is not necessarily the case that accesses ordered by locking will be |
| seen as ordered by CPUs not holding that lock. Consider this example: |
| |
| /* See Z6.0+pooncelock+pooncelock+pombonce.litmus. */ |
| void CPU0(void) |
| { |
| spin_lock(&mylock); |
| WRITE_ONCE(x, 1); |
| WRITE_ONCE(y, 1); |
| spin_unlock(&mylock); |
| } |
| |
| void CPU1(void) |
| { |
| spin_lock(&mylock); |
| r0 = READ_ONCE(y); |
| WRITE_ONCE(z, 1); |
| spin_unlock(&mylock); |
| } |
| |
| void CPU2(void) |
| { |
| WRITE_ONCE(z, 2); |
| smp_mb(); |
| r1 = READ_ONCE(x); |
| } |
| |
| Counter-intuitive though it might be, it is quite possible to have |
| the final value of r0 be 1, the final value of z be 2, and the final |
| value of r1 be 0. The reason for this surprising outcome is that CPU2() |
| never acquired the lock, and thus did not fully benefit from the lock's |
| ordering properties. |
| |
| Ordering can be extended to CPUs not holding the lock by careful use |
| of smp_mb__after_spinlock(): |
| |
| /* See Z6.0+pooncelock+poonceLock+pombonce.litmus. */ |
| void CPU0(void) |
| { |
| spin_lock(&mylock); |
| WRITE_ONCE(x, 1); |
| WRITE_ONCE(y, 1); |
| spin_unlock(&mylock); |
| } |
| |
| void CPU1(void) |
| { |
| spin_lock(&mylock); |
| smp_mb__after_spinlock(); |
| r0 = READ_ONCE(y); |
| WRITE_ONCE(z, 1); |
| spin_unlock(&mylock); |
| } |
| |
| void CPU2(void) |
| { |
| WRITE_ONCE(z, 2); |
| smp_mb(); |
| r1 = READ_ONCE(x); |
| } |
| |
| This addition of smp_mb__after_spinlock() strengthens the lock |
| acquisition sufficiently to rule out the counter-intuitive outcome. |
| In other words, the addition of the smp_mb__after_spinlock() prohibits |
| the counter-intuitive result where the final value of r0 is 1, the final |
| value of z is 2, and the final value of r1 is 0. |
| |
| |
| No Roach-Motel Locking! |
| ----------------------- |
| |
| This example requires familiarity with the herd7 "filter" clause, so |
| please read up on that topic in litmus-tests.txt. |
| |
| It is tempting to allow memory-reference instructions to be pulled |
| into a critical section, but this cannot be allowed in the general case. |
| For example, consider a spin loop preceding a lock-based critical section. |
| Now, herd7 does not model spin loops, but we can emulate one with two |
| loads, with a "filter" clause to constrain the first to return the |
| initial value and the second to return the updated value, as shown below: |
| |
| /* See Documentation/litmus-tests/locking/RM-fixed.litmus. */ |
| void CPU0(void) |
| { |
| spin_lock(&lck); |
| r2 = atomic_inc_return(&y); |
| WRITE_ONCE(x, 1); |
| spin_unlock(&lck); |
| } |
| |
| void CPU1(void) |
| { |
| r0 = READ_ONCE(x); |
| r1 = READ_ONCE(x); |
| spin_lock(&lck); |
| r2 = atomic_inc_return(&y); |
| spin_unlock(&lck); |
| } |
| |
| filter (1:r0=0 /\ 1:r1=1) |
| exists (1:r2=1) |
| |
| The variable "x" is the control variable for the emulated spin loop. |
| CPU0() sets it to "1" while holding the lock, and CPU1() emulates the |
| spin loop by reading it twice, first into "1:r0" (which should get the |
| initial value "0") and then into "1:r1" (which should get the updated |
| value "1"). |
| |
| The "filter" clause takes this into account, constraining "1:r0" to |
| equal "0" and "1:r1" to equal 1. |
| |
| Then the "exists" clause checks to see if CPU1() acquired its lock first, |
| which should not happen given the filter clause because CPU0() updates |
| "x" while holding the lock. And herd7 confirms this. |
| |
| But suppose that the compiler was permitted to reorder the spin loop |
| into CPU1()'s critical section, like this: |
| |
| /* See Documentation/litmus-tests/locking/RM-broken.litmus. */ |
| void CPU0(void) |
| { |
| int r2; |
| |
| spin_lock(&lck); |
| r2 = atomic_inc_return(&y); |
| WRITE_ONCE(x, 1); |
| spin_unlock(&lck); |
| } |
| |
| void CPU1(void) |
| { |
| spin_lock(&lck); |
| r0 = READ_ONCE(x); |
| r1 = READ_ONCE(x); |
| r2 = atomic_inc_return(&y); |
| spin_unlock(&lck); |
| } |
| |
| filter (1:r0=0 /\ 1:r1=1) |
| exists (1:r2=1) |
| |
| If "1:r0" is equal to "0", "1:r1" can never equal "1" because CPU0() |
| cannot update "x" while CPU1() holds the lock. And herd7 confirms this, |
| showing zero executions matching the "filter" criteria. |
| |
| And this is why Linux-kernel lock and unlock primitives must prevent |
| code from entering critical sections. It is not sufficient to only |
| prevent code from leaving them. |