|  | MARKING SHARED-MEMORY ACCESSES | 
|  | ============================== | 
|  |  | 
|  | This document provides guidelines for marking intentionally concurrent | 
|  | normal accesses to shared memory, that is "normal" as in accesses that do | 
|  | not use read-modify-write atomic operations.  It also describes how to | 
|  | document these accesses, both with comments and with special assertions | 
|  | processed by the Kernel Concurrency Sanitizer (KCSAN).  This discussion | 
|  | builds on an earlier LWN article [1]. | 
|  |  | 
|  |  | 
|  | ACCESS-MARKING OPTIONS | 
|  | ====================== | 
|  |  | 
|  | The Linux kernel provides the following access-marking options: | 
|  |  | 
|  | 1.	Plain C-language accesses (unmarked), for example, "a = b;" | 
|  |  | 
|  | 2.	Data-race marking, for example, "data_race(a = b);" | 
|  |  | 
|  | 3.	READ_ONCE(), for example, "a = READ_ONCE(b);" | 
|  | The various forms of atomic_read() also fit in here. | 
|  |  | 
|  | 4.	WRITE_ONCE(), for example, "WRITE_ONCE(a, b);" | 
|  | The various forms of atomic_set() also fit in here. | 
|  |  | 
|  |  | 
|  | These may be used in combination, as shown in this admittedly improbable | 
|  | example: | 
|  |  | 
|  | WRITE_ONCE(a, b + data_race(c + d) + READ_ONCE(e)); | 
|  |  | 
|  | Neither plain C-language accesses nor data_race() (#1 and #2 above) place | 
|  | any sort of constraint on the compiler's choice of optimizations [2]. | 
|  | In contrast, READ_ONCE() and WRITE_ONCE() (#3 and #4 above) restrict the | 
|  | compiler's use of code-motion and common-subexpression optimizations. | 
|  | Therefore, if a given access is involved in an intentional data race, | 
|  | using READ_ONCE() for loads and WRITE_ONCE() for stores is usually | 
|  | preferable to data_race(), which in turn is usually preferable to plain | 
|  | C-language accesses.  It is permissible to combine #2 and #3, for example, | 
|  | data_race(READ_ONCE(a)), which will both restrict compiler optimizations | 
|  | and disable KCSAN diagnostics. | 
|  |  | 
|  | KCSAN will complain about many types of data races involving plain | 
|  | C-language accesses, but marking all accesses involved in a given data | 
|  | race with one of data_race(), READ_ONCE(), or WRITE_ONCE(), will prevent | 
|  | KCSAN from complaining.  Of course, lack of KCSAN complaints does not | 
|  | imply correct code.  Therefore, please take a thoughtful approach | 
|  | when responding to KCSAN complaints.  Churning the code base with | 
|  | ill-considered additions of data_race(), READ_ONCE(), and WRITE_ONCE() | 
|  | is unhelpful. | 
|  |  | 
|  | In fact, the following sections describe situations where use of | 
|  | data_race() and even plain C-language accesses is preferable to | 
|  | READ_ONCE() and WRITE_ONCE(). | 
|  |  | 
|  |  | 
|  | Use of the data_race() Macro | 
|  | ---------------------------- | 
|  |  | 
|  | Here are some situations where data_race() should be used instead of | 
|  | READ_ONCE() and WRITE_ONCE(): | 
|  |  | 
|  | 1.	Data-racy loads from shared variables whose values are used only | 
|  | for diagnostic purposes. | 
|  |  | 
|  | 2.	Data-racy reads whose values are checked against marked reload. | 
|  |  | 
|  | 3.	Reads whose values feed into error-tolerant heuristics. | 
|  |  | 
|  | 4.	Writes setting values that feed into error-tolerant heuristics. | 
|  |  | 
|  |  | 
|  | Data-Racy Reads for Approximate Diagnostics | 
|  |  | 
|  | Approximate diagnostics include lockdep reports, monitoring/statistics | 
|  | (including /proc and /sys output), WARN*()/BUG*() checks whose return | 
|  | values are ignored, and other situations where reads from shared variables | 
|  | are not an integral part of the core concurrency design. | 
|  |  | 
|  | In fact, use of data_race() instead READ_ONCE() for these diagnostic | 
|  | reads can enable better checking of the remaining accesses implementing | 
|  | the core concurrency design.  For example, suppose that the core design | 
|  | prevents any non-diagnostic reads from shared variable x from running | 
|  | concurrently with updates to x.  Then using plain C-language writes | 
|  | to x allows KCSAN to detect reads from x from within regions of code | 
|  | that fail to exclude the updates.  In this case, it is important to use | 
|  | data_race() for the diagnostic reads because otherwise KCSAN would give | 
|  | false-positive warnings about these diagnostic reads. | 
|  |  | 
|  | If it is necessary to both restrict compiler optimizations and disable | 
|  | KCSAN diagnostics, use both data_race() and READ_ONCE(), for example, | 
|  | data_race(READ_ONCE(a)). | 
|  |  | 
|  | In theory, plain C-language loads can also be used for this use case. | 
|  | However, in practice this will have the disadvantage of causing KCSAN | 
|  | to generate false positives because KCSAN will have no way of knowing | 
|  | that the resulting data race was intentional. | 
|  |  | 
|  |  | 
|  | Data-Racy Reads That Are Checked Against Marked Reload | 
|  |  | 
|  | The values from some reads are not implicitly trusted.  They are instead | 
|  | fed into some operation that checks the full value against a later marked | 
|  | load from memory, which means that the occasional arbitrarily bogus value | 
|  | is not a problem.  For example, if a bogus value is fed into cmpxchg(), | 
|  | all that happens is that this cmpxchg() fails, which normally results | 
|  | in a retry.  Unless the race condition that resulted in the bogus value | 
|  | recurs, this retry will with high probability succeed, so no harm done. | 
|  |  | 
|  | However, please keep in mind that a data_race() load feeding into | 
|  | a cmpxchg_relaxed() might still be subject to load fusing on some | 
|  | architectures.  Therefore, it is best to capture the return value from | 
|  | the failing cmpxchg() for the next iteration of the loop, an approach | 
|  | that provides the compiler much less scope for mischievous optimizations. | 
|  | Capturing the return value from cmpxchg() also saves a memory reference | 
|  | in many cases. | 
|  |  | 
|  | In theory, plain C-language loads can also be used for this use case. | 
|  | However, in practice this will have the disadvantage of causing KCSAN | 
|  | to generate false positives because KCSAN will have no way of knowing | 
|  | that the resulting data race was intentional. | 
|  |  | 
|  |  | 
|  | Reads Feeding Into Error-Tolerant Heuristics | 
|  |  | 
|  | Values from some reads feed into heuristics that can tolerate occasional | 
|  | errors.  Such reads can use data_race(), thus allowing KCSAN to focus on | 
|  | the other accesses to the relevant shared variables.  But please note | 
|  | that data_race() loads are subject to load fusing, which can result in | 
|  | consistent errors, which in turn are quite capable of breaking heuristics. | 
|  | Therefore use of data_race() should be limited to cases where some other | 
|  | code (such as a barrier() call) will force the occasional reload. | 
|  |  | 
|  | Note that this use case requires that the heuristic be able to handle | 
|  | any possible error.  In contrast, if the heuristics might be fatally | 
|  | confused by one or more of the possible erroneous values, use READ_ONCE() | 
|  | instead of data_race(). | 
|  |  | 
|  | In theory, plain C-language loads can also be used for this use case. | 
|  | However, in practice this will have the disadvantage of causing KCSAN | 
|  | to generate false positives because KCSAN will have no way of knowing | 
|  | that the resulting data race was intentional. | 
|  |  | 
|  |  | 
|  | Writes Setting Values Feeding Into Error-Tolerant Heuristics | 
|  |  | 
|  | The values read into error-tolerant heuristics come from somewhere, | 
|  | for example, from sysfs.  This means that some code in sysfs writes | 
|  | to this same variable, and these writes can also use data_race(). | 
|  | After all, if the heuristic can tolerate the occasional bogus value | 
|  | due to compiler-mangled reads, it can also tolerate the occasional | 
|  | compiler-mangled write, at least assuming that the proper value is in | 
|  | place once the write completes. | 
|  |  | 
|  | Plain C-language stores can also be used for this use case.  However, | 
|  | in kernels built with CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n, this | 
|  | will have the disadvantage of causing KCSAN to generate false positives | 
|  | because KCSAN will have no way of knowing that the resulting data race | 
|  | was intentional. | 
|  |  | 
|  |  | 
|  | Use of Plain C-Language Accesses | 
|  | -------------------------------- | 
|  |  | 
|  | Here are some example situations where plain C-language accesses should | 
|  | used instead of READ_ONCE(), WRITE_ONCE(), and data_race(): | 
|  |  | 
|  | 1.	Accesses protected by mutual exclusion, including strict locking | 
|  | and sequence locking. | 
|  |  | 
|  | 2.	Initialization-time and cleanup-time accesses.	This covers a | 
|  | wide variety of situations, including the uniprocessor phase of | 
|  | system boot, variables to be used by not-yet-spawned kthreads, | 
|  | structures not yet published to reference-counted or RCU-protected | 
|  | data structures, and the cleanup side of any of these situations. | 
|  |  | 
|  | 3.	Per-CPU variables that are not accessed from other CPUs. | 
|  |  | 
|  | 4.	Private per-task variables, including on-stack variables, some | 
|  | fields in the task_struct structure, and task-private heap data. | 
|  |  | 
|  | 5.	Any other loads for which there is not supposed to be a concurrent | 
|  | store to that same variable. | 
|  |  | 
|  | 6.	Any other stores for which there should be neither concurrent | 
|  | loads nor concurrent stores to that same variable. | 
|  |  | 
|  | But note that KCSAN makes two explicit exceptions to this rule | 
|  | by default, refraining from flagging plain C-language stores: | 
|  |  | 
|  | a.	No matter what.  You can override this default by building | 
|  | with CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n. | 
|  |  | 
|  | b.	When the store writes the value already contained in | 
|  | that variable.	You can override this default by building | 
|  | with CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n. | 
|  |  | 
|  | c.	When one of the stores is in an interrupt handler and | 
|  | the other in the interrupted code.  You can override this | 
|  | default by building with CONFIG_KCSAN_INTERRUPT_WATCHER=y. | 
|  |  | 
|  | Note that it is important to use plain C-language accesses in these cases, | 
|  | because doing otherwise prevents KCSAN from detecting violations of your | 
|  | code's synchronization rules. | 
|  |  | 
|  |  | 
|  | ACCESS-DOCUMENTATION OPTIONS | 
|  | ============================ | 
|  |  | 
|  | It is important to comment marked accesses so that people reading your | 
|  | code, yourself included, are reminded of the synchronization design. | 
|  | However, it is even more important to comment plain C-language accesses | 
|  | that are intentionally involved in data races.  Such comments are | 
|  | needed to remind people reading your code, again, yourself included, | 
|  | of how the compiler has been prevented from optimizing those accesses | 
|  | into concurrency bugs. | 
|  |  | 
|  | It is also possible to tell KCSAN about your synchronization design. | 
|  | For example, ASSERT_EXCLUSIVE_ACCESS(foo) tells KCSAN that any | 
|  | concurrent access to variable foo by any other CPU is an error, even | 
|  | if that concurrent access is marked with READ_ONCE().  In addition, | 
|  | ASSERT_EXCLUSIVE_WRITER(foo) tells KCSAN that although it is OK for there | 
|  | to be concurrent reads from foo from other CPUs, it is an error for some | 
|  | other CPU to be concurrently writing to foo, even if that concurrent | 
|  | write is marked with data_race() or WRITE_ONCE(). | 
|  |  | 
|  | Note that although KCSAN will call out data races involving either | 
|  | ASSERT_EXCLUSIVE_ACCESS() or ASSERT_EXCLUSIVE_WRITER() on the one hand | 
|  | and data_race() writes on the other, KCSAN will not report the location | 
|  | of these data_race() writes. | 
|  |  | 
|  |  | 
|  | EXAMPLES | 
|  | ======== | 
|  |  | 
|  | As noted earlier, the goal is to prevent the compiler from destroying | 
|  | your concurrent algorithm, to help the human reader, and to inform | 
|  | KCSAN of aspects of your concurrency design.  This section looks at a | 
|  | few examples showing how this can be done. | 
|  |  | 
|  |  | 
|  | Lock Protection With Lockless Diagnostic Access | 
|  | ----------------------------------------------- | 
|  |  | 
|  | For example, suppose a shared variable "foo" is read only while a | 
|  | reader-writer spinlock is read-held, written only while that same | 
|  | spinlock is write-held, except that it is also read locklessly for | 
|  | diagnostic purposes.  The code might look as follows: | 
|  |  | 
|  | int foo; | 
|  | DEFINE_RWLOCK(foo_rwlock); | 
|  |  | 
|  | void update_foo(int newval) | 
|  | { | 
|  | write_lock(&foo_rwlock); | 
|  | foo = newval; | 
|  | do_something(newval); | 
|  | write_unlock(&foo_rwlock); | 
|  | } | 
|  |  | 
|  | int read_foo(void) | 
|  | { | 
|  | int ret; | 
|  |  | 
|  | read_lock(&foo_rwlock); | 
|  | do_something_else(); | 
|  | ret = foo; | 
|  | read_unlock(&foo_rwlock); | 
|  | return ret; | 
|  | } | 
|  |  | 
|  | void read_foo_diagnostic(void) | 
|  | { | 
|  | pr_info("Current value of foo: %d\n", data_race(foo)); | 
|  | } | 
|  |  | 
|  | The reader-writer lock prevents the compiler from introducing concurrency | 
|  | bugs into any part of the main algorithm using foo, which means that | 
|  | the accesses to foo within both update_foo() and read_foo() can (and | 
|  | should) be plain C-language accesses.  One benefit of making them be | 
|  | plain C-language accesses is that KCSAN can detect any erroneous lockless | 
|  | reads from or updates to foo.  The data_race() in read_foo_diagnostic() | 
|  | tells KCSAN that data races are expected, and should be silently | 
|  | ignored.  This data_race() also tells the human reading the code that | 
|  | read_foo_diagnostic() might sometimes return a bogus value. | 
|  |  | 
|  | If it is necessary to suppress compiler optimization and also detect | 
|  | buggy lockless writes, read_foo_diagnostic() can be updated as follows: | 
|  |  | 
|  | void read_foo_diagnostic(void) | 
|  | { | 
|  | pr_info("Current value of foo: %d\n", data_race(READ_ONCE(foo))); | 
|  | } | 
|  |  | 
|  | Alternatively, given that KCSAN is to ignore all accesses in this function, | 
|  | this function can be marked __no_kcsan and the data_race() can be dropped: | 
|  |  | 
|  | void __no_kcsan read_foo_diagnostic(void) | 
|  | { | 
|  | pr_info("Current value of foo: %d\n", READ_ONCE(foo)); | 
|  | } | 
|  |  | 
|  | However, in order for KCSAN to detect buggy lockless writes, your kernel | 
|  | must be built with CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n.  If you | 
|  | need KCSAN to detect such a write even if that write did not change | 
|  | the value of foo, you also need CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n. | 
|  | If you need KCSAN to detect such a write happening in an interrupt handler | 
|  | running on the same CPU doing the legitimate lock-protected write, you | 
|  | also need CONFIG_KCSAN_INTERRUPT_WATCHER=y.  With some or all of these | 
|  | Kconfig options set properly, KCSAN can be quite helpful, although | 
|  | it is not necessarily a full replacement for hardware watchpoints. | 
|  | On the other hand, neither are hardware watchpoints a full replacement | 
|  | for KCSAN because it is not always easy to tell hardware watchpoint to | 
|  | conditionally trap on accesses. | 
|  |  | 
|  |  | 
|  | Lock-Protected Writes With Lockless Reads | 
|  | ----------------------------------------- | 
|  |  | 
|  | For another example, suppose a shared variable "foo" is updated only | 
|  | while holding a spinlock, but is read locklessly.  The code might look | 
|  | as follows: | 
|  |  | 
|  | int foo; | 
|  | DEFINE_SPINLOCK(foo_lock); | 
|  |  | 
|  | void update_foo(int newval) | 
|  | { | 
|  | spin_lock(&foo_lock); | 
|  | WRITE_ONCE(foo, newval); | 
|  | ASSERT_EXCLUSIVE_WRITER(foo); | 
|  | do_something(newval); | 
|  | spin_unlock(&foo_wlock); | 
|  | } | 
|  |  | 
|  | int read_foo(void) | 
|  | { | 
|  | do_something_else(); | 
|  | return READ_ONCE(foo); | 
|  | } | 
|  |  | 
|  | Because foo is read locklessly, all accesses are marked.  The purpose | 
|  | of the ASSERT_EXCLUSIVE_WRITER() is to allow KCSAN to check for a buggy | 
|  | concurrent lockless write. | 
|  |  | 
|  |  | 
|  | Lock-Protected Writes With Heuristic Lockless Reads | 
|  | --------------------------------------------------- | 
|  |  | 
|  | For another example, suppose that the code can normally make use of | 
|  | a per-data-structure lock, but there are times when a global lock | 
|  | is required.  These times are indicated via a global flag.  The code | 
|  | might look as follows, and is based loosely on nf_conntrack_lock(), | 
|  | nf_conntrack_all_lock(), and nf_conntrack_all_unlock(): | 
|  |  | 
|  | bool global_flag; | 
|  | DEFINE_SPINLOCK(global_lock); | 
|  | struct foo { | 
|  | spinlock_t f_lock; | 
|  | int f_data; | 
|  | }; | 
|  |  | 
|  | /* All foo structures are in the following array. */ | 
|  | int nfoo; | 
|  | struct foo *foo_array; | 
|  |  | 
|  | void do_something_locked(struct foo *fp) | 
|  | { | 
|  | /* This works even if data_race() returns nonsense. */ | 
|  | if (!data_race(global_flag)) { | 
|  | spin_lock(&fp->f_lock); | 
|  | if (!smp_load_acquire(&global_flag)) { | 
|  | do_something(fp); | 
|  | spin_unlock(&fp->f_lock); | 
|  | return; | 
|  | } | 
|  | spin_unlock(&fp->f_lock); | 
|  | } | 
|  | spin_lock(&global_lock); | 
|  | /* global_lock held, thus global flag cannot be set. */ | 
|  | spin_lock(&fp->f_lock); | 
|  | spin_unlock(&global_lock); | 
|  | /* | 
|  | * global_flag might be set here, but begin_global() | 
|  | * will wait for ->f_lock to be released. | 
|  | */ | 
|  | do_something(fp); | 
|  | spin_unlock(&fp->f_lock); | 
|  | } | 
|  |  | 
|  | void begin_global(void) | 
|  | { | 
|  | int i; | 
|  |  | 
|  | spin_lock(&global_lock); | 
|  | WRITE_ONCE(global_flag, true); | 
|  | for (i = 0; i < nfoo; i++) { | 
|  | /* | 
|  | * Wait for pre-existing local locks.  One at | 
|  | * a time to avoid lockdep limitations. | 
|  | */ | 
|  | spin_lock(&fp->f_lock); | 
|  | spin_unlock(&fp->f_lock); | 
|  | } | 
|  | } | 
|  |  | 
|  | void end_global(void) | 
|  | { | 
|  | smp_store_release(&global_flag, false); | 
|  | spin_unlock(&global_lock); | 
|  | } | 
|  |  | 
|  | All code paths leading from the do_something_locked() function's first | 
|  | read from global_flag acquire a lock, so endless load fusing cannot | 
|  | happen. | 
|  |  | 
|  | If the value read from global_flag is true, then global_flag is | 
|  | rechecked while holding ->f_lock, which, if global_flag is now false, | 
|  | prevents begin_global() from completing.  It is therefore safe to invoke | 
|  | do_something(). | 
|  |  | 
|  | Otherwise, if either value read from global_flag is true, then after | 
|  | global_lock is acquired global_flag must be false.  The acquisition of | 
|  | ->f_lock will prevent any call to begin_global() from returning, which | 
|  | means that it is safe to release global_lock and invoke do_something(). | 
|  |  | 
|  | For this to work, only those foo structures in foo_array[] may be passed | 
|  | to do_something_locked().  The reason for this is that the synchronization | 
|  | with begin_global() relies on momentarily holding the lock of each and | 
|  | every foo structure. | 
|  |  | 
|  | The smp_load_acquire() and smp_store_release() are required because | 
|  | changes to a foo structure between calls to begin_global() and | 
|  | end_global() are carried out without holding that structure's ->f_lock. | 
|  | The smp_load_acquire() and smp_store_release() ensure that the next | 
|  | invocation of do_something() from do_something_locked() will see those | 
|  | changes. | 
|  |  | 
|  |  | 
|  | Lockless Reads and Writes | 
|  | ------------------------- | 
|  |  | 
|  | For another example, suppose a shared variable "foo" is both read and | 
|  | updated locklessly.  The code might look as follows: | 
|  |  | 
|  | int foo; | 
|  |  | 
|  | int update_foo(int newval) | 
|  | { | 
|  | int ret; | 
|  |  | 
|  | ret = xchg(&foo, newval); | 
|  | do_something(newval); | 
|  | return ret; | 
|  | } | 
|  |  | 
|  | int read_foo(void) | 
|  | { | 
|  | do_something_else(); | 
|  | return READ_ONCE(foo); | 
|  | } | 
|  |  | 
|  | Because foo is accessed locklessly, all accesses are marked.  It does | 
|  | not make sense to use ASSERT_EXCLUSIVE_WRITER() in this case because | 
|  | there really can be concurrent lockless writers.  KCSAN would | 
|  | flag any concurrent plain C-language reads from foo, and given | 
|  | CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n, also any concurrent plain | 
|  | C-language writes to foo. | 
|  |  | 
|  |  | 
|  | Lockless Reads and Writes, But With Single-Threaded Initialization | 
|  | ------------------------------------------------------------------ | 
|  |  | 
|  | For yet another example, suppose that foo is initialized in a | 
|  | single-threaded manner, but that a number of kthreads are then created | 
|  | that locklessly and concurrently access foo.  Some snippets of this code | 
|  | might look as follows: | 
|  |  | 
|  | int foo; | 
|  |  | 
|  | void initialize_foo(int initval, int nkthreads) | 
|  | { | 
|  | int i; | 
|  |  | 
|  | foo = initval; | 
|  | ASSERT_EXCLUSIVE_ACCESS(foo); | 
|  | for (i = 0; i < nkthreads; i++) | 
|  | kthread_run(access_foo_concurrently, ...); | 
|  | } | 
|  |  | 
|  | /* Called from access_foo_concurrently(). */ | 
|  | int update_foo(int newval) | 
|  | { | 
|  | int ret; | 
|  |  | 
|  | ret = xchg(&foo, newval); | 
|  | do_something(newval); | 
|  | return ret; | 
|  | } | 
|  |  | 
|  | /* Also called from access_foo_concurrently(). */ | 
|  | int read_foo(void) | 
|  | { | 
|  | do_something_else(); | 
|  | return READ_ONCE(foo); | 
|  | } | 
|  |  | 
|  | The initialize_foo() uses a plain C-language write to foo because there | 
|  | are not supposed to be concurrent accesses during initialization.  The | 
|  | ASSERT_EXCLUSIVE_ACCESS() allows KCSAN to flag buggy concurrent unmarked | 
|  | reads, and the ASSERT_EXCLUSIVE_ACCESS() call further allows KCSAN to | 
|  | flag buggy concurrent writes, even if:  (1) Those writes are marked or | 
|  | (2) The kernel was built with CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=y. | 
|  |  | 
|  |  | 
|  | Checking Stress-Test Race Coverage | 
|  | ---------------------------------- | 
|  |  | 
|  | When designing stress tests it is important to ensure that race conditions | 
|  | of interest really do occur.  For example, consider the following code | 
|  | fragment: | 
|  |  | 
|  | int foo; | 
|  |  | 
|  | int update_foo(int newval) | 
|  | { | 
|  | return xchg(&foo, newval); | 
|  | } | 
|  |  | 
|  | int xor_shift_foo(int shift, int mask) | 
|  | { | 
|  | int old, new, newold; | 
|  |  | 
|  | newold = data_race(foo); /* Checked by cmpxchg(). */ | 
|  | do { | 
|  | old = newold; | 
|  | new = (old << shift) ^ mask; | 
|  | newold = cmpxchg(&foo, old, new); | 
|  | } while (newold != old); | 
|  | return old; | 
|  | } | 
|  |  | 
|  | int read_foo(void) | 
|  | { | 
|  | return READ_ONCE(foo); | 
|  | } | 
|  |  | 
|  | If it is possible for update_foo(), xor_shift_foo(), and read_foo() to be | 
|  | invoked concurrently, the stress test should force this concurrency to | 
|  | actually happen.  KCSAN can evaluate the stress test when the above code | 
|  | is modified to read as follows: | 
|  |  | 
|  | int foo; | 
|  |  | 
|  | int update_foo(int newval) | 
|  | { | 
|  | ASSERT_EXCLUSIVE_ACCESS(foo); | 
|  | return xchg(&foo, newval); | 
|  | } | 
|  |  | 
|  | int xor_shift_foo(int shift, int mask) | 
|  | { | 
|  | int old, new, newold; | 
|  |  | 
|  | newold = data_race(foo); /* Checked by cmpxchg(). */ | 
|  | do { | 
|  | old = newold; | 
|  | new = (old << shift) ^ mask; | 
|  | ASSERT_EXCLUSIVE_ACCESS(foo); | 
|  | newold = cmpxchg(&foo, old, new); | 
|  | } while (newold != old); | 
|  | return old; | 
|  | } | 
|  |  | 
|  |  | 
|  | int read_foo(void) | 
|  | { | 
|  | ASSERT_EXCLUSIVE_ACCESS(foo); | 
|  | return READ_ONCE(foo); | 
|  | } | 
|  |  | 
|  | If a given stress-test run does not result in KCSAN complaints from | 
|  | each possible pair of ASSERT_EXCLUSIVE_ACCESS() invocations, the | 
|  | stress test needs improvement.  If the stress test was to be evaluated | 
|  | on a regular basis, it would be wise to place the above instances of | 
|  | ASSERT_EXCLUSIVE_ACCESS() under #ifdef so that they did not result in | 
|  | false positives when not evaluating the stress test. | 
|  |  | 
|  |  | 
|  | REFERENCES | 
|  | ========== | 
|  |  | 
|  | [1] "Concurrency bugs should fear the big bad data-race detector (part 2)" | 
|  | https://lwn.net/Articles/816854/ | 
|  |  | 
|  | [2] "Who's afraid of a big bad optimizing compiler?" | 
|  | https://lwn.net/Articles/793253/ |