diff options
Diffstat (limited to '')
-rw-r--r-- | tools/memory-model/Documentation/explanation.txt | 95 |
1 files changed, 76 insertions, 19 deletions
diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt index 5d72f3112e56..ee819a402b69 100644 --- a/tools/memory-model/Documentation/explanation.txt +++ b/tools/memory-model/Documentation/explanation.txt @@ -485,6 +485,57 @@ have R ->po X. It wouldn't make sense for a computation to depend somehow on a value that doesn't get loaded from shared memory until later in the code! +Here's a trick question: When is a dependency not a dependency? Answer: +When it is purely syntactic rather than semantic. We say a dependency +between two accesses is purely syntactic if the second access doesn't +actually depend on the result of the first. Here is a trivial example: + + r1 = READ_ONCE(x); + WRITE_ONCE(y, r1 * 0); + +There appears to be a data dependency from the load of x to the store +of y, since the value to be stored is computed from the value that was +loaded. But in fact, the value stored does not really depend on +anything since it will always be 0. Thus the data dependency is only +syntactic (it appears to exist in the code) but not semantic (the +second access will always be the same, regardless of the value of the +first access). Given code like this, a compiler could simply discard +the value returned by the load from x, which would certainly destroy +any dependency. (The compiler is not permitted to eliminate entirely +the load generated for a READ_ONCE() -- that's one of the nice +properties of READ_ONCE() -- but it is allowed to ignore the load's +value.) + +It's natural to object that no one in their right mind would write +code like the above. However, macro expansions can easily give rise +to this sort of thing, in ways that often are not apparent to the +programmer. + +Another mechanism that can lead to purely syntactic dependencies is +related to the notion of "undefined behavior". Certain program +behaviors are called "undefined" in the C language specification, +which means that when they occur there are no guarantees at all about +the outcome. Consider the following example: + + int a[1]; + int i; + + r1 = READ_ONCE(i); + r2 = READ_ONCE(a[r1]); + +Access beyond the end or before the beginning of an array is one kind +of undefined behavior. Therefore the compiler doesn't have to worry +about what will happen if r1 is nonzero, and it can assume that r1 +will always be zero regardless of the value actually loaded from i. +(If the assumption turns out to be wrong the resulting behavior will +be undefined anyway, so the compiler doesn't care!) Thus the value +from the load can be discarded, breaking the address dependency. + +The LKMM is unaware that purely syntactic dependencies are different +from semantic dependencies and therefore mistakenly predicts that the +accesses in the two examples above will be ordered. This is another +example of how the compiler can undermine the memory model. Be warned. + THE READS-FROM RELATION: rf, rfi, and rfe ----------------------------------------- @@ -1813,15 +1864,16 @@ spin_trylock() -- we can call these things lock-releases and lock-acquires -- have two properties beyond those of ordinary releases and acquires. -First, when a lock-acquire reads from a lock-release, the LKMM -requires that every instruction po-before the lock-release must -execute before any instruction po-after the lock-acquire. This would -naturally hold if the release and acquire operations were on different -CPUs, but the LKMM says it holds even when they are on the same CPU. -For example: +First, when a lock-acquire reads from or is po-after a lock-release, +the LKMM requires that every instruction po-before the lock-release +must execute before any instruction po-after the lock-acquire. This +would naturally hold if the release and acquire operations were on +different CPUs and accessed the same lock variable, but the LKMM says +it also holds when they are on the same CPU, even if they access +different lock variables. For example: int x, y; - spinlock_t s; + spinlock_t s, t; P0() { @@ -1830,9 +1882,9 @@ For example: spin_lock(&s); r1 = READ_ONCE(x); spin_unlock(&s); - spin_lock(&s); + spin_lock(&t); r2 = READ_ONCE(y); - spin_unlock(&s); + spin_unlock(&t); } P1() @@ -1842,10 +1894,10 @@ For example: WRITE_ONCE(x, 1); } -Here the second spin_lock() reads from the first spin_unlock(), and -therefore the load of x must execute before the load of y. Thus we -cannot have r1 = 1 and r2 = 0 at the end (this is an instance of the -MP pattern). +Here the second spin_lock() is po-after the first spin_unlock(), and +therefore the load of x must execute before the load of y, even though +the two locking operations use different locks. Thus we cannot have +r1 = 1 and r2 = 0 at the end (this is an instance of the MP pattern). This requirement does not apply to ordinary release and acquire fences, only to lock-related operations. For instance, suppose P0() @@ -1872,13 +1924,13 @@ instructions in the following order: and thus it could load y before x, obtaining r2 = 0 and r1 = 1. -Second, when a lock-acquire reads from a lock-release, and some other -stores W and W' occur po-before the lock-release and po-after the -lock-acquire respectively, the LKMM requires that W must propagate to -each CPU before W' does. For example, consider: +Second, when a lock-acquire reads from or is po-after a lock-release, +and some other stores W and W' occur po-before the lock-release and +po-after the lock-acquire respectively, the LKMM requires that W must +propagate to each CPU before W' does. For example, consider: int x, y; - spinlock_t x; + spinlock_t s; P0() { @@ -1908,7 +1960,12 @@ each CPU before W' does. For example, consider: If r1 = 1 at the end then the spin_lock() in P1 must have read from the spin_unlock() in P0. Hence the store to x must propagate to P2 -before the store to y does, so we cannot have r2 = 1 and r3 = 0. +before the store to y does, so we cannot have r2 = 1 and r3 = 0. But +if P1 had used a lock variable different from s, the writes could have +propagated in either order. (On the other hand, if the code in P0 and +P1 had all executed on a single CPU, as in the example before this +one, then the writes would have propagated in order even if the two +critical sections used different lock variables.) These two special requirements for lock-release and lock-acquire do not arise from the operational model. Nevertheless, kernel developers |