Documentation/atomic_t: Document forward progress expectations

Add a few words on forward progress; there's been quite a bit of confusion on the subject. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lkml.kernel.org/r/YQK9ziyogxTH0m9H@hirez.programming.kicks-ass.net
author: Peter Zijlstra <peterz@infradead.org> 2021-07-29 16:17:20 +0200
committer: Peter Zijlstra <peterz@infradead.org> 2021-08-04 15:16:47 +0200
commit: 55bccf1f93e4bf1b3209cc8648ab53f10f4601a5 (patch)
tree: 697a88da3461421a2105eff98d32c7ebc549f809
parent: locking/atomic: simplify non-atomic wrappers (diff)
download: linux-dev-55bccf1f93e4bf1b3209cc8648ab53f10f4601a5.tar.xz
linux-dev-55bccf1f93e4bf1b3209cc8648ab53f10f4601a5.zip
1 files changed, 53 insertions, 0 deletions
diff --git a/Documentation/atomic_t.txt b/Documentation/atomic_t.txt
index a9c1e2b39b15..0f1ffa03db09 100644
--- a/Documentation/atomic_t.txt
+++ b/Documentation/atomic_t.txt
@@ -312,3 +312,56 @@ Usage:
 
 NB. try_cmpxchg() also generates better code on some platforms (notably x86)
 where the function more closely matches the hardware instruction.
+
+
+FORWARD PROGRESS
+----------------
+
+In general strong forward progress is expected of all unconditional atomic
+operations -- those in the Arithmetic and Bitwise classes and xchg(). However
+a fair amount of code also requires forward progress from the conditional
+atomic operations.
+
+Specifically 'simple' cmpxchg() loops are expected to not starve one another
+indefinitely. However, this is not evident on LL/SC architectures, because
+while an LL/SC architecure 'can/should/must' provide forward progress
+guarantees between competing LL/SC sections, such a guarantee does not
+transfer to cmpxchg() implemented using LL/SC. Consider:
+
+  old = atomic_read(&v);
+  do {
+    new = func(old);
+  } while (!atomic_try_cmpxchg(&v, &old, new));
+
+which on LL/SC becomes something like:
+
+  old = atomic_read(&v);
+  do {
+    new = func(old);
+  } while (!({
+    volatile asm ("1: LL  %[oldval], %[v]\n"
+                  "   CMP %[oldval], %[old]\n"
+                  "   BNE 2f\n"
+                  "   SC  %[new], %[v]\n"
+                  "   BNE 1b\n"
+                  "2:\n"
+                  : [oldval] "=&r" (oldval), [v] "m" (v)
+		  : [old] "r" (old), [new] "r" (new)
+                  : "memory");
+    success = (oldval == old);
+    if (!success)
+      old = oldval;
+    success; }));
+
+However, even the forward branch from the failed compare can cause the LL/SC
+to fail on some architectures, let alone whatever the compiler makes of the C
+loop body. As a result there is no guarantee what so ever the cacheline
+containing @v will stay on the local CPU and progress is made.
+
+Even native CAS architectures can fail to provide forward progress for their
+primitive (See Sparc64 for an example).
+
+Such implementations are strongly encouraged to add exponential backoff loops
+to a failed CAS in order to ensure some progress. Affected architectures are
+also strongly encouraged to inspect/audit the atomic fallbacks, refcount_t and
+their locking primitives.
author	Peter Zijlstra <peterz@infradead.org>	2021-07-29 16:17:20 +0200
committer	Peter Zijlstra <peterz@infradead.org>	2021-08-04 15:16:47 +0200
commit	55bccf1f93e4bf1b3209cc8648ab53f10f4601a5 (patch)
tree	697a88da3461421a2105eff98d32c7ebc549f809
parent	locking/atomic: simplify non-atomic wrappers (diff)
download	linux-dev-55bccf1f93e4bf1b3209cc8648ab53f10f4601a5.tar.xz linux-dev-55bccf1f93e4bf1b3209cc8648ab53f10f4601a5.zip