rqspinlock: Hardcode cond_acquire loops for arm64

Currently, for rqspinlock usage, the implementation of smp_cond_load_acquire (and thus, atomic_cond_read_acquire) are susceptible to stalls on arm64, because they do not guarantee that the conditional expression will be repeatedly invoked if the address being loaded from is not written to by other CPUs. When support for event-streams is absent (which unblocks stuck WFE-based loops every ~100us), we may end up being stuck forever. This causes a problem for us, as we need to repeatedly invoke the RES_CHECK_TIMEOUT in the spin loop to break out when the timeout expires. Let us import the smp_cond_load_acquire_timewait implementation Ankur is proposing in [0], and then fallback to it once it is merged. While we rely on the implementation to amortize the cost of sampling check_timeout for us, it will not happen when event stream support is unavailable. This is not the common case, and it would be difficult to fit our logic in the time_expr_ns >= time_limit_ns comparison, hence just let it be. [0]: https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com Cc: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-9-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
author: Kumar Kartikeya Dwivedi <memxor@gmail.com> 2025-03-15 21:05:24 -0700
committer: Alexei Starovoitov <ast@kernel.org> 2025-03-19 08:03:04 -0700
commit: ebababcd03729db14b2dd911d6600af84415509c (patch)
tree: 74eafacaf3d4a8447b52b705cf40d145b4698b67 /kernel
parent: rqspinlock: Add support for timeouts (diff)
download: wireguard-linux-ebababcd03729db14b2dd911d6600af84415509c.tar.xz
wireguard-linux-ebababcd03729db14b2dd911d6600af84415509c.zip
1 files changed, 15 insertions, 0 deletions
diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
index 0d8964b4d44a..d429b923b58f 100644
--- a/kernel/bpf/rqspinlock.c
+++ b/kernel/bpf/rqspinlock.c
@@ -92,12 +92,21 @@ static noinline int check_timeout(struct rqspinlock_timeout *ts)
 	return 0;
 }
 
+/*
+ * Do not amortize with spins when res_smp_cond_load_acquire is defined,
+ * as the macro does internal amortization for us.
+ */
+#ifndef res_smp_cond_load_acquire
 #define RES_CHECK_TIMEOUT(ts, ret)                    \
 	({                                            \
 		if (!(ts).spin++)                     \
 			(ret) = check_timeout(&(ts)); \
 		(ret);                                \
 	})
+#else
+#define RES_CHECK_TIMEOUT(ts, ret, mask)	      \
+	({ (ret) = check_timeout(&(ts)); })
+#endif
 
 /*
  * Initialize the 'spin' member.
@@ -118,6 +127,12 @@ static noinline int check_timeout(struct rqspinlock_timeout *ts)
  */
 static DEFINE_PER_CPU_ALIGNED(struct qnode, rqnodes[_Q_MAX_NODES]);
 
+#ifndef res_smp_cond_load_acquire
+#define res_smp_cond_load_acquire(v, c) smp_cond_load_acquire(v, c)
+#endif
+
+#define res_atomic_cond_read_acquire(v, c) res_smp_cond_load_acquire(&(v)->counter, (c))
+
 /**
  * resilient_queued_spin_lock_slowpath - acquire the queued spinlock
  * @lock: Pointer to queued spinlock structure
author	Kumar Kartikeya Dwivedi <memxor@gmail.com>	2025-03-15 21:05:24 -0700
committer	Alexei Starovoitov <ast@kernel.org>	2025-03-19 08:03:04 -0700
commit	ebababcd03729db14b2dd911d6600af84415509c (patch)
tree	74eafacaf3d4a8447b52b705cf40d145b4698b67 /kernel
parent	rqspinlock: Add support for timeouts (diff)
download	wireguard-linux-ebababcd03729db14b2dd911d6600af84415509c.tar.xz wireguard-linux-ebababcd03729db14b2dd911d6600af84415509c.zip