From e85e6a21b2b5f31148cc3f2e785262b37c3e1ec7 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 10 Jan 2019 15:30:15 -0800 Subject: rcu: Unconditionally expedite during suspend/hibernate The rcu_pm_notify() function refuses to switch to/from expedited grace periods on systems with more than 256 CPUs due to the serialized initialization of expedited grace periods. However, expedited grace periods are now initialized in parallel, removing this concern. This commit therefore removes the checks from rcu_pm_notify(), so that expedited grace periods are used unconditionally during suspend/resume and hibernate/wake operations. As always, real-time workloads wishing to completely avoid expedited grace periods can use the rcupdate.rcu_normal= kernel parameter. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index acd6ccf56faf..95e3250b7b6e 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3559,13 +3559,11 @@ static int rcu_pm_notify(struct notifier_block *self, switch (action) { case PM_HIBERNATION_PREPARE: case PM_SUSPEND_PREPARE: - if (nr_cpu_ids <= 256) /* Expediting bad for large systems. */ - rcu_expedite_gp(); + rcu_expedite_gp(); break; case PM_POST_HIBERNATION: case PM_POST_SUSPEND: - if (nr_cpu_ids <= 256) /* Expediting bad for large systems. */ - rcu_unexpedite_gp(); + rcu_unexpedite_gp(); break; default: break; -- cgit v1.2.3-59-g8ed1b From 671a63517cf983ad8eaa324167165cef245ab744 Mon Sep 17 00:00:00 2001 From: "Joel Fernandes (Google)" Date: Sat, 19 Jan 2019 11:14:18 -0500 Subject: rcu: Avoid unnecessary softirq when system is idle When there are no callbacks pending on an idle system, I noticed that RCU softirq is continuously firing. During this the cpu_no_qs is set to false, and core_needs_qs is set to true indefinitely. This causes rcu_process_callbacks to be repeatedly called, even though the node corresponding to the CPU has that CPU's mask bit cleared and the system is idle. I believe the race is when such mask clearing is done during idle CPU scan of the quiescent state forcing stage in the kthread instead of the softirq. Since the rnp mask is cleared, but the flags on the CPU's rdp are not cleared, the CPU thinks it still needs to report to core RCU. Cure this by clearing the core_needs_qs flag when the CPU detects that its node is already updated which will avoid the unwanted softirq raises to the benefit of real-time systems. Test: Ran rcutorture for various tree RCU configs. Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 1 + 1 file changed, 1 insertion(+) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 95e3250b7b6e..2f78a115d34c 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2296,6 +2296,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp) } mask = rdp->grpmask; if ((rnp->qsmask & mask) == 0) { + rdp->core_needs_qs = false; raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } else { rdp->core_needs_qs = false; -- cgit v1.2.3-59-g8ed1b From 18d7e40679ef574e428f893101be1c0035e95ee3 Mon Sep 17 00:00:00 2001 From: Cyrill Gorcunov Date: Thu, 24 Jan 2019 21:14:37 +0300 Subject: rcu: rcu_qs -- Use raise_softirq_irqoff to not save irqs twice The rcu_qs is disabling IRQs by self so no need to do the same in raise_softirq but instead we can save some cycles using raise_softirq_irqoff directly. CC: Paul E. McKenney Signed-off-by: Cyrill Gorcunov Signed-off-by: Paul E. McKenney --- kernel/rcu/tiny.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index 911bd9076d43..477b4eb44af5 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -52,7 +52,7 @@ void rcu_qs(void) local_irq_save(flags); if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) { rcu_ctrlblk.donetail = rcu_ctrlblk.curtail; - raise_softirq(RCU_SOFTIRQ); + raise_softirq_irqoff(RCU_SOFTIRQ); } local_irq_restore(flags); } -- cgit v1.2.3-59-g8ed1b From 884157cef0acf05648fe921d80c680afababb428 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 11 Feb 2019 07:21:29 -0800 Subject: rcu: Make exit_rcu() handle non-preempted RCU readers The purpose of exit_rcu() is to handle cases where buggy code causes a task to exit within an RCU read-side critical section. It currently does that in the case where said RCU read-side critical section was preempted at least once, but fails to handle cases where preemption did not occur. This case needs to be handled because otherwise the final context switch away from the exiting task will incorrectly behave as if task exit were instead a preemption of an RCU read-side critical section, and will therefore queue the exiting task. The exiting task will have exited, and thus won't ever execute rcu_read_unlock(), which means that it will remain queued forever, blocking all subsequent grace periods, and eventually resulting in OOM. Although this is arguably better than letting grace periods proceed and having a later rcu_read_unlock() access the now-freed task structure that once belonged to the exiting tasks, it would obviously be better to correctly handle this case. This commit therefore sets ->rcu_read_lock_nesting to 1 in that case, so that the subsequence call to __rcu_read_unlock() causes the exiting task to exit its dangling RCU read-side critical section. Note that deferred quiescent states need not be considered. The reason is that removing the task from the ->blkd_tasks[] list in the call to rcu_preempt_deferred_qs() handles the per-task component of any deferred quiescent state, and all other components of any deferred quiescent state are associated with the CPU, which isn't going anywhere until some later CPU-hotplug operation, which will report any remaining deferred quiescent states from within the rcu_report_dead() function. Note also that negative values of ->rcu_read_lock_nesting need not be considered. First, these won't show up in exit_rcu() unless there is a serious bug in RCU, and second, setting ->rcu_read_lock_nesting sets the state so that the RCU read-side critical section will be exited normally. Again, this code has no effect unless there has been some prior bug that prevents a task from leaving an RCU read-side critical section before exiting. Furthermore, there have been no reports of the bug fixed by this commit appearing in production. This commit is therefore absolutely -not- recommended for backporting to -stable. Reported-by: ABHISHEK DUBEY Reported-by: BHARATH Y MOURYA Reported-by: Aravinda Prasad Signed-off-by: Paul E. McKenney Tested-by: ABHISHEK DUBEY --- kernel/rcu/tree_plugin.h | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 97dba50f6fb2..d408661d5fb7 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -804,19 +804,25 @@ static void rcu_flavor_sched_clock_irq(int user) /* * Check for a task exiting while in a preemptible-RCU read-side - * critical section, clean up if so. No need to issue warnings, - * as debug_check_no_locks_held() already does this if lockdep - * is enabled. + * critical section, clean up if so. No need to issue warnings, as + * debug_check_no_locks_held() already does this if lockdep is enabled. + * Besides, if this function does anything other than just immediately + * return, there was a bug of some sort. Spewing warnings from this + * function is like as not to simply obscure important prior warnings. */ void exit_rcu(void) { struct task_struct *t = current; - if (likely(list_empty(¤t->rcu_node_entry))) + if (unlikely(!list_empty(¤t->rcu_node_entry))) { + t->rcu_read_lock_nesting = 1; + barrier(); + t->rcu_read_unlock_special.b.blocked = true; + } else if (unlikely(t->rcu_read_lock_nesting)) { + t->rcu_read_lock_nesting = 1; + } else { return; - t->rcu_read_lock_nesting = 1; - barrier(); - t->rcu_read_unlock_special.b.blocked = true; + } __rcu_read_unlock(); rcu_preempt_deferred_qs(current); } -- cgit v1.2.3-59-g8ed1b From 3ffe3d1adc0b6cfb9b24db9995a537bb0aa30a8b Mon Sep 17 00:00:00 2001 From: Liu Song Date: Thu, 21 Feb 2019 22:13:27 +0800 Subject: rcu: Set rcutree.kthread_prio sysfs access to read-only The rcutree.kthread_prio kernel-boot parameter is used to set the priority for boost (rcub), per-CPU (rcuc), and grace-period (rcu_preempt or rcu_sched) kthreads. It is also used by rcutorture to check whether it is possible to meaningfully test RCU priority boosting. However, all of these cases will either ignore or be confused by any post-boot changes to rcutree.kthread_prio. Note that the user really can change the priorities of all of these kthreads using chrt, given sufficient privileges. Therefore, the read-write nature of sysfs access to rcutree.kthread_prio is thus at best an attractive nuisance. This commit therefore changes sysfs access to rcutree.kthread_prio to be read-only. Signed-off-by: Liu Song Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2f78a115d34c..296131450414 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -149,7 +149,7 @@ static void sync_sched_exp_online_cleanup(int cpu); /* rcuc/rcub kthread realtime priority */ static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0; -module_param(kthread_prio, int, 0644); +module_param(kthread_prio, int, 0444); /* Delay in jiffies for grace-period initialization delays, debug only. */ -- cgit v1.2.3-59-g8ed1b From b2eb85b49a576515fb845cb12568b173c2bedffc Mon Sep 17 00:00:00 2001 From: Akira Yokosawa Date: Sat, 2 Mar 2019 17:25:19 +0900 Subject: rcu: Move common code out of if-else block As the result of recent addition of "rdp->core_needs_qs = false;" in the "if" block, now both branches of the if-else have the same assignment. Factor it out and reduce line count. Signed-off-by: Akira Yokosawa Cc: Joel Fernandes Signed-off-by: Paul E. McKenney Acked-by: Joel Fernandes (Google) --- kernel/rcu/tree.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 296131450414..5aefd36ac648 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2295,12 +2295,10 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp) return; } mask = rdp->grpmask; + rdp->core_needs_qs = false; if ((rnp->qsmask & mask) == 0) { - rdp->core_needs_qs = false; raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } else { - rdp->core_needs_qs = false; - /* * This GP can't end until cpu checks in, so all of our * callbacks can be processed during the next GP. -- cgit v1.2.3-59-g8ed1b From da8739f23fadf05809c6c37c327367b229467045 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 5 Mar 2019 15:28:19 -0800 Subject: rcu: Allow rcu_nocbs= to specify all CPUs Currently, the rcu_nocbs= kernel boot parameter requires that a specific list of CPUs be specified, and has no way to say "all of them". As noted by user RavFX in a comment to Phoronix topic 1002538, this is an inconvenient side effect of the removal of the RCU_NOCB_CPU_ALL Kconfig option. This commit therefore enables the rcu_nocbs= kernel boot parameter to be given the string "all", as in "rcu_nocbs=all" to specify that all CPUs on the system are to have their RCU callbacks offloaded. Another approach would be to make cpulist_parse() check for "all", but there are uses of cpulist_parse() that do other checking, which could conflict with an "all". This commit therefore focuses on the specific use of cpulist_parse() in rcu_nocb_setup(). Just a note to other people who would like changes to Linux-kernel RCU: If you send your requests to me directly, they might get fixed somewhat faster. RavFX's comment was posted on January 22, 2018 and I first saw it on March 5, 2019. And the only reason that I found it -at- -all- was that I was looking for projects using RCU, and my search engine showed me that Phoronix comment quite by accident. Your choice, though! ;-) Signed-off-by: Paul E. McKenney --- Documentation/admin-guide/kernel-parameters.txt | 4 +++- kernel/rcu/tree_plugin.h | 5 ++++- 2 files changed, 7 insertions(+), 2 deletions(-) (limited to 'kernel') diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 2b8ee90bb644..d377a2166b79 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3623,7 +3623,9 @@ see CONFIG_RAS_CEC help text. rcu_nocbs= [KNL] - The argument is a cpu list, as described above. + The argument is a cpu list, as described above, + except that the string "all" can be used to + specify every CPU on the system. In kernels built with CONFIG_RCU_NOCB_CPU=y, set the specified list of CPUs to be no-callback CPUs. diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index d408661d5fb7..ed4a6dabf31d 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1776,7 +1776,10 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp) static int __init rcu_nocb_setup(char *str) { alloc_bootmem_cpumask_var(&rcu_nocb_mask); - cpulist_parse(str, rcu_nocb_mask); + if (!strcasecmp(str, "all")) + cpumask_setall(rcu_nocb_mask); + else + cpulist_parse(str, rcu_nocb_mask); return 1; } __setup("rcu_nocbs=", rcu_nocb_setup); -- cgit v1.2.3-59-g8ed1b From 497e42600b69aca1b799c840d2cfc7ad60bb8017 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 6 Mar 2019 14:47:56 -0800 Subject: rcu: Report error for bad rcu_nocbs= parameter values This commit prints a console message when cpulist_parse() reports a bad list of CPUs, and sets all CPUs' bits in that case. The reason for setting all CPUs' bits is that this is the safe(r) choice for real-time workloads, which would normally be the ones using the rcu_nocbs= kernel boot parameter. Either way, later RCU console log messages list the actual set of CPUs whose RCU callbacks will be offloaded. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_plugin.h | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index ed4a6dabf31d..f0aeb7416dcc 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1772,14 +1772,22 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp) */ -/* Parse the boot-time rcu_nocb_mask CPU list from the kernel parameters. */ +/* + * Parse the boot-time rcu_nocb_mask CPU list from the kernel parameters. + * The string after the "rcu_nocbs=" is either "all" for all CPUs, or a + * comma-separated list of CPUs and/or CPU ranges. If an invalid list is + * given, a warning is emitted and all CPUs are offloaded. + */ static int __init rcu_nocb_setup(char *str) { alloc_bootmem_cpumask_var(&rcu_nocb_mask); if (!strcasecmp(str, "all")) cpumask_setall(rcu_nocb_mask); else - cpulist_parse(str, rcu_nocb_mask); + if (cpulist_parse(str, rcu_nocb_mask)) { + pr_warn("rcu_nocbs= bad CPU range, all CPUs set\n"); + cpumask_setall(rcu_nocb_mask); + } return 1; } __setup("rcu_nocbs=", rcu_nocb_setup); -- cgit v1.2.3-59-g8ed1b From 0f58d2ac2c87d27006c2b610668ebc4ff1f7c3ba Mon Sep 17 00:00:00 2001 From: Neeraj Upadhyay Date: Fri, 8 Mar 2019 15:16:18 +0530 Subject: rcu: Fix self-wakeups for grace-period kthread The current rcu_gp_kthread_wake() function uses in_interrupt() and thus does a self-wakeup from all interrupt contexts, including the pointless case where the GP kthread happens to be running with bottom halves disabled, along with the impossible case where the GP kthread is running within an NMI handler (you are not supposed to invoke rcu_gp_kthread_wake() from within an NMI handler. This commit therefore replaces the in_interrupt() with in_irq(), so that the self-wakeups happen only from handlers for hardware interrupts and softirqs. This also makes the code match the comment. Signed-off-by: Neeraj Upadhyay Signed-off-by: Paul E. McKenney Acked-by: Steven Rostedt (VMware) --- kernel/rcu/tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 5aefd36ac648..139fa1f5c537 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1585,7 +1585,7 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp) static void rcu_gp_kthread_wake(void) { if ((current == rcu_state.gp_kthread && - !in_interrupt() && !in_serving_softirq()) || + !in_irq() && !in_serving_softirq()) || !READ_ONCE(rcu_state.gp_flags) || !rcu_state.gp_kthread) return; -- cgit v1.2.3-59-g8ed1b From 6973032a602ee678c98644a30d57ebf9c72dd6d3 Mon Sep 17 00:00:00 2001 From: Neeraj Upadhyay Date: Mon, 11 Mar 2019 15:16:11 +0530 Subject: rcu: Default jiffies_to_sched_qs to jiffies_till_sched_qs The current code only calls adjust_jiffies_till_sched_qs() if jiffies_till_sched_qs is left at its default value, so when the jiffies_till_sched_qs kernel-boot parameter actually is specified, jiffies_to_sched_qs will be left with the value zero, which will result in useless slowdowns of cond_resched(). This commit therefore changes rcu_init_geometry() to unconditionally invoke adjust_jiffies_till_sched_qs(), which ensures that jiffies_to_sched_qs will be initialized in all cases, thus maintaining good cond_resched() performance. Signed-off-by: Neeraj Upadhyay Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 139fa1f5c537..466299c3d2da 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3739,8 +3739,7 @@ static void __init rcu_init_geometry(void) jiffies_till_first_fqs = d; if (jiffies_till_next_fqs == ULONG_MAX) jiffies_till_next_fqs = d; - if (jiffies_till_sched_qs == ULONG_MAX) - adjust_jiffies_till_sched_qs(); + adjust_jiffies_till_sched_qs(); /* If the compile-time values are accurate, just leave. */ if (rcu_fanout_leaf == RCU_FANOUT_LEAF && -- cgit v1.2.3-59-g8ed1b From 85f2b60c4321b088ba08ec9a05b8a7b68e4ada2a Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 11 Mar 2019 15:45:13 -0700 Subject: rcu: Update jiffies_to_sched_qs and adjust_jiffies_till_sched_qs() comments This commit better documents the jiffies_to_sched_qs default-value strategy used by adjust_jiffies_till_sched_qs() Reported-by: Joel Fernandes Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 466299c3d2da..e117732bcd5d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -406,7 +406,7 @@ static bool rcu_kick_kthreads; */ static ulong jiffies_till_sched_qs = ULONG_MAX; module_param(jiffies_till_sched_qs, ulong, 0444); -static ulong jiffies_to_sched_qs; /* Adjusted version of above if not default */ +static ulong jiffies_to_sched_qs; /* See adjust_jiffies_till_sched_qs(). */ module_param(jiffies_to_sched_qs, ulong, 0444); /* Display only! */ /* @@ -424,6 +424,7 @@ static void adjust_jiffies_till_sched_qs(void) WRITE_ONCE(jiffies_to_sched_qs, jiffies_till_sched_qs); return; } + /* Otherwise, set to third fqs scan, but bound below on large system. */ j = READ_ONCE(jiffies_till_first_fqs) + 2 * READ_ONCE(jiffies_till_next_fqs); if (j < HZ / 10 + nr_cpu_ids / RCU_JIFFIES_FQS_DIV) -- cgit v1.2.3-59-g8ed1b From 5d8a752e31aaa4c9703f201956a40b45ed791217 Mon Sep 17 00:00:00 2001 From: Zhouyi Zhou Date: Wed, 20 Mar 2019 03:33:00 +0000 Subject: rcu: Fix force_qs_rnp() header comment Previously, threads blocked on offlining CPUS were migrated to the root rcu_node structure, thus requiring RCU priority boosting on this structure. However, since commit d19fb8d1f3f6 ("rcu: Don't migrate blocked tasks even if all corresponding CPUs offline"), RCU does not migrate blocked tasks. Consequently, RCU no longer does RCU priority boosting on the root rcu_node structure as of commit 1be0085b515e ("rcu: Don't initiate RCU priority boosting on root rcu_node"). This commit therefore brings comments for the force_qs_rnp() function's header comment in line with this new no-root-boosting reality. Signed-off-by: Zhouyi Zhou [ paulmck: Also remove obsolete comment on suppressing new grace periods. ] Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e117732bcd5d..abc8512ceb5f 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2548,11 +2548,11 @@ void rcu_sched_clock_irq(int user) } /* - * Scan the leaf rcu_node structures, processing dyntick state for any that - * have not yet encountered a quiescent state, using the function specified. - * Also initiate boosting for any threads blocked on the root rcu_node. - * - * The caller must have suppressed start of new grace periods. + * Scan the leaf rcu_node structures. For each structure on which all + * CPUs have reported a quiescent state and on which there are tasks + * blocking the current grace period, initiate RCU priority boosting. + * Otherwise, invoke the specified function to check dyntick state for + * each CPU that has not yet reported a quiescent state. */ static void force_qs_rnp(int (*f)(struct rcu_data *rdp)) { -- cgit v1.2.3-59-g8ed1b From a2badefa8574a7a424a72f67a3bd43248141f76a Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 21 Mar 2019 16:29:50 -0700 Subject: rcu: Eliminate redundant NULL-pointer check Because rcu_wake_cond() checks for a null task_struct pointer, there is no need for its callers to do so. This commit eliminates the redundant check. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_plugin.h | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index f0aeb7416dcc..81d3cd821891 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1191,8 +1191,6 @@ static int rcu_boost_kthread(void *arg) static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags) __releases(rnp->lock) { - struct task_struct *t; - raw_lockdep_assert_held_rcu_node(rnp); if (!rcu_preempt_blocked_readers_cgp(rnp) && rnp->exp_tasks == NULL) { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); @@ -1206,9 +1204,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags) if (rnp->exp_tasks == NULL) rnp->boost_tasks = rnp->gp_tasks; raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - t = rnp->boost_kthread_task; - if (t) - rcu_wake_cond(t, rnp->boost_kthread_status); + rcu_wake_cond(rnp->boost_kthread_task, + rnp->boost_kthread_status); } else { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } -- cgit v1.2.3-59-g8ed1b From f1a98045abd824df833354b309b5fa64ff95f792 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 23 Mar 2019 09:19:12 -0700 Subject: rcu: Fix typo in tree_exp.h comment This commit changes a rcu_exp_handler() comment from rcu_preempt_defer_qs() to rcu_preempt_deferred_qs() in order to better match reality. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_exp.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 4c2a0189e748..ec4fb93a5dbe 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -648,7 +648,7 @@ static void rcu_exp_handler(void *unused) * * If the CPU is fully enabled (or if some buggy RCU-preempt * read-side critical section is being used from idle), just - * invoke rcu_preempt_defer_qs() to immediately report the + * invoke rcu_preempt_deferred_qs() to immediately report the * quiescent state. We cannot use rcu_read_unlock_special() * because we are in an interrupt handler, which will cause that * function to take an early exit without doing anything. -- cgit v1.2.3-59-g8ed1b From add0d37b4f1e77de7d170ece43c8d765572a1eab Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 26 Mar 2019 10:22:22 -0700 Subject: rcu: Correct READ_ONCE()/WRITE_ONCE() for ->rcu_read_unlock_special The task_struct structure's ->rcu_read_unlock_special field is only ever read or written by the owning task, but it is accessed both at process and interrupt levels. It may therefore be accessed using plain reads and writes while interrupts are disabled, but must be accessed using READ_ONCE() and WRITE_ONCE() or better otherwise. This commit makes a few adjustments to align with this discipline. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_exp.h | 2 +- kernel/rcu/tree_plugin.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index ec4fb93a5dbe..1ee0782213b8 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -633,7 +633,7 @@ static void rcu_exp_handler(void *unused) raw_spin_lock_irqsave_rcu_node(rnp, flags); if (rnp->expmask & rdp->grpmask) { rdp->deferred_qs = true; - WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, true); + t->rcu_read_unlock_special.b.exp_hint = true; } raw_spin_unlock_irqrestore_rcu_node(rnp, flags); return; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 81d3cd821891..6ddb3c05e88f 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -285,7 +285,7 @@ static void rcu_qs(void) TPS("cpuqs")); __this_cpu_write(rcu_data.cpu_no_qs.b.norm, false); barrier(); /* Coordinate with rcu_flavor_sched_clock_irq(). */ - current->rcu_read_unlock_special.b.need_qs = false; + WRITE_ONCE(current->rcu_read_unlock_special.b.need_qs, false); } } @@ -817,7 +817,7 @@ void exit_rcu(void) if (unlikely(!list_empty(¤t->rcu_node_entry))) { t->rcu_read_lock_nesting = 1; barrier(); - t->rcu_read_unlock_special.b.blocked = true; + WRITE_ONCE(t->rcu_read_unlock_special.b.blocked, true); } else if (unlikely(t->rcu_read_lock_nesting)) { t->rcu_read_lock_nesting = 1; } else { -- cgit v1.2.3-59-g8ed1b From 5cdfd174ea6c2dc1d331b61bdc9572698658600a Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 12 Feb 2019 10:44:33 -0800 Subject: srcu: Check for in-flight callbacks in _cleanup_srcu_struct() If someone fails to drain the corresponding SRCU callbacks (for example, by failing to invoke srcu_barrier()) before invoking either cleanup_srcu_struct() or cleanup_srcu_struct_quiesced(), the resulting diagnostic is an ambiguous use-after-free diagnostic, and even then only if you are running something like KASAN. This commit therefore improves SRCU diagnostics by adding checks for in-flight callbacks at _cleanup_srcu_struct() time. Note that these diagnostics can still be defeated, for example, by invoking call_srcu() concurrently with cleanup_srcu_struct(). Which is a really bad idea, but sometimes all too easy to do. But even then, these diagnostics have at least some probability of catching the problem. Reported-by: Sagi Grimberg Reported-by: Bart Van Assche Signed-off-by: Paul E. McKenney Tested-by: Bart Van Assche --- kernel/rcu/srcutree.c | 2 ++ 1 file changed, 2 insertions(+) (limited to 'kernel') diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index a60b8ba9e1ac..4f30f3ecabc1 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -387,6 +387,8 @@ void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced) del_timer_sync(&sdp->delay_work); flush_work(&sdp->work); } + if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist))) + return; /* Forgot srcu_barrier(), so just leak it! */ } if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) != SRCU_STATE_IDLE) || WARN_ON(srcu_readers_active(ssp))) { -- cgit v1.2.3-59-g8ed1b From f5ad3991493c69d203d42b94d32349b54c58a3f1 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 13 Feb 2019 13:54:37 -0800 Subject: srcu: Remove cleanup_srcu_struct_quiesced() The cleanup_srcu_struct_quiesced() function was added because NVME used WQ_MEM_RECLAIM workqueues and SRCU did not, which meant that NVME workqueues waiting on SRCU workqueues could result in deadlocks during low-memory conditions. However, SRCU now also has WQ_MEM_RECLAIM workqueues, so there is no longer a potential for deadlock. Furthermore, it turns out to be extremely hard to use cleanup_srcu_struct_quiesced() correctly due to the fact that SRCU callback invocation accesses the srcu_struct structure's per-CPU data area just after callbacks are invoked. Therefore, the usual practice of using srcu_barrier() to wait for callbacks to be invoked before invoking cleanup_srcu_struct_quiesced() fails because SRCU's callback-invocation workqueue handler might be delayed, which can result in cleanup_srcu_struct_quiesced() being invoked (and thus freeing the per-CPU data) before the SRCU's callback-invocation workqueue handler is finished using that per-CPU data. Nor is this a theoretical problem: KASAN emitted use-after-free warnings because of this problem on actual runs. In short, NVME can now safely invoke cleanup_srcu_struct(), which avoids the use-after-free scenario. And cleanup_srcu_struct_quiesced() is quite difficult to use safely. This commit therefore removes cleanup_srcu_struct_quiesced(), switching its sole user back to cleanup_srcu_struct(). This effectively reverts the following pair of commits: f7194ac32ca2 ("srcu: Add cleanup_srcu_struct_quiesced()") 4317228ad9b8 ("nvme: Avoid flush dependency in delete controller flow") Reported-by: Bart Van Assche Signed-off-by: Paul E. McKenney Reviewed-by: Bart Van Assche Tested-by: Bart Van Assche --- drivers/nvme/host/core.c | 2 +- include/linux/srcu.h | 36 +----------------------------------- kernel/rcu/rcutorture.c | 7 +------ kernel/rcu/srcutiny.c | 9 +++------ kernel/rcu/srcutree.c | 30 ++++++++++++------------------ 5 files changed, 18 insertions(+), 66 deletions(-) (limited to 'kernel') diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 470601980794..739c5b4830d7 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -388,7 +388,7 @@ static void nvme_free_ns_head(struct kref *ref) nvme_mpath_remove_disk(head); ida_simple_remove(&head->subsys->ns_ida, head->instance); list_del_init(&head->entry); - cleanup_srcu_struct_quiesced(&head->srcu); + cleanup_srcu_struct(&head->srcu); nvme_put_subsystem(head->subsys); kfree(head); } diff --git a/include/linux/srcu.h b/include/linux/srcu.h index c495b2d51569..e432cc92c73d 100644 --- a/include/linux/srcu.h +++ b/include/linux/srcu.h @@ -56,45 +56,11 @@ struct srcu_struct { }; void call_srcu(struct srcu_struct *ssp, struct rcu_head *head, void (*func)(struct rcu_head *head)); -void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced); +void cleanup_srcu_struct(struct srcu_struct *ssp); int __srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp); void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp); void synchronize_srcu(struct srcu_struct *ssp); -/** - * cleanup_srcu_struct - deconstruct a sleep-RCU structure - * @ssp: structure to clean up. - * - * Must invoke this after you are finished using a given srcu_struct that - * was initialized via init_srcu_struct(), else you leak memory. - */ -static inline void cleanup_srcu_struct(struct srcu_struct *ssp) -{ - _cleanup_srcu_struct(ssp, false); -} - -/** - * cleanup_srcu_struct_quiesced - deconstruct a quiesced sleep-RCU structure - * @ssp: structure to clean up. - * - * Must invoke this after you are finished using a given srcu_struct that - * was initialized via init_srcu_struct(), else you leak memory. Also, - * all grace-period processing must have completed. - * - * "Completed" means that the last synchronize_srcu() and - * synchronize_srcu_expedited() calls must have returned before the call - * to cleanup_srcu_struct_quiesced(). It also means that the callback - * from the last call_srcu() must have been invoked before the call to - * cleanup_srcu_struct_quiesced(), but you can use srcu_barrier() to help - * with this last. Violating these rules will get you a WARN_ON() splat - * (with high probability, anyway), and will also cause the srcu_struct - * to be leaked. - */ -static inline void cleanup_srcu_struct_quiesced(struct srcu_struct *ssp) -{ - _cleanup_srcu_struct(ssp, true); -} - #ifdef CONFIG_DEBUG_LOCK_ALLOC /** diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index f14d1b18a74f..d2b226110835 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -592,12 +592,7 @@ static void srcu_torture_init(void) static void srcu_torture_cleanup(void) { - static DEFINE_TORTURE_RANDOM(rand); - - if (torture_random(&rand) & 0x800) - cleanup_srcu_struct(&srcu_ctld); - else - cleanup_srcu_struct_quiesced(&srcu_ctld); + cleanup_srcu_struct(&srcu_ctld); srcu_ctlp = &srcu_ctl; /* In case of a later rcutorture run. */ } diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c index 5d4a39a6505a..44d6606b8325 100644 --- a/kernel/rcu/srcutiny.c +++ b/kernel/rcu/srcutiny.c @@ -76,19 +76,16 @@ EXPORT_SYMBOL_GPL(init_srcu_struct); * Must invoke this after you are finished using a given srcu_struct that * was initialized via init_srcu_struct(), else you leak memory. */ -void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced) +void cleanup_srcu_struct(struct srcu_struct *ssp) { WARN_ON(ssp->srcu_lock_nesting[0] || ssp->srcu_lock_nesting[1]); - if (quiesced) - WARN_ON(work_pending(&ssp->srcu_work)); - else - flush_work(&ssp->srcu_work); + flush_work(&ssp->srcu_work); WARN_ON(ssp->srcu_gp_running); WARN_ON(ssp->srcu_gp_waiting); WARN_ON(ssp->srcu_cb_head); WARN_ON(&ssp->srcu_cb_head != ssp->srcu_cb_tail); } -EXPORT_SYMBOL_GPL(_cleanup_srcu_struct); +EXPORT_SYMBOL_GPL(cleanup_srcu_struct); /* * Removes the count for the old reader from the appropriate element of diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 4f30f3ecabc1..9b761e546de8 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -360,8 +360,14 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp) return SRCU_INTERVAL; } -/* Helper for cleanup_srcu_struct() and cleanup_srcu_struct_quiesced(). */ -void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced) +/** + * cleanup_srcu_struct - deconstruct a sleep-RCU structure + * @ssp: structure to clean up. + * + * Must invoke this after you are finished using a given srcu_struct that + * was initialized via init_srcu_struct(), else you leak memory. + */ +void cleanup_srcu_struct(struct srcu_struct *ssp) { int cpu; @@ -369,24 +375,12 @@ void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced) return; /* Just leak it! */ if (WARN_ON(srcu_readers_active(ssp))) return; /* Just leak it! */ - if (quiesced) { - if (WARN_ON(delayed_work_pending(&ssp->work))) - return; /* Just leak it! */ - } else { - flush_delayed_work(&ssp->work); - } + flush_delayed_work(&ssp->work); for_each_possible_cpu(cpu) { struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu); - if (quiesced) { - if (WARN_ON(timer_pending(&sdp->delay_work))) - return; /* Just leak it! */ - if (WARN_ON(work_pending(&sdp->work))) - return; /* Just leak it! */ - } else { - del_timer_sync(&sdp->delay_work); - flush_work(&sdp->work); - } + del_timer_sync(&sdp->delay_work); + flush_work(&sdp->work); if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist))) return; /* Forgot srcu_barrier(), so just leak it! */ } @@ -399,7 +393,7 @@ void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced) free_percpu(ssp->sda); ssp->sda = NULL; } -EXPORT_SYMBOL_GPL(_cleanup_srcu_struct); +EXPORT_SYMBOL_GPL(cleanup_srcu_struct); /* * Counts the new reader in the appropriate per-CPU element of the -- cgit v1.2.3-59-g8ed1b From 10462d6f58fb6dbde7563e9343505d98d5bfba3d Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 11 Jan 2019 16:10:57 -0800 Subject: rcu: Move RCU CPU stall-warning code out of update.c The RCU CPU stall-warning code for normal grace periods is currently scattered across three files, due to earlier Tiny RCU support for RCU CPU stall warnings and for old Kconfig options that have long since been retired. Given that it is hard for the lead RCU maintainer to find relevant stall-warning code, it would be good to consolidate it. This commit starts this process by moving stall-warning code from kernel/rcu/update.c to a new kernel/rcu/tree_stall.h file. Note that the definitions of rcu_cpu_stall_suppress and rcu_cpu_stall_timeout must remain in kernel/rcu/update.h to provide compatibility for kernel boot parameter lists. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcu.h | 1 + kernel/rcu/tree.c | 1 + kernel/rcu/tree_stall.h | 63 +++++++++++++++++++++++++++++++++++++++++++++++++ kernel/rcu/update.c | 59 +-------------------------------------------- 4 files changed, 66 insertions(+), 58 deletions(-) create mode 100644 kernel/rcu/tree_stall.h (limited to 'kernel') diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index acee72c0b24b..4b58c907b4b7 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -233,6 +233,7 @@ static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head) #ifdef CONFIG_RCU_STALL_COMMON extern int rcu_cpu_stall_suppress; +extern int rcu_cpu_stall_timeout; int rcu_jiffies_till_stall_check(void); #define rcu_ftrace_dump_stall_suppress() \ diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index acd6ccf56faf..424d50ccf9e6 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3858,5 +3858,6 @@ void __init rcu_init(void) srcu_init(); } +#include "tree_stall.h" #include "tree_exp.h" #include "tree_plugin.h" diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h new file mode 100644 index 000000000000..682189f4d083 --- /dev/null +++ b/kernel/rcu/tree_stall.h @@ -0,0 +1,63 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * RCU CPU stall warnings for normal RCU grace periods + * + * Copyright IBM Corporation, 2019 + * + * Author: Paul E. McKenney + */ + + +#ifdef CONFIG_PROVE_RCU +#define RCU_STALL_DELAY_DELTA (5 * HZ) +#else +#define RCU_STALL_DELAY_DELTA 0 +#endif + +int rcu_jiffies_till_stall_check(void) +{ + int till_stall_check = READ_ONCE(rcu_cpu_stall_timeout); + + /* + * Limit check must be consistent with the Kconfig limits + * for CONFIG_RCU_CPU_STALL_TIMEOUT. + */ + if (till_stall_check < 3) { + WRITE_ONCE(rcu_cpu_stall_timeout, 3); + till_stall_check = 3; + } else if (till_stall_check > 300) { + WRITE_ONCE(rcu_cpu_stall_timeout, 300); + till_stall_check = 300; + } + return till_stall_check * HZ + RCU_STALL_DELAY_DELTA; +} +EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check); + +void rcu_sysrq_start(void) +{ + if (!rcu_cpu_stall_suppress) + rcu_cpu_stall_suppress = 2; +} + +void rcu_sysrq_end(void) +{ + if (rcu_cpu_stall_suppress == 2) + rcu_cpu_stall_suppress = 0; +} + +static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr) +{ + rcu_cpu_stall_suppress = 1; + return NOTIFY_DONE; +} + +static struct notifier_block rcu_panic_block = { + .notifier_call = rcu_panic, +}; + +static int __init check_cpu_stall_init(void) +{ + atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block); + return 0; +} +early_initcall(check_cpu_stall_init); diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index cbaa976c5945..c3bf44ba42e5 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -424,68 +424,11 @@ EXPORT_SYMBOL_GPL(do_trace_rcu_torture_read); #endif #ifdef CONFIG_RCU_STALL_COMMON - -#ifdef CONFIG_PROVE_RCU -#define RCU_STALL_DELAY_DELTA (5 * HZ) -#else -#define RCU_STALL_DELAY_DELTA 0 -#endif - int rcu_cpu_stall_suppress __read_mostly; /* 1 = suppress stall warnings. */ EXPORT_SYMBOL_GPL(rcu_cpu_stall_suppress); -static int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT; - module_param(rcu_cpu_stall_suppress, int, 0644); +int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT; module_param(rcu_cpu_stall_timeout, int, 0644); - -int rcu_jiffies_till_stall_check(void) -{ - int till_stall_check = READ_ONCE(rcu_cpu_stall_timeout); - - /* - * Limit check must be consistent with the Kconfig limits - * for CONFIG_RCU_CPU_STALL_TIMEOUT. - */ - if (till_stall_check < 3) { - WRITE_ONCE(rcu_cpu_stall_timeout, 3); - till_stall_check = 3; - } else if (till_stall_check > 300) { - WRITE_ONCE(rcu_cpu_stall_timeout, 300); - till_stall_check = 300; - } - return till_stall_check * HZ + RCU_STALL_DELAY_DELTA; -} -EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check); - -void rcu_sysrq_start(void) -{ - if (!rcu_cpu_stall_suppress) - rcu_cpu_stall_suppress = 2; -} - -void rcu_sysrq_end(void) -{ - if (rcu_cpu_stall_suppress == 2) - rcu_cpu_stall_suppress = 0; -} - -static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr) -{ - rcu_cpu_stall_suppress = 1; - return NOTIFY_DONE; -} - -static struct notifier_block rcu_panic_block = { - .notifier_call = rcu_panic, -}; - -static int __init check_cpu_stall_init(void) -{ - atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block); - return 0; -} -early_initcall(check_cpu_stall_init); - #endif /* #ifdef CONFIG_RCU_STALL_COMMON */ #ifdef CONFIG_TASKS_RCU -- cgit v1.2.3-59-g8ed1b From 3fc3d1709fc75995ee09ad4f35f160cf360b397b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 11 Jan 2019 16:34:47 -0800 Subject: rcu: Move RCU CPU stall-warning code out of tree_plugin.h The RCU CPU stall-warning code for normal grace periods is currently scattered across two files, due to earlier Tiny RCU support for RCU CPU stall warnings and for old Kconfig options that have long since been retired. Given that it is hard for the lead RCU maintainer to find relevant stall-warning code, it would be good to consolidate it. This commit continues this process by moving stall-warning code from kernel/rcu/tree_plugin.c to a new kernel/rcu/tree_stall.h file. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_plugin.h | 90 --------------------------------------------- kernel/rcu/tree_stall.h | 95 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 95 insertions(+), 90 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 97dba50f6fb2..7fa3bc4d481b 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -642,79 +642,6 @@ static void rcu_read_unlock_special(struct task_struct *t) rcu_preempt_deferred_qs_irqrestore(t, flags); } -/* - * Dump detailed information for all tasks blocking the current RCU - * grace period on the specified rcu_node structure. - */ -static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp) -{ - unsigned long flags; - struct task_struct *t; - - raw_spin_lock_irqsave_rcu_node(rnp, flags); - if (!rcu_preempt_blocked_readers_cgp(rnp)) { - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - return; - } - t = list_entry(rnp->gp_tasks->prev, - struct task_struct, rcu_node_entry); - list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) { - /* - * We could be printing a lot while holding a spinlock. - * Avoid triggering hard lockup. - */ - touch_nmi_watchdog(); - sched_show_task(t); - } - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); -} - -/* - * Dump detailed information for all tasks blocking the current RCU - * grace period. - */ -static void rcu_print_detail_task_stall(void) -{ - struct rcu_node *rnp = rcu_get_root(); - - rcu_print_detail_task_stall_rnp(rnp); - rcu_for_each_leaf_node(rnp) - rcu_print_detail_task_stall_rnp(rnp); -} - -static void rcu_print_task_stall_begin(struct rcu_node *rnp) -{ - pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):", - rnp->level, rnp->grplo, rnp->grphi); -} - -static void rcu_print_task_stall_end(void) -{ - pr_cont("\n"); -} - -/* - * Scan the current list of tasks blocked within RCU read-side critical - * sections, printing out the tid of each. - */ -static int rcu_print_task_stall(struct rcu_node *rnp) -{ - struct task_struct *t; - int ndetected = 0; - - if (!rcu_preempt_blocked_readers_cgp(rnp)) - return 0; - rcu_print_task_stall_begin(rnp); - t = list_entry(rnp->gp_tasks->prev, - struct task_struct, rcu_node_entry); - list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) { - pr_cont(" P%d", t->pid); - ndetected++; - } - rcu_print_task_stall_end(); - return ndetected; -} - /* * Scan the current list of tasks blocked within RCU read-side critical * sections, printing out the tid of each that is blocking the current @@ -979,23 +906,6 @@ static bool rcu_preempt_need_deferred_qs(struct task_struct *t) } static void rcu_preempt_deferred_qs(struct task_struct *t) { } -/* - * Because preemptible RCU does not exist, we never have to check for - * tasks blocked within RCU read-side critical sections. - */ -static void rcu_print_detail_task_stall(void) -{ -} - -/* - * Because preemptible RCU does not exist, we never have to check for - * tasks blocked within RCU read-side critical sections. - */ -static int rcu_print_task_stall(struct rcu_node *rnp) -{ - return 0; -} - /* * Because preemptible RCU does not exist, we never have to check for * tasks blocked within RCU read-side critical sections that are diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 682189f4d083..6f5f94944f49 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -61,3 +61,98 @@ static int __init check_cpu_stall_init(void) return 0; } early_initcall(check_cpu_stall_init); + +#ifdef CONFIG_PREEMPT + +/* + * Dump detailed information for all tasks blocking the current RCU + * grace period on the specified rcu_node structure. + */ +static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp) +{ + unsigned long flags; + struct task_struct *t; + + raw_spin_lock_irqsave_rcu_node(rnp, flags); + if (!rcu_preempt_blocked_readers_cgp(rnp)) { + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + return; + } + t = list_entry(rnp->gp_tasks->prev, + struct task_struct, rcu_node_entry); + list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) { + /* + * We could be printing a lot while holding a spinlock. + * Avoid triggering hard lockup. + */ + touch_nmi_watchdog(); + sched_show_task(t); + } + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); +} + +/* + * Dump detailed information for all tasks blocking the current RCU + * grace period. + */ +static void rcu_print_detail_task_stall(void) +{ + struct rcu_node *rnp = rcu_get_root(); + + rcu_print_detail_task_stall_rnp(rnp); + rcu_for_each_leaf_node(rnp) + rcu_print_detail_task_stall_rnp(rnp); +} + +static void rcu_print_task_stall_begin(struct rcu_node *rnp) +{ + pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):", + rnp->level, rnp->grplo, rnp->grphi); +} + +static void rcu_print_task_stall_end(void) +{ + pr_cont("\n"); +} + +/* + * Scan the current list of tasks blocked within RCU read-side critical + * sections, printing out the tid of each. + */ +static int rcu_print_task_stall(struct rcu_node *rnp) +{ + struct task_struct *t; + int ndetected = 0; + + if (!rcu_preempt_blocked_readers_cgp(rnp)) + return 0; + rcu_print_task_stall_begin(rnp); + t = list_entry(rnp->gp_tasks->prev, + struct task_struct, rcu_node_entry); + list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) { + pr_cont(" P%d", t->pid); + ndetected++; + } + rcu_print_task_stall_end(); + return ndetected; +} + +#else /* #ifdef CONFIG_PREEMPT */ + +/* + * Because preemptible RCU does not exist, we never have to check for + * tasks blocked within RCU read-side critical sections. + */ +static void rcu_print_detail_task_stall(void) +{ +} + +/* + * Because preemptible RCU does not exist, we never have to check for + * tasks blocked within RCU read-side critical sections. + */ +static int rcu_print_task_stall(struct rcu_node *rnp) +{ + return 0; +} +#endif /* #else #ifdef CONFIG_PREEMPT */ -- cgit v1.2.3-59-g8ed1b From 32255d51b6ed00de2b88970ceea8db0ec3bae6f8 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 11 Jan 2019 16:57:41 -0800 Subject: rcu: Move RCU CPU stall-warning code out of tree.c This commit completes the process of consolidating the code for RCU CPU stall warnings for normal grace periods by moving the remaining such code from kernel/rcu/tree.c to kernel/rcu/tree_stall.h. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 291 ----------------------------------------------- kernel/rcu/tree.h | 10 +- kernel/rcu/tree_stall.h | 292 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 299 insertions(+), 294 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 424d50ccf9e6..001dd05f6e38 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -102,8 +102,6 @@ int rcu_num_lvls __read_mostly = RCU_NUM_LVLS; /* Number of rcu_nodes at specified level. */ int num_rcu_lvl[] = NUM_RCU_LVL_INIT; int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */ -/* panic() on RCU Stall sysctl. */ -int sysctl_panic_on_rcu_stall __read_mostly; /* Commandeer a sysrq key to dump RCU's tree. */ static bool sysrq_rcu; module_param(sysrq_rcu, bool, 0444); @@ -1167,295 +1165,6 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) return 0; } -static void record_gp_stall_check_time(void) -{ - unsigned long j = jiffies; - unsigned long j1; - - rcu_state.gp_start = j; - j1 = rcu_jiffies_till_stall_check(); - /* Record ->gp_start before ->jiffies_stall. */ - smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */ - rcu_state.jiffies_resched = j + j1 / 2; - rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs); -} - -/* - * Complain about starvation of grace-period kthread. - */ -static void rcu_check_gp_kthread_starvation(void) -{ - struct task_struct *gpk = rcu_state.gp_kthread; - unsigned long j; - - j = jiffies - READ_ONCE(rcu_state.gp_activity); - if (j > 2 * HZ) { - pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n", - rcu_state.name, j, - (long)rcu_seq_current(&rcu_state.gp_seq), - READ_ONCE(rcu_state.gp_flags), - gp_state_getname(rcu_state.gp_state), rcu_state.gp_state, - gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1); - if (gpk) { - pr_err("RCU grace-period kthread stack dump:\n"); - sched_show_task(gpk); - wake_up_process(gpk); - } - } -} - -/* - * Dump stacks of all tasks running on stalled CPUs. First try using - * NMIs, but fall back to manual remote stack tracing on architectures - * that don't support NMI-based stack dumps. The NMI-triggered stack - * traces are more accurate because they are printed by the target CPU. - */ -static void rcu_dump_cpu_stacks(void) -{ - int cpu; - unsigned long flags; - struct rcu_node *rnp; - - rcu_for_each_leaf_node(rnp) { - raw_spin_lock_irqsave_rcu_node(rnp, flags); - for_each_leaf_node_possible_cpu(rnp, cpu) - if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) - if (!trigger_single_cpu_backtrace(cpu)) - dump_cpu_task(cpu); - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - } -} - -/* - * If too much time has passed in the current grace period, and if - * so configured, go kick the relevant kthreads. - */ -static void rcu_stall_kick_kthreads(void) -{ - unsigned long j; - - if (!rcu_kick_kthreads) - return; - j = READ_ONCE(rcu_state.jiffies_kick_kthreads); - if (time_after(jiffies, j) && rcu_state.gp_kthread && - (rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) { - WARN_ONCE(1, "Kicking %s grace-period kthread\n", - rcu_state.name); - rcu_ftrace_dump(DUMP_ALL); - wake_up_process(rcu_state.gp_kthread); - WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ); - } -} - -static void panic_on_rcu_stall(void) -{ - if (sysctl_panic_on_rcu_stall) - panic("RCU Stall\n"); -} - -static void print_other_cpu_stall(unsigned long gp_seq) -{ - int cpu; - unsigned long flags; - unsigned long gpa; - unsigned long j; - int ndetected = 0; - struct rcu_node *rnp = rcu_get_root(); - long totqlen = 0; - - /* Kick and suppress, if so configured. */ - rcu_stall_kick_kthreads(); - if (rcu_cpu_stall_suppress) - return; - - /* - * OK, time to rat on our buddy... - * See Documentation/RCU/stallwarn.txt for info on how to debug - * RCU CPU stall warnings. - */ - pr_err("INFO: %s detected stalls on CPUs/tasks:", rcu_state.name); - print_cpu_stall_info_begin(); - rcu_for_each_leaf_node(rnp) { - raw_spin_lock_irqsave_rcu_node(rnp, flags); - ndetected += rcu_print_task_stall(rnp); - if (rnp->qsmask != 0) { - for_each_leaf_node_possible_cpu(rnp, cpu) - if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) { - print_cpu_stall_info(cpu); - ndetected++; - } - } - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - } - - print_cpu_stall_info_end(); - for_each_possible_cpu(cpu) - totqlen += rcu_get_n_cbs_cpu(cpu); - pr_cont("(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n", - smp_processor_id(), (long)(jiffies - rcu_state.gp_start), - (long)rcu_seq_current(&rcu_state.gp_seq), totqlen); - if (ndetected) { - rcu_dump_cpu_stacks(); - - /* Complain about tasks blocking the grace period. */ - rcu_print_detail_task_stall(); - } else { - if (rcu_seq_current(&rcu_state.gp_seq) != gp_seq) { - pr_err("INFO: Stall ended before state dump start\n"); - } else { - j = jiffies; - gpa = READ_ONCE(rcu_state.gp_activity); - pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n", - rcu_state.name, j - gpa, j, gpa, - READ_ONCE(jiffies_till_next_fqs), - rcu_get_root()->qsmask); - /* In this case, the current CPU might be at fault. */ - sched_show_task(current); - } - } - /* Rewrite if needed in case of slow consoles. */ - if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall))) - WRITE_ONCE(rcu_state.jiffies_stall, - jiffies + 3 * rcu_jiffies_till_stall_check() + 3); - - rcu_check_gp_kthread_starvation(); - - panic_on_rcu_stall(); - - rcu_force_quiescent_state(); /* Kick them all. */ -} - -static void print_cpu_stall(void) -{ - int cpu; - unsigned long flags; - struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - struct rcu_node *rnp = rcu_get_root(); - long totqlen = 0; - - /* Kick and suppress, if so configured. */ - rcu_stall_kick_kthreads(); - if (rcu_cpu_stall_suppress) - return; - - /* - * OK, time to rat on ourselves... - * See Documentation/RCU/stallwarn.txt for info on how to debug - * RCU CPU stall warnings. - */ - pr_err("INFO: %s self-detected stall on CPU", rcu_state.name); - print_cpu_stall_info_begin(); - raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags); - print_cpu_stall_info(smp_processor_id()); - raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags); - print_cpu_stall_info_end(); - for_each_possible_cpu(cpu) - totqlen += rcu_get_n_cbs_cpu(cpu); - pr_cont(" (t=%lu jiffies g=%ld q=%lu)\n", - jiffies - rcu_state.gp_start, - (long)rcu_seq_current(&rcu_state.gp_seq), totqlen); - - rcu_check_gp_kthread_starvation(); - - rcu_dump_cpu_stacks(); - - raw_spin_lock_irqsave_rcu_node(rnp, flags); - /* Rewrite if needed in case of slow consoles. */ - if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall))) - WRITE_ONCE(rcu_state.jiffies_stall, - jiffies + 3 * rcu_jiffies_till_stall_check() + 3); - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - - panic_on_rcu_stall(); - - /* - * Attempt to revive the RCU machinery by forcing a context switch. - * - * A context switch would normally allow the RCU state machine to make - * progress and it could be we're stuck in kernel space without context - * switches for an entirely unreasonable amount of time. - */ - set_tsk_need_resched(current); - set_preempt_need_resched(); -} - -static void check_cpu_stall(struct rcu_data *rdp) -{ - unsigned long gs1; - unsigned long gs2; - unsigned long gps; - unsigned long j; - unsigned long jn; - unsigned long js; - struct rcu_node *rnp; - - if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || - !rcu_gp_in_progress()) - return; - rcu_stall_kick_kthreads(); - j = jiffies; - - /* - * Lots of memory barriers to reject false positives. - * - * The idea is to pick up rcu_state.gp_seq, then - * rcu_state.jiffies_stall, then rcu_state.gp_start, and finally - * another copy of rcu_state.gp_seq. These values are updated in - * the opposite order with memory barriers (or equivalent) during - * grace-period initialization and cleanup. Now, a false positive - * can occur if we get an new value of rcu_state.gp_start and a old - * value of rcu_state.jiffies_stall. But given the memory barriers, - * the only way that this can happen is if one grace period ends - * and another starts between these two fetches. This is detected - * by comparing the second fetch of rcu_state.gp_seq with the - * previous fetch from rcu_state.gp_seq. - * - * Given this check, comparisons of jiffies, rcu_state.jiffies_stall, - * and rcu_state.gp_start suffice to forestall false positives. - */ - gs1 = READ_ONCE(rcu_state.gp_seq); - smp_rmb(); /* Pick up ->gp_seq first... */ - js = READ_ONCE(rcu_state.jiffies_stall); - smp_rmb(); /* ...then ->jiffies_stall before the rest... */ - gps = READ_ONCE(rcu_state.gp_start); - smp_rmb(); /* ...and finally ->gp_start before ->gp_seq again. */ - gs2 = READ_ONCE(rcu_state.gp_seq); - if (gs1 != gs2 || - ULONG_CMP_LT(j, js) || - ULONG_CMP_GE(gps, js)) - return; /* No stall or GP completed since entering function. */ - rnp = rdp->mynode; - jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; - if (rcu_gp_in_progress() && - (READ_ONCE(rnp->qsmask) & rdp->grpmask) && - cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) { - - /* We haven't checked in, so go dump stack. */ - print_cpu_stall(); - - } else if (rcu_gp_in_progress() && - ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && - cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) { - - /* They had a few time units to dump stack, so complain. */ - print_other_cpu_stall(gs2); - } -} - -/** - * rcu_cpu_stall_reset - prevent further stall warnings in current grace period - * - * Set the stall-warning timeout way off into the future, thus preventing - * any RCU CPU stall-warning messages from appearing in the current set of - * RCU grace periods. - * - * The caller must disable hard irqs. - */ -void rcu_cpu_stall_reset(void) -{ - WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2); -} - /* Trace-event wrapper function for trace_rcu_future_grace_period. */ static void trace_rcu_this_gp(struct rcu_node *rnp, struct rcu_data *rdp, unsigned long gp_seq_req, const char *s) diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index bb4f995f2d3f..3c4e26fff806 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -393,15 +393,13 @@ static const char *tp_rcu_varname __used __tracepoint_string = rcu_name; int rcu_dynticks_snap(struct rcu_data *rdp); -/* Forward declarations for rcutree_plugin.h */ +/* Forward declarations for tree_plugin.h */ static void rcu_bootup_announce(void); static void rcu_qs(void); static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp); #ifdef CONFIG_HOTPLUG_CPU static bool rcu_preempt_has_tasks(struct rcu_node *rnp); #endif /* #ifdef CONFIG_HOTPLUG_CPU */ -static void rcu_print_detail_task_stall(void); -static int rcu_print_task_stall(struct rcu_node *rnp); static int rcu_print_task_exp_stall(struct rcu_node *rnp); static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp); static void rcu_flavor_sched_clock_irq(int user); @@ -445,3 +443,9 @@ static void rcu_bind_gp_kthread(void); static bool rcu_nohz_full_cpu(void); static void rcu_dynticks_task_enter(void); static void rcu_dynticks_task_exit(void); + +/* Forward declarations for tree_stall.h */ +static void rcu_print_detail_task_stall(void); +static int rcu_print_task_stall(struct rcu_node *rnp); +static void record_gp_stall_check_time(void); +static void check_cpu_stall(struct rcu_data *rdp); diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 6f5f94944f49..e0e73f493363 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -8,6 +8,9 @@ */ +/* panic() on RCU Stall sysctl. */ +int sysctl_panic_on_rcu_stall __read_mostly; + #ifdef CONFIG_PROVE_RCU #define RCU_STALL_DELAY_DELTA (5 * HZ) #else @@ -156,3 +159,292 @@ static int rcu_print_task_stall(struct rcu_node *rnp) return 0; } #endif /* #else #ifdef CONFIG_PREEMPT */ + +static void record_gp_stall_check_time(void) +{ + unsigned long j = jiffies; + unsigned long j1; + + rcu_state.gp_start = j; + j1 = rcu_jiffies_till_stall_check(); + /* Record ->gp_start before ->jiffies_stall. */ + smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */ + rcu_state.jiffies_resched = j + j1 / 2; + rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs); +} + +/* + * Complain about starvation of grace-period kthread. + */ +static void rcu_check_gp_kthread_starvation(void) +{ + struct task_struct *gpk = rcu_state.gp_kthread; + unsigned long j; + + j = jiffies - READ_ONCE(rcu_state.gp_activity); + if (j > 2 * HZ) { + pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n", + rcu_state.name, j, + (long)rcu_seq_current(&rcu_state.gp_seq), + READ_ONCE(rcu_state.gp_flags), + gp_state_getname(rcu_state.gp_state), rcu_state.gp_state, + gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1); + if (gpk) { + pr_err("RCU grace-period kthread stack dump:\n"); + sched_show_task(gpk); + wake_up_process(gpk); + } + } +} + +/* + * Dump stacks of all tasks running on stalled CPUs. First try using + * NMIs, but fall back to manual remote stack tracing on architectures + * that don't support NMI-based stack dumps. The NMI-triggered stack + * traces are more accurate because they are printed by the target CPU. + */ +static void rcu_dump_cpu_stacks(void) +{ + int cpu; + unsigned long flags; + struct rcu_node *rnp; + + rcu_for_each_leaf_node(rnp) { + raw_spin_lock_irqsave_rcu_node(rnp, flags); + for_each_leaf_node_possible_cpu(rnp, cpu) + if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) + if (!trigger_single_cpu_backtrace(cpu)) + dump_cpu_task(cpu); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + } +} + +/* + * If too much time has passed in the current grace period, and if + * so configured, go kick the relevant kthreads. + */ +static void rcu_stall_kick_kthreads(void) +{ + unsigned long j; + + if (!rcu_kick_kthreads) + return; + j = READ_ONCE(rcu_state.jiffies_kick_kthreads); + if (time_after(jiffies, j) && rcu_state.gp_kthread && + (rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) { + WARN_ONCE(1, "Kicking %s grace-period kthread\n", + rcu_state.name); + rcu_ftrace_dump(DUMP_ALL); + wake_up_process(rcu_state.gp_kthread); + WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ); + } +} + +static void panic_on_rcu_stall(void) +{ + if (sysctl_panic_on_rcu_stall) + panic("RCU Stall\n"); +} + +static void print_other_cpu_stall(unsigned long gp_seq) +{ + int cpu; + unsigned long flags; + unsigned long gpa; + unsigned long j; + int ndetected = 0; + struct rcu_node *rnp = rcu_get_root(); + long totqlen = 0; + + /* Kick and suppress, if so configured. */ + rcu_stall_kick_kthreads(); + if (rcu_cpu_stall_suppress) + return; + + /* + * OK, time to rat on our buddy... + * See Documentation/RCU/stallwarn.txt for info on how to debug + * RCU CPU stall warnings. + */ + pr_err("INFO: %s detected stalls on CPUs/tasks:", rcu_state.name); + print_cpu_stall_info_begin(); + rcu_for_each_leaf_node(rnp) { + raw_spin_lock_irqsave_rcu_node(rnp, flags); + ndetected += rcu_print_task_stall(rnp); + if (rnp->qsmask != 0) { + for_each_leaf_node_possible_cpu(rnp, cpu) + if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) { + print_cpu_stall_info(cpu); + ndetected++; + } + } + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + } + + print_cpu_stall_info_end(); + for_each_possible_cpu(cpu) + totqlen += rcu_get_n_cbs_cpu(cpu); + pr_cont("(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n", + smp_processor_id(), (long)(jiffies - rcu_state.gp_start), + (long)rcu_seq_current(&rcu_state.gp_seq), totqlen); + if (ndetected) { + rcu_dump_cpu_stacks(); + + /* Complain about tasks blocking the grace period. */ + rcu_print_detail_task_stall(); + } else { + if (rcu_seq_current(&rcu_state.gp_seq) != gp_seq) { + pr_err("INFO: Stall ended before state dump start\n"); + } else { + j = jiffies; + gpa = READ_ONCE(rcu_state.gp_activity); + pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n", + rcu_state.name, j - gpa, j, gpa, + READ_ONCE(jiffies_till_next_fqs), + rcu_get_root()->qsmask); + /* In this case, the current CPU might be at fault. */ + sched_show_task(current); + } + } + /* Rewrite if needed in case of slow consoles. */ + if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall))) + WRITE_ONCE(rcu_state.jiffies_stall, + jiffies + 3 * rcu_jiffies_till_stall_check() + 3); + + rcu_check_gp_kthread_starvation(); + + panic_on_rcu_stall(); + + rcu_force_quiescent_state(); /* Kick them all. */ +} + +static void print_cpu_stall(void) +{ + int cpu; + unsigned long flags; + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); + struct rcu_node *rnp = rcu_get_root(); + long totqlen = 0; + + /* Kick and suppress, if so configured. */ + rcu_stall_kick_kthreads(); + if (rcu_cpu_stall_suppress) + return; + + /* + * OK, time to rat on ourselves... + * See Documentation/RCU/stallwarn.txt for info on how to debug + * RCU CPU stall warnings. + */ + pr_err("INFO: %s self-detected stall on CPU", rcu_state.name); + print_cpu_stall_info_begin(); + raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags); + print_cpu_stall_info(smp_processor_id()); + raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags); + print_cpu_stall_info_end(); + for_each_possible_cpu(cpu) + totqlen += rcu_get_n_cbs_cpu(cpu); + pr_cont(" (t=%lu jiffies g=%ld q=%lu)\n", + jiffies - rcu_state.gp_start, + (long)rcu_seq_current(&rcu_state.gp_seq), totqlen); + + rcu_check_gp_kthread_starvation(); + + rcu_dump_cpu_stacks(); + + raw_spin_lock_irqsave_rcu_node(rnp, flags); + /* Rewrite if needed in case of slow consoles. */ + if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall))) + WRITE_ONCE(rcu_state.jiffies_stall, + jiffies + 3 * rcu_jiffies_till_stall_check() + 3); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + + panic_on_rcu_stall(); + + /* + * Attempt to revive the RCU machinery by forcing a context switch. + * + * A context switch would normally allow the RCU state machine to make + * progress and it could be we're stuck in kernel space without context + * switches for an entirely unreasonable amount of time. + */ + set_tsk_need_resched(current); + set_preempt_need_resched(); +} + +static void check_cpu_stall(struct rcu_data *rdp) +{ + unsigned long gs1; + unsigned long gs2; + unsigned long gps; + unsigned long j; + unsigned long jn; + unsigned long js; + struct rcu_node *rnp; + + if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || + !rcu_gp_in_progress()) + return; + rcu_stall_kick_kthreads(); + j = jiffies; + + /* + * Lots of memory barriers to reject false positives. + * + * The idea is to pick up rcu_state.gp_seq, then + * rcu_state.jiffies_stall, then rcu_state.gp_start, and finally + * another copy of rcu_state.gp_seq. These values are updated in + * the opposite order with memory barriers (or equivalent) during + * grace-period initialization and cleanup. Now, a false positive + * can occur if we get an new value of rcu_state.gp_start and a old + * value of rcu_state.jiffies_stall. But given the memory barriers, + * the only way that this can happen is if one grace period ends + * and another starts between these two fetches. This is detected + * by comparing the second fetch of rcu_state.gp_seq with the + * previous fetch from rcu_state.gp_seq. + * + * Given this check, comparisons of jiffies, rcu_state.jiffies_stall, + * and rcu_state.gp_start suffice to forestall false positives. + */ + gs1 = READ_ONCE(rcu_state.gp_seq); + smp_rmb(); /* Pick up ->gp_seq first... */ + js = READ_ONCE(rcu_state.jiffies_stall); + smp_rmb(); /* ...then ->jiffies_stall before the rest... */ + gps = READ_ONCE(rcu_state.gp_start); + smp_rmb(); /* ...and finally ->gp_start before ->gp_seq again. */ + gs2 = READ_ONCE(rcu_state.gp_seq); + if (gs1 != gs2 || + ULONG_CMP_LT(j, js) || + ULONG_CMP_GE(gps, js)) + return; /* No stall or GP completed since entering function. */ + rnp = rdp->mynode; + jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; + if (rcu_gp_in_progress() && + (READ_ONCE(rnp->qsmask) & rdp->grpmask) && + cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) { + + /* We haven't checked in, so go dump stack. */ + print_cpu_stall(); + + } else if (rcu_gp_in_progress() && + ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && + cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) { + + /* They had a few time units to dump stack, so complain. */ + print_other_cpu_stall(gs2); + } +} + +/** + * rcu_cpu_stall_reset - prevent further stall warnings in current grace period + * + * Set the stall-warning timeout way off into the future, thus preventing + * any RCU CPU stall-warning messages from appearing in the current set of + * RCU grace periods. + * + * The caller must disable hard irqs. + */ +void rcu_cpu_stall_reset(void) +{ + WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2); +} -- cgit v1.2.3-59-g8ed1b From 21d0d79ab051bf9facb9960a30e58b93a31c75a5 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 11 Jan 2019 20:36:45 -0800 Subject: rcu: Inline RCU task stall-warning helper functions The rcu_print_detail_task_stall(), rcu_print_task_stall_begin(), and rcu_print_task_stall_end() functions were defined to allow long-gone Kconfig options to provide an abbreviated RCU CPU stall warning printout. This commit saves a few lines of code by inlining them into their sole callers. While in the area, a useless call of rcu_print_detail_task_stall_rnp() on the root rcu_node structure was eliminated. If there is only one rcu_node structure, its tasks get printed twice, but if there are more, the root rcu_node structure is guaranteed to have an empty list of blocked tasks, hence the uselessness. (Long ago, root rcu_node structures with non-empty ->blkd_tasks lists could happen, but no longer.) Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.h | 1 - kernel/rcu/tree_stall.h | 36 +++++++----------------------------- 2 files changed, 7 insertions(+), 30 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 3c4e26fff806..c6df9a13dd06 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -445,7 +445,6 @@ static void rcu_dynticks_task_enter(void); static void rcu_dynticks_task_exit(void); /* Forward declarations for tree_stall.h */ -static void rcu_print_detail_task_stall(void); static int rcu_print_task_stall(struct rcu_node *rnp); static void record_gp_stall_check_time(void); static void check_cpu_stall(struct rcu_data *rdp); diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index e0e73f493363..b476786b8ef7 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -94,30 +94,6 @@ static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp) raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } -/* - * Dump detailed information for all tasks blocking the current RCU - * grace period. - */ -static void rcu_print_detail_task_stall(void) -{ - struct rcu_node *rnp = rcu_get_root(); - - rcu_print_detail_task_stall_rnp(rnp); - rcu_for_each_leaf_node(rnp) - rcu_print_detail_task_stall_rnp(rnp); -} - -static void rcu_print_task_stall_begin(struct rcu_node *rnp) -{ - pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):", - rnp->level, rnp->grplo, rnp->grphi); -} - -static void rcu_print_task_stall_end(void) -{ - pr_cont("\n"); -} - /* * Scan the current list of tasks blocked within RCU read-side critical * sections, printing out the tid of each. @@ -129,14 +105,15 @@ static int rcu_print_task_stall(struct rcu_node *rnp) if (!rcu_preempt_blocked_readers_cgp(rnp)) return 0; - rcu_print_task_stall_begin(rnp); + pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):", + rnp->level, rnp->grplo, rnp->grphi); t = list_entry(rnp->gp_tasks->prev, struct task_struct, rcu_node_entry); list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) { pr_cont(" P%d", t->pid); ndetected++; } - rcu_print_task_stall_end(); + pr_cont("\n"); return ndetected; } @@ -146,7 +123,7 @@ static int rcu_print_task_stall(struct rcu_node *rnp) * Because preemptible RCU does not exist, we never have to check for * tasks blocked within RCU read-side critical sections. */ -static void rcu_print_detail_task_stall(void) +static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp) { } @@ -253,7 +230,7 @@ static void print_other_cpu_stall(unsigned long gp_seq) unsigned long gpa; unsigned long j; int ndetected = 0; - struct rcu_node *rnp = rcu_get_root(); + struct rcu_node *rnp; long totqlen = 0; /* Kick and suppress, if so configured. */ @@ -291,7 +268,8 @@ static void print_other_cpu_stall(unsigned long gp_seq) rcu_dump_cpu_stacks(); /* Complain about tasks blocking the grace period. */ - rcu_print_detail_task_stall(); + rcu_for_each_leaf_node(rnp) + rcu_print_detail_task_stall_rnp(rnp); } else { if (rcu_seq_current(&rcu_state.gp_seq) != gp_seq) { pr_err("INFO: Stall ended before state dump start\n"); -- cgit v1.2.3-59-g8ed1b From d87cda5094585b7a0f62075de68266cb9c1b35ca Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 11 Jan 2019 20:51:49 -0800 Subject: rcu: Move rcu_print_task_exp_stall() to tree_exp.h Because expedited CPU stall warnings are contained within the kernel/rcu/tree_exp.h file, rcu_print_task_exp_stall() should live there too. This commit carries out the required code motion. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_exp.h | 32 ++++++++++++++++++++++++++++++++ kernel/rcu/tree_plugin.h | 31 ------------------------------- 2 files changed, 32 insertions(+), 31 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 4c2a0189e748..7be3e085ddd6 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -10,6 +10,7 @@ #include static void rcu_exp_handler(void *unused); +static int rcu_print_task_exp_stall(struct rcu_node *rnp); /* * Record the start of an expedited grace period. @@ -670,6 +671,27 @@ static void sync_sched_exp_online_cleanup(int cpu) { } +/* + * Scan the current list of tasks blocked within RCU read-side critical + * sections, printing out the tid of each that is blocking the current + * expedited grace period. + */ +static int rcu_print_task_exp_stall(struct rcu_node *rnp) +{ + struct task_struct *t; + int ndetected = 0; + + if (!rnp->exp_tasks) + return 0; + t = list_entry(rnp->exp_tasks->prev, + struct task_struct, rcu_node_entry); + list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) { + pr_cont(" P%d", t->pid); + ndetected++; + } + return ndetected; +} + #else /* #ifdef CONFIG_PREEMPT_RCU */ /* Invoked on each online non-idle CPU for expedited quiescent state. */ @@ -709,6 +731,16 @@ static void sync_sched_exp_online_cleanup(int cpu) WARN_ON_ONCE(ret); } +/* + * Because preemptible RCU does not exist, we never have to check for + * tasks blocked within RCU read-side critical sections that are + * blocking the current expedited grace period. + */ +static int rcu_print_task_exp_stall(struct rcu_node *rnp) +{ + return 0; +} + #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ /** diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 7fa3bc4d481b..72519c57f656 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -642,27 +642,6 @@ static void rcu_read_unlock_special(struct task_struct *t) rcu_preempt_deferred_qs_irqrestore(t, flags); } -/* - * Scan the current list of tasks blocked within RCU read-side critical - * sections, printing out the tid of each that is blocking the current - * expedited grace period. - */ -static int rcu_print_task_exp_stall(struct rcu_node *rnp) -{ - struct task_struct *t; - int ndetected = 0; - - if (!rnp->exp_tasks) - return 0; - t = list_entry(rnp->exp_tasks->prev, - struct task_struct, rcu_node_entry); - list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) { - pr_cont(" P%d", t->pid); - ndetected++; - } - return ndetected; -} - /* * Check that the list of blocked tasks for the newly completed grace * period is in fact empty. It is a serious bug to complete a grace @@ -906,16 +885,6 @@ static bool rcu_preempt_need_deferred_qs(struct task_struct *t) } static void rcu_preempt_deferred_qs(struct task_struct *t) { } -/* - * Because preemptible RCU does not exist, we never have to check for - * tasks blocked within RCU read-side critical sections that are - * blocking the current expedited grace period. - */ -static int rcu_print_task_exp_stall(struct rcu_node *rnp) -{ - return 0; -} - /* * Because there is no preemptible RCU, there can be no readers blocked, * so there is no need to check for blocked tasks. So check only for -- cgit v1.2.3-59-g8ed1b From 40e69ac7d0c5a19ea14656bc3131c55719baec96 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 11 Jan 2019 20:58:58 -0800 Subject: rcu: Inline RCU stall-warning info helper functions The print_cpu_stall_info_begin() and print_cpu_stall_info_end() print a single character each onto the console, and are a holdover from a time when RCU CPU stall warning messages could be abbreviated using a long-gone Kconfig option. This commit therefore adds these single characters to already-printed strings in the calling functions, and then eliminates both print_cpu_stall_info_begin() and print_cpu_stall_info_end(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.h | 2 -- kernel/rcu/tree_plugin.h | 12 ------------ kernel/rcu/tree_stall.h | 12 ++++-------- 3 files changed, 4 insertions(+), 22 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index c6df9a13dd06..d73472af49e7 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -416,9 +416,7 @@ static void rcu_prepare_for_idle(void); static bool rcu_preempt_has_tasks(struct rcu_node *rnp); static bool rcu_preempt_need_deferred_qs(struct task_struct *t); static void rcu_preempt_deferred_qs(struct task_struct *t); -static void print_cpu_stall_info_begin(void); static void print_cpu_stall_info(int cpu); -static void print_cpu_stall_info_end(void); static void zero_cpu_stall_ticks(struct rcu_data *rdp); static bool rcu_nocb_cpu_needs_barrier(int cpu); static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 72519c57f656..2df5bb04fd7a 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1550,12 +1550,6 @@ static void print_cpu_stall_fast_no_hz(char *cp, int cpu) #endif /* #else #ifdef CONFIG_RCU_FAST_NO_HZ */ -/* Initiate the stall-info list. */ -static void print_cpu_stall_info_begin(void) -{ - pr_cont("\n"); -} - /* * Print out diagnostic information for the specified stalled CPU. * @@ -1606,12 +1600,6 @@ static void print_cpu_stall_info(int cpu) fast_no_hz); } -/* Terminate the stall-info list. */ -static void print_cpu_stall_info_end(void) -{ - pr_err("\t"); -} - /* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */ static void zero_cpu_stall_ticks(struct rcu_data *rdp) { diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index b476786b8ef7..7ef3b596e45f 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -243,8 +243,7 @@ static void print_other_cpu_stall(unsigned long gp_seq) * See Documentation/RCU/stallwarn.txt for info on how to debug * RCU CPU stall warnings. */ - pr_err("INFO: %s detected stalls on CPUs/tasks:", rcu_state.name); - print_cpu_stall_info_begin(); + pr_err("INFO: %s detected stalls on CPUs/tasks:\n", rcu_state.name); rcu_for_each_leaf_node(rnp) { raw_spin_lock_irqsave_rcu_node(rnp, flags); ndetected += rcu_print_task_stall(rnp); @@ -258,10 +257,9 @@ static void print_other_cpu_stall(unsigned long gp_seq) raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } - print_cpu_stall_info_end(); for_each_possible_cpu(cpu) totqlen += rcu_get_n_cbs_cpu(cpu); - pr_cont("(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n", + pr_cont("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n", smp_processor_id(), (long)(jiffies - rcu_state.gp_start), (long)rcu_seq_current(&rcu_state.gp_seq), totqlen); if (ndetected) { @@ -314,15 +312,13 @@ static void print_cpu_stall(void) * See Documentation/RCU/stallwarn.txt for info on how to debug * RCU CPU stall warnings. */ - pr_err("INFO: %s self-detected stall on CPU", rcu_state.name); - print_cpu_stall_info_begin(); + pr_err("INFO: %s self-detected stall on CPU\n", rcu_state.name); raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags); print_cpu_stall_info(smp_processor_id()); raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags); - print_cpu_stall_info_end(); for_each_possible_cpu(cpu) totqlen += rcu_get_n_cbs_cpu(cpu); - pr_cont(" (t=%lu jiffies g=%ld q=%lu)\n", + pr_cont("\t(t=%lu jiffies g=%ld q=%lu)\n", jiffies - rcu_state.gp_start, (long)rcu_seq_current(&rcu_state.gp_seq), totqlen); -- cgit v1.2.3-59-g8ed1b From 59b73a27681c5841440391f970a9a085228ba975 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 11 Jan 2019 21:05:17 -0800 Subject: rcu: Move FAST_NO_HZ stall-warning code to tree_stall.h This commit further consolidates the stall-warning code by moving print_cpu_stall_info() and its helper functions along with zero_cpu_stall_ticks() to kernel/rcu/tree_stall.h. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.h | 1 - kernel/rcu/tree_plugin.h | 80 ------------------------------------------------ kernel/rcu/tree_stall.h | 80 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 80 insertions(+), 81 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index d73472af49e7..49bf3b00bb50 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -416,7 +416,6 @@ static void rcu_prepare_for_idle(void); static bool rcu_preempt_has_tasks(struct rcu_node *rnp); static bool rcu_preempt_need_deferred_qs(struct task_struct *t); static void rcu_preempt_deferred_qs(struct task_struct *t); -static void print_cpu_stall_info(int cpu); static void zero_cpu_stall_ticks(struct rcu_data *rdp); static bool rcu_nocb_cpu_needs_barrier(int cpu); static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 2df5bb04fd7a..a1f9d7c15bd8 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1528,86 +1528,6 @@ static void rcu_cleanup_after_idle(void) #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ -#ifdef CONFIG_RCU_FAST_NO_HZ - -static void print_cpu_stall_fast_no_hz(char *cp, int cpu) -{ - struct rcu_data *rdp = &per_cpu(rcu_data, cpu); - - sprintf(cp, "last_accelerate: %04lx/%04lx, Nonlazy posted: %c%c%c", - rdp->last_accelerate & 0xffff, jiffies & 0xffff, - ".l"[rdp->all_lazy], - ".L"[!rcu_segcblist_n_nonlazy_cbs(&rdp->cblist)], - ".D"[!rdp->tick_nohz_enabled_snap]); -} - -#else /* #ifdef CONFIG_RCU_FAST_NO_HZ */ - -static void print_cpu_stall_fast_no_hz(char *cp, int cpu) -{ - *cp = '\0'; -} - -#endif /* #else #ifdef CONFIG_RCU_FAST_NO_HZ */ - -/* - * Print out diagnostic information for the specified stalled CPU. - * - * If the specified CPU is aware of the current RCU grace period, then - * print the number of scheduling clock interrupts the CPU has taken - * during the time that it has been aware. Otherwise, print the number - * of RCU grace periods that this CPU is ignorant of, for example, "1" - * if the CPU was aware of the previous grace period. - * - * Also print out idle and (if CONFIG_RCU_FAST_NO_HZ) idle-entry info. - */ -static void print_cpu_stall_info(int cpu) -{ - unsigned long delta; - char fast_no_hz[72]; - struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); - char *ticks_title; - unsigned long ticks_value; - - /* - * We could be printing a lot while holding a spinlock. Avoid - * triggering hard lockup. - */ - touch_nmi_watchdog(); - - ticks_value = rcu_seq_ctr(rcu_state.gp_seq - rdp->gp_seq); - if (ticks_value) { - ticks_title = "GPs behind"; - } else { - ticks_title = "ticks this GP"; - ticks_value = rdp->ticks_this_gp; - } - print_cpu_stall_fast_no_hz(fast_no_hz, cpu); - delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq); - pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s\n", - cpu, - "O."[!!cpu_online(cpu)], - "o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)], - "N."[!!(rdp->grpmask & rdp->mynode->qsmaskinitnext)], - !IS_ENABLED(CONFIG_IRQ_WORK) ? '?' : - rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' : - "!."[!delta], - ticks_value, ticks_title, - rcu_dynticks_snap(rdp) & 0xfff, - rdp->dynticks_nesting, rdp->dynticks_nmi_nesting, - rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu), - READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart, - fast_no_hz); -} - -/* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */ -static void zero_cpu_stall_ticks(struct rcu_data *rdp) -{ - rdp->ticks_this_gp = 0; - rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id()); - WRITE_ONCE(rdp->last_fqs_resched, jiffies); -} - #ifdef CONFIG_RCU_NOCB_CPU /* diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 7ef3b596e45f..19b915380a6f 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -223,6 +223,86 @@ static void panic_on_rcu_stall(void) panic("RCU Stall\n"); } +#ifdef CONFIG_RCU_FAST_NO_HZ + +static void print_cpu_stall_fast_no_hz(char *cp, int cpu) +{ + struct rcu_data *rdp = &per_cpu(rcu_data, cpu); + + sprintf(cp, "last_accelerate: %04lx/%04lx, Nonlazy posted: %c%c%c", + rdp->last_accelerate & 0xffff, jiffies & 0xffff, + ".l"[rdp->all_lazy], + ".L"[!rcu_segcblist_n_nonlazy_cbs(&rdp->cblist)], + ".D"[!rdp->tick_nohz_enabled_snap]); +} + +#else /* #ifdef CONFIG_RCU_FAST_NO_HZ */ + +static void print_cpu_stall_fast_no_hz(char *cp, int cpu) +{ + *cp = '\0'; +} + +#endif /* #else #ifdef CONFIG_RCU_FAST_NO_HZ */ + +/* + * Print out diagnostic information for the specified stalled CPU. + * + * If the specified CPU is aware of the current RCU grace period, then + * print the number of scheduling clock interrupts the CPU has taken + * during the time that it has been aware. Otherwise, print the number + * of RCU grace periods that this CPU is ignorant of, for example, "1" + * if the CPU was aware of the previous grace period. + * + * Also print out idle and (if CONFIG_RCU_FAST_NO_HZ) idle-entry info. + */ +static void print_cpu_stall_info(int cpu) +{ + unsigned long delta; + char fast_no_hz[72]; + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); + char *ticks_title; + unsigned long ticks_value; + + /* + * We could be printing a lot while holding a spinlock. Avoid + * triggering hard lockup. + */ + touch_nmi_watchdog(); + + ticks_value = rcu_seq_ctr(rcu_state.gp_seq - rdp->gp_seq); + if (ticks_value) { + ticks_title = "GPs behind"; + } else { + ticks_title = "ticks this GP"; + ticks_value = rdp->ticks_this_gp; + } + print_cpu_stall_fast_no_hz(fast_no_hz, cpu); + delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq); + pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s\n", + cpu, + "O."[!!cpu_online(cpu)], + "o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)], + "N."[!!(rdp->grpmask & rdp->mynode->qsmaskinitnext)], + !IS_ENABLED(CONFIG_IRQ_WORK) ? '?' : + rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' : + "!."[!delta], + ticks_value, ticks_title, + rcu_dynticks_snap(rdp) & 0xfff, + rdp->dynticks_nesting, rdp->dynticks_nmi_nesting, + rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu), + READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart, + fast_no_hz); +} + +/* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */ +static void zero_cpu_stall_ticks(struct rcu_data *rdp) +{ + rdp->ticks_this_gp = 0; + rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id()); + WRITE_ONCE(rdp->last_fqs_resched, jiffies); +} + static void print_other_cpu_stall(unsigned long gp_seq) { int cpu; -- cgit v1.2.3-59-g8ed1b From e23344c2ca42d0083596bb39964675bef00ad691 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 12 Jan 2019 09:35:44 -0800 Subject: rcu: Organize functions in tree_stall.h This commit does only code movement, removal of now-unneeded forward declarations, and addition of comments. It organizes the functions that implement RCU CPU stall warnings for normal grace periods into three categories: 1. Control of RCU CPU stall warnings, including computing timeouts. 2. Interaction of stall warnings with grace periods. 3. Actual printing of the RCU CPU stall-warning messages. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.h | 1 - kernel/rcu/tree_stall.h | 180 ++++++++++++++++++++++++++---------------------- 2 files changed, 97 insertions(+), 84 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 49bf3b00bb50..099410dbcbe9 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -442,6 +442,5 @@ static void rcu_dynticks_task_enter(void); static void rcu_dynticks_task_exit(void); /* Forward declarations for tree_stall.h */ -static int rcu_print_task_stall(struct rcu_node *rnp); static void record_gp_stall_check_time(void); static void check_cpu_stall(struct rcu_data *rdp); diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 19b915380a6f..03ed47883d8a 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -7,6 +7,9 @@ * Author: Paul E. McKenney */ +////////////////////////////////////////////////////////////////////////////// +// +// Controlling CPU stall warnings, including delay calculation. /* panic() on RCU Stall sysctl. */ int sysctl_panic_on_rcu_stall __read_mostly; @@ -17,6 +20,7 @@ int sysctl_panic_on_rcu_stall __read_mostly; #define RCU_STALL_DELAY_DELTA 0 #endif +/* Limit-check stall timeouts specified at boottime and runtime. */ int rcu_jiffies_till_stall_check(void) { int till_stall_check = READ_ONCE(rcu_cpu_stall_timeout); @@ -36,6 +40,7 @@ int rcu_jiffies_till_stall_check(void) } EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check); +/* Don't do RCU CPU stall warnings during long sysrq printouts. */ void rcu_sysrq_start(void) { if (!rcu_cpu_stall_suppress) @@ -48,6 +53,7 @@ void rcu_sysrq_end(void) rcu_cpu_stall_suppress = 0; } +/* Don't print RCU CPU stall warnings during a kernel panic. */ static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr) { rcu_cpu_stall_suppress = 1; @@ -65,6 +71,78 @@ static int __init check_cpu_stall_init(void) } early_initcall(check_cpu_stall_init); +/* If so specified via sysctl, panic, yielding cleaner stall-warning output. */ +static void panic_on_rcu_stall(void) +{ + if (sysctl_panic_on_rcu_stall) + panic("RCU Stall\n"); +} + +/** + * rcu_cpu_stall_reset - prevent further stall warnings in current grace period + * + * Set the stall-warning timeout way off into the future, thus preventing + * any RCU CPU stall-warning messages from appearing in the current set of + * RCU grace periods. + * + * The caller must disable hard irqs. + */ +void rcu_cpu_stall_reset(void) +{ + WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2); +} + +////////////////////////////////////////////////////////////////////////////// +// +// Interaction with RCU grace periods + +/* Start of new grace period, so record stall time (and forcing times). */ +static void record_gp_stall_check_time(void) +{ + unsigned long j = jiffies; + unsigned long j1; + + rcu_state.gp_start = j; + j1 = rcu_jiffies_till_stall_check(); + /* Record ->gp_start before ->jiffies_stall. */ + smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */ + rcu_state.jiffies_resched = j + j1 / 2; + rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs); +} + +/* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */ +static void zero_cpu_stall_ticks(struct rcu_data *rdp) +{ + rdp->ticks_this_gp = 0; + rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id()); + WRITE_ONCE(rdp->last_fqs_resched, jiffies); +} + +/* + * If too much time has passed in the current grace period, and if + * so configured, go kick the relevant kthreads. + */ +static void rcu_stall_kick_kthreads(void) +{ + unsigned long j; + + if (!rcu_kick_kthreads) + return; + j = READ_ONCE(rcu_state.jiffies_kick_kthreads); + if (time_after(jiffies, j) && rcu_state.gp_kthread && + (rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) { + WARN_ONCE(1, "Kicking %s grace-period kthread\n", + rcu_state.name); + rcu_ftrace_dump(DUMP_ALL); + wake_up_process(rcu_state.gp_kthread); + WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ); + } +} + +////////////////////////////////////////////////////////////////////////////// +// +// Printing RCU CPU stall warnings + #ifdef CONFIG_PREEMPT /* @@ -137,43 +215,6 @@ static int rcu_print_task_stall(struct rcu_node *rnp) } #endif /* #else #ifdef CONFIG_PREEMPT */ -static void record_gp_stall_check_time(void) -{ - unsigned long j = jiffies; - unsigned long j1; - - rcu_state.gp_start = j; - j1 = rcu_jiffies_till_stall_check(); - /* Record ->gp_start before ->jiffies_stall. */ - smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */ - rcu_state.jiffies_resched = j + j1 / 2; - rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs); -} - -/* - * Complain about starvation of grace-period kthread. - */ -static void rcu_check_gp_kthread_starvation(void) -{ - struct task_struct *gpk = rcu_state.gp_kthread; - unsigned long j; - - j = jiffies - READ_ONCE(rcu_state.gp_activity); - if (j > 2 * HZ) { - pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n", - rcu_state.name, j, - (long)rcu_seq_current(&rcu_state.gp_seq), - READ_ONCE(rcu_state.gp_flags), - gp_state_getname(rcu_state.gp_state), rcu_state.gp_state, - gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1); - if (gpk) { - pr_err("RCU grace-period kthread stack dump:\n"); - sched_show_task(gpk); - wake_up_process(gpk); - } - } -} - /* * Dump stacks of all tasks running on stalled CPUs. First try using * NMIs, but fall back to manual remote stack tracing on architectures @@ -196,33 +237,6 @@ static void rcu_dump_cpu_stacks(void) } } -/* - * If too much time has passed in the current grace period, and if - * so configured, go kick the relevant kthreads. - */ -static void rcu_stall_kick_kthreads(void) -{ - unsigned long j; - - if (!rcu_kick_kthreads) - return; - j = READ_ONCE(rcu_state.jiffies_kick_kthreads); - if (time_after(jiffies, j) && rcu_state.gp_kthread && - (rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) { - WARN_ONCE(1, "Kicking %s grace-period kthread\n", - rcu_state.name); - rcu_ftrace_dump(DUMP_ALL); - wake_up_process(rcu_state.gp_kthread); - WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ); - } -} - -static void panic_on_rcu_stall(void) -{ - if (sysctl_panic_on_rcu_stall) - panic("RCU Stall\n"); -} - #ifdef CONFIG_RCU_FAST_NO_HZ static void print_cpu_stall_fast_no_hz(char *cp, int cpu) @@ -295,12 +309,26 @@ static void print_cpu_stall_info(int cpu) fast_no_hz); } -/* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */ -static void zero_cpu_stall_ticks(struct rcu_data *rdp) +/* Complain about starvation of grace-period kthread. */ +static void rcu_check_gp_kthread_starvation(void) { - rdp->ticks_this_gp = 0; - rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id()); - WRITE_ONCE(rdp->last_fqs_resched, jiffies); + struct task_struct *gpk = rcu_state.gp_kthread; + unsigned long j; + + j = jiffies - READ_ONCE(rcu_state.gp_activity); + if (j > 2 * HZ) { + pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n", + rcu_state.name, j, + (long)rcu_seq_current(&rcu_state.gp_seq), + READ_ONCE(rcu_state.gp_flags), + gp_state_getname(rcu_state.gp_state), rcu_state.gp_state, + gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1); + if (gpk) { + pr_err("RCU grace-period kthread stack dump:\n"); + sched_show_task(gpk); + wake_up_process(gpk); + } + } } static void print_other_cpu_stall(unsigned long gp_seq) @@ -488,17 +516,3 @@ static void check_cpu_stall(struct rcu_data *rdp) print_other_cpu_stall(gs2); } } - -/** - * rcu_cpu_stall_reset - prevent further stall warnings in current grace period - * - * Set the stall-warning timeout way off into the future, thus preventing - * any RCU CPU stall-warning messages from appearing in the current set of - * RCU grace periods. - * - * The caller must disable hard irqs. - */ -void rcu_cpu_stall_reset(void) -{ - WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2); -} -- cgit v1.2.3-59-g8ed1b From 7ac1907c9e7ba5f80d3c298fe9d2dbf620566a49 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 14 Jan 2019 10:19:20 -0800 Subject: rcu: Move irq-disabled stall-warning checking to tree_stall.h The rcu_iw_handler() function's sole purpose in life is to indicate whether a stalled CPU had interrupts disabled, so it belongs in kernel/rcu/tree_stall.h. This commit therefore makes that move, clarifying its header comment while in the area. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 21 --------------------- kernel/rcu/tree.h | 1 + kernel/rcu/tree_stall.h | 20 ++++++++++++++++++++ 3 files changed, 21 insertions(+), 21 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 001dd05f6e38..929531ed168c 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1031,27 +1031,6 @@ static int dyntick_save_progress_counter(struct rcu_data *rdp) return 0; } -/* - * Handler for the irq_work request posted when a grace period has - * gone on for too long, but not yet long enough for an RCU CPU - * stall warning. Set state appropriately, but just complain if - * there is unexpected state on entry. - */ -static void rcu_iw_handler(struct irq_work *iwp) -{ - struct rcu_data *rdp; - struct rcu_node *rnp; - - rdp = container_of(iwp, struct rcu_data, rcu_iw); - rnp = rdp->mynode; - raw_spin_lock_rcu_node(rnp); - if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) { - rdp->rcu_iw_gp_seq = rnp->gp_seq; - rdp->rcu_iw_pending = false; - } - raw_spin_unlock_rcu_node(rnp); -} - /* * Return true if the specified CPU has passed through a quiescent * state by virtue of being in or having passed through an dynticks diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 099410dbcbe9..f882ce3ca5a5 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -443,4 +443,5 @@ static void rcu_dynticks_task_exit(void); /* Forward declarations for tree_stall.h */ static void record_gp_stall_check_time(void); +static void rcu_iw_handler(struct irq_work *iwp); static void check_cpu_stall(struct rcu_data *rdp); diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 03ed47883d8a..526e223e41ce 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -139,6 +139,26 @@ static void rcu_stall_kick_kthreads(void) } } +/* + * Handler for the irq_work request posted about halfway into the RCU CPU + * stall timeout, and used to detect excessive irq disabling. Set state + * appropriately, but just complain if there is unexpected state on entry. + */ +static void rcu_iw_handler(struct irq_work *iwp) +{ + struct rcu_data *rdp; + struct rcu_node *rnp; + + rdp = container_of(iwp, struct rcu_data, rcu_iw); + rnp = rdp->mynode; + raw_spin_lock_rcu_node(rnp); + if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) { + rdp->rcu_iw_gp_seq = rnp->gp_seq; + rdp->rcu_iw_pending = false; + } + raw_spin_unlock_rcu_node(rnp); +} + ////////////////////////////////////////////////////////////////////////////// // // Printing RCU CPU stall warnings -- cgit v1.2.3-59-g8ed1b From b51bcbbf16ef0ea352e8b924dd8638112e4037a5 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 15 Jan 2019 07:01:33 -0800 Subject: rcu: Move forward-progress checkers into tree_stall.h This commit further consolidates stall-warning functionality by moving forward-progress checkers into kernel/rcu/tree_stall.h, updating a comment or two while in the area. More specifically, this commit moves show_rcu_gp_kthreads(), rcu_check_gp_start_stall(), rcu_fwd_progress_check(), sysrq_rcu, sysrq_show_rcu(), sysrq_rcudump_op, and rcu_sysrq_init() from kernel/rcu/tree.c to kernel/rcu/tree_stall.h. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 166 ---------------------------------------------- kernel/rcu/tree.h | 2 + kernel/rcu/tree_stall.h | 171 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 173 insertions(+), 166 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 929531ed168c..4cb1ebd93b0c 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -102,9 +102,6 @@ int rcu_num_lvls __read_mostly = RCU_NUM_LVLS; /* Number of rcu_nodes at specified level. */ int num_rcu_lvl[] = NUM_RCU_LVL_INIT; int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */ -/* Commandeer a sysrq key to dump RCU's tree. */ -static bool sysrq_rcu; -module_param(sysrq_rcu, bool, 0444); /* * The rcu_scheduler_active variable is initialized to the value @@ -510,74 +507,6 @@ static const char *gp_state_getname(short gs) return gp_state_names[gs]; } -/* - * Show the state of the grace-period kthreads. - */ -void show_rcu_gp_kthreads(void) -{ - int cpu; - unsigned long j; - unsigned long ja; - unsigned long jr; - unsigned long jw; - struct rcu_data *rdp; - struct rcu_node *rnp; - - j = jiffies; - ja = j - READ_ONCE(rcu_state.gp_activity); - jr = j - READ_ONCE(rcu_state.gp_req_activity); - jw = j - READ_ONCE(rcu_state.gp_wake_time); - pr_info("%s: wait state: %s(%d) ->state: %#lx delta ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n", - rcu_state.name, gp_state_getname(rcu_state.gp_state), - rcu_state.gp_state, - rcu_state.gp_kthread ? rcu_state.gp_kthread->state : 0x1ffffL, - ja, jr, jw, (long)READ_ONCE(rcu_state.gp_wake_seq), - (long)READ_ONCE(rcu_state.gp_seq), - (long)READ_ONCE(rcu_get_root()->gp_seq_needed), - READ_ONCE(rcu_state.gp_flags)); - rcu_for_each_node_breadth_first(rnp) { - if (ULONG_CMP_GE(rcu_state.gp_seq, rnp->gp_seq_needed)) - continue; - pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld\n", - rnp->grplo, rnp->grphi, (long)rnp->gp_seq, - (long)rnp->gp_seq_needed); - if (!rcu_is_leaf_node(rnp)) - continue; - for_each_leaf_node_possible_cpu(rnp, cpu) { - rdp = per_cpu_ptr(&rcu_data, cpu); - if (rdp->gpwrap || - ULONG_CMP_GE(rcu_state.gp_seq, - rdp->gp_seq_needed)) - continue; - pr_info("\tcpu %d ->gp_seq_needed %ld\n", - cpu, (long)rdp->gp_seq_needed); - } - } - /* sched_show_task(rcu_state.gp_kthread); */ -} -EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads); - -/* Dump grace-period-request information due to commandeered sysrq. */ -static void sysrq_show_rcu(int key) -{ - show_rcu_gp_kthreads(); -} - -static struct sysrq_key_op sysrq_rcudump_op = { - .handler = sysrq_show_rcu, - .help_msg = "show-rcu(y)", - .action_msg = "Show RCU tree", - .enable_mask = SYSRQ_ENABLE_DUMP, -}; - -static int __init rcu_sysrq_init(void) -{ - if (sysrq_rcu) - return register_sysrq_key('y', &sysrq_rcudump_op); - return 0; -} -early_initcall(rcu_sysrq_init); - /* * Send along grace-period-related data for rcutorture diagnostics. */ @@ -2323,101 +2252,6 @@ void rcu_force_quiescent_state(void) } EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); -/* - * This function checks for grace-period requests that fail to motivate - * RCU to come out of its idle mode. - */ -void -rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp, - const unsigned long gpssdelay) -{ - unsigned long flags; - unsigned long j; - struct rcu_node *rnp_root = rcu_get_root(); - static atomic_t warned = ATOMIC_INIT(0); - - if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() || - ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed)) - return; - j = jiffies; /* Expensive access, and in common case don't get here. */ - if (time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) || - time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) || - atomic_read(&warned)) - return; - - raw_spin_lock_irqsave_rcu_node(rnp, flags); - j = jiffies; - if (rcu_gp_in_progress() || - ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) || - time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) || - time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) || - atomic_read(&warned)) { - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - return; - } - /* Hold onto the leaf lock to make others see warned==1. */ - - if (rnp_root != rnp) - raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */ - j = jiffies; - if (rcu_gp_in_progress() || - ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) || - time_before(j, rcu_state.gp_req_activity + gpssdelay) || - time_before(j, rcu_state.gp_activity + gpssdelay) || - atomic_xchg(&warned, 1)) { - raw_spin_unlock_rcu_node(rnp_root); /* irqs remain disabled. */ - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - return; - } - WARN_ON(1); - if (rnp_root != rnp) - raw_spin_unlock_rcu_node(rnp_root); - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - show_rcu_gp_kthreads(); -} - -/* - * Do a forward-progress check for rcutorture. This is normally invoked - * due to an OOM event. The argument "j" gives the time period during - * which rcutorture would like progress to have been made. - */ -void rcu_fwd_progress_check(unsigned long j) -{ - unsigned long cbs; - int cpu; - unsigned long max_cbs = 0; - int max_cpu = -1; - struct rcu_data *rdp; - - if (rcu_gp_in_progress()) { - pr_info("%s: GP age %lu jiffies\n", - __func__, jiffies - rcu_state.gp_start); - show_rcu_gp_kthreads(); - } else { - pr_info("%s: Last GP end %lu jiffies ago\n", - __func__, jiffies - rcu_state.gp_end); - preempt_disable(); - rdp = this_cpu_ptr(&rcu_data); - rcu_check_gp_start_stall(rdp->mynode, rdp, j); - preempt_enable(); - } - for_each_possible_cpu(cpu) { - cbs = rcu_get_n_cbs_cpu(cpu); - if (!cbs) - continue; - if (max_cpu < 0) - pr_info("%s: callbacks", __func__); - pr_cont(" %d: %lu", cpu, cbs); - if (cbs <= max_cbs) - continue; - max_cbs = cbs; - max_cpu = cpu; - } - if (max_cpu >= 0) - pr_cont("\n"); -} -EXPORT_SYMBOL_GPL(rcu_fwd_progress_check); - /* Perform RCU core processing work for the current CPU. */ static __latent_entropy void rcu_core(struct softirq_action *unused) { diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index f882ce3ca5a5..e253d11af3c4 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -445,3 +445,5 @@ static void rcu_dynticks_task_exit(void); static void record_gp_stall_check_time(void); static void rcu_iw_handler(struct irq_work *iwp); static void check_cpu_stall(struct rcu_data *rdp); +static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp, + const unsigned long gpssdelay); diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 526e223e41ce..9e3db08d02bc 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -536,3 +536,174 @@ static void check_cpu_stall(struct rcu_data *rdp) print_other_cpu_stall(gs2); } } + +////////////////////////////////////////////////////////////////////////////// +// +// RCU forward-progress mechanisms, including of callback invocation. + + +/* + * Show the state of the grace-period kthreads. + */ +void show_rcu_gp_kthreads(void) +{ + int cpu; + unsigned long j; + unsigned long ja; + unsigned long jr; + unsigned long jw; + struct rcu_data *rdp; + struct rcu_node *rnp; + + j = jiffies; + ja = j - READ_ONCE(rcu_state.gp_activity); + jr = j - READ_ONCE(rcu_state.gp_req_activity); + jw = j - READ_ONCE(rcu_state.gp_wake_time); + pr_info("%s: wait state: %s(%d) ->state: %#lx delta ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n", + rcu_state.name, gp_state_getname(rcu_state.gp_state), + rcu_state.gp_state, + rcu_state.gp_kthread ? rcu_state.gp_kthread->state : 0x1ffffL, + ja, jr, jw, (long)READ_ONCE(rcu_state.gp_wake_seq), + (long)READ_ONCE(rcu_state.gp_seq), + (long)READ_ONCE(rcu_get_root()->gp_seq_needed), + READ_ONCE(rcu_state.gp_flags)); + rcu_for_each_node_breadth_first(rnp) { + if (ULONG_CMP_GE(rcu_state.gp_seq, rnp->gp_seq_needed)) + continue; + pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld\n", + rnp->grplo, rnp->grphi, (long)rnp->gp_seq, + (long)rnp->gp_seq_needed); + if (!rcu_is_leaf_node(rnp)) + continue; + for_each_leaf_node_possible_cpu(rnp, cpu) { + rdp = per_cpu_ptr(&rcu_data, cpu); + if (rdp->gpwrap || + ULONG_CMP_GE(rcu_state.gp_seq, + rdp->gp_seq_needed)) + continue; + pr_info("\tcpu %d ->gp_seq_needed %ld\n", + cpu, (long)rdp->gp_seq_needed); + } + } + /* sched_show_task(rcu_state.gp_kthread); */ +} +EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads); + +/* + * This function checks for grace-period requests that fail to motivate + * RCU to come out of its idle mode. + */ +static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp, + const unsigned long gpssdelay) +{ + unsigned long flags; + unsigned long j; + struct rcu_node *rnp_root = rcu_get_root(); + static atomic_t warned = ATOMIC_INIT(0); + + if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() || + ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed)) + return; + j = jiffies; /* Expensive access, and in common case don't get here. */ + if (time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) || + time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) || + atomic_read(&warned)) + return; + + raw_spin_lock_irqsave_rcu_node(rnp, flags); + j = jiffies; + if (rcu_gp_in_progress() || + ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) || + time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) || + time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) || + atomic_read(&warned)) { + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + return; + } + /* Hold onto the leaf lock to make others see warned==1. */ + + if (rnp_root != rnp) + raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */ + j = jiffies; + if (rcu_gp_in_progress() || + ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) || + time_before(j, rcu_state.gp_req_activity + gpssdelay) || + time_before(j, rcu_state.gp_activity + gpssdelay) || + atomic_xchg(&warned, 1)) { + raw_spin_unlock_rcu_node(rnp_root); /* irqs remain disabled. */ + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + return; + } + WARN_ON(1); + if (rnp_root != rnp) + raw_spin_unlock_rcu_node(rnp_root); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + show_rcu_gp_kthreads(); +} + +/* + * Do a forward-progress check for rcutorture. This is normally invoked + * due to an OOM event. The argument "j" gives the time period during + * which rcutorture would like progress to have been made. + */ +void rcu_fwd_progress_check(unsigned long j) +{ + unsigned long cbs; + int cpu; + unsigned long max_cbs = 0; + int max_cpu = -1; + struct rcu_data *rdp; + + if (rcu_gp_in_progress()) { + pr_info("%s: GP age %lu jiffies\n", + __func__, jiffies - rcu_state.gp_start); + show_rcu_gp_kthreads(); + } else { + pr_info("%s: Last GP end %lu jiffies ago\n", + __func__, jiffies - rcu_state.gp_end); + preempt_disable(); + rdp = this_cpu_ptr(&rcu_data); + rcu_check_gp_start_stall(rdp->mynode, rdp, j); + preempt_enable(); + } + for_each_possible_cpu(cpu) { + cbs = rcu_get_n_cbs_cpu(cpu); + if (!cbs) + continue; + if (max_cpu < 0) + pr_info("%s: callbacks", __func__); + pr_cont(" %d: %lu", cpu, cbs); + if (cbs <= max_cbs) + continue; + max_cbs = cbs; + max_cpu = cpu; + } + if (max_cpu >= 0) + pr_cont("\n"); +} +EXPORT_SYMBOL_GPL(rcu_fwd_progress_check); + +/* Commandeer a sysrq key to dump RCU's tree. */ +static bool sysrq_rcu; +module_param(sysrq_rcu, bool, 0444); + +/* Dump grace-period-request information due to commandeered sysrq. */ +static void sysrq_show_rcu(int key) +{ + show_rcu_gp_kthreads(); +} + +static struct sysrq_key_op sysrq_rcudump_op = { + .handler = sysrq_show_rcu, + .help_msg = "show-rcu(y)", + .action_msg = "Show RCU tree", + .enable_mask = SYSRQ_ENABLE_DUMP, +}; + +static int __init rcu_sysrq_init(void) +{ + if (sysrq_rcu) + return register_sysrq_key('y', &sysrq_rcudump_op); + return 0; +} +early_initcall(rcu_sysrq_init); -- cgit v1.2.3-59-g8ed1b From 6c70e9cd5f3c6d93f3e1da6d101073e898f39170 Mon Sep 17 00:00:00 2001 From: Neeraj Upadhyay Date: Fri, 8 Mar 2019 11:57:48 -0800 Subject: rcu: Fix nohz status in stall warning The Documentation/RCU/stallwarn.txt file says that stall warnings print "D" if dyntick-idle processing is enabled, but the code in print_cpu_stall_fast_no_hz() prints "." instead. This commit therefore reverses the sense of the test to make the code match the documentation. Signed-off-by: Neeraj Upadhyay Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_stall.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 9e3db08d02bc..f65a73a97323 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -267,7 +267,7 @@ static void print_cpu_stall_fast_no_hz(char *cp, int cpu) rdp->last_accelerate & 0xffff, jiffies & 0xffff, ".l"[rdp->all_lazy], ".L"[!rcu_segcblist_n_nonlazy_cbs(&rdp->cblist)], - ".D"[!rdp->tick_nohz_enabled_snap]); + ".D"[!!rdp->tick_nohz_enabled_snap]); } #else /* #ifdef CONFIG_RCU_FAST_NO_HZ */ -- cgit v1.2.3-59-g8ed1b From 24aca4aea4f0179e0e56cf9ec610c27d07702945 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 22 Jan 2019 19:23:00 -0800 Subject: torture: Don't try to offline the last CPU If there is only one online CPU, it doesn't make sense to try to offline it, as any such attempt is guaranteed to fail. This commit therefore check for this condition and refuses to attempt the nonsensical. Reported-by: Su Yue Signed-off-by: Paul E. McKenney Tested-By: Su Yue --- kernel/torture.c | 2 ++ 1 file changed, 2 insertions(+) (limited to 'kernel') diff --git a/kernel/torture.c b/kernel/torture.c index 8faa1a9aaeb9..17b2be9bde12 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -88,6 +88,8 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes, if (!cpu_online(cpu) || !cpu_is_hotpluggable(cpu)) return false; + if (num_online_cpus() <= 1) + return false; /* Can't offline the last CPU. */ if (verbose > 1) pr_alert("%s" TORTURE_FLAG -- cgit v1.2.3-59-g8ed1b From a3b0e1e59ef1757488ef05b66bc376eaf7b06ada Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 28 Feb 2019 15:06:13 -0800 Subject: rcutorture: Make rcutorture_extend_mask() comment match the code The code actually rarely uses more than one type of RCU read-side protection, as is actually desired given that we need some reasonable probability of preempting RCU read-side critical sections, which cannot happen with multiple types of protection. This comment therefore adjusts the comment. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index f14d1b18a74f..2453229ba15a 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1160,7 +1160,7 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp) unsigned long randmask2 = randmask1 >> 3; WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT); - /* Most of the time lots of bits, half the time only one bit. */ + /* Mostly only one bit (need preemption!), sometimes lots of bits. */ if (!(randmask1 & 0x7)) mask = mask & randmask2; else -- cgit v1.2.3-59-g8ed1b From f47cb1bb0da23162f4c17e0c0023df4889ecb492 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 28 Feb 2019 15:59:43 -0800 Subject: rcutorture: Remove ->ext_irq_conflict field Back when there was a separate RCU-bh flavor, the ->ext_irq_conflict field was used to prevent executing local_bh_enable() while interrupts were disabled. However, there is no longer an RCU-bh flavor, so this commit removes the no-longer-needed ->ext_irq_conflict field. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 5 ----- 1 file changed, 5 deletions(-) (limited to 'kernel') diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 2453229ba15a..21ab3c7eb221 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -299,7 +299,6 @@ struct rcu_torture_ops { int irq_capable; int can_boost; int extendables; - int ext_irq_conflict; const char *name; }; @@ -1170,10 +1169,6 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp) ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) || (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH)))) mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH; - if ((mask & RCUTORTURE_RDR_IRQ) && - !(mask & cur_ops->ext_irq_conflict) && - (oldmask & cur_ops->ext_irq_conflict)) - mask |= cur_ops->ext_irq_conflict; /* Or if readers object. */ return mask ?: RCUTORTURE_RDR_RCU; } -- cgit v1.2.3-59-g8ed1b From d44ac1bebc47e46d62019808a893582f56496a98 Mon Sep 17 00:00:00 2001 From: Neeraj Upadhyay Date: Sat, 9 Mar 2019 00:40:45 +0530 Subject: rcutorture: Fix expected forward progress duration in OOM notifier The rcutorture_oom_notify() function has a misplaced close parenthesis that results in increasingly long delays in rcu_fwd_progress_check()'s checking for various RCU forward-progress problems. This commit therefore puts the parenthesis in the right place. Signed-off-by: Neeraj Upadhyay Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 21ab3c7eb221..b42682b94cb7 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1843,7 +1843,7 @@ static int rcutorture_oom_notify(struct notifier_block *self, WARN(1, "%s invoked upon OOM during forward-progress testing.\n", __func__); rcu_torture_fwd_cb_hist(); - rcu_fwd_progress_check(1 + (jiffies - READ_ONCE(rcu_fwd_startat) / 2)); + rcu_fwd_progress_check(1 + (jiffies - READ_ONCE(rcu_fwd_startat)) / 2); WRITE_ONCE(rcu_fwd_emergency_stop, true); smp_mb(); /* Emergency stop before free and wait to avoid hangs. */ pr_info("%s: Freed %lu RCU callbacks.\n", -- cgit v1.2.3-59-g8ed1b From b813afae7ab6a5e91b4e16cc567331d9c2ae1f04 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 21 Mar 2019 09:27:28 -0700 Subject: rcutorture: Fix cleanup path for invalid torture_type strings If the specified rcutorture.torture_type is not in the rcu_torture_init() function's torture_ops[] array, rcutorture prints some console messages and then invokes rcu_torture_cleanup() to set state so that a future torture test can run. However, rcu_torture_cleanup() also attempts to end the test that didn't actually start, and in doing so relies on the value of cur_ops, a value that is not particularly relevant in this case. This can result in confusing output or even follow-on failures due to attempts to use facilities that have not been properly initialized. This commit therefore sets the value of cur_ops to NULL in this case and inserts a check near the beginning of rcu_torture_cleanup(), thus avoiding relying on an irrelevant cur_ops value. Reported-by: kernel test robot Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'kernel') diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index b42682b94cb7..e3c0f57ab0aa 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -2089,6 +2089,10 @@ rcu_torture_cleanup(void) cur_ops->cb_barrier(); return; } + if (!cur_ops) { + torture_cleanup_end(); + return; + } rcu_torture_barrier_cleanup(); torture_stop_kthread(rcu_torture_fwd_prog, fwd_prog_task); @@ -2262,6 +2266,7 @@ rcu_torture_init(void) pr_cont("\n"); WARN_ON(!IS_MODULE(CONFIG_RCU_TORTURE_TEST)); firsterr = -EINVAL; + cur_ops = NULL; goto unwind; } if (cur_ops->fqs == NULL && fqs_duration != 0) { -- cgit v1.2.3-59-g8ed1b From ad092c027713a68a34168942a5ef422e42e039f4 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 21 Mar 2019 10:26:41 -0700 Subject: rcuperf: Fix cleanup path for invalid perf_type strings If the specified rcuperf.perf_type is not in the rcu_perf_init() function's perf_ops[] array, rcuperf prints some console messages and then invokes rcu_perf_cleanup() to set state so that a future torture test can run. However, rcu_perf_cleanup() also attempts to end the test that didn't actually start, and in doing so relies on the value of cur_ops, a value that is not particularly relevant in this case. This can result in confusing output or even follow-on failures due to attempts to use facilities that have not been properly initialized. This commit therefore sets the value of cur_ops to NULL in this case and inserts a check near the beginning of rcu_perf_cleanup(), thus avoiding relying on an irrelevant cur_ops value. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcuperf.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'kernel') diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c index c29761152874..7a6890b23c5f 100644 --- a/kernel/rcu/rcuperf.c +++ b/kernel/rcu/rcuperf.c @@ -494,6 +494,10 @@ rcu_perf_cleanup(void) if (torture_cleanup_begin()) return; + if (!cur_ops) { + torture_cleanup_end(); + return; + } if (reader_tasks) { for (i = 0; i < nrealreaders; i++) @@ -614,6 +618,7 @@ rcu_perf_init(void) pr_cont("\n"); WARN_ON(!IS_MODULE(CONFIG_RCU_PERF_TEST)); firsterr = -EINVAL; + cur_ops = NULL; goto unwind; } if (cur_ops->init) -- cgit v1.2.3-59-g8ed1b From a9d6938ddb7f892552013b93e4842fc1a538628d Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 21 Mar 2019 13:30:01 -0700 Subject: locktorture: NULL cxt.lwsa and cxt.lrsa to allow bad-arg detection Currently, lock_torture_cleanup() uses the values of cxt.lwsa and cxt.lrsa to detect bad parameters that prevented locktorture from initializing, let alone running. In this case, lock_torture_cleanup() does no cleanup aside from invoking torture_cleanup_begin() and torture_cleanup_end(), as required to permit future torture tests to run. However, this heuristic fails if the run with bad parameters was preceded by a previous run that actually ran: In this case, both cxt.lwsa and cxt.lrsa will remain non-zero, which means that the current lock_torture_cleanup() invocation will be unable to detect the fact that it should skip cleanup, which can result in charming outcomes such as double frees. This commit therefore NULLs out both cxt.lwsa and cxt.lrsa at the end of any run that actually ran. Signed-off-by: Paul E. McKenney Cc: Davidlohr Bueso Cc: Josh Triplett --- kernel/locking/locktorture.c | 2 ++ 1 file changed, 2 insertions(+) (limited to 'kernel') diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index ad40a2617063..80a463d31a8d 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -829,7 +829,9 @@ static void lock_torture_cleanup(void) "End of test: SUCCESS"); kfree(cxt.lwsa); + cxt.lwsa = NULL; kfree(cxt.lrsa); + cxt.lrsa = NULL; end: torture_cleanup_end(); -- cgit v1.2.3-59-g8ed1b