aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2015-01-15Merge branches 'doc.2015.01.07a', 'fixes.2015.01.15a', 'preempt.2015.01.06a', 'srcu.2015.01.06a', 'stall.2015.01.16a' and 'torture.2015.01.11a' into HEADPaul E. McKenney47-567/+562
doc.2015.01.07a: Documentation updates. fixes.2015.01.15a: Miscellaneous fixes. preempt.2015.01.06a: Changes to handling of lists of preempted tasks. srcu.2015.01.06a: SRCU updates. stall.2015.01.16a: RCU CPU stall-warning updates and fixes. torture.2015.01.11a: RCU torture-test updates and fixes.
2015-01-15rcu: Initialize tiny RCU stall-warning timeouts at bootPaul E. McKenney1-0/+2
The current tiny RCU stall-warning code assumes that the jiffies counter starts at zero, however, it is sometimes initialized to other values, for example, -30,000. This commit therefore changes rcu_init() to invoke reset_cpu_stall_ticks() for both flavors of RCU to initialize the stall-warning times properly at boot. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-15rcu: Fix RCU CPU stall detection in tiny implementationMiroslav Benes1-4/+3
The tiny RCU CPU stall detection depends on *rcp->curtail not being NULL. It is however a tail pointer and thus NULL by definition. Instead we should check rcp->rcucblist for the presence of pending callbacks which need to be processed. With this fix INFO about the stall is printed and jiffies_stall (jiffies at next stall) correctly updated. Note that the check for pending callback is necessary to avoid spurious warnings if there are no pendings callbacks. Signed-off-by: Miroslav Benes <mbenes@suse.cz> [ paulmck: Fused identical "if" statements, ported to -rcu. ] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-15rcu: Add GP-kthread-starvation checks to CPU stall warningsPaul E. McKenney2-1/+29
This commit adds a message that is printed if the relevant grace-period kthread has not been able to run for the two seconds preceding the stall warning. (The two seconds is double the maximum interval between successive bouts of quiescent-state forcing.) Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-15rcu: Make cond_resched_rcu_qs() apply to normal RCU flavorsPaul E. McKenney7-25/+65
Although cond_resched_rcu_qs() only applies to TASKS_RCU, it is used in places where it would be useful for it to apply to the normal RCU flavors, rcu_preempt, rcu_sched, and rcu_bh. This is especially the case for workloads that aggressively overload the system, particularly those that generate large numbers of RCU updates on systems running NO_HZ_FULL CPUs. This commit therefore communicates quiescent states from cond_resched_rcu_qs() to the normal RCU flavors. Note that it is unfortunately necessary to leave the old ->passed_quiesce mechanism in place to allow quiescent states that apply to only one flavor to be recorded. (Yes, we could decrement ->rcu_qs_ctr_snap in that case, but that is not so good for debugging of RCU internals.) In addition, if one of the RCU flavor's grace period has stalled, this will invoke rcu_momentary_dyntick_idle(), resulting in a heavy-weight quiescent state visible from other CPUs. Reported-by: Sasha Levin <sasha.levin@oracle.com> Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Merge commit from Sasha Levin fixing a bug where __this_cpu() was used in preemptible code. ]
2015-01-15rcu: Optionally run grace-period kthreads at real-time priorityPaul E. McKenney3-8/+27
Recent testing has shown that under heavy load, running RCU's grace-period kthreads at real-time priority can improve performance (according to 0day test robot) and reduce the incidence of RCU CPU stall warnings. However, most systems do just fine with the default non-realtime priorities for these kthreads, and it does not make sense to expose the entire user base to any risk stemming from this change, given that this change is of use only to a few users running extremely heavy workloads. Therefore, this commit allows users to specify realtime priorities for the grace-period kthreads, but leaves them running SCHED_OTHER by default. The realtime priority may be specified at build time via the RCU_KTHREAD_PRIO Kconfig parameter, or at boot time via the rcutree.kthread_prio parameter. Either way, 0 says to continue the default SCHED_OTHER behavior and values from 1-99 specify that priority of SCHED_FIFO behavior. Note that a value of 0 is not permitted when the RCU_BOOST Kconfig parameter is specified. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-14ksoftirqd: Use new cond_resched_rcu_qs() functionPaul E. McKenney1-6/+1
Simplify run_ksoftirqd() by using the new cond_resched_rcu_qs() function that conditionally reschedules, but unconditionally supplies an RCU quiescent state. This commit is separate from the previous commit by Calvin Owens because Calvin's approach can be backported, while this commit cannot be. The reason that this commit cannot be backported is that cond_resched_rcu_qs() does not always provide the needed quiescent state in earlier kernels. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-14ksoftirqd: Enable IRQs and call cond_resched() before poking RCUCalvin Owens1-1/+5
While debugging an issue with excessive softirq usage, I encountered the following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread infrastructure"): [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ] ...but despite this note, the patch still calls RCU with IRQs disabled. This seemingly innocuous change caused a significant regression in softirq CPU usage on the sending side of a large TCP transfer (~1 GB/s): when introducing 0.01% packet loss, the softirq usage would jump to around 25%, spiking as high as 50%. Before the change, the usage would never exceed 5%. Moving the call to rcu_note_context_switch() after the cond_sched() call, as it was originally before the hotplug patch, completely eliminated this problem. Signed-off-by: Calvin Owens <calvinowens@fb.com> Cc: stable@vger.kernel.org Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcutorture: Add more diagnostics in rcu_barrier() test failure casePaul E. McKenney1-0/+3
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10torture: Flag console.log file to prevent holdovers from earlier runsPaul E. McKenney1-0/+1
A system misconfiguration that prevents qemu from running at all (for example, a missing dynamically linked library) will keep the console.log file from the previous run. This can fool the developer into thinking that this failed run actually completed correctly. This commit therefore overwrites the console.log file just before launching qemu. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10torture: Add "-enable-kvm -soundhw pcspk" to qemu command linePaul E. McKenney1-4/+4
More recent qemu implementations really want "-enable-kvm", and the "-soundhw pcspk" makes the script a bit less dependent on odd audio libraries being installed. This commit therefore adds both to the default qemu command line. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcutorture: Handle different mpstat versionsPaul E. McKenney1-1/+1
The mpstat command recently added the %gnice column, which messes up the cpu2use.sh script's idle-CPU calculations. This commit therefore uses $NF instead of $12 to select the last (%idle) column. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcutorture: Check from beginning to end of grace periodPaul E. McKenney5-43/+95
Currently, rcutorture's Reader Batch checks measure from the end of the previous grace period to the end of the current one. This commit tightens up these checks by measuring from the start and end of the same grace period. This involves adding rcu_batches_started() and friends corresponding to the existing rcu_batches_completed() and friends. We leave SRCU alone for the moment, as it does not yet have a way of tracking both ends of its grace periods. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcu: Remove redundant rcu_batches_completed() declarationPaul E. McKenney1-1/+0
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcutorture: Drop rcu_torture_completed() and friendsPaul E. McKenney1-13/+3
Now that the return type of rcu_batches_completed() and friends matches that of the rcu_torture_ops structure's ->completed field, the wrapper functions can be deleted. This commit carries out that deletion, while also wiring "sched"'s ->completed field to rcu_batches_completed_sched(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcu: Provide rcu_batches_completed_sched() for TINY_RCUPaul E. McKenney1-0/+8
A bug in rcutorture has caused it to ignore completed batches. In preparation for fixing that bug, this commit provides TINY_RCU with the required rcu_batches_completed_sched(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcutorture: Use unsigned for Reader Batch computationsPaul E. McKenney1-9/+9
The counter returned by the various ->completed functions is subject to overflow, which means that subtracting two such counters might result in overflow, which invokes undefined behavior in the C standard. This commit therefore changes these functions and variables to unsigned to avoid this undefined behavior. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcutorture: Make build-output parsing correctly flag RCU's warningsPaul E. McKenney1-7/+13
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcu: Make _batches_completed() functions return unsigned longPaul E. McKenney5-11/+11
Long ago, the various ->completed fields were of type long, but now are unsigned long due to signed-integer-overflow concerns. However, the various _batches_completed() functions remained of type long, even though their only purpose in life is to return the corresponding ->completed field. This patch cleans this up by changing these functions' return types to unsigned long. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcutorture: Issue warnings on close calls due to Reader Batch blowsPaul E. McKenney1-0/+18
Normal rcutorture checking overestimates grace periods somewhat due to the fact that there is a delay from a grace-period request until the start of the corresponding grace period and another delay from the end of that grace period to notification of the requestor. This means that rcutorture's detection of RCU bugs is less sensitive than it might be. It turns out that rcutorture also checks the underlying grace-period "completed" counter (displayed in Reader Batch output), which in theory allows rcutorture to do exact checks. In practice, memory misordering (by both compiler and CPU) can result in false positives. However, experience on x86 shows that these false positives are quite rare, occurring less than one time per 1,000 hours of testing. This commit therefore does the exact checking, giving a warning if any Reader Batch blows happen, and flagging an error if they happen more often than once every three hours in long tests. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-07documentation: Fix smp typo in memory-barriers.txtDavidlohr Bueso1-1/+1
Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-07documentation: Record limitations of bitfields and small variablesPaul E. McKenney1-0/+44
This commit documents the fact that it is not safe to use bitfields as shared variables in synchronization algorithms. It also documents that CPUs must be able to concurrently load from and store to adjacent one-byte and two-byte variables, which is in fact required by the C11 standard (Section 3.14). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Handle gpnum/completed wrap while dyntick idlePaul E. McKenney3-6/+15
Subtle race conditions can result if a CPU stays in dyntick-idle mode long enough for the ->gpnum and ->completed fields to wrap. For example, consider the following sequence of events: o CPU 1 encounters a quiescent state while waiting for grace period 5 to complete, but then enters dyntick-idle mode. o While CPU 1 is in dyntick-idle mode, the grace-period counters wrap around so that the grace period number is now 4. o Just as CPU 1 exits dyntick-idle mode, grace period 4 completes and grace period 5 begins. o The quiescent state that CPU 1 passed through during the old grace period 5 looks like it applies to the new grace period 5. Therefore, the new grace period 5 completes without CPU 1 having passed through a quiescent state. This could clearly be a fatal surprise to any long-running RCU read-side critical section that happened to be running on CPU 1 at the time. At one time, this was not a problem, given that it takes significant time for the grace-period counters to overflow even on 32-bit systems. However, with the advent of NO_HZ_FULL and SMP embedded systems, arbitrarily long idle periods are now becoming quite feasible. It is therefore time to close this race. This commit therefore avoids this race condition by having the quiescent-state forcing code detect when a CPU is falling too far behind, and setting a new rcu_data field ->gpwrap when this happens. Whenever this new ->gpwrap field is set, the CPU's ->gpnum and ->completed fields are known to be untrustworthy, and can be ignored, along with any associated quiescent states. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Improve diagnostics for spurious RCU CPU stall warningsPaul E. McKenney3-5/+34
The current RCU CPU stall warning code will print "Stall ended before state dump start" any time that the stall-warning code is triggered on a CPU that has already reported a quiescent state for the current grace period and if all quiescent states have been reported for the current grace period. However, a true stall can result in these symptoms, for example, by preventing RCU's grace-period kthreads from ever running This commit therefore checks for this condition, reporting the end of the stall only if one of the grace-period counters has actually advanced. Otherwise, it reports the last time that the grace-period kthread made meaningful progress. (In normal situations, the grace-period kthread should make meaningful progress at least every jiffies_till_next_fqs jiffies.) Reported-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Miroslav Benes <mbenes@suse.cz>
2015-01-06rcu: Make RCU_CPU_STALL_INFO include number of fqs attemptsPaul E. McKenney3-1/+5
One way that an RCU CPU stall warning can happen is if the grace-period kthread is not allowed to execute. One proxy for this kthread's forward progress is the number of force-quiescent-state (fqs) scans. This commit therefore adds the number of fqs scans to the RCU CPU stall warning printouts when CONFIG_RCU_CPU_STALL_INFO=y. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcutorture: Add checks for stall ending before dump startPaul E. McKenney1-1/+1
The current rcutorture scripting checks for actual stalls (via the "Call Trace:" check), but fails to spot the case where a stall ends just as it is being detected. This commit therefore adds a check for this case. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Set default to RCU_CPU_STALL_INFO=yPaul E. McKenney1-1/+1
The RCU_CPU_STALL_INFO code has been in for quite some time, and has proven reliable. This commit therefore enables it by default. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Make SRCU optional by using CONFIG_SRCUPranith Kumar23-1/+34
SRCU is not necessary to be compiled by default in all cases. For tinification efforts not compiling SRCU unless necessary is desirable. The current patch tries to make compiling SRCU optional by introducing a new Kconfig option CONFIG_SRCU which is selected when any of the components making use of SRCU are selected. If we do not select CONFIG_SRCU, srcu.o will not be compiled at all. text data bss dec hex filename 2007 0 0 2007 7d7 kernel/rcu/srcu.o Size of arch/powerpc/boot/zImage changes from text data bss dec hex filename 831552 64180 23944 919676 e087c arch/powerpc/boot/zImage : before 829504 64180 23952 917636 e0084 arch/powerpc/boot/zImage : after so the savings are about ~2000 bytes. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Josh Triplett <josh@joshtriplett.org> CC: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: resolve conflict due to removal of arch/ia64/kvm/Kconfig. ]
2015-01-06rcu: Combine DEFINE_SRCU() and DEFINE_STATIC_SRCU()Paul E. McKenney1-6/+4
The DEFINE_SRCU() and DEFINE_STATIC_SRCU() definitions are quite similar, so this commit combines them, saving a bit of code and removing redundancy. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Expand SRCU ->completed to 64 bitsPaul E. McKenney2-3/+3
When rcutorture used only the low-order 32 bits of the grace-period number, it was not a problem for SRCU to use a 32-bit completed field. However, rcutorture now uses the full 64 bits on 64-bit systems, so this commit converts SRCU's ->completed field to unsigned long so as to provide 64 bits on 64-bit systems. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Remove redundant callback-list initializationPaul E. McKenney1-3/+0
The RCU callback lists are initialized in both rcu_boot_init_percpu_data() and rcu_init_percpu_data(). The former is intended for initializing immutable data, so this commit removes the initialization from rcu_boot_init_percpu_data() and leaves it in rcu_init_percpu_data(). This change prepares for permitting callbacks to be queued very early in boot. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Don't scan root rcu_node structure for stalled tasksPaul E. McKenney1-9/+0
Now that blocked tasks are no longer migrated to the root rcu_node structure, there is no need to scan the root rcu_node structure for blocked tasks stalling the current grace period. This commit therefore removes this scan. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Revert "Allow post-unlock reference for rt_mutex" to avoid priority-inversionLai Jiangshan2-12/+1
The patch dfeb9765ce3c ("Allow post-unlock reference for rt_mutex") ensured rcu-boost safe even the rt_mutex has post-unlock reference. But rt_mutex allowing post-unlock reference is definitely a bug and it was fixed by the commit 27e35715df54 ("rtmutex: Plug slow unlock race"). This fix made the previous patch (dfeb9765ce3c) useless. And even worse, the priority-inversion introduced by the the previous patch still exists. rcu_read_unlock_special() { rt_mutex_unlock(&rnp->boost_mtx); /* Priority-Inversion: * the current task had been deboosted and preempted as a low * priority task immediately, it could wait long before reschedule in, * and the rcu-booster also waits on this low priority task and sleeps. * This priority-inversion makes rcu-booster can't work * as expected. */ complete(&rnp->boost_completion); } Just revert the patch to avoid it. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Note quiescent state when CPU goes offlinePaul E. McKenney1-1/+1
The rcu_cleanup_dead_cpu() function (called after a CPU has gone completely offline) has not reported a quiescent state because there was probably at least one synchronize_rcu() between the time the CPU went offline and the CPU_DEAD notifier, and this would have detected the CPU's offline state via quiescent-state forcing. However, the plan is for CPUs to take themselves offline, at which point it makes sense for them to report their own quiescent state. This commit makes this change in preparation for the new CPU-hotplug setup. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Don't bother affinitying rcub kthreads away from offline CPUsPaul E. McKenney1-5/+1
When rcu_boost_kthread_setaffinity() sees that all CPUs for a given rcu_node structure are now offline, it affinities the corresponding RCU-boost ("rcub") kthread away from those CPUs. This is pointless because the kthread cannot run on those offline CPUs in any case. This commit therefore removes this unneeded code. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Don't initiate RCU priority boosting on root rcu_nodePaul E. McKenney1-6/+0
Because there is no longer any preempted tasks on the root rcu_node, and because there is no longer ever an rcub kthread for the root rcu_node, this commit drops the code in force_qs_rnp() that attempts to awaken the non-existent root rcub kthread. This is strictly a performance enhancement, removing a root rcu_node ->lock acquisition and release along with some tests in rcu_initiate_boost(), ending with the test that notes that there is no rcub kthread. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Don't spawn rcub kthreads on root rcu_node structurePaul E. McKenney1-6/+2
Now that offlining CPUs no longer moves leaf rcu_node structures' ->blkd_tasks lists to the root, there is no way for the root rcu_node structure's ->blkd_task list to be nonempty, unless the root node is also the sole leaf node. This commit therefore refrains from creating an rcub kthread for the root rcu_node structure unless it is also the sole leaf. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Make use of rcu_preempt_has_tasks()Paul E. McKenney1-3/+3
Given that there is now arcu_preempt_has_tasks() function that checks to see if the ->blkd_tasks list is non-empty, this commit makes use of it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Shorten irq-disable region in rcu_cleanup_dead_cpu()Paul E. McKenney1-2/+2
Now that we are not migrating callbacks, there is no need to hold the ->orphan_lock across the the ->qsmaskinit bit-clearing process. This commit therefore releases ->orphan_lock immediately after adopting the orphaned RCU callbacks. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Don't migrate blocked tasks even if all corresponding CPUs offlinePaul E. McKenney3-160/+4
When the last CPU associated with a given leaf rcu_node structure goes offline, something must be done about the tasks queued on that rcu_node structure. Each of these tasks has been preempted on one of the leaf rcu_node structure's CPUs while in an RCU read-side critical section that it have not yet exited. Handling these tasks is the job of rcu_preempt_offline_tasks(), which migrates them from the leaf rcu_node structure to the root rcu_node structure. Unfortunately, this migration has to be done one task at a time because each tasks allegiance must be shifted from the original leaf rcu_node to the root, so that future attempts to deal with these tasks will acquire the root rcu_node structure's ->lock rather than that of the leaf. Worse yet, this migration must be done with interrupts disabled, which is not so good for realtime response, especially given that there is no bound on the number of tasks on a given rcu_node structure's list. (OK, OK, there is a bound, it is just that it is unreasonably large, especially on 64-bit systems.) This was not considered a problem back when rcu_preempt_offline_tasks() was first written because realtime systems were assumed not to do CPU-hotplug operations while real-time applications were running. This assumption has proved of dubious validity given that people are starting to run multiple realtime applications on a single SMP system and that it is common practice to offline then online a CPU before starting its real-time application in order to clear extraneous processing off of that CPU. So we now need CPU hotplug operations to avoid undue latencies. This commit therefore avoids migrating these tasks, instead letting them be dequeued one by one from the original leaf rcu_node structure by rcu_read_unlock_special(). This means that the clearing of bits from the upper-level rcu_node structures must be deferred until the last such task has been dequeued, because otherwise subsequent grace periods won't wait on them. This commit has the beneficial side effect of simplifying the CPU-hotplug code for TREE_PREEMPT_RCU, especially in CONFIG_RCU_BOOST builds. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Make rcu_read_unlock_special() propagate ->qsmaskinit bit clearingPaul E. McKenney2-3/+17
This commit causes rcu_read_unlock_special() to propagate ->qsmaskinit bit clearing up the rcu_node tree once a given rcu_node structure's blkd_tasks list becomes empty. This is the final commit in preparation for the rework of RCU priority boosting: It enables preempted tasks to remain queued on their rcu_node structure even after all of that rcu_node structure's CPUs have gone offline. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Abstract rcu_cleanup_dead_rnp() from rcu_cleanup_dead_cpu()Paul E. McKenney3-19/+66
This commit abstracts rcu_cleanup_dead_rnp() from rcu_cleanup_dead_cpu() in preparation for the rework of RCU priority boosting. This new function will be invoked from rcu_read_unlock_special() in the reworked scheme, which is why rcu_cleanup_dead_rnp() assumes that the leaf rcu_node structure's ->qsmaskinit field has already been updated. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Rename "empty" to "empty_norm" in preparation for boost reworkPaul E. McKenney1-3/+3
This commit undertakes a simple variable renaming to make way for some rework of RCU priority boosting. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Protect rcu_boost() lockless accesses with ACCESS_ONCE()Paul E. McKenney1-1/+2
This commit prevents random compiler optimizations by applying ACCESS_ONCE() to lockless accesses. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Remove "select IRQ_WORK" from config TREE_RCULai Jiangshan2-3/+0
The 48a7639ce80c ("rcu: Make callers awaken grace-period kthread") removed the irq_work_queue(), so the TREE_RCU doesn't need irq work any more. This commit therefore updates RCU's Kconfig and Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Fix rcu_barrier() race that could result in too-short waitPaul E. McKenney3-39/+36
The rcu_barrier() no-callbacks check for no-CBs CPUs has race conditions. It checks a given CPU's lists of callbacks, and if all three no-CBs lists are empty, ignores that CPU. However, these three lists could potentially be empty even when callbacks are present if the check executed just as the callbacks were being moved from one list to another. It turns out that recent versions of rcutorture can spot this race. This commit plugs this hole by consolidating the per-list counts of no-CBs callbacks into a single count, which is incremented before the corresponding callback is posted and after it is invoked. Then rcu_barrier() checks this single count to reliably determine whether the corresponding CPU has no-CBs callbacks. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06hotplugcpu: Avoid deadlocks by waking active_writerDavid Hildenbrand1-33/+23
Commit b2c4623dcd07 ("rcu: More on deadlock between CPU hotplug and expedited grace periods") introduced another problem that can easily be reproduced by starting/stopping cpus in a loop. E.g.: for i in `seq 5000`; do echo 1 > /sys/devices/system/cpu/cpu1/online echo 0 > /sys/devices/system/cpu/cpu1/online done Will result in: INFO: task /cpu_start_stop:1 blocked for more than 120 seconds. Call Trace: ([<00000000006a028e>] __schedule+0x406/0x91c) [<0000000000130f60>] cpu_hotplug_begin+0xd0/0xd4 [<0000000000130ff6>] _cpu_up+0x3e/0x1c4 [<0000000000131232>] cpu_up+0xb6/0xd4 [<00000000004a5720>] device_online+0x80/0xc0 [<00000000004a57f0>] online_store+0x90/0xb0 ... And a deadlock. Problem is that if the last ref in put_online_cpus() can't get the cpu_hotplug.lock the puts_pending count is incremented, but a sleeping active_writer might never be woken up, therefore never exiting the loop in cpu_hotplug_begin(). This fix removes puts_pending and turns refcount into an atomic variable. We also introduce a wait queue for the active_writer, to avoid possible races and use-after-free. There is no need to take the lock in put_online_cpus() anymore. Can't reproduce it with this fix. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rculist: Fix sparse warningYing Xue1-8/+8
This fixes the following sparse warnings: make C=1 CF=-D__CHECK_ENDIAN__ net/ipv6/addrconf.o net/ipv6/addrconf.c:3495:9: error: incompatible types in comparison expression (different address spaces) net/ipv6/addrconf.c:3495:9: error: incompatible types in comparison expression (different address spaces) net/ipv6/addrconf.c:3495:9: error: incompatible types in comparison expression (different address spaces) net/ipv6/addrconf.c:3495:9: error: incompatible types in comparison expression (different address spaces) To silence these spare complaints, an RCU annotation should be added to "next" pointer of hlist_node structure through hlist_next_rcu() macro when iterating over a hlist with hlist_for_each_entry_continue_rcu_bh(). By the way, this commit also resolves the same error appearing in hlist_for_each_entry_continue_rcu(). Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06tiny_rcu: Directly force QS when call_rcu_[bh|sched]() on idle_taskLai Jiangshan4-99/+14
For RCU in UP, context-switch = QS = GP, thus we can force a context-switch when any call_rcu_[bh|sched]() is happened on idle_task. After doing so, rcu_idle/irq_enter/exit() are useless, so we can simply make these functions empty. More important, this change does not change the functionality logically. Note: raise_softirq(RCU_SOFTIRQ)/rcu_sched_qs() in rcu_idle_enter() and outmost rcu_irq_exit() will have to wake up the ksoftirqd (due to in_interrupt() == 0). Before this patch After this patch: call_rcu_sched() in idle; call_rcu_sched() in idle set resched do other stuffs; do other stuffs outmost rcu_irq_exit() outmost rcu_irq_exit() (empty function) (or rcu_idle_enter()) (or rcu_idle_enter(), also empty function) start to resched. (see above) rcu_sched_qs() rcu_sched_qs() QS,and GP and advance cb QS,and GP and advance cb wake up the ksoftirqd wake up the ksoftirqd set resched resched to ksoftirqd (or other) resched to ksoftirqd (or other) These two code patches are almost the same. Size changed after patched: size kernel/rcu/tiny-old.o kernel/rcu/tiny-patched.o text data bss dec hex filename 3449 206 8 3663 e4f kernel/rcu/tiny-old.o 2406 144 8 2558 9fe kernel/rcu/tiny-patched.o Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcupdate: Replace smp_read_barrier_depends() with lockless_dereference()Pranith Kumar1-5/+5
Recently lockless_dereference() was added which can be used in place of hard-coding smp_read_barrier_depends(). The following PATCH makes the change. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>