cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)

This is based on Linus' idea of creating cpu_active_map that prevents scheduler load balancer from migrating tasks to the cpu that is going down. It allows us to simplify domain management code and avoid unecessary domain rebuilds during cpu hotplug event handling. Please ignore the cpusets part for now. It needs some more work in order to avoid crazy lock nesting. Although I did simplfy and unify domain reinitialization logic. We now simply call partition_sched_domains() in all the cases. This means that we're using exact same code paths as in cpusets case and hence the test below cover cpusets too. Cpuset changes to make rebuild_sched_domains() callable from various contexts are in the separate patch (right next after this one). This not only boots but also easily handles while true; do make clean; make -j 8; done and while true; do on-off-cpu 1; done at the same time. (on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing). Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing this on right now in gnome-terminal and things are moving just fine. Also this is running with most of the debug features enabled (lockdep, mutex, etc) no BUG_ONs or lockdep complaints so far. I believe I addressed all of the Dmitry's comments for original Linus' version. I changed both fair and rt balancer to mask out non-active cpus. And replaced cpu_is_offline() with !cpu_active() in the main scheduler code where it made sense (to me). Signed-off-by: Max Krasnyanskiy <maxk@qualcomm.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Gregory Haskins <ghaskins@novell.com> Cc: dmitry.adamushko@gmail.com Cc: pj@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
author: Max Krasnyansky <maxk@qualcomm.com> 2008-07-15 04:43:49 -0700
committer: Ingo Molnar <mingo@elte.hu> 2008-07-18 13:22:25 +0200
commit: e761b7725234276a802322549cee5255305a0930 (patch)
tree: 27b351a7d5fc9a93590e0effce1c5adb1bfcebc0 /kernel/sched_fair.c
parent: sched: rework of "prioritize non-migratable tasks over migratable ones" (diff)
download: linux-dev-e761b7725234276a802322549cee5255305a0930.tar.xz
linux-dev-e761b7725234276a802322549cee5255305a0930.zip
1 files changed, 3 insertions, 0 deletions
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index f2aa987027d6..d924c679dfac 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1004,6 +1004,8 @@ static void yield_task_fair(struct rq *rq)
  * not idle and an idle cpu is available.  The span of cpus to
  * search starts with cpus closest then further out as needed,
  * so we always favor a closer, idle cpu.
+ * Domains may include CPUs that are not usable for migration,
+ * hence we need to mask them out (cpu_active_map)
  *
  * Returns the CPU we should wake onto.
  */
@@ -1031,6 +1033,7 @@ static int wake_idle(int cpu, struct task_struct *p)
 		    || ((sd->flags & SD_WAKE_IDLE_FAR)
 			&& !task_hot(p, task_rq(p)->clock, sd))) {
 			cpus_and(tmp, sd->span, p->cpus_allowed);
+			cpus_and(tmp, tmp, cpu_active_map);
 			for_each_cpu_mask(i, tmp) {
 				if (idle_cpu(i)) {
 					if (i != task_cpu(p)) {
author	Max Krasnyansky <maxk@qualcomm.com>	2008-07-15 04:43:49 -0700
committer	Ingo Molnar <mingo@elte.hu>	2008-07-18 13:22:25 +0200
commit	e761b7725234276a802322549cee5255305a0930 (patch)
tree	27b351a7d5fc9a93590e0effce1c5adb1bfcebc0 /kernel/sched_fair.c
parent	sched: rework of "prioritize non-migratable tasks over migratable ones" (diff)
download	linux-dev-e761b7725234276a802322549cee5255305a0930.tar.xz linux-dev-e761b7725234276a802322549cee5255305a0930.zip