sched: RT-balance, avoid overloading

This patch changes the searching for a run queue by a waking RT task to try to pick another runqueue if the currently running task is an RT task. The reason is that RT tasks behave different than normal tasks. Preempting a normal task to run a RT task to keep its cache hot is fine, because the preempted non-RT task may wait on that same runqueue to run again unless the migration thread comes along and pulls it off. RT tasks behave differently. If one is preempted, it makes an active effort to continue to run. So by having a high priority task preempt a lower priority RT task, that lower RT task will then quickly try to run on another runqueue. This will cause that lower RT task to replace its nice hot cache (and TLB) with a completely cold one. This is for the hope that the new high priority RT task will keep its cache hot. Remeber that this high priority RT task was just woken up. So it may likely have been sleeping for several milliseconds, and will end up with a cold cache anyway. RT tasks run till they voluntarily stop, or are preempted by a higher priority task. This means that it is unlikely that the woken RT task will have a hot cache to wake up to. So pushing off a lower RT task is just killing its cache for no good reason. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
author: Steven Rostedt <srostedt@redhat.com> 2008-01-25 21:08:12 +0100
committer: Ingo Molnar <mingo@elte.hu> 2008-01-25 21:08:12 +0100
commit: e1f47d891c0f00769d6d40ac5740f943e998d089 (patch)
tree: ccf402b5b5a8377af811afb288c39e2e136f1700 /kernel/sched_rt.c
parent: sched: wake-balance fixes (diff)
download: linux-dev-e1f47d891c0f00769d6d40ac5740f943e998d089.tar.xz
linux-dev-e1f47d891c0f00769d6d40ac5740f943e998d089.zip
1 files changed, 16 insertions, 4 deletions
diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index 87d7b3ff3861..9becc3710b60 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -160,11 +160,23 @@ static int select_task_rq_rt(struct task_struct *p, int sync)
 	struct rq *rq = task_rq(p);
 
 	/*
-	 * If the task will not preempt the RQ, try to find a better RQ
-	 * before we even activate the task
+	 * If the current task is an RT task, then
+	 * try to see if we can wake this RT task up on another
+	 * runqueue. Otherwise simply start this RT task
+	 * on its current runqueue.
+	 *
+	 * We want to avoid overloading runqueues. Even if
+	 * the RT task is of higher priority than the current RT task.
+	 * RT tasks behave differently than other tasks. If
+	 * one gets preempted, we try to push it off to another queue.
+	 * So trying to keep a preempting RT task on the same
+	 * cache hot CPU will force the running RT task to
+	 * a cold CPU. So we waste all the cache for the lower
+	 * RT task in hopes of saving some of a RT task
+	 * that is just being woken and probably will have
+	 * cold cache anyway.
 	 */
-	if ((p->prio >= rq->rt.highest_prio)
-	    && (p->nr_cpus_allowed > 1)) {
+	if (unlikely(rt_task(rq->curr))) {
 		int cpu = find_lowest_rq(p);
 
 		return (cpu == -1) ? task_cpu(p) : cpu;
author	Steven Rostedt <srostedt@redhat.com>	2008-01-25 21:08:12 +0100
committer	Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:12 +0100
commit	e1f47d891c0f00769d6d40ac5740f943e998d089 (patch)
tree	ccf402b5b5a8377af811afb288c39e2e136f1700 /kernel/sched_rt.c
parent	sched: wake-balance fixes (diff)
download	linux-dev-e1f47d891c0f00769d6d40ac5740f943e998d089.tar.xz linux-dev-e1f47d891c0f00769d6d40ac5740f943e998d089.zip