sched/cputime: Improve cputime_adjust()

People report that utime and stime from /proc/<pid>/stat become very wrong when the numbers are big enough, especially if you watch these counters incrementally. Specifically, the current implementation of: stime*rtime/total, results in a saw-tooth function on top of the desired line, where the teeth grow in size the larger the values become. IOW, it has a relative error. The result is that, when watching incrementally as time progresses (for large values), we'll see periods of pure stime or utime increase, irrespective of the actual ratio we're striving for. Replace scale_stime() with a math64.h helper: mul_u64_u64_div_u64() that is far more accurate. This also allows architectures to override the implementation -- for instance they can opt for the old algorithm if this new one turns out to be too expensive for them. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200519172506.GA317395@hirez.programming.kicks-ass.net
author: Oleg Nesterov <oleg@redhat.com> 2020-05-19 19:25:06 +0200
committer: Peter Zijlstra <peterz@infradead.org> 2020-06-15 14:10:00 +0200
commit: 3dc167ba5729ddd2d8e3fa1841653792c295d3f1 (patch)
tree: d3348dfe2edc313740bfd0b348d91d36726f9cc1 /kernel/sched/cputime.c
parent: Linux 5.8-rc1 (diff)
download: linux-dev-3dc167ba5729ddd2d8e3fa1841653792c295d3f1.tar.xz
linux-dev-3dc167ba5729ddd2d8e3fa1841653792c295d3f1.zip
1 files changed, 1 insertions, 45 deletions
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index ff9435dee1df..5a55d2300452 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -520,50 +520,6 @@ void account_idle_ticks(unsigned long ticks)
 }
 
 /*
- * Perform (stime * rtime) / total, but avoid multiplication overflow by
- * losing precision when the numbers are big.
- */
-static u64 scale_stime(u64 stime, u64 rtime, u64 total)
-{
-	u64 scaled;
-
-	for (;;) {
-		/* Make sure "rtime" is the bigger of stime/rtime */
-		if (stime > rtime)
-			swap(rtime, stime);
-
-		/* Make sure 'total' fits in 32 bits */
-		if (total >> 32)
-			goto drop_precision;
-
-		/* Does rtime (and thus stime) fit in 32 bits? */
-		if (!(rtime >> 32))
-			break;
-
-		/* Can we just balance rtime/stime rather than dropping bits? */
-		if (stime >> 31)
-			goto drop_precision;
-
-		/* We can grow stime and shrink rtime and try to make them both fit */
-		stime <<= 1;
-		rtime >>= 1;
-		continue;
-
-drop_precision:
-		/* We drop from rtime, it has more bits than stime */
-		rtime >>= 1;
-		total >>= 1;
-	}
-
-	/*
-	 * Make sure gcc understands that this is a 32x32->64 multiply,
-	 * followed by a 64/32->64 divide.
-	 */
-	scaled = div_u64((u64) (u32) stime * (u64) (u32) rtime, (u32)total);
-	return scaled;
-}
-
-/*
  * Adjust tick based cputime random precision against scheduler runtime
  * accounting.
  *
@@ -622,7 +578,7 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
 		goto update;
 	}
 
-	stime = scale_stime(stime, rtime, stime + utime);
+	stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
 
 update:
 	/*
author	Oleg Nesterov <oleg@redhat.com>	2020-05-19 19:25:06 +0200
committer	Peter Zijlstra <peterz@infradead.org>	2020-06-15 14:10:00 +0200
commit	3dc167ba5729ddd2d8e3fa1841653792c295d3f1 (patch)
tree	d3348dfe2edc313740bfd0b348d91d36726f9cc1 /kernel/sched/cputime.c
parent	Linux 5.8-rc1 (diff)
download	linux-dev-3dc167ba5729ddd2d8e3fa1841653792c295d3f1.tar.xz linux-dev-3dc167ba5729ddd2d8e3fa1841653792c295d3f1.zip