sched: Add cluster scheduler level for x86

There are x86 CPU architectures (e.g. Jacobsville) where L2 cahce is shared among a cluster of cores instead of being exclusive to one single core. To prevent oversubscription of L2 cache, load should be balanced between such L2 clusters, especially for tasks with no shared data. On benchmark such as SPECrate mcf test, this change provides a boost to performance especially on medium load system on Jacobsville. on a Jacobsville that has 24 Atom cores, arranged into 6 clusters of 4 cores each, the benchmark number is as follow: Improvement over baseline kernel for mcf_r copies run time base rate 1 -0.1% -0.2% 6 25.1% 25.1% 12 18.8% 19.0% 24 0.3% 0.3% So this looks pretty good. In terms of the system's task distribution, some pretty bad clumping can be seen for the vanilla kernel without the L2 cluster domain for the 6 and 12 copies case. With the extra domain for cluster, the load does get evened out between the clusters. Note this patch isn't an universal win as spreading isn't necessarily a win, particually for those workload who can benefit from packing. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210924085104.44806-4-21cnbao@gmail.com
author: Tim Chen <tim.c.chen@linux.intel.com> 2021-09-24 20:51:04 +1200
committer: Peter Zijlstra <peterz@infradead.org> 2021-10-15 11:25:16 +0200
commit: 66558b730f2533cc2bf2b74d51f5f80b81e2bad0 (patch)
tree: ce653b4f61e022621c7affef860d50cad17aa218 /arch/x86/include/asm/topology.h
parent: sched: Add cluster scheduler level in core and related Kconfig for ARM64 (diff)
download: linux-dev-66558b730f2533cc2bf2b74d51f5f80b81e2bad0.tar.xz
linux-dev-66558b730f2533cc2bf2b74d51f5f80b81e2bad0.zip
1 files changed, 3 insertions, 0 deletions
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 9239399e5491..cc164777e661 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -103,6 +103,7 @@ static inline void setup_node_to_cpumask_map(void) { }
 #include <asm-generic/topology.h>
 
 extern const struct cpumask *cpu_coregroup_mask(int cpu);
+extern const struct cpumask *cpu_clustergroup_mask(int cpu);
 
 #define topology_logical_package_id(cpu)	(cpu_data(cpu).logical_proc_id)
 #define topology_physical_package_id(cpu)	(cpu_data(cpu).phys_proc_id)
@@ -113,7 +114,9 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
 extern unsigned int __max_die_per_package;
 
 #ifdef CONFIG_SMP
+#define topology_cluster_id(cpu)		(per_cpu(cpu_l2c_id, cpu))
 #define topology_die_cpumask(cpu)		(per_cpu(cpu_die_map, cpu))
+#define topology_cluster_cpumask(cpu)		(cpu_clustergroup_mask(cpu))
 #define topology_core_cpumask(cpu)		(per_cpu(cpu_core_map, cpu))
 #define topology_sibling_cpumask(cpu)		(per_cpu(cpu_sibling_map, cpu))
author	Tim Chen <tim.c.chen@linux.intel.com>	2021-09-24 20:51:04 +1200
committer	Peter Zijlstra <peterz@infradead.org>	2021-10-15 11:25:16 +0200
commit	66558b730f2533cc2bf2b74d51f5f80b81e2bad0 (patch)
tree	ce653b4f61e022621c7affef860d50cad17aa218 /arch/x86/include/asm/topology.h
parent	sched: Add cluster scheduler level in core and related Kconfig for ARM64 (diff)
download	linux-dev-66558b730f2533cc2bf2b74d51f5f80b81e2bad0.tar.xz linux-dev-66558b730f2533cc2bf2b74d51f5f80b81e2bad0.zip