aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/include
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2025-05-27 16:48:47 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2025-05-27 16:48:47 -0700
commitc89756bcf406af313d191cfe3709e7c175c5b0cd (patch)
tree46259271bfd32051a26a9c5f26c455960ffbdf51 /include
parentMerge tag 'acpi-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm (diff)
parentMerge branch 'pm-tools' (diff)
downloadwireguard-linux-c89756bcf406af313d191cfe3709e7c175c5b0cd.tar.xz
wireguard-linux-c89756bcf406af313d191cfe3709e7c175c5b0cd.zip
Merge tag 'pm-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki: "Once again, the changes are dominated by cpufreq updates, but this time the majority of them are cpufreq core changes, mostly related to the introduction of policy locking guards and __free() usage, and fixes related to boost handling. Still, there is also a significant update of the intel_pstate driver making it register an energy model when running on a hybrid platform which is used for enabling energy-aware scheduling (EAS) if the driver operates in the passive mode (and schedutil is used as the cpufreq governor for all CPUs which is the passive mode default). There are some amd-pstate driver updates too, for a good measure, including the "Requested CPU Min frequency" BIOS option support and new online/offline callbacks. In the cpuidle space, the most significant change is the addition of a C1 demotion on/off sysfs knob to intel_idle which should help some users to configure their systems more precisely. There is also the conversion of the PSCI cpuidle driver to a faux device one and there are two small updates of cpuidle governors. Device power management is also modified quite a bit, especially the handling of devices with asynchronous suspend and resume enabled during system transitions. They are now going to be handled more asynchronously during suspend transitions and somewhat less aggressively during resume transitions. Apart from the above, the operating performance points (OPP) library is now going to use mutex locking guards and scope-based cleanup helpers and there is the usual bunch of assorted fixes and code cleanups. Specifics: - Fix potential division-by-zero error in em_compute_costs() (Yaxiong Tian) - Fix typos in energy model documentation and example driver code (Moon Hee Lee, Atul Kumar Pant) - Rearrange the energy model management code and add a new function for adjusting a CPU energy model after adjusting the capacity of the given CPU to it (Rafael Wysocki) - Refactor cpufreq_online(), add and use cpufreq policy locking guards, use __free() in policy reference counting, and clean up core cpufreq code on top of that (Rafael Wysocki) - Fix boost handling on CPU suspend/resume and sysfs updates (Viresh Kumar) - Fix des_perf clamping with max_perf in amd_pstate_update() (Dhananjay Ugwekar) - Add offline, online and suspend callbacks to the amd-pstate driver, rename and use the existing amd_pstate_epp callbacks in it (Dhananjay Ugwekar) - Add support for the "Requested CPU Min frequency" BIOS option to the amd-pstate driver (Dhananjay Ugwekar) - Reset amd-pstate driver mode after running selftests (Swapnil Sapkal) - Avoid shadowing ret in amd_pstate_ut_check_driver() (Nathan Chancellor) - Add helper for governor checks to the schedutil cpufreq governor and move cpufreq-specific EAS checks to cpufreq (Rafael Wysocki) - Populate the cpu_capacity sysfs entries from the intel_pstate driver after registering asym capacity support (Ricardo Neri) - Add support for enabling Energy-aware scheduling (EAS) to the intel_pstate driver when operating in the passive mode on a hybrid platform (Rafael Wysocki) - Drop redundant cpus_read_lock() from store_local_boost() in the cpufreq core (Seyediman Seyedarab) - Replace sscanf() with kstrtouint() in the cpufreq code and use a symbol instead of a raw number in it (Bowen Yu) - Add support for autonomous CPU performance state selection to the CPPC cpufreq driver (Lifeng Zheng) - OPP: Add dev_pm_opp_set_level() (Praveen Talari) - Introduce scope-based cleanup headers and mutex locking guards in OPP core (Viresh Kumar) - Switch OPP to use kmemdup_array() (Zhang Enpei) - Optimize bucket assignment when next_timer_ns equals KTIME_MAX in the menu cpuidle governor (Zhongqiu Han) - Convert the cpuidle PSCI driver to a faux device one (Sudeep Holla) - Add C1 demotion on/off sysfs knob to the intel_idle driver (Artem Bityutskiy) - Fix typos in two comments in the teo cpuidle governor (Atul Kumar Pant) - Fix denying of auto suspend in pm_suspend_timer_fn() (Charan Teja Kalla) - Move debug runtime PM attributes to runtime_attrs[] (Rafael Wysocki) - Add new devm_ functions for enabling runtime PM and runtime PM reference counting (Bence Csókás) - Remove size arguments from strscpy() calls in the hibernation core code (Thorsten Blum) - Adjust the handling of devices with asynchronous suspend enabled during system suspend and resume to start resuming them immediately after resuming their parents and to start suspending such a device immediately after suspending its first child (Rafael Wysocki) - Adjust messages printed during tasks freezing to avoid using pr_cont() (Andrew Sayers, Paul Menzel) - Clean up unnecessary usage of !! in pm_print_times_init() (Zihuan Zhang) - Add missing wakeup source attribute relax_count to sysfs and remove the space character at the end ofi the string produced by pm_show_wakelocks() (Zijun Hu) - Add configurable pm_test delay for hibernation (Zihuan Zhang) - Disable asynchronous suspend in ucsi_ccg_probe() to prevent the cypd4226 device on Tegra boards from suspending prematurely (Jon Hunter) - Unbreak printing PM debug messages during hibernation and clean up some related code (Rafael Wysocki) - Add a systemd service to run cpupower and change cpupower binding's Makefile to use -lcpupower (John B. Wyatt IV, Francesco Poli)" * tag 'pm-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (72 commits) cpufreq: CPPC: Add support for autonomous selection cpufreq: Update sscanf() to kstrtouint() cpufreq: Replace magic number OPP: switch to use kmemdup_array() PM: freezer: Rewrite restarting tasks log to remove stray *done.* PM: runtime: fix denying of auto suspend in pm_suspend_timer_fn() cpufreq: drop redundant cpus_read_lock() from store_local_boost() cpupower: do not install files to /etc/default/ cpupower: do not call systemctl at install time cpupower: do not write DESTDIR to cpupower.service PM: sleep: Introduce pm_sleep_transition_in_progress() cpufreq/amd-pstate: Avoid shadowing ret in amd_pstate_ut_check_driver() cpufreq: intel_pstate: Document hybrid processor support cpufreq: intel_pstate: EAS: Increase cost for CPUs using L3 cache cpufreq: intel_pstate: EAS support for hybrid platforms PM: EM: Introduce em_adjust_cpu_capacity() PM: EM: Move CPU capacity check to em_adjust_new_capacity() PM: EM: Documentation: Fix typos in example driver code cpufreq: Drop policy locking from cpufreq_policy_is_good_for_eas() PM: sleep: Introduce pm_suspend_in_progress() ...
Diffstat (limited to 'include')
-rw-r--r--include/linux/arch_topology.h8
-rw-r--r--include/linux/cpufreq.h22
-rw-r--r--include/linux/energy_model.h2
-rw-r--r--include/linux/pm_opp.h32
-rw-r--r--include/linux/suspend.h9
-rw-r--r--include/linux/topology.h9
6 files changed, 66 insertions, 16 deletions
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 2222e8b03ff4..d72d6e5aa200 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -14,14 +14,6 @@ int topology_update_cpu_topology(void);
struct device_node;
bool topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu);
-DECLARE_PER_CPU(unsigned long, cpu_scale);
-
-static inline unsigned long topology_get_cpu_scale(int cpu)
-{
- return per_cpu(cpu_scale, cpu);
-}
-
-void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity);
DECLARE_PER_CPU(unsigned long, capacity_freq_ref);
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 7a5b391dcc01..95f3807c8c55 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -170,6 +170,12 @@ struct cpufreq_policy {
struct notifier_block nb_max;
};
+DEFINE_GUARD(cpufreq_policy_write, struct cpufreq_policy *,
+ down_write(&_T->rwsem), up_write(&_T->rwsem))
+
+DEFINE_GUARD(cpufreq_policy_read, struct cpufreq_policy *,
+ down_read(&_T->rwsem), up_read(&_T->rwsem))
+
/*
* Used for passing new cpufreq policy data to the cpufreq driver's ->verify()
* callback for sanitization. That callback is only expected to modify the min
@@ -235,9 +241,6 @@ void disable_cpufreq(void);
u64 get_cpu_idle_time(unsigned int cpu, u64 *wall, int io_busy);
-struct cpufreq_policy *cpufreq_cpu_acquire(unsigned int cpu);
-void cpufreq_cpu_release(struct cpufreq_policy *policy);
-int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu);
void refresh_frequency_limits(struct cpufreq_policy *policy);
void cpufreq_update_policy(unsigned int cpu);
void cpufreq_update_limits(unsigned int cpu);
@@ -395,7 +398,7 @@ struct cpufreq_driver {
unsigned int (*get)(unsigned int cpu);
/* Called to update policy limits on firmware notifications. */
- void (*update_limits)(unsigned int cpu);
+ void (*update_limits)(struct cpufreq_policy *policy);
/* optional */
int (*bios_limit)(int cpu, unsigned int *limit);
@@ -647,6 +650,15 @@ module_exit(__governor##_exit)
struct cpufreq_governor *cpufreq_default_governor(void);
struct cpufreq_governor *cpufreq_fallback_governor(void);
+#ifdef CONFIG_CPU_FREQ_GOV_SCHEDUTIL
+bool sugov_is_governor(struct cpufreq_policy *policy);
+#else
+static inline bool sugov_is_governor(struct cpufreq_policy *policy)
+{
+ return false;
+}
+#endif
+
static inline void cpufreq_policy_apply_limits(struct cpufreq_policy *policy)
{
if (policy->max < policy->cur)
@@ -1225,6 +1237,8 @@ void cpufreq_generic_init(struct cpufreq_policy *policy,
struct cpufreq_frequency_table *table,
unsigned int transition_latency);
+bool cpufreq_ready_for_eas(const struct cpumask *cpu_mask);
+
static inline void cpufreq_register_em_with_opp(struct cpufreq_policy *policy)
{
dev_pm_opp_of_register_em(get_cpu_device(policy->cpu),
diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index d8eabbf86a5b..7fa1eb3cc823 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -179,6 +179,7 @@ int em_dev_compute_costs(struct device *dev, struct em_perf_state *table,
int em_dev_update_chip_binning(struct device *dev);
int em_update_performance_limits(struct em_perf_domain *pd,
unsigned long freq_min_khz, unsigned long freq_max_khz);
+void em_adjust_cpu_capacity(unsigned int cpu);
void em_rebuild_sched_domains(void);
/**
@@ -403,6 +404,7 @@ int em_update_performance_limits(struct em_perf_domain *pd,
{
return -EINVAL;
}
+static inline void em_adjust_cpu_capacity(unsigned int cpu) {}
static inline void em_rebuild_sched_domains(void) {}
#endif
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index c247317aae38..cf477beae4bb 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -11,6 +11,7 @@
#ifndef __LINUX_OPP_H__
#define __LINUX_OPP_H__
+#include <linux/cleanup.h>
#include <linux/energy_model.h>
#include <linux/err.h>
#include <linux/notifier.h>
@@ -100,7 +101,7 @@ struct dev_pm_opp_data {
#if defined(CONFIG_PM_OPP)
struct opp_table *dev_pm_opp_get_opp_table(struct device *dev);
-void dev_pm_opp_get_opp_table_ref(struct opp_table *opp_table);
+struct opp_table *dev_pm_opp_get_opp_table_ref(struct opp_table *opp_table);
void dev_pm_opp_put_opp_table(struct opp_table *opp_table);
unsigned long dev_pm_opp_get_bw(struct dev_pm_opp *opp, bool peak, int index);
@@ -161,7 +162,7 @@ struct dev_pm_opp *dev_pm_opp_find_bw_ceil(struct device *dev,
struct dev_pm_opp *dev_pm_opp_find_bw_floor(struct device *dev,
unsigned int *bw, int index);
-void dev_pm_opp_get(struct dev_pm_opp *opp);
+struct dev_pm_opp *dev_pm_opp_get(struct dev_pm_opp *opp);
void dev_pm_opp_put(struct dev_pm_opp *opp);
int dev_pm_opp_add_dynamic(struct device *dev, struct dev_pm_opp_data *opp);
@@ -196,6 +197,7 @@ int dev_pm_opp_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
void dev_pm_opp_remove_table(struct device *dev);
void dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask);
int dev_pm_opp_sync_regulators(struct device *dev);
+
#else
static inline struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
{
@@ -207,7 +209,10 @@ static inline struct opp_table *dev_pm_opp_get_opp_table_indexed(struct device *
return ERR_PTR(-EOPNOTSUPP);
}
-static inline void dev_pm_opp_get_opp_table_ref(struct opp_table *opp_table) {}
+static inline struct opp_table *dev_pm_opp_get_opp_table_ref(struct opp_table *opp_table)
+{
+ return opp_table;
+}
static inline void dev_pm_opp_put_opp_table(struct opp_table *opp_table) {}
@@ -345,7 +350,10 @@ static inline struct dev_pm_opp *dev_pm_opp_find_bw_floor(struct device *dev,
return ERR_PTR(-EOPNOTSUPP);
}
-static inline void dev_pm_opp_get(struct dev_pm_opp *opp) {}
+static inline struct dev_pm_opp *dev_pm_opp_get(struct dev_pm_opp *opp)
+{
+ return opp;
+}
static inline void dev_pm_opp_put(struct dev_pm_opp *opp) {}
@@ -573,6 +581,12 @@ static inline int dev_pm_opp_of_find_icc_paths(struct device *dev, struct opp_ta
}
#endif
+/* Scope based cleanup macro for OPP reference counting */
+DEFINE_FREE(put_opp, struct dev_pm_opp *, if (!IS_ERR_OR_NULL(_T)) dev_pm_opp_put(_T))
+
+/* Scope based cleanup macro for OPP table reference counting */
+DEFINE_FREE(put_opp_table, struct opp_table *, if (!IS_ERR_OR_NULL(_T)) dev_pm_opp_put_opp_table(_T))
+
/* OPP Configuration helpers */
static inline int dev_pm_opp_add(struct device *dev, unsigned long freq,
@@ -704,4 +718,14 @@ static inline unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
return dev_pm_opp_get_freq_indexed(opp, 0);
}
+static inline int dev_pm_opp_set_level(struct device *dev, unsigned int level)
+{
+ struct dev_pm_opp *opp __free(put_opp) = dev_pm_opp_find_level_exact(dev, level);
+
+ if (IS_ERR(opp))
+ return PTR_ERR(opp);
+
+ return dev_pm_opp_set_opp(dev, opp);
+}
+
#endif /* __LINUX_OPP_H__ */
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index da6ebca3ff77..b1c76c8f2c82 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -298,6 +298,11 @@ static inline void s2idle_set_ops(const struct platform_s2idle_ops *ops) {}
static inline void s2idle_wake(void) {}
#endif /* !CONFIG_SUSPEND */
+static inline bool pm_suspend_in_progress(void)
+{
+ return pm_suspend_target_state != PM_SUSPEND_ON;
+}
+
/* struct pbe is used for creating lists of pages that should be restored
* atomically during the resume from disk, because the page frames they have
* occupied before the suspend are in use.
@@ -470,6 +475,8 @@ extern void pm_print_active_wakeup_sources(void);
extern unsigned int lock_system_sleep(void);
extern void unlock_system_sleep(unsigned int);
+extern bool pm_sleep_transition_in_progress(void);
+
#else /* !CONFIG_PM_SLEEP */
static inline int register_pm_notifier(struct notifier_block *nb)
@@ -498,6 +505,8 @@ static inline void pm_system_irq_wakeup(unsigned int irq_number) {}
static inline unsigned int lock_system_sleep(void) { return 0; }
static inline void unlock_system_sleep(unsigned int flags) {}
+static inline bool pm_sleep_transition_in_progress(void) { return false; }
+
#endif /* !CONFIG_PM_SLEEP */
#ifdef CONFIG_PM_SLEEP_DEBUG
diff --git a/include/linux/topology.h b/include/linux/topology.h
index 24e715f0f6d2..cd6b4bdc9cfd 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -332,4 +332,13 @@ sched_numa_hop_mask(unsigned int node, unsigned int hops)
!IS_ERR_OR_NULL(mask); \
__hops++)
+DECLARE_PER_CPU(unsigned long, cpu_scale);
+
+static inline unsigned long topology_get_cpu_scale(int cpu)
+{
+ return per_cpu(cpu_scale, cpu);
+}
+
+void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity);
+
#endif /* _LINUX_TOPOLOGY_H */