aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2021-02-20 21:42:18 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2021-02-20 21:42:18 -0800
commit02f9fc286e039d0bef7284fb1200ee755b525bde (patch)
tree2ce396501ee2590ca7e37c3ea6f058d164b41474 /Documentation
parentMerge tag 'staging-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging (diff)
parentMerge branches 'powercap' and 'pm-misc' (diff)
downloadlinux-dev-02f9fc286e039d0bef7284fb1200ee755b525bde.tar.xz
linux-dev-02f9fc286e039d0bef7284fb1200ee755b525bde.zip
Merge tag 'pm-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki: "These add a new power capping facility allowing aggregate power constraints to be applied to sets of devices in a distributed manner, add a new CPU ID to the RAPL power capping driver and improve it, drop a cpufreq driver belonging to a platform that is not supported any more, drop two redundant cpufreq driver flags, update cpufreq drivers (intel_pstate, brcmstb-avs, qcom-hw), update the operating performance points (OPP) framework (code cleanups, new helpers, devfreq-related modifications), clean up devfreq, extend the PM clock layer, update the cpupower utility and make assorted janitorial changes. Specifics: - Add new power capping facility called DTPM (Dynamic Thermal Power Management), based on the existing power capping framework, to allow aggregate power constraints to be applied to sets of devices in a distributed manner, along with a CPU backend driver based on the Energy Model (Daniel Lezcano, Dan Carpenter, Colin Ian King). - Add AlderLake Mobile support to the Intel RAPL power capping driver and make it use the topology interface when laying out the system topology (Zhang Rui, Yunfeng Ye). - Drop the cpufreq tango driver belonging to a platform that is not supported any more (Arnd Bergmann). - Drop the redundant CPUFREQ_STICKY and CPUFREQ_PM_NO_WARN cpufreq driver flags (Viresh Kumar). - Update cpufreq drivers: * Fix max CPU frequency discovery in the intel_pstate driver and make janitorial changes in it (Chen Yu, Rafael Wysocki, Nigel Christian). * Fix resource leaks in the brcmstb-avs-cpufreq driver (Christophe JAILLET). * Make the tegra20 driver use the resource-managed API (Dmitry Osipenko). * Enable boost support in the qcom-hw driver (Shawn Guo). - Update the operating performance points (OPP) framework: * Clean up the OPP core (Dmitry Osipenko, Viresh Kumar). * Extend the OPP API by adding new helpers to it (Dmitry Osipenko, Viresh Kumar). * Allow required OPPs to be used for devfreq devices and update the devfreq governor code accordingly (Saravana Kannan). * Prepare the framework for introducing new dev_pm_opp_set_opp() helper (Viresh Kumar). * Drop dev_pm_opp_set_bw() and update related drivers (Viresh Kumar). * Allow lazy linking of required-OPPs (Viresh Kumar). - Simplify and clean up devfreq somewhat (Lukasz Luba, Yang Li, Pierre Kuo). - Update the generic power domains (genpd) framework: * Use device's next wakeup to determine domain idle state (Lina Iyer). * Improve initialization and debug (Dmitry Osipenko). * Simplify computations (Abaci Team). - Make janitorial changes in the core code handling system sleep and PM-runtime (Bhaskar Chowdhury, Bjorn Helgaas, Rikard Falkeborn, Zqiang). - Update the MAINTAINERS entry for the exynos cpuidle driver and drop DEBUG definition from intel_idle (Krzysztof Kozlowski, Tom Rix). - Extend the PM clock layer to cover clocks that must sleep (Nicolas Pitre). - Update the cpupower utility: * Update cpupower command, add support for AMD family 0x19 and clean up the code to remove many of the family checks to make future family updates easier (Nathan Fontenot, Robert Richter). * Add Makefile dependencies for install targets to allow building cpupower in parallel rather than serially (Ivan Babrou). - Make janitorial changes in power management Kconfig (Lukasz Luba)" * tag 'pm-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits) MAINTAINERS: cpuidle: exynos: include header in file pattern powercap: intel_rapl: Use topology interface in rapl_init_domains() powercap: intel_rapl: Use topology interface in rapl_add_package() PM: sleep: Constify static struct attribute_group PM: Kconfig: remove unneeded "default n" options PM: EM: update Kconfig description and drop "default n" option cpufreq: Remove unused flag CPUFREQ_PM_NO_WARN cpufreq: Remove CPUFREQ_STICKY flag PM / devfreq: Add required OPPs support to passive governor PM / devfreq: Cache OPP table reference in devfreq OPP: Add function to look up required OPP's for a given OPP PM / devfreq: rk3399_dmc: Remove unneeded semicolon opp: Replace ENOTSUPP with EOPNOTSUPP opp: Fix "foo * bar" should be "foo *bar" opp: Don't ignore clk_get() errors other than -ENOENT opp: Update bandwidth requirements based on scaling up/down opp: Allow lazy-linking of required-opps opp: Remove dev_pm_opp_set_bw() devfreq: tegra30: Migrate to dev_pm_opp_set_opp() drm: msm: Migrate to dev_pm_opp_set_opp() ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/power/index.rst1
-rw-r--r--Documentation/power/powercap/dtpm.rst212
-rw-r--r--Documentation/power/runtime_pm.rst14
3 files changed, 220 insertions, 7 deletions
diff --git a/Documentation/power/index.rst b/Documentation/power/index.rst
index ced8a8007434..a0f5244fb427 100644
--- a/Documentation/power/index.rst
+++ b/Documentation/power/index.rst
@@ -30,6 +30,7 @@ Power Management
userland-swsusp
powercap/powercap
+ powercap/dtpm
regulator/consumer
regulator/design
diff --git a/Documentation/power/powercap/dtpm.rst b/Documentation/power/powercap/dtpm.rst
new file mode 100644
index 000000000000..a38dee3d815b
--- /dev/null
+++ b/Documentation/power/powercap/dtpm.rst
@@ -0,0 +1,212 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================================
+Dynamic Thermal Power Management framework
+==========================================
+
+On the embedded world, the complexity of the SoC leads to an
+increasing number of hotspots which need to be monitored and mitigated
+as a whole in order to prevent the temperature to go above the
+normative and legally stated 'skin temperature'.
+
+Another aspect is to sustain the performance for a given power budget,
+for example virtual reality where the user can feel dizziness if the
+performance is capped while a big CPU is processing something else. Or
+reduce the battery charging because the dissipated power is too high
+compared with the power consumed by other devices.
+
+The user space is the most adequate place to dynamically act on the
+different devices by limiting their power given an application
+profile: it has the knowledge of the platform.
+
+The Dynamic Thermal Power Management (DTPM) is a technique acting on
+the device power by limiting and/or balancing a power budget among
+different devices.
+
+The DTPM framework provides an unified interface to act on the
+device power.
+
+Overview
+========
+
+The DTPM framework relies on the powercap framework to create the
+powercap entries in the sysfs directory and implement the backend
+driver to do the connection with the power manageable device.
+
+The DTPM is a tree representation describing the power constraints
+shared between devices, not their physical positions.
+
+The nodes of the tree are a virtual description aggregating the power
+characteristics of the children nodes and their power limitations.
+
+The leaves of the tree are the real power manageable devices.
+
+For instance::
+
+ SoC
+ |
+ `-- pkg
+ |
+ |-- pd0 (cpu0-3)
+ |
+ `-- pd1 (cpu4-5)
+
+The pkg power will be the sum of pd0 and pd1 power numbers::
+
+ SoC (400mW - 3100mW)
+ |
+ `-- pkg (400mW - 3100mW)
+ |
+ |-- pd0 (100mW - 700mW)
+ |
+ `-- pd1 (300mW - 2400mW)
+
+When the nodes are inserted in the tree, their power characteristics are propagated to the parents::
+
+ SoC (600mW - 5900mW)
+ |
+ |-- pkg (400mW - 3100mW)
+ | |
+ | |-- pd0 (100mW - 700mW)
+ | |
+ | `-- pd1 (300mW - 2400mW)
+ |
+ `-- pd2 (200mW - 2800mW)
+
+Each node have a weight on a 2^10 basis reflecting the percentage of power consumption along the siblings::
+
+ SoC (w=1024)
+ |
+ |-- pkg (w=538)
+ | |
+ | |-- pd0 (w=231)
+ | |
+ | `-- pd1 (w=794)
+ |
+ `-- pd2 (w=486)
+
+ Note the sum of weights at the same level are equal to 1024.
+
+When a power limitation is applied to a node, then it is distributed along the children given their weights. For example, if we set a power limitation of 3200mW at the 'SoC' root node, the resulting tree will be::
+
+ SoC (w=1024) <--- power_limit = 3200mW
+ |
+ |-- pkg (w=538) --> power_limit = 1681mW
+ | |
+ | |-- pd0 (w=231) --> power_limit = 378mW
+ | |
+ | `-- pd1 (w=794) --> power_limit = 1303mW
+ |
+ `-- pd2 (w=486) --> power_limit = 1519mW
+
+
+Flat description
+----------------
+
+A root node is created and it is the parent of all the nodes. This
+description is the simplest one and it is supposed to give to user
+space a flat representation of all the devices supporting the power
+limitation without any power limitation distribution.
+
+Hierarchical description
+------------------------
+
+The different devices supporting the power limitation are represented
+hierarchically. There is one root node, all intermediate nodes are
+grouping the child nodes which can be intermediate nodes also or real
+devices.
+
+The intermediate nodes aggregate the power information and allows to
+set the power limit given the weight of the nodes.
+
+User space API
+==============
+
+As stated in the overview, the DTPM framework is built on top of the
+powercap framework. Thus the sysfs interface is the same, please refer
+to the powercap documentation for further details.
+
+ * power_uw: Instantaneous power consumption. If the node is an
+ intermediate node, then the power consumption will be the sum of all
+ children power consumption.
+
+ * max_power_range_uw: The power range resulting of the maximum power
+ minus the minimum power.
+
+ * name: The name of the node. This is implementation dependent. Even
+ if it is not recommended for the user space, several nodes can have
+ the same name.
+
+ * constraint_X_name: The name of the constraint.
+
+ * constraint_X_max_power_uw: The maximum power limit to be applicable
+ to the node.
+
+ * constraint_X_power_limit_uw: The power limit to be applied to the
+ node. If the value contained in constraint_X_max_power_uw is set,
+ the constraint will be removed.
+
+ * constraint_X_time_window_us: The meaning of this file will depend
+ on the constraint number.
+
+Constraints
+-----------
+
+ * Constraint 0: The power limitation is immediately applied, without
+ limitation in time.
+
+Kernel API
+==========
+
+Overview
+--------
+
+The DTPM framework has no power limiting backend support. It is
+generic and provides a set of API to let the different drivers to
+implement the backend part for the power limitation and create the
+power constraints tree.
+
+It is up to the platform to provide the initialization function to
+allocate and link the different nodes of the tree.
+
+A special macro has the role of declaring a node and the corresponding
+initialization function via a description structure. This one contains
+an optional parent field allowing to hook different devices to an
+already existing tree at boot time.
+
+For instance::
+
+ struct dtpm_descr my_descr = {
+ .name = "my_name",
+ .init = my_init_func,
+ };
+
+ DTPM_DECLARE(my_descr);
+
+The nodes of the DTPM tree are described with dtpm structure. The
+steps to add a new power limitable device is done in three steps:
+
+ * Allocate the dtpm node
+ * Set the power number of the dtpm node
+ * Register the dtpm node
+
+The registration of the dtpm node is done with the powercap
+ops. Basically, it must implements the callbacks to get and set the
+power and the limit.
+
+Alternatively, if the node to be inserted is an intermediate one, then
+a simple function to insert it as a future parent is available.
+
+If a device has its power characteristics changing, then the tree must
+be updated with the new power numbers and weights.
+
+Nomenclature
+------------
+
+ * dtpm_alloc() : Allocate and initialize a dtpm structure
+
+ * dtpm_register() : Add the dtpm node to the tree
+
+ * dtpm_unregister() : Remove the dtpm node from the tree
+
+ * dtpm_update_power() : Update the power characteristics of the dtpm node
diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst
index 0553008b6279..d9c777b18f7a 100644
--- a/Documentation/power/runtime_pm.rst
+++ b/Documentation/power/runtime_pm.rst
@@ -579,7 +579,7 @@ should be used. Of course, for this purpose the device's runtime PM has to be
enabled earlier by calling pm_runtime_enable().
Note, if the device may execute pm_runtime calls during the probe (such as
-if it is registers with a subsystem that may call back in) then the
+if it is registered with a subsystem that may call back in) then the
pm_runtime_get_sync() call paired with a pm_runtime_put() call will be
appropriate to ensure that the device is not put back to sleep during the
probe. This can happen with systems such as the network device layer.
@@ -587,11 +587,11 @@ probe. This can happen with systems such as the network device layer.
It may be desirable to suspend the device once ->probe() has finished.
Therefore the driver core uses the asynchronous pm_request_idle() to submit a
request to execute the subsystem-level idle callback for the device at that
-time. A driver that makes use of the runtime autosuspend feature, may want to
+time. A driver that makes use of the runtime autosuspend feature may want to
update the last busy mark before returning from ->probe().
Moreover, the driver core prevents runtime PM callbacks from racing with the bus
-notifier callback in __device_release_driver(), which is necessary, because the
+notifier callback in __device_release_driver(), which is necessary because the
notifier is used by some subsystems to carry out operations affecting the
runtime PM functionality. It does so by calling pm_runtime_get_sync() before
driver_sysfs_remove() and the BUS_NOTIFY_UNBIND_DRIVER notifications. This
@@ -603,7 +603,7 @@ calling pm_runtime_suspend() from their ->remove() routines, the driver core
executes pm_runtime_put_sync() after running the BUS_NOTIFY_UNBIND_DRIVER
notifications in __device_release_driver(). This requires bus types and
drivers to make their ->remove() callbacks avoid races with runtime PM directly,
-but also it allows of more flexibility in the handling of devices during the
+but it also allows more flexibility in the handling of devices during the
removal of their drivers.
Drivers in ->remove() callback should undo the runtime PM changes done
@@ -693,7 +693,7 @@ that the device appears to be runtime-suspended and its state is fine, so it
may be left in runtime suspend provided that all of its descendants are also
left in runtime suspend. If that happens, the PM core will not execute any
system suspend and resume callbacks for all of those devices, except for the
-complete callback, which is then entirely responsible for handling the device
+.complete() callback, which is then entirely responsible for handling the device
as appropriate. This only applies to system suspend transitions that are not
related to hibernation (see Documentation/driver-api/pm/devices.rst for more
information).
@@ -706,7 +706,7 @@ out the following operations:
right before executing the subsystem-level .prepare() callback for it and
pm_runtime_barrier() is called for every device right before executing the
subsystem-level .suspend() callback for it. In addition to that the PM core
- calls __pm_runtime_disable() with 'false' as the second argument for every
+ calls __pm_runtime_disable() with 'false' as the second argument for every
device right before executing the subsystem-level .suspend_late() callback
for it.
@@ -783,7 +783,7 @@ driver/base/power/generic_ops.c:
`int pm_generic_restore_noirq(struct device *dev);`
- invoke the ->restore_noirq() callback provided by the device's driver
-These functions are the defaults used by the PM core, if a subsystem doesn't
+These functions are the defaults used by the PM core if a subsystem doesn't
provide its own callbacks for ->runtime_idle(), ->runtime_suspend(),
->runtime_resume(), ->suspend(), ->suspend_noirq(), ->resume(),
->resume_noirq(), ->freeze(), ->freeze_noirq(), ->thaw(), ->thaw_noirq(),