aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-class-devfreq54
-rw-r--r--Documentation/devicetree/bindings/devfreq/exynos-bus.txt71
-rw-r--r--Documentation/devicetree/bindings/opp/opp.txt54
-rw-r--r--Documentation/driver-api/thermal/power_allocator.rst12
-rw-r--r--Documentation/power/energy-model.rst30
-rw-r--r--Documentation/scheduler/sched-energy.rst5
6 files changed, 195 insertions, 31 deletions
diff --git a/Documentation/ABI/testing/sysfs-class-devfreq b/Documentation/ABI/testing/sysfs-class-devfreq
index b8ebff4b1c4c..386bc230a33d 100644
--- a/Documentation/ABI/testing/sysfs-class-devfreq
+++ b/Documentation/ABI/testing/sysfs-class-devfreq
@@ -37,20 +37,6 @@ Description:
The /sys/class/devfreq/.../target_freq shows the next governor
predicted target frequency of the corresponding devfreq object.
-What: /sys/class/devfreq/.../polling_interval
-Date: September 2011
-Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
-Description:
- The /sys/class/devfreq/.../polling_interval shows and sets
- the requested polling interval of the corresponding devfreq
- object. The values are represented in ms. If the value is
- less than 1 jiffy, it is considered to be 0, which means
- no polling. This value is meaningless if the governor is
- not polling; thus. If the governor is not using
- devfreq-provided central polling
- (/sys/class/devfreq/.../central_polling is 0), this value
- may be useless.
-
What: /sys/class/devfreq/.../trans_stat
Date: October 2012
Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
@@ -66,14 +52,6 @@ Description:
echo 0 > /sys/class/devfreq/.../trans_stat
-What: /sys/class/devfreq/.../userspace/set_freq
-Date: September 2011
-Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
-Description:
- The /sys/class/devfreq/.../userspace/set_freq shows and
- sets the requested frequency for the devfreq object if
- userspace governor is in effect.
-
What: /sys/class/devfreq/.../available_frequencies
Date: October 2012
Contact: Nishanth Menon <nm@ti.com>
@@ -110,6 +88,35 @@ Description:
The max_freq overrides min_freq because max_freq may be
used to throttle devices to avoid overheating.
+What: /sys/class/devfreq/.../polling_interval
+Date: September 2011
+Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+ The /sys/class/devfreq/.../polling_interval shows and sets
+ the requested polling interval of the corresponding devfreq
+ object. The values are represented in ms. If the value is
+ less than 1 jiffy, it is considered to be 0, which means
+ no polling. This value is meaningless if the governor is
+ not polling; thus. If the governor is not using
+ devfreq-provided central polling
+ (/sys/class/devfreq/.../central_polling is 0), this value
+ may be useless.
+
+ A list of governors that support the node:
+ - simple_ondmenad
+ - tegra_actmon
+
+What: /sys/class/devfreq/.../userspace/set_freq
+Date: September 2011
+Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+ The /sys/class/devfreq/.../userspace/set_freq shows and
+ sets the requested frequency for the devfreq object if
+ userspace governor is in effect.
+
+ A list of governors that support the node:
+ - userspace
+
What: /sys/class/devfreq/.../timer
Date: July 2020
Contact: Chanwoo Choi <cw00.choi@samsung.com>
@@ -122,3 +129,6 @@ Description:
echo deferrable > /sys/class/devfreq/.../timer
echo delayed > /sys/class/devfreq/.../timer
+
+ A list of governors that support the node:
+ - simple_ondemand
diff --git a/Documentation/devicetree/bindings/devfreq/exynos-bus.txt b/Documentation/devicetree/bindings/devfreq/exynos-bus.txt
index e71f752cc18f..bcaa2c08ac11 100644
--- a/Documentation/devicetree/bindings/devfreq/exynos-bus.txt
+++ b/Documentation/devicetree/bindings/devfreq/exynos-bus.txt
@@ -51,6 +51,19 @@ Optional properties only for parent bus device:
- exynos,saturation-ratio: the percentage value which is used to calibrate
the performance count against total cycle count.
+Optional properties for the interconnect functionality (QoS frequency
+constraints):
+- #interconnect-cells: should be 0.
+- interconnects: as documented in ../interconnect.txt, describes a path at the
+ higher level interconnects used by this interconnect provider.
+ If this interconnect provider is directly linked to a top level interconnect
+ provider the property contains only one phandle. The provider extends
+ the interconnect graph by linking its node to a node registered by provider
+ pointed to by first phandle in the 'interconnects' property.
+
+- samsung,data-clock-ratio: ratio of the data throughput in B/s to minimum data
+ clock frequency in Hz, default value is 8 when this property is missing.
+
Detailed correlation between sub-blocks and power line according to Exynos SoC:
- In case of Exynos3250, there are two power line as following:
VDD_MIF |--- DMC
@@ -135,7 +148,7 @@ Detailed correlation between sub-blocks and power line according to Exynos SoC:
|--- PERIC (Fixed clock rate)
|--- FSYS (Fixed clock rate)
-Example1:
+Example 1:
Show the AXI buses of Exynos3250 SoC. Exynos3250 divides the buses to
power line (regulator). The MIF (Memory Interface) AXI bus is used to
transfer data between DRAM and CPU and uses the VDD_MIF regulator.
@@ -184,7 +197,7 @@ Example1:
|L5 |200000 |200000 |400000 |300000 | ||1000000 |
----------------------------------------------------------
-Example2 :
+Example 2:
The bus of DMC (Dynamic Memory Controller) block in exynos3250.dtsi
is listed below:
@@ -419,3 +432,57 @@ Example2 :
devfreq = <&bus_leftbus>;
status = "okay";
};
+
+Example 3:
+ An interconnect path "bus_display -- bus_leftbus -- bus_dmc" on
+ Exynos4412 SoC with video mixer as an interconnect consumer device.
+
+ soc {
+ bus_dmc: bus_dmc {
+ compatible = "samsung,exynos-bus";
+ clocks = <&clock CLK_DIV_DMC>;
+ clock-names = "bus";
+ operating-points-v2 = <&bus_dmc_opp_table>;
+ samsung,data-clock-ratio = <4>;
+ #interconnect-cells = <0>;
+ };
+
+ bus_leftbus: bus_leftbus {
+ compatible = "samsung,exynos-bus";
+ clocks = <&clock CLK_DIV_GDL>;
+ clock-names = "bus";
+ operating-points-v2 = <&bus_leftbus_opp_table>;
+ #interconnect-cells = <0>;
+ interconnects = <&bus_dmc>;
+ };
+
+ bus_display: bus_display {
+ compatible = "samsung,exynos-bus";
+ clocks = <&clock CLK_ACLK160>;
+ clock-names = "bus";
+ operating-points-v2 = <&bus_display_opp_table>;
+ #interconnect-cells = <0>;
+ interconnects = <&bus_leftbus &bus_dmc>;
+ };
+
+ bus_dmc_opp_table: opp_table1 {
+ compatible = "operating-points-v2";
+ /* ... */
+ }
+
+ bus_leftbus_opp_table: opp_table3 {
+ compatible = "operating-points-v2";
+ /* ... */
+ };
+
+ bus_display_opp_table: opp_table4 {
+ compatible = "operating-points-v2";
+ /* .. */
+ };
+
+ &mixer {
+ compatible = "samsung,exynos4212-mixer";
+ interconnects = <&bus_display &bus_dmc>;
+ /* ... */
+ };
+ };
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
index 9847dfeeffcb..08b3da4736cf 100644
--- a/Documentation/devicetree/bindings/opp/opp.txt
+++ b/Documentation/devicetree/bindings/opp/opp.txt
@@ -65,7 +65,9 @@ Required properties:
- OPP nodes: One or more OPP nodes describing voltage-current-frequency
combinations. Their name isn't significant but their phandle can be used to
- reference an OPP.
+ reference an OPP. These are mandatory except for the case where the OPP table
+ is present only to indicate dependency between devices using the opp-shared
+ property.
Optional properties:
- opp-shared: Indicates that device nodes using this OPP Table Node's phandle
@@ -568,3 +570,53 @@ Example 6: opp-microvolt-<name>, opp-microamp-<name>:
};
};
};
+
+Example 7: Single cluster Quad-core ARM cortex A53, OPP points from firmware,
+distinct clock controls but two sets of clock/voltage/current lines.
+
+/ {
+ cpus {
+ #address-cells = <2>;
+ #size-cells = <0>;
+
+ cpu@0 {
+ compatible = "arm,cortex-a53";
+ reg = <0x0 0x100>;
+ next-level-cache = <&A53_L2>;
+ clocks = <&dvfs_controller 0>;
+ operating-points-v2 = <&cpu_opp0_table>;
+ };
+ cpu@1 {
+ compatible = "arm,cortex-a53";
+ reg = <0x0 0x101>;
+ next-level-cache = <&A53_L2>;
+ clocks = <&dvfs_controller 1>;
+ operating-points-v2 = <&cpu_opp0_table>;
+ };
+ cpu@2 {
+ compatible = "arm,cortex-a53";
+ reg = <0x0 0x102>;
+ next-level-cache = <&A53_L2>;
+ clocks = <&dvfs_controller 2>;
+ operating-points-v2 = <&cpu_opp1_table>;
+ };
+ cpu@3 {
+ compatible = "arm,cortex-a53";
+ reg = <0x0 0x103>;
+ next-level-cache = <&A53_L2>;
+ clocks = <&dvfs_controller 3>;
+ operating-points-v2 = <&cpu_opp1_table>;
+ };
+
+ };
+
+ cpu_opp0_table: opp0_table {
+ compatible = "operating-points-v2";
+ opp-shared;
+ };
+
+ cpu_opp1_table: opp1_table {
+ compatible = "operating-points-v2";
+ opp-shared;
+ };
+};
diff --git a/Documentation/driver-api/thermal/power_allocator.rst b/Documentation/driver-api/thermal/power_allocator.rst
index 67b6a3297238..aa5f66552d6f 100644
--- a/Documentation/driver-api/thermal/power_allocator.rst
+++ b/Documentation/driver-api/thermal/power_allocator.rst
@@ -71,7 +71,9 @@ to the speed-grade of the silicon. `sustainable_power` is therefore
simply an estimate, and may be tuned to affect the aggressiveness of
the thermal ramp. For reference, the sustainable power of a 4" phone
is typically 2000mW, while on a 10" tablet is around 4500mW (may vary
-depending on screen size).
+depending on screen size). It is possible to have the power value
+expressed in an abstract scale. The sustained power should be aligned
+to the scale used by the related cooling devices.
If you are using device tree, do add it as a property of the
thermal-zone. For example::
@@ -269,3 +271,11 @@ won't be very good. Note that this is not particular to this
governor, step-wise will also misbehave if you call its throttle()
faster than the normal thermal framework tick (due to interrupts for
example) as it will overreact.
+
+Energy Model requirements
+=========================
+
+Another important thing is the consistent scale of the power values
+provided by the cooling devices. All of the cooling devices in a single
+thermal zone should have power values reported either in milli-Watts
+or scaled to the same 'abstract scale'.
diff --git a/Documentation/power/energy-model.rst b/Documentation/power/energy-model.rst
index a6fb986abe3c..60ac091d3b0d 100644
--- a/Documentation/power/energy-model.rst
+++ b/Documentation/power/energy-model.rst
@@ -20,6 +20,21 @@ possible source of information on its own, the EM framework intervenes as an
abstraction layer which standardizes the format of power cost tables in the
kernel, hence enabling to avoid redundant work.
+The power values might be expressed in milli-Watts or in an 'abstract scale'.
+Multiple subsystems might use the EM and it is up to the system integrator to
+check that the requirements for the power value scale types are met. An example
+can be found in the Energy-Aware Scheduler documentation
+Documentation/scheduler/sched-energy.rst. For some subsystems like thermal or
+powercap power values expressed in an 'abstract scale' might cause issues.
+These subsystems are more interested in estimation of power used in the past,
+thus the real milli-Watts might be needed. An example of these requirements can
+be found in the Intelligent Power Allocation in
+Documentation/driver-api/thermal/power_allocator.rst.
+Kernel subsystems might implement automatic detection to check whether EM
+registered devices have inconsistent scale (based on EM internal flag).
+Important thing to keep in mind is that when the power values are expressed in
+an 'abstract scale' deriving real energy in milli-Joules would not be possible.
+
The figure below depicts an example of drivers (Arm-specific here, but the
approach is applicable to any architecture) providing power costs to the EM
framework, and interested clients reading the data from it::
@@ -73,7 +88,7 @@ Drivers are expected to register performance domains into the EM framework by
calling the following API::
int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states,
- struct em_data_callback *cb, cpumask_t *cpus);
+ struct em_data_callback *cb, cpumask_t *cpus, bool milliwatts);
Drivers must provide a callback function returning <frequency, power> tuples
for each performance state. The callback function provided by the driver is free
@@ -81,6 +96,10 @@ to fetch data from any relevant location (DT, firmware, ...), and by any mean
deemed necessary. Only for CPU devices, drivers must specify the CPUs of the
performance domains using cpumask. For other devices than CPUs the last
argument must be set to NULL.
+The last argument 'milliwatts' is important to set with correct value. Kernel
+subsystems which use EM might rely on this flag to check if all EM devices use
+the same scale. If there are different scales, these subsystems might decide
+to: return warning/error, stop working or panic.
See Section 3. for an example of driver implementing this
callback, and kernel/power/energy_model.c for further documentation on this
API.
@@ -156,7 +175,8 @@ EM framework::
37 nr_opp = foo_get_nr_opp(policy);
38
39 /* And register the new performance domain */
- 40 em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb, policy->cpus);
- 41
- 42 return 0;
- 43 }
+ 40 em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb, policy->cpus,
+ 41 true);
+ 42
+ 43 return 0;
+ 44 }
diff --git a/Documentation/scheduler/sched-energy.rst b/Documentation/scheduler/sched-energy.rst
index 001e09c95e1d..afe02d394402 100644
--- a/Documentation/scheduler/sched-energy.rst
+++ b/Documentation/scheduler/sched-energy.rst
@@ -350,6 +350,11 @@ independent EM framework in Documentation/power/energy-model.rst.
Please also note that the scheduling domains need to be re-built after the
EM has been registered in order to start EAS.
+EAS uses the EM to make a forecasting decision on energy usage and thus it is
+more focused on the difference when checking possible options for task
+placement. For EAS it doesn't matter whether the EM power values are expressed
+in milli-Watts or in an 'abstract scale'.
+
6.3 - Energy Model complexity
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^