aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/thermal/thermal_core.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2021-07-10Merge tag 'thermal-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linuxLinus Torvalds1-1/+1
Pull thermal updates from Daniel Lezcano: - Add rk3568 sensor support (Finley Xiao) - Add missing MODULE_DEVICE_TABLE for the Spreadtrum sensor (Chunyan Zhang) - Export additionnal attributes for the int340x thermal processor (Srinivas Pandruvada) - Add SC7280 compatible for the tsens driver (Rajeshwari Ravindra Kamble) - Fix kernel documentation for thermal_zone_device_unregister() and use devm_platform_get_and_ioremap_resource() (Yang Yingliang) - Fix coefficient calculations for the rcar_gen3 sensor driver (Niklas Söderlund) - Fix shadowing variable rcar_gen3_ths_tj_1 (Geert Uytterhoeven) - Add missing of_node_put() for the iMX and Spreadtrum sensors (Krzysztof Kozlowski) - Add tegra3 thermal sensor DT bindings (Dmitry Osipenko) - Stop the thermal zone monitoring when unregistering it to prevent a temperature update without the 'get_temp' callback (Dmitry Osipenko) - Add rk3568 DT bindings, convert bindings to yaml schemas and add the corresponding compatible in the Rockchip sensor (Ezequiel Garcia) - Add the sc8180x compatible for the Qualcomm tsensor (Bjorn Andersson) - Use the find_first_zero_bit() function instead of custom code (Andy Shevchenko) - Fix the kernel doc for the device cooling device (Yang Li) - Reorg the processor thermal int340x to set the scene for the PCI mmio driver (Srinivas Pandruvada) - Add PCI MMIO driver for the int340x processor thermal driver (Srinivas Pandruvada) - Add hwmon sensors for the mediatek sensor (Frank Wunderlich) - Fix warning for return value reported by Smatch for the int340x thermal processor (Srinivas Pandruvada) - Fix wrong register access and decoding for the int340x thermal processor (Srinivas Pandruvada) * tag 'thermal-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (23 commits) thermal/drivers/int340x/processor_thermal: Fix tcc setting thermal/drivers/int340x/processor_thermal: Fix warning for return value thermal/drivers/mediatek: Add sensors-support thermal/drivers/int340x/processor_thermal: Add PCI MMIO based thermal driver thermal/drivers/int340x/processor_thermal: Split enumeration and processing part thermal: devfreq_cooling: Fix kernel-doc thermal/drivers/intel/intel_soc_dts_iosf: Switch to use find_first_zero_bit() dt-bindings: thermal: tsens: Add sc8180x compatible dt-bindings: rockchip-thermal: Support the RK3568 SoC compatible dt-bindings: thermal: convert rockchip-thermal to json-schema thermal/core/thermal_of: Stop zone device before unregistering it dt-bindings: thermal: Add binding for Tegra30 thermal sensor thermal/drivers/sprd: Add missing of_node_put for loop iteration thermal/drivers/imx_sc: Add missing of_node_put for loop iteration thermal/drivers/rcar_gen3_thermal: Do not shadow rcar_gen3_ths_tj_1 thermal/drivers/rcar_gen3_thermal: Fix coefficient calculations thermal/drivers/st: Use devm_platform_get_and_ioremap_resource() thermal/core: Correct function name thermal_zone_device_unregister() dt-bindings: thermal: tsens: Add compatible string to TSENS binding for SC7280 thermal/drivers/int340x: processor_thermal: Export additional attributes ...
2021-06-21thermal: Use generic HW-protection shutdown APIMatti Vaittinen1-59/+4
The hardware shutdown function was exported from kernel/reboot for other subsystems to use. Logic is copied from the thermal_core. The protection mutex is replaced by an atomic_t to allow calls also from an IRQ context. Also the WARN() was replaced by pr_emerg() based on discussions here: https://lore.kernel.org/lkml/YJuPwAZroVZ%2Fw633@alley/ and here: https://lore.kernel.org/linux-iommu/20210331093104.383705-4-geert+renesas@glider.be/ Use the exported API instead of implementing own just for the thermal_core. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/5531e89d9e710f5d10e7cdce3ee58957335b9e03.1622628333.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-12thermal/core: Correct function name thermal_zone_device_unregister()Yang Yingliang1-1/+1
Fix the following make W=1 kernel build warning: drivers/thermal/thermal_core.c:1376: warning: expecting prototype for thermal_device_unregister(). Prototype was for thermal_zone_device_unregister() instead Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210517051020.3463536-1-yangyingliang@huawei.com
2021-04-22thermal/core: Remove thermal_notify_frameworkThara Gopinath1-18/+0
thermal_notify_framework just updates for a single trip point where as thermal_zone_device_update does other bookkeeping like updating the temperature of the thermal zone and setting the next trip point. The only driver that was using thermal_notify_framework was updated in the previous patch to use thermal_zone_device_update instead. Since there are no users for thermal_notify_framework remove it. Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210122023406.3500424-3-thara.gopinath@linaro.org
2021-04-15thermal/core: Fix memory leak in the error pathDaniel Lezcano1-0/+1
Fix the following error: smatch warnings: drivers/thermal/thermal_core.c:1020 __thermal_cooling_device_register() warn: possible memory leak of 'cdev' by freeing the cdev when exiting the function in the error path. Fixes: 584837618100 ("thermal/drivers/core: Use a char pointer for the cooling device name") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210319202257.890848-1-daniel.lezcano@linaro.org
2021-03-15thermal/drivers/core: Use a char pointer for the cooling device nameDaniel Lezcano1-16/+22
We want to have any kind of name for the cooling devices as we do no longer want to rely on auto-numbering. Let's replace the cooling device's fixed array by a char pointer to be allocated dynamically when registering the cooling device, so we don't limit the length of the name. Rework the error path at the same time as we have to rollback the allocations in case of error. Tested with a dummy device having the name: "Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch" A village on the island of Anglesey (Wales), known to have the longest name in Europe. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20210314111333.16551-1-daniel.lezcano@linaro.org
2021-01-19thermal/core: Remove pointless thermal_zone_device_reset() functionDaniel Lezcano1-7/+1
The function thermal_zone_device_reset() is called in the thermal_zone_device_register() which allocates and initialize the structure. The passive field is already zero-ed by the allocation, the function is useless. Call directly thermal_zone_device_init() instead and thermal_zone_device_reset(). Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201222181110.1231977-1-daniel.lezcano@linaro.org
2021-01-19thermal/core: Remove ms based delay fieldsDaniel Lezcano1-3/+1
The code does no longer use the ms unit based fields to set the delays as they are replaced by the jiffies. Remove them and replace their user to use the jiffies version instead. Cc: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Peter Kästle <peter@piie.net> Acked-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20201216220337.839878-3-daniel.lezcano@linaro.org
2021-01-19thermal/core: Use precomputed jiffies for the pollingDaniel Lezcano1-10/+5
The delays are also stored in jiffies based unit. Use them instead of the ms. Cc: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201216220337.839878-2-daniel.lezcano@linaro.org
2021-01-19thermal/core: Precompute the delays from msecs to jiffiesDaniel Lezcano1-0/+3
The delays are stored in ms units and when the polling function is called this delay is converted into jiffies at each call. Instead of doing the conversion again and again, compute the jiffies at init time and use the value directly when setting the polling. Cc: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201216220337.839878-1-daniel.lezcano@linaro.org
2021-01-19thermal/core: Remove THERMAL_TRIPS_NONE testDaniel Lezcano1-1/+1
The last site calling the thermal_zone_bind_cooling_device() function with the THERMAL_TRIPS_NONE parameter was removed. We can get rid of this test as no user of this function is calling this function with this parameter. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201214233811.485669-5-daniel.lezcano@linaro.org
2021-01-19thermal/core: Remove unused functions rebind/unbind exceptionDaniel Lezcano1-37/+0
The functions thermal_zone_device_rebind_exception and thermal_zone_device_unbind_exception are not used from anywhere. Remove that code. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201214233811.485669-2-daniel.lezcano@linaro.org
2021-01-07thermal/core: Remove notify opsDaniel Lezcano1-3/+0
With the removal of the notifys user in a previous patches, the ops is no longer needed, remove it. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20201210121514.25760-5-daniel.lezcano@linaro.org
2020-12-11thermal/core: Add critical and hot opsDaniel Lezcano1-16/+27
Currently there is no way to the sensors to directly call an ops in interrupt mode without calling thermal_zone_device_update assuming all the trip points are defined. A sensor may want to do something special if a trip point is hot or critical. This patch adds the critical and hot ops to the thermal zone device, so a sensor can directly invoke them or let the thermal framework to call the sensor specific ones. Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20201210121514.25760-2-daniel.lezcano@linaro.org
2020-12-11thermal/core: Emit a warning if the thermal zone is updated without opsDaniel Lezcano1-1/+2
The actual code is silently ignoring a thermal zone update when a driver is requesting it without a get_temp ops set. That looks not correct, as the caller should not have called this function if the thermal zone is unable to read the temperature. That makes the code less robust as the check won't detect the driver is inconsistently using the thermal API and that does not help to improve the framework as these circumvolutions hide the problem at the source. In order to detect the situation when it happens, let's add a warning when the update is requested without the get_temp() ops set. Any warning emitted will have to be fixed at the source of the problem: the caller must not call thermal_zone_device_update if there is not get_temp callback set. Cc: Thara Gopinath <thara.gopinath@linaro.org> Cc: Amit Kucheria <amitk@kernel.org> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20201210121514.25760-1-daniel.lezcano@linaro.org
2020-10-27drivers/thermal/core: Optimize trip points checkBernard Zhao1-6/+3
The trip points are checked one by one with multiple condition branches where one condition is enough to disable the trip point. Merge all these conditions in a single 'OR' statement. Signed-off-by: Bernard Zhao <bernard@vivo.com> Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201027013743.62392-1-bernard@vivo.com [dlezcano] Changed patch description
2020-10-27thermal: core: Move power_actor_set_power into IPALukasz Luba1-41/+0
Since the power actor section has one function power_actor_set_power() move it into Intelligent Power Allocation (IPA). There is no other user of that helper function. It would also allow to remove the check of cdev_is_power_actor() because the code which calls it in IPA already does the needed check. Make the function static since only IPA use it. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201015112441.4056-5-lukasz.luba@arm.com
2020-10-27thermal: core: Remove unused functions in power actor sectionLukasz Luba1-47/+0
Since the Intelligent Power Allocation (IPA) uses different way to get minimum and maximum power for a given cooling device, the helper functions are not needed. There is no other code which uses them, so remove the helper functions. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201015112441.4056-4-lukasz.luba@arm.com
2020-10-26thermal: core: Add upper and lower limits to power_actor_set_powerMichael Kao1-1/+1
The upper and lower limits of thermal throttle state in the DT do not apply to the Intelligent Power Allocation (IPA) governor. Add the clamping for cooling device upper and lower limits in the power_actor_set_power() used by IPA. Signed-off-by: Michael Kao <michael.kao@mediatek.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201007024332.30322-1-michael.kao@mediatek.com
2020-10-12thermal: cooling: Remove unused variable *tzzhuguangqing1-7/+5
1. devfreq_cooling.c: The variable *tz is not used in devfreq_cooling_get_requested_power(), devfreq_cooling_state2power() and devfreq_cooling_power2state(). 2. cpufreq_cooling.c: After 84fe2cab48590, the variable *tz is not used anymore in cpufreq_get_requested_power(), cpufreq_state2power() and cpufreq_power2state(). Remove the variable *tz. Signed-off-by: zhuguangqing <zhuguangqing@xiaomi.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200914071101.13575-1-zhuguangqing83@gmail.com
2020-10-12thermal: core: remove unnecessary mutex_init()Qinglang Miao1-1/+0
The mutex poweroff_lock is initialized statically. It is unnecessary to initialize by mutex_init(). Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200916062139.191233-1-miaoqinglang@huawei.com
2020-09-04thermal: core: Fix use-after-free in thermal_zone_device_unregister()Dmitry Osipenko1-2/+3
The user-after-free bug in thermal_zone_device_unregister() is reported by KASAN. It happens because struct thermal_zone_device is released during of device_unregister() invocation, and hence the "tz" variable shouldn't be touched by thermal_notify_tz_delete(tz->id). Fixes: 55cdf0a283b8 ("thermal: core: Add notifications call in the framework") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200817235854.26816-1-digetx@gmail.com
2020-07-29thermal: core: Add thermal zone enable/disable notificationDaniel Lezcano1-0/+5
Now the calls to enable/disable a thermal zone are centralized in a call to a function, we can add in these the corresponding netlink notifications. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Link: https://lore.kernel.org/r/20200727231033.26512-1-daniel.lezcano@linaro.org
2020-07-24thermal: core: Fix thermal zone lookup by IDThierry Reding1-3/+5
When a thermal zone is looked up by an ID and no zone is found matching that ID, the thermal_zone_get_by_id() function will return a pointer to the thermal zone list head which isn't actually a valid thermal zone. This can lead to a subsequent crash because a valid pointer is returned to the called, but dereferencing that pointer as struct thermal_zone is not safe. Fixes: 329b064fbd13 ("thermal: core: Get thermal zone by id") Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200724170105.2705467-1-thierry.reding@gmail.com
2020-07-21thermal: core: Move initialization after core initcallDaniel Lezcano1-1/+1
The generic netlink is initialized at subsys_initcall, so far after the thermal init routine and the thermal generic netlink family initialization. On ŝome platforms, that leads to a memory corruption. The fix was sent to netdev@ to move the genetlink framework initialization at core_initcall. Move the thermal core initialization to postcore level which is very close to core level. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Link: https://lore.kernel.org/r/20200717164217.18819-2-daniel.lezcano@linaro.org
2020-07-21thermal: netlink: Improve the initcall orderingDaniel Lezcano1-0/+4
The initcalls like to play joke. In our case, the thermal-netlink initcall is called after the thermal-core initcall but this one sends a notification before the former is initialized. No issue was spotted, but it could lead to a memory corruption, so instead of relying on the core_initcall for the thermal-netlink, let's initialize directly from the thermal-core init routine, so we have full control of the init ordering. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Link: https://lore.kernel.org/r/20200717164217.18819-1-daniel.lezcano@linaro.org
2020-07-07thermal: core: Add notifications call in the frameworkDaniel Lezcano1-0/+21
The generic netlink protocol is implemented but the different notification functions are not yet connected to the core code. These changes add the notification calls in the different corresponding places. Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20200706105538.2159-4-daniel.lezcano@linaro.org
2020-07-07thermal: core: Get thermal zone by idDaniel Lezcano1-0/+14
The next patch will introduce the generic netlink protocol to handle events, sampling and command from the thermal framework. In order to deal with the thermal zone, it uses its unique identifier to characterize it in the message. Passing an integer is more efficient than passing an entire string. This change provides a function returning back a thermal zone pointer corresponding to the identifier passed as parameter. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20200706105538.2159-2-daniel.lezcano@linaro.org
2020-07-07thermal: core: Add helpers to browse the cdev, tz and governor listDaniel Lezcano1-0/+51
The cdev, tz and governor list, as well as their respective locks are statically defined in the thermal_core.c file. In order to give a sane access to these list, like browsing all the thermal zones or all the cooling devices, let's define a set of helpers where we pass a callback as a parameter to be called for each thermal entity. We keep the self-encapsulation and ensure the locks are correctly taken when looking at the list. Acked-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200706105538.2159-1-daniel.lezcano@linaro.org
2020-07-07thermal: Make thermal_zone_device_is_enabled() available to core onlyAndrzej Pietrasiewicz1-1/+0
This function is not needed by drivers. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200703104354.19657-4-andrzej.p@collabora.com
2020-06-29thermal: Rename set_mode() to change_mode()Andrzej Pietrasiewicz1-2/+2
set_mode() is only called when tzd's mode is about to change. Actual setting is performed in thermal_core, in thermal_zone_device_set_mode(). The meaning of set_mode() callback is actually to notify the driver about the mode being changed and giving the driver a chance to oppose such change. To better reflect the purpose of the method rename it to change_mode() Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> [for acerhdf] Acked-by: Peter Kaestle <peter@piie.net> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-12-andrzej.p@collabora.com
2020-06-29thermal: core: Stop polling DISABLED thermal devicesAndrzej Pietrasiewicz1-2/+14
Polling DISABLED devices is not desired, as all such "disabled" devices are meant to be handled by userspace. This patch introduces and uses should_stop_polling() to decide whether the device should be polled or not. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-10-andrzej.p@collabora.com
2020-06-29thermal: Use mode helpers in driversAndrzej Pietrasiewicz1-1/+1
Use thermal_zone_device_{en|dis}able() and thermal_zone_device_is_enabled(). Consequently, all set_mode() implementations in drivers: - can stop modifying tzd's "mode" member, - shall stop taking tzd's lock, as it is taken in the helpers - shall stop calling thermal_zone_device_update() as it is called in the helpers - can assume they are called when the mode truly changes, so checks to verify that can be dropped Not providing set_mode() by a driver no longer prevents the core from being able to set tzd's mode, so the relevant check in mode_store() is removed. Other comments: - acpi/thermal.c: tz->thermal_zone->mode will be updated only after we return from set_mode(), so use function parameter in thermal_set_mode() instead, no need to call acpi_thermal_check() in set_mode() - thermal/imx_thermal.c: regmap writes and mode assignment are done in thermal_zone_device_{en|dis}able() and set_mode() callback - thermal/intel/intel_quark_dts_thermal.c: soc_dts_{en|dis}able() are a part of set_mode() callback, so they don't need to modify tzd->mode, and don't need to fall back to the opposite mode if unsuccessful, as the return value will be propagated to thermal_zone_device_{en|dis}able() and ultimately tzd's member will not be changed in thermal_zone_device_set_mode(). - thermal/of-thermal.c: no need to set zone->mode to DISABLED in of_parse_thermal_zones() as a tzd is kzalloc'ed so mode is DISABLED anyway Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> [for acerhdf] Acked-by: Peter Kaestle <peter@piie.net> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-8-andrzej.p@collabora.com
2020-06-29thermal: Add mode helpersAndrzej Pietrasiewicz1-0/+53
Prepare for making the drivers not access tzd's private members. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> [staticize thermal_zone_device_set_mode()] Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-7-andrzej.p@collabora.com
2020-06-29thermal: remove get_mode() operation of driversAndrzej Pietrasiewicz1-6/+1
get_mode() is now redundant, as the state is stored in struct thermal_zone_device. Consequently the "mode" attribute in sysfs can always be visible, because it is always possible to get the mode from struct tzd. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> [for acerhdf] Acked-by: Peter Kaestle <peter@piie.net> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-6-andrzej.p@collabora.com
2020-05-22thermal/core: Replace module.h with export.hAmit Kucheria1-1/+1
Thermal core cannot be modular, remove the unnecessary module.h include and replace with export.h to handle EXPORT_SYMBOL family of macros. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/33af23406dcdb0c62dae1e6401446b997ccb449f.1589199124.git.amit.kucheria@linaro.org
2020-05-22thermal/core: Get rid of MODULE_* tagsAmit Kucheria1-4/+0
The thermal framework can no longer be compiled as a module as of commit 554b3529fe01 ("thermal/drivers/core: Remove the module Kconfig's option"). Remove the MODULE_* tags. Rui is mentioned in the copyright line at the top of the file and the license is mentioned in the SPDX tags. So no loss of information. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/74339a09a55f8f3d86c4074fc2bf853a302d6186.1589199124.git.amit.kucheria@linaro.org
2020-04-14thermal: core: Remove pointless debug tracesDaniel Lezcano1-6/+0
The last temperature and the current temperature are show via a dev_debug. The line before, those temperature are also traced. It is pointless to duplicate the traces for the temperatures, remove the dev_dbg traces. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Link: https://lore.kernel.org/r/20200331165449.30355-2-daniel.lezcano@linaro.org
2019-11-14thermal: Fix deadlock in thermal thermal_zone_device_checkWei Wang1-2/+2
1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device") changed cancel_delayed_work to cancel_delayed_work_sync to avoid a use-after-free issue. However, cancel_delayed_work_sync could be called insides the WQ causing deadlock. [54109.642398] c0 1162 kworker/u17:1 D 0 11030 2 0x00000000 [54109.642437] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check [54109.642447] c0 1162 Call trace: [54109.642456] c0 1162 __switch_to+0x138/0x158 [54109.642467] c0 1162 __schedule+0xba4/0x1434 [54109.642480] c0 1162 schedule_timeout+0xa0/0xb28 [54109.642492] c0 1162 wait_for_common+0x138/0x2e8 [54109.642511] c0 1162 flush_work+0x348/0x40c [54109.642522] c0 1162 __cancel_work_timer+0x180/0x218 [54109.642544] c0 1162 handle_thermal_trip+0x2c4/0x5a4 [54109.642553] c0 1162 thermal_zone_device_update+0x1b4/0x25c [54109.642563] c0 1162 thermal_zone_device_check+0x18/0x24 [54109.642574] c0 1162 process_one_work+0x3cc/0x69c [54109.642583] c0 1162 worker_thread+0x49c/0x7c0 [54109.642593] c0 1162 kthread+0x17c/0x1b0 [54109.642602] c0 1162 ret_from_fork+0x10/0x18 [54109.643051] c0 1162 kworker/u17:2 D 0 16245 2 0x00000000 [54109.643067] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check [54109.643077] c0 1162 Call trace: [54109.643085] c0 1162 __switch_to+0x138/0x158 [54109.643095] c0 1162 __schedule+0xba4/0x1434 [54109.643104] c0 1162 schedule_timeout+0xa0/0xb28 [54109.643114] c0 1162 wait_for_common+0x138/0x2e8 [54109.643122] c0 1162 flush_work+0x348/0x40c [54109.643131] c0 1162 __cancel_work_timer+0x180/0x218 [54109.643141] c0 1162 handle_thermal_trip+0x2c4/0x5a4 [54109.643150] c0 1162 thermal_zone_device_update+0x1b4/0x25c [54109.643159] c0 1162 thermal_zone_device_check+0x18/0x24 [54109.643167] c0 1162 process_one_work+0x3cc/0x69c [54109.643177] c0 1162 worker_thread+0x49c/0x7c0 [54109.643186] c0 1162 kthread+0x17c/0x1b0 [54109.643195] c0 1162 ret_from_fork+0x10/0x18 [54109.644500] c0 1162 cat D 0 7766 1 0x00000001 [54109.644515] c0 1162 Call trace: [54109.644524] c0 1162 __switch_to+0x138/0x158 [54109.644536] c0 1162 __schedule+0xba4/0x1434 [54109.644546] c0 1162 schedule_preempt_disabled+0x80/0xb0 [54109.644555] c0 1162 __mutex_lock+0x3a8/0x7f0 [54109.644563] c0 1162 __mutex_lock_slowpath+0x14/0x20 [54109.644575] c0 1162 thermal_zone_get_temp+0x84/0x360 [54109.644586] c0 1162 temp_show+0x30/0x78 [54109.644609] c0 1162 dev_attr_show+0x5c/0xf0 [54109.644628] c0 1162 sysfs_kf_seq_show+0xcc/0x1a4 [54109.644636] c0 1162 kernfs_seq_show+0x48/0x88 [54109.644656] c0 1162 seq_read+0x1f4/0x73c [54109.644664] c0 1162 kernfs_fop_read+0x84/0x318 [54109.644683] c0 1162 __vfs_read+0x50/0x1bc [54109.644692] c0 1162 vfs_read+0xa4/0x140 [54109.644701] c0 1162 SyS_read+0xbc/0x144 [54109.644708] c0 1162 el0_svc_naked+0x34/0x38 [54109.845800] c0 1162 D 720.000s 1->7766->7766 cat [panic] Fixes: 1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device") Cc: stable@vger.kernel.org Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-11-07thermal: Initialize thermal subsystem earlierAmit Kucheria1-1/+1
Now that the thermal framework is built-in, in order to facilitate thermal mitigation as early as possible in the boot cycle, move the thermal framework initialization to core_initcall. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/f8ff0ab4a8e9c2eca5a26fb2256365b26cb326ce.1571656015.git.amit.kucheria@linaro.org
2019-11-07thermal: Remove netlink supportAmit Kucheria1-100/+1
There are no users of netlink messages for thermal inside the kernel. Remove the code and adjust the documentation. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/8ff02cf62186c7a54fff325fad40a2e9ca3affa6.1571656014.git.amit.kucheria@linaro.org
2019-09-24thermal: Add some error messagesAmit Kucheria1-4/+13
When registering a thermal zone device, we currently return -EINVAL in four cases. This makes it a little hard to debug the real cause of the failure. Print some error messages to make it easier for developer to figure out what happened. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-09-24thermal: Fix use-after-free when unregistering thermal zone deviceIdo Schimmel1-1/+1
thermal_zone_device_unregister() cancels the delayed work that polls the thermal zone, but it does not wait for it to finish. This is racy with respect to the freeing of the thermal zone device, which can result in a use-after-free [1]. Fix this by waiting for the delayed work to finish before freeing the thermal zone device. Note that thermal_zone_device_set_polling() is never invoked from an atomic context, so it is safe to call cancel_delayed_work_sync() that can block. [1] [ +0.002221] ================================================================== [ +0.000064] BUG: KASAN: use-after-free in __mutex_lock+0x1076/0x11c0 [ +0.000016] Read of size 8 at addr ffff8881e48e0450 by task kworker/1:0/17 [ +0.000023] CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 5.2.0-rc6-custom-02495-g8e73ca3be4af #1701 [ +0.000010] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016 [ +0.000016] Workqueue: events_freezable_power_ thermal_zone_device_check [ +0.000012] Call Trace: [ +0.000021] dump_stack+0xa9/0x10e [ +0.000020] print_address_description.cold.2+0x9/0x25e [ +0.000018] __kasan_report.cold.3+0x78/0x9d [ +0.000016] kasan_report+0xe/0x20 [ +0.000016] __mutex_lock+0x1076/0x11c0 [ +0.000014] step_wise_throttle+0x72/0x150 [ +0.000018] handle_thermal_trip+0x167/0x760 [ +0.000019] thermal_zone_device_update+0x19e/0x5f0 [ +0.000019] process_one_work+0x969/0x16f0 [ +0.000017] worker_thread+0x91/0xc40 [ +0.000014] kthread+0x33d/0x400 [ +0.000015] ret_from_fork+0x3a/0x50 [ +0.000020] Allocated by task 1: [ +0.000015] save_stack+0x19/0x80 [ +0.000015] __kasan_kmalloc.constprop.4+0xc1/0xd0 [ +0.000014] kmem_cache_alloc_trace+0x152/0x320 [ +0.000015] thermal_zone_device_register+0x1b4/0x13a0 [ +0.000015] mlxsw_thermal_init+0xc92/0x23d0 [ +0.000014] __mlxsw_core_bus_device_register+0x659/0x11b0 [ +0.000013] mlxsw_core_bus_device_register+0x3d/0x90 [ +0.000013] mlxsw_pci_probe+0x355/0x4b0 [ +0.000014] local_pci_probe+0xc3/0x150 [ +0.000013] pci_device_probe+0x280/0x410 [ +0.000013] really_probe+0x26a/0xbb0 [ +0.000013] driver_probe_device+0x208/0x2e0 [ +0.000013] device_driver_attach+0xfe/0x140 [ +0.000013] __driver_attach+0x110/0x310 [ +0.000013] bus_for_each_dev+0x14b/0x1d0 [ +0.000013] driver_register+0x1c0/0x400 [ +0.000015] mlxsw_sp_module_init+0x5d/0xd3 [ +0.000014] do_one_initcall+0x239/0x4dd [ +0.000013] kernel_init_freeable+0x42b/0x4e8 [ +0.000012] kernel_init+0x11/0x18b [ +0.000013] ret_from_fork+0x3a/0x50 [ +0.000015] Freed by task 581: [ +0.000013] save_stack+0x19/0x80 [ +0.000014] __kasan_slab_free+0x125/0x170 [ +0.000013] kfree+0xf3/0x310 [ +0.000013] thermal_release+0xc7/0xf0 [ +0.000014] device_release+0x77/0x200 [ +0.000014] kobject_put+0x1a8/0x4c0 [ +0.000014] device_unregister+0x38/0xc0 [ +0.000014] thermal_zone_device_unregister+0x54e/0x6a0 [ +0.000014] mlxsw_thermal_fini+0x184/0x35a [ +0.000014] mlxsw_core_bus_device_unregister+0x10a/0x640 [ +0.000013] mlxsw_devlink_core_bus_device_reload+0x92/0x210 [ +0.000015] devlink_nl_cmd_reload+0x113/0x1f0 [ +0.000014] genl_family_rcv_msg+0x700/0xee0 [ +0.000013] genl_rcv_msg+0xca/0x170 [ +0.000013] netlink_rcv_skb+0x137/0x3a0 [ +0.000012] genl_rcv+0x29/0x40 [ +0.000013] netlink_unicast+0x49b/0x660 [ +0.000013] netlink_sendmsg+0x755/0xc90 [ +0.000013] __sys_sendto+0x3de/0x430 [ +0.000013] __x64_sys_sendto+0xe2/0x1b0 [ +0.000013] do_syscall_64+0xa4/0x4d0 [ +0.000013] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ +0.000017] The buggy address belongs to the object at ffff8881e48e0008 which belongs to the cache kmalloc-2k of size 2048 [ +0.000012] The buggy address is located 1096 bytes inside of 2048-byte region [ffff8881e48e0008, ffff8881e48e0808) [ +0.000007] The buggy address belongs to the page: [ +0.000012] page:ffffea0007923800 refcount:1 mapcount:0 mapping:ffff88823680d0c0 index:0x0 compound_mapcount: 0 [ +0.000020] flags: 0x200000000010200(slab|head) [ +0.000019] raw: 0200000000010200 ffffea0007682008 ffffea00076ab808 ffff88823680d0c0 [ +0.000016] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000 [ +0.000007] page dumped because: kasan: bad access detected [ +0.000012] Memory state around the buggy address: [ +0.000012] ffff8881e48e0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] ffff8881e48e0380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] >ffff8881e48e0400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000008] ^ [ +0.000012] ffff8881e48e0480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] ffff8881e48e0500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000007] ================================================================== Fixes: b1569e99c795 ("ACPI: move thermal trip handling to generic thermal layer") Reported-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-09-24thermal/drivers/core: Use put_device() if device_register() failsYue Hu1-12/+13
Never directly free @dev after calling device_register(), even if it returned an error! Always use put_device() to give up the reference initialized. Clean up the rollback block also. Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-06-27thermal/drivers/core: Use governor table to initializeDaniel Lezcano1-23/+29
Now that the governor table is in place and the macro allows to browse the table, declare the governor so the entry is added in the governor table in the init section. The [un]register_thermal_governors function does no longer need to use the exported [un]register thermal governor's specific function which in turn call the [un]register_thermal_governor. The governors are fully self-encapsulated. The cyclic dependency is no longer needed, remove it. Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-05-16Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linuxLinus Torvalds1-19/+12
Pull thermal management updates from Zhang Rui: - Remove the 'module' Kconfig option for thermal subsystem framework because the thermal framework are required to be ready as early as possible to avoid overheat at boot time (Daniel Lezcano) - Fix a bug that thermal framework pokes disabled thermal zones upon resume (Wei Wang) - A couple of cleanups and trivial fixes on int340x thermal drivers (Srinivas Pandruvada, Zhang Rui, Sumeet Pawnikar) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: drivers: thermal: processor_thermal: Downgrade error message mlxsw: Remove obsolete dependency on THERMAL=m hwmon/drivers/core: Simplify complex dependency thermal/drivers/core: Fix typo in the option name thermal/drivers/core: Remove depends on THERMAL in Kconfig thermal/drivers/core: Remove module unload code thermal/drivers/core: Remove the module Kconfig's option thermal: core: skip update disabled thermal zones after suspend thermal: make device_register's type argument const thermal: intel: int340x: processor_thermal_device: simplify to get driver data thermal/int3403_thermal: favor _TMP instead of PTYP
2019-05-14thermal: Introduce devm_thermal_of_cooling_device_registerGuenter Roeck1-0/+49
thermal_of_cooling_device_register() and thermal_cooling_device_register() are typically called from driver probe functions, and thermal_cooling_device_unregister() is called from remove functions. This makes both a perfect candidate for device managed functions. Introduce devm_thermal_of_cooling_device_register(). This function can also be used to replace thermal_cooling_device_register() by passing a NULL pointer as device node. The new function requires both struct device * and struct device_node * as parameters since the struct device_node * parameter is not always identical to dev->of_node. Don't introduce a device managed remove function since it is not needed at this point. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-07Merge branches 'thermal-core', 'thermal-built-it' and 'thermal-intel' into nextZhang Rui1-16/+1
2019-05-06thermal/drivers/core: Remove module unload codeDaniel Lezcano1-16/+1
Now the thermal core is no longer compiled as a module. Remove the unloading module code and move the unregister function to the __init section. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-05-06thermal: core: skip update disabled thermal zones after suspendWei Wang1-0/+8
It is unnecessary to update disabled thermal zones post suspend and sometimes leads error/warning in bad behaved thermal drivers. Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>