aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2021-04-20hwmon: Add driver for fsp-3y PSUs and PDUsVáclav Kubernát5-0/+293
This patch adds support for these devices: - YH-5151E - the PDU - YM-2151E - the PSU The device datasheet says that the devices support PMBus 1.2, but in my testing, a lot of the commands aren't supported and if they are, they sometimes behave strangely or inconsistently. For example, writes to the PAGE command requires using PEC, otherwise the write won't work and the page won't switch, even though, the standard says that PEC is optional. On the other hand, writes to SMBALERT don't require PEC. Because of this, the driver is mostly reverse engineered with the help of a tool called pmbus_peek written by David Brownell (and later adopted by my colleague Jan Kundrát). The device also has some sort of a timing issue when switching pages, which is explained further in the code. Because of this, the driver support is limited. It exposes only the values that have been tested to work correctly. Signed-off-by: Václav Kubernát <kubernat@cesnet.cz> Link: https://lore.kernel.org/r/20210414080019.3530794-1-kubernat@cesnet.cz [groeck: Fixed up "missing braces around initializer" from 0-day] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (intel-m10-bmc-hwmon) add sensor support of Intel D5005 cardMatthew Gerlach2-0/+132
Like the Intel N3000 card, the Intel D5005 has a MAX10 based BMC. This commit adds support for the D5005 sensors that are monitored by the MAX10 BMC. Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@linux.intel.com> Acked-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20210413225835.459662-3-matthew.gerlach@linux.intel.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (sch5627) Split sch5627_update_device()Armin Wolf1-33/+69
An error in sch5627_update_device() could cause sch5627_read() to fail even if the error did not affect the target sensor type. Split sch5627_update_device() to prevent that. Tested on a Fujitsu Esprimo P720. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20210411164225.11967-3-W_Armin@gmx.de Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (sch5627) Convert to hwmon_device_register_with_info()Armin Wolf1-232/+112
hwmon_device_register() is deprecated. Convert driver to use hwmon_device_register_with_info() and remove sysfs attributes which are now being handled by the hwmon subsystem. Channel handling was inspired by corsair-cpro. Tested on a Fujitsu Esprimo P720. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20210411164225.11967-2-W_Armin@gmx.de [groeck: Replaced 0 with NULL] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (nct6683) remove useless functionJiapeng Chong1-11/+0
Fix the following clang warning: drivers/hwmon/nct6683.c:491:19: warning: unused function 'in_to_reg' [-Wunused-function]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/1618293770-55307-1-git-send-email-jiapeng.chong@linux.alibaba.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (dell-smm) Add Dell Latitude E7440 to fan control whitelistSebastian Oechsle1-0/+8
Allow manual PWM control on Dell Latitude E7440. Reviewed-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20210411095741.zmllsuc7pevdipvy@icloud.com Signed-off-by: Sebastian Oechsle <setboolean@icloud.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20MAINTAINERS: Add keyword pattern for hwmon registration functionsGuenter Roeck1-0/+1
A pattern match for hardware monitoring registration functions ensures that hardware monitoring maintainers are copied whenever hardware monitoring drivers are added to the tree. Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (mlxreg-fan) Add support for fan drawers capability and present registersVadim Pasternak1-1/+50
Add support for fan drawer's capability and present registers in order to set mapping between the fan drawers and tachometers. Some systems are equipped with fan drawers with one tachometer inside. Others with fan drawers with several tachometers inside. Using present register along with tachometer-to-drawer mapping allows to skip reading missed tachometers and expose input for them as zero, instead of exposing fault code returned by hardware. Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20210322172237.2213584-1-vadimp@nvidia.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (pmbus/tps53679) Add support for TI TPS53676Erik Rosen3-5/+63
Add support for TI TPS53676 controller to the tps53679 pmbus driver The driver uses the USER_DATA_03 register to figure out how many phases are enabled and to which channel they are assigned, and sets the number of pages and phases accordingly. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Link: https://lore.kernel.org/r/20210322193734.75127-3-erik.rosen@metormote.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20dt-bindings: Add trivial device entry for TPS53676Erik Rosen1-0/+2
Add trivial device entry for TPS53676 Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210318212441.69050-2-erik.rosen@metormote.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (ftsteutates) Rudimentary typo fixesBhaskar Chowdhury1-1/+1
s/Temprature/Temperature/ s/revsion/revision/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20210323043438.1321903-1-unixbhaskar@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (pmbus) Add driver for BluTek BPA-RS600Chris Packham5-0/+257
The BPA-RS600 is a compact 600W AC to DC removable power supply module. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210317040231.21490-3-chris.packham@alliedtelesis.co.nz [groeck: Added bpa-rs600 to index.rst] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20dt-bindings: Add vendor prefix and trivial device for BluTek BPA-RS600Chris Packham2-0/+4
Add vendor prefix "blutek" for BluTek Power. Add trivial device entry for BPA-RS600. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210317040231.21490-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: replace snprintf in show functions with sysfs_emitGuenter Roeck24-150/+151
coccicheck complains about the use of snprintf() in sysfs show functions. drivers/hwmon/ina3221.c:701:8-16: WARNING: use scnprintf or sprintf This results in a large number of patch submissions. Fix it all in one go using the following coccinelle rules. Use sysfs_emit instead of scnprintf or sprintf since that makes more sense. @depends on patch@ identifier show, dev, attr, buf; @@ ssize_t show(struct device *dev, struct device_attribute *attr, char *buf) { <... return - snprintf(buf, \( PAGE_SIZE \| PAGE_SIZE - 1 \), + sysfs_emit(buf, ...); ...> } @depends on patch@ identifier show, dev, attr, buf, rc; @@ ssize_t show(struct device *dev, struct device_attribute *attr, char *buf) { <... rc = - snprintf(buf, \( PAGE_SIZE \| PAGE_SIZE - 1 \), + sysfs_emit(buf, ...); ...> } While at it, remove unnecessary braces and as well as unnecessary else after return statements to address checkpatch warnings in the resulting patch. Cc: Zihao Tang <tangzihao1@hisilicon.com> Cc: Jay Fang <f.fangjian@huawei.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (pmbus) Replace - with _ in device names before registrationChris Packham1-1/+7
The hwmon sysfs ABI requires that the `name` property doesn't include any dashes. But when the pmbus code picks the name up from the device tree it quite often does. Replace '-' with '_' before registering the device. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Link: https://lore.kernel.org/r/20210317040231.21490-2-chris.packham@alliedtelesis.co.nz Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: add driver for NZXT Kraken X42/X52/X62/X72Jonas Malaco6-0/+295
These are "all-in-one" CPU liquid coolers that can be monitored and controlled through a proprietary USB HID protocol. While the models have differently sized radiators and come with varying numbers of fans, they are all indistinguishable at the software level. The driver exposes fan/pump speeds and coolant temperature through the standard hwmon sysfs interface. Fan and pump control, while supported by the devices, are not currently exposed. The firmware accepts up to 61 trip points per channel (fan/pump), but the same set of trip temperatures has to be maintained for both; with pwmX_auto_point_Y_temp attributes, users would need to maintain this invariant themselves. Instead, fan and pump control, as well as LED control (which the device also supports for 9 addressable RGB LEDs on the CPU water block) are left for existing and already mature user-space tools, which can still be used alongside the driver, thanks to hidraw. A link to one, which I also maintain, is provided in the documentation. The implementation is based on USB traffic analysis. It has been runtime tested on x86_64, both as a built-in driver and as a module. Signed-off-by: Jonas Malaco <jonas@protocubo.io> Link: https://lore.kernel.org/r/20210319045544.416138-1-jonas@protocubo.io [groeck: Removed unnecessary spinlock.h include] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (ina2xx) Convert sysfs sprintf/snprintf family to sysfs_emitZihao Tang1-5/+4
Fix the following coccicheck warning: drivers/hwmon/ina2xx.c:313:8-16: WARNING: use scnprintf or sprintf drivers/hwmon/ina2xx.c:453:8-16: WARNING: use scnprintf or sprintf drivers/hwmon/ina2xx.c:484:8-16: WARNING: use scnprintf or sprintf drivers/hwmon/ina2xx.c:540:8-16: WARNING: use scnprintf or sprintf Signed-off-by: Zihao Tang <tangzihao1@hisilicon.com> Signed-off-by: Jay Fang <f.fangjian@huawei.com> Link: https://lore.kernel.org/r/1615892457-35501-1-git-send-email-f.fangjian@huawei.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: Use kobj_to_dev()Guenter Roeck6-12/+12
coccinelle complains about WARNING opportunity for kobj_to_dev() in several files, resulting in one-by-one patch submissions. Handle all remaining instances in one go. Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (ds1621) Use kobj_to_dev()Tian Tao1-1/+1
fixed the following coccicheck: ./drivers/hwmon/ds1621.c:329:60-61: WARNING opportunity for kobj_to_dev(). Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Link: https://lore.kernel.org/r/1616032504-59817-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (ftsteutates) Fix spelling typozuoqilin1-1/+1
Change 'revsion' to 'revision'. Signed-off-by: zuoqilin <zuoqilin@yulong.com> Link: https://lore.kernel.org/r/20210318124637.1331-1-zuoqilin1@163.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (corsair-psu) add support for critical valuesWilken Gottwalt2-56/+282
Adds support for reading the critical values of the temperature sensors and the rail sensors (voltage and current) once and caches them. Updates the naming of the constants following a more clear scheme. Also updates the documentation and fixes some typos. Updates is_visible and ops_read functions to be more readable. The new sensors output of a Corsair HX850i will look like this: corsairpsu-hid-3-1 Adapter: HID adapter v_in: 230.00 V v_out +12v: 12.14 V (crit min = +8.41 V, crit max = +15.59 V) v_out +5v: 5.03 V (crit min = +3.50 V, crit max = +6.50 V) v_out +3.3v: 3.30 V (crit min = +2.31 V, crit max = +4.30 V) psu fan: 0 RPM vrm temp: +46.2°C (crit = +70.0°C) case temp: +39.8°C (crit = +70.0°C) power total: 152.00 W power +12v: 108.00 W power +5v: 41.00 W power +3.3v: 5.00 W curr +12v: 9.00 A (crit max = +85.00 A) curr +5v: 8.31 A (crit max = +40.00 A) curr +3.3v: 1.62 A (crit max = +40.00 A) Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net> Link: https://lore.kernel.org/r/YFNg6vGk3sQmyqgB@monster.powergraphx.local Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (pmbus/stpddc60) Add ST STPDDC60 pmbus driverErik Rosen6-0/+357
Add hardware monitoring support for ST STPDDC60 Unversal Digital Multicell Controller. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Link: https://lore.kernel.org/r/20210218115249.28513-3-erik.rosen@metormote.com [groeck: Fixed whitespace error in Makefile] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (pmbus) Add pmbus_set_update() function to set update flagErik Rosen2-0/+12
For the STPDDC60 chip, the vout alarm-limits are represented as an offset relative to the commanded output voltage. This means that the limits are dynamic and must not be cached by the pmbus driver. This patch adds a pmbus_set_sensor() function to pmbus_core to be able to set the update flag on selected sensors after auto-detection of limit attributes. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Link: https://lore.kernel.org/r/20210218115249.28513-2-erik.rosen@metormote.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (nct6683) Support NCT6686DJiqi Li1-2/+9
Add support for NCT6686D chip used in the Lenovo P620. Signed-off-by: Jiqi Li <lijq9@lenovo.com> Reviewed-by: Mark Pearson <markpearson@lenovo.com> Link: https://lore.kernel.org/r/20210304104421.1912934-1-lijq9@lenovo.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (pmbus) Add driver for Infineon IR36021Chris Packham5-0/+153
The IR36021 is a dual‐loop digital multi‐phase buck controller. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Link: https://lore.kernel.org/r/20210301035954.16713-3-chris.packham@alliedtelesis.co.nz Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20dt-bindings: trivial-devices: Add infineon,ir36021Chris Packham1-0/+2
Add infineon,ir36021 to trivial-devices.yaml. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210301035954.16713-2-chris.packham@alliedtelesis.co.nz Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (corsair-psu) Update calculation of LINEAR11 valuesWilken Gottwalt1-22/+8
Changes the way how LINEAR11 values are calculated. The new method increases the precision of 2-3 digits. old method: corsairpsu-hid-3-1 Adapter: HID adapter v_in: 230.00 V v_out +12v: 12.00 V v_out +5v: 5.00 V v_out +3.3v: 3.00 V psu fan: 0 RPM vrm temp: +44.0°C case temp: +37.0°C power total: 152.00 W power +12v: 112.00 W power +5v: 38.00 W power +3.3v: 5.00 W curr in: N/A curr +12v: 9.00 A curr +5v: 7.00 A curr +3.3v: 1000.00 mA new method: corsairpsu-hid-3-1 Adapter: HID adapter v_in: 230.00 V v_out +12v: 12.16 V v_out +5v: 5.01 V v_out +3.3v: 3.30 V psu fan: 0 RPM vrm temp: +44.5°C case temp: +37.8°C power total: 148.00 W power +12v: 108.00 W power +5v: 37.00 W power +3.3v: 4.50 W curr in: N/A curr +12v: 9.25 A curr +5v: 7.50 A curr +3.3v: 1.50 A Co-developed-by: Jack Doan <me@jackdoan.com> Signed-off-by: Jack Doan <me@jackdoan.com> Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net> Link: https://lore.kernel.org/r/YDoSMqFbgoTXyoru@monster.powergraphx.local Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: Switch to using the new API kobj_to_dev()Yang Li1-1/+1
fixed the following coccicheck: ./drivers/hwmon/hwmon.c:82:60-61: WARNING opportunity for kobj_to_dev() Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/1614071667-5665-1-git-send-email-yang.lee@linux.alibaba.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (adm9240) Convert to devm_hwmon_device_register_with_info APIGuenter Roeck1-495/+445
Also use regmap for register caching. This change reduces code and data size by more than 40%. While at it, fixed some warnings reported by checkpatch. Cc: Chris Packham <Chris.Packham@alliedtelesis.co.nz> Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (adm9240) Store i2c device instead of client in local dataGuenter Roeck1-15/+14
We only use the pointer to i2c_client to access &client->dev. Store the device pointer directly instead of retrieving it from i2c_client. Cc: Chris Packham <Chris.Packham@alliedtelesis.co.nz> Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20hwmon: (adm9240) Drop log messages from detect functionGuenter Roeck1-12/+5
Not detecting a chip in the detect function is normal and should not generate any log messages, much less error messages. Cc: Chris Packham <Chris.Packham@alliedtelesis.co.nz> Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-03-14Linux 5.12-rc3Linus Torvalds1-1/+1
2021-03-14prctl: fix PR_SET_MM_AUXV kernel stack leakAlexey Dobriyan1-1/+1
Doing a prctl(PR_SET_MM, PR_SET_MM_AUXV, addr, 1); will copy 1 byte from userspace to (quite big) on-stack array and then stash everything to mm->saved_auxv. AT_NULL terminator will be inserted at the very end. /proc/*/auxv handler will find that AT_NULL terminator and copy original stack contents to userspace. This devious scheme requires CAP_SYS_RESOURCE. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13zram: fix broken page writebackMinchan Kim1-3/+3
commit 0d8359620d9b ("zram: support page writeback") introduced two problems. It overwrites writeback_store's return value as kstrtol's return value, which makes return value zero so user could see zero as return value of write syscall even though it wrote data successfully. It also breaks index value in the loop in that it doesn't increase the index any longer. It means it can write only first starting block index so user couldn't write all idle pages in the zram so lose memory saving chance. This patch fixes those issues. Link: https://lkml.kernel.org/r/20210312173949.2197662-2-minchan@kernel.org Fixes: 0d8359620d9b("zram: support page writeback") Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Amos Bianchi <amosbianchi@google.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: John Dias <joaodias@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13zram: fix return value on writeback_storeMinchan Kim1-3/+8
writeback_store's return value is overwritten by submit_bio_wait's return value. Thus, writeback_store will return zero since there was no IO error. In the end, write syscall from userspace will see the zero as return value, which could make the process stall to keep trying the write until it will succeed. Link: https://lkml.kernel.org/r/20210312173949.2197662-1-minchan@kernel.org Fixes: 3b82a051c101("drivers/block/zram/zram_drv.c: fix error return codes not being returned in writeback_store") Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: John Dias <joaodias@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13mm/memcg: set memcg when splitting pageZhou Guanghui1-0/+1
As described in the split_page() comment, for the non-compound high order page, the sub-pages must be freed individually. If the memcg of the first page is valid, the tail pages cannot be uncharged when be freed. For example, when alloc_pages_exact is used to allocate 1MB continuous physical memory, 2MB is charged(kmemcg is enabled and __GFP_ACCOUNT is set). When make_alloc_exact free the unused 1MB and free_pages_exact free the applied 1MB, actually, only 4KB(one page) is uncharged. Therefore, the memcg of the tail page needs to be set when splitting a page. Michel: There are at least two explicit users of __GFP_ACCOUNT with alloc_exact_pages added recently. See 7efe8ef274024 ("KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT") and c419621873713 ("KVM: s390: Add memcg accounting to KVM allocations"), so this is not just a theoretical issue. Link: https://lkml.kernel.org/r/20210304074053.65527-3-zhouguanghui1@huawei.com Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Rui Xiang <rui.xiang@huawei.com> Cc: Tianhong Ding <dingtianhong@huawei.com> Cc: Weilong Chen <chenweilong@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argumentZhou Guanghui3-14/+9
Rename mem_cgroup_split_huge_fixup to split_page_memcg and explicitly pass in page number argument. In this way, the interface name is more common and can be used by potential users. In addition, the complete info(memcg and flag) of the memcg needs to be set to the tail pages. Link: https://lkml.kernel.org/r/20210304074053.65527-2-zhouguanghui1@huawei.com Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Tianhong Ding <dingtianhong@huawei.com> Cc: Weilong Chen <chenweilong@huawei.com> Cc: Rui Xiang <rui.xiang@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) signSergei Trofimovich1-1/+1
In https://bugs.gentoo.org/769614 Dmitry noticed that `ptrace(PTRACE_GET_SYSCALL_INFO)` does not return error sign properly. The bug is in mismatch between get/set errors: static inline long syscall_get_error(struct task_struct *task, struct pt_regs *regs) { return regs->r10 == -1 ? regs->r8:0; } static inline long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs) { return regs->r8; } static inline void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs, int error, long val) { if (error) { /* error < 0, but ia64 uses > 0 return value */ regs->r8 = -error; regs->r10 = -1; } else { regs->r8 = val; regs->r10 = 0; } } Tested on v5.10 on rx3600 machine (ia64 9040 CPU). Link: https://lkml.kernel.org/r/20210221002554.333076-2-slyfox@gentoo.org Link: https://bugs.gentoo.org/769614 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Reported-by: Dmitry V. Levin <ldv@altlinux.org> Reviewed-by: Dmitry V. Levin <ldv@altlinux.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13ia64: fix ia64_syscall_get_set_arguments() for break-based syscallsSergei Trofimovich1-6/+18
In https://bugs.gentoo.org/769614 Dmitry noticed that `ptrace(PTRACE_GET_SYSCALL_INFO)` does not work for syscalls called via glibc's syscall() wrapper. ia64 has two ways to call syscalls from userspace: via `break` and via `eps` instructions. The difference is in stack layout: 1. `eps` creates simple stack frame: no locals, in{0..7} == out{0..8} 2. `break` uses userspace stack frame: may be locals (glibc provides one), in{0..7} == out{0..8}. Both work fine in syscall handling cde itself. But `ptrace(PTRACE_GET_SYSCALL_INFO)` uses unwind mechanism to re-extract syscall arguments but it does not account for locals. The change always skips locals registers. It should not change `eps` path as kernel's handler already enforces locals=0 and fixes `break`. Tested on v5.10 on rx3600 machine (ia64 9040 CPU). Link: https://lkml.kernel.org/r/20210221002554.333076-1-slyfox@gentoo.org Link: https://bugs.gentoo.org/769614 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Reported-by: Dmitry V. Levin <ldv@altlinux.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13mm/userfaultfd: fix memory corruption due to writeprotectNadav Amit1-0/+8
Userfaultfd self-test fails occasionally, indicating a memory corruption. Analyzing this problem indicates that there is a real bug since mmap_lock is only taken for read in mwriteprotect_range() and defers flushes, and since there is insufficient consideration of concurrent deferred TLB flushes in wp_page_copy(). Although the PTE is flushed from the TLBs in wp_page_copy(), this flush takes place after the copy has already been performed, and therefore changes of the page are possible between the time of the copy and the time in which the PTE is flushed. To make matters worse, memory-unprotection using userfaultfd also poses a problem. Although memory unprotection is logically a promotion of PTE permissions, and therefore should not require a TLB flush, the current userrfaultfd code might actually cause a demotion of the architectural PTE permission: when userfaultfd_writeprotect() unprotects memory region, it unintentionally *clears* the RW-bit if it was already set. Note that this unprotecting a PTE that is not write-protected is a valid use-case: the userfaultfd monitor might ask to unprotect a region that holds both write-protected and write-unprotected PTEs. The scenario that happens in selftests/vm/userfaultfd is as follows: cpu0 cpu1 cpu2 ---- ---- ---- [ Writable PTE cached in TLB ] userfaultfd_writeprotect() [ write-*unprotect* ] mwriteprotect_range() mmap_read_lock() change_protection() change_protection_range() ... change_pte_range() [ *clear* “write”-bit ] [ defer TLB flushes ] [ page-fault ] ... wp_page_copy() cow_user_page() [ copy page ] [ write to old page ] ... set_pte_at_notify() A similar scenario can happen: cpu0 cpu1 cpu2 cpu3 ---- ---- ---- ---- [ Writable PTE cached in TLB ] userfaultfd_writeprotect() [ write-protect ] [ deferred TLB flush ] userfaultfd_writeprotect() [ write-unprotect ] [ deferred TLB flush] [ page-fault ] wp_page_copy() cow_user_page() [ copy page ] ... [ write to page ] set_pte_at_notify() This race exists since commit 292924b26024 ("userfaultfd: wp: apply _PAGE_UFFD_WP bit"). Yet, as Yu Zhao pointed, these races became apparent since commit 09854ba94c6a ("mm: do_wp_page() simplification") which made wp_page_copy() more likely to take place, specifically if page_count(page) > 1. To resolve the aforementioned races, check whether there are pending flushes on uffd-write-protected VMAs, and if there are, perform a flush before doing the COW. Further optimizations will follow to avoid during uffd-write-unprotect unnecassary PTE write-protection and TLB flushes. Link: https://lkml.kernel.org/r/20210304095423.3825684-1-namit@vmware.com Fixes: 09854ba94c6a ("mm: do_wp_page() simplification") Signed-off-by: Nadav Amit <namit@vmware.com> Suggested-by: Yu Zhao <yuzhao@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> [5.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13kasan: fix KASAN_STACK dependency for HW_TAGSAndrey Konovalov1-0/+1
There's a runtime failure when running HW_TAGS-enabled kernel built with GCC on hardware that doesn't support MTE. GCC-built kernels always have CONFIG_KASAN_STACK enabled, even though stack instrumentation isn't supported by HW_TAGS. Having that config enabled causes KASAN to issue MTE-only instructions to unpoison kernel stacks, which causes the failure. Fix the issue by disallowing CONFIG_KASAN_STACK when HW_TAGS is used. (The commit that introduced CONFIG_KASAN_HW_TAGS specified proper dependency for CONFIG_KASAN_STACK_ENABLE but not for CONFIG_KASAN_STACK.) Link: https://lkml.kernel.org/r/59e75426241dbb5611277758c8d4d6f5f9298dac.1615215441.git.andreyknvl@google.com Fixes: 6a63a63ff1ac ("kasan: introduce CONFIG_KASAN_HW_TAGS") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Marco Elver <elver@google.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOCAndrey Konovalov1-2/+6
Currently, kasan_free_nondeferred_pages()->kasan_free_pages() is called after debug_pagealloc_unmap_pages(). This causes a crash when debug_pagealloc is enabled, as HW_TAGS KASAN can't set tags on an unmapped page. This patch puts kasan_free_nondeferred_pages() before debug_pagealloc_unmap_pages() and arch_free_page(), which can also make the page unavailable. Link: https://lkml.kernel.org/r/24cd7db274090f0e5bc3adcdc7399243668e3171.1614987311.git.andreyknvl@google.com Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Marco Elver <elver@google.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13mm/madvise: replace ptrace attach requirement for process_madviseSuren Baghdasaryan1-1/+12
process_madvise currently requires ptrace attach capability. PTRACE_MODE_ATTACH gives one process complete control over another process. It effectively removes the security boundary between the two processes (in one direction). Granting ptrace attach capability even to a system process is considered dangerous since it creates an attack surface. This severely limits the usage of this API. The operations process_madvise can perform do not affect the correctness of the operation of the target process; they only affect where the data is physically located (and therefore, how fast it can be accessed). What we want is the ability for one process to influence another process in order to optimize performance across the entire system while leaving the security boundary intact. Replace PTRACE_MODE_ATTACH with a combination of PTRACE_MODE_READ and CAP_SYS_NICE. PTRACE_MODE_READ to prevent leaking ASLR metadata and CAP_SYS_NICE for influencing process performance. Link: https://lkml.kernel.org/r/20210303185807.2160264-1-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Jann Horn <jannh@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tim Murray <timmurray@google.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: <stable@vger.kernel.org> [5.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13include/linux/sched/mm.h: use rcu_dereference in in_vfork()Matthew Wilcox (Oracle)1-1/+2
Fix a sparse warning by using rcu_dereference(). Technically this is a bug and a sufficiently aggressive compiler could reload the `real_parent' pointer outside the protection of the rcu lock (and access freed memory), but I think it's pretty unlikely to happen. Link: https://lkml.kernel.org/r/20210221194207.1351703-1-willy@infradead.org Fixes: b18dc5f291c0 ("mm, oom: skip vforked tasks from being selected") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13kfence: fix reports if constant function prefixes existMarco Elver1-6/+12
Some architectures prefix all functions with a constant string ('.' on ppc64). Add ARCH_FUNC_PREFIX, which may optionally be defined in <asm/kfence.h>, so that get_stack_skipnr() can work properly. Link: https://lkml.kernel.org/r/f036c53d-7e81-763c-47f4-6024c6c5f058@csgroup.eu Link: https://lkml.kernel.org/r/20210304144000.1148590-1-elver@google.com Signed-off-by: Marco Elver <elver@google.com> Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocationsMarco Elver1-1/+1
cache_alloc_debugcheck_after() performs checks on an object, including adjusting the returned pointer. None of this should apply to KFENCE objects. While for non-bulk allocations, the checks are skipped when we allocate via KFENCE, for bulk allocations cache_alloc_debugcheck_after() is called via cache_alloc_debugcheck_after_bulk(). Fix it by skipping cache_alloc_debugcheck_after() for KFENCE objects. Link: https://lkml.kernel.org/r/20210304205256.2162309-1-elver@google.com Signed-off-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13kfence: fix printk format for ptrdiff_tMarco Elver1-6/+6
Use %td for ptrdiff_t. Link: https://lkml.kernel.org/r/3abbe4c9-16ad-c168-a90f-087978ccd8f7@csgroup.eu Link: https://lkml.kernel.org/r/20210303121157.3430807-1-elver@google.com Signed-off-by: Marco Elver <elver@google.com> Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Jann Horn <jannh@google.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP*Arnd Bergmann1-0/+6
Separating compiler-clang.h from compiler-gcc.h inadventently dropped the definitions of the three HAVE_BUILTIN_BSWAP macros, which requires falling back to the open-coded version and hoping that the compiler detects it. Since all versions of clang support the __builtin_bswap interfaces, add back the flags and have the headers pick these up automatically. This results in a 4% improvement of compilation speed for arm defconfig. Note: it might also be worth revisiting which architectures set CONFIG_ARCH_USE_BUILTIN_BSWAP for one compiler or the other, today this is set on six architectures (arm32, csky, mips, powerpc, s390, x86), while another ten architectures define custom helpers (alpha, arc, ia64, m68k, mips, nios2, parisc, sh, sparc, xtensa), and the rest (arm64, h8300, hexagon, microblaze, nds32, openrisc, riscv) just get the unoptimized version and rely on the compiler to detect it. A long time ago, the compiler builtins were architecture specific, but nowadays, all compilers that are able to build the kernel have correct implementations of them, though some may not be as optimized as the inline asm versions. The patch that dropped the optimization landed in v4.19, so as discussed it would be fairly safe to backport this revert to stable kernels to the 4.19/5.4/5.10 stable kernels, but there is a remaining risk for regressions, and it has no known side-effects besides compile speed. Link: https://lkml.kernel.org/r/20210226161151.2629097-1-arnd@kernel.org Link: https://lore.kernel.org/lkml/20210225164513.3667778-1-arnd@kernel.org/ Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Miguel Ojeda <ojeda@kernel.org> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Guo Ren <guoren@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Marco Elver <elver@google.com> Cc: Arvind Sankar <nivedita@alum.mit.edu> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13MAINTAINERS: exclude uapi directories in API/ABI sectionVlastimil Babka1-2/+2
Commit 7b4693e644cb ("MAINTAINERS: add uapi directories to API/ABI section") added include/uapi/ and arch/*/include/uapi/ so that patches modifying them CC linux-api. However that was already done in the past and resulted in too much noise and thus later removed, as explained in b14fd334ff3d ("MAINTAINERS: trim the file triggers for ABI/API") To prevent another round of addition and removal in the future, change the entries to X: (explicit exclusion) for documentation purposes, although they are not subdirectories of broader included directories, as there is apparently no defined way to add plain comments in subsystem sections. Link: https://lkml.kernel.org/r/20210301100255.25229-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com> Acked-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13binfmt_misc: fix possible deadlock in bm_register_writeLior Ribak1-15/+14
There is a deadlock in bm_register_write: First, in the begining of the function, a lock is taken on the binfmt_misc root inode with inode_lock(d_inode(root)). Then, if the user used the MISC_FMT_OPEN_FILE flag, the function will call open_exec on the user-provided interpreter. open_exec will call a path lookup, and if the path lookup process includes the root of binfmt_misc, it will try to take a shared lock on its inode again, but it is already locked, and the code will get stuck in a deadlock To reproduce the bug: $ echo ":iiiii:E::ii::/proc/sys/fs/binfmt_misc/bla:F" > /proc/sys/fs/binfmt_misc/register backtrace of where the lock occurs (#5): 0 schedule () at ./arch/x86/include/asm/current.h:15 1 0xffffffff81b51237 in rwsem_down_read_slowpath (sem=0xffff888003b202e0, count=<optimized out>, state=state@entry=2) at kernel/locking/rwsem.c:992 2 0xffffffff81b5150a in __down_read_common (state=2, sem=<optimized out>) at kernel/locking/rwsem.c:1213 3 __down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1222 4 down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1355 5 0xffffffff811ee22a in inode_lock_shared (inode=<optimized out>) at ./include/linux/fs.h:783 6 open_last_lookups (op=0xffffc9000022fe34, file=0xffff888004098600, nd=0xffffc9000022fd10) at fs/namei.c:3177 7 path_openat (nd=nd@entry=0xffffc9000022fd10, op=op@entry=0xffffc9000022fe34, flags=flags@entry=65) at fs/namei.c:3366 8 0xffffffff811efe1c in do_filp_open (dfd=<optimized out>, pathname=pathname@entry=0xffff8880031b9000, op=op@entry=0xffffc9000022fe34) at fs/namei.c:3396 9 0xffffffff811e493f in do_open_execat (fd=fd@entry=-100, name=name@entry=0xffff8880031b9000, flags=<optimized out>, flags@entry=0) at fs/exec.c:913 10 0xffffffff811e4a92 in open_exec (name=<optimized out>) at fs/exec.c:948 11 0xffffffff8124aa84 in bm_register_write (file=<optimized out>, buffer=<optimized out>, count=19, ppos=<optimized out>) at fs/binfmt_misc.c:682 12 0xffffffff811decd2 in vfs_write (file=file@entry=0xffff888004098500, buf=buf@entry=0xa758d0 ":iiiii:E::ii::i:CF ", count=count@entry=19, pos=pos@entry=0xffffc9000022ff10) at fs/read_write.c:603 13 0xffffffff811defda in ksys_write (fd=<optimized out>, buf=0xa758d0 ":iiiii:E::ii::i:CF ", count=19) at fs/read_write.c:658 14 0xffffffff81b49813 in do_syscall_64 (nr=<optimized out>, regs=0xffffc9000022ff58) at arch/x86/entry/common.c:46 15 0xffffffff81c0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120 To solve the issue, the open_exec call is moved to before the write lock is taken by bm_register_write Link: https://lkml.kernel.org/r/20210228224414.95962-1-liorribak@gmail.com Fixes: 948b701a607f1 ("binfmt_misc: add persistent opened binary handler for containers") Signed-off-by: Lior Ribak <liorribak@gmail.com> Acked-by: Helge Deller <deller@gmx.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>