linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2020-03-24	i2c: stm32f7: switch to I²C generic property parsing	Andy Shevchenko	1	-31/+26
	Switch to the new generic functions: i2c_parse_fw_timings(). While here, replace hard coded values with standard bus frequency definitions. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Alain Volmat <alain.volmat@st.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-24	i2c: rcar: Consolidate timings calls in rcar_i2c_clock_calculate()	Andy Shevchenko	1	-9/+9
	Move i2c_parse_fw_timings() to rcar_i2c_clock_calculate() to consolidate timings calls in one place. While here, replace hard coded values with standard bus frequency definitions. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-24	i2c: core: Allow override timing properties with 0	Andy Shevchenko	2	-14/+18
	Some drivers may allow to override properties with 0 value when defaults are not in use, thus, replace memset() with corresponding per property update. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-24	i2c: core: Provide generic definitions for bus frequencies	Andy Shevchenko	3	-5/+13
	There are few maximum bus frequencies being used in the I²C core code. Provide generic definitions for bus frequencies and use them in the core. The drivers may use predefined constants where it is appropriate. Some of them are already using these under slightly different names. We will convert them later to use newly introduced defines. Note, the name of modes are chosen to follow well established naming scheme [1]. These definitions will also help to avoid typos in the numbers that may lead to subtle errors. [1]: https://en.wikipedia.org/wiki/I%C2%B2C#Differences_between_modes Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-22	i2c: mxs: Use dma_request_chan() instead dma_request_slave_channel()	Peter Ujfalusi	1	-3/+3
	dma_request_slave_channel() is a wrapper on top of dma_request_chan() eating up the error code. By using dma_request_chan() directly the driver can support deferred probing against DMA. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-21	i2c: imx: remove duplicate print after platform_get_irq()	Tang Bin	1	-3/+1
	We don't need dev_err() message because when something goes wrong, platform_get_irq() has print an error message itself, so we should remove duplicate dev_err(). Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-21	i2c: designware: Fix spelling typos in the comments	Andy Shevchenko	5	-8/+8
	Fix spelling typos in the comments with help of `codespell`. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-21	i2c: designware: Discard i2c_dw_read_comp_param() function	Serge Semin	2	-7/+0
	There is no code left in the kernel which would be using the function. So just remove it. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-21	i2c: designware: Detect the FIFO size in the common code	Serge Semin	5	-24/+27
	The problem with detecting the FIFO depth in the platform driver is that in order to implement this we have to access the controller IC_COMP_PARAM_1 register. Currently it's done before the i2c_dw_set_reg_access() method execution, which is errors prone since the method determines the registers endianness and access mode and we can't use dw_readl/dw_writel accessors before this information is retrieved. We also can't move the i2c_dw_set_reg_access() function invocation to after the master/slave probe functions call (when endianness and access mode are determined), since the FIFO depth information is used by them for initializations. So in order to fix the problem we have no choice but to move the FIFO size detection methods to the common code and call it at the probe stage. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-20	i2c: dev: Fix the race between the release of i2c_dev and cdev	Kevin Hao	1	-22/+26
	The struct cdev is embedded in the struct i2c_dev. In the current code, we would free the i2c_dev struct directly in put_i2c_dev(), but the cdev is manged by a kobject, and the release of it is not predictable. So it is very possible that the i2c_dev is freed before the cdev is entirely released. We can easily get the following call trace with CONFIG_DEBUG_KOBJECT_RELEASE and CONFIG_DEBUG_OBJECTS_TIMERS enabled. ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x38 WARNING: CPU: 19 PID: 1 at lib/debugobjects.c:325 debug_print_object+0xb0/0xf0 Modules linked in: CPU: 19 PID: 1 Comm: swapper/0 Tainted: G W 5.2.20-yocto-standard+ #120 Hardware name: Marvell OcteonTX CN96XX board (DT) pstate: 80c00089 (Nzcv daIf +PAN +UAO) pc : debug_print_object+0xb0/0xf0 lr : debug_print_object+0xb0/0xf0 sp : ffff00001292f7d0 x29: ffff00001292f7d0 x28: ffff800b82151788 x27: 0000000000000001 x26: ffff800b892c0000 x25: ffff0000124a2558 x24: 0000000000000000 x23: ffff00001107a1d8 x22: ffff0000116b5088 x21: ffff800bdc6afca8 x20: ffff000012471ae8 x19: ffff00001168f2c8 x18: 0000000000000010 x17: 00000000fd6f304b x16: 00000000ee79de43 x15: ffff800bc0e80568 x14: 79616c6564203a74 x13: 6e6968207473696c x12: 5f72656d6974203a x11: ffff0000113f0018 x10: 0000000000000000 x9 : 000000000000001f x8 : 0000000000000000 x7 : ffff0000101294cc x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000001 x3 : 00000000ffffffff x2 : 0000000000000000 x1 : 387fc15c8ec0f200 x0 : 0000000000000000 Call trace: debug_print_object+0xb0/0xf0 __debug_check_no_obj_freed+0x19c/0x228 debug_check_no_obj_freed+0x1c/0x28 kfree+0x250/0x440 put_i2c_dev+0x68/0x78 i2cdev_detach_adapter+0x60/0xc8 i2cdev_notifier_call+0x3c/0x70 notifier_call_chain+0x8c/0xe8 blocking_notifier_call_chain+0x64/0x88 device_del+0x74/0x380 device_unregister+0x54/0x78 i2c_del_adapter+0x278/0x2d0 unittest_i2c_bus_remove+0x3c/0x80 platform_drv_remove+0x30/0x50 device_release_driver_internal+0xf4/0x1c0 driver_detach+0x58/0xa0 bus_remove_driver+0x84/0xd8 driver_unregister+0x34/0x60 platform_driver_unregister+0x20/0x30 of_unittest_overlay+0x8d4/0xbe0 of_unittest+0xae8/0xb3c do_one_initcall+0xac/0x450 do_initcall_level+0x208/0x224 kernel_init_freeable+0x2d8/0x36c kernel_init+0x18/0x108 ret_from_fork+0x10/0x1c irq event stamp: 3934661 hardirqs last enabled at (3934661): [<ffff00001009fa04>] debug_exception_exit+0x4c/0x58 hardirqs last disabled at (3934660): [<ffff00001009fb14>] debug_exception_enter+0xa4/0xe0 softirqs last enabled at (3934654): [<ffff000010081d94>] __do_softirq+0x46c/0x628 softirqs last disabled at (3934649): [<ffff0000100b4a1c>] irq_exit+0x104/0x118 This is a common issue when using cdev embedded in a struct. Fortunately, we already have a mechanism to solve this kind of issue. Please see commit 233ed09d7fda ("chardev: add helper function to register char devs with a struct device") for more detail. In this patch, we choose to embed the struct device into the i2c_dev, and use the API provided by the commit 233ed09d7fda to make sure that the release of i2c_dev and cdev are in sequence. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-13	i2c: qcom-geni: Drop of_platform.h include	Stephen Boyd	1	-1/+0
	This driver doesn't call any DT platform functions like of_platform_*(). Remove the include as it isn't used. Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-13	i2c: qcom-geni: Grow a dev pointer to simplify code	Stephen Boyd	1	-31/+26
	Some lines are long here. Use a struct dev pointer to shorten lines and simplify code. The clk_get() call can fail because of EPROBE_DEFER problems too, so just remove the error print message because it isn't useful. Finally, platform_get_irq() already prints an error so just remove that error message. Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-13	i2c: qcom-geni: Let firmware specify irq trigger flags	Stephen Boyd	1	-2/+2
	We don't need to force IRQF_TRIGGER_HIGH here as the DT or ACPI tables should take care of this for us. Just use 0 instead so that we use the flags from the firmware. Also, remove specify dev_name() for the irq name so that we can get better information in /proc/interrupts about which device is generating interrupts. Cc: Alok Chauhan <alokc@codeaurora.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-13	i2c: stm32f7: do not backup read-only PECR register	Alain Volmat	1	-4/+0
	The PECR register provides received packet computed PEC value. It makes no sense restoring its value after a reset, and anyway, as read-only register it cannot be restored. Fixes: ea6dd25deeb5 ("i2c: stm32f7: add PM_SLEEP suspend/resume support") Signed-off-by: Alain Volmat <alain.volmat@st.com> Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-10	i2c: smbus: remove outdated references to irq level triggers	Wolfram Sang	2	-10/+0
	IRQ levels are now handled within the IRQ core. Remove the forgotten references from the documentation. Fixes: 9b9f2b8bc2ac ("i2c: i2c-smbus: Use threaded irq for smbalert") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-10	i2c: convert SMBus alert setup function to return an ERRPTR	Wolfram Sang	7	-27/+35
	Only few drivers use this call, so drivers and I2C core are converted at once with this patch. By simply using i2c_new_client_device() instead of i2c_new_device(), we easily can return an ERRPTR for this function as well. To make out of tree users aware that something changed, the function is renamed to i2c_new_smbus_alert_device(). Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-10	i2c: stm32f7: add a new st, stm32mp15-i2c compatible	Alain Volmat	1	-6/+33
	Add a new stm32mp15 specific compatible to handle FastMode+ registers handling which is different on the stm32mp15 compared to the stm32f7 or stm32h7. Indeed, on the stm32mp15, the FastMode+ set and clear registers are separated while on the other platforms (F7 or H7) the control is done in a unique register. Signed-off-by: Alain Volmat <alain.volmat@st.com> Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-10	i2c: stm32f7: disable/restore Fast Mode Plus bits in low power modes	Alain Volmat	1	-13/+39
	Defer the initial enabling of the Fast Mode Plus bits after the stm32f7_i2c_setup_timing call in probe function in order to avoid enabling them if speed is downgraded. Clear & restore the Fast Mode Plus bits in the suspend/resume handlers of the driver. Signed-off-by: Alain Volmat <alain.volmat@st.com> Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-10	i2c: brcmstb: Support BCM2711 HDMI BSC controllers	Maxime Ripard	1	-0/+33
	The HDMI blocks in the BCM2771 have an i2c controller to retrieve the EDID. This block is split into two parts, the BSC and the AUTO_I2C, lying in two separate register areas. The AUTO_I2C block has a mailbox-like interface and will take away the BSC control from the CPU if enabled. However, the BSC is the actually the same controller than the one supported by the brcmstb driver, and the AUTO_I2C doesn't really bring any immediate benefit. Let's use the BSC then, but let's also tie the AUTO_I2C registers with a separate compatible so that we can enable AUTO_I2C if needed in the future. The AUTO_I2C is enabled by default at boot though, so we first need to release the BSC from the AUTO_I2C control. Cc: Kamal Dasu <kdasu.kdev@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: bcm-kernel-feedback-list@broadcom.com Cc: linux-i2c@vger.kernel.org Signed-off-by: Maxime Ripard <maxime@cerno.tech> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-10	dt-bindings: i2c: brcmstb: Add BCM2711 BSC/AUTO-I2C binding	Maxime Ripard	1	-1/+39
	The HDMI blocks in the BCM2771 have an i2c controller to retrieve the EDID. This block is split into two parts, the BSC and the AUTO_I2C, lying in two separate register areas. The AUTO_I2C block has a mailbox-like interface and will take away the BSC control from the CPU if enabled. However, the BSC is the actually the same controller than the one supported by the brcmstb driver, and the AUTO_I2C doesn't really bring any immediate benefit. We can model it in the DT as a single device with two register range, which will allow us to use or or the other in the driver without changing anything in the DT. Signed-off-by: Maxime Ripard <maxime@cerno.tech> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-10	dt-bindings: i2c: brcmstb: Convert the BRCMSTB binding to a schema	Maxime Ripard	3	-27/+60
	Switch the DT binding to a YAML schema to enable the DT validation. Signed-off-by: Maxime Ripard <maxime@cerno.tech> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-10	i2c: omap: use devm_platform_ioremap_resource()	chenqiwu	1	-3/+1
	Use a new API devm_platform_ioremap_resource() to simplify code. Signed-off-by: chenqiwu <chenqiwu@xiaomi.com> Tested-by: Luca Ceresoli <luca@lucaceresoli.net> Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net> Reviewed-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-10	i2c: use kobj_to_dev() API	chenqiwu	1	-2/+2
	Use kobj_to_dev() API instead of container_of(). Signed-off-by: chenqiwu <chenqiwu@xiaomi.com> Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-10	i2c: powermac: correct comment about custom handling	Wolfram Sang	1	-8/+7
	The comment had some flaws which are now fixed: - the prefix is 'MAC' not 'AAPL' - no kernel coding style and too short length - 'we do' instead of 'we to' Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-02-26	i2c: dev: keep sorting of includes	Wolfram Sang	1	-1/+1
	Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-02-26	i2c: stm32f7: allow controller to be wakeup-source	Alain Volmat	1	-21/+86
	Allow the i2c-stm32f7 controller to become a wakeup-source of the system. In such case, when a slave is registered to the I2C controller, receiving a I2C message targeting that registered slave address wakes up the suspended system. In order to be able to wake-up, the I2C controller DT node must have the property wakeup-source defined and a slave must be registered. Signed-off-by: Alain Volmat <alain.volmat@st.com> Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-02-26	i2c: imx: implement master_xfer_atomic callback	Stefan Lengfeld	1	-41/+105
	Rework the read and write code paths in the driver to support operation in atomic contexts. To achieve this, the driver must not rely on IRQs and not call schedule(), e.g. via a sleep routine, in these cases. With this patch the driver supports normal operation, DMA transfers and now the polling mode or also called sleep-free or IRQ-less operation. It makes the code not simpler or easier to read, but atomic I2C transfers are needed on some hardware configurations, e.g. to trigger reboots on an external PMIC chip. Signed-off-by: Stefan Lengfeld <contact@stefanchrist.eu> [m.felsch@pengutronix.de: integrate https://patchwork.ozlabs.org/patch/1085943/ review feedback] [m.felsch@pengutronix.de: adapt commit message] Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Stefan Agner <stefan@agner.ch> Tested-by: Stefan Lengfeld <contact@stefanchrist.eu> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-02-26	i2c: at91: implement i2c bus recovery	Kamel Bouhara	2	-0/+82
	Implement i2c bus recovery when slaves devices might hold SDA low. In this case re-assign SCL/SDA to gpios and issue 9 dummy clock pulses until the slave release SDA. Signed-off-by: Kamel Bouhara <kamel.bouhara@bootlin.com> Signed-off-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com> Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-02-26	dt-bindings: i2c: at91: document optional bus recovery properties	Kamel Bouhara	1	-0/+10
	The at91 I2C controller can support bus recovery by re-assigning SCL and SDA to gpios. Add the optional pinctrl and gpio properties to do so. Signed-off-by: Kamel Bouhara <kamel.bouhara@bootlin.com> [codrin.ciubotariu@microchip.com: rebased] Signed-off-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-02-23	Linux 5.6-rc3	Linus Torvalds	1	-1/+1

2020-02-23	csky: Replace <linux/clk-provider.h> by <linux/of_clk.h>	Geert Uytterhoeven	1	-1/+1
	The C-Sky platform code is not a clock provider, and just needs to call of_clk_init(). Hence it can include <linux/of_clk.h> instead of <linux/clk-provider.h>. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-22	io_uring: fix __io_iopoll_check deadlock in io_sq_thread	Xiaoguang Wang	1	-18/+9
	Since commit a3a0e43fd770 ("io_uring: don't enter poll loop if we have CQEs pending"), if we already events pending, we won't enter poll loop. In case SETUP_IOPOLL and SETUP_SQPOLL are both enabled, if app has been terminated and don't reap pending events which are already in cq ring, and there are some reqs in poll_list, io_sq_thread will enter __io_iopoll_check(), and find pending events, then return, this loop will never have a chance to exit. I have seen this issue in fio stress tests, to fix this issue, let io_sq_thread call io_iopoll_getevents() with argument 'min' being zero, and remove __io_iopoll_check(). Fixes: a3a0e43fd770 ("io_uring: don't enter poll loop if we have CQEs pending") Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-02-21	ext4: fix mount failure with quota configured as module	Jan Kara	1	-1/+1
	When CONFIG_QFMT_V2 is configured as a module, the test in ext4_feature_set_ok() fails and so mount of filesystems with quota or project features fails. Fix the test to use IS_ENABLED macro which works properly even for modules. Link: https://lore.kernel.org/r/20200221100835.9332-1-jack@suse.cz Fixes: d65d87a07476 ("ext4: improve explanation of a mount failure caused by a misconfigured kernel") Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-02-21	jbd2: fix ocfs2 corrupt when clearing block group bits	wangyan	1	-2/+6
	I found a NULL pointer dereference in ocfs2_block_group_clear_bits(). The running environment: kernel version: 4.19 A cluster with two nodes, 5 luns mounted on two nodes, and do some file operations like dd/fallocate/truncate/rm on every lun with storage network disconnection. The fallocate operation on dm-23-45 caused an null pointer dereference. The information of NULL pointer dereference as follows: [577992.878282] JBD2: Error -5 detected when updating journal superblock for dm-23-45. [577992.878290] Aborting journal on device dm-23-45. ... [577992.890778] JBD2: Error -5 detected when updating journal superblock for dm-24-46. [577992.890908] __journal_remove_journal_head: freeing b_committed_data [577992.890916] (fallocate,88392,52):ocfs2_extend_trans:474 ERROR: status = -30 [577992.890918] __journal_remove_journal_head: freeing b_committed_data [577992.890920] (fallocate,88392,52):ocfs2_rotate_tree_right:2500 ERROR: status = -30 [577992.890922] __journal_remove_journal_head: freeing b_committed_data [577992.890924] (fallocate,88392,52):ocfs2_do_insert_extent:4382 ERROR: status = -30 [577992.890928] (fallocate,88392,52):ocfs2_insert_extent:4842 ERROR: status = -30 [577992.890928] __journal_remove_journal_head: freeing b_committed_data [577992.890930] (fallocate,88392,52):ocfs2_add_clusters_in_btree:4947 ERROR: status = -30 [577992.890933] __journal_remove_journal_head: freeing b_committed_data [577992.890939] __journal_remove_journal_head: freeing b_committed_data [577992.890949] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 [577992.890950] Mem abort info: [577992.890951] ESR = 0x96000004 [577992.890952] Exception class = DABT (current EL), IL = 32 bits [577992.890952] SET = 0, FnV = 0 [577992.890953] EA = 0, S1PTW = 0 [577992.890954] Data abort info: [577992.890955] ISV = 0, ISS = 0x00000004 [577992.890956] CM = 0, WnR = 0 [577992.890958] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000f8da07a9 [577992.890960] [0000000000000020] pgd=0000000000000000 [577992.890964] Internal error: Oops: 96000004 [#1] SMP [577992.890965] Process fallocate (pid: 88392, stack limit = 0x00000000013db2fd) [577992.890968] CPU: 52 PID: 88392 Comm: fallocate Kdump: loaded Tainted: G W OE 4.19.36 #1 [577992.890969] Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 0.98 08/25/2019 [577992.890971] pstate: 60400009 (nZCv daif +PAN -UAO) [577992.891054] pc : _ocfs2_free_suballoc_bits+0x63c/0x968 [ocfs2] [577992.891082] lr : _ocfs2_free_suballoc_bits+0x618/0x968 [ocfs2] [577992.891084] sp : ffff0000c8e2b810 [577992.891085] x29: ffff0000c8e2b820 x28: 0000000000000000 [577992.891087] x27: 00000000000006f3 x26: ffffa07957b02e70 [577992.891089] x25: ffff807c59d50000 x24: 00000000000006f2 [577992.891091] x23: 0000000000000001 x22: ffff807bd39abc30 [577992.891093] x21: ffff0000811d9000 x20: ffffa07535d6a000 [577992.891097] x19: ffff000001681638 x18: ffffffffffffffff [577992.891098] x17: 0000000000000000 x16: ffff000080a03df0 [577992.891100] x15: ffff0000811d9708 x14: 203d207375746174 [577992.891101] x13: 73203a524f525245 x12: 20373439343a6565 [577992.891103] x11: 0000000000000038 x10: 0101010101010101 [577992.891106] x9 : ffffa07c68a85d70 x8 : 7f7f7f7f7f7f7f7f [577992.891109] x7 : 0000000000000000 x6 : 0000000000000080 [577992.891110] x5 : 0000000000000000 x4 : 0000000000000002 [577992.891112] x3 : ffff000001713390 x2 : 2ff90f88b1c22f00 [577992.891114] x1 : ffff807bd39abc30 x0 : 0000000000000000 [577992.891116] Call trace: [577992.891139] _ocfs2_free_suballoc_bits+0x63c/0x968 [ocfs2] [577992.891162] _ocfs2_free_clusters+0x100/0x290 [ocfs2] [577992.891185] ocfs2_free_clusters+0x50/0x68 [ocfs2] [577992.891206] ocfs2_add_clusters_in_btree+0x198/0x5e0 [ocfs2] [577992.891227] ocfs2_add_inode_data+0x94/0xc8 [ocfs2] [577992.891248] ocfs2_extend_allocation+0x1bc/0x7a8 [ocfs2] [577992.891269] ocfs2_allocate_extents+0x14c/0x338 [ocfs2] [577992.891290] __ocfs2_change_file_space+0x3f8/0x610 [ocfs2] [577992.891309] ocfs2_fallocate+0xe4/0x128 [ocfs2] [577992.891316] vfs_fallocate+0x11c/0x250 [577992.891317] ksys_fallocate+0x54/0x88 [577992.891319] __arm64_sys_fallocate+0x28/0x38 [577992.891323] el0_svc_common+0x78/0x130 [577992.891325] el0_svc_handler+0x38/0x78 [577992.891327] el0_svc+0x8/0xc My analysis process as follows: ocfs2_fallocate __ocfs2_change_file_space ocfs2_allocate_extents ocfs2_extend_allocation ocfs2_add_inode_data ocfs2_add_clusters_in_btree ocfs2_insert_extent ocfs2_do_insert_extent ocfs2_rotate_tree_right ocfs2_extend_rotate_transaction ocfs2_extend_trans jbd2_journal_restart jbd2__journal_restart /* handle->h_transaction is NULL, * is_handle_aborted(handle) is true / handle->h_transaction = NULL; start_this_handle return -EROFS; ocfs2_free_clusters _ocfs2_free_clusters _ocfs2_free_suballoc_bits ocfs2_block_group_clear_bits ocfs2_journal_access_gd __ocfs2_journal_access jbd2_journal_get_undo_access / I think jbd2_write_access_granted() will * return true, because do_get_write_access() * will return -EROFS. / if (jbd2_write_access_granted(...)) return 0; do_get_write_access / handle->h_transaction is NULL, it will * return -EROFS here, so do_get_write_access() * was not called. / if (is_handle_aborted(handle)) return -EROFS; / bh2jh(group_bh) is NULL, caused NULL pointer dereference / undo_bg = (struct ocfs2_group_desc ) bh2jh(group_bh)->b_committed_data; If handle->h_transaction == NULL, then jbd2_write_access_granted() does not really guarantee that journal_head will stay around, not even speaking of its b_committed_data. The bh2jh(group_bh) can be removed after ocfs2_journal_access_gd() and before call "bh2jh(group_bh)->b_committed_data". So, we should move is_handle_aborted() check from do_get_write_access() into jbd2_journal_get_undo_access() and jbd2_journal_get_write_access() before the call to jbd2_write_access_granted(). Link: https://lore.kernel.org/r/f72a623f-b3f1-381a-d91d-d22a1c83a336@huawei.com Signed-off-by: Yan Wang <wangyan122@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jun Piao <piaojun@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@kernel.org
2020-02-21	ext4: fix race between writepages and enabling EXT4_EXTENTS_FL	Eric Biggers	2	-9/+23
	If EXT4_EXTENTS_FL is set on an inode while ext4_writepages() is running on it, the following warning in ext4_add_complete_io() can be hit: WARNING: CPU: 1 PID: 0 at fs/ext4/page-io.c:234 ext4_put_io_end_defer+0xf0/0x120 Here's a minimal reproducer (not 100% reliable) (root isn't required): while true; do sync done & while true; do rm -f file touch file chattr -e file echo X >> file chattr +e file done The problem is that in ext4_writepages(), ext4_should_dioread_nolock() (which only returns true on extent-based files) is checked once to set the number of reserved journal credits, and also again later to select the flags for ext4_map_blocks() and copy the reserved journal handle to ext4_io_end::handle. But if EXT4_EXTENTS_FL is being concurrently set, the first check can see dioread_nolock disabled while the later one can see it enabled, causing the reserved handle to unexpectedly be NULL. Since changing EXT4_EXTENTS_FL is uncommon, and there may be other races related to doing so as well, fix this by synchronizing changing EXT4_EXTENTS_FL with ext4_writepages() via the existing s_writepages_rwsem (previously called s_journal_flag_rwsem). This was originally reported by syzbot without a reproducer at https://syzkaller.appspot.com/bug?extid=2202a584a00fffd19fbf, but now that dioread_nolock is the default I also started seeing this when running syzkaller locally. Link: https://lore.kernel.org/r/20200219183047.47417-3-ebiggers@kernel.org Reported-by: syzbot+2202a584a00fffd19fbf@syzkaller.appspotmail.com Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@kernel.org
2020-02-21	ext4: rename s_journal_flag_rwsem to s_writepages_rwsem	Eric Biggers	3	-11/+11
	In preparation for making s_journal_flag_rwsem synchronize ext4_writepages() with changes to both the EXTENTS and JOURNAL_DATA flags (rather than just JOURNAL_DATA as it does currently), rename it to s_writepages_rwsem. Link: https://lore.kernel.org/r/20200219183047.47417-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@kernel.org
2020-02-21	ext4: fix potential race between s_flex_groups online resizing and access	Suraj Jitindar Singh	5	-37/+76
	During an online resize an array of s_flex_groups structures gets replaced so it can get enlarged. If there is a concurrent access to the array and this memory has been reused then this can lead to an invalid memory access. The s_flex_group array has been converted into an array of pointers rather than an array of structures. This is to ensure that the information contained in the structures cannot get out of sync during a resize due to an accessor updating the value in the old structure after it has been copied but before the array pointer is updated. Since the structures them- selves are no longer copied but only the pointers to them this case is mitigated. Link: https://bugzilla.kernel.org/show_bug.cgi?id=206443 Link: https://lore.kernel.org/r/20200221053458.730016-4-tytso@mit.edu Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-02-21	MAINTAINERS: use tabs for SAFESETID	Randy Dunlap	1	-4/+4
	Use tabs for indentation instead of spaces for SAFESETID. All (!) other entries in MAINTAINERS use tabs (according to my simple grepping). Link: http://lkml.kernel.org/r/2bb2e52a-2694-816d-57b4-6cabfadd6c1a@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Micah Morton <mortonm@chromium.org> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	lib/stackdepot.c: fix global out-of-bounds in stack_slabs	Alexander Potapenko	1	-2/+6
	Walter Wu has reported a potential case in which init_stack_slab() is called after stack_slabs[STACK_ALLOC_MAX_SLABS - 1] has already been initialized. In that case init_stack_slab() will overwrite stack_slabs[STACK_ALLOC_MAX_SLABS], which may result in a memory corruption. Link: http://lkml.kernel.org/r/20200218102950.260263-1-glider@google.com Fixes: cd11016e5f521 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB") Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: Walter Wu <walter-zh.wu@mediatek.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	mm/sparsemem: pfn_to_page is not valid yet on SPARSEMEM	Wei Yang	1	-1/+1
	When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page() doesn't work before sparse_init_one_section() is called. This leads to a crash when hotplug memory: BUG: unable to handle page fault for address: 0000000006400000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 3 PID: 221 Comm: kworker/u16:1 Tainted: G W 5.5.0-next-20200205+ #343 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:__memset+0x24/0x30 Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3 RSP: 0018:ffffb43ac0373c80 EFLAGS: 00010a87 RAX: ffffffffffffffff RBX: ffff8a1518800000 RCX: 0000000000050000 RDX: 0000000000000000 RSI: 00000000000000ff RDI: 0000000006400000 RBP: 0000000000140000 R08: 0000000000100000 R09: 0000000006400000 R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000028 R14: 0000000000000000 R15: ffff8a153ffd9280 FS: 0000000000000000(0000) GS:ffff8a153ab00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000006400000 CR3: 0000000136fca000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: sparse_add_section+0x1c9/0x26a __add_pages+0xbf/0x150 add_pages+0x12/0x60 add_memory_resource+0xc8/0x210 __add_memory+0x62/0xb0 acpi_memory_device_add+0x13f/0x300 acpi_bus_attach+0xf6/0x200 acpi_bus_scan+0x43/0x90 acpi_device_hotplug+0x275/0x3d0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x1a7/0x370 worker_thread+0x30/0x380 kthread+0x112/0x130 ret_from_fork+0x35/0x40 We should use memmap as it did. On x86 the impact is limited to x86_32 builds, or x86_64 configurations that override the default setting for SPARSEMEM_VMEMMAP. Other memory hotplug archs (arm64, ia64, and ppc) also default to SPARSEMEM_VMEMMAP=y. [dan.j.williams@intel.com: changelog update] {rppt@linux.ibm.com: changelog update] Link: http://lkml.kernel.org/r/20200219030454.4844-1-bhe@redhat.com Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Baoquan He <bhe@redhat.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	mm/vmscan.c: don't round up scan size for online memory cgroup	Gavin Shan	1	-3/+6
	Commit 68600f623d69 ("mm: don't miss the last page because of round-off error") makes the scan size round up to @denominator regardless of the memory cgroup's state, online or offline. This affects the overall reclaiming behavior: the corresponding LRU list is eligible for reclaiming only when its size logically right shifted by @sc->priority is bigger than zero in the former formula. For example, the inactive anonymous LRU list should have at least 0x4000 pages to be eligible for reclaiming when we have 60/12 for swappiness/priority and without taking scan/rotation ratio into account. After the roundup is applied, the inactive anonymous LRU list becomes eligible for reclaiming when its size is bigger than or equal to 0x1000 in the same condition. (0x4000 >> 12) * 60 / (60 + 140 + 1) = 1 ((0x1000 >> 12) * 60) + 200) / (60 + 140 + 1) = 1 aarch64 has 512MB huge page size when the base page size is 64KB. The memory cgroup that has a huge page is always eligible for reclaiming in that case. The reclaiming is likely to stop after the huge page is reclaimed, meaing the further iteration on @sc->priority and the silbing and child memory cgroups will be skipped. The overall behaviour has been changed. This fixes the issue by applying the roundup to offlined memory cgroups only, to give more preference to reclaim memory from offlined memory cgroup. It sounds reasonable as those memory is unlikedly to be used by anyone. The issue was found by starting up 8 VMs on a Ampere Mustang machine, which has 8 CPUs and 16 GB memory. Each VM is given with 2 vCPUs and 2GB memory. It took 264 seconds for all VMs to be completely up and 784MB swap is consumed after that. With this patch applied, it took 236 seconds and 60MB swap to do same thing. So there is 10% performance improvement for my case. Note that KSM is disable while THP is enabled in the testing. total used free shared buff/cache available Mem: 16196 10065 2049 16 4081 3749 Swap: 8175 784 7391 total used free shared buff/cache available Mem: 16196 11324 3656 24 1215 2936 Swap: 8175 60 8115 Link: http://lkml.kernel.org/r/20200211024514.8730-1-gshan@redhat.com Fixes: 68600f623d69 ("mm: don't miss the last page because of round-off error") Signed-off-by: Gavin Shan <gshan@redhat.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: <stable@vger.kernel.org> [4.20+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	lib/string.c: update match_string() doc-strings with correct behavior	Alexandru Ardelean	1	-0/+16
	There were a few attempts at changing behavior of the match_string() helpers (i.e. 'match_string()' & 'sysfs_match_string()'), to change & extend the behavior according to the doc-string. But the simplest approach is to just fix the doc-strings. The current behavior is fine as-is, and some bugs were introduced trying to fix it. As for extending the behavior, new helpers can always be introduced if needed. The match_string() helpers behave more like 'strncmp()' in the sense that they go up to n elements or until the first NULL element in the array of strings. This change updates the doc-strings with this info. Link: http://lkml.kernel.org/r/20200213072722.8249-1-alexandru.ardelean@analog.com Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: "Tobin C . Harding" <tobin@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	mm/memcontrol.c: lost css_put in memcg_expand_shrinker_maps()	Vasily Averin	1	-1/+3
	for_each_mem_cgroup() increases css reference counter for memory cgroup and requires to use mem_cgroup_iter_break() if the walk is cancelled. Link: http://lkml.kernel.org/r/c98414fb-7e1f-da0f-867a-9340ec4bd30b@virtuozzo.com Fixes: 0a4465d34028 ("mm, memcg: assign memcg-aware shrinkers bitmap to memcg") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Acked-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	mm/swapfile.c: fix a comment in sys_swapon()	Christoph Hellwig	1	-1/+1
	claim_swapfile now always takes i_rwsem. Link: http://lkml.kernel.org/r/20200114161225.309792-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	scripts/get_maintainer.pl: deprioritize old Fixes: addresses	Douglas Anderson	1	-4/+4
	Recently, I found that get_maintainer was causing me to send emails to the old addresses for maintainers. Since I usually just trust the output of get_maintainer to know the right email address, I didn't even look carefully and fired off two patch series that went to the wrong place. Oops. The problem was introduced recently when trying to add signatures from Fixes. The problem was that these email addresses were added too early in the process of compiling our list of places to send. Things added to the list earlier are considered more canonical and when we later added maintainer entries we ended up deduplicating to the old address. Here are two examples using mainline commits (to make it easier to replicate) for the two maintainers that I messed up recently: $ git format-patch d8549bcd0529~..d8549bcd0529 $ ./scripts/get_maintainer.pl 0001-clk-Add-clk_hw.patch \| grep Boyd Stephen Boyd <sboyd@codeaurora.org>... $ git format-patch 6d1238aa3395~..6d1238aa3395 $ ./scripts/get_maintainer.pl 0001-arm64-dts-qcom-qcs404.patch \| grep Andy Andy Gross <andy.gross@linaro.org> Let's move the adding of addresses from Fixes: to the end since the email addresses from these are much more likely to be older. After this patch the above examples get the right addresses for the two examples. Link: http://lkml.kernel.org/r/20200127095001.1.I41fba9f33590bfd92cd01960161d8384268c6569@changeid Fixes: 2f5bd343694e ("scripts/get_maintainer.pl: add signatures from Fixes: <badcommit> lines in commit message") Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Joe Perches <joe@perches.com> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Andy Gross <agross@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	get_maintainer: remove uses of P: for maintainer name	Joe Perches	1	-24/+0
	Commit 1ca84ed6425f ("MAINTAINERS: Reclaim the P: tag for Maintainer Entry Profile") changed the use of the "P:" tag from "Person" to "Profile (ie: special subsystem coding styles and characteristics)" Change how get_maintainer.pl parses the "P:" tag to match. Link: http://lkml.kernel.org/r/ca53823fc5d25c0be32ad937d0207a0589c08643.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Dan Williams <dan.j.william@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	selftests/vm: add missed tests in run_vmtests	SeongJae Park	1	-0/+33
	The commits introducing 'mlock-random-test'[1], 'map_fiex_noreplace'[2], and 'thuge-gen'[3] have not added those in the 'run_vmtests' script and thus the 'run_tests' command of kselftests doesn't run those. This commit adds those in the script. 'gup_benchmark' and 'transhuge-stress' are also not included in the 'run_vmtests', but this commit does not add those because those are for performance measurement rather than pass/fail tests. [1] commit 26b4224d9961 ("selftests: expanding more mlock selftest") [2] commit 91cbacc34512 ("tools/testing/selftests/vm/map_fixed_noreplace.c: add test for MAP_FIXED_NOREPLACE") [3] commit fcc1f2d5dd34 ("selftests: add a test program for variable huge page sizes in mmap/shmget") Link: http://lkml.kernel.org/r/20200206085144.29126-1-sj38.park@gmail.com Signed-off-by: SeongJae Park <sjpark@amazon.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	include/uapi/linux/swab.h: fix userspace breakage, use __BITS_PER_LONG for swap	Christian Borntraeger	1	-2/+2
	QEMU has a funny new build error message when I use the upstream kernel headers: CC block/file-posix.o In file included from /home/cborntra/REPOS/qemu/include/qemu/timer.h:4, from /home/cborntra/REPOS/qemu/include/qemu/timed-average.h:29, from /home/cborntra/REPOS/qemu/include/block/accounting.h:28, from /home/cborntra/REPOS/qemu/include/block/block_int.h:27, from /home/cborntra/REPOS/qemu/block/file-posix.c:30: /usr/include/linux/swab.h: In function `__swab': /home/cborntra/REPOS/qemu/include/qemu/bitops.h:20:34: error: "sizeof" is not defined, evaluates to 0 [-Werror=undef] 20 \| #define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE) \| ^~~~~~ /home/cborntra/REPOS/qemu/include/qemu/bitops.h:20:41: error: missing binary operator before token "(" 20 \| #define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE) \| ^ cc1: all warnings being treated as errors make: *** [/home/cborntra/REPOS/qemu/rules.mak:69: block/file-posix.o] Error 1 rm tests/qemu-iotests/socket_scm_helper.o This was triggered by commit d5767057c9a ("uapi: rename ext2_swab() to swab() and share globally in swab.h"). That patch is doing #include <asm/bitsperlong.h> but it uses BITS_PER_LONG. The kernel file asm/bitsperlong.h provide only __BITS_PER_LONG. Let us use the __ variant in swap.h Link: http://lkml.kernel.org/r/20200213142147.17604-1-borntraeger@de.ibm.com Fixes: d5767057c9a ("uapi: rename ext2_swab() to swab() and share globally in swab.h") Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: Allison Randal <allison@lohutok.net> Cc: Joe Perches <joe@perches.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: William Breathitt Gray <vilhelm.gray@gmail.com> Cc: Torsten Hilbrich <torsten.hilbrich@secunet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"	Ioanna Alifieraki	1	-4/+2
	This reverts commit a97955844807e327df11aa33869009d14d6b7de0. Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()") removes a lock that is needed. This leads to a process looping infinitely in exit_sem() and can also lead to a crash. There is a reproducer available in [1] and with the commit reverted the issue does not reproduce anymore. Using the reproducer found in [1] is fairly easy to reach a point where one of the child processes is looping infinitely in exit_sem between for(;;) and if (semid == -1) block, while it's trying to free its last sem_undo structure which has already been freed by freeary(). Each sem_undo struct is on two lists: one per semaphore set (list_id) and one per process (list_proc). The list_id list tracks undos by semaphore set, and the list_proc by process. Undo structures are removed either by freeary() or by exit_sem(). The freeary function is invoked when the user invokes a syscall to remove a semaphore set. During this operation freeary() traverses the list_id associated with the semaphore set and removes the undo structures from both the list_id and list_proc lists. For this case, exit_sem() is called at process exit. Each process contains a struct sem_undo_list (referred to as "ulp") which contains the head for the list_proc list. When the process exits, exit_sem() traverses this list to remove each sem_undo struct. As in freeary(), whenever a sem_undo struct is removed from list_proc, it is also removed from the list_id list. Removing elements from list_id is safe for both exit_sem() and freeary() due to sem_lock(). Removing elements from list_proc is not safe; freeary() locks &un->ulp->lock when it performs list_del_rcu(&un->list_proc) but exit_sem() does not (locking was removed by commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"). This can result in the following situation while executing the reproducer [1] : Consider a child process in exit_sem() and the parent in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)). - The list_proc for the child contains the last two undo structs A and B (the rest have been removed either by exit_sem() or freeary()). - The semid for A is 1 and semid for B is 2. - exit_sem() removes A and at the same time freeary() removes B. - Since A and B have different semid sem_lock() will acquire different locks for each process and both can proceed. The bug is that they remove A and B from the same list_proc at the same time because only freeary() acquires the ulp lock. When exit_sem() removes A it makes ulp->list_proc.next to point at B and at the same time freeary() removes B setting B->semid=-1. At the next iteration of for(;;) loop exit_sem() will try to remove B. The only way to break from for(;;) is for (&un->list_proc == &ulp->list_proc) to be true which is not. Then exit_sem() will check if B->semid=-1 which is and will continue looping in for(;;) until the memory for B is reallocated and the value at B->semid is changed. At that point, exit_sem() will crash attempting to unlink B from the lists (this can be easily triggered by running the reproducer [1] a second time). To prove this scenario instrumentation was added to keep information about each sem_undo (un) struct that is removed per process and per semaphore set (sma). CPU0 CPU1 [caller holds sem_lock(sma for A)] ... freeary() exit_sem() ... ... ... sem_lock(sma for B) spin_lock(A->ulp->lock) ... list_del_rcu(un_A->list_proc) list_del_rcu(un_B->list_proc) Undo structures A and B have different semid and sem_lock() operations proceed. However they belong to the same list_proc list and they are removed at the same time. This results into ulp->list_proc.next pointing to the address of B which is already removed. After reverting commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()") the issue was no longer reproducible. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779 Link: http://lkml.kernel.org/r/20191211191318.11860-1-ioanna-maria.alifieraki@canonical.com Fixes: a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()") Signed-off-by: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Herton R. Krzesinski <herton@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: <malat@debian.org> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jay Vosburgh <jay.vosburgh@canonical.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-21	y2038: hide timeval/timespec/itimerval/itimerspec types	Arnd Bergmann	2	-10/+14
	There are no in-kernel users remaining, but there may still be users that include linux/time.h instead of sys/time.h from user space, so leave the types available to user space while hiding them from kernel space. Only the __kernel_old_* versions of these types remain now. Link: http://lkml.kernel.org/r/20200110154232.4104492-4-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>