linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-02-24	i2c: ocores: turn incomplete kdoc into a comment	Wolfram Sang	1	-3/+3
	gcc complains, rightfully so, I think: drivers/i2c/busses/i2c-ocores.c:32: warning: Cannot understand * @process_lock: protect I2C transfer process. on line 32 - I thought it was a doc line Make it a simple comment. Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-23	i2c: designware: Do not allow i2c_dw_xfer() calls while suspended	Hans de Goede	4	-1/+17
	On most Intel Bay- and Cherry-Trail systems the PMIC is connected over I2C and the PMIC is accessed through various means by the _PS0 and _PS3 ACPI methods (power on / off methods) of various devices. This leads to suspend/resume ordering problems where a device may be resumed and get its _PS0 method executed before the I2C controller is resumed. On Cherry Trail this leads to errors like these: i2c_designware 808622C1:06: controller timed out ACPI Error: AE_ERROR, Returned by Handler for [UserDefinedRegion] ACPI Error: Method parse/execution failed \_SB.P18W._ON, AE_ERROR video LNXVIDEO:00: Failed to change power state to D0 But on Bay Trail this caused I2C reads to seem to succeed, but they end up returning wrong data, which ends up getting written back by the typical read-modify-write cycle done to turn on various power-resources. Debugging the problems caused by this silent data corruption is quite nasty. This commit adds a check which disallows i2c_dw_xfer() calls to happen until the controller's resume method has completed. Which turns the silent data corruption into getting these errors in dmesg instead: i2c_designware 80860F41:04: Error i2c_dw_xfer call while suspended ACPI Error: AE_ERROR, Returned by Handler for [UserDefinedRegion] ACPI Error: Method parse/execution failed \_SB.PCI0.GFX0._PS0, AE_ERROR Which is much better. Note the above errors are an example of issues which this patch will help to debug, the actual fix requires fixing the suspend order and this has been fixed by a different commit. Note the setting / clearing of the suspended flag in the suspend / resume methods is NOT protected by i2c_lock_bus(). This is intentional as these methods get called from i2c_dw_xfer() (through pm_runtime_get/put) a nd i2c_dw_xfer() is called with the i2c_bus_lock held, so otherwise we would deadlock. This means that there is a theoretical race between a non runtime suspend and the suspended check in i2c_dw_xfer(), this is not a problem since normally we should not hit the race and this check is primarily a debugging tool so hitting the check if there are suspend/resume ordering problems does not need to be 100% reliable. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-23	i2c: tegra: Only display error messages if DMA setup fails	Jonathan Hunter	1	-4/+6
	Commit 86c92b9965ff ("i2c: tegra: Add DMA support") added DMA support to the Tegra I2C driver for Tegra devices that support the APB DMA controller. One side-effect of this change is that even for Tegra devices that do not have an APB DMA controller and hence, cannot support DMA tranfers for I2C transactions, the following error messages are still displayed ... ERR KERN tegra-i2c 31c0000.i2c: cannot use DMA: -19 ERR KERN tegra-i2c 31c0000.i2c: falling back to PIO There is no point displaying the above messages for devices that do not have an APB DMA controller and so fix this by returning from the tegra_i2c_init_dma() function if 'has_apb_dma' is not true. Furthermore, if CONFIG_TEGRA20_APB_DMA is not set, then rather than printing an error message, print an debug message as for whatever reason this could be intentional. Fixes: 86c92b9965ff ("i2c: tegra: Add DMA support") Signed-off-by: Jonathan Hunter <jonathanh@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-23	i2c: gpio: fault-injector: add 'inject_panic' injector	Wolfram Sang	2	-1/+56
	Add a fault injector simulating a Kernel panic happening after starting a transfer. Read the docs for its usage. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-23	i2c: gpio: fault-injector: add 'lose_arbitration' injector	Wolfram Sang	2	-0/+99
	Add a fault injector simulating 'arbitration lost' from multi-master setups. Read the docs for its usage. A helper function for future fault injectors using SCL interrupts is created to achieve this. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-23	i2c: tegra: remove multi-master support	Sowjanya Komatineni	1	-2/+2
	Multi-master support is defeatured on Tegra210 and Tegra186 due to known bugs. This patch removes multi-master support for Tegra210 and Tegra186 I2C HW feature. Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-23	i2c: tegra: remove master fifo support on tegra186	Sowjanya Komatineni	1	-1/+1
	Tegra186 does not have master FIFO control register and instead uses FIFO control register like prior Tegra chipset. This patch fixes this and prevents crashing during boot when accessing FIFO control registers. Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-15	i2c: tegra: change phrasing, "fallbacking" to "falling back"	Colin Ian King	1	-2/+2
	The phrasing in two dev_err messages is using fallbacking which os less understandable than "falling back", so fix this up. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-15	i2c: expand minor range when registering chrdev region	Chengguang Xu	1	-1/+1
	Actually, total amount of available minor number for a single major is MINORMASK + 1. So expand minor range when registering chrdev region. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> [wsa: fixed typo in commit message] Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-15	i2c: aspeed: Add multi-master use case support	Jae Hyun Yoo	1	-26/+93
	In multi-master environment, this driver's master cannot know exactly when a peer master sends data to this driver's slave so cases can be happened that this master tries sending data through the master_xfer function but slave data from a peer master is still being processed or slave xfer is started by a peer immediately after it queues a master command. To support multi-master use cases properly, this H/W provides arbitration in physical level and it provides priority based command handling too to avoid conflicts in multi-master environment, means that if a master and a slave events happen at the same time, H/W will handle a higher priority event first and a pending event will be handled when bus comes back to the idle state. To support this H/W feature properly, this patch adds the 'pending' state of master and its handling code so that the pending master xfer can be continued after slave operation properly. Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: core-smbus: don't trace smbus_reply data on errors	John Sperbeck	2	-4/+4
	If an smbus transfer fails, there's no guarantee that the output buffer was written. So, avoid trying to show the output buffer when tracing after an error. This was 'mostly harmless', but would trip up kasan checking if left-over cruft in byte 0 is a large length, causing us to read from unwritten memory. Signed-off-by: John Sperbeck <jsperbeck@google.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: ocores: Add support for bus clock via platform data	Andrew Lunn	2	-1/+5
	Add the I2C bus clock speed to the platform data structure. If not set, default to 100KHz as before. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: ocores: Add support for IO mapper registers.	Andrew Lunn	1	-3/+29
	Some implementations of the OCORES i2c bus master use IO mapped registers. Add support for getting the IO registers from the platform data, and register accessor functions. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: ocores: checkpatch fixes	Federico Vaga	1	-11/+18
	Miscellaneous style fixes from checkpatch Signed-off-by: Federico Vaga <federico.vaga@cern.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: ocores: add SPDX tag	Federico Vaga	2	-8/+2
	It adds the SPDX tag and it removes the old text about the GPLv2. Signed-off-by: Federico Vaga <federico.vaga@cern.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: ocores: add polling interface	Federico Vaga	1	-21/+161
	This driver assumes that an interrupt line is always available for the I2C master. This is not always the case and this patch adds support for a polling version. Report from Andrew Lunn: I did some timing tests for this. On my box, we request a udelay of 80uS. The kernel actually delays for about 79uS. We then spin in ocores_wait() for an additional 10-11uS, which is 3 to 4 iterations. There are actually 9 bits on the wire, not 8, since there is an ACK/NACK bit after the actual data transfer. So i changed the delay to (9 * 1000) / i2c->bus_clock_khz. That resulted in ocores_wait() mostly not looping at all. But for reading an 4K AT24 EEPROM, it increased the read time by 10ms, from 424ms to 434ms. So we should probably keep with 8. Signed-off-by: Federico Vaga <federico.vaga@cern.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: ocores: do not handle IRQ if IF is not set	Federico Vaga	1	-3/+6
	If the Interrupt Flag (IF) is not set, we should not handle the IRQ: - the line can be shared with other devices - it can be a spurious interrupt To avoid reading twice the status register, the ocores_process() function expects it to be read by the caller. Signed-off-by: Federico Vaga <federico.vaga@cern.ch> Acked-by: Peter Korsgaard <peter@korsgaard.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: ocores: stop transfer on timeout	Federico Vaga	1	-9/+45
	Detecting a timeout is ok, but we also need to assert a STOP command on the bus in order to prevent it from generating interrupts when there are no on going transfers. Example: very long transmission. 1. ocores_xfer: START a transfer 2. ocores_isr : handle byte by byte the transfer 3. ocores_xfer: goes in timeout [[bugfix here]] 4. ocores_xfer: return to I2C subsystem and to the I2C driver 5. I2C driver : it may clean up the i2c_msg memory 6. ocores_isr : receives another interrupt (pending bytes to be transferred) but the i2c_msg memory is invalid now So, since the transfer was too long, we have to detect the timeout and STOP the transfer. Another point is that we have a critical region here. When handling the timeout condition we may have a running IRQ handler. For this reason I introduce a spinlock. In order to make easier to understan locking I have: - added a new function to handle timeout - modified the current ocores_process() function in order to be protected by the new spinlock Like this it is obvious at first sight that this locking serializes the execution of ocores_process() and ocores_process_timeout() Signed-off-by: Federico Vaga <federico.vaga@cern.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: tegra: add i2c interface timing support	Sowjanya Komatineni	1	-30/+159
	This patch adds I2C interface timing registers support for proper bus rate configuration along with meeting the I2C spec setup and hold times based on the tuning performed on Tegra210, Tegra186 and Tegra194 platforms. I2C_INTERFACE_TIMING_0 register contains TLOW and THIGH field and Tegra I2C controller design uses them as a part of internal clock divisor. I2C_INTERFACE_TIMING_1 register contains the setup and hold times for start and stop conditions. Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: tegra: update transfer timeout	Sowjanya Komatineni	1	-4/+11
	Tegra194 allows max of 64K bytes and Tegra186 and prior allows max of 4K bytes of transfer per packet. one sec timeout is not enough for transfers more than 10K bytes at STD bus rate. This patch updates I2C transfer timeout based on the transfer size and I2C bus rate to allow enough time during max transfer size at lower bus speed. Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: tegra: Add DMA support	Sowjanya Komatineni	1	-49/+383
	This patch adds DMA support for Tegra I2C. Tegra I2C TX and RX FIFO depth is 8 words. PIO mode is used for transfer size of the max FIFO depth and DMA mode is used for transfer size higher than max FIFO depth to save CPU overhead. PIO mode needs full intervention of CPU to fill or empty FIFO's and also need to service multiple data requests interrupt for the same transaction. This adds delay between data bytes of the same transfer when CPU is fully loaded and some slave devices has internal timeout for no bus activity and stops transaction to avoid bus hang. DMA mode is helpful in such cases. DMA mode is also helpful for Large transfers during downloading or uploading FW over I2C to some external devices. Tegra210 and prior Tegra chips use APBDMA driver which is replaced with GPCDMA on Tegra186 and Tegra194. This patch uses has_apb_dma flag in hw_feature to differentiate DMA driver change between Tegra chipset. APBDMA driver is registered from module-init level and this patch also has a change to register I2C driver at module-init level rather than subsys-init to avoid deferring I2C probe till APBDMA driver is registered. Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: tegra: update maximum transfer size	Sowjanya Komatineni	1	-2/+6
	Tegra194 supports maximum 64K bytes per packet including 12 bytes of packet header irrespective of PIO or DMA mode transfer. This patch updates Tegra194 max write length to account for packet header size for transfers. Cc: stable@vger.kernel.org # 4.20+ Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: tegra: fix maximum transfer size	Sowjanya Komatineni	1	-1/+1
	Tegra186 and prior supports maximum 4K bytes per packet transfer including 12 bytes of packet header. This patch fixes max write length limit to account packet header size for transfers. Cc: stable@vger.kernel.org # 4.4+ Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: tegra: add bus clear Master Support	Sowjanya Komatineni	1	-0/+81
	Bus clear feature of Tegra I2C controller helps to recover from bus hang when I2C master loses the bus arbitration due to the slave device holding SDA LOW continuously for some unknown reasons. Per I2C specification, the device that held the bus LOW should release it within 9 clock pulses. During bus clear operation, Tegra I2C controller sends 9 clock pulses and terminates the transaction with STOP condition. Upon successful bus clear operation, bus goes to idle state and driver retries the transaction. Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	i2c: tegra: sort all the include headers alphabetically	Sowjanya Komatineni	1	-11/+8
	This patch sorts all the include headers alphabetically for the I2C Tegra driver. Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-14	eeprom: at24: implement support for 'num-addresses' property	Bartosz Golaszewski	1	-5/+8
	If the device node defines 'num-addresses', let it override the default behavior. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2019-02-14	dt-bindings: at24: add the 'num-addresses' property	Bartosz Golaszewski	1	-0/+3
	Currently the at24 driver only creates additional i2c dummies for atmel,24c00 and it's hard-coded. Some other chips (like for example Microchip's 24AA02T) also take more slave addresses despite being otherwise compatible with already supported variants. Add a new property to the device tree binding document that defines the total number of i2c slave addresses taken by the device. The addresses are counted starting from the one in the reg property. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2019-02-14	eeprom: at24: remove at24_platform_data	Bartosz Golaszewski	3	-148/+75
	There are no more users of at24_platform_data. Remove the relevant header and modify the driver code to not use it anymore. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2019-02-10	Linux 5.0-rc6	Linus Torvalds	1	-1/+1

2019-02-10	x86/mm: Make set_pmd_at() paravirt aware	Juergen Gross	1	-1/+1
	set_pmd_at() calls native_set_pmd() unconditionally on x86. This was fine as long as only huge page entries were written via set_pmd_at(), as Xen pv guests don't support those. Commit 2c91bd4a4e2e53 ("mm: speed up mremap by 20x on large regions") introduced a usage of set_pmd_at() possible on pv guests, leading to failures like: BUG: unable to handle kernel paging request at ffff888023e26778 #PF error: [PROT] [WRITE] RIP: e030:move_page_tables+0x7c1/0xae0 move_vma.isra.3+0xd1/0x2d0 __se_sys_mremap+0x3c6/0x5b0 do_syscall_64+0x49/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Make set_pmd_at() paravirt aware by just letting it use set_pmd(). Fixes: 2c91bd4a4e2e53 ("mm: speed up mremap by 20x on large regions") Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Cc: boris.ostrovsky@oracle.com Cc: sstabellini@kernel.org Cc: hpa@zytor.com Cc: bp@alien8.de Cc: torvalds@linux-foundation.org Link: https://lkml.kernel.org/r/20190210074056.11842-1-jgross@suse.com
2019-02-08	i2c: rcar: refactor TCYC handling	Wolfram Sang	1	-9/+6
	The latest documentation made it clear that we need to initialize the TCYC value independently of DMA. The old code used TCYC06 (wrongly) for non-DMA transfers. The new code sets TCYC up independently from DMA. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-08	i2c: gpio: merge two very similar comments	Wolfram Sang	1	-12/+5
	I think it is clear enough if we have the explanation once and make it clear it is applicable for both SCL and SDA. Reword it a little with the help of Simon's native language skills :) Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-08	i2c: sh_mobile: use new clock calculation formulas for Gen2	Wolfram Sang	1	-5/+5
	We measured the clock on a Lager and an Ebisu board. The new formula gives better results for both. So after Gen3, switch to this formula for all Gen2 SoCs. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-08	i2c: sh_mobile: use new clock calculation formulas for Gen3	Wolfram Sang	1	-3/+3
	We could finally measure the clock on an Ebisu board. The new formula gives way better results, i.e. 100kHz instead of 106kHz and 400kHz instead of 387kHz. Switch to these formulas for all Gen3 SoCs. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-08	i2c: sh_mobile: sort compatible entries	Wolfram Sang	1	-2/+2
	Makes it easier to add new ones. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-08	i2c: cbus-gpio: Switch to use GPIO descriptors	Linus Walleij	3	-85/+40
	This augments the CBUS GPIO I2C driver to use GPIO descriptors for clock, sel and data. We drop the platform data that was only used for carrying GPIO numbers and use machine descriptor tables instead. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-08	MAINTAINERS: Update the ocores i2c bus driver maintainer, etc	Andrew Lunn	1	-0/+2
	The listed maintainer has not been responding to emails for a while. Add myself as a second maintainer. Add the platform data include file, which was not listed. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-02-08	blk-mq: remove duplicated definition of blk_mq_freeze_queue	Liu Bo	1	-1/+0
	As the prototype has been defined in "include/linux/blk-mq.h", the one in "block/blk-mq.h" can be removed then. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-08	Blk-iolatency: warn on negative inflight IO counter	Liu Bo	1	-1/+3
	This is to catch any unexpected negative value of inflight IO counter. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-08	blk-iolatency: fix IO hang due to negative inflight counter	Liu Bo	1	-7/+45
	Our test reported the following stack, and vmcore showed that ->inflight counter is -1. [ffffc9003fcc38d0] __schedule at ffffffff8173d95d [ffffc9003fcc3958] schedule at ffffffff8173de26 [ffffc9003fcc3970] io_schedule at ffffffff810bb6b6 [ffffc9003fcc3988] blkcg_iolatency_throttle at ffffffff813911cb [ffffc9003fcc3a20] rq_qos_throttle at ffffffff813847f3 [ffffc9003fcc3a48] blk_mq_make_request at ffffffff8137468a [ffffc9003fcc3b08] generic_make_request at ffffffff81368b49 [ffffc9003fcc3b68] submit_bio at ffffffff81368d7d [ffffc9003fcc3bb8] ext4_io_submit at ffffffffa031be00 [ext4] [ffffc9003fcc3c00] ext4_writepages at ffffffffa03163de [ext4] [ffffc9003fcc3d68] do_writepages at ffffffff811c49ae [ffffc9003fcc3d78] __filemap_fdatawrite_range at ffffffff811b6188 [ffffc9003fcc3e30] filemap_write_and_wait_range at ffffffff811b6301 [ffffc9003fcc3e60] ext4_sync_file at ffffffffa030cee8 [ext4] [ffffc9003fcc3ea8] vfs_fsync_range at ffffffff8128594b [ffffc9003fcc3ee8] do_fsync at ffffffff81285abd [ffffc9003fcc3f18] sys_fsync at ffffffff81285d50 [ffffc9003fcc3f28] do_syscall_64 at ffffffff81003c04 [ffffc9003fcc3f50] entry_SYSCALL_64_after_swapgs at ffffffff81742b8e The ->inflight counter may be negative (-1) if 1) blk-iolatency was disabled when the IO was issued, 2) blk-iolatency was enabled before this IO reached its endio, 3) the ->inflight counter is decreased from 0 to -1 in endio() In fact the hang can be easily reproduced by the below script, H=/sys/fs/cgroup/unified/ P=/sys/fs/cgroup/unified/test echo "+io" > $H/cgroup.subtree_control mkdir -p $P echo $$ > $P/cgroup.procs xfs_io -f -d -c "pwrite 0 4k" /dev/sdg echo "`cat /sys/block/sdg/dev` target=1000000" > $P/io.latency xfs_io -f -d -c "pwrite 0 4k" /dev/sdg This fixes the problem by freezing the queue so that while enabling/disabling iolatency, there is no inflight rq running. Note that quiesce_queue is not needed as this only updating iolatency configuration about which dispatching request_queue doesn't care. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-08	MAINTAINERS: unify reference to xen-devel list	Lukas Bulwahn	1	-1/+1
	In the linux kernel MAINTAINERS file, largely "xen-devel@lists.xenproject.org (moderated for non-subscribers)" is used to refer to the xen-devel mailing list. The DRM DRIVERS FOR XEN section entry mentions xen-devel@lists.xen.org instead, but that is just the same mailing list as the mailing list above. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-02-08	x86/mm/cpa: Fix set_mce_nospec()	Peter Zijlstra	1	-25/+25
	The recent commit fe0937b24ff5 ("x86/mm/cpa: Fold cpa_flush_range() and cpa_flush_array() into a single cpa_flush() function") accidentally made the call to make_addr_canonical_again() go away, which breaks set_mce_nospec(). Re-instate the call to convert the address back into canonical form right before invoking either CLFLUSH or INVLPG. Rename the function while at it to be shorter (and less MAGA). Fixes: fe0937b24ff5 ("x86/mm/cpa: Fold cpa_flush_range() and cpa_flush_array() into a single cpa_flush() function") Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Luck <tony.luck@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Rik van Riel <riel@surriel.com> Link: https://lkml.kernel.org/r/20190208120859.GH32511@hirez.programming.kicks-ass.net
2019-02-08	futex: Handle early deadlock return correctly	Thomas Gleixner	2	-15/+50
	commit 56222b212e8e ("futex: Drop hb->lock before enqueueing on the rtmutex") changed the locking rules in the futex code so that the hash bucket lock is not longer held while the waiter is enqueued into the rtmutex wait list. This made the lock and the unlock path symmetric, but unfortunately the possible early exit from __rt_mutex_proxy_start() due to a detected deadlock was not updated accordingly. That allows a concurrent unlocker to observe inconsitent state which triggers the warning in the unlock path. futex_lock_pi() futex_unlock_pi() lock(hb->lock) queue(hb_waiter) lock(hb->lock) lock(rtmutex->wait_lock) unlock(hb->lock) // acquired hb->lock hb_waiter = futex_top_waiter() lock(rtmutex->wait_lock) __rt_mutex_proxy_start() ---> fail remove(rtmutex_waiter); ---> returns -EDEADLOCK unlock(rtmutex->wait_lock) // acquired wait_lock wake_futex_pi() rt_mutex_next_owner() --> returns NULL --> WARN lock(hb->lock) unqueue(hb_waiter) The problem is caused by the remove(rtmutex_waiter) in the failure case of __rt_mutex_proxy_start() as this lets the unlocker observe a waiter in the hash bucket but no waiter on the rtmutex, i.e. inconsistent state. The original commit handles this correctly for the other early return cases (timeout, signal) by delaying the removal of the rtmutex waiter until the returning task reacquired the hash bucket lock. Treat the failure case of __rt_mutex_proxy_start() in the same way and let the existing cleanup code handle the eventual handover of the rtmutex gracefully. The regular rt_mutex_proxy_start() gains the rtmutex waiter removal for the failure case, so that the other callsites are still operating correctly. Add proper comments to the code so all these details are fully documented. Thanks to Peter for helping with the analysis and writing the really valuable code comments. Fixes: 56222b212e8e ("futex: Drop hb->lock before enqueueing on the rtmutex") Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Co-developed-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Stefan Liebler <stli@linux.ibm.com> Cc: Sebastian Sewior <bigeasy@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1901292311410.1950@nanos.tec.linutronix.de
2019-02-08	futex: Fix barrier comment	Davidlohr Bueso	1	-2/+2
	The current comment for the barrier that guarantees that waiter increment is always before taking the hb spinlock (barrier (A)) needs to be fixed as it is misplaced. This is obviously referring to hb_waiters_inc, which is a full barrier. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190206185602.949-1-dave@stgolabs.net
2019-02-07	net: dsa: b53: Fix for failure when irq is not defined in dt	Arun Parameswaran	1	-3/+0
	Fixes the issues with non BCM58XX chips in the b53 driver failing, when the irq is not specified in the device tree. Removed the check for BCM58XX in b53_srab_prepare_irq(), so the 'port->irq' will be set to '-EXIO' if the irq is not specified in the device tree. Fixes: 16994374a6fc ("net: dsa: b53: Make SRAB driver manage port interrupts") Fixes: b2ddc48a81b5 ("net: dsa: b53: Do not fail when IRQ are not initialized") Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07	blktrace: Show requests without sector	Jan Kara	1	-1/+7
	Currently, blktrace will not show requests that don't have any data as rq->__sector is initialized to -1 which is out of device range and thus discarded by act_log_check(). This is most notably the case for cache flush requests sent to the device. Fix the problem by making blk_rq_trace_sector() return 0 for requests without initialized sector. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-07	mips: cm: reprime error cause	Vladimir Kondratiev	1	-1/+1
	Accordingly to the documentation ---cut--- The GCR_ERROR_CAUSE.ERR_TYPE field and the GCR_ERROR_MULT.ERR_TYPE fields can be cleared by either a reset or by writing the current value of GCR_ERROR_CAUSE.ERR_TYPE to the GCR_ERROR_CAUSE.ERR_TYPE register. ---cut--- Do exactly this. Original value of cm_error may be safely written back; it clears error cause and keeps other bits untouched. Fixes: 3885c2b463f6 ("MIPS: CM: Add support for reporting CM cache errors") Signed-off-by: Vladimir Kondratiev <vladimir.kondratiev@linux.intel.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # v4.3+
2019-02-07	mips: loongson64: remove unreachable(), fix loongson_poweroff().	Yifeng Li	1	-1/+6
	On my Yeeloong 8089, I noticed the machine fails to shutdown properly, and often, the function mach_prepare_reboot() is unexpectedly executed, thus the machine reboots instead. A wait loop is needed to ensure the system is in a well-defined state before going down. In commit 997e93d4df16 ("MIPS: Hang more efficiently on halt/powerdown/restart"), a general superset of the wait loop for all platforms is already provided, so we don't need to implement our own. This commit simply removes the unreachable() compiler marco after mach_prepare_reboot(), thus allowing the execution of machine_hang(). My test shows that the machine is now able to shutdown successfully. Please note that there are two different bugs preventing the machine from shutting down, another work-in-progress commit is needed to fix a lockup in cpufreq / i8259 driver, please read Reference, this commit does not fix that bug. Reference: https://lkml.org/lkml/2019/2/5/908 Signed-off-by: Yifeng Li <tomli@tomli.me> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org Cc: Huacai Chen <chenhc@lemote.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: stable@vger.kernel.org # v4.17+
2019-02-07	sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach()	Hangbin Liu	1	-1/+2
	If we disabled IPv6 from the kernel command line (ipv6.disable=1), we should not call ip6_err_gen_icmpv6_unreach(). This: ip link add sit1 type sit local 192.0.2.1 remote 192.0.2.2 ttl 1 ip link set sit1 up ip addr add 198.51.100.1/24 dev sit1 ping 198.51.100.2 if IPv6 is disabled at boot time, will crash the kernel. v2: there's no need to use in6_dev_get(), use __in6_dev_get() instead, as we only need to check that idev exists and we are under rcu_read_lock() (from netif_receive_skb_internal()). Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: ca15a078bd90 ("sit: generate icmpv6 error when receiving icmpv4 error") Cc: Oussama Ghorbel <ghorbel@pivasoftware.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07	geneve: should not call rt6_lookup() when ipv6 was disabled	Hangbin Liu	1	-3/+7
	When we add a new GENEVE device with IPv6 remote, checking only for IS_ENABLED(CONFIG_IPV6) is not enough as we may disable IPv6 in the kernel command line (ipv6.disable=1), and calling rt6_lookup() would cause a NULL pointer dereference. v2: - don't mix declarations and code (reported by Stefano Brivio, Eric Dumazet) - there's no need to use in6_dev_get() as we only need to check that idev exists (reported by David Ahern). This is under RTNL, so we can simply use __in6_dev_get() instead (Stefano, Eric). Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: c40e89fd358e9 ("geneve: configure MTU based on a lower device") Cc: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>