linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-11-01	ntb: idt: Add basic hwmon sysfs interface	Serge Semin	3	-1/+206
	IDT PCIe switches provide an embedded temperature sensor working within [0; 127.5]C with resolution of 0.5C. They also can generate a PCIe upstream interrupt in case if the temperature passes through specified thresholds. Since this thresholds interface is very broken the created hwmon-sysfs interface exposes only the next set of hwmon nodes: current input temperature, lowest and highest values measured, history resetting, value offset. HWmon alarm interface isn't provided. IDT PCIe switch also've got an ADC/filter settings of the sensor. This driver doesn't expose them to the hwmon-sysfs interface at the moment, except the offset node. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2018-11-01	ntb: idt: Alter temperature read method	Serge Semin	2	-17/+152
	In order to create a hwmon interface for the IDT PCIe-switch temperature sensor the already available reader method should be improved. Particularly we need to redesign it so one would be able to read temperature/offset values from registers of the passed types. Since IDT sensor interface provides temperature in unsigned format 0:7:1 (7 bits for real value and one for fraction) we also need to have helpers for the typical sysfs temperature data type conversion to and from this format. Even though the IDT PCIe-switch provided temperature offset got the same but signed type it can be translated by these methods too. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2018-11-01	compat: Cleanup in_compat_syscall() callers	Dmitry Safonov	1	-12/+4
	Now that in_compat_syscall() is consistent on all architectures and does not longer report true on native i686, the workarounds (ifdeffery and helpers) can be removed. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andy Lutomirsky <luto@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Stultz <john.stultz@linaro.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-efi@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lkml.kernel.org/r/20181012134253.23266-3-dima@arista.com
2018-11-01	irqchip/irq-mvebu-sei: Fix a NULL vs IS_ERR() bug in probe function	Dan Carpenter	1	-2/+2
	The devm_ioremap_resource() function never returns NULL, it returns error pointers. Fixes: 61ce8d8d8a81 ("irqchip/irq-mvebu-sei: Add new driver for Marvell SEI") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Gregory Clement <gregory.clement@bootlin.com> Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: kernel-janitors@vger.kernel.org Link: https://lkml.kernel.org/r/20181013102246.GD16086@mwanda
2018-10-31	net: stmmac: Fix stmmac_mdio_reset() when building stmmac as modules	Niklas Cassel	1	-1/+1
	When building stmmac, it is only possible to select CONFIG_DWMAC_GENERIC, or any of the glue drivers, when CONFIG_STMMAC_PLATFORM is set. The only exception is CONFIG_STMMAC_PCI. When calling of_mdiobus_register(), it will call our ->reset() callback, which is set to stmmac_mdio_reset(). Most of the code in stmmac_mdio_reset() is protected by a "#if defined(CONFIG_STMMAC_PLATFORM)", which will evaluate to false when CONFIG_STMMAC_PLATFORM=m. Because of this, the phy reset gpio will only be pulled when stmmac is built as built-in, but not when built as modules. Fix this by using "#if IS_ENABLED()" instead of "#if defined()". Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue	David S. Miller	14	-37/+76
	Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-10-31 This series contains a various collection of fixes. Miroslav Lichvar from Red Hat or should I say IBM now? Updates the PHC timecounter interval for igb so that it gets updated at least once every 550 seconds. Ngai-Mint provides a fix for fm10k to prevent a soft lockup or system crash by adding a new condition to determine if the SM mailbox is in the correct state before proceeding. Jake provides several fm10k fixes, first one marks complier aborts as non-fatal since on some platforms trigger machine check errors when the compile aborts. Added missing device ids to the in-kernel driver. Due to the recent fixes, bumped the driver version. I (Jeff Kirsher) fixed a XFRM_ALGO dependency for both ixgbe and ixgbevf. This fix was based on the original work from Arnd Bergmann, which only fixed ixgbe. Mitch provides a fix for i40e/avf to update the status codes, which resolves an issue between a mis-match between i40e and the iavf driver, which also supports the ice LAN driver. Radoslaw fixes the ixgbe where the driver is logging a message about spoofed packets detected when the VF is re-started with a different MAC address. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	ntb_netdev: Simplify remove with client device drvdata	Aaron Sierra	1	-25/+3
	Replace the elaborate private structure global linked-list used in ntb_netdev_probe() and ntb_netdev_remove() by stashing our private data in the NTB transport client device. Signed-off-by: Aaron Sierra <asierra@xes-inc.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2018-10-31	NTB: transport: Try harder to alloc an aligned MW buffer	Aaron Sierra	1	-23/+63
	Be a little wasteful if the (likely CMA) message window buffer is not suitably aligned after our first attempt; allocate a buffer twice as big as we need and manually align our MW buffer within it. This was needed on Intel Broadwell DE platforms with intel_iommu=off Signed-off-by: Aaron Sierra <asierra@xes-inc.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2018-10-31	ntb: ntb_transport: Mark expected switch fall-throughs	Gustavo A. R. Silva	1	-0/+2
	In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 1373888 ("Missing break in switch") Addresses-Coverity-ID: 1373889 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Allen Hubbe <allenbh@gmail.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2018-10-31	ntb: idt: Set PCIe bus address to BARLIMITx	Serge Semin	1	-1/+1
	IDT NTB driver sets the upper limit of actual translation address being written to the corresponding memory window setup. It is achieved by BARLIMITx register initialization. Needless to say, that the register works within PCIe bus address space. In general CPU and PCIe address spaces are different. It means, that addresses used for Memory TLPs routine can be different from CPU addresses. While in most of cases they are the same, there are exceptions when the proper mapping must be performed to have the portable driver code. There used to be a virt_to_bus()/bus_to_virt() interface for this purpose. But it's deprecated now. It was also a mistake to use pci_resource_start() since the return address of the method is at the CPU address space. In order to achieve the desired purpose we need to use pci_bus_address() helper. This method shall return a PCIe bus base address of the corresponding BAR resource. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Acked-by: Allen Hubbe <allenbh@gmail.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2018-10-31	Merge tag 'tag-chrome-platform-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform	Linus Torvalds	7	-11/+179
	Pull chrome-platform updates from Benson Leung: - Move mfd/cros_ec_lpc* includes to drivers/platform from mfd - Adding a new interrupt path for cros_ec_lpc * tag 'tag-chrome-platform-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform: platform/chrome: chromeos_tbmc - Remove unneeded const platform/chrome: Add a new interrupt path for cros_ec_lpc mfd: cros_ec: Fix and improve kerneldoc comments. platform/chrome: Move mfd/cros_ec_lpc* includes to drivers/platform.
2018-10-31	i2c: Clear client->irq in i2c_device_remove	Charles Keepax	1	-0/+2
	The IRQ will be mapped in i2c_device_probe only if client->irq is zero and i2c_device_remove does not clear this. When rebinding an I2C device, whos IRQ provider has also been rebound this means that an IRQ mapping will never be created, causing the I2C device to fail to acquire its IRQ. Fix this issue by clearing client->irq in i2c_device_remove, forcing i2c_device_probe to lookup the mapping again. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-10-31	i2c: Remove unnecessary call to irq_find_mapping	Charles Keepax	1	-4/+1
	irq_create_mapping calls irq_find_mapping internally and will use the found mapping if one exists, so there is no need to manually call this from i2c_smbus_host_notify_to_irq. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-10-31	Merge tag 'ceph-for-4.20-rc1' of git://github.com/ceph/ceph-client	Linus Torvalds	1	-12/+16
	Pull ceph updates from Ilya Dryomov: "The highlights are: - a series that fixes some old memory allocation issues in libceph (myself). We no longer allocate memory in places where allocation failures cannot be handled and BUG when the allocation fails. - support for copy_file_range() syscall (Luis Henriques). If size and alignment conditions are met, it leverages RADOS copy-from operation. Otherwise, a local copy is performed. - a patch that reduces memory requirement of ceph_sync_read() from the size of the entire read to the size of one object (Zheng Yan). - fallocate() syscall is now restricted to FALLOC_FL_PUNCH_HOLE (Luis Henriques)" * tag 'ceph-for-4.20-rc1' of git://github.com/ceph/ceph-client: (25 commits) ceph: new mount option to disable usage of copy-from op ceph: support copy_file_range file operation libceph: support the RADOS copy-from operation ceph: add non-blocking parameter to ceph_try_get_caps() libceph: check reply num_data_items in setup_request_data() libceph: preallocate message data items libceph, rbd, ceph: move ceph_osdc_alloc_messages() calls libceph: introduce alloc_watch_request() libceph: assign cookies in linger_submit() libceph: enable fallback to ceph_msg_new() in ceph_msgpool_get() ceph: num_ops is off by one in ceph_aio_retry_work() libceph: no need to call osd_req_opcode_valid() in osd_req_encode_op() ceph: set timeout conditionally in __cap_delay_requeue libceph: don't consume a ref on pagelist in ceph_msg_data_add_pagelist() libceph: introduce ceph_pagelist_alloc() libceph: osd_req_op_cls_init() doesn't need to take opcode libceph: bump CEPH_MSG_MAX_DATA_LEN ceph: only allow punch hole mode in fallocate ceph: refactor ceph_sync_read() ceph: check if LOOKUPNAME request was aborted when filling trace ...
2018-10-31	NTB: ntb_hw_idt: replace IS_ERR_OR_NULL with regular NULL checks	Gustavo A. R. Silva	1	-4/+4
	Both devm_kcalloc() and devm_kzalloc() return NULL on error. They never return error pointers. The use of IS_ERR_OR_NULL is currently applied to the wrong context. Fix this by replacing IS_ERR_OR_NULL with regular NULL checks. Fixes: bf2a952d31d2 ("NTB: Add IDT 89HPESxNTx PCIe-switches support") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2018-10-31	ntb: intel: fix return value for ndev_vec_mask()	Dave Jiang	1	-1/+1
	ndev_vec_mask() should be returning u64 mask value instead of int. Otherwise the mask value returned can be incorrect for larger vectors. Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Lucas Van <lucas.van@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2018-10-31	ntb_netdev: fix sleep time mismatch	Jon Mason	1	-1/+1
	The tx_time should be in usecs (according to the comment above the variable), but the setting of the timer during the rearming is done in msecs. Change it to match the expected units. Fixes: e74bfeedad08 ("NTB: Add flow control to the ntb_netdev") Suggested-by: Gerd W. Haeussler <gerd.haeussler@cesys-it.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Acked-by: Dave Jiang <dave.jiang@intel.com>
2018-10-31	mlxsw: spectrum: Set minimum shaper on MC TCs	Petr Machata	1	-0/+25
	An MC-aware mode was introduced in commit 7b8195306694 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports"). In MC-aware mode, BUM traffic gets a special treatment by being assigned to a separate set of traffic classes 8..15. Pairs of TCs 0 and 8, 1 and 9, etc., are then configured to strictly prioritize the lower-numbered ones. The intention is to prevent BUM traffic from flooding the switch and push out all UC traffic, which would otherwise happen, and instead give UC traffic precedence. However strictly prioritizing UC traffic has the effect that UC overload pushes out all BUM traffic, such as legitimate ARP queries. These packets are kept in queues for a while, but under sustained UC overload, their lifetime eventually expires and these packets are dropped. That is detrimental to network performance as well. Therefore configure the MC TCs (8..15) with minimum shaper of 200Mbps (a minimum permitted value) to allow a trickle of necessary control traffic to get through. Fixes: 7b8195306694 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	mlxsw: reg: QEEC: Add minimum shaper fields	Petr Machata	1	-1/+21
	Add QEEC.mise (minimum shaper enable) and QEEC.min_shaper_rate to enable configuration of minimum shaper. Increase the QEEC length to 0x20 as well: that's the length that the register has had for a long time now, but with the configurations that mlxsw typically exercises, the firmware tolerated 0x1C-sized packets. With mise=true however, FW rejects packets unless they have the full required length. Fixes: b9b7cee40579 ("mlxsw: reg: Add QoS ETS Element Configuration register") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset()	Huazhong Tan	1	-0/+5
	Since hclgevf_reset_wait() is used to wait for the hardware to complete the reset, it is not necessary to hold the rtnl_lock during hclgevf_reset_wait(). So this patch releases the lock for the duration of hclgevf_reset_wait(). Fixes: 6988eb2a9b77 ("net: hns3: Add support to reset the enet/ring mgmt layer") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: bugfix for rtnl_lock's range in the hclge_reset()	Huazhong Tan	1	-0/+3
	Since hclge_reset_wait() is used to wait for the hardware to complete the reset, it is not necessary to hold the rtnl_lock during hclge_reset_wait(). So this patch releases the lock for the duration of hclge_reset_wait(). Fixes: 6d4fab39533f ("net: hns3: Reset net device with rtnl_lock") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: bugfix for handling mailbox while the command queue reinitialized	Huazhong Tan	1	-0/+6
	In a multi-core machine, the mailbox service and reset service will be executed at the same time. The reset service will re-initialize the command queue, before that, the mailbox handler can only get some invalid messages. The HCLGE_STATE_CMD_DISABLE flag means that the command queue is not available and needs to be reinitialized. Therefore, when the mailbox handler recognizes this flag, it should not process the command. Fixes: dde1a86e93ca ("net: hns3: Add mailbox support to PF driver") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: fix incorrect return value/type of some functions	Huazhong Tan	6	-53/+85
	There are some functions that, when they fail to send the command, need to return the corresponding error value to its caller. Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Fixes: 681ec3999b3d ("net: hns3: fix for vlan table lost problem when resetting") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: bugfix for hclge_mdio_write and hclge_mdio_read	Huazhong Tan	1	-2/+2
	When there is a PHY, the driver needs to complete some operations through MDIO during reset reinitialization, so HCLGE_STATE_CMD_DISABLE is more suitable than HCLGE_STATE_RST_HANDLING to prevent the MDIO operation from being sent during the hardware reset. Fixes: b50ae26c57cb ("net: hns3: never send command queue message to IMP when reset) Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: bugfix for is_valid_csq_clean_head()	Huazhong Tan	1	-6/+6
	The HEAD pointer of the hardware command queue maybe equal to the command queue's next_to_use in the driver, so that does not belong to the invalid HEAD pointer, since the hardware may not process the command in time, causing the HEAD pointer to be too late to update. The variables' name in this function is unreadable, so give them a more readable one. Fixes: 3ff504908f95 ("net: hns3: fix a dead loop in hclge_cmd_csq_clean") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: remove unnecessary queue reset in the hns3_uninit_all_ring()	Huazhong Tan	2	-6/+0
	It is not necessary to reset the queue in the hns3_uninit_all_ring(), since the queue is stopped in the down operation, and will be reset in the up operation. And the judgment of the HCLGE_STATE_RST_HANDLING flag in the hclge_reset_tqp() is not correct, because we need to reset tqp during pf reset, otherwise it may cause queue not being reset to working state problem. Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: bugfix for the initialization of command queue's spin lock	Huazhong Tan	1	-4/+10
	The spin lock of the command queue only need to be initialized once when the driver initializes the command queue. It is not necessary to initialize the spin lock when resetting. At the same time, the modification of the queue member should be performed after acquiring the lock. Fixes: 3efb960f056d ("net: hns3: Refactor the initialization of command queue") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: bugfix for reporting unknown vector0 interrupt repeatly problem	Huazhong Tan	1	-1/+1
	The current driver supports handling two vector0 interrupts, reset and mailbox. When the hardware reports an interrupt of another type of interrupt source, if the driver does not process the interrupt, but enables the interrupt, the hardware will repeatedly report the unknown interrupt. Therefore, the driver enables the vector0 interrupt after clearing the known type of interrupt source. Other conditions are not enabled. Fixes: cd8c5c269b1d ("net: hns3: Fix for hclge_reset running repeatly problem") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: bugfix for buffer not free problem during resetting	Huazhong Tan	1	-3/+21
	When hns3_get_ring_config()/hns3_queue_to_ring()/ hns3_get_vector_ring_chain() failed during resetting, the allocated memory has not been freed before these three functions return. So this patch adds error handler in these functions to fix it. Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: add error handler for hns3_nic_init_vector_data()	Huazhong Tan	1	-2/+8
	When hns3_nic_init_vector_data() fails to map ring to vector, it should cancel the netif_napi_add() that has been successfully done and then exits. Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net/mlx5e: fix csum adjustments caused by RXFCS	Eric Dumazet	1	-36/+9
	As shown by Dmitris, we need to use csum_block_add() instead of csum_add() when adding the FCS contribution to skb csum. Before 4.18 (more exactly commit 88078d98d1bb "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"), the whole skb csum was thrown away, so RXFCS changes were ignored. Then before commit d55bef5059dd ("net: fix pskb_trim_rcsum_slow() with odd trim offset") both mlx5 and pskb_trim_rcsum_slow() bugs were canceling each other. Now we fixed pskb_trim_rcsum_slow() we need to fix mlx5. Note that this patch also rewrites mlx5e_get_fcs() to : - Use skb_header_pointer() instead of reinventing it. - Use __get_unaligned_cpu32() to avoid possible non aligned accesses as Dmitris pointed out. Fixes: 902a545904c7 ("net/mlx5e: When RXFCS is set, add FCS data into checksum calculation") Reported-by: Paweł Staszewski <pstaszewski@itcare.pl> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eran Ben Elisha <eranbe@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Dimitris Michailidis <dmichail@google.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Paweł Staszewski <pstaszewski@itcare.pl> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Tested-By: Maria Pasechnik <mariap@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	vhost: Fix Spectre V1 vulnerability	Jason Wang	1	-0/+2
	The idx in vhost_vring_ioctl() was controlled by userspace, hence a potential exploitation of the Spectre variant 1 vulnerability. Fixing this by sanitizing idx before using it to index d->vqs. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	net: hns3: fix spelling mistake "intrerrupt" -> "interrupt"	Colin Ian King	1	-1/+1
	Trivial fix to spelling mistake in dev_err message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31	Merge tag 'fbdev-v4.20' of https://github.com/bzolnier/linux	Linus Torvalds	26	-617/+158
	Pull fbdev updates from Bartlomiej Zolnierkiewicz: "No major changes to the subsystem itself, mainly fb drivers fixes & cleanups (atyfb & udlfb updates stand out from the rest) + removal of no longer needed old clps711xfb driver. Details: - update atyfb driver - improvements for ATI Mach64 chips: detect the dot clock divider correctly on Sparc, fix display corruptions (due to endianness issues and improper reading of accelerator registers), optimize scrolling performance and also fix debugging printks (Mikulas Patocka) - rewrite USB unplug handling in udlfb driver using framebuffer subsystem reference counting (Mikulas Patocka) - fix support for native-mode display-timings in atmel_lcdfb driver (Sam Ravnborg) - fix information leak & add missing access_ok() checks in sbuslib (Dan Carpenter) - allow using GPIO expanders that can sleep in ssd1307fb driver (Michal Vokáč) - convert omapfb driver to use GPIO descriptors instead of GPIO numbers for Amstrad Delta board (Janusz Krzysztofik) - fix broken Kconfig menu dependencies (Randy Dunlap) - convert fbdev subsystem to use %pOFn instead of device_node.name (Rob Herring) - remove the dead old CLPS711x LCD support driver (the new CLPS711x LCD support driver is still available) - misc fixes (Jia-Ju Bai, Gustavo A. R. Silva) - misc cleanups (Mehdi Bounya, Nathan Chancellor, YueHaibing)" * tag 'fbdev-v4.20' of https://github.com/bzolnier/linux: (22 commits) video: fbdev: remove redundant 'default n' from Kconfig-s video: fbdev: remove dead old CLPS711x LCD support driver Revert "video: ssd1307fb: Do not hard code active-low reset sequence" video: fbdev: arcfb: mark expected switch fall-through pxa168fb: remove set but not used variables 'mi' video: ssd1307fb: Do not hard code active-low reset sequence video: ssd1307fb: Use gpiod_set_value_cansleep() for reset fbdev: fix broken menu dependencies video: fbdev: sis: Remove unnecessary parentheses and commented code video: fbdev: omapfb: lcd_ams_delta: use GPIO lookup table fbdev: sbuslib: integer overflow in sbusfb_ioctl_helper() fbdev: sbuslib: use checked version of put_user() fbdev: Convert to using %pOFn instead of device_node.name atmel_lcdfb: support native-mode display-timings Video: vgastate: fixed a spacing coding style atyfb: fix debugging printks mach64: optimize wait_for_fifo mach64: fix image corruption due to reading accelerator registers mach64: fix display corruption on big endian machines mach64: detect the dot clock divider correctly on sparc ...
2018-10-31	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux	Linus Torvalds	1	-3/+6
	Pull thermal management updates from Zhang Rui: - Fix a use-after-free issue when unregistering a thermal cooling device (Dmitry Osipenko) - use power_efficient_wq for thermal worker to save more power (Jeson Gao) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: core: using power_efficient_wq for thermal worker thermal: core: Fix use-after-free in thermal_cooling_device_destroy_sysfs
2018-10-31	EDAC, skx: Fix randconfig builds	Borislav Petkov	1	-1/+1
	The driver depends on the ADXL component glue and selects it. However, ADXL itself implicitly depends on ACPI and in nonsensical randconfig builds like this: # CONFIG_ACPI is not set CONFIG_ACPI_ADXL=y where ACPI is not enabled, the build fails with: drivers/edac/skx_edac.o: In function `skx_mce_check_error': skx_edac.c:(.text+0xab): undefined reference to `adxl_decode' drivers/edac/skx_edac.o: In function `skx_init': skx_edac.c:(.init.text+0x8bf): undefined reference to `adxl_get_component_names' make: *** [vmlinux] Error 1 Add stubs for that case so that the build succeeds. CONFIG_ACPI=n doesn't make any sense for real configurations but this fix will at least silence randconfig builds. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Tony Luck <tony.luck@intel.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
2018-10-31	Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux	Linus Torvalds	219	-3478/+18371
	Pull clk updates from Stephen Boyd: "This time it looks like a quieter release cycle in the clk tree. I guess that's because of summer time holidays/vacations. The biggest change in the diffstat is in the Qualcomm clk driver, where they got support for CPUs and handful of SoCs. After that, the at91 driver got a major rewrite for newer DT bindings that should make things easier going forward and the TI code moved to a clockdomain based design. The long tail is mostly small driver updates for newer clks and some simpler SoC clock drivers such as the Hisilicon and imx support. In the core framework, we only have two small changes this time. One is a new clk API to get all clks for a device with the bulk clk APIs. This allows drivers that don't care about doing anything besides turning on all the clks to just clk_get() them all and turn them on. The other change is the beginning of a way to support save and restore of clk settings in the clk framework. TI is the only user right now, but we will want to expand upon this design in the future to support more save and restore of clk registers. At least this gets us started and works well enough for one SoC, but there's more work in the future. Core: - clk_bulk_get_all() API and friends to get all the clks for a device - Basic clk state save/restore hooks New Drivers: - Renesas RZ/A2 (R7S9210) SoC, including early clocks - Rensas RZ/G1N (R8A7744) and RZ/G2E (R8A774C0) SoCs - Rensas RZ/G2M (r8a774a1) SoC - Qualcomm Krait CPU clk support - Qualcomm QCS404 GCC support - Qualcomm SDM660 GCC support - Qualcomm SDM845 camera clock controller - Ingenic jz4725b CGU - Hisilicon 3670 SoC support - TI SCI clks on K3 SoCs - iMX6 MMDC clks - Reset Controller (RMU) support for Actions Semi Owl S900 and S700 SoCs Updates: - Rework at91 PMC clock driver for new DT bindings - Nvidia Tegra clk driver MBIST workaround fix - S2RAM support for Marvell mvebu periph clks - Use updated printk format for OF node names - Fix TI code to only search DT subnodes - Various static analysis finds - Tag various drivers with SPDX license tags - Support dynamic frequency switching (DFS) on qcom SDM845 GCC - Only use s2mps11 dt-binding defines instead of redefining them in the driver - Add some more missing clks to qcom MSM8996 GCC - Quad SPI clks on qcom SDM845 - Add support for CMT timer clocks on R-Car V3H - Add support for SHDI and various timer clocks on R-Car V3M - Improve OSC and RCLK (watchdog) handling on R-Car Gen3 SoCs - Amlogic clk-pll driver improvements and updates - Amlogic axg audio controller system clocks - Register Amlogic meson8b clock controller early - Add support for SATA and Fine Display Processor (FDP) clocks on R-Car M3-N - Consolidation of system suspend related code in Exynos, S5P, S3C SoC clk drivers - Fixes for system suspend support on Exynos542x (Odroid boards) and Exynos5433 SoC - Remove obsoleted Exynos4212 ISP clock definitions - Migrated TI am3/4/5 and dra7 SoCs to clockdomain based design - TI RTC+DDR sleep mode support for clock save/restore - Allwinner A64 display engine support and fixes - Allwinner A83t display engine support and fixes" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (186 commits) clk: qcom: Remove unused arrays in SDM845 GCC clk: fixed-rate: fix of_node_get-put imbalance clk: s2mps11: Add used attribute to s2mps11_dt_match clk: qcom: gcc-sdm660: Add MODULE_LICENSE clk: qcom: Add safe switch hook for krait mux clocks dt-bindings: clock: Document qcom,krait-cc clk: qcom: Add Krait clock controller driver dt-bindings: arm: Document qcom,kpss-gcc clk: qcom: Add KPSS ACC/GCC driver clk: qcom: Add support for Krait clocks clk: qcom: Add IPQ806X's HFPLLs clk: qcom: Add MSM8960/APQ8064's HFPLLs dt-bindings: clock: Document qcom,hfpll clk: qcom: Add HFPLL driver clk: qcom: Add support for High-Frequency PLLs (HFPLLs) ARM: Add Krait L2 register accessor functions clk: imx6q: add mmdc0 ipg clock clk: imx6sl: add mmdc ipg clocks clk: imx6sll: add mmdc1 ipg clock clk: imx6sx: add mmdc1 ipg clock ...
2018-10-31	ixgbe: fix MAC anti-spoofing filter after VFLR	Radoslaw Tyl	1	-1/+3
	This change resolves a driver bug where the driver is logging a message that says "Spoofed packets detected". This can occur on the PF (host) when a VF has VLAN+MACVLAN enabled and is re-started with a different MAC address. MAC and VLAN anti-spoofing filters are to be enabled together. Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Acked-by: Piotr Skajewski <piotrx.skajewski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-31	Merge tag 'vfio-v4.20-rc1.v2' of git://github.com/awilliam/linux-vfio	Linus Torvalds	3	-4/+37
	Pull VFIO updates from Alex Williamson: - EDID interfaces for vfio devices supporting display extensions (Gerd Hoffmann) - Generically select Type-1 IOMMU model support on ARM/ARM64 (Geert Uytterhoeven) - Quirk for VFs reporting INTx pin (Alex Williamson) - Fix error path memory leak in MSI support (Li Qiang) * tag 'vfio-v4.20-rc1.v2' of git://github.com/awilliam/linux-vfio: vfio: add edid support to mbochs sample driver vfio: add edid api for display (vgpu) devices. drivers/vfio: Allow type-1 IOMMU instantiation with all ARM/ARM64 IOMMUs vfio/pci: Mask buggy SR-IOV VF INTx support vfio/pci: Fix potential memory leak in vfio_msi_cap_len
2018-10-31	i40e: Update status codes	Mitch Williams	1	-1/+1
	Add a few new status code which will be used by the ice driver, and rename a few to make them more consistent. Error code are mapped to similar values as in i40e_status.h, so as to be compatible with older VF drivers not using this status enum. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-31	Merge tag 'media/v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media	Linus Torvalds	69	-345/+4245
	Pull new experimental media request API from Mauro Carvalho Chehab: "A new media request API This API is needed to support device drivers that can dynamically change their parameters for each new frame. The latest versions of Google camera and codec HAL depends on such feature. At this stage, it supports only stateless codecs. It has been discussed for a long time (at least over the last 3-4 years), and we finally reached to something that seem to work. This series contain both the API and core changes required to support it and a new m2m decoder driver (cedrus). As the current API is still experimental, the only real driver using it (cedrus) was added at staging[1]. We intend to keep it there for a while, in order to test the API. Only when we're sure that this API works for other cases (like encoders), we'll move this driver out of staging and set the API into a stone. [1] We added support for the vivid virtual driver (used only for testing) to it too, as it makes easier to test the API for the ones that don't have the cedrus hardware" * tag 'media/v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (53 commits) media: dt-bindings: Document the Rockchip VPU bindings media: platform: Add Cedrus VPU decoder driver media: dt-bindings: media: Document bindings for the Cedrus VPU driver media: v4l: Add definition for the Sunxi tiled NV12 format media: v4l: Add definitions for MPEG-2 slice format and metadata media: videobuf2-core: Rework and rename helper for request buffer count media: v4l2-ctrls.c: initialize an error return code with zero media: v4l2-compat-ioctl32.c: add missing documentation for a field media: media-request: update documentation media: media-request: EPERM -> EACCES/EBUSY media: v4l2-ctrls: improve media_request_(un)lock_for_update media: v4l2-ctrls: use media_request_(un)lock_for_access media: media-request: add media_request_(un)lock_for_access media: vb2: set reqbufs/create_bufs capabilities media: videodev2.h: add new capabilities for buffer types media: buffer.rst: only set V4L2_BUF_FLAG_REQUEST_FD for QBUF media: v4l2-ctrls: return -EACCES if request wasn't completed media: media-request: return -EINVAL for invalid request_fds media: vivid: add request support media: vivid: add mc ...
2018-10-31	ixgbe/ixgbevf: fix XFRM_ALGO dependency	Jeff Kirsher	7	-12/+30
	Based on the original work from Arnd Bergmann. When XFRM_ALGO is not enabled, the new ixgbe IPsec code produces a link error: drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.o: In function `ixgbe_ipsec_vf_add_sa': ixgbe_ipsec.c:(.text+0x1266): undefined reference to `xfrm_aead_get_byname' Simply selecting XFRM_ALGO from here causes circular dependencies, so to fix it, we probably want this slightly more complex solution that is similar to what other drivers with XFRM offload do: A separate Kconfig symbol now controls whether we include the IPsec offload code. To keep the old behavior, this is left as 'default y'. The dependency in XFRM_OFFLOAD still causes a circular dependency but is not actually needed because this symbol is not user visible, so removing that dependency on top makes it all work. CC: Arnd Bergmann <arnd@arndb.de> CC: Shannon Nelson <shannon.nelson@oracle.com> Fixes: eda0333ac293 ("ixgbe: add VF IPsec management") Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
2018-10-31	fm10k: bump driver version to match out-of-tree release	Jacob Keller	1	-1/+1
	The upstream and out-of-tree drivers are once again at comparable functionality. It's been a while since we updated the upstream driver version, so bump it now. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-31	fm10k: add missing device IDs to the upstream driver	Jacob Keller	2	-0/+4
	The device IDs for the Ethernet SDI Adapter devices were never added to the upstream driver. The IDs are already in the pci.ids database, and are supported by the out-of-tree driver. Add the device IDs now, so that the upstream driver can recognize and load these devices. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-31	fm10k: ensure completer aborts are marked as non-fatal after a resume	Jacob Keller	1	-20/+28
	VF drivers can trigger PCIe completer aborts any time they read a queue that they don't own. Even in nominal circumstances, it is not possible to prevent the VF driver from reading queues it doesn't own. VF drivers may attempt to read queues it previously owned, but which it no longer does due to a PF reset. Normally these completer aborts aren't an issue. However, on some platforms these trigger machine check errors. This is true even if we lower their severity from fatal to non-fatal. Indeed, we already have code for lowering the severity. We could attempt to mask these errors conditionally around resets, which is the most common time they would occur. However this would essentially be a race between the PF and VF drivers, and we may still occasionally see machine check exceptions on these strictly configured platforms. Instead, mask the errors entirely any time we resume VFs. By doing so, we prevent the completer aborts from being sent to the parent PCIe device, and thus these strict platforms will not upgrade them into machine check errors. Additionally, we don't lose any information by masking these errors, because we'll still report VFs which attempt to access queues via the FUM_BAD_VF_QACCESS errors. Without this change, on platforms where completer aborts cause machine check exceptions, the VF reading queues it doesn't own could crash the host system. Masking the completer abort prevents this, so we should mask it for good, and not just around a PCIe reset. Otherwise malicious or misconfigured VFs could cause the host system to crash. Because we are masking the error entirely, there is little reason to also keep setting the severity bit, so that code is also removed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-31	fm10k: fix SM mailbox full condition	Ngai-Mint Kwan	1	-1/+2
	Current condition will always incorrectly report a full SM mailbox if an IES API application is not running. Due to this, the "fm10k_service_task" will be infinitely queued into the driver's workqueue. This, in turn, will cause a "kworker" thread to report 100% CPU utilization and might cause "soft lockup" events or system crashes. To fix this issue, a new condition is added to determine if the SM mailbox is in the correct state of FM10K_STATE_OPEN before proceeding. In other words, an instance of the IES API must be running. If there is, the remainder of the flow stays the same which is to determine if the SM mailbox capacity has been exceeded or not and take appropriate action. Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-31	igb: shorten maximum PHC timecounter update interval	Miroslav Lichvar	1	-1/+7
	The timecounter needs to be updated at least once per ~550 seconds in order to avoid a 40-bit SYSTIM timestamp to be misinterpreted as an old timestamp. Since commit 500462a9d ("timers: Switch to a non-cascading wheel"), scheduling of delayed work seems to be less accurate and a requested delay of 540 seconds may actually be longer than 550 seconds. Shorten the delay to 480 seconds to be sure the timecounter is updated in time. This fixes an issue with HW timestamps on 82580/I350/I354 being off by ~1100 seconds for few seconds every ~9 minutes. Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-31	mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock	David Hildenbrand	1	-12/+1
	There seem to be some problems as result of 30467e0b3be ("mm, hotplug: fix concurrent memory hot-add deadlock"), which tried to fix a possible lock inversion reported and discussed in [1] due to the two locks a) device_lock() b) mem_hotplug_lock While add_memory() first takes b), followed by a) during bus_probe_device(), onlining of memory from user space first took a), followed by b), exposing a possible deadlock. In [1], and it was decided to not make use of device_hotplug_lock, but rather to enforce a locking order. The problems I spotted related to this: 1. Memory block device attributes: While .state first calls mem_hotplug_begin() and the calls device_online() - which takes device_lock() - .online does no longer call mem_hotplug_begin(), so effectively calls online_pages() without mem_hotplug_lock. 2. device_online() should be called under device_hotplug_lock, however onlining memory during add_memory() does not take care of that. In addition, I think there is also something wrong about the locking in 3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages() without locks. This was introduced after 30467e0b3be. And skimming over the code, I assume it could need some more care in regards to locking (e.g. device_online() called without device_hotplug_lock. This will be addressed in the following patches. Now that we hold the device_hotplug_lock when - adding memory (e.g. via add_memory()/add_memory_resource()) - removing memory (e.g. via remove_memory()) - device_online()/device_offline() We can move mem_hotplug_lock usage back into online_pages()/offline_pages(). Why is mem_hotplug_lock still needed? Essentially to make get_online_mems()/put_online_mems() be very fast (relying on device_hotplug_lock would be very slow), and to serialize against addition of memory that does not create memory block devices (hmm). [1] http://driverdev.linuxdriverproject.org/pipermail/ driverdev-devel/ 2015-February/065324.html This patch is partly based on a patch by Vitaly Kuznetsov. Link: http://lkml.kernel.org/r/20180925091457.28651-4-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Rashmica Gupta <rashmica.g@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31	mm/memory_hotplug: make add_memory() take the device_hotplug_lock	David Hildenbrand	3	-3/+11
	add_memory() currently does not take the device_hotplug_lock, however is aleady called under the lock from arch/powerpc/platforms/pseries/hotplug-memory.c drivers/acpi/acpi_memhotplug.c to synchronize against CPU hot-remove and similar. In general, we should hold the device_hotplug_lock when adding memory to synchronize against online/offline request (e.g. from user space) - which already resulted in lock inversions due to device_lock() and mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory hot-add deadlock"). add_memory()/add_memory_resource() will create memory block devices, so this really feels like the right thing to do. Holding the device_hotplug_lock makes sure that a memory block device can really only be accessed (e.g. via .online/.state) from user space, once the memory has been fully added to the system. The lock is not held yet in drivers/xen/balloon.c arch/powerpc/platforms/powernv/memtrace.c drivers/s390/char/sclp_cmd.c drivers/hv/hv_balloon.c So, let's either use the locked variants or take the lock. Don't export add_memory_resource(), as it once was exported to be used by XEN, which is never built as a module. If somebody requires it, we also have to export a locked variant (as device_hotplug_lock is never exported). Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mathieu Malaterre <malat@debian.org> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31	mm/memory_hotplug: make remove_memory() take the device_hotplug_lock	David Hildenbrand	1	-1/+1
	Patch series "mm: online/offline_pages called w.o. mem_hotplug_lock", v3. Reading through the code and studying how mem_hotplug_lock is to be used, I noticed that there are two places where we can end up calling device_online()/device_offline() - online_pages()/offline_pages() without the mem_hotplug_lock. And there are other places where we call device_online()/device_offline() without the device_hotplug_lock. While e.g. echo "online" > /sys/devices/system/memory/memory9/state is fine, e.g. echo 1 > /sys/devices/system/memory/memory9/online Will not take the mem_hotplug_lock. However the device_lock() and device_hotplug_lock. E.g. via memory_probe_store(), we can end up calling add_memory()->online_pages() without the device_hotplug_lock. So we can have concurrent callers in online_pages(). We e.g. touch in online_pages() basically unprotected zone->present_pages then. Looks like there is a longer history to that (see Patch #2 for details), and fixing it to work the way it was intended is not really possible. We would e.g. have to take the mem_hotplug_lock in device/base/core.c, which sounds wrong. Summary: We had a lock inversion on mem_hotplug_lock and device_lock(). More details can be found in patch 3 and patch 6. I propose the general rules (documentation added in patch 6): 1. add_memory/add_memory_resource() must only be called with device_hotplug_lock. 2. remove_memory() must only be called with device_hotplug_lock. This is already documented and holds for all callers. 3. device_online()/device_offline() must only be called with device_hotplug_lock. This is already documented and true for now in core code. Other callers (related to memory hotplug) have to be fixed up. 4. mem_hotplug_lock is taken inside of add_memory/remove_memory/ online_pages/offline_pages. To me, this looks way cleaner than what we have right now (and easier to verify). And looking at the documentation of remove_memory, using lock_device_hotplug also for add_memory() feels natural. This patch (of 6): remove_memory() is exported right now but requires the device_hotplug_lock, which is not exported. So let's provide a variant that takes the lock and only export that one. The lock is already held in arch/powerpc/platforms/pseries/hotplug-memory.c drivers/acpi/acpi_memhotplug.c arch/powerpc/platforms/powernv/memtrace.c Apart from that, there are not other users in the tree. Link: http://lkml.kernel.org/r/20180925091457.28651-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Rashmica Gupta <rashmica.g@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>