linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-10-21	xarray: Add XArray load operation	Matthew Wilcox	1	-0/+1
	The xa_load function brings with it a lot of infrastructure; xa_empty(), xa_is_err(), and large chunks of the XArray advanced API that are used to implement xa_load. As the test-suite demonstrates, it is possible to use the XArray functions on a radix tree. The radix tree functions depend on the GFP flags being stored in the root of the tree, so it's not possible to use the radix tree functions on an XArray. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21	xarray: Add documentation	Matthew Wilcox	2	-0/+405
	This is documentation on how to use the XArray, not details about its internal implementation. Signed-off-by: Matthew Wilcox <willy@infradead.org> Acked-by: Josef Bacik <jbacik@fb.com>
2018-10-21	xarray: Define struct xa_node	Matthew Wilcox	5	-73/+77
	This is a direct replacement for struct radix_tree_node. A couple of struct members have changed name, so convert those. Use a #define so that radix tree users continue to work without change. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-10-21	xarray: Add definition of struct xarray	Matthew Wilcox	12	-68/+181
	This is a direct replacement for struct radix_tree_root. Some of the struct members have changed name; convert those, and use a #define so that radix_tree users continue to work without change. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-09-29	xarray: Change definition of sibling entries	Matthew Wilcox	6	-50/+121
	Instead of storing a pointer to the slot containing the canonical entry, store the offset of the slot. Produces slightly more efficient code (~300 bytes) and simplifies the implementation. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-09-29	xarray: Replace exceptional entries	Matthew Wilcox	26	-232/+278
	Introduce xarray value entries and tagged pointers to replace radix tree exceptional entries. This is a slight change in encoding to allow the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a value entry). It is also a change in emphasis; exceptional entries are intimidating and different. As the comment explains, you can choose to store values or pointers in the xarray and they are both first-class citizens. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-09-29	idr: Permit any valid kernel pointer to be stored	Matthew Wilcox	3	-10/+78
	An upcoming change to the encoding of internal entries will set the bottom two bits to 0b10. Unfortunately, m68k only aligns some data structures to 2 bytes, so the IDR will interpret them as internal entries and things will go badly wrong. Change the radix tree so that it stops either when the node indicates that it's the bottom of the tree (shift == 0) or when the entry is not an internal entry. This means we cannot insert an arbitrary kernel pointer as a multiorder entry, but the IDR does not permit multiorder entries. Annoyingly, this means the IDR can no longer take advantage of the radix tree's ability to store a single entry at offset 0 without allocating memory. A pointer which is 2-byte aligned cannot be stored directly in the root as it would be indistinguishable from a node, so we must allocate a node in order to store a 2-byte pointer at index 0. The idr_replace() function does not take a GFP flags argument, so cannot allocate memory. If a user inserts a 4-byte aligned pointer at index 0 and then replaces it with a 2-byte aligned pointer, we must be able to store it. Arbitrary pointer values are still not permitted; pointers of the form 2 + (i * 4) for values of i between 0 and 1023 are reserved for the implementation. These are not valid kernel pointers as they would point into the zero page. This change does cause a runtime memory consumption regression for the IDA. I will recover that later. Signed-off-by: Matthew Wilcox <willy@infradead.org> Tested-by: Guenter Roeck <linux@roeck-us.net>
2018-09-29	Update email address	Matthew Wilcox	9	-11/+18
	Redirect some older email addresses that are in the git logs. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-09-29	cpufreq: qcom-kryo: Fix section annotations	Nathan Chancellor	1	-2/+2
	There is currently a warning when building the Kryo cpufreq driver into the kernel image: WARNING: vmlinux.o(.text+0x8aa424): Section mismatch in reference from the function qcom_cpufreq_kryo_probe() to the function .init.text:qcom_cpufreq_kryo_get_msm_id() The function qcom_cpufreq_kryo_probe() references the function __init qcom_cpufreq_kryo_get_msm_id(). This is often because qcom_cpufreq_kryo_probe lacks a __init annotation or the annotation of qcom_cpufreq_kryo_get_msm_id is wrong. Remove the '__init' annotation from qcom_cpufreq_kryo_get_msm_id so that there is no more mismatch warning. Additionally, Nick noticed that the remove function was marked as '__init' when it should really be marked as '__exit'. Fixes: 46e2856b8e18 (cpufreq: Add Kryo CPU scaling driver) Fixes: 5ad7346b4ae2 (cpufreq: kryo: Add module remove and exit) Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.18+ <stable@vger.kernel.org> # 4.18+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-09-28	perf/core: Add sanity check to deal with pinned event failure	Reinette Chatre	1	-0/+6
	It is possible that a failure can occur during the scheduling of a pinned event. The initial portion of perf_event_read_local() contains the various error checks an event should pass before it can be considered valid. Ensure that the potential scheduling failure of a pinned event is checked for and have a credible error. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: acme@kernel.org Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/6486385d1f30336e9973b24c8c65f5079543d3d3.1537377064.git.reinette.chatre@intel.com
2018-09-28	xen/blkfront: correct purging of persistent grants	Juergen Gross	1	-2/+2
	Commit a46b53672b2c2e3770b38a4abf90d16364d2584b ("xen/blkfront: cleanup stale persistent grants") introduced a regression as purged persistent grants were not pu into the list of free grants again. Correct that. Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-28	Revert "xen/blkfront: When purging persistent grants, keep them in the buffer"	Jens Axboe	1	-1/+3
	Fix didn't work for all cases, reverting to add a (hopefully) better fix. This reverts commit f151ba989d149bbdfc90e5405724bbea094f9b17. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-28	selftests/powerpc: Fix Makefiles for headers_install change	Michael Ellerman	17	-0/+17
	Commit b2d35fa5fc80 ("selftests: add headers_install to lib.mk") introduced a requirement that Makefiles more than one level below the selftests directory need to define top_srcdir, but it didn't update any of the powerpc Makefiles. This broke building all the powerpc selftests with eg: make[1]: Entering directory '/src/linux/tools/testing/selftests/powerpc' BUILD_TARGET=/src/linux/tools/testing/selftests/powerpc/alignment; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C alignment all make[2]: Entering directory '/src/linux/tools/testing/selftests/powerpc/alignment' ../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory make[2]: *** No rule to make target '../../../../scripts/subarch.include'. make[2]: Failed to remake makefile '../../../../scripts/subarch.include'. Makefile:38: recipe for target 'alignment' failed Fix it by setting top_srcdir in the affected Makefiles. Fixes: b2d35fa5fc80 ("selftests: add headers_install to lib.mk") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-09-27	blk-mq: I/O and timer unplugs are inverted in blktrace	Ilya Dryomov	1	-2/+2
	trace_block_unplug() takes true for explicit unplugs and false for implicit unplugs. schedule() unplugs are implicit and should be reported as timer unplugs. While correct in the legacy code, this has been inverted in blk-mq since 4.11. Cc: stable@vger.kernel.org Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers") Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-27	x86/boot: Fix kexec booting failure in the SEV bit detection code	Kairui Song	1	-19/+0
	Commit 1958b5fc4010 ("x86/boot: Add early boot support when running with SEV active") can occasionally cause system resets when kexec-ing a second kernel even if SEV is not active. That's because get_sev_encryption_bit() uses 32-bit rIP-relative addressing to read the value of enc_bit - a variable which caches a previously detected encryption bit position - but kexec may allocate the early boot code to a higher location, beyond the 32-bit addressing limit. In this case, garbage will be read and get_sev_encryption_bit() will return the wrong value, leading to accessing memory with the wrong encryption setting. Therefore, remove enc_bit, and thus get rid of the need to do 32-bit rIP-relative addressing in the first place. [ bp: massage commit message heavily. ] Fixes: 1958b5fc4010 ("x86/boot: Add early boot support when running with SEV active") Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: linux-kernel@vger.kernel.org Cc: tglx@linutronix.de Cc: mingo@redhat.com Cc: hpa@zytor.com Cc: brijesh.singh@amd.com Cc: kexec@lists.infradead.org Cc: dyoung@redhat.com Cc: bhe@redhat.com Cc: ghook@redhat.com Link: https://lkml.kernel.org/r/20180927123845.32052-1-kasong@redhat.com
2018-09-27	bcache: add separate workqueue for journal_write to avoid deadlock	Guoju Fang	3	-3/+12
	After write SSD completed, bcache schedules journal_write work to system_wq, which is a public workqueue in system, without WQ_MEM_RECLAIM flag. system_wq is also a bound wq, and there may be no idle kworker on current processor. Creating a new kworker may unfortunately need to reclaim memory first, by shrinking cache and slab used by vfs, which depends on bcache device. That's a deadlock. This patch create a new workqueue for journal_write with WQ_MEM_RECLAIM flag. It's rescuer thread will work to avoid the deadlock. Signed-off-by: Guoju Fang <fangguoju@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-27	drm/amd/display: Fix Edid emulation for linux	Bhawanpreet Lakha	3	-7/+137
	[Why] EDID emulation didn't work properly for linux, as we stop programming if nothing is connected physically. [How] We get a flag from DRM when we want to do edid emulation. We check if this flag is true and nothing is connected physically, if so we only program the front end using VIRTUAL_SIGNAL. Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-27	drm/amd/display: Fix Vega10 lightup on S3 resume	Roman Li	3	-18/+1
	[Why] There have been a few reports of Vega10 display remaining blank after S3 resume. The regression is caused by workaround for mode change on Vega10 - skip set_bandwidth if stream count is 0. As a result we skipped dispclk reset on suspend, thus on resume we may skip the clock update assuming it hasn't been changed. On some systems it causes display blank or 'out of range'. [How] Revert "drm/amd/display: Fix Vega10 black screen after mode change" Verified that it hadn't cause mode change regression. Signed-off-by: Roman Li <Roman.Li@amd.com> Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-27	drm/amdgpu: Fix vce work queue was not cancelled when suspend	Rex Zhu	2	-3/+4
	The vce cancel_delayed_work_sync never be called. driver call the function in error path. This caused the A+A suspend hang when runtime pm enebled. As we will visit the smu in the idle queue. this will cause smu hang because the dgpu has been suspend, and the dgpu also will be waked up. As the smu has been hang, so the dgpu resume will failed. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2018-09-27	Revert "drm/panel: Add device_link from panel device to DRM device"	Linus Walleij	2	-11/+0
	This reverts commit 0c08754b59da5557532d946599854e6df28edc22. commit 0c08754b59da ("drm/panel: Add device_link from panel device to DRM device") creates a circular dependency under these circumstances: 1. The panel depends on dsi-host because it is MIPI-DSI child device. 2. dsi-host depends on the drm parent device (connector->dev->dev) this should be allowed. 3. drm parent dev (connector->dev->dev) depends on the panel after this patch. This makes the dependency circular and while it appears it does not affect any in-tree drivers (they do not seem to have dsi hosts depending on the same parent device) this does not seem right. As noted in a response from Andrzej Hajda, the intent is likely to make the panel dependent on the DRM device (connector->dev) not its parent. But we have no way of doing that since the DRM device doesn't contain any struct device on its own (arguably it should). Revert this until a proper approach is figured out. Cc: Jyri Sarha <jsarha@ti.com> Cc: Eric Anholt <eric@anholt.net> Cc: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20180927124130.9102-1-linus.walleij@linaro.org
2018-09-27	xen/blkfront: When purging persistent grants, keep them in the buffer	Boris Ostrovsky	1	-3/+1
	Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") added support for purging persistent grants when they are not in use. As part of the purge, the grants were removed from the grant buffer, This eventually causes the buffer to become empty, with BUG_ON triggered in get_free_grant(). This can be observed even on an idle system, within 20-30 minutes. We should keep the grants in the buffer when purging, and only free the grant ref. Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-27	clocksource/drivers/timer-atmel-pit: Properly handle error cases	Alexandre Belloni	1	-6/+14
	The smatch utility reports a possible leak: smatch warnings: drivers/clocksource/timer-atmel-pit.c:183 at91sam926x_pit_dt_init() warn: possible memory leak of 'data' Ensure data is freed before exiting with an error. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: stable@vger.kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2018-09-26	block: fix deadline elevator drain for zoned block devices	Damien Le Moal	1	-1/+1
	When the deadline scheduler is used with a zoned block device, writes to a zone will be dispatched one at a time. This causes the warning message: deadline: forced dispatching is broken (nr_sorted=X), please report this to be displayed when switching to another elevator with the legacy I/O path while write requests to a zone are being retained in the scheduler queue. Prevent this message from being displayed when executing elv_drain_elevator() for a zoned block device. __blk_drain_queue() will loop until all writes are dispatched and completed, resulting in the desired elevator queue drain without extensive modifications to the deadline code itself to handle forced-dispatch calls. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Fixes: 8dc8146f9c92 ("deadline-iosched: Introduce zone locking support") Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-26	ACPI / hotplug / PCI: Don't scan for non-hotplug bridges if slot is not bridge	Mika Westerberg	1	-5/+6
	HP 6730b laptop has an ethernet NIC connected to one of the PCIe root ports. The root ports themselves are native PCIe hotplug capable. Now, during boot after PCI devices are scanned the BIOS triggers ACPI bus check directly to the NIC: ACPI: \_SB_.PCI0.RP06.NIC_: Bus check in hotplug_event() It is not clear why it is sending bus check but regardless the ACPI hotplug notify handler calls enable_slot() directly (instead of going through acpiphp_check_bridge() as there is no bridge), which ends up handling special case for non-hotplug bridges with native PCIe hotplug. This results a crash of some kind but the reporter only sees black screen so it is hard to figure out the exact spot and what actually happens. Based on a few fix proposals it was tracked to crash somewhere inside pci_assign_unassigned_bridge_resources(). In any case we should not really be in that special branch at all because the ACPI notify happened to a slot that is not a PCI bridge (it is just a regular PCI device). Fix this so that we only go to that special branch if we are calling enable_slot() for a bridge (e.g., the ACPI notification was for the bridge). Link: https://bugzilla.kernel.org/show_bug.cgi?id=201127 Fixes: 84c8b58ed3ad ("ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug") Reported-by: Peter Anemone <peter.anemone@gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CC: stable@vger.kernel.org # v4.18+
2018-09-26	drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set	Jason Ekstrand	1	-0/+5
	We attempt to get fences earlier in the hopes that everything will already have fences and no callbacks will be needed. If we do succeed in getting a fence, getting one a second time will result in a duplicate ref with no unref. This is causing memory leaks in Vulkan applications that create a lot of fences; playing for a few hours can, apparently, bring down the system. Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107899 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20180926071703.15257-1-jason.ekstrand@intel.com
2018-09-26	iommu/amd: Return devid as alias for ACPI HID devices	Arindam Nath	1	-0/+6
	ACPI HID devices do not actually have an alias for them in the IVRS. But dev_data->alias is still used for indexing into the IOMMU device table for devices being handled by the IOMMU. So for ACPI HID devices, we simply return the corresponding devid as an alias, as parsed from IVRS table. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Fixes: 2bf9a0a12749 ('iommu/amd: Add iommu support for ACPI HID devices') Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-09-25	blk-mq: Allow blocking queue tag iter callbacks	Keith Busch	1	-9/+4
	A recent commit runs tag iterator callbacks under the rcu read lock, but existing callbacks do not satisfy the non-blocking requirement. The commit intended to prevent an iterator from accessing a queue that's being modified. This patch fixes the original issue by taking a queue reference instead of reading it, which allows callbacks to make blocking calls. Fixes: f5bbbbe4d6357 ("blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter") Acked-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-25	nvme: properly propagate errors in nvme_mpath_init	Susobhan Dey	1	-2/+4
	Signed-off-by: Susobhan Dey <susobhan.dey@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-09-25	dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration	Christoph Hellwig	1	-0/+3
	The patch adding the infrastructure failed to actually add the symbol declaration, oops.. Fixes: faef87723a ("dma-noncoherent: add a arch_sync_dma_for_cpu_all hook") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Paul Burton <paul.burton@mips.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-09-25	RDMA/core: Set right entry state before releasing reference	Parav Pandit	1	-34/+34
	Currently add_modify_gid() for IB link layer has followong issue in cache update path. When GID update event occurs, core releases reference to the GID table without updating its state and/or entry pointer. CPU-0 CPU-1 ------ ----- ib_cache_update() IPoIB ULP add_modify_gid() [..] put_gid_entry() refcnt = 0, but state = valid, entry is valid. (work item is not yet executed). ipoib_create_ah() rdma_create_ah() rdma_get_gid_attr() <-- Tries to acquire gid_attr which has refcnt = 0. This is incorrect. GID entry state and entry pointer is provides the accurate GID enty state. Such fields must be updated with rwlock to protect against readers and, such fields must be in sane state before refcount can drop to zero. Otherwise above race condition can happen leading to use-after-free situation. Following backtrace has been observed when cache update for an IB port is triggered while IPoIB ULP is creating an AH. Therefore, when updating GID entry, first mark a valid entry as invalid through state and set the barrier so that no callers can acquired the GID entry, followed by release reference to it. refcount_t: increment on 0; use-after-free. WARNING: CPU: 4 PID: 29106 at lib/refcount.c:153 refcount_inc_checked+0x30/0x50 Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] RIP: 0010:refcount_inc_checked+0x30/0x50 RSP: 0018:ffff8802ad36f600 EFLAGS: 00010082 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000008 RDI: ffffffff86710100 RBP: ffff8802d6e60a30 R08: ffffed005d67bf8b R09: ffffed005d67bf8b R10: 0000000000000001 R11: ffffed005d67bf8a R12: ffff88027620cee8 R13: ffff8802d6e60988 R14: ffff8802d6e60a78 R15: 0000000000000202 FS: 0000000000000000(0000) GS:ffff8802eb200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3ab35e5c88 CR3: 00000002ce84a000 CR4: 00000000000006e0 IPv6: ADDRCONF(NETDEV_CHANGE): ib1: link becomes ready Call Trace: rdma_get_gid_attr+0x220/0x310 [ib_core] ? lock_acquire+0x145/0x3a0 rdma_fill_sgid_attr+0x32c/0x470 [ib_core] rdma_create_ah+0x89/0x160 [ib_core] ? rdma_fill_sgid_attr+0x470/0x470 [ib_core] ? ipoib_create_ah+0x52/0x260 [ib_ipoib] ipoib_create_ah+0xf5/0x260 [ib_ipoib] ipoib_mcast_join_complete+0xbbe/0x2540 [ib_ipoib] Fixes: b150c3862d21 ("IB/core: Introduce GID entry reference counts") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25	IB/mlx5: Destroy the DEVX object upon error flow	Yishai Hadas	1	-1/+4
	Upon DEVX object creation the object must be destroyed upon a follows error flow. Fixes: 7efce3691d33 ("IB/mlx5: Add obj create and destroy functionality") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25	IB/uverbs: Free uapi on destroy	Mark Bloch	1	-0/+1
	Make sure we free struct uverbs_api once we clean the radix tree. It was allocated by uverbs_alloc_api(). Fixes: 9ed3e5f44772 ("IB/uverbs: Build the specs into a radix tree at runtime") Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25	powerpc/numa: Use associativity if VPHN hcall is successful	Srikar Dronamraju	1	-1/+3
	Currently associativity is used to lookup node-id even if the preceding VPHN hcall failed. However this can cause CPU to be made part of the wrong node, (most likely to be node 0). This is because VPHN is not enabled on KVM guests. With 2ea6263 ("powerpc/topology: Get topology for shared processors at boot"), associativity is used to set to the wrong node. Hence KVM guest topology is broken. For example : A 4 node KVM guest before would have reported. [root@localhost ~]# numactl -H available: 4 nodes (0-3) node 0 cpus: 0 1 2 3 node 0 size: 1746 MB node 0 free: 1604 MB node 1 cpus: 4 5 6 7 node 1 size: 2044 MB node 1 free: 1765 MB node 2 cpus: 8 9 10 11 node 2 size: 2044 MB node 2 free: 1837 MB node 3 cpus: 12 13 14 15 node 3 size: 2044 MB node 3 free: 1903 MB node distances: node 0 1 2 3 0: 10 40 40 40 1: 40 10 40 40 2: 40 40 10 40 3: 40 40 40 10 Would now report: [root@localhost ~]# numactl -H available: 4 nodes (0-3) node 0 cpus: 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 node 0 size: 1746 MB node 0 free: 1244 MB node 1 cpus: node 1 size: 2044 MB node 1 free: 2032 MB node 2 cpus: 1 node 2 size: 2044 MB node 2 free: 2028 MB node 3 cpus: node 3 size: 2044 MB node 3 free: 2032 MB node distances: node 0 1 2 3 0: 10 40 40 40 1: 40 10 40 40 2: 40 40 10 40 3: 40 40 40 10 Fix this by skipping associativity lookup if the VPHN hcall failed. Fixes: 2ea626306810 ("powerpc/topology: Get topology for shared processors at boot") Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-09-25	powerpc/tm: Avoid possible userspace r1 corruption on reclaim	Michael Neuling	1	-1/+8
	Current we store the userspace r1 to PACATMSCRATCH before finally saving it to the thread struct. In theory an exception could be taken here (like a machine check or SLB miss) that could write PACATMSCRATCH and hence corrupt the userspace r1. The SLB fault currently doesn't touch PACATMSCRATCH, but others do. We've never actually seen this happen but it's theoretically possible. Either way, the code is fragile as it is. This patch saves r1 to the kernel stack (which can't fault) before we turn MSR[RI] back on. PACATMSCRATCH is still used but only with MSR[RI] off. We then copy r1 from the kernel stack to the thread struct once we have MSR[RI] back on. Suggested-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-09-25	powerpc/tm: Fix userspace r13 corruption	Michael Neuling	1	-2/+9
	When we treclaim we store the userspace checkpointed r13 to a scratch SPR and then later save the scratch SPR to the user thread struct. Unfortunately, this doesn't work as accessing the user thread struct can take an SLB fault and the SLB fault handler will write the same scratch SPRG that now contains the userspace r13. To fix this, we store r13 to the kernel stack (which can't fault) before we access the user thread struct. Found by running P8 guest + powervm + disable_1tb_segments + TM. Seen as a random userspace segfault with r13 looking like a kernel address. Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-09-25	iommu/vt-d: Handle memory shortage on pasid table allocation	Lu Baolu	2	-4/+4
	Pasid table memory allocation could return failure due to memory shortage. Limit the pasid table size to 1MiB because current 8MiB contiguous physical memory allocation can be hard to come by. W/o a PASID table, the device could continue to work with only shared virtual memory impacted. So, let's go ahead with context mapping even the memory allocation for pasid table failed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107783 Fixes: cc580e41260d ("iommu/vt-d: Per PCI device pasid table interfaces") Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Reported-and-tested-by: Pelton Kyle D <kyle.d.pelton@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-09-25	Revert "uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name"	Lubomir Rintel	2	-2/+2
	This changes UAPI, breaking iwd and libell: ell/key.c: In function 'kernel_dh_compute': ell/key.c:205:38: error: 'struct keyctl_dh_params' has no member named 'private'; did you mean 'dh_private'? struct keyctl_dh_params params = { .private = private, ^~~~~~~ dh_private This reverts commit 8a2336e549d385bb0b46880435b411df8d8200e8. Fixes: 8a2336e549d3 ("uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name") Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: David Howells <dhowells@redhat.com> cc: Randy Dunlap <rdunlap@infradead.org> cc: Mat Martineau <mathew.j.martineau@linux.intel.com> cc: Stephan Mueller <smueller@chronox.de> cc: James Morris <jmorris@namei.org> cc: "Serge E. Hallyn" <serge@hallyn.com> cc: Mat Martineau <mathew.j.martineau@linux.intel.com> cc: Andrew Morton <akpm@linux-foundation.org> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: <stable@vger.kernel.org> Signed-off-by: James Morris <james.morris@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-25	Revert "pinctrl: intel: Do pin translation when lock IRQ"	Mika Westerberg	1	-32/+0
	This reverts commit 55aedef50d4d810670916d9fce4a40d5da2079e7. Commit 55aedef50d4d ("pinctrl: intel: Do pin translation when lock IRQ") added special translation from GPIO number to hardware pin number to irq_reqres/relres hooks to avoid failure when IRQs are requested. The actual failure happened inside gpiochip_lock_as_irq() because it calls gpiod_get_direction() and pinctrl-intel.c::intel_gpio_get_direction() implementation originally missed the translation so the two hooks made it work by skipping the ->get_direction() call entirely (it overwrote the default GPIOLIB provided functions). The proper fix that adds translation to GPIO callbacks was merged with commit 96147db1e1df ("pinctrl: intel: Do pin translation in other GPIO operations as well"). This allows us to use the default GPIOLIB provided functions again. In addition as find out by Benjamin Tissoires the two functions (intel_gpio_irq_reqres()/intel_gpio_irq_relres()) now cause problems of their own because they operate on pin numbers and pass that pin number to gpiochip_lock_as_irq() which actually expects a GPIO number. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199911 Fixes: 55aedef50d4d ("pinctrl: intel: Do pin translation when lock IRQ") Reported-and-tested-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-09-25	pinctrl: cannonlake: Fix HOSTSW_OWN register offset of H variant	Mika Westerberg	1	-13/+20
	It turns out the HOSTSW_OWN register offset is different between LP and H variants. The latter should use 0xc0 instead so fix that. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199911 Fixes: a663ccf0fea1 ("pinctrl: intel: Add Intel Cannon Lake PCH-H pin controller support") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-09-25	pinctrl/amd: poll InterruptEnable bits in amd_gpio_irq_set_type	Daniel Kurtz	1	-10/+23
	From the AMD BKDG, if WAKE_INT_MASTER_REG.MaskStsEn is set, a software write to the debounce registers of any gpio will block wake/interrupt status generation for all gpios for a length of time that depends on WAKE_INT_MASTER_REG.MaskStsLength[11:0]. During this period the Interrupt Delivery bit (INTERRUPT_ENABLE) will read as 0. In commit 4c1de0414a1340 ("pinctrl/amd: poll InterruptEnable bits in enable_irq") we tried to fix this same "gpio Interrupts are blocked immediately after writing debounce registers" problem, but incorrectly assumed it only affected the gpio whose debounce was being configured and not ALL gpios. To solve this for all gpios, we move the polling loop from amd_gpio_irq_enable() to amd_gpio_irq_set_type(), while holding the gpio spinlock. This ensures that another gpio operation (e.g. amd_gpio_irq_unmask()) can read a temporarily disabled IRQ and incorrectly disable it while trying to modify some other register bits. Fixes: 4c1de0414a1340 pinctrl/amd: poll InterruptEnable bits in enable_irq Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-09-25	iommu/rockchip: Free irqs in shutdown handler	Heiko Stuebner	1	-0/+6
	In the iommu's shutdown handler we disable runtime-pm which could result in the irq-handler running unclocked and since commit 3fc7c5c0cff3 ("iommu/rockchip: Handle errors returned from PM framework") we warn about that fact. This can cause warnings on shutdown on some Rockchip machines, so free the irqs in the shutdown handler before we disable runtime-pm. Reported-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Fixes: 3fc7c5c0cff3 ("iommu/rockchip: Handle errors returned from PM framework") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>