wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2021-01-13	x86/xen: Don't register Xen IPIs when they aren't going to be used	David Woodhouse	1	-2/+2
	In the case where xen_have_vector_callback is false, we still register the IPI vectors in xen_smp_intr_init() for the secondary CPUs even though they aren't going to be used. Stop doing that. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210106153958.584169-5-dwmw2@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com>
2021-01-13	x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery	David Woodhouse	2	-1/+14
	It's useful to be able to test non-vector event channel delivery, to make sure Linux will work properly on older Xen which doesn't have it. It's also useful for those working on Xen and Xen-compatible hypervisors, because there are guest kernels still in active use which use PCI INTX even when vector delivery is available. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210106153958.584169-4-dwmw2@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com>
2021-01-13	xen: Set platform PCI device INTX affinity to CPU0	David Woodhouse	1	-0/+7
	With INTX or GSI delivery, Xen uses the event channel structures of CPU0. If the interrupt gets handled by Linux on a different CPU, then no events are seen as pending. Rather than introducing locking to allow other CPUs to process CPU0's events, just ensure that the PCI interrupts happens only on CPU0. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210106153958.584169-3-dwmw2@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com>
2021-01-13	xen: Fix event channel callback via INTX/GSI	David Woodhouse	7	-35/+70
	For a while, event channel notification via the PCI platform device has been broken, because we attempt to communicate with xenstore before we even have notifications working, with the xs_reset_watches() call in xs_init(). We tend to get away with this on Xen versions below 4.0 because we avoid calling xs_reset_watches() anyway, because xenstore might not cope with reading a non-existent key. And newer Xen does have the vector callback support, so we rarely fall back to INTX/GSI delivery. To fix it, clean up a bit of the mess of xs_init() and xenbus_probe() startup. Call xs_init() directly from xenbus_init() only in the !XS_HVM case, deferring it to be called from xenbus_probe() in the XS_HVM case instead. Then fix up the invocation of xenbus_probe() to happen either from its device_initcall if the callback is available early enough, or when the callback is finally set up. This means that the hack of calling xenbus_probe() from a workqueue after the first interrupt, or directly from the PCI platform device setup, is no longer needed. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210113132606.422794-2-dwmw2@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com>
2021-01-13	xen/privcmd: allow fetching resource sizes	Roger Pau Monne	1	-6/+19
	Allow issuing an IOCTL_PRIVCMD_MMAP_RESOURCE ioctl with num = 0 and addr = 0 in order to fetch the size of a specific resource. Add a shortcut to the default map resource path, since fetching the size requires no address to be passed in, and thus no VMA to setup. This is missing from the initial implementation, and causes issues when mapping resources that don't have fixed or known sizes. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com> Cc: stable@vger.kernel.org # >= 4.18 Link: https://lore.kernel.org/r/20210112115358.23346-1-roger.pau@citrix.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-19	xen: Kconfig: remove X86_64 depends from XEN_512GB	Jason Andryuk	1	-1/+1
	commit bfda93aee0ec ("xen: Kconfig: nest Xen guest options") accidentally re-added X86_64 as a dependency to XEN_512GB. It was originally removed in commit a13f2ef168cb ("x86/xen: remove 32-bit Xen PV guest support"). Remove it again. Fixes: bfda93aee0ec ("xen: Kconfig: nest Xen guest options") Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20201216140838.16085-1-jandryuk@gmail.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-16	xen/manage: Fix fall-through warnings for Clang	Gustavo A. R. Silva	1	-0/+1
	In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/5cfc00b1d8ed68eb2c2b6317806a0aa7e57d27f1.1605896060.git.gustavoars@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-16	xen-blkfront: Fix fall-through warnings for Clang	Gustavo A. R. Silva	1	-0/+1
	In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/33057688012c34dd60315ad765ff63f070e98c0c.1605896059.git.gustavoars@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-16	xen: remove trailing semicolon in macro definition	Tom Rix	1	-1/+1
	The macro use will already have a semicolon. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20201127160707.2622061-1-trix@redhat.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-16	xen: Kconfig: nest Xen guest options	Jason Andryuk	1	-13/+13
	Moving XEN_512GB allows it to nest under XEN_PV. That also allows XEN_PVH to nest under XEN as a sibling to XEN_PV and XEN_PVHVM giving: [] Xen guest support [] Xen PV guest support [] Limit Xen pv-domain memory to 512GB [] Xen PV Dom0 support [] Xen PVHVM guest support [] Xen PVH guest support Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20201014175342.152712-3-jandryuk@gmail.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-16	xen: Remove Xen PVH/PVHVM dependency on PCI	Jason Andryuk	2	-7/+13
	A Xen PVH domain doesn't have a PCI bus or devices, so it doesn't need PCI support built in. Currently, XEN_PVH depends on XEN_PVHVM which depends on PCI. Introduce XEN_PVHVM_GUEST as a toplevel item and change XEN_PVHVM to a hidden variable. This allows XEN_PVH to depend on XEN_PVHVM without PCI while XEN_PVHVM_GUEST depends on PCI. In drivers/xen, compile platform-pci depending on XEN_PVHVM_GUEST since that pulls in the PCI dependency for linking. Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20201014175342.152712-2-jandryuk@gmail.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-16	x86/xen: Convert to DEFINE_SHOW_ATTRIBUTE	Qinglang Miao	1	-11/+1
	Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20200917125547.104472-1-miaoqinglang@huawei.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-14	xen-blkback: set ring->xenblkd to NULL after kthread_stop()	Pawel Wieczorkiewicz	1	-0/+1
	When xen_blkif_disconnect() is called, the kernel thread behind the block interface is stopped by calling kthread_stop(ring->xenblkd). The ring->xenblkd thread pointer being non-NULL determines if the thread has been already stopped. Normally, the thread's function xen_blkif_schedule() sets the ring->xenblkd to NULL, when the thread's main loop ends. However, when the thread has not been started yet (i.e. wake_up_process() has not been called on it), the xen_blkif_schedule() function would not be called yet. In such case the kthread_stop() call returns -EINTR and the ring->xenblkd remains dangling. When this happens, any consecutive call to xen_blkif_disconnect (for example in frontend_changed() callback) leads to a kernel crash in kthread_stop() (e.g. NULL pointer dereference in exit_creds()). This is XSA-350. Cc: <stable@vger.kernel.org> # 4.12 Fixes: a24fa22ce22a ("xen/blkback: don't use xen_blkif_get() in xen-blkback kthread") Reported-by: Olivier Benjamin <oliben@amazon.com> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-14	xenbus/xenbus_backend: Disallow pending watch messages	SeongJae Park	1	-0/+7
	'xenbus_backend' watches 'state' of devices, which is writable by guests. Hence, if guests intensively updates it, dom0 will have lots of pending events that exhausting memory of dom0. In other words, guests can trigger dom0 memory pressure. This is known as XSA-349. However, the watch callback of it, 'frontend_changed()', reads only 'state', so doesn't need to have the pending events. To avoid the problem, this commit disallows pending watch messages for 'xenbus_backend' using the 'will_handle()' watch callback. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park <sjpark@amazon.de> Reported-by: Michael Kurth <mku@amazon.de> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-14	xen/xenbus: Count pending messages for each watch	SeongJae Park	2	-11/+20
	This commit adds a counter of pending messages for each watch in the struct. It is used to skip unnecessary pending messages lookup in 'unregister_xenbus_watch()'. It could also be used in 'will_handle' callback. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park <sjpark@amazon.de> Reported-by: Michael Kurth <mku@amazon.de> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-14	xen/xenbus/xen_bus_type: Support will_handle watch callback	SeongJae Park	2	-1/+4
	This commit adds support of the 'will_handle' watch callback for 'xen_bus_type' users. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park <sjpark@amazon.de> Reported-by: Michael Kurth <mku@amazon.de> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-14	xen/xenbus: Add 'will_handle' callback support in xenbus_watch_path()	SeongJae Park	6	-7/+17
	Some code does not directly make 'xenbus_watch' object and call 'register_xenbus_watch()' but use 'xenbus_watch_path()' instead. This commit adds support of 'will_handle' callback in the 'xenbus_watch_path()' and it's wrapper, 'xenbus_watch_pathfmt()'. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park <sjpark@amazon.de> Reported-by: Michael Kurth <mku@amazon.de> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-14	xen/xenbus: Allow watches discard events before queueing	SeongJae Park	4	-1/+16
	If handling logics of watch events are slower than the events enqueue logic and the events can be created from the guests, the guests could trigger memory pressure by intensively inducing the events, because it will create a huge number of pending events that exhausting the memory. Fortunately, some watch events could be ignored, depending on its handler callback. For example, if the callback has interest in only one single path, the watch wouldn't want multiple pending events. Or, some watches could ignore events to same path. To let such watches to volutarily help avoiding the memory pressure situation, this commit introduces new watch callback, 'will_handle'. If it is not NULL, it will be called for each new event just before enqueuing it. Then, if the callback returns false, the event will be discarded. No watch is using the callback for now, though. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park <sjpark@amazon.de> Reported-by: Michael Kurth <mku@amazon.de> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-13	Linux 5.10	Linus Torvalds	1	-1/+1

2020-12-12	md: change mddev 'chunk_sectors' from int to unsigned	Mike Snitzer	1	-2/+2
	Commit e2782f560c29 ("Revert "dm raid: remove unnecessary discard limits for raid10"") exposed compiler warnings introduced by commit e0910c8e4f87 ("dm raid: fix discard limits for raid1 and raid10"): In file included from ./include/linux/kernel.h:14, from ./include/asm-generic/bug.h:20, from ./arch/x86/include/asm/bug.h:93, from ./include/linux/bug.h:5, from ./include/linux/mmdebug.h:5, from ./include/linux/gfp.h:5, from ./include/linux/slab.h:15, from drivers/md/dm-raid.c:8: drivers/md/dm-raid.c: In function ‘raid_io_hints’: ./include/linux/minmax.h:18:28: warning: comparison of distinct pointer types lacks a cast (!!(sizeof((typeof(x) )1 == (typeof(y) )1))) ^~ ./include/linux/minmax.h:32:4: note: in expansion of macro ‘__typecheck’ (__typecheck(x, y) && __no_side_effects(x, y)) ^~~~~~~~~~~ ./include/linux/minmax.h:42:24: note: in expansion of macro ‘__safe_cmp’ __builtin_choose_expr(__safe_cmp(x, y), \ ^~~~~~~~~~ ./include/linux/minmax.h:51:19: note: in expansion of macro ‘__careful_cmp’ #define min(x, y) __careful_cmp(x, y, <) ^~~~~~~~~~~~~ ./include/linux/minmax.h:84:39: note: in expansion of macro ‘min’ __x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); }) ^~~ drivers/md/dm-raid.c:3739:33: note: in expansion of macro ‘min_not_zero’ limits->max_discard_sectors = min_not_zero(rs->md.chunk_sectors, ^~~~~~~~~~~~ Fix this by changing the chunk_sectors member of 'struct mddev' from int to 'unsigned int' to match the type used for the 'chunk_sectors' member of 'struct queue_limits'. Various MD code still uses 'int' but none of it appears to ever make use of signed int; and storing positive signed int in unsigned is perfectly safe. Reported-by: Song Liu <songliubraving@fb.com> Fixes: e2782f560c29 ("Revert "dm raid: remove unnecessary discard limits for raid10"") Fixes: e0910c8e4f87 ("dm raid: fix discard limits for raid1 and raid10") Cc: stable@vger,kernel.org # e0910c8e4f87 was marked for stable@ Signed-off-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Song Liu <song@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-12	x86/kprobes: Fix optprobe to detect INT3 padding correctly	Masami Hiramatsu	1	-2/+20
	Commit 7705dc855797 ("x86/vmlinux: Use INT3 instead of NOP for linker fill bytes") changed the padding bytes between functions from NOP to INT3. However, when optprobe decodes a target function it finds INT3 and gives up the jump optimization. Instead of giving up any INT3 detection, check whether the rest of the bytes to the end of the function are INT3. If all of them are INT3, those come from the linker. In that case, continue the optprobe jump optimization. [ bp: Massage commit message. ] Fixes: 7705dc855797 ("x86/vmlinux: Use INT3 instead of NOP for linker fill bytes") Reported-by: Adam Zabrocki <pi3@pi3.com.pl> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/160767025681.3880685.16021570341428835411.stgit@devnote2
2020-12-11	Input: goodix - add upside-down quirk for Teclast X98 Pro tablet	Simon Beginn	1	-0/+12
	The touchscreen on the Teclast x98 Pro is also mounted upside-down in relation to the display orientation. Signed-off-by: Simon Beginn <linux@simonmicro.de> Signed-off-by: Bastien Nocera <hadess@hadess.net> Link: https://lore.kernel.org/r/20201117004253.27A5A27EFD@localhost Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-12-11	tools/kvm_stat: Exempt time-based counters	Stefan Raspl	1	-1/+5
	The new counters halt_poll_success_ns and halt_poll_fail_ns do not count events. Instead they provide a time, and mess up our statistics. Therefore, we should exclude them. Removal is currently implemented with an exempt list. If more counters like these appear, we can think about a more general rule like excluding all fields name "*_ns", in case that's a standing convention. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Tested-and-reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20201208210829.101324-1-raspl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-11	KVM: mmu: Fix SPTE encoding of MMIO generation upper half	Maciej S. Szmigiero	3	-10/+21
	Commit cae7ed3c2cb0 ("KVM: x86: Refactor the MMIO SPTE generation handling") cleaned up the computation of MMIO generation SPTE masks, however it introduced a bug how the upper part was encoded: SPTE bits 52-61 were supposed to contain bits 10-19 of the current generation number, however a missing shift encoded bits 1-10 there instead (mostly duplicating the lower part of the encoded generation number that then consisted of bits 1-9). In the meantime, the upper part was shrunk by one bit and moved by subsequent commits to become an upper half of the encoded generation number (bits 9-17 of bits 0-17 encoded in a SPTE). In addition to the above, commit 56871d444bc4 ("KVM: x86: fix overlap between SPTE_MMIO_MASK and generation") has changed the SPTE bit range assigned to encode the generation number and the total number of bits encoded but did not update them in the comment attached to their defines, nor in the KVM MMU doc. Let's do it here, too, since it is too trivial thing to warrant a separate commit. Fixes: cae7ed3c2cb0 ("KVM: x86: Refactor the MMIO SPTE generation handling") Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <156700708db2a5296c5ed7a8b9ac71f1e9765c85.1607129096.git.maciej.szmigiero@oracle.com> Cc: stable@vger.kernel.org [Reorganize macros so that everything is computed from the bit ranges. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-11	bpf: Fix enum names for bpf_this_cpu_ptr() and bpf_per_cpu_ptr() helpers	Andrii Nakryiko	4	-8/+8
	Remove bpf_ prefix, which causes these helpers to be reported in verifier dump as bpf_bpf_this_cpu_ptr() and bpf_bpf_per_cpu_ptr(), respectively. Lets fix it as long as it is still possible before UAPI freezes on these helpers. Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	mm/hugetlb: clear compound_nr before freeing gigantic pages	Gerald Schaefer	1	-0/+1
	Commit 1378a5ee451a ("mm: store compound_nr as well as compound_order") added compound_nr counter to first tail struct page, overlaying with page->mapping. The overlay itself is fine, but while freeing gigantic hugepages via free_contig_range(), a "bad page" check will trigger for non-NULL page->mapping on the first tail page: BUG: Bad page state in process bash pfn:380001 page:00000000c35f0856 refcount:0 mapcount:0 mapping:00000000126b68aa index:0x0 pfn:0x380001 aops:0x0 flags: 0x3ffff00000000000() raw: 3ffff00000000000 0000000000000100 0000000000000122 0000000100000000 raw: 0000000000000000 0000000000000000 ffffffff00000000 0000000000000000 page dumped because: non-NULL mapping Modules linked in: CPU: 6 PID: 616 Comm: bash Not tainted 5.10.0-rc7-next-20201208 #1 Hardware name: IBM 3906 M03 703 (LPAR) Call Trace: show_stack+0x6e/0xe8 dump_stack+0x90/0xc8 bad_page+0xd6/0x130 free_pcppages_bulk+0x26a/0x800 free_unref_page+0x6e/0x90 free_contig_range+0x94/0xe8 update_and_free_page+0x1c4/0x2c8 free_pool_huge_page+0x11e/0x138 set_max_huge_pages+0x228/0x300 nr_hugepages_store_common+0xb8/0x130 kernfs_fop_write+0xd2/0x218 vfs_write+0xb0/0x2b8 ksys_write+0xac/0xe0 system_call+0xe6/0x288 Disabling lock debugging due to kernel taint This is because only the compound_order is cleared in destroy_compound_gigantic_page(), and compound_nr is set to 1U << order == 1 for order 0 in set_compound_order(page, 0). Fix this by explicitly clearing compound_nr for first tail page after calling set_compound_order(page, 0). Link: https://lkml.kernel.org/r/20201208182813.66391-2-gerald.schaefer@linux.ibm.com Fixes: 1378a5ee451a ("mm: store compound_nr as well as compound_order") Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: <stable@vger.kernel.org> [5.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	kasan: fix object remaining in offline per-cpu quarantine	Kuan-Ying Lee	1	-0/+39
	We hit this issue in our internal test. When enabling generic kasan, a kfree()'d object is put into per-cpu quarantine first. If the cpu goes offline, object still remains in the per-cpu quarantine. If we call kmem_cache_destroy() now, slub will report "Objects remaining" error. ============================================================================= BUG test_module_slab (Not tainted): Objects remaining in test_module_slab on __kmem_cache_shutdown() ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Slab 0x(____ptrval____) objects=34 used=1 fp=0x(____ptrval____) flags=0x2ffff00000010200 CPU: 3 PID: 176 Comm: cat Tainted: G B 5.10.0-rc1-00007-g4525c8781ec0-dirty #10 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x2b0 show_stack+0x18/0x68 dump_stack+0xfc/0x168 slab_err+0xac/0xd4 __kmem_cache_shutdown+0x1e4/0x3c8 kmem_cache_destroy+0x68/0x130 test_version_show+0x84/0xf0 module_attr_show+0x40/0x60 sysfs_kf_seq_show+0x128/0x1c0 kernfs_seq_show+0xa0/0xb8 seq_read+0x1f0/0x7e8 kernfs_fop_read+0x70/0x338 vfs_read+0xe4/0x250 ksys_read+0xc8/0x180 __arm64_sys_read+0x44/0x58 el0_svc_common.constprop.0+0xac/0x228 do_el0_svc+0x38/0xa0 el0_sync_handler+0x170/0x178 el0_sync+0x174/0x180 INFO: Object 0x(____ptrval____) @offset=15848 INFO: Allocated in test_version_show+0x98/0xf0 age=8188 cpu=6 pid=172 stack_trace_save+0x9c/0xd0 set_track+0x64/0xf0 alloc_debug_processing+0x104/0x1a0 ___slab_alloc+0x628/0x648 __slab_alloc.isra.0+0x2c/0x58 kmem_cache_alloc+0x560/0x588 test_version_show+0x98/0xf0 module_attr_show+0x40/0x60 sysfs_kf_seq_show+0x128/0x1c0 kernfs_seq_show+0xa0/0xb8 seq_read+0x1f0/0x7e8 kernfs_fop_read+0x70/0x338 vfs_read+0xe4/0x250 ksys_read+0xc8/0x180 __arm64_sys_read+0x44/0x58 el0_svc_common.constprop.0+0xac/0x228 kmem_cache_destroy test_module_slab: Slab cache still has objects Register a cpu hotplug function to remove all objects in the offline per-cpu quarantine when cpu is going offline. Set a per-cpu variable to indicate this cpu is offline. [qiang.zhang@windriver.com: fix slab double free when cpu-hotplug] Link: https://lkml.kernel.org/r/20201204102206.20237-1-qiang.zhang@windriver.com Link: https://lkml.kernel.org/r/1606895585-17382-2-git-send-email-Kuan-Ying.Lee@mediatek.com Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Signed-off-by: Zqiang <qiang.zhang@windriver.com> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Guangye Yang <guangye.yang@mediatek.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Nicholas Tang <nicholas.tang@mediatek.com> Cc: Miles Chen <miles.chen@mediatek.com> Cc: Qian Cai <qcai@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	elfcore: fix building with clang	Arnd Bergmann	3	-27/+22
	kernel/elfcore.c only contains weak symbols, which triggers a bug with clang in combination with recordmcount: Cannot find symbol for section 2: .text. kernel/elfcore.o: failed Move the empty stubs into linux/elfcore.h as inline functions. As only two architectures use these, just use the architecture specific Kconfig symbols to key off the declaration. Link: https://lkml.kernel.org/r/20201204165742.3815221-2-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Barret Rhoden <brho@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	initramfs: fix clang build failure	Arnd Bergmann	1	-1/+1
	There is only one function in init/initramfs.c that is in the .text section, and it is marked __weak. When building with clang-12 and the integrated assembler, this leads to a bug with recordmcount: ./scripts/recordmcount "init/initramfs.o" Cannot find symbol for section 2: .text. init/initramfs.o: failed I'm not quite sure what exactly goes wrong, but I notice that this function is only ever called from an __init function, and normally inlined. Marking it __init as well is clearly correct and it leads to recordmcount no longer complaining. Link: https://lkml.kernel.org/r/20201204165742.3815221-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Barret Rhoden <brho@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	kbuild: avoid static_assert for genksyms	Arnd Bergmann	1	-0/+5
	genksyms does not know or care about the _Static_assert() built-in, and sometimes falls back to ignoring the later symbols, which causes undefined behavior such as WARNING: modpost: EXPORT symbol "ethtool_set_ethtool_phy_ops" [vmlinux] version generation failed, symbol will not be versioned. ld: net/ethtool/common.o: relocation R_AARCH64_ABS32 against `__crc_ethtool_set_ethtool_phy_ops' can not be used when making a shared object net/ethtool/common.o:(_ftrace_annotated_branch+0x0): dangerous relocation: unsupported relocation Redefine static_assert for genksyms to avoid that. Link: https://lkml.kernel.org/r/20201203230955.1482058-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Kees Cook <keescook@chromium.org> Cc: Rikard Falkeborn <rikard.falkeborn@gmail.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	selftest/fpu: avoid clang warning	Arnd Bergmann	1	-1/+2
	With extra warnings enabled, clang complains about the redundant -mhard-float argument: clang: error: argument unused during compilation: '-mhard-float' [-Werror,-Wunused-command-line-argument] Move this into the gcc-only part of the Makefile. Link: https://lkml.kernel.org/r/20201203223652.1320700-1-arnd@kernel.org Fixes: 4185b3b92792 ("selftests/fpu: Add an FPU selftest") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Petteri Aimonen <jpa@git.mail.kapsi.fi> Cc: Borislav Petkov <bp@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	proc: use untagged_addr() for pagemap_read addresses	Miles Chen	1	-2/+6
	When we try to visit the pagemap of a tagged userspace pointer, we find that the start_vaddr is not correct because of the tag. To fix it, we should untag the userspace pointers in pagemap_read(). I tested with 5.10-rc4 and the issue remains. Explanation from Catalin in [1]: "Arguably, that's a user-space bug since tagged file offsets were never supported. In this case it's not even a tag at bit 56 as per the arm64 tagged address ABI but rather down to bit 47. You could say that the problem is caused by the C library (malloc()) or whoever created the tagged vaddr and passed it to this function. It's not a kernel regression as we've never supported it. Now, pagemap is a special case where the offset is usually not generated as a classic file offset but rather derived by shifting a user virtual address. I guess we can make a concession for pagemap (only) and allow such offset with the tag at bit (56 - PAGE_SHIFT + 3)" My test code is based on [2]: A userspace pointer which has been tagged by 0xb4: 0xb400007662f541c8 userspace program: uint64 OsLayer::VirtualToPhysical(void vaddr) { uint64 frame, paddr, pfnmask, pagemask; int pagesize = sysconf(_SC_PAGESIZE); off64_t off = ((uintptr_t)vaddr) / pagesize 8; // off = 0xb400007662f541c8 / pagesize * 8 = 0x5a00003b317aa0 int fd = open(kPagemapPath, O_RDONLY); ... if (lseek64(fd, off, SEEK_SET) != off \|\| read(fd, &frame, 8) != 8) { int err = errno; string errtxt = ErrorString(err); if (fd >= 0) close(fd); return 0; } ... } kernel fs/proc/task_mmu.c: static ssize_t pagemap_read(struct file file, char __user buf, size_t count, loff_t ppos) { ... src = ppos; svpfn = src / PM_ENTRY_BYTES; // svpfn == 0xb400007662f54 start_vaddr = svpfn << PAGE_SHIFT; // start_vaddr == 0xb400007662f54000 end_vaddr = mm->task_size; /* watch out for wraparound */ // svpfn == 0xb400007662f54 // (mm->task_size >> PAGE) == 0x8000000 if (svpfn > mm->task_size >> PAGE_SHIFT) // the condition is true because of the tag 0xb4 start_vaddr = end_vaddr; ret = 0; while (count && (start_vaddr < end_vaddr)) { // we cannot visit correct entry because start_vaddr is set to end_vaddr int len; unsigned long end; ... } ... } [1] https://lore.kernel.org/patchwork/patch/1343258/ [2] https://github.com/stressapptest/stressapptest/blob/master/src/os.cc#L158 Link: https://lkml.kernel.org/r/20201204024347.8295-1-miles.chen@mediatek.com Signed-off-by: Miles Chen <miles.chen@mediatek.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Will Deacon <will@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com> Cc: <stable@vger.kernel.org> [5.4-] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	revert "mm/filemap: add static for function __add_to_page_cache_locked"	Andrew Morton	1	-1/+1
	Revert commit 3351b16af494 ("mm/filemap: add static for function __add_to_page_cache_locked") due to incompatibility with ALLOW_ERROR_INJECTION which result in build errors. Link: https://lkml.kernel.org/r/CAADnVQJ6tmzBXvtroBuEH6QA0H+q7yaSKxrVvVxhqr3KBZdEXg@mail.gmail.com Tested-by: Justin Forbes <jmforbes@linuxtx.org> Tested-by: Greg Thelen <gthelen@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Michal Kubecek <mkubecek@suse.cz> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Tony Luck <tony.luck@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	Input: cm109 - do not stomp on control URB	Dmitry Torokhov	1	-2/+5
	We need to make sure we are not stomping on the control URB that was issued when opening the device when attempting to toggle buzzer. To do that we need to mark it as pending in cm109_open(). Reported-and-tested-by: syzbot+150f793ac5bc18eee150@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-12-11	mtd: rawnand: xway: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: d525914b5bd8 ("mtd: rawnand: xway: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-10-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: socrates: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: b36bf0a0fe5d ("mtd: rawnand: socrates: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-9-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: plat_nand: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: 612e048e6aab ("mtd: rawnand: plat_nand: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-8-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: pasemi: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: 8fc6f1f042b2 ("mtd: rawnand: pasemi: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-7-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: orion: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Reported-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Fixes: 553508cec2e8 ("mtd: rawnand: orion: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-6-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: mpc5121: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: 6dd09f775b72 ("mtd: rawnand: mpc5121: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-5-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: gpio: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: f6341f6448e0 ("mtd: rawnand: gpio: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-4-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: au1550: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: dbffc8ccdf3a ("mtd: rawnand: au1550: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-3-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: ams-delta: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: 59d93473323a ("mtd: rawnand: ams-delta: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-2-miquel.raynal@bootlin.com
2020-12-11	Revert "scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback()"	Andrea Parri (Microsoft)	1	-5/+0
	This reverts commit 3b8c72d076c42bf27284cda7b2b2b522810686f8. Dexuan reported a regression where StorVSC fails to probe a device (and where, consequently, the VM may fail to boot). The root-cause analysis led to a long-standing race condition that is exposed by the validation /commit in question. Let's put the new validation aside until a proper solution for that race condition is in place. Link: https://lore.kernel.org/r/20201211131404.21359-1-parri.andrea@gmail.com Fixes: 3b8c72d076c4 ("scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback()") Cc: Dexuan Cui <decui@microsoft.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-10	RISC-V: Define get_cycles64() regardless of M-mode	Palmer Dabbelt	1	-2/+2
	The timer driver uses get_cycles64() unconditionally to obtain the current time. A recent refactoring lost the common definition for some configs, which is now the only one we need. Fixes: d5be89a8d118 ("RISC-V: Resurrect the MMIO timer implementation for M-mode systems") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-12-11	drm/i915/display: Go softly softly on initial modeset failure	Chris Wilson	1	-1/+1
	Reduce the module/device probe error into a mere debug to hide issues where the initial modeset is failing (after lies told by hw probe) and the system hangs with a livelock in cleaning up the failed commit. Reported-by: H.J. Lu <hjl.tools@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=210619 Fixes: b3bf99daaee9 ("drm/i915/display: Defer initial modeset until after GGTT is initialised") Fixes: ccc9e67ab26f ("drm/i915/display: Defer initial modeset until after GGTT is initialised") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Dave Airlie <airlied@redhat.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201210230741.17140-1-chris@chris-wilson.co.uk
2020-12-10	x86/apic/vector: Fix ordering in vector assignment	Thomas Gleixner	1	-10/+14
	Prarit reported that depending on the affinity setting the ' irq $N: Affinity broken due to vector space exhaustion.' message is showing up in dmesg, but the vector space on the CPUs in the affinity mask is definitely not exhausted. Shung-Hsi provided traces and analysis which pinpoints the problem: The ordering of trying to assign an interrupt vector in assign_irq_vector_any_locked() is simply wrong if the interrupt data has a valid node assigned. It does: 1) Try the intersection of affinity mask and node mask 2) Try the node mask 3) Try the full affinity mask 4) Try the full online mask Obviously #2 and #3 are in the wrong order as the requested affinity mask has to take precedence. In the observed cases #1 failed because the affinity mask did not contain CPUs from node 0. That made it allocate a vector from node 0, thereby breaking affinity and emitting the misleading message. Revert the order of #2 and #3 so the full affinity mask without the node intersection is tried before actually affinity is broken. If no node is assigned then only the full affinity mask and if that fails the full online mask is tried. Fixes: d6ffc6ac83b1 ("x86/vector: Respect affinity mask in irq descriptor") Reported-by: Prarit Bhargava <prarit@redhat.com> Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87ft4djtyp.fsf@nanos.tec.linutronix.de
2020-12-10	NFS: Disable READ_PLUS by default	Anna Schumaker	2	-1/+10
	We've been seeing failures with xfstests generic/091 and generic/263 when using READ_PLUS. I've made some progress on these issues, and the tests fail later on but still don't pass. Let's disable READ_PLUS by default until we can work out what is going on. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-12-10	NFSv4.2: Fix 5 seconds delay when doing inter server copy	Dai Ngo	1	-1/+1
	Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5 seconds delay regardless of the size of the copy. The delay is from nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential fails because the seqid in both nfs4_state and nfs4_stateid are 0. Fix __nfs42_ssc_open to delay setting of NFS_OPEN_STATE in nfs4_state, until after the call to update_open_stateid, to indicate this is the 1st open. This fix is part of a 2 patches, the other patch is the fix in the source server to return the stateid for COPY_NOTIFY request with seqid 1 instead of 0. Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-12-10	NFS: Fix rpcrdma_inline_fixup() crash with new LISTXATTRS operation	Chuck Lever	2	-9/+13
	By switching to an XFS-backed export, I am able to reproduce the ibcomp worker crash on my client with xfstests generic/013. For the failing LISTXATTRS operation, xdr_inline_pages() is called with page_len=12 and buflen=128. - When ->send_request() is called, rpcrdma_marshal_req() does not set up a Reply chunk because buflen is smaller than the inline threshold. Thus rpcrdma_convert_iovs() does not get invoked at all and the transport's XDRBUF_SPARSE_PAGES logic is not invoked on the receive buffer. - During reply processing, rpcrdma_inline_fixup() tries to copy received data into rq_rcv_buf->pages because page_len is positive. But there are no receive pages because rpcrdma_marshal_req() never allocated them. The result is that the ibcomp worker faults and dies. Sometimes that causes a visible crash, and sometimes it results in a transport hang without other symptoms. RPC/RDMA's XDRBUF_SPARSE_PAGES support is not entirely correct, and should eventually be fixed or replaced. However, my preference is that upper-layer operations should explicitly allocate their receive buffers (using GFP_KERNEL) when possible, rather than relying on XDRBUF_SPARSE_PAGES. Reported-by: Olga kornievskaia <kolga@netapp.com> Suggested-by: Olga kornievskaia <kolga@netapp.com> Fixes: c10a75145feb ("NFSv4.2: add the extended attribute proc functions.") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Olga kornievskaia <kolga@netapp.com> Reviewed-by: Frank van der Linden <fllinden@amazon.com> Tested-by: Olga kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>