wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2021-04-16	Drivers: hv: vmbus: Use after free in __vmbus_open()	Dan Carpenter	1	-1/+1
	The "open_info" variable is added to the &vmbus_connection.chn_msg_list, but the error handling frees "open_info" without removing it from the list. This will result in a use after free. First remove it from the list, and then free it. Fixes: 6f3d791f3006 ("Drivers: hv: vmbus: Fix rescind handling issues") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/YHV3XLCot6xBS44r@mwanda Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-04-14	Drivers: hv: vmbus: remove unused function	Jiapeng Chong	1	-9/+0
	Fix the following clang warning: drivers/hv/ring_buffer.c:89:1: warning: unused function 'hv_set_next_read_location' [-Wunused-function]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1618381282-119135-1-git-send-email-jiapeng.chong@linux.alibaba.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-04-02	Drivers: hv: vmbus: Remove unused linux/version.h header	Qiheng Lin	1	-1/+0
	That file is not needed in hv.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Qiheng Lin <linqiheng@huawei.com> Link: https://lore.kernel.org/r/20210331060646.2471-1-linqiheng@huawei.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-04-02	x86/hyperv: remove unused linux/version.h header	Zheng Yongjun	1	-1/+0
	That header is not needed in hv_proc.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yongjun Zheng <zhengyongjun3@huawei.com> Link: https://lore.kernel.org/r/20210326064942.3263776-1-zhengyongjun3@huawei.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-24	x86/Hyper-V: Support for free page reporting	Sunil Muthuswamy	6	-8/+180
	Linux has support for free page reporting now (36e66c554b5c) for virtualized environment. On Hyper-V when virtually backed VMs are configured, Hyper-V will advertise cold memory discard capability, when supported. This patch adds the support to hook into the free page reporting infrastructure and leverage the Hyper-V cold memory discard hint hypercall to report/free these pages back to the host. Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Tested-by: Matheus Castello <matheus@castello.eng.br> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/SN4PR2101MB0880121FA4E2FEC67F35C1DCC0649@SN4PR2101MB0880.namprd21.prod.outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-24	x86/hyperv: Fix unused variable 'hi' warning in hv_apic_read	Xu Yihang	1	-0/+2
	Fixes the following W=1 kernel build warning(s): arch/x86/hyperv/hv_apic.c:58:15: warning: variable ‘hi’ set but not used [-Wunused-but-set-variable] Compiled with CONFIG_HYPERV enabled: make allmodconfig ARCH=x86_64 CROSS_COMPILE=x86_64-linux-gnu- make W=1 arch/x86/hyperv/hv_apic.o ARCH=x86_64 CROSS_COMPILE=x86_64-linux-gnu- HV_X64_MSR_EOI occupies bit 31:0 and HV_X64_MSR_TPR occupies bit 7:0, which means the higher 32 bits are not really used. Cast the variable hi to void to silence this warning. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xu Yihang <xuyihang@huawei.com> Link: https://lore.kernel.org/r/20210323025013.191533-1-xuyihang@huawei.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-24	x86/hyperv: Fix unused variable 'msr_val' warning in hv_qlock_wait	Xu Yihang	1	-2/+6
	Fixes the following W=1 kernel build warning(s): arch/x86/hyperv/hv_spinlock.c:28:16: warning: variable ‘msr_val’ set but not used [-Wunused-but-set-variable] unsigned long msr_val; As Hypervisor Top-Level Functional Specification states in chapter 7.5 Virtual Processor Idle Sleep State, "A partition which possesses the AccessGuestIdleMsr privilege (refer to section 4.2.2) may trigger entry into the virtual processor idle sleep state through a read to the hypervisor-defined MSR HV_X64_MSR_GUEST_IDLE". That means only a read of the MSR is necessary. The returned value msr_val is not used. Cast it to void to silence this warning. Reference: https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xu Yihang <xuyihang@huawei.com> Link: https://lore.kernel.org/r/20210323024302.174434-1-xuyihang@huawei.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-24	hv: hyperv.h: a few mundane typo fixes	Bhaskar Chowdhury	1	-4/+4
	s/sructure/structure/ s/extention/extension/ s/offerred/offered/ s/adversley/adversely/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20210321233108.3885240-1-unixbhaskar@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-22	drivers: hv: Fix EXPORT_SYMBOL and tab spaces issue	Vasanth	1	-4/+3
	1.Fixed EXPORT_SYMBOL should be follow immediately function/variable. 2.Fixed code tab spaces issue. Signed-off-by: Vasanth M <vasanth3g@gmail.com> Link: https://lore.kernel.org/r/20210310052155.39460-1-vasanth3g@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-22	Drivers: hv: vmbus: Drop error message when 'No request id available'	Andrea Parri (Microsoft)	1	-1/+0
	Running out of request IDs on a channel essentially produces the same effect as running out of space in the ring buffer, in that -EAGAIN is returned. The error message in hv_ringbuffer_write() should either be dropped (since we don't output a message when the ring buffer is full) or be made conditional/debug-only. Suggested-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Fixes: e8b7db38449ac ("Drivers: hv: vmbus: Add vmbus_requestor data structure for VMBus hardening") Link: https://lore.kernel.org/r/20210301191348.196485-1-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-11	asm-generic/hyperv: Add missing function prototypes per -W1 warnings	Michael Kelley	1	-0/+2
	Add two function prototypes for -W1 warnings generated by the kernel test robot. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1615402069-39462-1-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	clocksource/drivers/hyper-v: Move handling of STIMER0 interrupts	Michael Kelley	6	-72/+120
	STIMER0 interrupts are most naturally modeled as per-cpu IRQs. But because x86/x64 doesn't have per-cpu IRQs, the core STIMER0 interrupt handling machinery is done in code under arch/x86 and Linux IRQs are not used. Adding support for ARM64 means adding equivalent code using per-cpu IRQs under arch/arm64. A better model is to treat per-cpu IRQs as the normal path (which it is for modern architectures), and the x86/x64 path as the exception. Do this by incorporating standard Linux per-cpu IRQ allocation into the main SITMER0 driver code, and bypass it in the x86/x64 exception case. For x86/x64, special case code is retained under arch/x86, but no STIMER0 interrupt handling code is needed under arch/arm64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1614721102-2241-11-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	clocksource/drivers/hyper-v: Set clocksource rating based on Hyper-V feature	Michael Kelley	1	-10/+13
	On x86/x64, the TSC clocksource is available in a Hyper-V VM only if Hyper-V provides the TSC_INVARIANT flag. The rating on the Hyper-V Reference TSC page clocksource is currently set so that it will not override the TSC clocksource in this case. Alternatively, if the TSC clocksource is not available, then the Hyper-V clocksource is used. But on ARM64, the Hyper-V Reference TSC page clocksource should override the ARM arch counter, since the Hyper-V clocksource provides scaling and offsetting during live migrations that is not provided for the ARM arch counter. To get the needed behavior for both x86/x64 and ARM64, tweak the logic by defaulting the Hyper-V Reference TSC page clocksource rating to a large value that will always override. If the Hyper-V TSC_INVARIANT flag is set, then reduce the rating so that it will not override the TSC. While the logic for getting there is slightly different, the net result in the normal cases is no functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1614721102-2241-10-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	clocksource/drivers/hyper-v: Handle sched_clock differences inline	Michael Kelley	2	-11/+24
	While the Hyper-V Reference TSC code is architecture neutral, the pv_ops.time.sched_clock() function is implemented for x86/x64, but not for ARM64. Current code calls a utility function under arch/x86 (and coming, under arch/arm64) to handle the difference. Change this approach to handle the difference inline based on whether GENERIC_SCHED_CLOCK is present. The new approach removes code under arch/* since the difference is tied more to the specifics of the Linux implementation than to the architecture. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1614721102-2241-9-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	clocksource/drivers/hyper-v: Handle vDSO differences inline	Michael Kelley	2	-6/+8
	While the driver for the Hyper-V Reference TSC and STIMERs is architecture neutral, vDSO is implemented for x86/x64, but not for ARM64. Current code calls into utility functions under arch/x86 (and coming, under arch/arm64) to handle the difference. Change this approach to handle the difference inline based on whether VDSO_CLOCK_MODE_HVCLOCK is present. The new approach removes code under arch/* since the difference is tied more to the specifics of the Linux implementation than to the architecture. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1614721102-2241-8-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	Drivers: hv: vmbus: Move handling of VMbus interrupts	Michael Kelley	5	-22/+70
	VMbus interrupts are most naturally modelled as per-cpu IRQs. But because x86/x64 doesn't have per-cpu IRQs, the core VMbus interrupt handling machinery is done in code under arch/x86 and Linux IRQs are not used. Adding support for ARM64 means adding equivalent code using per-cpu IRQs under arch/arm64. A better model is to treat per-cpu IRQs as the normal path (which it is for modern architectures), and the x86/x64 path as the exception. Do this by incorporating standard Linux per-cpu IRQ allocation into the main VMbus driver, and bypassing it in the x86/x64 exception case. For x86/x64, special case code is retained under arch/x86, but no VMbus interrupt handling code is needed under arch/arm64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-7-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	Drivers: hv: vmbus: Handle auto EOI quirk inline	Michael Kelley	2	-4/+11
	On x86/x64, Hyper-V provides a flag to indicate auto EOI functionality, but it doesn't on ARM64. Handle this quirk inline instead of calling into code under arch/x86 (and coming, under arch/arm64). No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-6-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	Drivers: hv: vmbus: Move hyperv_report_panic_msg to arch neutral code	Michael Kelley	3	-33/+19
	With the new Hyper-V MSR set function, hyperv_report_panic_msg() can be architecture neutral, so move it out from under arch/x86 and merge into hv_kmsg_dump(). This move also avoids needing a separate implementation under arch/arm64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-5-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	Drivers: hv: Redo Hyper-V synthetic MSR get/set functions	Michael Kelley	7	-100/+110
	Current code defines a separate get and set macro for each Hyper-V synthetic MSR used by the VMbus driver. Furthermore, the get macro can't be converted to a standard function because the second argument is modified in place, which is somewhat bad form. Redo this by providing a single get and a single set function that take a parameter specifying the MSR to be operated on. Fixup usage of the get function. Calling locations are no more complex than before, but the code under arch/x86 and the upcoming code under arch/arm64 is significantly simplified. Also standardize the names of Hyper-V synthetic MSRs that are architecture neutral. But keep the old x86-specific names as aliases that can be removed later when all references (particularly in KVM code) have been cleaned up in a separate patch series. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-4-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	x86/hyper-v: Move hv_message_type to architecture neutral module	Michael Kelley	2	-29/+35
	The definition of enum hv_message_type includes arch neutral and x86/x64-specific values. Ideally there would be a way to put the arch neutral values in an arch neutral module, and the arch specific values in an arch specific module. But C doesn't provide a way to extend enum types. As a compromise, move the entire definition into an arch neutral module, to avoid duplicating the arch neutral values for x86/x64 and for ARM64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-3-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	Drivers: hv: vmbus: Move Hyper-V page allocator to arch neutral code	Michael Kelley	4	-27/+40
	The Hyper-V page allocator functions are implemented in an architecture neutral way. Move them into the architecture neutral VMbus module so a separate implementation for ARM64 is not needed. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-2-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08	drivers: hv: Fix whitespace errors	Vasanth	2	-2/+2
	Fixed checkpatch warning and errors on hv driver. Signed-off-by: Vasanth Mathivanan <vasanth3g@gmail.com> Link: https://lore.kernel.org/r/20210219171311.421961-1-vasanth3g@gmail.com Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-05	Linux 5.12-rc2	Linus Torvalds	1	-1/+1

2021-03-05	RDMA/rxe: Fix errant WARN_ONCE in rxe_completer()	Bob Pearson	1	-32/+23
	In rxe_comp.c in rxe_completer() the function free_pkt() did not clear skb which triggered a warning at 'done:' and could possibly at 'exit:'. The WARN_ONCE() calls are not actually needed. The call to free_pkt() is moved to the end to clearly show that all skbs are freed. Fixes: 899aba891cab ("RDMA/rxe: Fix FIXME in rxe_udp_encap_recv()") Link: https://lore.kernel.org/r/20210304192048.2958-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-05	RDMA/rxe: Fix extra deref in rxe_rcv_mcast_pkt()	Bob Pearson	1	-24/+35
	rxe_rcv_mcast_pkt() dropped a reference to ib_device when no error occurred causing an underflow on the reference counter. This code is cleaned up to be clearer and easier to read. Fixes: 899aba891cab ("RDMA/rxe: Fix FIXME in rxe_udp_encap_recv()") Link: https://lore.kernel.org/r/20210304192048.2958-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-05	RDMA/rxe: Fix missed IB reference counting in loopback	Bob Pearson	1	-1/+9
	When the noted patch below extending the reference taken by rxe_get_dev_from_net() in rxe_udp_encap_recv() until each skb is freed it was not matched by a reference in the loopback path resulting in underflows. Fixes: 899aba891cab ("RDMA/rxe: Fix FIXME in rxe_udp_encap_recv()") Link: https://lore.kernel.org/r/20210304192048.2958-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-05	io_uring: don't restrict issue_flags for io_openat	Pavel Begunkov	1	-1/+1
	45d189c606292 ("io_uring: replace force_nonblock with flags") did something strange for io_openat() slicing all issue_flags but IO_URING_F_NONBLOCK. Not a bug for now, but better to just forward the flags. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-05	io_uring: make SQPOLL thread parking saner	Jens Axboe	1	-35/+30
	We have this weird true/false return from parking, and then some of the callers decide to look at that. It can lead to unbalanced parks and sqd locking. Have the callers check the thread status once it's parked. We know we have the lock at that point, so it's either valid or it's NULL. Fix race with parking on thread exit. We need to be careful here with ordering of the sdq->lock and the IO_SQ_THREAD_SHOULD_PARK bit. Rename sqd->completion to sqd->parked to reflect that this is the only thing this completion event doesn. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-05	io-wq: kill hashed waitqueue before manager exits	Jens Axboe	1	-4/+5
	If we race with shutting down the io-wq context and someone queueing a hashed entry, then we can exit the manager with it armed. If it then triggers after the manager has exited, we can have a use-after-free where io_wqe_hash_wake() attempts to wake a now gone manager process. Move the killing of the hashed write queue into the manager itself, so that we know we've killed it before the task exits. Fixes: e941894eae31 ("io-wq: make buffered file write hashed work map per-ctx") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-05	io_uring: clear IOCB_WAITQ for non -EIOCBQUEUED return	Jens Axboe	1	-0/+1
	The callback can only be armed, if we get -EIOCBQUEUED returned. It's important that we clear the WAITQ bit for other cases, otherwise we can queue for async retry and filemap will assume that we're armed and return -EAGAIN instead of just blocking for the IO. Cc: stable@vger.kernel.org # 5.9+ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-05	io_uring: don't keep looping for more events if we can't flush overflow	Jens Axboe	1	-3/+12
	It doesn't make sense to wait for more events to come in, if we can't even flush the overflow we already have to the ring. Return -EBUSY for that condition, just like we do for attempts to submit with overflow pending. Cc: stable@vger.kernel.org # 5.11 Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-05	io_uring: move to using create_io_thread()	Jens Axboe	3	-109/+54
	This allows us to do task creation and setup without needing to use completions to try and synchronize with the starting thread. Get rid of the old io_wq_fork_thread() wrapper, and the 'wq' and 'worker' startup completion events - we can now do setup before the task is running. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-05	nvmet: model_number must be immutable once set	Max Gurtovoy	4	-45/+50
	In case we have already established connection to nvmf target, it shouldn't be allowed to change the model_number. E.g. if someone will identify ctrl and get model_number of "my_model" later on will change the model_numbel via configfs to "my_new_model" this will break the NVMe specification for "Get Log Page – Persistent Event Log" that refers to Model Number as: "This field contains the same value as reported in the Model Number field of the Identify Controller data structure, bytes 63:24." Although it doesn't mentioned explicitly that this field can't be changed, we can assume it. So allow setting this field only once: using configfs or in the first identify ctrl operation. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-03-05	nvme-fabrics: fix kato initialization	Martin George	1	-1/+4
	Currently kato is initialized to NVME_DEFAULT_KATO for both discovery & i/o controllers. This is a problem specifically for non-persistent discovery controllers since it always ends up with a non-zero kato value. Fix this by initializing kato to zero instead, and ensuring various controllers are assigned appropriate kato values as follows: non-persistent controllers - kato set to zero persistent controllers - kato set to NVMF_DEV_DISC_TMO (or any positive int via nvme-cli) i/o controllers - kato set to NVME_DEFAULT_KATO (or any positive int via nvme-cli) Signed-off-by: Martin George <marting@netapp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-03-05	nvme-hwmon: Return error code when registration fails	Daniel Wagner	1	-0/+1
	The hwmon pointer wont be NULL if the registration fails. Though the exit code path will assign it to ctrl->hwmon_device. Later nvme_hwmon_exit() will try to free the invalid pointer. Avoid this by returning the error code from hwmon_device_register_with_info(). Fixes: ed7770f66286 ("nvme/hwmon: rework to avoid devm allocation") Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-03-05	nvme-pci: add quirks for Lexar 256GB SSD	Pascal Terjan	1	-0/+3
	Add the NVME_QUIRK_NO_NS_DESC_LIST and NVME_QUIRK_IGNORE_DEV_SUBNQN quirks for this buggy device. Reported and tested in https://bugs.mageia.org/show_bug.cgi?id=28417 Signed-off-by: Pascal Terjan <pterjan@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-03-05	nvme-pci: mark Kingston SKC2000 as not supporting the deepest power state	Zoltán Böszörményi	1	-0/+2
	My 2TB SKC2000 showed the exact same symptoms that were provided in 538e4a8c57 ("nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs"), i.e. a complete NVME lockup that needed cold boot to get it back. According to some sources, the A2000 is simply a rebadged SKC2000 with a slightly optimized firmware. Adding the SKC2000 PCI ID to the quirk list with the same workaround as the A2000 made my laptop survive a 5 hours long Yocto bootstrap buildfest which reliably triggered the SSD lockup previously. Signed-off-by: Zoltán Böszörményi <zboszor@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-03-05	nvme-pci: mark Seagate Nytro XM1440 as QUIRK_NO_NS_DESC_LIST.	Julian Einwag	1	-1/+2
	The kernel fails to fully detect these SSDs, only the character devices are present: [ 10.785605] nvme nvme0: pci function 0000:04:00.0 [ 10.876787] nvme nvme1: pci function 0000:81:00.0 [ 13.198614] nvme nvme0: missing or invalid SUBNQN field. [ 13.198658] nvme nvme1: missing or invalid SUBNQN field. [ 13.206896] nvme nvme0: Shutdown timeout set to 20 seconds [ 13.215035] nvme nvme1: Shutdown timeout set to 20 seconds [ 13.225407] nvme nvme0: 16/0/0 default/read/poll queues [ 13.233602] nvme nvme1: 16/0/0 default/read/poll queues [ 13.239627] nvme nvme0: Identify Descriptors failed (8194) [ 13.246315] nvme nvme1: Identify Descriptors failed (8194) Adding the NVME_QUIRK_NO_NS_DESC_LIST fixes this problem. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=205679 Signed-off-by: Julian Einwag <jeinwag-nvme@marcapo.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-04	scsi: iscsi: Verify lengths on passthrough PDUs	Chris Leech	1	-0/+9
	Open-iSCSI sends passthrough PDUs over netlink, but the kernel should be verifying that the provided PDU header and data lengths fall within the netlink message to prevent accessing beyond that in memory. Cc: stable@vger.kernel.org Reported-by: Adam Nichols <adam@grimm-co.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04	scsi: iscsi: Ensure sysfs attributes are limited to PAGE_SIZE	Chris Leech	2	-83/+90
	As the iSCSI parameters are exported back through sysfs, it should be enforcing that they never are more than PAGE_SIZE (which should be more than enough) before accepting updates through netlink. Change all iSCSI sysfs attributes to use sysfs_emit(). Cc: stable@vger.kernel.org Reported-by: Adam Nichols <adam@grimm-co.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04	scsi: iscsi: Restrict sessions and handles to admin capabilities	Lee Duncan	1	-0/+6
	Protect the iSCSI transport handle, available in sysfs, by requiring CAP_SYS_ADMIN to read it. Also protect the netlink socket by restricting reception of messages to ones sent with CAP_SYS_ADMIN. This disables normal users from being able to end arbitrary iSCSI sessions. Cc: stable@vger.kernel.org Reported-by: Adam Nichols <adam@grimm-co.com> Reviewed-by: Chris Leech <cleech@redhat.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04	kernel: provide create_io_thread() helper	Jens Axboe	2	-0/+32
	Provide a generic helper for setting up an io_uring worker. Returns a task_struct so that the caller can do whatever setup is needed, then call wake_up_new_task() to kick it into gear. Add a kernel_clone_args member, io_thread, which tells copy_process() to mark the task with PF_IO_WORKER. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-04	io_uring: reliably cancel linked timeouts	Pavel Begunkov	1	-0/+1
	Linked timeouts are fired asynchronously (i.e. soft-irq), and use generic cancellation paths to do its stuff, including poking into io-wq. The problem is that it's racy to access tctx->io_wq, as io_uring_task_cancel() and others may be happening at this exact moment. Mark linked timeouts with REQ_F_INLIFGHT for now, making sure there are no timeouts before io-wq destraction. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-04	io_uring: cancel-match based on flags	Pavel Begunkov	1	-2/+2
	Instead of going into request internals, like checking req->file->f_op, do match them based on REQ_F_INFLIGHT, it's set only when we want it to be reliably cancelled. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-04	dm verity: fix FEC for RS roots unaligned to block size	Milan Broz	1	-11/+12
	Optional Forward Error Correction (FEC) code in dm-verity uses Reed-Solomon code and should support roots from 2 to 24. The error correction parity bytes (of roots lengths per RS block) are stored on a separate device in sequence without any padding. Currently, to access FEC device, the dm-verity-fec code uses dm-bufio client with block size set to verity data block (usually 4096 or 512 bytes). Because this block size is not divisible by some (most!) of the roots supported lengths, data repair cannot work for partially stored parity bytes. This fix changes FEC device dm-bufio block size to "roots << SECTOR_SHIFT" where we can be sure that the full parity data is always available. (There cannot be partial FEC blocks because parity must cover whole sectors.) Because the optional FEC starting offset could be unaligned to this new block size, we have to use dm_bufio_set_sector_offset() to configure it. The problem is easily reproduced using veritysetup, e.g. for roots=13: # create verity device with RS FEC dd if=/dev/urandom of=data.img bs=4096 count=8 status=none veritysetup format data.img hash.img --fec-device=fec.img --fec-roots=13 \| awk '/^Root hash/{ print $3 }' >roothash # create an erasure that should be always repairable with this roots setting dd if=/dev/zero of=data.img conv=notrunc bs=1 count=8 seek=4088 status=none # try to read it through dm-verity veritysetup open data.img test hash.img --fec-device=fec.img --fec-roots=13 $(cat roothash) dd if=/dev/mapper/test of=/dev/null bs=4096 status=noxfer # wait for possible recursive recovery in kernel udevadm settle veritysetup close test With this fix, errors are properly repaired. device-mapper: verity-fec: 7:1: FEC 0: corrected 8 errors ... Without it, FEC code usually ends on unrecoverable failure in RS decoder: device-mapper: verity-fec: 7:1: FEC 0: failed to correct: -74 ... This problem is present in all kernels since the FEC code's introduction (kernel 4.5). It is thought that this problem is not visible in Android ecosystem because it always uses a default RS roots=2. Depends-on: a14e5ec66a7a ("dm bufio: subtract the number of initial sectors in dm_bufio_get_device_size") Signed-off-by: Milan Broz <gmazyland@gmail.com> Tested-by: Jérôme Carretero <cJ-ko@zougloub.eu> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Cc: stable@vger.kernel.org # 4.5+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-03-04	dm bufio: subtract the number of initial sectors in dm_bufio_get_device_size	Mikulas Patocka	1	-0/+4
	dm_bufio_get_device_size returns the device size in blocks. Before returning the value, we must subtract the nubmer of starting sectors. The number of starting sectors may not be divisible by block size. Note that currently, no target is using dm_bufio_set_sector_offset and dm_bufio_get_device_size simultaneously, so this change has no effect. However, an upcoming dm-verity-fec fix needs this change. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Milan Broz <gmazyland@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-03-04	btrfs: zoned: do not account freed region of read-only block group as zone_unusable	Naohiro Aota	1	-1/+6
	We migrate zone unusable bytes to read-only bytes when a block group is set to read-only, and account all the free region as bytes_readonly. Thus, we should not increase block_group->zone_unusable when the block group is read-only. Fixes: 169e0da91a21 ("btrfs: zoned: track unusable bytes for zones") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-04	btrfs: zoned: use sector_t for zone sectors	Naohiro Aota	1	-2/+2
	We need to use sector_t for zone_sectors, or it would set the zone size to zero when the size >= 4GB (= 2^24 sectors) by shifting the zone_sectors value by SECTOR_SHIFT. We're assuming zones sizes up to 8GiB. Fixes: 5b316468983d ("btrfs: get zone information of zoned block devices") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-04	tracing: Fix comment about the trace_event_call flags	Steven Rostedt (VMware)	1	-9/+2
	In the declaration of the struct trace_event_call, the flags has the bits defined in the comment above it. But these bits are also defined by the TRACE_EVENT_FL_* enums just above the declaration of the struct. As the comment about the flags in the struct has become stale and incorrect, just replace it with a reference to the TRACE_EVENT_FL_* enum above. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04	tracing: Skip selftests if tracing is disabled	Steven Rostedt (VMware)	1	-0/+6
	If tracing is disabled for some reason (traceoff_on_warning, command line, etc), the ftrace selftests are guaranteed to fail, as their results are defined by trace data in the ring buffers. If the ring buffers are turned off, the tests will fail, due to lack of data. Because tracing being disabled is for a specific reason (warning, user decided to, etc), it does not make sense to enable tracing to run the self tests, as the test output may corrupt the reason for the tracing to be disabled. Instead, simply skip the self tests and report that they are being skipped due to tracing being disabled. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>