wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2025-01-15	USB: gadget: Use str_enable_disable-like helpers	Krzysztof Kozlowski	11	-17/+27
	Replace ternary (condition ? "enable" : "disable") syntax with helpers from string_choices.h because: 1. Simple function call with one argument is easier to read. Ternary operator has three arguments and with wrapping might lead to quite long code. 2. Is slightly shorter thus also easier to read. 3. It brings uniformity in the text - same string. 4. Allows deduping by the linker, which results in a smaller binary file. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250114-str-enable-disable-usb-v1-5-c8405df47c19@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15	USB: phy: Use str_enable_disable-like helpers	Krzysztof Kozlowski	3	-3/+6
	Replace ternary (condition ? "enable" : "disable") syntax with helpers from string_choices.h because: 1. Simple function call with one argument is easier to read. Ternary operator has three arguments and with wrapping might lead to quite long code. 2. Is slightly shorter thus also easier to read. 3. It brings uniformity in the text - same string. 4. Allows deduping by the linker, which results in a smaller binary file. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250114-str-enable-disable-usb-v1-4-c8405df47c19@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15	USB: typec: Use str_enable_disable-like helpers	Krzysztof Kozlowski	6	-21/+27
	Replace ternary (condition ? "enable" : "disable") syntax with helpers from string_choices.h because: 1. Simple function call with one argument is easier to read. Ternary operator has three arguments and with wrapping might lead to quite long code. 2. Is slightly shorter thus also easier to read. 3. It brings uniformity in the text - same string. 4. Allows deduping by the linker, which results in a smaller binary file. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250114-str-enable-disable-usb-v1-3-c8405df47c19@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15	USB: host: Use str_enable_disable-like helpers	Krzysztof Kozlowski	5	-7/+12
	Replace ternary (condition ? "enable" : "disable") syntax with helpers from string_choices.h because: 1. Simple function call with one argument is easier to read. Ternary operator has three arguments and with wrapping might lead to quite long code. 2. Is slightly shorter thus also easier to read. 3. It brings uniformity in the text - same string. 4. Allows deduping by the linker, which results in a smaller binary file. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250114-str-enable-disable-usb-v1-2-c8405df47c19@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15	USB: Replace own str_plural with common one	Krzysztof Kozlowski	2	-20/+11
	Use existing str_plural() helper from string_choices.h to reduce amount of duplicated code. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250114-str-enable-disable-usb-v1-1-c8405df47c19@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-13	usb: phy: Remove API devm_usb_put_phy()	Zijun Hu	3	-32/+0
	Static devm_usb_phy_match() is only called by API devm_usb_put_phy(), and the API has no caller now. Remove the API and the static function. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20250112-remove_api-v1-1-49cc8f792ac9@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-13	usb: typec: tcpm: set SRC_SEND_CAPABILITIES timeout to PD_T_SENDER_RESPONSE	Jos Wang	1	-1/+1
	As PD2.0 spec ("8.3.3.2.3 PE_SRC_Send_Capabilities state"), after the Source receives the GoodCRC Message from the Sink in response to the Source_Capabilities message, it should start the SenderResponseTimer, after the timer times out, the state machine transitions to the HARD_RESET state. Fixes: f0690a25a140 ("staging: typec: USB Type-C Port Manager (tcpm)") Cc: stable@vger.kernel.org Signed-off-by: Jos Wang <joswang@lenovo.com> Reviewed-by: Badhri Jagan Sridharan <badhri@google.com> Link: https://lore.kernel.org/r/20250105135245.7493-1-joswang1221@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-12	Linux 6.13-rc7	Linus Torvalds	1	-1/+1

2025-01-12	KVM: e500: perform hugepage check after looking up the PFN	Paolo Bonzini	1	-109/+69
	e500 KVM tries to bypass __kvm_faultin_pfn() in order to map VM_PFNMAP VMAs as huge pages. This is a Bad Idea because VM_PFNMAP VMAs could become noncontiguous as a result of callsto remap_pfn_range(). Instead, use the already existing host PTE lookup to retrieve a valid host-side mapping level after __kvm_faultin_pfn() has returned. Then find the largest size that will satisfy the guest's request while staying within a single host PTE. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-12	KVM: e500: map readonly host pages for read	Paolo Bonzini	1	-2/+3
	The new __kvm_faultin_pfn() function is upset by the fact that e500 KVM ignores host page permissions - __kvm_faultin requires a "writable" outgoing argument, but e500 KVM is nonchalantly passing NULL. If the host page permissions do not include writability, the shadow TLB entry is forcibly mapped read-only. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-12	KVM: e500: track host-writability of pages	Paolo Bonzini	2	-4/+13
	Add the possibility of marking a page so that the UW and SW bits are force-cleared. This is stored in the private info so that it persists across multiple calls to kvmppc_e500_setup_stlbe. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-12	KVM: e500: use shadow TLB entry as witness for writability	Paolo Bonzini	1	-4/+3
	kvmppc_e500_ref_setup is returning whether the guest TLB entry is writable, which is than passed to kvm_release_faultin_page. This makes little sense for two reasons: first, because the function sets up the private data for the page and the return value feels like it has been bolted on the side; second, because what really matters is whether the _shadow_ TLB entry is writable. If it is not writable, the page can be released as non-dirty. Shift from using tlbe_is_writable(gtlbe) to doing the same check on the shadow TLB entry. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-12	KVM: e500: always restore irqs	Paolo Bonzini	1	-2/+2
	If find_linux_pte fails, IRQs will not be restored. This is unlikely to happen in practice since it would have been reported as hanging hosts, but it should of course be fixed anyway. Cc: stable@vger.kernel.org Reported-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-11	usb: host: xhci-plat: Assign shared_hcd->rsrc_start	WangYuli	1	-0/+2
	When inserting a USB device, examining hcd->rsrc_start can be helpful in identifying which hcd is mounted, as the physical address represented here is typically unique. The following code snippet demonstrates this: struct usb_hcd *hcd = bus_to_hcd(udev->bus); unsigned long long usb_hcd_addr = (unsigned long long)hcd->rsrc_start; However, this approach has limitations now. For USB hosts with an MMIO interface, the effectiveness of this method is restricted to USB 2.0. Because commit 3429e91a661e ("usb: host: xhci: add platform driver support") assigned res->start to hcd->rsrc_start. But shared_hcd->rsrc_start remains unassigned, which is also necessary in certain scenarios. Fixes: 3429e91a661e ("usb: host: xhci: add platform driver support") Co-developed-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com> Link: https://lore.kernel.org/r/186B9F56972457B4+20250107133854.172309-1-wangyuli@uniontech.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-11	usb: dwc3: omap: Fix devm_regulator_get_optional() error handling	Rob Herring (Arm)	1	-2/+5
	Commit 533561a8aad5 ("usb: dwc3: omap: Use devm_regulator_get_optional()") assumed NULL was returned, but devm_regulator_get_optional() returns -ENODEV rather than NULL like other *_get_optional() functions. Fixes: 533561a8aad5 ("usb: dwc3: omap: Use devm_regulator_get_optional()") Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20250109170748.3852439-1-robh@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-11	usb: dwc3: st: Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr()	Raphael Gallais-Pou	1	-4/+2
	Letting the compiler remove these functions when the kernel is built without CONFIG_PM_SLEEP support is simpler and less error prone than the use of #ifdef based kernel configuration guards. Signed-off-by: Raphael Gallais-Pou <rgallaispou@gmail.com> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20250109152055.52931-1-rgallaispou@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-11	usb: dwc3-am62: Fix an OF node leak in phy_syscon_pll_refclk()	Joe Hattori	1	-0/+1
	phy_syscon_pll_refclk() leaks an OF node obtained by of_parse_phandle_with_fixed_args(), thus add an of_node_put() call. Cc: stable <stable@kernel.org> Fixes: e8784c0aec03 ("drivers: usb: dwc3: Add AM62 USB wrapper driver") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20250109001638.70033-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-11	usb: dwc3: Skip resume if pm_runtime_set_active() fails	Ray Chi	1	-1/+4
	When the system begins to enter suspend mode, dwc3_suspend() is called by PM suspend. There is a problem that if someone interrupt the system suspend process between dwc3_suspend() and pm_suspend() of its parent device, PM suspend will be canceled and attempt to resume suspended devices so that dwc3_resume() will be called. However, dwc3 and its parent device (like the power domain or glue driver) may already be suspended by runtime PM in fact. If this sutiation happened, the pm_runtime_set_active() in dwc3_resume() will return an error since parent device was suspended. This can lead to unexpected behavior if DWC3 proceeds to execute dwc3_resume_common(). EX. RPM suspend: ... -> dwc3_runtime_suspend() -> rpm_suspend() of parent device ... PM suspend: ... -> dwc3_suspend() -> pm_suspend of parent device ^ interrupt, so resume suspended device ... <- dwc3_resume() <-/ ^ pm_runtime_set_active() returns error To prevent the problem, this commit will skip dwc3_resume_common() and return the error if pm_runtime_set_active() fails. Fixes: 68c26fe58182 ("usb: dwc3: set pm runtime active before resume common") Cc: stable <stable@kernel.org> Signed-off-by: Ray Chi <raychi@google.com> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20250106082240.3822059-1-raychi@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-11	usb: typec: cros-ec-ucsi: Mark cros_ucsi_ops static/const	Stephen Boyd	1	-1/+1
	This structure isn't used outside this file and can be marked const so that it gets moved to read-only memory. Cc: Pavan Holla <pholla@chromium.org> Cc: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Cc: Łukasz Bartosik <ukaszb@chromium.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Łukasz Bartosik <ukaszb@chromium.org> Link: https://lore.kernel.org/r/20250109220655.1677506-3-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-11	usb: typec: cros-ec-ucsi: Add newlines to printk messages	Stephen Boyd	1	-16/+12
	These printks need newlines. Add them and convert to dev_err_probe() in cros_ucsi_probe() to simplify code. Cc: Pavan Holla <pholla@chromium.org> Cc: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Cc: Łukasz Bartosik <ukaszb@chromium.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Łukasz Bartosik <ukaszb@chromium.org> Link: https://lore.kernel.org/r/20250109220655.1677506-2-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-11	dt-bindings: usb: Correct indentation and style in DTS example	Krzysztof Kozlowski	15	-164/+164
	DTS example in the bindings should be indented with 2- or 4-spaces and aligned with opening '- \|', so correct any differences like 3-spaces or mixtures 2- and 4-spaces in one binding. No functional changes here, but saves some comments during reviews of new patches built on existing code. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com> Acked-by: Andrew Jeffery <andrew@codeconstruct.com.au> # aspeed,usb-vhub.yaml Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Francesco Dolcini <francesco.dolcini@toradex.com> # ti,tusb73x0-pci.yaml Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> # brcm,bdc.yaml Acked-by: Benjamin Bara <benjamin.bara@skidata.com> # cypress,hx3.yaml Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> # renesas Link: https://lore.kernel.org/r/20250107131015.246461-1-krzysztof.kozlowski@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-10	MAINTAINERS: powerpc: Update my status	Michael Ellerman	1	-3/+2
	Maddy is taking over the day-to-day maintenance of powerpc. I will still be around to help, and as a backup. Re-order the main POWERPC list to put Maddy first to reflect that. KVM/powerpc patches will be handled by Maddy via the powerpc tree with review from Nick, so replace myself with Maddy there. Remove myself from BPF, leaving Hari & Christophe as maintainers. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-01-10	smb: client: sync the root session and superblock context passwords before automounting	Meetakshi Setiya	1	-1/+18
	In some cases, when password2 becomes the working password, the client swaps the two password fields in the root session struct, but not in the smb3_fs_context struct in cifs_sb. DFS automounts inherit fs context from their parent mounts. Therefore, they might end up getting the passwords in the stale order. The automount should succeed, because the mount function will end up retrying with the actual password anyway. But to reduce these unnecessary session setup retries for automounts, we can sync the parent context's passwords with the root session's passwords before duplicating it to the child's fs context. Cc: stable@vger.kernel.org Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-10	sched_ext: idle: Refresh idle masks during idle-to-idle transitions	Andrea Righi	3	-15/+59
	With the consolidation of put_prev_task/set_next_task(), see commit 436f3eed5c69 ("sched: Combine the last put_prev_task() and the first set_next_task()"), we are now skipping the transition between these two functions when the previous and the next tasks are the same. As a result, the scx idle state of a CPU is updated only when transitioning to or from the idle thread. While this is generally correct, it can lead to uneven and inefficient core utilization in certain scenarios [1]. A typical scenario involves proactive wake-ups: scx_bpf_pick_idle_cpu() selects and marks an idle CPU as busy, followed by a wake-up via scx_bpf_kick_cpu(), without dispatching any tasks. In this case, the CPU continues running the idle thread, returns to idle, but remains marked as busy, preventing it from being selected again as an idle CPU (until a task eventually runs on it and releases the CPU). For example, running a workload that uses 20% of each CPU, combined with an scx scheduler using proactive wake-ups, results in the following core utilization: CPU 0: 25.7% CPU 1: 29.3% CPU 2: 26.5% CPU 3: 25.5% CPU 4: 0.0% CPU 5: 25.5% CPU 6: 0.0% CPU 7: 10.5% To address this, refresh the idle state also in pick_task_idle(), during idle-to-idle transitions, but only trigger ops.update_idle() on actual state changes to prevent unnecessary updates to the scx scheduler and maintain balanced state transitions. With this change in place, the core utilization in the previous example becomes the following: CPU 0: 18.8% CPU 1: 19.4% CPU 2: 18.0% CPU 3: 18.7% CPU 4: 19.3% CPU 5: 18.9% CPU 6: 18.7% CPU 7: 19.3% [1] https://github.com/sched-ext/scx/pull/1139 Fixes: 7c65ae81ea86 ("sched_ext: Don't call put_prev_task_scx() before picking the next task") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10	io_uring: don't touch sqd->thread off tw add	Pavel Begunkov	1	-4/+1
	With IORING_SETUP_SQPOLL all requests are created by the SQPOLL task, which means that req->task should always match sqd->thread. Since accesses to sqd->thread should be separately protected, use req->task in io_req_normal_work_add() instead. Note, in the eyes of io_req_normal_work_add(), the SQPOLL task struct is always pinned and alive, and sqd->thread can either be the task or NULL. It's only problematic if the compiler decides to reload the value after the null check, which is not so likely. Cc: stable@vger.kernel.org Cc: Bui Quang Minh <minhquangbui99@gmail.com> Reported-by: lizetao <lizetao1@huawei.com> Fixes: 78f9b61bd8e54 ("io_uring: wake SQPOLL task when task_work is added to an empty queue") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1cbbe72cf32c45a8fee96026463024cd8564a7d7.1736541357.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-10	io_uring/sqpoll: zero sqd->thread on tctx errors	Pavel Begunkov	1	-1/+5
	Syzkeller reports: BUG: KASAN: slab-use-after-free in thread_group_cputime+0x409/0x700 kernel/sched/cputime.c:341 Read of size 8 at addr ffff88803578c510 by task syz.2.3223/27552 Call Trace: <TASK> ... kasan_report+0x143/0x180 mm/kasan/report.c:602 thread_group_cputime+0x409/0x700 kernel/sched/cputime.c:341 thread_group_cputime_adjusted+0xa6/0x340 kernel/sched/cputime.c:639 getrusage+0x1000/0x1340 kernel/sys.c:1863 io_uring_show_fdinfo+0xdfe/0x1770 io_uring/fdinfo.c:197 seq_show+0x608/0x770 fs/proc/fd.c:68 ... That's due to sqd->task not being cleared properly in cases where SQPOLL task tctx setup fails, which can essentially only happen with fault injection to insert allocation errors. Cc: stable@vger.kernel.org Fixes: 1251d2025c3e1 ("io_uring/sqpoll: early exit thread if task_context wasn't allocated") Reported-by: syzbot+3d92cfcfa84070b0a470@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/efc7ec7010784463b2e7466d7b5c02c2cb381635.1736519461.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-10	workqueue: warn if delayed_work is queued to an offlined cpu.	Imran Khan	1	-0/+7
	delayed_work submitted to an offlined cpu, will not get executed, after the specified delay if the cpu remains offline. If the cpu never comes online the work will never get executed. checking for online cpu in __queue_delayed_work, does not sound like a good idea because to do this reliably we need hotplug lock and since work may be submitted from atomic contexts, we would have to use cpus_read_trylock. But if trylock fails we would queue the work on any cpu and this may not be optimal because our intended cpu might still be online. Putting a WARN_ON_ONCE for an already offlined cpu, will indicate users of queue_delayed_work_on, if they are (wrongly) trying to queue delayed_work on offlined cpu. Also indicate the problem of using offlined cpu with queue_delayed_work_on, in its description. Signed-off-by: Imran Khan <imran.f.khan@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10	poll: kill poll_does_not_wait()	Oleg Nesterov	1	-13/+3
	It no longer has users. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162743.GA18947@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10	sock_poll_wait: kill the no longer necessary barrier after poll_wait()	Oleg Nesterov	1	-10/+7
	Now that poll_wait() provides a full barrier we can remove smp_mb() from sock_poll_wait(). Also, the poll_does_not_wait() check before poll_wait() just adds the unnecessary confusion, kill it. poll_wait() does the same "p && p->_qproc" check. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162736.GA18944@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10	io_uring_poll: kill the no longer necessary barrier after poll_wait()	Oleg Nesterov	1	-5/+4
	Now that poll_wait() provides a full barrier we can remove smp_rmb() from io_uring_poll(). In fact I don't think smp_rmb() was correct, it can't serialize LOADs and STOREs. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162730.GA18940@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10	poll_wait: kill the obsolete wait_address check	Oleg Nesterov	1	-1/+1
	This check is historical and no longer needed, wait_address is never NULL. These days we rely on the poll_table->_qproc check. NULL if select/poll is not going to sleep, or it already has a data to report, or all waiters have already been registered after the 1st iteration. However, poll_table *p can be NULL, see p9_fd_poll() for example, so we can't remove the "p != NULL" check. Link: https://lore.kernel.org/all/20250106180325.GF7233@redhat.com/ Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162724.GA18926@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10	poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()	Oleg Nesterov	1	-1/+9
	As the comment above waitqueue_active() explains, it can only be used if both waker and waiter have mb()'s that pair with each other. However __pollwait() is broken in this respect. This is not pipe-specific, but let's look at pipe_poll() for example: poll_wait(...); // -> __pollwait() -> add_wait_queue() LOAD(pipe->head); LOAD(pipe->head); In theory these LOAD()'s can leak into the critical section inside add_wait_queue() and can happen before list_add(entry, wq_head), in this case pipe_poll() can race with wakeup_pipe_readers/writers which do smp_mb(); if (waitqueue_active(wq_head)) wake_up_interruptible(wq_head); There are more __pollwait()-like functions (grep init_poll_funcptr), and it seems that at least ep_ptable_queue_proc() has the same problem, so the patch adds smp_mb() into poll_wait(). Link: https://lore.kernel.org/all/20250102163320.GA17691@redhat.com/ Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162717.GA18922@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10	xfs: lock dquot buffer before detaching dquot from b_li_list	Darrick J. Wong	1	-1/+2
	We have to lock the buffer before we can delete the dquot log item from the buffer's log item list. Cc: stable@vger.kernel.org # v6.13-rc3 Fixes: acc8f8628c3737 ("xfs: attach dquot buffer to dquot log item buffer") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-10	fs: debugfs: fix open proxy for unsafe files	Johannes Berg	1	-1/+1
	In the previous commit referenced below, I had to split the short fops handling into different proxy fops. This necessitated knowing out-of-band whether or not the ops are short or full, when attempting to convert from fops to allocated fsdata. Unfortunately, I only converted full_proxy_open() which is used for the new full_proxy_open_regular() and full_proxy_open_short(), but forgot about the call in open_proxy_open(), used for debugfs_create_file_unsafe(). Fix that, it never has short fops. Fixes: f8f25893a477 ("fs: debugfs: differentiate short fops with proxy ops") Reported-by: Suresh Kumar Kurmi <suresh.kumar.kurmi@intel.com> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202501101055.bb8bf3e7-lkp@intel.com Reported-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20250110085826.cd74f3b7a36b.I430c79c82ec3f954c2ff9665753bf6ac9e63eef8@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-10	uprobes: Fix race in uprobe_free_utask	Jiri Olsa	1	-1/+1
	Max Makarov reported kernel panic [1] in perf user callchain code. The reason for that is the race between uprobe_free_utask and bpf profiler code doing the perf user stack unwind and is triggered within uprobe_free_utask function: - after current->utask is freed and - before current->utask is set to NULL general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI RIP: 0010:is_uprobe_at_func_entry+0x28/0x80 ... ? die_addr+0x36/0x90 ? exc_general_protection+0x217/0x420 ? asm_exc_general_protection+0x26/0x30 ? is_uprobe_at_func_entry+0x28/0x80 perf_callchain_user+0x20a/0x360 get_perf_callchain+0x147/0x1d0 bpf_get_stackid+0x60/0x90 bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b ? __smp_call_single_queue+0xad/0x120 bpf_overflow_handler+0x75/0x110 ... asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:__kmem_cache_free+0x1cb/0x350 ... ? uprobe_free_utask+0x62/0x80 ? acct_collect+0x4c/0x220 uprobe_free_utask+0x62/0x80 mm_release+0x12/0xb0 do_exit+0x26b/0xaa0 __x64_sys_exit+0x1b/0x20 do_syscall_64+0x5a/0x80 It can be easily reproduced by running following commands in separate terminals: # while :; do bpftrace -e 'uprobe:/bin/ls:_start { printf("hit\n"); }' -c ls; done # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }' Fixing this by making sure current->utask pointer is set to NULL before we start to release the utask object. [1] https://github.com/grafana/pyroscope/issues/3673 Fixes: cfa7f3d2c526 ("perf,x86: avoid missing caller address in stack traces captured in uprobe") Reported-by: Max Makarov <maxpain@linux.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250109141440.2692173-1-jolsa@kernel.org
2025-01-10	tracing/kprobes: Fix to free objects when failed to copy a symbol	Masami Hiramatsu (Google)	1	-2/+4
	In __trace_kprobe_create(), if something fails it must goto error block to free objects. But when strdup() a symbol, it returns without that. Fix it to goto the error block to free objects correctly. Link: https://lore.kernel.org/all/173643297743.1514810.2408159540454241947.stgit@devnote2/ Fixes: 6212dd29683e ("tracing/kprobes: Use dyn_event framework for kprobe events") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-01-09	firewall: remove misplaced semicolon from stm32_firewall_get_firewall	guanjing	1	-1/+1
	Remove misplaced colon in stm32_firewall_get_firewall() which results in a syntax error when the code is compiled without CONFIG_STM32_FIREWALL. Fixes: 5c9668cfc6d7 ("firewall: introduce stm32_firewall framework") Signed-off-by: guanjing <guanjing@cmss.chinamobile.com> Reviewed-by: Gatien Chevallier <gatien.chevallier@foss.st.com> Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-01-09	drivers/perf: riscv: Do not allow invalid raw event config	Atish Patra	1	-2/+5
	The SBI specification allows only lower 48bits of hpmeventX to be configured via SBI PMU. Currently, the driver masks of the higher bits but doesn't return an error. This will lead to an additional SBI call for config matching which should return for an invalid event error in most of the cases. However, if a platform(i.e Rocket and sifive cores) implements a bitmap of all bits in the event encoding this will lead to an incorrect event being programmed leading to user confusion. Report the error to the user if higher bits are set during the event mapping itself to avoid the confusion and save an additional SBI call. Suggested-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-3-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09	drivers/perf: riscv: Return error for default case	Atish Patra	1	-2/+3
	If the upper two bits has an invalid valid (0x1), the event mapping is not reliable as it returns an uninitialized variable. Return appropriate value for the default case. Fixes: f0c9363db2dd ("perf/riscv-sbi: Add platform specific firmware event handling") Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-2-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09	drivers/perf: riscv: Fix Platform firmware event data	Atish Patra	2	-7/+6
	Platform firmware event data field is allowed to be 62 bits for Linux as uppper most two bits are reserved to indicate SBI fw or platform specific firmware events. However, the event data field is masked as per the hardware raw event mask which is not correct. Fix the platform firmware event data field with proper mask. Fixes: f0c9363db2dd ("perf/riscv-sbi: Add platform specific firmware event handling") Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-1-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09	tools: selftests: riscv: Add test count for vstate_prctl	Yong-Xuan Wang	1	-0/+2
	Add the test count to drop the warning message. "Planned tests != run tests (0 != 1)" Fixes: 7cf6198ce22d ("selftests: Test RISC-V Vector prctl interface") Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Andy Chiu <AndybnAC@gmail.com> Link: https://lore.kernel.org/r/20241220091730.28006-3-yongxuan.wang@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09	tools: selftests: riscv: Add pass message for v_initval_nolibc	Yong-Xuan Wang	1	-0/+4
	Add the pass message after we successfully complete the test. Fixes: 5c93c4c72fbc ("selftests: Test RISC-V Vector's first-use handler") Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Andy Chiu <AndybnAC@gmail.com> Link: https://lore.kernel.org/r/20241220091730.28006-2-yongxuan.wang@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09	rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy	Matthieu Baerts (NGI0)	1	-7/+32
	As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The per-netns structure can be obtained from the table->data using container_of(), then the 'net' one can be retrieved from the listen socket (if available). Fixes: c6a58ffed536 ("RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-9-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy	Matthieu Baerts (NGI0)	1	-1/+2
	As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly, as this is the only member needed from the 'net' structure, but that would increase the size of this fix, to use '*data' everywhere 'net->sctp.probe_interval' is used. Fixes: d1e462a7a5f3 ("sctp: add probe_interval in sysctl and sock/asoc/transport") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-8-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	sctp: sysctl: udp_port: avoid using current->nsproxy	Matthieu Baerts (NGI0)	1	-1/+1
	As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly, but that would increase the size of this fix, while 'sctp.ctl_sock' still needs to be retrieved from 'net' structure. Fixes: 046c052b475e ("sctp: enable udp tunneling socks") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-7-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	sctp: sysctl: auth_enable: avoid using current->nsproxy	Matthieu Baerts (NGI0)	1	-1/+1
	As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly, but that would increase the size of this fix, while 'sctp.ctl_sock' still needs to be retrieved from 'net' structure. Fixes: b14878ccb7fa ("net: sctp: cache auth_enable per endpoint") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-6-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	sctp: sysctl: rto_min/max: avoid using current->nsproxy	Matthieu Baerts (NGI0)	1	-2/+2
	As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly, as this is the only member needed from the 'net' structure, but that would increase the size of this fix, to use '*data' everywhere 'net->sctp.rto_min/max' is used. Fixes: 4f3fdf3bc59c ("sctp: add check rto_min and rto_max in sysctl") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-5-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy	Matthieu Baerts (NGI0)	1	-1/+2
	As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly, as this is the only member needed from the 'net' structure, but that would increase the size of this fix, to use '*data' everywhere 'net->sctp.sctp_hmac_alg' is used. Fixes: 3c68198e7511 ("sctp: Make hmac algorithm selection for cookie generation dynamic") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-4-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	mptcp: sysctl: blackhole timeout: avoid using current->nsproxy	Matthieu Baerts (NGI0)	1	-1/+3
	As mentioned in the previous commit, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'pernet' structure can be obtained from the table->data using container_of(). Fixes: 27069e7cb3d1 ("mptcp: disable active MPTCP in case of blackhole") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-3-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	mptcp: sysctl: sched: avoid using current->nsproxy	Matthieu Baerts (NGI0)	1	-6/+5
	Using the 'net' structure via 'current' is not recommended for different reasons. First, if the goal is to use it to read or write per-netns data, this is inconsistent with how the "generic" sysctl entries are doing: directly by only using pointers set to the table entry, e.g. table->data. Linked to that, the per-netns data should always be obtained from the table linked to the netns it had been created for, which may not coincide with the reader's or writer's netns. Another reason is that access to current->nsproxy->netns can oops if attempted when current->nsproxy had been dropped when the current task is exiting. This is what syzbot found, when using acct(2): Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f] CPU: 1 UID: 0 PID: 5924 Comm: syz-executor Not tainted 6.13.0-rc5-syzkaller-00004-gccb98ccef0e5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125 Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00 RSP: 0018:ffffc900034774e8 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 1ffff9200068ee9e RCX: ffffc90003477620 RDX: 0000000000000005 RSI: ffffffff8b08f91e RDI: 0000000000000028 RBP: 0000000000000001 R08: ffffc90003477710 R09: 0000000000000040 R10: 0000000000000040 R11: 00000000726f7475 R12: 0000000000000000 R13: ffffc90003477620 R14: ffffc90003477710 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fee3cd452d8 CR3: 000000007d116000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> proc_sys_call_handler+0x403/0x5d0 fs/proc/proc_sysctl.c:601 __kernel_write_iter+0x318/0xa80 fs/read_write.c:612 __kernel_write+0xf6/0x140 fs/read_write.c:632 do_acct_process+0xcb0/0x14a0 kernel/acct.c:539 acct_pin_kill+0x2d/0x100 kernel/acct.c:192 pin_kill+0x194/0x7c0 fs/fs_pin.c:44 mnt_pin_kill+0x61/0x1e0 fs/fs_pin.c:81 cleanup_mnt+0x3ac/0x450 fs/namespace.c:1366 task_work_run+0x14e/0x250 kernel/task_work.c:239 exit_task_work include/linux/task_work.h:43 [inline] do_exit+0xad8/0x2d70 kernel/exit.c:938 do_group_exit+0xd3/0x2a0 kernel/exit.c:1087 get_signal+0x2576/0x2610 kernel/signal.c:3017 arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop kernel/entry/common.c:111 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218 do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fee3cb87a6a Code: Unable to access opcode bytes at 0x7fee3cb87a40. RSP: 002b:00007fffcccac688 EFLAGS: 00000202 ORIG_RAX: 0000000000000037 RAX: 0000000000000000 RBX: 00007fffcccac710 RCX: 00007fee3cb87a6a RDX: 0000000000000041 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000000000003 R08: 00007fffcccac6ac R09: 00007fffcccacac7 R10: 00007fffcccac710 R11: 0000000000000202 R12: 00007fee3cd49500 R13: 00007fffcccac6ac R14: 0000000000000000 R15: 00007fee3cd4b000 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125 Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00 RSP: 0018:ffffc900034774e8 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 1ffff9200068ee9e RCX: ffffc90003477620 RDX: 0000000000000005 RSI: ffffffff8b08f91e RDI: 0000000000000028 RBP: 0000000000000001 R08: ffffc90003477710 R09: 0000000000000040 R10: 0000000000000040 R11: 00000000726f7475 R12: 0000000000000000 R13: ffffc90003477620 R14: ffffc90003477710 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fee3cd452d8 CR3: 000000007d116000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess), 1 bytes skipped: 0: 42 80 3c 38 00 cmpb $0x0,(%rax,%r15,1) 5: 0f 85 fe 02 00 00 jne 0x309 b: 4d 8b a4 24 08 09 00 mov 0x908(%r12),%r12 12: 00 13: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 1a: fc ff df 1d: 49 8d 7c 24 28 lea 0x28(%r12),%rdi 22: 48 89 fa mov %rdi,%rdx 25: 48 c1 ea 03 shr $0x3,%rdx * 29: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction 2d: 0f 85 cc 02 00 00 jne 0x2ff 33: 4d 8b 7c 24 28 mov 0x28(%r12),%r15 38: 48 rex.W 39: 8d .byte 0x8d 3a: 84 24 c8 test %ah,(%rax,%rcx,8) Here with 'net.mptcp.scheduler', the 'net' structure is not really needed, because the table->data already has a pointer to the current scheduler, the only thing needed from the per-netns data. Simply use 'data', instead of getting (most of the time) the same thing, but from a longer and indirect way. Fixes: 6963c508fd7a ("mptcp: only allow set existing scheduler for net.mptcp.scheduler") Cc: stable@vger.kernel.org Reported-by: syzbot+e364f774c6f57f2c86d1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-2-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>