aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/exported-sql-viewer.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-02-26ext2: Make ext2_params_spec staticJan Kara1-1/+1
It isn't used outside of fs/ext2/super.c. Reported-by: kernel test robot <lkp@intel.com> Fixes: eab61d3260d7 ("ext2: convert to the new mount API") Signed-off-by: Jan Kara <jack@suse.cz>
2025-02-24ext2: create ext2_msg_fc for use during parsingEric Sandeen1-13/+37
Rather than send a NULL sb to ext2_msg, which omits the s_id from messages, create a new ext2_msg_fc which is able to provide this information from the filesystem context *fc when parsing. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250223201014.7541-3-sandeen@redhat.com
2025-02-24ext2: convert to the new mount APIEric Sandeen2-262/+310
Convert ext2 to the new mount API. Note that this makes the sb= option more accepting than it was before; previosly, sb= was only accepted if it was the first specified option. Now it can exist anywhere, and if respecified, the last specified value is used. Parse-time messages here are sent to ext2_msg with a NULL sb, and ext2_msg is adjusted to accept that, as ext4 does today as well. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250223201014.7541-2-sandeen@redhat.com
2025-02-14ext2: Remove reference to bh->b_pageMatthew Wilcox (Oracle)1-1/+1
Buffer heads are attached to folios, not to pages. Also flush_dcache_page() is now deprecated in favour of flush_dcache_folio(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250213182045.2131356-1-willy@infradead.org
2025-02-12isofs: fix KMSAN uninit-value bug in do_isofs_readdir()Qasim Ijaz1-1/+2
In do_isofs_readdir() when assigning the variable "struct iso_directory_record *de" the b_data field of the buffer_head is accessed and an offset is added to it, the size of b_data is 2048 and the offset size is 2047, meaning "de = (struct iso_directory_record *) (bh->b_data + offset);" yields the final byte of the 2048 sized b_data block. The first byte of the directory record (de_len) is then read and found to be 31, meaning the directory record size is 31 bytes long. The directory record is defined by the structure: struct iso_directory_record { __u8 length; // 1 byte __u8 ext_attr_length; // 1 byte __u8 extent[8]; // 8 bytes __u8 size[8]; // 8 bytes __u8 date[7]; // 7 bytes __u8 flags; // 1 byte __u8 file_unit_size; // 1 byte __u8 interleave; // 1 byte __u8 volume_sequence_number[4]; // 4 bytes __u8 name_len; // 1 byte char name[]; // variable size } __attribute__((packed)); The fixed portion of this structure occupies 33 bytes. Therefore, a valid directory record must be at least 33 bytes long (even without considering the variable-length name field). Since de_len is only 31, it is insufficient to contain the complete fixed header. The code later hits the following sanity check that compares de_len against the sum of de->name_len and sizeof(struct iso_directory_record): if (de_len < de->name_len[0] + sizeof(struct iso_directory_record)) { ... } Since the fixed portion of the structure is 33 bytes (up to and including name_len member), a valid record should have de_len of at least 33 bytes; here, however, de_len is too short, and the field de->name_len (located at offset 32) is accessed even though it lies beyond the available 31 bytes. This access on the corrupted isofs data triggers a KASAN uninitialized memory warning. The fix would be to first verify that de_len is at least sizeof(struct iso_directory_record) before accessing any fields like de->name_len. Reported-by: syzbot <syzbot+812641c6c3d7586a1613@syzkaller.appspotmail.com> Tested-by: syzbot <syzbot+812641c6c3d7586a1613@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=812641c6c3d7586a1613 Fixes: 2deb1acc653c ("isofs: fix access to unallocated memory when reading corrupted filesystem") Signed-off-by: Qasim Ijaz <qasdev00@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250211195900.42406-1-qasdev00@gmail.com
2025-02-11x86/cpu/kvm: SRSO: Fix possible missing IBPB on VM-ExitPatrick Bellasi2-8/+16
In [1] the meaning of the synthetic IBPB flags has been redefined for a better separation of concerns: - ENTRY_IBPB -- issue IBPB on entry only - IBPB_ON_VMEXIT -- issue IBPB on VM-Exit only and the Retbleed mitigations have been updated to match this new semantics. Commit [2] was merged shortly before [1], and their interaction was not handled properly. This resulted in IBPB not being triggered on VM-Exit in all SRSO mitigation configs requesting an IBPB there. Specifically, an IBPB on VM-Exit is triggered only when X86_FEATURE_IBPB_ON_VMEXIT is set. However: - X86_FEATURE_IBPB_ON_VMEXIT is not set for "spec_rstack_overflow=ibpb", because before [1] having X86_FEATURE_ENTRY_IBPB was enough. Hence, an IBPB is triggered on entry but the expected IBPB on VM-exit is not. - X86_FEATURE_IBPB_ON_VMEXIT is not set also when "spec_rstack_overflow=ibpb-vmexit" if X86_FEATURE_ENTRY_IBPB is already set. That's because before [1] this was effectively redundant. Hence, e.g. a "retbleed=ibpb spec_rstack_overflow=bpb-vmexit" config mistakenly reports the machine still vulnerable to SRSO, despite an IBPB being triggered both on entry and VM-Exit, because of the Retbleed selected mitigation config. - UNTRAIN_RET_VM won't still actually do anything unless CONFIG_MITIGATION_IBPB_ENTRY is set. For "spec_rstack_overflow=ibpb", enable IBPB on both entry and VM-Exit and clear X86_FEATURE_RSB_VMEXIT which is made superfluous by X86_FEATURE_IBPB_ON_VMEXIT. This effectively makes this mitigation option similar to the one for 'retbleed=ibpb', thus re-order the code for the RETBLEED_MITIGATION_IBPB option to be less confusing by having all features enabling before the disabling of the not needed ones. For "spec_rstack_overflow=ibpb-vmexit", guard this mitigation setting with CONFIG_MITIGATION_IBPB_ENTRY to ensure UNTRAIN_RET_VM sequence is effectively compiled in. Drop instead the CONFIG_MITIGATION_SRSO guard, since none of the SRSO compile cruft is required in this configuration. Also, check only that the required microcode is present to effectively enabled the IBPB on VM-Exit. Finally, update the KConfig description for CONFIG_MITIGATION_IBPB_ENTRY to list also all SRSO config settings enabled by this guard. Fixes: 864bcaa38ee4 ("x86/cpu/kvm: Provide UNTRAIN_RET_VM") [1] Fixes: d893832d0e1e ("x86/srso: Add IBPB on VMEXIT") [2] Reported-by: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Patrick Bellasi <derkling@google.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-02-10NFSD: Fix CB_GETATTR status fixChuck Lever1-1/+1
Jeff says: Now that I look, 1b3e26a5ccbf is wrong. The patch on the ml was correct, but the one that got committed is different. It should be: status = decode_cb_op_status(xdr, OP_CB_GETATTR, &cb->cb_status); if (unlikely(status || cb->cb_status)) If "status" is non-zero, decoding failed (usu. BADXDR), but we also want to bail out and not decode the rest of the call if the decoded cb_status is non-zero. That's not happening here, cb_seq_status has already been checked and is non-zero, so this ends up trying to decode the rest of the CB_GETATTR reply when it doesn't exist. Reported-by: Jeff Layton <jlayton@kernel.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219737 Fixes: 1b3e26a5ccbf ("NFSD: fix decoding in nfs4_xdr_dec_cb_getattr") Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-02-10NFSD: fix hang in nfsd4_shutdown_callbackDai Ngo1-2/+5
If nfs4_client is in courtesy state then there is no point to send the callback. This causes nfsd4_shutdown_callback to hang since cl_cb_inflight is not 0. This hang lasts about 15 minutes until TCP notifies NFSD that the connection was dropped. This patch modifies nfsd4_run_cb_work to skip the RPC call if nfs4_client is in courtesy state. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Fixes: 66af25799940 ("NFSD: add courteous server support for thread with only delegation") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-02-10nfsd: fix __fh_verify for localioOlga Kornievskaia1-2/+3
__fh_verify() added a call to svc_xprt_set_valid() to help do connection management but during LOCALIO path rqstp argument is NULL, leading to NULL pointer dereferencing and a crash. Fixes: eccbbc7c00a5 ("nfsd: don't use sv_nrthreads in connection limiting calculations.") Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-02-10nfsd: fix uninitialised slot info when a request is retriedNeilBrown1-1/+2
A recent patch moved the assignment of seq->maxslots from before the test for a resent request (which ends with a goto) to after, resulting in it not being run in that case. This results in the server returning bogus "high slot id" and "target high slot id" values. The assignments to ->maxslots and ->target_maxslots need to be *after* the out: label so that the correct values are returned in replies to requests that are served from cache. Fixes: 60aa6564317d ("nfsd: allocate new session-based DRC slots on demand.") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-02-09Linux 6.14-rc2Linus Torvalds1-1/+1
2025-02-09PM: sleep: core: Restrict power.set_active propagationRafael J. Wysocki1-12/+9
Commit 3775fc538f53 ("PM: sleep: core: Synchronize runtime PM status of parents and children") exposed an issue related to simple_pm_bus_pm_ops that uses pm_runtime_force_suspend() and pm_runtime_force_resume() as bus type PM callbacks for the noirq phases of system-wide suspend and resume. The problem is that pm_runtime_force_suspend() does not distinguish runtime-suspended devices from devices for which runtime PM has never been enabled, so if it sees a device with runtime PM status set to RPM_ACTIVE, it will assume that runtime PM is enabled for that device and so it will attempt to suspend it with the help of its runtime PM callbacks which may not be ready for that. As it turns out, this causes simple_pm_bus_runtime_suspend() to crash due to a NULL pointer dereference. Another problem related to the above commit and simple_pm_bus_pm_ops is that setting runtime PM status of a device handled by the latter to RPM_ACTIVE will actually prevent it from being resumed because pm_runtime_force_resume() only resumes devices with runtime PM status set to RPM_SUSPENDED. To mitigate these issues, do not allow power.set_active to propagate beyond the parent of the device with DPM_FLAG_SMART_SUSPEND set that will need to be resumed, which should be a sufficient stop-gap for the time being, but they will need to be properly addressed in the future because in general during system-wide resume it is necessary to resume all devices in a dependency chain in which at least one device is going to be resumed. Fixes: 3775fc538f53 ("PM: sleep: core: Synchronize runtime PM status of parents and children") Closes: https://lore.kernel.org/linux-pm/1c2433d4-7e0f-4395-b841-b8eac7c25651@nvidia.com/ Reported-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/6137505.lOV4Wx5bFT@rjwysocki.net
2025-02-08fgraph: Fix set_graph_notrace with setting TRACE_GRAPH_NOTRACE_BITSteven Rostedt1-1/+1
The code was restructured where the function graph notrace code, that would not trace a function and all its children is done by setting a NOTRACE flag when the function that is not to be traced is hit. There's a TRACE_GRAPH_NOTRACE_BIT which defines the bit in the flags and a TRACE_GRAPH_NOTRACE which is the mask with that bit set. But the restructuring used TRACE_GRAPH_NOTRACE_BIT when it should have used TRACE_GRAPH_NOTRACE. For example: # cd /sys/kernel/tracing # echo set_track_prepare stack_trace_save > set_graph_notrace # echo function_graph > current_tracer # cat trace [..] 0) | __slab_free() { 0) | free_to_partial_list() { 0) | arch_stack_walk() { 0) | __unwind_start() { 0) 0.501 us | get_stack_info(); Where a non filter trace looks like: # echo > set_graph_notrace # cat trace 0) | free_to_partial_list() { 0) | set_track_prepare() { 0) | stack_trace_save() { 0) | arch_stack_walk() { 0) | __unwind_start() { Where the filter should look like: # cat trace 0) | free_to_partial_list() { 0) | _raw_spin_lock_irqsave() { 0) 0.350 us | preempt_count_add(); 0) 0.351 us | do_raw_spin_lock(); 0) 2.440 us | } Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250208001511.535be150@batman.local.home Fixes: b84214890a9bc ("function_graph: Move graph notrace bit to shadow stack global var") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-02-07kbuild: Move -Wenum-enum-conversion to W=2Nathan Chancellor1-1/+4
-Wenum-enum-conversion was strengthened in clang-19 to warn for C, which caused the kernel to move it to W=1 in commit 75b5ab134bb5 ("kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1") because there were numerous instances that would break builds with -Werror. Unfortunately, this is not a full solution, as more and more developers, subsystems, and distributors are building with W=1 as well, so they continue to see the numerous instances of this warning. Since the move to W=1, there have not been many new instances that have appeared through various build reports and the ones that have appeared seem to be following similar existing patterns, suggesting that most instances of this warning will not be real issues. The only alternatives for silencing this warning are adding casts (which is generally seen as an ugly practice) or refactoring the enums to macro defines or a unified enum (which may be undesirable because of type safety in other parts of the code). Move the warning to W=2, where warnings that occur frequently but may be relevant should reside. Cc: stable@vger.kernel.org Fixes: 75b5ab134bb5 ("kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1") Link: https://lore.kernel.org/ZwRA9SOcOjjLJcpi@google.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-02-08kbuild: install-extmod-build: add missing quotation marks for CC variableWangYuli1-1/+1
While attempting to build a Debian packages with CC="ccache gcc", I saw the following error as builddeb builds linux-headers-$KERNELVERSION: make HOSTCC=ccache gcc VPATH= srcroot=. -f ./scripts/Makefile.build obj=debian/linux-headers-6.14.0-rc1/usr/src/linux-headers-6.14.0-rc1/scripts make[6]: *** No rule to make target 'gcc'. Stop. Upon investigation, it seems that one instance of $(CC) variable reference in ./scripts/package/install-extmod-build was missing quotation marks, causing the above error. Add the missing quotation marks around $(CC) to fix build. Fixes: 5f73e7d0386d ("kbuild: refactor cross-compiling linux-headers package") Co-developed-by: Mingcong Bai <jeffbai@aosc.io> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Tested-by: WangYuli <wangyuli@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-02-07MAINTAINERS: Remove myselfHector Martin1-1/+0
I no longer have any faith left in the kernel development process or community management approach. Apple/ARM platform development will continue downstream. If I feel like sending some patches upstream in the future myself for whatever subtree I may, or I may not. Anyone who feels like fighting the upstreaming fight themselves is welcome to do so. Signed-off-by: Hector Martin <marcan@marcan.st> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-02-07MAINTAINERS: Move Pavel to kernel.org addressPavel Machek2-9/+7
I need to filter my emails better, switch to pavel@kernel.org address to help with that. Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-02-07HID: hid-steam: Don't use cancel_delayed_work_sync in IRQ contextVicki Pfau1-1/+1
Lockdep reported that, as steam_do_deck_input_event is called from steam_raw_event inside of an IRQ context, it can lead to issues if that IRQ occurs while the work to be cancelled is running. By using cancel_delayed_work, this issue can be avoided. The exact ordering of the work and the event processing is not super important, so this is safe. Fixes: cd438e57dd05 ("HID: hid-steam: Add gamepad-only mode switched to by holding options") Signed-off-by: Vicki Pfau <vi@endrift.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-02-07HID: hid-steam: Move hidraw input (un)registering to workVicki Pfau1-7/+31
Due to an interplay between locking in the input and hid transport subsystems, attempting to register or deregister the relevant input devices during the hidraw open/close events can lead to a lock ordering issue. Though this shouldn't cause a deadlock, this commit moves the input device manipulation to deferred work to sidestep the issue. Fixes: 385a4886778f6 ("HID: steam: remove input device when a hid client is running.") Signed-off-by: Vicki Pfau <vi@endrift.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-02-07HID: hid-thrustmaster: fix stack-out-of-bounds read in usb_check_int_endpoints()Tulio Fernandes1-1/+1
Syzbot[1] has detected a stack-out-of-bounds read of the ep_addr array from hid-thrustmaster driver. This array is passed to usb_check_int_endpoints function from usb.c core driver, which executes a for loop that iterates over the elements of the passed array. Not finding a null element at the end of the array, it tries to read the next, non-existent element, crashing the kernel. To fix this, a 0 element was added at the end of the array to break the for loop. [1] https://syzkaller.appspot.com/bug?extid=9c9179ac46169c56c1ad Reported-by: syzbot+9c9179ac46169c56c1ad@syzkaller.appspotmail.com Fixes: 50420d7c79c3 ("HID: hid-thrustmaster: Fix warning in thrustmaster_probe by adding endpoint check") Signed-off-by: Túlio Fernandes <tuliomf09@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-02-07HID: apple: fix up the F6 key on the Omoton KB066 keyboardAlex Henrie1-0/+3
The Omoton KB066 is an Apple A1255 keyboard clone (HID product code 05ac:022c). On both keyboards, the F6 key becomes Num Lock when the Fn key is held. But unlike its Apple exemplar, when the Omoton's F6 key is pressed without Fn, it sends the usage code 0xC0301 from the reserved section of the consumer page instead of the standard F6 usage code 0x7003F from the keyboard page. The nonstandard code is translated to KEY_UNKNOWN and becomes useless on Linux. The Omoton KB066 is a pretty popular keyboard, judging from its 29,058 reviews on Amazon at time of writing, so let's account for its quirk to make it more usable. By the way, it would be nice if we could automatically set fnmode to 0 for Omoton keyboards because they handle the Fn key internally and the kernel's Fn key handling creates undesirable side effects such as making F1 and F2 always Brightness Up and Brightness Down in fnmode=1 (the default) or always F1 and F2 in fnmode=2. Unfortunately I don't think there's a way to identify Bluetooth keyboards more specifically than the HID product code which is obviously inaccurate. Users of Omoton keyboards will just have to set fnmode to 0 manually to get full Fn key functionality. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-02-07HID: hid-apple: Apple Magic Keyboard a3203 USB-C supportIevgen Vovk2-0/+6
Add Apple Magic Keyboard 2024 model (with USB-C port) device ID (0320) to those recognized by the hid-apple driver. Keyboard is otherwise compatible with the existing implementation for its earlier 2021 model. Signed-off-by: Ievgen Vovk <YevgenVovk@ukr.net> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-02-07vfs: sanity check the length passed to inode_set_cached_link()Mateusz Guzik1-0/+13
This costs a strlen() call when instatianating a symlink. Preferably it would be hidden behind VFS_WARN_ON (or compatible), but there is no such facility at the moment. With the facility in place the call can be patched out in production kernels. In the meantime, since the cost is being paid unconditionally, use the result to a fixup the bad caller. This is not expected to persist in the long run (tm). Sample splat: bad length passed for symlink [/tmp/syz-imagegen43743633/file0/file0] (got 131109, expected 37) [rest of WARN blurp goes here] Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250204213207.337980-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07pidfs: improve ioctl handlingChristian Brauner1-1/+11
Pidfs supports extensible and non-extensible ioctls. The extensible ioctls need to check for the ioctl number itself not just the ioctl command otherwise both backward- and forward compatibility are broken. The pidfs ioctl handler also needs to look at the type of the ioctl command to guard against cases where "[...] a daemon receives some random file descriptor from a (potentially less privileged) client and expects the FD to be of some specific type, it might call ioctl() on this FD with some type-specific command and expect the call to fail if the FD is of the wrong type; but due to the missing type check, the kernel instead performs some action that userspace didn't expect." (cf. [1]] Link: https://lore.kernel.org/r/20250204-work-pidfs-ioctl-v1-1-04987d239575@kernel.org Link: https://lore.kernel.org/r/CAG48ez2K9A5GwtgqO31u9ZL292we8ZwAA=TJwwEv7wRuJ3j4Lw@mail.gmail.com [1] Fixes: 8ce352818820 ("pidfs: check for valid ioctl commands") Acked-by: Luca Boccassi <luca.boccassi@gmail.com> Reported-by: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org # v6.13; please backport with 8ce352818820 ("pidfs: check for valid ioctl commands") Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07fsnotify: disable pre-content and permission events by defaultAmir Goldstein1-0/+5
After introducing pre-content events, we had a regression related to disabling huge faults on files that should never have pre-content events enabled. This happened because the default f_mode of allocated files (0) does not disable pre-content events. Pre-content events are disabled in file_set_fsnotify_mode_by_watchers() but internal files may not get to call this helper. Initialize f_mode to disable permission and pre-content events for all files and if needed they will be enabled for the callers of file_set_fsnotify_mode_by_watchers(). Fixes: 20bf82a898b6 ("mm: don't allow huge faults for files with pre content watches") Reported-by: Alex Williamson <alex.williamson@redhat.com> Closes: https://lore.kernel.org/linux-fsdevel/20250131121703.1e4d00a7.alex.williamson@redhat.com/ Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250203223205.861346-4-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07selftests: always check mask returned by statmount(2)Miklos Szeredi1-1/+21
STATMOUNT_MNT_OPTS can actually be missing if there are no options. This is a change of behavior since 75ead69a7173 ("fs: don't let statmount return empty strings"). The other checks shouldn't actually trigger, but add them for correctness and for easier debugging if the test fails. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20250129160641.35485-1-mszeredi@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07fsnotify: disable notification by default for all pseudo filesAmir Goldstein4-2/+24
Most pseudo files are not applicable for fsnotify events at all, let alone to the new pre-content events. Disable notifications to all files allocated with alloc_file_pseudo() and enable legacy inotify events for the specific cases of pipe and socket, which have known users of inotify events. Pre-content events are also kept disabled for sockets and pipes. Fixes: 20bf82a898b6 ("mm: don't allow huge faults for files with pre content watches") Reported-by: Alex Williamson <alex.williamson@redhat.com> Closes: https://lore.kernel.org/linux-fsdevel/20250131121703.1e4d00a7.alex.williamson@redhat.com/ Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wi2pThSVY=zhO=ZKxViBj5QCRX-=AS2+rVknQgJnHXDFg@mail.gmail.com/ Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250203223205.861346-3-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07fs: fix adding security options to statmount.mnt_optMiklos Szeredi1-15/+14
Prepending security options was made conditional on sb->s_op->show_options, but security options are independent of sb options. Fixes: 056d33137bf9 ("fs: prepend statmount.mnt_opts string with security_sb_mnt_opts()") Fixes: f9af549d1fd3 ("fs: export mount options via statmount()") Cc: stable@vger.kernel.org # v6.11 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20250129151253.33241-1-mszeredi@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07fsnotify: use accessor to set FMODE_NONOTIFY_*Amir Goldstein5-13/+25
The FMODE_NONOTIFY_* bits are a 2-bits mode. Open coding manipulation of those bits is risky. Use an accessor file_set_fsnotify_mode() to set the mode. Rename file_set_fsnotify_mode() => file_set_fsnotify_mode_from_watchers() to make way for the simple accessor name. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250203223205.861346-2-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07lockref: remove count argument of lockref_initAndreas Gruenbacher5-7/+8
All users of lockref_init() now initialize the count to 1, so hardcode that and remove the count argument. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Link: https://lore.kernel.org/r/20250130135624.1899988-4-agruenba@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07gfs2: switch to lockref_init(..., 1)Andreas Gruenbacher1-2/+2
In qd_alloc(), initialize the lockref count to 1 to cover the common case. Compensate for that in gfs2_quota_init() by adjusting the count back down to 0; this only occurs when mounting the filesystem rw. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Link: https://lore.kernel.org/r/20250130135624.1899988-3-agruenba@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07gfs2: use lockref_init for gl_lockrefAndreas Gruenbacher2-2/+1
Move the initialization of gl_lockref from gfs2_init_glock_once() to gfs2_glock_get(). This allows to use lockref_init() there. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Link: https://lore.kernel.org/r/20250130135624.1899988-2-agruenba@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07statmount: let unset strings be emptyMiklos Szeredi1-9/+16
Just like it's normal for unset values to be zero, unset strings should be empty instead of containing random values. It seems to be a typical mistake that the mask returned by statmount is not checked, which can result in various bugs. With this fix, these bugs are prevented, since it is highly likely that userspace would just want to turn the missing mask case into an empty string anyway (most of the recently found cases are of this type). Link: https://lore.kernel.org/all/CAJfpegsVCPfCn2DpM8iiYSS5DpMsLB8QBUCHecoj6s0Vxf4jzg@mail.gmail.com/ Fixes: 68385d77c05b ("statmount: simplify string option retrieval") Fixes: 46eae99ef733 ("add statmount(2) syscall") Cc: stable@vger.kernel.org # v6.8 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20250130121500.113446-1-mszeredi@redhat.com Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07vboxsf: fix building with GCC 15Brahmajit Das1-1/+2
Building with GCC 15 results in build error fs/vboxsf/super.c:24:54: error: initializer-string for array of ‘unsigned char’ is too long [-Werror=unterminated-string-initialization] 24 | static const unsigned char VBSF_MOUNT_SIGNATURE[4] = "\000\377\376\375"; | ^~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Due to GCC having enabled -Werror=unterminated-string-initialization[0] by default. Separately initializing each array element of VBSF_MOUNT_SIGNATURE to ensure NUL termination, thus satisfying GCC 15 and fixing the build error. [0]: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wno-unterminated-string-initialization Signed-off-by: Brahmajit Das <brahmajit.xyz@gmail.com> Link: https://lore.kernel.org/r/20250121162648.1408743-1-brahmajit.xyz@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07fs/stat.c: avoid harmless garbage value problem in vfs_statx_path()Su Hui1-1/+3
Clang static checker(scan-build) warning: fs/stat.c:287:21: warning: The left expression of the compound assignment is an uninitialized value. The computed value will also be garbage. 287 | stat->result_mask |= STATX_MNT_ID_UNIQUE; | ~~~~~~~~~~~~~~~~~ ^ fs/stat.c:290:21: warning: The left expression of the compound assignment is an uninitialized value. The computed value will also be garbage. 290 | stat->result_mask |= STATX_MNT_ID; When vfs_getattr() failed because of security_inode_getattr(), 'stat' is uninitialized. In this case, there is a harmless garbage problem in vfs_statx_path(). It's better to return error directly when vfs_getattr() failed, avoiding garbage value and more clearly. Signed-off-by: Su Hui <suhui@nfschina.com> Link: https://lore.kernel.org/r/20250119025946.1168957-1-suhui@nfschina.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-07timers/migration: Fix off-by-one root mis-connectionFrederic Weisbecker1-1/+9
Before attaching a new root to the old root, the children counter of the new root is checked to verify that only the upcoming CPU's top group have been connected to it. However since the recently added commit b729cc1ec21a ("timers/migration: Fix another race between hotplug and idle entry/exit") this check is not valid anymore because the old root is pre-accounted as a child to the new root. Therefore after connecting the upcoming CPU's top group to the new root, the children count to be expected must be 2 and not 1 anymore. This omission results in the old root to not be connected to the new root. Then eventually the system may run with more than one top level, which defeats the purpose of a single idle migrator. Also the old root is pre-accounted but not connected upon the new root creation. But it can be connected to the new root later on. Therefore the old root may be accounted twice to the new root. The propagation of such overcommit can end up creating a double final top-level root with a groupmask incorrectly initialized. Although harmless given that the final top level roots will never have a parent to walk up to, this oddity opportunistically reported the core issue: WARNING: CPU: 8 PID: 0 at kernel/time/timer_migration.c:543 tmigr_requires_handle_remote CPU: 8 UID: 0 PID: 0 Comm: swapper/8 RIP: 0010:tmigr_requires_handle_remote Call Trace: <IRQ> ? tmigr_requires_handle_remote ? hrtimer_run_queues update_process_times tick_periodic tick_handle_periodic __sysvec_apic_timer_interrupt sysvec_apic_timer_interrupt </IRQ> Fix the problem by taking the old root into account in the children count of the new root so the connection is not omitted. Also warn when more than one top level group exists to better detect similar issues in the future. Fixes: b729cc1ec21a ("timers/migration: Fix another race between hotplug and idle entry/exit") Reported-by: Matt Fleming <mfleming@cloudflare.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20250205160220.39467-1-frederic@kernel.org
2025-02-07genirq: Remove leading space from irq_chip::irq_print_chip() callbacksGeert Uytterhoeven4-4/+4
The space separator was factored out from the multiple chip name prints, but several irq_chip::irq_print_chip() callbacks still print a leading space. Remove the superfluous double spaces. Fixes: 9d9f204bdf7243bf ("genirq/proc: Add missing space separator back") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/893f7e9646d8933cd6786d5a1ef3eb076d263768.1738764803.git.geert+renesas@glider.be
2025-02-06bcachefs: bch2_bkey_sectors_need_rebalance() now only depends on bch_extent_rebalanceKent Overstreet4-20/+26
Previously, bch2_bkey_sectors_need_rebalance() called bch2_target_accepts_data(), checking whether the target is writable. However, this means that adding or removing devices from a target would change the value of bch2_bkey_sectors_need_rebalance() for an existing extent; this needs to be invariant so that the extent trigger can correctly maintain rebalance_work accounting. Instead, check target_accepts_data() in io_opts_to_rebalance_opts(), before creating the bch_extent_rebalance entry. This fixes (one?) cause of rebalance_work accounting being off. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06bcachefs: Fix rcu imbalance in bch2_fs_btree_key_cache_exit()Kent Overstreet1-1/+0
Spotted by sparse. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06bcachefs: Fix discard path journal flushingKent Overstreet8-35/+55
The discard path is supposed to issue journal flushes when there's too many buckets empty buckets that need a journal commit before they can be written to again, but at some point this code seems to have been lost. Bring it back with a new optimization to make sure we don't issue too many journal flushes: the journal now tracks the sequence number of the most recent flush in progress, which the discard path uses when deciding which buckets need a journal flush. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06bcachefs: fix deadlock in journal_entry_open()Jeongjun Park4-2/+28
In the previous commit b3d82c2f2761, code was added to prevent journal sequence overflow. Among them, the code added to journal_entry_open() uses the bch2_fs_fatal_err_on() function to handle errors. However, __journal_res_get() , which calls journal_entry_open() , calls journal_entry_open() while holding journal->lock , but bch2_fs_fatal_err_on() internally tries to acquire journal->lock , which results in a deadlock. So we need to add a locked helper to handle fatal errors even when the journal->lock is held. Fixes: b3d82c2f2761 ("bcachefs: Guard against journal seq overflow") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06bcachefs: fix incorrect pointer check in __bch2_subvolume_delete()Jeongjun Park1-1/+6
For some unknown reason, checks on struct bkey_s_c_snapshot and struct bkey_s_c_snapshot_tree pointers are missing. Therefore, I think it would be appropriate to fix the incorrect pointer checking through this patch. Fixes: 4bd06f07bcb5 ("bcachefs: Fixes for snapshot_tree.master_subvol") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06bcachefs docs: SubmittingPatches.rstKent Overstreet3-0/+100
Add an (initial?) patch submission checklist, focusing mainly on testing. Yes, all patches must be tested, and that starts (but does not end) with the patch author. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06string.h: Use ARRAY_SIZE() for memtostr*()/strtomem*()Kees Cook1-4/+8
The destination argument of memtostr*() and strtomem*() must be a fixed-size char array at compile time, so there is no need to use __builtin_object_size() (which is useful for when an argument is either a pointer or unknown). Instead use ARRAY_SIZE(), which has the benefit of working around a bug in Clang (fixed[1] in 15+) that got __builtin_object_size() wrong sometimes. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501310832.kiAeOt2z-lkp@intel.com/ Suggested-by: Kent Overstreet <kent.overstreet@linux.dev> Link: https://github.com/llvm/llvm-project/commit/d8e0a6d5e9dd2311641f9a8a5d2bf90829951ddc [1] Tested-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-06compiler.h: Introduce __must_be_byte_array()Kees Cook1-1/+7
In preparation for adding stricter type checking to the str/mem*() helpers, provide a way to check that a variable is a byte array via __must_be_byte_array(). Suggested-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-06compiler.h: Move C string helpers into C-only kernel sectionKees Cook1-13/+13
The C kernel helpers for evaluating C Strings were positioned where they were visible to assembly inclusion, which was not intended. Move them into the kernel and C-only area of the header so future changes won't confuse the assembler. Fixes: d7a516c6eeae ("compiler.h: Fix undefined BUILD_BUG_ON_ZERO()") Fixes: 559048d156ff ("string: Check for "nonstring" attribute on strscpy() arguments") Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-07x86: rust: set rustc-abi=x86-softfloat on rustc>=1.86.0Alice Ryhl1-0/+18
When using Rust on the x86 architecture, we are currently using the unstable target.json feature to specify the compilation target. Rustc is going to change how softfloat is specified in the target.json file on x86, thus update generate_rust_target.rs to specify softfloat using the new option. Note that if you enable this parameter with a compiler that does not recognize it, then that triggers a warning but it does not break the build. [ For future reference, this solves the following error: RUSTC L rust/core.o error: Error loading target specification: target feature `soft-float` is incompatible with the ABI but gets enabled in target spec. Run `rustc --print target-list` for a list of built-in targets - Miguel ] Cc: <stable@vger.kernel.org> # Needed in 6.12.y and 6.13.y only (Rust is pinned in older LTSs). Link: https://github.com/rust-lang/rust/pull/136146 Signed-off-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> # for x86 Link: https://lore.kernel.org/r/20250203-rustc-1-86-x86-softfloat-v1-1-220a72a5003e@google.com [ Added 6.13.y too to Cc: stable tag and added reasoning to avoid over-backporting. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-02-06selftests/seccomp: validate uretprobe syscall passes through seccompEyal Birger1-0/+199
The uretprobe syscall is implemented as a performance enhancement on x86_64 by having the kernel inject a call to it on function exit; User programs cannot call this system call explicitly. As such, this syscall is considered a kernel implementation detail and should not be filtered by seccomp. Enhance the seccomp bpf test suite to check that uretprobes can be attached to processes without the killing the process regardless of seccomp policy. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20250202162921.335813-3-eyal.birger@gmail.com [kees: Skip archs without __NR_uretprobe] Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-06seccomp: passthrough uretprobe systemcall without filteringEyal Birger1-0/+12
When attaching uretprobes to processes running inside docker, the attached process is segfaulted when encountering the retprobe. The reason is that now that uretprobe is a system call the default seccomp filters in docker block it as they only allow a specific set of known syscalls. This is true for other userspace applications which use seccomp to control their syscall surface. Since uretprobe is a "kernel implementation detail" system call which is not used by userspace application code directly, it is impractical and there's very little point in forcing all userspace applications to explicitly allow it in order to avoid crashing tracked processes. Pass this systemcall through seccomp without depending on configuration. Note: uretprobe is currently only x86_64 and isn't expected to ever be supported in i386. Fixes: ff474a78cef5 ("uprobe: Add uretprobe syscall to speed up return probe") Reported-by: Rafael Buchbinder <rafi@rbk.io> Closes: https://lore.kernel.org/lkml/CAHsH6Gs3Eh8DFU0wq58c_LF8A4_+o6z456J7BidmcVY2AqOnHQ@mail.gmail.com/ Link: https://lore.kernel.org/lkml/20250121182939.33d05470@gandalf.local.home/T/#me2676c378eff2d6a33f3054fed4a5f3afa64e65b Link: https://lore.kernel.org/lkml/20250128145806.1849977-1-eyal.birger@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20250202162921.335813-2-eyal.birger@gmail.com [kees: minimized changes for easier backporting, tweaked commit log] Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-06stackinit: Fix comment for test_small_endGeert Uytterhoeven1-1/+1
In union test_small_end, the small members are three and four. Fixes: e71a29db79da1946 ("stackinit: Add union initialization to selftests") Closes: https://lore.kernel.org/CAMuHMdWvcKOc6v5o3-9-SqP_4oh5-GZQjZZb=-krhY=mVRED_Q@mail.gmail.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/3f8faa2d7d0d6b36571093ab0fb1fd5157abd7bb.1738593178.git.geert+renesas@glider.be Signed-off-by: Kees Cook <kees@kernel.org>