aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/exported-sql-viewer.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-01-27virtio_blk: Add support for transport error recoveryIsrael Rukshin1-3/+25
Add support for proper cleanup and re-initialization of virtio-blk devices during transport reset error recovery flow. This enhancement includes: - Pre-reset handler (reset_prepare) to perform device-specific cleanup - Post-reset handler (reset_done) to re-initialize the device These changes allow the device to recover from various reset scenarios, ensuring proper functionality after a reset event occurs. Without this implementation, the device cannot properly recover from resets, potentially leading to undefined behavior or device malfunction. This feature has been tested using PCI transport with Function Level Reset (FLR) as an example reset mechanism. The reset can be triggered manually via sysfs (echo 1 > /sys/bus/pci/devices/$PCI_ADDR/reset). Signed-off-by: Israel Rukshin <israelr@nvidia.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Message-Id: <1732690652-3065-3-git-send-email-israelr@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27virtio_pci: Add support for PCIe Function Level ResetIsrael Rukshin3-25/+118
Implement support for Function Level Reset (FLR) in virtio_pci devices. This change adds reset_prepare and reset_done callbacks, allowing drivers to properly handle FLR operations. Without this patch, performing and recovering from an FLR is not possible for virtio_pci devices. This implementation ensures proper FLR handling and recovery for both physical and virtual functions. The device reset can be triggered in case of error or manually via sysfs: echo 1 > /sys/bus/pci/devices/$PCI_ADDR/reset Signed-off-by: Israel Rukshin <israelr@nvidia.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Message-Id: <1732690652-3065-2-git-send-email-israelr@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27vhost/net: Set num_buffers for virtio 1.0Akihiko Odaki1-1/+4
The specification says the device MUST set num_buffers to 1 if VIRTIO_NET_F_MRG_RXBUF has not been negotiated. Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240915-v1-v1-1-f10d2cb5e759@daynix.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27vdpa/octeon_ep: read vendor-specific PCI capabilityShijith Thotton3-2/+58
Added support to read the vendor-specific PCI capability to identify the type of device being emulated. Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Shijith Thotton <sthotton@marvell.com> Message-Id: <20250103153226.1933479-4-sthotton@marvell.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27virtio-pci: define type and header for PCI vendor dataShijith Thotton1-0/+14
Added macro definition for VIRTIO_PCI_CAP_VENDOR_CFG to identify the PCI vendor data type in the virtio_pci_cap structure. Defined a new struct virtio_pci_vndr_data for the vendor data capability header as per the specification. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Shijith Thotton <sthotton@marvell.com> Message-Id: <20250103153226.1933479-3-sthotton@marvell.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27vdpa/octeon_ep: handle device config change eventsSatha Rao1-0/+8
The first interrupt of the device is used to notify the host about device configuration changes, such as link status updates. The ISR configuration area is updated to indicate a config change event when triggered. Signed-off-by: Satha Rao <skoteshwar@marvell.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Shijith Thotton <sthotton@marvell.com> Message-Id: <20250103153226.1933479-2-sthotton@marvell.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27vdpa/octeon_ep: enable support for multiple interrupts per deviceShijith Thotton3-39/+62
Updated the driver to utilize all the MSI-X interrupt vectors supported by each OCTEON endpoint VF, instead of relying on a single vector. Enabling more interrupts allows packets from multiple rings to be distributed across multiple cores, improving parallelism and performance. Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Shijith Thotton <sthotton@marvell.com> Message-Id: <20250103153226.1933479-1-sthotton@marvell.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27vdpa: solidrun: Replace deprecated PCI functionsPhilipp Stanner1-29/+28
The PCI functions pcim_iomap_regions() pcim_iounmap_regions() pcim_iomap_table() have been deprecated by the PCI subsystem. Replace these functions with their successors pcim_iomap_region() and pcim_iounmap_region(). Signed-off-by: Philipp Stanner <pstanner@redhat.com> Message-Id: <20241219094428.21511-2-phasta@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com>
2025-01-27s390/kdump: virtio-mem kdump support (CONFIG_PROC_VMCORE_DEVICE_RAM)David Hildenbrand2-8/+32
Let's add support for including virtio-mem device RAM in the crash dump, setting NEED_PROC_VMCORE_DEVICE_RAM, and implementing elfcorehdr_fill_device_ram_ptload_elf64(). To avoid code duplication, factor out the code to fill a PT_LOAD entry. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-13-david@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27virtio-mem: support CONFIG_PROC_VMCORE_DEVICE_RAMDavid Hildenbrand2-0/+89
Let's implement the get_device_ram() vmcore callback, so architectures that select NEED_PROC_VMCORE_NEED_DEVICE_RAM, like s390 soon, can include that memory in a crash dump. Merge ranges, and process ranges that might contain a mixture of plugged and unplugged, to reduce the total number of ranges. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-12-david@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27virtio-mem: remember usable region sizeDavid Hildenbrand1-3/+7
Let's remember the usable region size, which will be helpful in kdump mode next. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-11-david@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27virtio-mem: mark device ready before registering callbacks in kdump modeDavid Hildenbrand1-2/+3
After the callbacks are registered we may immediately get a callback. So mark the device ready before registering the callbacks. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-10-david@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27fs/proc/vmcore: introduce PROC_VMCORE_DEVICE_RAM to detect device RAM ranges in 2nd kernelDavid Hildenbrand3-0/+183
s390 allocates+prepares the elfcore hdr in the dump (2nd) kernel, not in the crashed kernel. RAM provided by memory devices such as virtio-mem can only be detected using the device driver; when vmcore_init() is called, these device drivers are usually not loaded yet, or the devices did not get probed yet. Consequently, on s390 these RAM ranges will not be included in the crash dump, which makes the dump partially corrupt and is unfortunate. Instead of deferring the vmcore_init() call, to an (unclear?) later point, let's reuse the vmcore_cb infrastructure to obtain device RAM ranges as the device drivers probe the device and get access to this information. Then, we'll add these ranges to the vmcore, adding more PT_LOAD entries and updating the offsets+vmcore size. Use a separate Kconfig option to be set by an architecture to include this code only if the arch really needs it. Further, we'll make the config depend on the relevant drivers (i.e., virtio_mem) once they implement support (next). The alternative of having a PROVIDE_PROC_VMCORE_DEVICE_RAM config option was dropped for now for simplicity. The current target use case is s390, which only creates an elf64 elfcore, so focusing on elf64 is sufficient. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-9-david@redhat.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27fs/proc/vmcore: factor out freeing a list of vmcore rangesDavid Hildenbrand2-8/+12
Let's factor it out into include/linux/crash_dump.h, from where we can use it also outside of vmcore.c later. Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-8-david@redhat.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27fs/proc/vmcore: factor out allocating a vmcore range and adding it to a listDavid Hildenbrand2-19/+16
Let's factor it out into include/linux/crash_dump.h, from where we can use it also outside of vmcore.c later. Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-7-david@redhat.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27fs/proc/vmcore: move vmcore definitions out of kcore.hDavid Hildenbrand3-23/+23
These vmcore defines are not related to /proc/kcore, move them out. We'll move "struct vmcoredd_node" to vmcore.c, because it is only used internally. While "struct vmcore" is only used internally for now, we're planning on using it from inline functions in crash_dump.h next, so move it to crash_dump.h. While at it, rename "struct vmcore" to "struct vmcore_range", which is a more suitable name and will make the usage of it outside of vmcore.c clearer. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-6-david@redhat.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27fs/proc/vmcore: prefix all pr_* with "vmcore:"David Hildenbrand1-1/+3
Let's use "vmcore: " as a prefix, converting the single "Kdump: vmcore not initialized" one to effectively be "vmcore: not initialized". Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-5-david@redhat.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27fs/proc/vmcore: disallow vmcore modifications while the vmcore is openDavid Hildenbrand1-23/+34
The vmcoredd_update_size() call and its effects (size/offset changes) are currently completely unsynchronized, and will cause trouble when performed concurrently, or when done while someone is already reading the vmcore. Let's protect all vmcore modifications by the vmcore_mutex, disallow vmcore modifications while the vmcore is open, and warn on vmcore modifications after the vmcore was already opened once: modifications while the vmcore is open are unsafe, and modifications after the vmcore was opened indicates trouble. Properly synchronize against concurrent opening of the vmcore. No need to grab the mutex during mmap()/read(): after we opened the vmcore, modifications are impossible. It's worth noting that modifications after the vmcore was opened are completely unexpected, so failing if open, and warning if already opened (+closed again) is good enough. This change not only handles concurrent adding of device dumps + concurrent reading of the vmcore properly, it also prepares for other mechanisms that will modify the vmcore. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-4-david@redhat.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27fs/proc/vmcore: replace vmcoredd_mutex by vmcore_mutexDavid Hildenbrand1-9/+8
Now that we have a mutex that synchronizes against opening of the vmcore, let's use that one to replace vmcoredd_mutex: there is no need to have two separate ones. This is a preparation for properly preventing vmcore modifications after the vmcore was opened. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-3-david@redhat.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-27fs/proc/vmcore: convert vmcore_cb_lock into vmcore_mutexDavid Hildenbrand1-7/+8
We want to protect vmcore modifications from concurrent opening of the vmcore, and also serialize vmcore modification. (a) We can currently modify the vmcore after it was opened. This can happen if a vmcoredd is added after the vmcore module was initialized and already opened by user space. We want to fix that and prepare for new code wanting to serialize against concurrent opening. (b) To handle it cleanly we need to protect the modifications against concurrent opening. As the modifications end up allocating memory and can sleep, we cannot rely on the spinlock. Let's convert the spinlock into a mutex to prepare for further changes. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20241204125444.1734652-2-david@redhat.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-26LoongArch: Extend the maximum number of watchpointsTiezhu Yang2-3/+13
The maximum number of load/store watchpoints and fetch instruction watchpoints is 14 each according to LoongArch Reference Manual, so extend the maximum number of watchpoints from 8 to 14 for ptrace. By the way, just simply change 8 to 14 for the definition in struct user_watch_state at the beginning, but it may corrupt uapi, then add a new struct user_watch_state_v2 directly. As far as I can tell, the only users for this struct in the userspace are GDB and LLDB, there are no any problems of software compatibility between the application and kernel according to the analysis. The compatibility problem has been considered while developing and testing. When the applications in the userspace get watchpoint state, the length will be specified which is no bigger than the sizeof struct user_watch_state or user_watch_state_v2, the actual length is assigned as the minimal value of the application and kernel in the generic code of ptrace: kernel/ptrace.c: ptrace_regset(): kiov->iov_len = min(kiov->iov_len, (__kernel_size_t) (regset->n * regset->size)); if (req == PTRACE_GETREGSET) return copy_regset_to_user(task, view, regset_no, 0, kiov->iov_len, kiov->iov_base); else return copy_regset_from_user(task, view, regset_no, 0, kiov->iov_len, kiov->iov_base); For example, there are four kind of combinations, all of them work well. (1) "older kernel + older gdb", the actual length is 8+(8+8+4+4)*8=200; (2) "newer kernel + newer gdb", the actual length is 8+(8+8+4+4)*14=344; (3) "older kernel + newer gdb", the actual length is 8+(8+8+4+4)*8=200; (4) "newer kernel + older gdb", the actual length is 8+(8+8+4+4)*8=200. Link: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#control-and-status-registers-related-to-watchpoints Cc: stable@vger.kernel.org Fixes: 1a69f7a161a7 ("LoongArch: ptrace: Expose hardware breakpoints to debuggers") Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-01-26LoongArch: Change 8 to 14 for LOONGARCH_MAX_{BRP,WRP}Tiezhu Yang3-4/+76
The maximum number of load/store watchpoints and fetch instruction watchpoints is 14 each according to LoongArch Reference Manual, so change 8 to 14 for the related code. Link: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#control-and-status-registers-related-to-watchpoints Cc: stable@vger.kernel.org Fixes: edffa33c7bb5 ("LoongArch: Add hardware breakpoints/watchpoints support") Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-01-26LoongArch: Add debugfs entries to switch SFB/TSO stateHuacai Chen4-7/+186
We need to switch SFB (Store Fill Buffer) and TSO (Total Store Order) state at runtime to debug memory management and KVM virtualization, so add two debugfs entries "sfb_state" and "tso_state" under the directory /sys/kernel/debug/loongarch. Query SFB: cat /sys/kernel/debug/loongarch/sfb_state Enable SFB: echo 1 > /sys/kernel/debug/loongarch/sfb_state Disable SFB: echo 0 > /sys/kernel/debug/loongarch/sfb_state Query TSO: cat /sys/kernel/debug/loongarch/tso_state Switch TSO: echo [TSO] > /sys/kernel/debug/loongarch/tso_state Available [TSO] states: 0 (No Load No Store) 1 (All Load No Store) 3 (Same Load No Store) 4 (No Load All Store) 5 (All Load All Store) 7 (Same Load All Store) Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-01-26LoongArch: Fix warnings during S3 suspendHuacai Chen3-3/+2
The enable_gpe_wakeup() function calls acpi_enable_all_wakeup_gpes(), and the later one may call the preempt_schedule_common() function, resulting in a thread switch and causing the CPU to be in an interrupt enabled state after the enable_gpe_wakeup() function returns, leading to the warnings as follow. [ C0] WARNING: ... at kernel/time/timekeeping.c:845 ktime_get+0xbc/0xc8 [ C0] ... [ C0] Call Trace: [ C0] [<90000000002243b4>] show_stack+0x64/0x188 [ C0] [<900000000164673c>] dump_stack_lvl+0x60/0x88 [ C0] [<90000000002687e4>] __warn+0x8c/0x148 [ C0] [<90000000015e9978>] report_bug+0x1c0/0x2b0 [ C0] [<90000000016478e4>] do_bp+0x204/0x3b8 [ C0] [<90000000025b1924>] exception_handlers+0x1924/0x10000 [ C0] [<9000000000343bbc>] ktime_get+0xbc/0xc8 [ C0] [<9000000000354c08>] tick_sched_timer+0x30/0xb0 [ C0] [<90000000003408e0>] __hrtimer_run_queues+0x160/0x378 [ C0] [<9000000000341f14>] hrtimer_interrupt+0x144/0x388 [ C0] [<9000000000228348>] constant_timer_interrupt+0x38/0x48 [ C0] [<90000000002feba4>] __handle_irq_event_percpu+0x64/0x1e8 [ C0] [<90000000002fed48>] handle_irq_event_percpu+0x20/0x80 [ C0] [<9000000000306b9c>] handle_percpu_irq+0x5c/0x98 [ C0] [<90000000002fd4a0>] generic_handle_domain_irq+0x30/0x48 [ C0] [<9000000000d0c7b0>] handle_cpu_irq+0x70/0xa8 [ C0] [<9000000001646b30>] handle_loongarch_irq+0x30/0x48 [ C0] [<9000000001646bc8>] do_vint+0x80/0xe0 [ C0] [<90000000002aea1c>] finish_task_switch.isra.0+0x8c/0x2a8 [ C0] [<900000000164e34c>] __schedule+0x314/0xa48 [ C0] [<900000000164ead8>] schedule+0x58/0xf0 [ C0] [<9000000000294a2c>] worker_thread+0x224/0x498 [ C0] [<900000000029d2f0>] kthread+0xf8/0x108 [ C0] [<9000000000221f28>] ret_from_kernel_thread+0xc/0xa4 [ C0] [ C0] ---[ end trace 0000000000000000 ]--- The root cause is acpi_enable_all_wakeup_gpes() uses a mutex to protect acpi_hw_enable_all_wakeup_gpes(), and acpi_ut_acquire_mutex() may cause a thread switch. Since there is no longer concurrent execution during loongarch_acpi_suspend(), we can call acpi_hw_enable_all_wakeup_gpes() directly in enable_gpe_wakeup(). The solution is similar to commit 22db06337f590d01 ("ACPI: sleep: Avoid breaking S3 wakeup due to might_sleep()"). Fixes: 366bb35a8e48 ("LoongArch: Add suspend (ACPI S3) support") Signed-off-by: Qunqin Zhao <zhaoqunqin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-01-26module: sign with sha512 instead of sha1 by defaultThorsten Leemhuis1-0/+1
Switch away from using sha1 for module signing by default and use the more modern sha512 instead, which is what among others Arch, Fedora, RHEL, and Ubuntu are currently using for their kernels. Sha1 has not been considered secure against well-funded opponents since 2005[1]; since 2011 the NIST and other organizations furthermore recommended its replacement[2]. This is why OpenSSL on RHEL9, Fedora Linux 41+[3], and likely some other current and future distributions reject the creation of sha1 signatures, which leads to a build error of allmodconfig configurations: 80A20474797F0000:error:03000098:digital envelope routines:do_sigver_init:invalid digest:crypto/evp/m_sigver.c:342: make[4]: *** [.../certs/Makefile:53: certs/signing_key.pem] Error 1 make[4]: *** Deleting file 'certs/signing_key.pem' make[4]: *** Waiting for unfinished jobs.... make[3]: *** [.../scripts/Makefile.build:478: certs] Error 2 make[2]: *** [.../Makefile:1936: .] Error 2 make[1]: *** [.../Makefile:224: __sub-make] Error 2 make[1]: Leaving directory '...' make: *** [Makefile:224: __sub-make] Error 2 This change makes allmodconfig work again and sets a default that is more appropriate for current and future users, too. Link: https://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html [1] Link: https://csrc.nist.gov/projects/hash-functions [2] Link: https://fedoraproject.org/wiki/Changes/OpenSSLDistrustsha1SigVer [3] Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: kdevops <kdevops@lists.linux.dev> [0] Link: https://github.com/linux-kdevops/linux-modules-kpd/actions/runs/11420092929/job/31775404330 [0] Link: https://lore.kernel.org/r/52ee32c0c92afc4d3263cea1f8a1cdc809728aff.1729088288.git.linux@leemhuis.info Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: Don't fail module loading when setting ro_after_init section RO failedChristophe Leroy1-3/+4
Once module init has succeded it is too late to cancel loading. If setting ro_after_init data section to read-only fails, all we can do is to inform the user through a warning. Reported-by: Thomas Gleixner <tglx@linutronix.de> Closes: https://lore.kernel.org/all/20230915082126.4187913-1-ruanjinjie@huawei.com/ Fixes: d1909c022173 ("module: Don't ignore errors from set_memory_XX()") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/d6c81f38da76092de8aacc8c93c4c65cb0fe48b8.1733427536.git.christophe.leroy@csgroup.eu Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: Split module_enable_rodata_ro()Christophe Leroy3-7/+13
module_enable_rodata_ro() is called twice, once before module init to set rodata sections readonly and once after module init to set rodata_after_init section readonly. The second time, only the rodata_after_init section needs to be set to read-only, no need to re-apply it to already set rodata. Split module_enable_rodata_ro() in two. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Daniel Gomez <da.gomez@samsung.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/e3b6ff0df7eac281c58bb02cecaeb377215daff3.1733427536.git.christophe.leroy@csgroup.eu Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: sysfs: Use const 'struct bin_attribute'Thomas Weißschuh1-10/+10
The sysfs core is switching to 'const struct bin_attribute's. Prepare for that. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20241227-sysfs-const-bin_attr-module-v2-6-e267275f0f37@weissschuh.net Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: sysfs: Add notes attributes through attribute_groupThomas Weißschuh1-26/+28
A kobject is meant to manage the lifecycle of some resource. However the module sysfs code only creates a kobject to get a "notes" subdirectory in sysfs. This can be achieved easier and cheaper by using a sysfs group. Switch the notes attribute code to such a group, similar to how the section allocation in the same file already works. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20241227-sysfs-const-bin_attr-module-v2-5-e267275f0f37@weissschuh.net Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: sysfs: Simplify section attribute allocationThomas Weißschuh1-8/+10
The existing allocation logic manually stuffs two allocations into one. This is hard to understand and of limited value, given that all the section names are allocated on their own anyways. Une one allocation per datastructure. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20241227-sysfs-const-bin_attr-module-v2-4-e267275f0f37@weissschuh.net Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: sysfs: Drop 'struct module_sect_attr'Thomas Weißschuh1-15/+11
This is now an otherwise empty wrapper around a 'struct bin_attribute', not providing any functionality. Remove it. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20241227-sysfs-const-bin_attr-module-v2-3-e267275f0f37@weissschuh.net Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: sysfs: Drop member 'module_sect_attr::address'Thomas Weißschuh1-5/+2
'struct bin_attribute' already contains the member 'private' to pass custom data to the attribute handlers. Use that instead of the custom 'address' member. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20241227-sysfs-const-bin_attr-module-v2-2-e267275f0f37@weissschuh.net Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: sysfs: Drop member 'module_sect_attrs::nsections'Thomas Weißschuh1-6/+3
The member is only used to iterate over all attributes in free_sect_attrs(). However the attribute group can already be used for that. Use the group and drop 'nsections'. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20241227-sysfs-const-bin_attr-module-v2-1-e267275f0f37@weissschuh.net Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: Constify 'struct module_attribute'Thomas Weißschuh5-34/+34
These structs are never modified, move them to read-only memory. This makes the API clearer and also prepares for the constification of 'struct attribute' itself. While at it, also constify 'modinfo_attrs_count'. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Link: https://lore.kernel.org/r/20241216-sysfs-const-attr-module-v1-3-3790b53e0abf@weissschuh.net Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: Handle 'struct module_version_attribute' as constThomas Weißschuh2-3/+3
The structure is always read-only due to its placement in the read-only section __modver. Reflect this at its usage sites. Also prepare for the const handling of 'struct module_attribute' itself. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Link: https://lore.kernel.org/r/20241216-sysfs-const-attr-module-v1-2-3790b53e0abf@weissschuh.net Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26params: Prepare for 'const struct module_attribute *'Thomas Weißschuh1-3/+3
The 'struct module_attribute' sysfs callbacks are about to change to receive a 'const struct module_attribute *' parameter. Prepare for that by avoid casting away the constness through container_of() and using const pointers to 'struct param_attribute'. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Link: https://lore.kernel.org/r/20241216-sysfs-const-attr-module-v1-1-3790b53e0abf@weissschuh.net Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: Put known GPL offenders in an arrayUwe Kleine-König1-9/+14
Instead of repeating the add_taint_module() call for each offender, create an array and loop over that one. This simplifies adding new entries considerably. Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Link: https://lore.kernel.org/r/20241115185253.1299264-2-wse@tuxedocomputers.com [ppavlu: make the array const] Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-26module: Extend the preempt disabled section in dereference_symbol_descriptor().Sebastian Andrzej Siewior1-1/+1
dereference_symbol_descriptor() needs to obtain the module pointer belonging to pointer in order to resolve that pointer. The returned mod pointer is obtained under RCU-sched/ preempt_disable() guarantees and needs to be used within this section to ensure that the module is not removed in the meantime. Extend the preempt_disable() section to also cover dereference_module_function_descriptor(). Fixes: 04b8eb7a4ccd9 ("symbol lookup: introduce dereference_symbol_descriptor()") Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Helge Deller <deller@gmx.de> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N Rao <naveen@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20250108090457.512198-2-bigeasy@linutronix.de Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-01-25mm/compaction: fix UBSAN shift-out-of-bounds warningLiu Shixin1-1/+2
syzkaller reported a UBSAN shift-out-of-bounds warning of (1UL << order) in isolate_freepages_block(). The bogus compound_order can be any value because it is union with flags. Add back the MAX_PAGE_ORDER check to fix the warning. Link: https://lkml.kernel.org/r/20250123021029.2826736-1-liushixin2@huawei.com Fixes: 3da0272a4c7d ("mm/compaction: correctly return failure with bogus compound_order in strict mode") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Nanyong Sun <sunnanyong@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25s390/mm: add missing ctor/dtor on page table upgradeAlexander Gordeev1-0/+3
Commit 78966b550289 ("s390: pgtable: add statistics for PUD and P4D level page table") misses the call to pagetable_p4d_ctor() against a newly allocated P4D table in crst_table_upgrade(); Commit 68c601de75d8 ("mm: introduce ctor/dtor at PGD level") misses the call to pagetable_pgd_ctor() against a newly allocated PGD and the call to pagetable_dtor() against a newly allocated P4D that is about to be freed on crst_table_upgrade() PGD upgrade fail path. The missed constructors and destructor break (at least) the page table accounting when a process memory space is upgraded. Link: https://lkml.kernel.org/r/20250123160349.200154-1-agordeev@linux.ibm.com Fixes: 78966b550289 ("s390: pgtable: add statistics for PUD and P4D level page table") Fixes: 68c601de75d8 ("mm: introduce ctor/dtor at PGD level") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reported-by: Heiko Carstens <hca@linux.ibm.com> Closes: https://lore.kernel.org/all/20250122074954.8685-A-hca@linux.ibm.com/ Suggested-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Acked-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()Thorsten Blum1-1/+2
Remove hard-coded strings by using the str_on_off() helper function. Link: https://lkml.kernel.org/r/20250116062403.2496-2-thorsten.blum@linux.dev Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25tools: add VM_WARN_ON_VMG definitionSuren Baghdasaryan1-0/+1
vma tests compilation yields the following error: vma.c:732:9: error: implicit declaration of function ‘VM_WARN_ON_VMG’ Fix it by adding missing VM_WARN_ON_VMG() definition. Link: https://lkml.kernel.org/r/20250116181538.759469-1-surenb@google.com Fixes: e3a7ae85f87c ("mm/debug: prefer VM_WARN_ON_VMG() to report VMG debug warnings") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25mm/damon/core: use str_high_low() helper in damos_wmark_wait_us()Thorsten Blum1-3/+3
Remove hard-coded strings by using the str_high_low() helper function. Link: https://lkml.kernel.org/r/20250116204216.106999-2-thorsten.blum@linux.dev Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25seqlock: add missing parameter documentation for raw_seqcount_try_begin()Suren Baghdasaryan1-0/+1
Add missing documentation for raw_seqcount_try_begin() start parameter. Link: https://lkml.kernel.org/r/20250116182730.801497-1-surenb@google.com Fixes: dba4761a3e40 ("seqlock: add raw_seqcount_try_begin") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/all/20250116170522.23e884d5@canb.auug.org.au/ Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_threshJim Zhao1-37/+16
Address the feedback from 39ac99852fca ("mm/page-writeback: raise wb_thresh to prevent write blocking with strictlimit)". The wb_thresh bumping logic is scattered across wb_position_ratio, __wb_calc_thresh, and wb_update_dirty_ratelimit. For consistency, consolidate all wb_thresh bumping logic into __wb_calc_thresh. Link: https://lkml.kernel.org/r/20241121100539.605818-1-jimzhao.ai@gmail.com Signed-off-by: Jim Zhao <jimzhao.ai@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25mm/page_alloc: remove the incorrect and misleading commentYuntao Wang1-7/+0
The comment removed in this patch originally belonged to the build_zonelists_in_zone_order() function, which was introduced by commit f0c0b2b808f2 ("change zonelist order: zonelist order selection logic"). Later, commit c9bff3eebc09 ("mm, page_alloc: rip out ZONELIST_ORDER_ZONE") removed build_zonelists_in_zone_order() but left its comment behind. Subsequently, commit 9d3be21bf9c0 ("mm, page_alloc: simplify zonelist initialization") moved the node_order variable into build_zonelists(), making the comment originally belonged to build_zonelists_in_zone_order() appear as if it were part of build_zonelists(). Remove this misleading comment. Link: https://lkml.kernel.org/r/20250115041634.63387-1-yuntao.wang@linux.dev Signed-off-by: Yuntao Wang <yuntao.wang@linux.dev> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25zram: remove zcomp_stream_put() from write_incompressible_page()Sergey Senozhatsky1-1/+0
We cannot and should not put per-CPU compression stream in write_incompressible_page() because that function never gets any per-CPU streams in the first place. It's zram_write_page() that puts the stream before it calls write_incompressible_page(). Link: https://lkml.kernel.org/r/20250115072003.380567-1-senozhatsky@chromium.org Fixes: 485d11509d6d ("zram: factor out ZRAM_HUGE write") Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25mm: separate move/undo parts from migrate_pages_batch()Byungchul Park1-51/+83
Functionally, no change. This is a preparation for luf mechanism that requires to use separated folio lists for its own handling during migration. Refactored migrate_pages_batch() so as to separate move/undo parts from migrate_pages_batch(). Link: https://lkml.kernel.org/r/20250115103403.11882-1-byungchul@sk.com Signed-off-by: Byungchul Park <byungchul@sk.com> Reviewed-by: Shivank Garg <shivankg@amd.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25mm/kfence: use str_write_read() helper in get_access_type()Thorsten Blum2-2/+4
Remove hard-coded strings by using the str_write_read() helper function. Link: https://lkml.kernel.org/r/20250115155511.954535-2-thorsten.blum@linux.dev Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25selftests/mm/mkdirty: fix memory leak in test_uffdio_copy()liuye1-0/+1
Release memory before exception branch returns to prevent memory leaks Checking tools/testing/selftests/mm/mkdirty.c ... tools/testing/selftests/mm/mkdirty.c:283:3: error: Memory leak: src [memleak] return; ^ Link: https://lkml.kernel.org/r/20250114023838.48589-1-liuye@kylinos.cn Signed-off-by: liuye <liuye@kylinos.cn> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>