aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-sqlite.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2024-12-03module: Convert default symbol namespace to string literalMasahiro Yamada18-21/+21
Commit cdd30ebb1b9f ("module: Convert symbol namespace to string literal") only converted MODULE_IMPORT_NS() and EXPORT_SYMBOL_NS(), leaving DEFAULT_SYMBOL_NAMESPACE as a macro expansion. This commit converts DEFAULT_SYMBOL_NAMESPACE in the same way to avoid annoyance for the default namespace as well. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-03doc: module: revert misconversions for MODULE_IMPORT_NS()Masahiro Yamada3-6/+6
This reverts the misconversions introduced by commit cdd30ebb1b9f ("module: Convert symbol namespace to string literal"). The affected descriptions refer to MODULE_IMPORT_NS() tags in general, rather than suggesting the use of the empty string ("") as the namespace. Fixes: cdd30ebb1b9f ("module: Convert symbol namespace to string literal") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-03scripts/nsdeps: get 'make nsdeps' working againMasahiro Yamada2-3/+3
Since commit cdd30ebb1b9f ("module: Convert symbol namespace to string literal"), when MODULE_IMPORT_NS() is missing, 'make nsdeps' inserts pointless code: MODULE_IMPORT_NS("ns"); Here, "ns" is not a namespace, but the variable in the semantic patch. It must not be quoted. Instead, a string literal must be passed to Coccinelle. Fixes: cdd30ebb1b9f ("module: Convert symbol namespace to string literal") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-03LoongArch: KVM: Protect kvm_io_bus_{read,write}() with SRCUHuacai Chen2-11/+26
When we enable lockdep we get such a warning: ============================= WARNING: suspicious RCU usage 6.12.0-rc7+ #1891 Tainted: G W ----------------------------- arch/loongarch/kvm/../../../virt/kvm/kvm_main.c:5945 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by qemu-system-loo/948: #0: 90000001184a00a8 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0xf4/0xe20 [kvm] stack backtrace: CPU: 2 UID: 0 PID: 948 Comm: qemu-system-loo Tainted: G W 6.12.0-rc7+ #1891 Tainted: [W]=WARN Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022 Stack : 0000000000000089 9000000005a0db9c 90000000071519c8 900000012c578000 900000012c57b940 0000000000000000 900000012c57b948 9000000007e53788 900000000815bcc8 900000000815bcc0 900000012c57b7b0 0000000000000001 0000000000000001 4b031894b9d6b725 0000000005dec000 9000000100427b00 00000000000003d2 0000000000000001 000000000000002d 0000000000000003 0000000000000030 00000000000003b4 0000000005dec000 0000000000000000 900000000806d000 9000000007e53788 00000000000000b4 0000000000000004 0000000000000004 0000000000000000 0000000000000000 9000000107baf600 9000000008916000 9000000007e53788 9000000005924778 000000001fe001e5 00000000000000b0 0000000000000007 0000000000000000 0000000000071c1d ... Call Trace: [<9000000005924778>] show_stack+0x38/0x180 [<90000000071519c4>] dump_stack_lvl+0x94/0xe4 [<90000000059eb754>] lockdep_rcu_suspicious+0x194/0x240 [<ffff80000221f47c>] kvm_io_bus_read+0x19c/0x1e0 [kvm] [<ffff800002225118>] kvm_emu_mmio_read+0xd8/0x440 [kvm] [<ffff8000022254bc>] kvm_handle_read_fault+0x3c/0xe0 [kvm] [<ffff80000222b3c8>] kvm_handle_exit+0x228/0x480 [kvm] Fix it by protecting kvm_io_bus_{read,write}() with SRCU. Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-12-02module: Convert symbol namespace to string literalPeter Zijlstra876-2196/+2189
Clean up the existing export namespace code along the same lines of commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo) to __section("foo")") and for the same reason, it is not desired for the namespace argument to be a macro expansion itself. Scripted using git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file; do awk -i inplace ' /^#define EXPORT_SYMBOL_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /^#define MODULE_IMPORT_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /MODULE_IMPORT_NS/ { $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g"); } /EXPORT_SYMBOL_NS/ { if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) { if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ && $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ && $0 !~ /^my/) { getline line; gsub(/[[:space:]]*\\$/, ""); gsub(/[[:space:]]/, "", line); $0 = $0 " " line; } $0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/, "\\1(\\2, \"\\3\")", "g"); } } { print }' $file; done Requested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc Acked-by: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02platform/x86: asus-nb-wmi: Ignore unknown event 0xCFArmin Wolf1-0/+1
On the Asus X541UAK an unknown event 0xCF is emited when the charger is plugged in. This is caused by the following AML code: If (ACPS ()) { ACPF = One Local0 = 0x58 If (ATKP) { ^^^^ATKD.IANE (0xCF) } } Else { ACPF = Zero Local0 = 0x57 } Notify (AC0, 0x80) // Status Change If (ATKP) { ^^^^ATKD.IANE (Local0) } Sleep (0x64) PNOT () Sleep (0x0A) NBAT (0x80) Ignore the 0xCF event to silence the unknown event warning. Reported-by: Pau Espin Pedrol <pespin@espeweb.net> Closes: https://lore.kernel.org/platform-driver-x86/54d4860b-ec9c-4992-acf6-db3f90388293@espeweb.net Signed-off-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20241123224700.18530-1-W_Armin@gmx.de Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-02platform/x86: asus-wmi: Ignore return value when writing thermal policyArmin Wolf1-9/+2
On some machines like the ASUS Vivobook S14 writing the thermal policy returns the currently writen thermal policy instead of an error code. Ignore the return code to avoid falsely returning an error when the thermal policy was written successfully. Reported-by: auslands-kv@gmx.de Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219517 Fixes: 2daa86e78c49 ("platform/x86: asus_wmi: Support throttle thermal policy") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20241124171941.29789-1-W_Armin@gmx.de Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-02platform/x86: samsung-laptop: Match MODULE_DESCRIPTION() to functionalitySedat Dilek1-1/+1
Change module description from "Samsung Backlight driver" to "Samsung Laptop driver" to better match driver's functionality. Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20241123133041.16042-1-sedat.dilek@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-02LoongArch: KVM: Protect kvm_check_requests() with SRCUHuacai Chen1-1/+3
When we enable lockdep we get such a warning: ============================= WARNING: suspicious RCU usage 6.12.0-rc7+ #1891 Tainted: G W ----------------------------- include/linux/kvm_host.h:1043 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by qemu-system-loo/948: #0: 90000001184a00a8 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0xf4/0xe20 [kvm] stack backtrace: CPU: 0 UID: 0 PID: 948 Comm: qemu-system-loo Tainted: G W 6.12.0-rc7+ #1891 Tainted: [W]=WARN Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022 Stack : 0000000000000089 9000000005a0db9c 90000000071519c8 900000012c578000 900000012c57b920 0000000000000000 900000012c57b928 9000000007e53788 900000000815bcc8 900000000815bcc0 900000012c57b790 0000000000000001 0000000000000001 4b031894b9d6b725 0000000004dec000 90000001003299c0 0000000000000414 0000000000000001 000000000000002d 0000000000000003 0000000000000030 00000000000003b4 0000000004dec000 90000001184a0000 900000000806d000 9000000007e53788 00000000000000b4 0000000000000004 0000000000000004 0000000000000000 0000000000000000 9000000107baf600 9000000008916000 9000000007e53788 9000000005924778 0000000010000044 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d ... Call Trace: [<9000000005924778>] show_stack+0x38/0x180 [<90000000071519c4>] dump_stack_lvl+0x94/0xe4 [<90000000059eb754>] lockdep_rcu_suspicious+0x194/0x240 [<ffff8000022143bc>] kvm_gfn_to_hva_cache_init+0xfc/0x120 [kvm] [<ffff80000222ade4>] kvm_pre_enter_guest+0x3a4/0x520 [kvm] [<ffff80000222b3dc>] kvm_handle_exit+0x23c/0x480 [kvm] Fix it by protecting kvm_check_requests() with SRCU. Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-12-02LoongArch: BPF: Adjust the parameter of emit_jirl()Tiezhu Yang3-5/+15
The branch instructions beq, bne, blt, bge, bltu, bgeu and jirl belong to the format reg2i16, but the sequence of oprand is different for the instruction jirl. So adjust the parameter order of emit_jirl() to make it more readable correspond with the Instruction Set Architecture manual. Here are the instruction formats: beq rj, rd, offs16 bne rj, rd, offs16 blt rj, rd, offs16 bge rj, rd, offs16 bltu rj, rd, offs16 bgeu rj, rd, offs16 jirl rd, rj, offs16 Link: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#branch-instructions Suggested-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-12-02LoongArch: Add architecture specific huge_pte_clear()Bibo Mao1-0/+10
When executing mm selftests run_vmtests.sh, there is such an error: BUG: Bad page state in process uffd-unit-tests pfn:00000 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x0 flags: 0xffff0000002000(reserved|node=0|zone=0|lastcpupid=0xffff) raw: 00ffff0000002000 ffffbf0000000008 ffffbf0000000008 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set Modules linked in: snd_seq_dummy snd_seq snd_seq_device rfkill vfat fat virtio_balloon efi_pstore virtio_net pstore net_failover failover fuse nfnetlink virtio_scsi virtio_gpu virtio_dma_buf dm_multipath efivarfs CPU: 2 UID: 0 PID: 1913 Comm: uffd-unit-tests Not tainted 6.12.0 #184 Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 Stack : 900000047c8ac000 0000000000000000 9000000000223a7c 900000047c8ac000 900000047c8af690 900000047c8af698 0000000000000000 900000047c8af7d8 900000047c8af7d0 900000047c8af7d0 900000047c8af5b0 0000000000000001 0000000000000001 900000047c8af698 10b3c7d53da40d26 0000010000000000 0000000000000022 0000000fffffffff fffffffffe000000 ffff800000000000 000000000000002f 0000800000000000 000000017a6d4000 90000000028f8940 0000000000000000 0000000000000000 90000000025aa5e0 9000000002905000 0000000000000000 90000000028f8940 ffff800000000000 0000000000000000 0000000000000000 0000000000000000 9000000000223a94 000000012001839c 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d ... Call Trace: [<9000000000223a94>] show_stack+0x5c/0x180 [<9000000001c3fd64>] dump_stack_lvl+0x6c/0xa0 [<900000000056aa08>] bad_page+0x1a0/0x1f0 [<9000000000574978>] free_unref_folios+0xbf0/0xd20 [<90000000004e65cc>] folios_put_refs+0x1a4/0x2b8 [<9000000000599a0c>] free_pages_and_swap_cache+0x164/0x260 [<9000000000547698>] tlb_batch_pages_flush+0xa8/0x1c0 [<9000000000547f30>] tlb_finish_mmu+0xa8/0x218 [<9000000000543cb8>] exit_mmap+0x1a0/0x360 [<9000000000247658>] __mmput+0x78/0x200 [<900000000025583c>] do_exit+0x43c/0xde8 [<9000000000256490>] do_group_exit+0x68/0x110 [<9000000000256554>] sys_exit_group+0x1c/0x20 [<9000000001c413b4>] do_syscall+0x94/0x130 [<90000000002216d8>] handle_syscall+0xb8/0x158 Disabling lock debugging due to kernel taint BUG: non-zero pgtables_bytes on freeing mm: -16384 On LoongArch system, invalid huge pte entry should be invalid_pte_table or a single _PAGE_HUGE bit rather than a zero value. And it should be the same with invalid pmd entry, since pmd_none() is called by function free_pgd_range() and pmd_none() return 0 by huge_pte_clear(). So single _PAGE_HUGE bit is also treated as a valid pte table and free_pte_range() will be called in free_pmd_range(). free_pmd_range() pmd = pmd_offset(pud, addr); do { next = pmd_addr_end(addr, end); if (pmd_none_or_clear_bad(pmd)) continue; free_pte_range(tlb, pmd, addr); } while (pmd++, addr = next, addr != end); Here invalid_pte_table is used for both invalid huge pte entry and pmd entry. Cc: stable@vger.kernel.org Fixes: 09cfefb7fa70 ("LoongArch: Add memory management") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-12-02LoongArch/irq: Use seq_put_decimal_ull_width() for decimal valuesDavid Wang1-1/+1
Performance improvement for reading /proc/interrupts on LoongArch. On a system with n CPUs and m interrupts, there will be n*m decimal values yielded via seq_printf(.."%10u "..) which is less efficient than seq_put_decimal_ull_width(), stress reading /proc/interrupts indicates ~30% performance improvement with this patch (and its friends). Signed-off-by: David Wang <00107082@163.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-12-02LoongArch: Fix reserving screen info memory for above-4G firmwareHuacai Chen1-1/+1
Since screen_info.lfb_base is a __u32 type, an above-4G address need an ext_lfb_base to present its higher 32bits. In init_screen_info() we can use __screen_info_lfb_base() to handle this case for reserving screen info memory. Signed-off-by: Xuefeng Zhao <zhaoxuefeng@loongson.cn> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-12-01Get rid of 'remove_new' relic from platform driver structLinus Torvalds474-494/+484
The continual trickle of small conversion patches is grating on me, and is really not helping. Just get rid of the 'remove_new' member function, which is just an alias for the plain 'remove', and had a comment to that effect: /* * .remove_new() is a relic from a prototype conversion of .remove(). * New drivers are supposed to implement .remove(). Once all drivers are * converted to not use .remove_new any more, it will be dropped. */ This was just a tree-wide 'sed' script that replaced '.remove_new' with '.remove', with some care taken to turn a subsequent tab into two tabs to make things line up. I did do some minimal manual whitespace adjustment for places that used spaces to line things up. Then I just removed the old (sic) .remove_new member function, and this is the end result. No more unnecessary conversion noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-01Linux 6.13-rc1Linus Torvalds1-2/+2
2024-12-01strscpy: write destination buffer only onceLinus Torvalds1-6/+17
The point behind strscpy() was to once and for all avoid all the problems with 'strncpy()' and later broken "fixed" versions like strlcpy() that just made things worse. So strscpy not only guarantees NUL-termination (unlike strncpy), it also doesn't do unnecessary padding at the destination. But at the same time also avoids byte-at-a-time reads and writes by _allowing_ some extra NUL writes - within the size, of course - so that the whole copy can be done with word operations. It is also stable in the face of a mutable source string: it explicitly does not read the source buffer multiple times (so an implementation using "strnlen()+memcpy()" would be wrong), and does not read the source buffer past the size (like the mis-design that is strlcpy does). Finally, the return value is designed to be simple and unambiguous: if the string cannot be copied fully, it returns an actual negative error, making error handling clearer and simpler (and the caller already knows the size of the buffer). Otherwise it returns the string length of the result. However, there was one final stability issue that can be important to callers: the stability of the destination buffer. In particular, the same way we shouldn't read the source buffer more than once, we should avoid doing multiple writes to the destination buffer: first writing a potentially non-terminated string, and then terminating it with NUL at the end does not result in a stable result buffer. Yes, it gives the right result in the end, but if the rule for the destination buffer was that it is _always_ NUL-terminated even when accessed concurrently with updates, the final byte of the buffer needs to always _stay_ as a NUL byte. [ Note that "final byte is NUL" here is literally about the final byte in the destination array, not the terminating NUL at the end of the string itself. There is no attempt to try to make concurrent reads and writes give any kind of consistent string length or contents, but we do want to guarantee that there is always at least that final terminating NUL character at the end of the destination array if it existed before ] This is relevant in the kernel for the tsk->comm[] array, for example. Even without locking (for either readers or writers), we want to know that while the buffer contents may be garbled, it is always a valid C string and always has a NUL character at 'comm[TASK_COMM_LEN-1]' (and never has any "out of thin air" data). So avoid any "copy possibly non-terminated string, and terminate later" behavior, and write the destination buffer only once. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-30printf: Remove unused 'bprintf'Dr. David Alan Gilbert2-24/+0
bprintf() is unused. Remove it. It was added in the commit 4370aa4aa753 ("vsprintf: add binary printf") but as far as I can see was never used, unlike the other two functions in that patch. Link: https://lore.kernel.org/20241002173147.210107-1-linux@treblig.org Reviewed-by: Andy Shevchenko <andy@kernel.org> Acked-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-11-30tools/power turbostat: 2024.11.30Len Brown2-2/+2
since 2024.07.26: assorted minor bug fixes assorted platform specific tweaks initial RAPL PSYS (SysWatt) support Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Add RAPL psys as a built-in counterPatryk Wlazlyn2-10/+85
Introduce the counter as a part of global, platform counters structure. We open the counter for only one cpu, but otherwise treat it as an ordinary RAPL counter, allowing for grouped perf read. The counter is disabled by default, because it's interpretation may require additional, platform specific information, making it unsuitable for general use. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Fix child's argument forwardingPatryk Wlazlyn1-1/+1
Add '+' to optstring when early scanning for --no-msr and --no-perf. It causes option processing to stop as soon as a nonoption argument is encountered, effectively skipping child's arguments. Fixes: 3e4048466c39 ("tools/power turbostat: Add --no-msr option") Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Force --no-perf in --dump modePatryk Wlazlyn1-0/+6
Force the --no-perf early to prevent using it as a source. User asks for raw values, but perf returns them relative to the opening of the file descriptor. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Add support for /sys/class/drm/card1Zhang Rui1-9/+29
On some machines, the graphics device is enumerated as /sys/class/drm/card1 instead of /sys/class/drm/card0. The current implementation does not handle this scenario, resulting in the loss of graphics C6 residency and frequency information. Add support for /sys/class/drm/card1, ensuring that turbostat can retrieve and display the graphics columns for these platforms. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Cache graphics sysfs file descriptors during probeZhang Rui1-50/+32
Snapshots of the graphics sysfs knobs are taken based on file descriptors. To optimize this process, open the files and cache the file descriptors during the graphics probe phase. As a result, the previously cached pathnames become redundant and are removed. This change aims to streamline the code without altering its functionality. No functional change intended. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Consolidate graphics sysfs accessZhang Rui1-9/+6
Currently, there is an inconsistency in how graphics sysfs knobs are accessed: graphics residency sysfs knobs are opened and closed for each read, while graphics frequency sysfs knobs are opened once and remain open until turbostat exits. This inconsistency is confusing and adds unnecessary code complexity. Consolidate the access method by opening the sysfs files once and reusing the file pointers for subsequent accesses. This approach simplifies the code and ensures a consistent method for accessing graphics sysfs knobs. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Remove unnecessary fflush() callZhang Rui1-4/+3
The graphics sysfs knobs are read-only, making the use of fflush() before reading them redundant. Remove the unnecessary fflush() call. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Enhance platform divergence descriptionZhang Rui1-28/+30
In various generations, platforms often share a majority of features, diverging only in a few specific aspects. The current approach of using hardcoded values in 'platform_features' structure fails to effectively represent these divergences. To improve the description of platform divergence: 1. Each newly introduced 'platform_features' structure must have a base, typically derived from the previous generation. 2. Platform feature values should be inherited from the base structure rather than being hardcoded. This approach ensures a more accurate and maintainable representation of platform-specific features across different generations. Converts `adl_features` and `lnl_features` to follow this new scheme. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Add initial support for GraniteRapids-DZhang Rui1-0/+1
Add initial support for GraniteRapids-D. It shares the same features with SapphireRapids. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Remove PC3 support on LunarlakeZhang Rui1-1/+1
Lunarlake supports CC1/CC6/CC7/PC2/PC6/PC10. Remove PC3 support on Lunarlake. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Rename arl_features to lnl_featuresZhang Rui1-2/+2
As ARL shares the same features with ADL/RPL/MTL, now 'arl_features' is used by Lunarlake platform only. Rename 'arl_features' to 'lnl_features'. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Add back PC8 support on ArrowlakeZhang Rui1-3/+3
Similar to ADL/RPL/MTL, ARL supports CC1/CC6/CC7/PC2/PC3/PC6/PC8/PC10. Add back PC8 support on Arrowlake. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Remove PC7/PC9 support on MTLZhang Rui1-2/+2
Similar to ADL/RPL, MTL support CC1/CC6/CC7/PC2/PC3/PC6/PC8/CP10. Remove PC7/PC9 support on MTL. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Honor --show CPU, even when even when num_cpus=1Patryk Wlazlyn1-2/+2
Honor --show CPU and --show Core when "topo.num_cpus == 1". Previously turbostat assumed that on a 1-CPU system, these columns should never appear. Honoring these flags makes it easier for several programs that parse turbostat output. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Fix trailing '\n' parsingZhang Rui1-0/+3
parse_cpu_string() parses the string input either from command line or from /sys/fs/cgroup/cpuset.cpus.effective to get a list of CPUs that turbostat can run with. The cpu string returned by /sys/fs/cgroup/cpuset.cpus.effective contains a trailing '\n', but strtoul() fails to treat this as an error. That says, for the code below val = ("\n", NULL, 10); val returns 0, and errno is also not set. As a result, CPU0 is erroneously considered as allowed CPU and this causes failures when turbostat tries to run on CPU0. get_counters: Could not migrate to CPU 0 ... turbostat: re-initialized with num_cpus 8, allowed_cpus 5 get_counters: Could not migrate to CPU 0 Add a check to return immediately if '\n' or '\0' is detected. Fixes: 8c3dd2c9e542 ("tools/power/turbostat: Abstrct function for parsing cpu string") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Allow using cpu device in perf counters on hybrid platformsPatryk Wlazlyn2-7/+123
Intel hybrid platforms expose different perf devices for P and E cores. Instead of one, "/sys/bus/event_source/devices/cpu" device, there are "/sys/bus/event_source/devices/{cpu_core,cpu_atom}". This, however makes it more complicated for the user, because most of the counters are available on both and had to be handled manually. This patch allows users to use "virtual" cpu device that is seemingly translated to cpu_core and cpu_atom perf devices, depending on the type of a CPU we are opening the counter for. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Fix column printing for PMT xtal_time countersPatryk Wlazlyn1-3/+3
If the very first printed column was for a PMT counter of type xtal_time we would misalign the column header, because we were always printing the delimiter. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: fix GCC9 build regressionTodd Brandt1-9/+6
Fix build regression seen when using old gcc-9 compiler. Signed-off-by: Todd Brandt <todd.e.brandt@intel.com> Reviewed-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30PCI/pwrctrl: Unregister platform device only if one actually existsBrian Norris1-2/+7
If a PCI device has an associated device_node with power supplies, pci_bus_add_device() creates platform devices for use by pwrctrl. When the PCI device is removed, pci_stop_dev() uses of_find_device_by_node() to locate the related platform device, then unregisters it. But when we remove a PCI device with no associated device node, dev_of_node(dev) is NULL, and of_find_device_by_node(NULL) returns the first device with "dev->of_node == NULL". The result is that we (a) mistakenly unregister a completely unrelated platform device, leading to issues like the first trace below, and (b) dereference the NULL pointer from dev_of_node() when clearing OF_POPULATED, as in the second trace. Unregister a platform device only if there is one associated with this PCI device. This resolves issues seen when doing: # echo 1 > /sys/bus/pci/devices/.../remove Sample issue from unregistering the wrong platform device: WARNING: CPU: 0 PID: 5095 at drivers/regulator/core.c:5885 regulator_unregister+0x140/0x160 Call trace: regulator_unregister+0x140/0x160 devm_rdev_release+0x1c/0x30 release_nodes+0x68/0x100 devres_release_all+0x98/0xf8 device_unbind_cleanup+0x20/0x70 device_release_driver_internal+0x1f4/0x240 device_release_driver+0x20/0x40 bus_remove_device+0xd8/0x170 device_del+0x154/0x380 device_unregister+0x28/0x88 of_device_unregister+0x1c/0x30 pci_stop_bus_device+0x154/0x1b0 pci_stop_and_remove_bus_device_locked+0x28/0x48 remove_store+0xa0/0xb8 dev_attr_store+0x20/0x40 sysfs_kf_write+0x4c/0x68 Later NULL pointer dereference for of_node_clear_flag(NULL, OF_POPULATED): Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0 Call trace: pci_stop_bus_device+0x190/0x1b0 pci_stop_and_remove_bus_device_locked+0x28/0x48 remove_store+0xa0/0xb8 dev_attr_store+0x20/0x40 sysfs_kf_write+0x4c/0x68 Link: https://lore.kernel.org/r/20241126210443.4052876-1-briannorris@chromium.org Fixes: 681725afb6b9 ("PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent") Reported-by: Saurabh Sengar <ssengar@linux.microsoft.com> Closes: https://lore.kernel.org/r/1732890621-19656-1-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Brian Norris <briannorris@chromium.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-30Revert "serial: sh-sci: Clean sci_ports[0] after at earlycon exit"Greg Kroah-Hartman1-28/+0
This reverts commit 3791ea69a4858b81e0277f695ca40f5aae40f312. It was reported to cause boot-time issues, so revert it for now. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: 3791ea69a485 ("serial: sh-sci: Clean sci_ports[0] after at earlycon exit") Cc: stable <stable@kernel.org> Cc: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-30sh: intc: Fix use-after-free bug in register_intc_controller()Dan Carpenter1-1/+1
In the error handling for this function, d is freed without ever removing it from intc_list which would lead to a use after free. To fix this, let's only add it to the list after everything has succeeded. Fixes: 2dcec7a988a1 ("sh: intc: set_irq_wake() support") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2024-11-30sh: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACKHuacai Chen1-1/+1
When CONFIG_CPUMASK_OFFSTACK and CONFIG_DEBUG_PER_CPU_MAPS are selected, cpu_max_bits_warn() generates a runtime warning similar as below when showing /proc/cpuinfo. Fix this by using nr_cpu_ids (the runtime limit) instead of NR_CPUS to iterate CPUs. [ 3.052463] ------------[ cut here ]------------ [ 3.059679] WARNING: CPU: 3 PID: 1 at include/linux/cpumask.h:108 show_cpuinfo+0x5e8/0x5f0 [ 3.070072] Modules linked in: efivarfs autofs4 [ 3.076257] CPU: 0 PID: 1 Comm: systemd Not tainted 5.19-rc5+ #1052 [ 3.099465] Stack : 9000000100157b08 9000000000f18530 9000000000cf846c 9000000100154000 [ 3.109127] 9000000100157a50 0000000000000000 9000000100157a58 9000000000ef7430 [ 3.118774] 90000001001578e8 0000000000000040 0000000000000020 ffffffffffffffff [ 3.128412] 0000000000aaaaaa 1ab25f00eec96a37 900000010021de80 900000000101c890 [ 3.138056] 0000000000000000 0000000000000000 0000000000000000 0000000000aaaaaa [ 3.147711] ffff8000339dc220 0000000000000001 0000000006ab4000 0000000000000000 [ 3.157364] 900000000101c998 0000000000000004 9000000000ef7430 0000000000000000 [ 3.167012] 0000000000000009 000000000000006c 0000000000000000 0000000000000000 [ 3.176641] 9000000000d3de08 9000000001639390 90000000002086d8 00007ffff0080286 [ 3.186260] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1c [ 3.195868] ... [ 3.199917] Call Trace: [ 3.203941] [<90000000002086d8>] show_stack+0x38/0x14c [ 3.210666] [<9000000000cf846c>] dump_stack_lvl+0x60/0x88 [ 3.217625] [<900000000023d268>] __warn+0xd0/0x100 [ 3.223958] [<9000000000cf3c90>] warn_slowpath_fmt+0x7c/0xcc [ 3.231150] [<9000000000210220>] show_cpuinfo+0x5e8/0x5f0 [ 3.238080] [<90000000004f578c>] seq_read_iter+0x354/0x4b4 [ 3.245098] [<90000000004c2e90>] new_sync_read+0x17c/0x1c4 [ 3.252114] [<90000000004c5174>] vfs_read+0x138/0x1d0 [ 3.258694] [<90000000004c55f8>] ksys_read+0x70/0x100 [ 3.265265] [<9000000000cfde9c>] do_syscall+0x7c/0x94 [ 3.271820] [<9000000000202fe4>] handle_syscall+0xc4/0x160 [ 3.281824] ---[ end trace 8b484262b4b8c24c ]--- Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2024-11-29btrfs: fix lockdep warnings on io_uring encoded readsMark Harmstone2-0/+20
Lockdep doesn't like the fact that btrfs_uring_read_extent() returns to userspace still holding the inode lock, even though we release it once the I/O finishes. Add calls to rwsem_release() and rwsem_acquire_read() to work round this. Reported-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> 34310c442e17 ("btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl)") Signed-off-by: Mark Harmstone <maharmstone@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-29btrfs: ref-verify: fix use-after-free after invalid ref actionFilipe Manana1-0/+1
At btrfs_ref_tree_mod() after we successfully inserted the new ref entry (local variable 'ref') into the respective block entry's rbtree (local variable 'be'), if we find an unexpected action of BTRFS_DROP_DELAYED_REF, we error out and free the ref entry without removing it from the block entry's rbtree. Then in the error path of btrfs_ref_tree_mod() we call btrfs_free_ref_cache(), which iterates over all block entries and then calls free_block_entry() for each one, and there we will trigger a use-after-free when we are called against the block entry to which we added the freed ref entry to its rbtree, since the rbtree still points to the block entry, as we didn't remove it from the rbtree before freeing it in the error path at btrfs_ref_tree_mod(). Fix this by removing the new ref entry from the rbtree before freeing it. Syzbot report this with the following stack traces: BTRFS error (device loop0 state EA): Ref action 2, root 5, ref_root 0, parent 8564736, owner 0, offset 0, num_refs 18446744073709551615 __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523 update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_insert_empty_items+0x9c/0x1a0 fs/btrfs/ctree.c:4314 btrfs_insert_empty_item fs/btrfs/ctree.h:669 [inline] btrfs_insert_orphan_item+0x1f1/0x320 fs/btrfs/orphan.c:23 btrfs_orphan_add+0x6d/0x1a0 fs/btrfs/inode.c:3482 btrfs_unlink+0x267/0x350 fs/btrfs/inode.c:4293 vfs_unlink+0x365/0x650 fs/namei.c:4469 do_unlinkat+0x4ae/0x830 fs/namei.c:4533 __do_sys_unlinkat fs/namei.c:4576 [inline] __se_sys_unlinkat fs/namei.c:4569 [inline] __x64_sys_unlinkat+0xcc/0xf0 fs/namei.c:4569 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f BTRFS error (device loop0 state EA): Ref action 1, root 5, ref_root 5, parent 0, owner 260, offset 0, num_refs 1 __btrfs_mod_ref+0x76b/0xac0 fs/btrfs/extent-tree.c:2521 update_ref_for_cow+0x96a/0x11f0 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411 __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline] __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137 __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171 btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313 prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586 relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611 btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081 btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377 __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161 btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538 BTRFS error (device loop0 state EA): Ref action 2, root 5, ref_root 0, parent 8564736, owner 0, offset 0, num_refs 18446744073709551615 __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523 update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411 __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline] __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137 __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171 btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313 prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586 relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611 btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081 btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377 __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161 btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538 ================================================================== BUG: KASAN: slab-use-after-free in rb_first+0x69/0x70 lib/rbtree.c:473 Read of size 8 at addr ffff888042d1af38 by task syz.0.0/5329 CPU: 0 UID: 0 PID: 5329 Comm: syz.0.0 Not tainted 6.12.0-rc7-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 rb_first+0x69/0x70 lib/rbtree.c:473 free_block_entry+0x78/0x230 fs/btrfs/ref-verify.c:248 btrfs_free_ref_cache+0xa3/0x100 fs/btrfs/ref-verify.c:917 btrfs_ref_tree_mod+0x139f/0x15e0 fs/btrfs/ref-verify.c:898 btrfs_free_extent+0x33c/0x380 fs/btrfs/extent-tree.c:3544 __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523 update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411 __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline] __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137 __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171 btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313 prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586 relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611 btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081 btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377 __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161 btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538 btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f996df7e719 RSP: 002b:00007f996ede7038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f996e135f80 RCX: 00007f996df7e719 RDX: 0000000020000180 RSI: 00000000c4009420 RDI: 0000000000000004 RBP: 00007f996dff139e R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f996e135f80 R15: 00007fff79f32e68 </TASK> Allocated by task 5329: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:257 [inline] __kmalloc_cache_noprof+0x19c/0x2c0 mm/slub.c:4295 kmalloc_noprof include/linux/slab.h:878 [inline] kzalloc_noprof include/linux/slab.h:1014 [inline] btrfs_ref_tree_mod+0x264/0x15e0 fs/btrfs/ref-verify.c:701 btrfs_free_extent+0x33c/0x380 fs/btrfs/extent-tree.c:3544 __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523 update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411 __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline] __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137 __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171 btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313 prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586 relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611 btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081 btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377 __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161 btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538 btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 5329: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:230 [inline] slab_free_hook mm/slub.c:2342 [inline] slab_free mm/slub.c:4579 [inline] kfree+0x1a0/0x440 mm/slub.c:4727 btrfs_ref_tree_mod+0x136c/0x15e0 btrfs_free_extent+0x33c/0x380 fs/btrfs/extent-tree.c:3544 __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523 update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411 __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline] __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137 __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171 btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313 prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586 relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611 btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081 btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377 __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161 btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538 btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f The buggy address belongs to the object at ffff888042d1af00 which belongs to the cache kmalloc-64 of size 64 The buggy address is located 56 bytes inside of freed 64-byte region [ffff888042d1af00, ffff888042d1af40) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x42d1a anon flags: 0x4fff00000000000(node=1|zone=1|lastcpupid=0x7ff) page_type: f5(slab) raw: 04fff00000000000 ffff88801ac418c0 0000000000000000 dead000000000001 raw: 0000000000000000 0000000000200020 00000001f5000000 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 5055, tgid 5055 (dhcpcd-run-hook), ts 40377240074, free_ts 40376848335 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1541 prep_new_page mm/page_alloc.c:1549 [inline] get_page_from_freelist+0x3649/0x3790 mm/page_alloc.c:3459 __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4735 alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265 alloc_slab_page+0x6a/0x140 mm/slub.c:2412 allocate_slab+0x5a/0x2f0 mm/slub.c:2578 new_slab mm/slub.c:2631 [inline] ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3818 __slab_alloc+0x58/0xa0 mm/slub.c:3908 __slab_alloc_node mm/slub.c:3961 [inline] slab_alloc_node mm/slub.c:4122 [inline] __do_kmalloc_node mm/slub.c:4263 [inline] __kmalloc_noprof+0x25a/0x400 mm/slub.c:4276 kmalloc_noprof include/linux/slab.h:882 [inline] kzalloc_noprof include/linux/slab.h:1014 [inline] tomoyo_encode2 security/tomoyo/realpath.c:45 [inline] tomoyo_encode+0x26f/0x540 security/tomoyo/realpath.c:80 tomoyo_realpath_from_path+0x59e/0x5e0 security/tomoyo/realpath.c:283 tomoyo_get_realpath security/tomoyo/file.c:151 [inline] tomoyo_check_open_permission+0x255/0x500 security/tomoyo/file.c:771 security_file_open+0x777/0x990 security/security.c:3109 do_dentry_open+0x369/0x1460 fs/open.c:945 vfs_open+0x3e/0x330 fs/open.c:1088 do_open fs/namei.c:3774 [inline] path_openat+0x2c84/0x3590 fs/namei.c:3933 page last free pid 5055 tgid 5055 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1112 [inline] free_unref_page+0xcfb/0xf20 mm/page_alloc.c:2642 free_pipe_info+0x300/0x390 fs/pipe.c:860 put_pipe_info fs/pipe.c:719 [inline] pipe_release+0x245/0x320 fs/pipe.c:742 __fput+0x23f/0x880 fs/file_table.c:431 __do_sys_close fs/open.c:1567 [inline] __se_sys_close fs/open.c:1552 [inline] __x64_sys_close+0x7f/0x110 fs/open.c:1552 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Memory state around the buggy address: ffff888042d1ae00: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff888042d1ae80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc >ffff888042d1af00: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ^ ffff888042d1af80: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc ffff888042d1b000: 00 00 00 00 00 fc fc 00 00 00 00 00 fc fc 00 00 Reported-by: syzbot+7325f164162e200000c1@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/673723eb.050a0220.1324f8.00a8.GAE@google.com/T/#u Fixes: fd708b81d972 ("Btrfs: add a extent ref verify tool") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-29btrfs: add a sanity check for btrfs root in btrfs_search_slot()Lizhi Xu1-1/+5
Syzbot reports a null-ptr-deref in btrfs_search_slot(). The reproducer is using rescue=ibadroots, and the extent tree root is corrupted thus the extent tree is NULL. When scrub tries to search the extent tree to gather the needed extent info, btrfs_search_slot() doesn't check if the target root is NULL or not, resulting the null-ptr-deref. Add sanity check for btrfs root before using it in btrfs_search_slot(). Reported-by: syzbot+3030e17bd57a73d39bd7@syzkaller.appspotmail.com Fixes: 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots") Link: https://syzkaller.appspot.com/bug?extid=3030e17bd57a73d39bd7 CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Qu Wenruo <wqu@suse.com> Tested-by: syzbot+3030e17bd57a73d39bd7@syzkaller.appspotmail.com Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-29btrfs: don't loop for nowait writes when checking for cross referencesFilipe Manana1-1/+1
When checking for delayed refs when verifying if there are cross references for a data extent, we stop if the path has nowait set and we can't try lock the delayed ref head's mutex, returning -EAGAIN with the goal of making a write fallback to a blocking context. However we ignore the -EAGAIN at btrfs_cross_ref_exist() when check_delayed_ref() returns it, and keep looping instead of immediately returning the -EAGAIN to the caller. Fix this by not looping if we get -EAGAIN and we have a nowait path. Fixes: 26ce91144631 ("btrfs: make can_nocow_extent nowait compatible") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-29brd: decrease the number of allocated pages which discardedZhang Xianwei1-1/+3
The number of allocated pages which discarded will not decrease. Fix it. Fixes: 9ead7efc6f3f ("brd: implement discard support") Signed-off-by: Zhang Xianwei <zhang.xianwei8@zte.com.cn> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241128170056565nPKSz2vsP8K8X2uk2iaDG@zte.com.cn Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-29block, bfq: fix bfqq uaf in bfq_limit_depth()Yu Kuai1-13/+24
Set new allocated bfqq to bic or remove freed bfqq from bic are both protected by bfqd->lock, however bfq_limit_depth() is deferencing bfqq from bic without the lock, this can lead to UAF if the io_context is shared by multiple tasks. For example, test bfq with io_uring can trigger following UAF in v6.6: ================================================================== BUG: KASAN: slab-use-after-free in bfqq_group+0x15/0x50 Call Trace: <TASK> dump_stack_lvl+0x47/0x80 print_address_description.constprop.0+0x66/0x300 print_report+0x3e/0x70 kasan_report+0xb4/0xf0 bfqq_group+0x15/0x50 bfqq_request_over_limit+0x130/0x9a0 bfq_limit_depth+0x1b5/0x480 __blk_mq_alloc_requests+0x2b5/0xa00 blk_mq_get_new_requests+0x11d/0x1d0 blk_mq_submit_bio+0x286/0xb00 submit_bio_noacct_nocheck+0x331/0x400 __block_write_full_folio+0x3d0/0x640 writepage_cb+0x3b/0xc0 write_cache_pages+0x254/0x6c0 write_cache_pages+0x254/0x6c0 do_writepages+0x192/0x310 filemap_fdatawrite_wbc+0x95/0xc0 __filemap_fdatawrite_range+0x99/0xd0 filemap_write_and_wait_range.part.0+0x4d/0xa0 blkdev_read_iter+0xef/0x1e0 io_read+0x1b6/0x8a0 io_issue_sqe+0x87/0x300 io_wq_submit_work+0xeb/0x390 io_worker_handle_work+0x24d/0x550 io_wq_worker+0x27f/0x6c0 ret_from_fork_asm+0x1b/0x30 </TASK> Allocated by task 808602: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 __kasan_slab_alloc+0x83/0x90 kmem_cache_alloc_node+0x1b1/0x6d0 bfq_get_queue+0x138/0xfa0 bfq_get_bfqq_handle_split+0xe3/0x2c0 bfq_init_rq+0x196/0xbb0 bfq_insert_request.isra.0+0xb5/0x480 bfq_insert_requests+0x156/0x180 blk_mq_insert_request+0x15d/0x440 blk_mq_submit_bio+0x8a4/0xb00 submit_bio_noacct_nocheck+0x331/0x400 __blkdev_direct_IO_async+0x2dd/0x330 blkdev_write_iter+0x39a/0x450 io_write+0x22a/0x840 io_issue_sqe+0x87/0x300 io_wq_submit_work+0xeb/0x390 io_worker_handle_work+0x24d/0x550 io_wq_worker+0x27f/0x6c0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x1b/0x30 Freed by task 808589: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x27/0x40 __kasan_slab_free+0x126/0x1b0 kmem_cache_free+0x10c/0x750 bfq_put_queue+0x2dd/0x770 __bfq_insert_request.isra.0+0x155/0x7a0 bfq_insert_request.isra.0+0x122/0x480 bfq_insert_requests+0x156/0x180 blk_mq_dispatch_plug_list+0x528/0x7e0 blk_mq_flush_plug_list.part.0+0xe5/0x590 __blk_flush_plug+0x3b/0x90 blk_finish_plug+0x40/0x60 do_writepages+0x19d/0x310 filemap_fdatawrite_wbc+0x95/0xc0 __filemap_fdatawrite_range+0x99/0xd0 filemap_write_and_wait_range.part.0+0x4d/0xa0 blkdev_read_iter+0xef/0x1e0 io_read+0x1b6/0x8a0 io_issue_sqe+0x87/0x300 io_wq_submit_work+0xeb/0x390 io_worker_handle_work+0x24d/0x550 io_wq_worker+0x27f/0x6c0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x1b/0x30 Fix the problem by protecting bic_to_bfqq() with bfqd->lock. CC: Jan Kara <jack@suse.cz> Fixes: 76f1df88bbc2 ("bfq: Limit number of requests consumed by each cgroup") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20241129091509.2227136-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-29selftests/hid: fix kfunc inclusions with newer bpftoolBenjamin Tissoires1-8/+11
bpftool now embeds the kfuncs definitions directly in the generated vmlinux.h This is great, but because the selftests dir might be compiled with HID_BPF disabled, we have no guarantees to be able to compile the sources with the generated kfuncs. If we have the kfuncs, because we have the `__not_used` hack, the newly defined kfuncs do not match the ones from vmlinux.h and things go wrong. Prevent vmlinux.h to define its kfuncs and also add the missing `__weak` symbols for our custom kfuncs definitions Link: https://patch.msgid.link/20241128-fix-new-bpftool-v1-1-c9abdf94a719@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-11-29io_uring/tctx: work around xa_store() allocation error issueJens Axboe1-1/+12
syzbot triggered the following WARN_ON: WARNING: CPU: 0 PID: 16 at io_uring/tctx.c:51 __io_uring_free+0xfa/0x140 io_uring/tctx.c:51 which is the WARN_ON_ONCE(!xa_empty(&tctx->xa)); sanity check in __io_uring_free() when a io_uring_task is going through its final put. The syzbot test case includes injecting memory allocation failures, and it very much looks like xa_store() can fail one of its memory allocations and end up with ->head being non-NULL even though no entries exist in the xarray. Until this issue gets sorted out, work around it by attempting to iterate entries in our xarray, and WARN_ON_ONCE() if one is found. Reported-by: syzbot+cc36d44ec9f368e443d3@syzkaller.appspotmail.com Link: https://lore.kernel.org/io-uring/673c1643.050a0220.87769.0066.GAE@google.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-29Revert "s390/mm: Allow large pages for KASAN shadow mapping"Vasily Gorbik1-11/+1
This reverts commit ff123eb7741638d55abf82fac090bb3a543c1e74. Allowing large pages for KASAN shadow mappings isn't inherently wrong, but adding POPULATE_KASAN_MAP_SHADOW to large_allowed() exposes an issue in can_large_pud() and can_large_pmd(). Since commit d8073dc6bc04 ("s390/mm: Allow large pages only for aligned physical addresses"), both can_large_pud() and can_large_pmd() call _pa() to check if large page physical addresses are aligned. However, _pa() has a side effect: it allocates memory in POPULATE_KASAN_MAP_SHADOW mode. This results in massive memory leaks. The proper fix would be to address both large_allowed() and _pa()'s side effects, but for now, revert this change to avoid the leaks. Fixes: ff123eb77416 ("s390/mm: Allow large pages for KASAN shadow mapping") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-29posix-timers: Target group sigqueue to current task only if not exitingFrederic Weisbecker1-3/+4
A sigqueue belonging to a posix timer, which target is not a specific thread but a whole thread group, is preferrably targeted to the current task if it is part of that thread group. However nothing prevents a posix timer event from queueing such a sigqueue from a reaped yet running task. The interruptible code space between exit_notify() and the final call to schedule() is enough for posix_timer_fn() hrtimer to fire. If that happens while the current task is part of the thread group target, it is proposed to handle it but since its sighand pointer may have been cleared already, the sigqueue is dropped even if there are other tasks running within the group that could handle it. As a result posix timers with thread group wide target may miss signals when some of their threads are exiting. Fix this with verifying that the current task hasn't been through exit_notify() before proposing it as a preferred target so as to ensure that its sighand is still here and stable. complete_signal() might still reconsider the choice and find a better target within the group if current has passed retarget_shared_pending() already. Fixes: bcb7ee79029d ("posix-timers: Prefer delivery of signals to the current thread") Reported-by: Anthony Mallet <anthony.mallet@laas.fr> Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241122234811.60455-1-frederic@kernel.org Closes: https://lore.kernel.org/all/26411.57288.238690.681680@gargle.gargle.HOWL