aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/exported-sql-viewer.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-03-17objtool: Use O_CREAT with explicit mode maskIngo Molnar1-1/+1
Recent Ubuntu enforces 3-argument open() with O_CREAT: CC /home/mingo/tip/tools/objtool/builtin-check.o In file included from /usr/include/fcntl.h:341, from builtin-check.c:9: In function ‘open’, inlined from ‘copy_file’ at builtin-check.c:201:11: /usr/include/x86_64-linux-gnu/bits/fcntl2.h:52:11: error: call to ‘__open_missing_mode’ declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments 52 | __open_missing_mode (); | ^~~~~~~~~~~~~~~~~~~~~~ Use 0400 as the most restrictive mode for the new file. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org
2025-03-17objtool: Add CONFIG_OBJTOOL_WERRORJosh Poimboeuf2-0/+12
Objtool warnings can be indicative of crashes, broken live patching, or even boot failures. Ignoring them is not recommended. Add CONFIG_OBJTOOL_WERROR to upgrade objtool warnings to errors by enabling the objtool --Werror option. Also set --backtrace to print the branches leading up to the warning, which can help considerably when debugging certain warnings. To avoid breaking bots too badly for now, make it the default for real world builds only (!COMPILE_TEST). Co-developed-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/3e7c109313ff15da6c80788965cc7450115b0196.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Create backup on error and print argsJosh Poimboeuf3-67/+65
Recreating objtool errors can be a manual process. Kbuild removes the object, so it has to be compiled or linked again before running objtool. Then the objtool args need to be reversed engineered. Make that all easier by automatically making a backup of the object file on error, and print a modified version of the args which can be used to recreate. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/7571e30636359b3e173ce6e122419452bb31882f.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Change "warning:" to "error:" for --WerrorJosh Poimboeuf1-2/+4
This is similar to GCC's behavior and makes it more obvious why the build failed. Suggested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Brendan Jackman <jackmanb@google.com> Link: https://lore.kernel.org/r/56f0565b15b4b4caa9a08953fa9c679dfa973514.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Add --Werror optionJosh Poimboeuf3-3/+14
Any objtool warning has the potential of reflecting (or triggering) a major bug in the kernel or compiler which could result in crashing the kernel or breaking the livepatch consistency model. In preparation for failing the build on objtool errors/warnings, add a new --Werror option. [ jpoimboe: commit log, comments, error out on fatal errors too ] Co-developed-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/e423ea4ec297f510a108aa6c78b52b9fe30fa8c1.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Add --output optionJosh Poimboeuf5-35/+89
Add option to allow writing the changed binary to a separate file rather than changing it in place. Libelf makes this suprisingly hard, so take the easy way out and just copy the file before editing it. Also steal the -o short option from --orc. Nobody will notice ;-) Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/0da308d42d82b3bbed16a31a72d6bde52afcd6bd.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Upgrade "Linked object detected" warning to errorJosh Poimboeuf1-2/+2
Force the user to fix their cmdline if they forget the '--link' option. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Brendan Jackman <jackmanb@google.com> Link: https://lore.kernel.org/r/8380bbf3a0fa86e03fd63f60568ae06a48146bc1.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Consolidate option validationJosh Poimboeuf1-44/+24
The option validations are a bit scattered, consolidate them. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Brendan Jackman <jackmanb@google.com> Link: https://lore.kernel.org/r/8f886502fda1d15f39d7351b70d4ebe5903da627.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Remove --unret dependency on --rethunkJosh Poimboeuf1-5/+0
With unret validation enabled and IBT/LTO disabled, objtool runs on TUs with --rethunk and on vmlinux.o with --unret. So this dependency isn't valid as they don't always run on the same object. This error never triggered before because --unret is always coupled with --noinstr, so the first conditional in opts_valid() returns early due to opts.noinstr being true. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/c6f5635784a28ed4b10ac4307b1858e015e6eff0.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Increase per-function WARN_FUNC() rate limitJosh Poimboeuf3-5/+13
Increase the per-function WARN_FUNC() rate limit from 1 to 2. If the number of warnings for a given function goes beyond 2, print "skipping duplicate warning(s)". This helps root out additional warnings in a function that might be hiding behind the first one. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/aec318d66c037a51c9f376d6fb0e8ff32812a037.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Update documentationJosh Poimboeuf1-42/+53
Fix some outdated information in the objtool doc. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/2552ee8b48631127bf269359647a7389edf5f002.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Improve __noreturn annotation warningJosh Poimboeuf2-8/+6
Clarify what needs to be done to resolve the missing __noreturn warning. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/ab835a35d00bacf8aff0b56257df93f14fdd8224.1741975349.git.jpoimboe@kernel.org
2025-03-17objtool: Fix error handling inconsistencies in check()Josh Poimboeuf1-4/+6
Make sure all fatal errors are funneled through the 'out' label with a negative ret. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Brendan Jackman <jackmanb@google.com> Link: https://lore.kernel.org/r/0f49d6a27a080b4012e84e6df1e23097f44cc082.1741975349.git.jpoimboe@kernel.org
2025-03-17x86/traps: Make exc_double_fault() consistently noreturnJosh Poimboeuf2-31/+18
The CONFIG_X86_ESPFIX64 version of exc_double_fault() can return to its caller, but the !CONFIG_X86_ESPFIX64 version never does. In the latter case the compiler and/or objtool may consider it to be implicitly noreturn. However, due to the currently inflexible way objtool detects noreturns, a function's noreturn status needs to be consistent across configs. The current workaround for this issue is to suppress unreachable warnings for exc_double_fault()'s callers. Unfortunately that can result in ORC coverage gaps and potentially worse issues like inert static calls and silently disabled CPU mitigations. Instead, prevent exc_double_fault() from ever being implicitly marked noreturn by forcing a return behind a never-taken conditional. Until a more integrated noreturn detection method exists, this is likely the least objectionable workaround. Fixes: 55eeab2a8a11 ("objtool: Ignore exc_double_fault() __noreturn warnings") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Brendan Jackman <jackmanb@google.com> Link: https://lore.kernel.org/r/d1f4026f8dc35d0de6cc61f2684e0cb6484009d1.1741975349.git.jpoimboe@kernel.org
2025-03-12LoongArch: Enable jump table for objtoolTiezhu Yang2-1/+8
For now, it is time to remove -fno-jump-tables to enable jump table for objtool if the compiler has -mannotate-tablejump, otherwise it is better to remain -fno-jump-tables to keep compatibility with older compilers. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20241217010905.13054-8-yangtiezhu@loongson.cn Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2025-03-12objtool/LoongArch: Add support for goto tableTiezhu Yang1-2/+30
The objtool program need to analysis the control flow of each object file generated by compiler toolchain, it needs to know all the locations that a branch instruction may jump into, if a jump table is used, objtool has to correlate the jump instruction with the table. On x86 (which is the only port supported by objtool before LoongArch), there is a relocation type on the jump instruction and directly points to the table. But on LoongArch, the relocation is on another kind of instruction prior to the jump instruction, and also with scheduling it is not very easy to tell the offset of that instruction from the jump instruction. Furthermore, because LoongArch has -fsection-anchors (often enabled at -O1 or above) the relocation may actually points to a section anchor instead of the table itself. For the jump table of switch cases, a GCC patch "LoongArch: Add support to annotate tablejump" and a Clang patch "[LoongArch] Add options for annotate tablejump" have been merged into the upstream mainline, it can parse the additional section ".discard.tablejump_annotate" which stores the jump info as pairs of addresses, each pair contains the address of jump instruction and the address of jump table. For the jump table of computed gotos, it is indeed not easy to implement in the compiler, especially if there is more than one computed goto in a function such as ___bpf_prog_run(). objdump kernel/bpf/core.o shows that there are many table jump instructions in ___bpf_prog_run(), but there are no relocations on the table jump instructions and to the table directly on LoongArch. Without the help of compiler, in order to figure out the address of goto table for the special case of ___bpf_prog_run(), since the instruction sequence is relatively single and stable, it makes sense to add a helper find_reloc_of_rodata_c_jump_table() to find the relocation which points to the section ".rodata..c_jump_table". If find_reloc_by_table_annotate() failed, it means there is no relocation info of switch table address in ".rela.discard.tablejump_annotate", then objtool may find the relocation info of goto table ".rodata..c_jump_table" with find_reloc_of_rodata_c_jump_table(). Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20250211115016.26913-6-yangtiezhu@loongson.cn Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2025-03-12objtool/LoongArch: Add support for switch tableTiezhu Yang1-1/+130
The objtool program need to analysis the control flow of each object file generated by compiler toolchain, it needs to know all the locations that a branch instruction may jump into, if a jump table is used, objtool has to correlate the jump instruction with the table. On x86 (which is the only port supported by objtool before LoongArch), there is a relocation type on the jump instruction and directly points to the table. But on LoongArch, the relocation is on another kind of instruction prior to the jump instruction, and also with scheduling it is not very easy to tell the offset of that instruction from the jump instruction. Furthermore, because LoongArch has -fsection-anchors (often enabled at -O1 or above) the relocation may actually points to a section anchor instead of the table itself. The good news is that after continuous analysis and discussion, at last a GCC patch "LoongArch: Add support to annotate tablejump" and a Clang patch "[LoongArch] Add options for annotate tablejump" have been merged into the upstream mainline, the compiler changes make life much easier for switch table support of objtool on LoongArch. By now, there is an additional section ".discard.tablejump_annotate" to store the jump info as pairs of addresses, each pair contains the address of jump instruction and the address of jump table. In order to find switch table, it is easy to parse the relocation section ".rela.discard.tablejump_annotate" to get table_sec and table_offset, the rest process is somehow like x86. Additionally, it needs to get each table size. When compiling on LoongArch, there are unsorted table offsets of rodata if there exist many jump tables, it will get the wrong table end and find the wrong table jump destination instructions in add_jump_table(). Sort the rodata table offset by parsing ".rela.discard.tablejump_annotate" and then get each table size of rodata corresponded with each table jump instruction, it is used to check the table end and will break the process when parsing ".rela.rodata" to avoid getting the wrong jump destination instructions. Link: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0ee028f55640 Link: https://github.com/llvm/llvm-project/commit/4c2c17756739 Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20250211115016.26913-5-yangtiezhu@loongson.cn Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2025-03-12objtool: Handle PC relative relocation typeTiezhu Yang4-5/+27
For the most part, an absolute relocation type is used for rodata. In the case of STT_SECTION, reloc->sym->offset is always zero, for the other symbol types, reloc_addend(reloc) is always zero, thus it can use a simple statement "reloc->sym->offset + reloc_addend(reloc)" to obtain the symbol offset for various symbol types. When compiling on LoongArch, there exist PC relative relocation types for rodata, it needs to calculate the symbol offset with "S + A - PC" according to the spec of "ELF for the LoongArch Architecture". If there is only one jump table in the rodata, the "PC" is the entry address which is equal with the value of reloc_offset(reloc), at this time, reloc_offset(table) is 0. If there are many jump tables in the rodata, the "PC" is the offset of the jump table's base address which is equal with the value of reloc_offset(reloc) - reloc_offset(table). So for LoongArch, if the relocation type is PC relative, it can use a statement "reloc_offset(reloc) - reloc_offset(table)" to get the "PC" value when calculating the symbol offset with "S + A - PC" for one or many jump tables in the rodata. Add an arch-specific function arch_jump_table_sym_offset() to assign the symbol offset, for the most part that is an absolute relocation, the default value is "reloc->sym->offset + reloc_addend(reloc)" in the weak definition, it can be overridden by each architecture that has different requirements. Link: https://github.com/loongson/la-abi-specs/blob/release/laelf.adoc Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20250211115016.26913-4-yangtiezhu@loongson.cn Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2025-03-12objtool: Handle different entry size of rodataTiezhu Yang5-1/+41
In the most cases, the entry size of rodata is 8 bytes because the relocation type is 64 bit. There are also 32 bit relocation types, the entry size of rodata should be 4 bytes in this case. Add an arch-specific function arch_reloc_size() to assign the entry size of rodata for x86, powerpc and LoongArch. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20250211115016.26913-3-yangtiezhu@loongson.cn Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2025-03-12objtool: Handle various symbol types of rodataTiezhu Yang1-5/+11
In the relocation section ".rela.rodata" of each .o file compiled with LoongArch toolchain, there are various symbol types such as STT_NOTYPE, STT_OBJECT, STT_FUNC in addition to the usual STT_SECTION, it needs to use reloc symbol offset instead of reloc addend to find the destination instruction in find_jump_table() and add_jump_table(). For the most part, an absolute relocation type is used for rodata. In the case of STT_SECTION, reloc->sym->offset is always zero, and for the other symbol types, reloc_addend(reloc) is always zero, thus it can use a simple statement "reloc->sym->offset + reloc_addend(reloc)" to obtain the symbol offset for various symbol types. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20250211115016.26913-2-yangtiezhu@loongson.cn Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2025-03-12objtool: Hide unnecessary compiler error messageDavid Engraf1-1/+1
The check for using old libelf prints an error message when libelf.h is not available but does not abort. This may confuse so hide the compiler error message. Signed-off-by: David Engraf <david.engraf@sysgo.com> Link: https://lore.kernel.org/r/20250203073610.206000-1-david.engraf@sysgo.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2025-03-09Linux 6.14-rc6Linus Torvalds1-1/+1
2025-03-08x86/microcode/AMD: Add some forgotten models to the SHA checkBorislav Petkov (AMD)1-0/+6
Add some more forgotten models to the SHA check. Fixes: 50cef76d5cb0 ("x86/microcode/AMD: Load only SHA256-checksummed patches") Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Toralf Förster <toralf.foerster@gmx.de> Link: https://lore.kernel.org/r/20250307220256.11816-1-bp@kernel.org
2025-03-08LoongArch: KVM: Fix GPA size issue about VMBibo Mao2-1/+11
Physical address space is 48 bit on Loongson-3A5000 physical machine, however it is 47 bit for VM on Loongson-3A5000 system. Size of physical address space of VM is the same with the size of virtual user space (a half) of physical machine. Variable cpu_vabits represents user address space, kernel address space is not included (user space and kernel space are both a half of total). Here cpu_vabits, rather than cpu_vabits - 1, is to represent the size of guest physical address space. Also there is strict checking about page fault GPA address, inject error if it is larger than maximum GPA address of VM. Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-03-08LoongArch: KVM: Reload guest CSR registers after sleepBibo Mao1-0/+7
On host, the HW guest CSR registers are lost after suspend and resume operation. Since last_vcpu of boot CPU still records latest vCPU pointer so that the guest CSR register skips to reload when boot CPU resumes and vCPU is scheduled. Here last_vcpu is cleared so that guest CSR registers will reload from scheduled vCPU context after suspend and resume. Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-03-08LoongArch: KVM: Add interrupt checking for AVECBibo Mao1-1/+1
There is a newly added macro INT_AVEC with CSR ESTAT register, which is bit 14 used for LoongArch AVEC support. AVEC interrupt status bit 14 is supported with macro CSR_ESTAT_IS, so here replace the hard-coded value 0x1fff with macro CSR_ESTAT_IS so that the AVEC interrupt status is also supported by KVM. Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-03-08LoongArch: Set hugetlb mmap base address aligned with pmd sizeBibo Mao1-1/+5
With ltp test case "testcases/bin/hugefork02", there is a dmesg error report message such as: kernel BUG at mm/hugetlb.c:5550! Oops - BUG[#1]: CPU: 0 UID: 0 PID: 1517 Comm: hugefork02 Not tainted 6.14.0-rc2+ #241 Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 pc 90000000004eaf1c ra 9000000000485538 tp 900000010edbc000 sp 900000010edbf940 a0 900000010edbfb00 a1 9000000108d20280 a2 00007fffe9474000 a3 00007ffff3474000 a4 0000000000000000 a5 0000000000000003 a6 00000000003cadd3 a7 0000000000000000 t0 0000000001ffffff t1 0000000001474000 t2 900000010ecd7900 t3 00007fffe9474000 t4 00007fffe9474000 t5 0000000000000040 t6 900000010edbfb00 t7 0000000000000001 t8 0000000000000005 u0 90000000004849d0 s9 900000010edbfa00 s0 9000000108d20280 s1 00007fffe9474000 s2 0000000002000000 s3 9000000108d20280 s4 9000000002b38b10 s5 900000010edbfb00 s6 00007ffff3474000 s7 0000000000000406 s8 900000010edbfa08 ra: 9000000000485538 unmap_vmas+0x130/0x218 ERA: 90000000004eaf1c __unmap_hugepage_range+0x6f4/0x7d0 PRMD: 00000004 (PPLV0 +PIE -PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) Process hugefork02 (pid: 1517, threadinfo=00000000a670eaf4, task=000000007a95fc64) Call Trace: [<90000000004eaf1c>] __unmap_hugepage_range+0x6f4/0x7d0 [<9000000000485534>] unmap_vmas+0x12c/0x218 [<9000000000494068>] exit_mmap+0xe0/0x308 [<900000000025fdc4>] mmput+0x74/0x180 [<900000000026a284>] do_exit+0x294/0x898 [<900000000026aa30>] do_group_exit+0x30/0x98 [<900000000027bed4>] get_signal+0x83c/0x868 [<90000000002457b4>] arch_do_signal_or_restart+0x54/0xfa0 [<90000000015795e8>] irqentry_exit_to_user_mode+0xb8/0x138 [<90000000002572d0>] tlb_do_page_fault_1+0x114/0x1b4 The problem is that base address allocated from hugetlbfs is not aligned with pmd size. Here add a checking for hugetlbfs and align base address with pmd size. After this patch the test case "testcases/bin/hugefork02" passes to run. This is similar to the commit 7f24cbc9c4d42db8a3c8484d1 ("mm/mmap: teach generic_get_unmapped_area{_topdown} to handle hugetlb mappings"). Cc: stable@vger.kernel.org # 6.13+ Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-03-08LoongArch: Set max_pfn with the PFN of the last pageBibo Mao1-0/+3
The current max_pfn equals to zero. In this case, it causes user cannot get some page information through /proc filesystem such as kpagecount. The following message is displayed by stress-ng test suite with command "stress-ng --verbose --physpage 1 -t 1". # stress-ng --verbose --physpage 1 -t 1 stress-ng: error: [1691] physpage: cannot read page count for address 0x134ac000 in /proc/kpagecount, errno=22 (Invalid argument) stress-ng: error: [1691] physpage: cannot read page count for address 0x7ffff207c3a8 in /proc/kpagecount, errno=22 (Invalid argument) stress-ng: error: [1691] physpage: cannot read page count for address 0x134b0000 in /proc/kpagecount, errno=22 (Invalid argument) ... After applying this patch, the kernel can pass the test. # stress-ng --verbose --physpage 1 -t 1 stress-ng: debug: [1701] physpage: [1701] started (instance 0 on CPU 3) stress-ng: debug: [1701] physpage: [1701] exited (instance 0 on CPU 3) stress-ng: debug: [1700] physpage: [1701] terminated (success) Cc: stable@vger.kernel.org # 6.8+ Fixes: ff6c3d81f2e8 ("NUMA: optimize detection of memory with no node id assigned by firmware") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-03-08LoongArch: Use polling play_dead() when resuming from hibernationHuacai Chen1-1/+46
When CONFIG_RANDOM_KMALLOC_CACHES or other randomization infrastructrue enabled, the idle_task's stack may different between the booting kernel and target kernel. So when resuming from hibernation, an ACTION_BOOT_CPU IPI wakeup the idle instruction in arch_cpu_idle_dead() and jump to the interrupt handler. But since the stack pointer is changed, the interrupt handler cannot restore correct context. So rename the current arch_cpu_idle_dead() to idle_play_dead(), make it as the default version of play_dead(), and the new arch_cpu_idle_dead() call play_dead() directly. For hibernation, implement an arch-specific hibernate_resume_nonboot_cpu_disable() to use the polling version (idle instruction is replace by nop, and irq is disabled) of play_dead(), i.e. poll_play_dead(), to avoid IPI handler corrupting the idle_task's stack when resuming from hibernation. This solution is a little similar to commit 406f992e4a372dafbe3c ("x86 / hibernate: Use hlt_play_dead() when resuming from hibernation"). Cc: stable@vger.kernel.org Tested-by: Erpeng Xu <xuerpeng@uniontech.com> Tested-by: Yuli Wang <wangyuli@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-03-08LoongArch: Eliminate superfluous get_numa_distances_cnt()Yuli Wang1-12/+0
In LoongArch, get_numa_distances_cnt() isn't in use, resulting in a compiler warning. Fix follow errors with clang-18 when W=1e: arch/loongarch/kernel/acpi.c:259:28: error: unused function 'get_numa_distances_cnt' [-Werror,-Wunused-function] 259 | static inline unsigned int get_numa_distances_cnt(struct acpi_table_slit *slit) | ^~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Link: https://lore.kernel.org/all/Z7bHPVUH4lAezk0E@kernel.org/ Signed-off-by: Yuli Wang <wangyuli@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-03-08LoongArch: Convert unreachable() to BUG()Tiezhu Yang1-2/+2
When compiling on LoongArch, there exists the following objtool warning in arch/loongarch/kernel/machine_kexec.o: kexec_reboot() falls through to next function crash_shutdown_secondary() Avoid using unreachable() as it can (and will in the absence of UBSAN) generate fall-through code. Use BUG() so we get a "break BRK_BUG" trap (with unreachable annotation). Cc: stable@vger.kernel.org # 6.12+ Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-03-08x86/mm: Define PTRS_PER_PMD for assembly code tooIngo Molnar1-4/+4
Andy reported the following build warning from head_32.S: In file included from arch/x86/kernel/head_32.S:29: arch/x86/include/asm/pgtable_32.h:59:5: error: "PTRS_PER_PMD" is not defined, evaluates to 0 [-Werror=undef] 59 | #if PTRS_PER_PMD > 1 The reason is that on 2-level i386 paging the folded in PMD's PTRS_PER_PMD constant is not defined in assembly headers, only in generic MM C headers. Instead of trying to fish out the definition from the generic headers, just define it - it even has a comment for it already... Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/Z8oa8AUVyi2HWfo9@gmail.com
2025-03-07virt: sev-guest: Move SNP Guest Request data pages handling under snp_cmd_mutexAlexey Kardashevskiy3-24/+39
Compared to the SNP Guest Request, the "Extended" version adds data pages for receiving certificates. If not enough pages provided, the HV can report to the VM how much is needed so the VM can reallocate and repeat. Commit ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") moved handling of the allocated/desired pages number out of scope of said mutex and create a possibility for a race (multiple instances trying to trigger Extended request in a VM) as there is just one instance of snp_msg_desc per /dev/sev-guest and no locking other than snp_cmd_mutex. Fix the issue by moving the data blob/size and the GHCB input struct (snp_req_data) into snp_guest_req which is allocated on stack now and accessed by the GHCB caller under that mutex. Stop allocating SEV_FW_BLOB_MAX_SIZE in snp_msg_alloc() as only one of four callers needs it. Free the received blob in get_ext_report() right after it is copied to the userspace. Possible future users of snp_send_guest_request() are likely to have different ideas about the buffer size anyways. Fixes: ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250307013700.437505-3-aik@amd.com
2025-03-07virt: sev-guest: Allocate request data dynamicallyNikunj A Dadhania1-9/+15
Commit ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") narrowed the command mutex scope to snp_send_guest_request(). However, GET_REPORT, GET_DERIVED_KEY, and GET_EXT_REPORT share the req structure in snp_guest_dev. Without the mutex protection, concurrent requests can overwrite each other's data. Fix it by dynamically allocating the request structure. Fixes: ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") Closes: https://github.com/AMDESE/AMDSEV/issues/265 Reported-by: andreas.stuehrk@yaxi.tech Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250307013700.437505-2-aik@amd.com
2025-03-07x86/amd_nb: Use rdmsr_safe() in amd_get_mmconfig_range()Andrew Cooper1-6/+3
Xen doesn't offer MSR_FAM10H_MMIO_CONF_BASE to all guests. This results in the following warning: unchecked MSR access error: RDMSR from 0xc0010058 at rIP: 0xffffffff8101d19f (xen_do_read_msr+0x7f/0xa0) Call Trace: xen_read_msr+0x1e/0x30 amd_get_mmconfig_range+0x2b/0x80 quirk_amd_mmconfig_area+0x28/0x100 pnp_fixup_device+0x39/0x50 __pnp_add_device+0xf/0x150 pnp_add_device+0x3d/0x100 pnpacpi_add_device_handler+0x1f9/0x280 acpi_ns_get_device_callback+0x104/0x1c0 acpi_ns_walk_namespace+0x1d0/0x260 acpi_get_devices+0x8a/0xb0 pnpacpi_init+0x50/0x80 do_one_initcall+0x46/0x2e0 kernel_init_freeable+0x1da/0x2f0 kernel_init+0x16/0x1b0 ret_from_fork+0x30/0x50 ret_from_fork_asm+0x1b/0x30 based on quirks for a "PNP0c01" device. Treating MMCFG as disabled is the right course of action, so no change is needed there. This was most likely exposed by fixing the Xen MSR accessors to not be silently-safe. Fixes: 3fac3734c43a ("xen/pv: support selecting safe/unsafe msr accesses") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250307002846.3026685-1-andrew.cooper3@citrix.com
2025-03-06fs/pipe: add simpler helpers for common casesLinus Torvalds7-23/+49
The fix to atomically read the pipe head and tail state when not holding the pipe mutex has caused a number of headaches due to the size change of the involved types. It turns out that we don't have _that_ many places that access these fields directly and were affected, but we have more than we strictly should have, because our low-level helper functions have been designed to have intimate knowledge of how the pipes work. And as a result, that random noise of direct 'pipe->head' and 'pipe->tail' accesses makes it harder to pinpoint any actual potential problem spots remaining. For example, we didn't have a "is the pipe full" helper function, but instead had a "given these pipe buffer indexes and this pipe size, is the pipe full". That's because some low-level pipe code does actually want that much more complicated interface. But most other places literally just want a "is the pipe full" helper, and not having it meant that those places ended up being unnecessarily much too aware of this all. It would have been much better if only the very core pipe code that cared had been the one aware of this all. So let's fix it - better late than never. This just introduces the trivial wrappers for "is this pipe full or empty" and to get how many pipe buffers are used, so that instead of writing if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) the places that literally just want to know if a pipe is full can just say if (pipe_is_full(pipe)) instead. The existing trivial cases were converted with a 'sed' script. This cuts down on the places that access pipe->head and pipe->tail directly outside of the pipe code (and core splice code) quite a lot. The splice code in particular still revels in doing the direct low-level accesses, and the fuse fuse_dev_splice_write() code also seems a bit unnecessarily eager to go very low-level, but it's at least a bit better than it used to be. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-06block: Name the RQF flags enumBreno Leitao1-1/+1
Commit 5f89154e8e9e3445f9b59 ("block: Use enum to define RQF_x bit indexes") converted the RQF flags to an anonymous enum, which was a beneficial change. This patch goes one step further by naming the enum as "rqf_flags". This naming enables exporting these flags to BPF clients, eliminating the need to duplicate these flags in BPF code. Instead, BPF clients can now access the same kernel-side values through CO:RE (Compile Once, Run Everywhere), as shown in this example: rqf_stats = bpf_core_enum_value(enum rqf_flags, __RQF_STATS) Suggested-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20250306-rqf_flags-v1-1-bbd64918b406@debian.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-06bcachefs: copygc now skips non-rw devicesKent Overstreet1-13/+12
There's no point in doing copygc on non-rw devices: the fragmentation doesn't matter if we're not writing to them, and we may not have anywhere to put the data on our other devices. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06bcachefs: Fix bch2_dev_journal_alloc() spuriously failingKent Overstreet1-27/+32
Previously, we fixed journal resize spuriousl failing with -BCH_ERR_open_buckets_empty, but initial journal allocation was missed because it didn't invoke the "block on allocator" loop at all. Factor out the "loop on allocator" code to fix that. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06x86/boot: Sanitize boot params before parsing command lineArd Biesheuvel1-0/+2
The 5-level paging code parses the command line to look for the 'no5lvl' string, and does so very early, before sanitize_boot_params() has been called and has been given the opportunity to wipe bogus data from the fields in boot_params that are not covered by struct setup_header, and are therefore supposed to be initialized to zero by the bootloader. This triggers an early boot crash when using syslinux-efi to boot a recent kernel built with CONFIG_X86_5LEVEL=y and CONFIG_EFI_STUB=n, as the 0xff padding that now fills the unused PE/COFF header is copied into boot_params by the bootloader, and interpreted as the top half of the command line pointer. Fix this by sanitizing the boot_params before use. Note that there is no harm in calling this more than once; subsequent invocations are able to spot that the boot_params have already been cleaned up. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> # v6.1+ Link: https://lore.kernel.org/r/20250306155915.342465-2-ardb+git@google.com Closes: https://lore.kernel.org/all/202503041549.35913.ulrich.gemkow@ikr.uni-stuttgart.de
2025-03-06fs/pipe: fix pipe buffer index use in FUSELinus Torvalds1-7/+6
This was another case that Rasmus pointed out where the direct access to the pipe head and tail pointers broke on 32-bit configurations due to the type changes. As with the pipe FIONREAD case, fix it by using the appropriate helper functions that deal with the right pipe index sizing. Reported-by: Rasmus Villemoes <ravi@prevas.dk> Link: https://lore.kernel.org/all/878qpi5wz4.fsf@prevas.dk/ Fixes: 3d252160b818 ("fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex")Cc: Oleg > Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-06fs/pipe: do not open-code pipe head/tail logic in FIONREADLinus Torvalds1-4/+3
Rasmus points out that we do indeed have other cases of breakage from the type changes that were introduced on 32-bit targets in order to read the pipe head and tail values atomically (commit 3d252160b818: "fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex"). Fix it up by using the proper helper functions that now deal with the pipe buffer index types properly. This makes the code simpler and more obvious. The compiler does the CSE and loop hoisting of the pipe ring size masking that we used to do manually, so open-coding this was never a good idea. Reported-by: Rasmus Villemoes <ravi@prevas.dk> Link: https://lore.kernel.org/all/87cyeu5zgk.fsf@prevas.dk/ Fixes: 3d252160b818 ("fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex")Cc: Oleg Nesterov <oleg@redhat.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-06fs/pipe: express 'pipe_empty()' in terms of 'pipe_occupancy()'Linus Torvalds1-6/+6
That's what 'pipe_full()' does, so it's more consistent. But more importantly it gets the type limits right when the pipe head and tail are no longer necessarily 'unsigned int'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-06usb: typec: ucsi: Fix NULL pointer accessAndrei Kuchynski1-6/+7
Resources should be released only after all threads that utilize them have been destroyed. This commit ensures that resources are not released prematurely by waiting for the associated workqueue to complete before deallocating them. Cc: stable <stable@kernel.org> Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking") Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250305111739.1489003-2-akuchynski@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06usb: quirks: Add DELAY_INIT and NO_LPM for Prolific Mass Storage Card ReaderMiao Li1-0/+4
When used on Huawei hisi platforms, Prolific Mass Storage Card Reader which the VID:PID is in 067b:2731 might fail to enumerate at boot time and doesn't work well with LPM enabled, combination quirks: USB_QUIRK_DELAY_INIT + USB_QUIRK_NO_LPM fixed the problems. Signed-off-by: Miao Li <limiao@kylinos.cn> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/20250304070757.139473-1-limiao870622@163.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06gpio: rcar: Fix missing of_node_put() callFabrizio Castro1-1/+6
of_parse_phandle_with_fixed_args() requires its caller to call into of_node_put() on the node pointer from the output structure, but such a call is currently missing. Call into of_node_put() to rectify that. Fixes: 159f8a0209af ("gpio-rcar: Add DT support") Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20250305163753.34913-2-fabrizio.castro.jz@renesas.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-03-06btrfs: fix a leaked chunk map issue in read_one_chunk()Haoxiang Li1-0/+1
Add btrfs_free_chunk_map() to free the memory allocated by btrfs_alloc_chunk_map() if btrfs_add_chunk_map() fails. Fixes: 7dc66abb5a47 ("btrfs: use a dedicated data structure for chunk maps") CC: stable@vger.kernel.org Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-03-06kbuild: install-extmod-build: Fix build when specifying KBUILD_OUTPUTInochi Amaoto1-1/+1
Since commit 5f73e7d0386d ("kbuild: refactor cross-compiling linux-headers package"), the linux-headers pacman package fails to build when "O=" is set. The build system complains: /mnt/chroot/linux/scripts/Makefile.build:41: mnt/chroots/linux-mainline/pacman/linux-upstream/pkg/linux-upstream-headers/usr//lib/modules/6.14.0-rc3-00350-g771dba31fffc/build/scripts/Makefile: No such file or directory This is because the "srcroot" variable is set to "." and the "build" variable is set to the absolute path. This makes the "src" variables point to wrong directory. Change the "build" variable to a relative path to "." to fix build. Fixes: 5f73e7d0386d ("kbuild: refactor cross-compiling linux-headers package") Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-03-06net: ipv6: fix missing dst ref drop in ila lwtunnelJustin Iurman1-0/+1
Add missing skb_dst_drop() to drop reference to the old dst before adding the new dst to the skb. Fixes: 79ff2fc31e0f ("ila: Cache a route to translated address") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250305081655.19032-1-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-06net: ipv6: fix dst ref loop in ila lwtunnelJustin Iurman1-1/+2
This patch follows commit 92191dd10730 ("net: ipv6: fix dst ref loops in rpl, seg6 and ioam6 lwtunnels") and, on a second thought, the same patch is also needed for ila (even though the config that triggered the issue was pathological, but still, we don't want that to happen). Fixes: 79ff2fc31e0f ("ila: Cache a route to translated address") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250304181039.35951-1-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>