aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/util (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-10-08Merge tag 'perf-tools-for-v6.18-1-2025-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-toolsLinus Torvalds79-2477/+4349
Pull perf tools updates from Arnaldo Carvalho de Melo: - Extended 'perf annotate' with DWARF type information (--code-with-type) integration in the TUI, including a 'T' hotkey to toggle it - Enhanced 'perf bench mem' with new mmap() workloads and control over page/chunk sizes - Fix 'perf stat' error handling to correctly display unsupported events - Improved support for Clang cross-compilation - Refactored LLVM and Capstone disasm for modularity - Introduced the :X modifier to exclude an event from automatic regrouping - Adjusted KVM sampling defaults to use the "cycles" event to prevent failures - Added comprehensive support for decoding PowerPC Dispatch Trace Log (DTL) - Updated Arm SPE tracing logic for better analysis of memory and snoop details - Synchronized Intel PMU events and metrics with TMA 5.1 across multiple processor generations - Converted dependencies like libperl and libtracefs to be opt-in - Handle more Rust symbols in kallsyms ('N', debugging) - Improve the python binding to allow for python based tools to use more of the libraries, add a 'ilist' utility to test those new bindings - Various 'perf test' fixes - Kan Liang no longer a perf tools reviewer * tag 'perf-tools-for-v6.18-1-2025-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (192 commits) perf tools: Fix arm64 libjvmti build by generating unistd_64.h perf tests: Don't retest sections in "Object code reading" perf docs: Document building with Clang perf build: Support build with clang perf test coresight: Dismiss clang warning for unroll loop thread perf test coresight: Dismiss clang warning for thread loop perf test coresight: Dismiss clang warning for memcpy thread perf build: Disable thread safety analysis for perl header perf build: Correct CROSS_ARCH for clang perf python: split Clang options when invoking Popen tools build: Align warning options with perf perf disasm: Remove unused evsel from 'struct annotate_args' perf srcline: Fallback between addr2line implementations perf disasm: Make ins__scnprintf() and ins__is_nop() static perf dso: Clean up read_symbol() error handling perf dso: Support BPF programs in dso__read_symbol() perf dso: Move read_symbol() from llvm/capstone to dso perf llvm: Reduce LLVM initialization perf check: Add libLLVM feature perf parse-events: Fix parsing of >30kb event strings ...
2025-10-06perf build: Disable thread safety analysis for perl headerLeo Yan1-1/+1
When build with perl5, it reports error: In file included from /usr/lib/perl5/5.42.0/x86_64-linux-thread-multi/CORE/perl.h:7933: /usr/lib/perl5/5.42.0/x86_64-linux-thread-multi/CORE/inline.h:298:5: error: mutex 'PL_env_mutex.lock' is not held on every path through here [-Werror,-Wthread-safety-analysis] 298 | ENV_UNLOCK; | ^ /usr/lib/perl5/5.42.0/x86_64-linux-thread-multi/CORE/perl.h:7091:31: note: expanded from macro 'ENV_UNLOCK' 7091 | # define ENV_UNLOCK PERL_REENTRANT_UNLOCK("env"... | ^ /usr/lib/perl5/5.42.0/x86_64-linux-thread-multi/CORE/perl.h:6465:7: note: expanded from macro 'PERL_REENTRANT_UNLOCK' 6465 | } STMT_END | ^ /usr/lib/perl5/5.42.0/x86_64-linux-thread-multi/CORE/perl.h:865:28: note: expanded from macro 'STMT_END' 865 | # define STMT_END while (0) | ^ The error is caused by perl header but not perf code, disable thread safety analysis if including the header. Though GCC does not support the thread safety analysis option, this negative warning flag is silently ignored by it. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20251006-perf_build_android_ndk-v3-4-4305590795b2@arm.com Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Bill Wendling <morbo@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: linux-riscv@lists.infradead.org Cc: llvm@lists.linux.dev Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-06perf python: split Clang options when invoking PopenLeo Yan1-1/+4
When passing a list to subprocess.Popen, each element maps to one argv token. Current code bundles multiple Clang flags into a single element, something like: cmd = ['clang', '--target=x86_64-linux-gnu -fintegrated-as -Wno-cast-function-type-mismatch', 'test-hello.c'] So Clang only sees one long, invalid option instead of separate flags, as a result, the script cannot capture any log via PIPE. Fix this by using shlex.split() to separate the string so each option becomes its own argv element. The fixed list will be: cmd = ['clang', '--target=x86_64-linux-gnu', '-fintegrated-as', '-Wno-cast-function-type-mismatch', 'test-hello.c'] Fixes: 09e6f9f98370 ("perf python: Fix splitting CC into compiler and options") Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20251006-perf_build_android_ndk-v3-2-4305590795b2@arm.com Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Bill Wendling <morbo@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: linux-riscv@lists.infradead.org Cc: llvm@lists.linux.dev Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-06perf disasm: Remove unused evsel from 'struct annotate_args'Ian Rogers2-2/+0
Set in symbol__annotate() but never used. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-06perf srcline: Fallback between addr2line implementationsIan Rogers8-481/+485
Factor the addr2line function implementation into separate source files (addr2line.[ch]) and rename the addr2line function cmd__addr2line. In srcline replace the ifdef-ed addr2line implementations with one that first tries the llvm__addr2line implementation, then the deprecated libbfd__addr2line function and on failure uses cmd__addr2line. If HAVE_LIBLLVM_SUPPORT is enabled the llvm__addr2line will execute against the libLLVM.so it is linked against. If HAVE_LIBLLVM_DYNAMIC is enabled then libperf-llvm.so (that links against libLLVM.so) will be dlopened. If the dlopen succeeds then the behavior should match HAVE_LIBLLVM_SUPPORT. On failure cmd__addr2line is used. The dlopen is only tried once. If HAVE_LIBLLVM_DYNAMIC isn't enabled then llvm__addr2line immediately fails and cmd__addr2line is used. Clean up the dso__free_a2l logic, which is only needed in the non-LLVM version and moved to addr2line.c. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-06perf disasm: Make ins__scnprintf() and ins__is_nop() staticIan Rogers2-6/+3
Reduce the scope of ins__scnprintf() and ins__is_nop() that aren't used outside of disasm.c. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-06perf dso: Clean up read_symbol() error handlingIan Rogers3-5/+16
Ensure errno is set and return to caller for error handling. Unusually for perf the value isn't negated as expected by symbol__strerror_disassemble(). Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-06perf dso: Support BPF programs in dso__read_symbol()Ian Rogers4-42/+80
Set the buffer to the code in the BPF linear info. This enables BPF JIT code disassembly by LLVM and capstone. Move the common but minimal disassmble_bpf_image call to disassemble_objdump so that it is only called after falling back to the objdump option. Similarly move the disassmble_bpf function to disassemble_objdump and rename to disassmble_bpf_libbfd to make it clearer that this support relies on libbfd. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-06perf dso: Move read_symbol() from llvm/capstone to dsoIan Rogers4-128/+97
Move the read_symbol function to dso.h, make the return type const and add a mutable out_buf out parameter. In future changes this will allow a code pointer to be returned without necessary allocating memory. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-06perf llvm: Reduce LLVM initializationIan Rogers1-12/+21
Move the 3 LLVM initialization routines to be called in a single init_llvm function that has its own bool to avoid repeated initialization. Reduce the scope of triplet and avoid copying strings for x86. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> [ Move init_llvm() under HAVE_LIBLLVM_SUPPORT to fix the build ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-03perf parse-events: Fix parsing of >30kb event stringsIan Rogers1-14/+3
Metrics may generate many particularly uncore event references. The resulting event string may then be >32kb. The parse events lex is using "%option reject" which stores backtracking state in a buffer sized at roughtly 30kb. If the event string is larger than this then a buffer overflow and typically a crash happens. The need for "%option reject" was for BPF events which were removed in commit 3d6dfae88917 ("perf parse-events: Remove BPF event support"). As "%option reject" is both a memory and performance cost let's remove it and fix the parsing case for event strings being over ~30kb. Whilst cleaning up "%option reject" make the header files accurately reflect functions used in the code and tidy up not requiring yywrap. Measuring on the "PMU JSON event tests" a modest reduction of 0.41% user time and 0.27% max resident size was observed. More importantly this change fixes parsing large metrics and event strings. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-03perf record: Add ratio-to-prev termThomas Falcon7-2/+105
Provide ratio-to-prev term which allows the user to set the event sample period of two events corresponding to a desired ratio. If using on an Intel x86 platform with Auto Counter Reload support, also set corresponding event's config2 attribute with a bitmask which counters to reset and which counters to sample if the desired ratio is met or exceeded. On other platforms, only the sample period is affected by the ratio-to-prev term. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-03perf bpf-event: Use libbpf version rather than feature checkIan Rogers2-2/+5
The feature check guarded the -DHAVE_LIBBPF_STRINGS_SUPPORT is unnecessary as it is sufficient and easier to use the LIBBPF_CURRENT_VERSION_GEQ macro. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Blake Jones <blakejones@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-03perf annotate: Rename TSR_KIND_POINTER to TSR_KIND_PERCPU_POINTERZecheng Li2-4/+4
TSR_KIND_POINTER only represents percpu pointers currently. Rename it to TSR_KIND_PERCPU_POINTER so we can use the TSR_KIND_POINTER to represent pointer to a type. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Zecheng Li <zecheng@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xu Liu <xliuprof@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-03perf stat: Refactor retry/skip/fatal error handlingIan Rogers2-16/+25
For the sake of Intel topdown events commit 9eac5612da1c9102 ("perf stat: Don't skip failing group events") changed 'perf stat' error handling making it so that more errors were fatal and didn't report "<not supported>" events. The change outside of topdown events was unintentional. The notion of "fatal" error handling was introduced in commit e0e6a6ca3ac211cc ("perf stat: Factor out open error handling") and refined in commits like commit cb5ef60067c11cc8 ("perf stat: Error out unsupported group leader immediately") to be an approach for avoiding later assertion failures in the code base. This change fixes those issues and removes the notion of a fatal error on an event. If all events fail to open then a fatal error occurs with the previous fatal error message. This seems to best match the notion of supported events and allowing some errors not to stop 'perf stat', while allowing the truly fatal no event case to terminate the tool early. The evsel->errored flag is only used in the stat code but always just meaning !evsel->supported although there is a comment about it being sticky. Force all evsels to be supported in evsel__init and then clear this when evsel__open fails. When an event is tried the supported is set to true again. This simplifies the notion of whether an evsel is broken. In the get_group_fd code, fail to get a group fd when the evsel isn't supported. If the leader isn't supported then it is also expected that there is no group_fd as the leader will have been skipped. Therefore change the BUG_ON test to be on supported rather than skippable. This corrects the assertion errors that were the reason for the previous fatal error handling. Fixes: 9eac5612da1c9102 ("perf stat: Don't skip failing group events") Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20251002220727.1889799-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-03perf stat: Move create_perf_stat_counter() to builtin-stat.cIan Rogers2-60/+0
The function create_perf_stat_counter is only used in builtin-stat.c and contains logic about retrying events specific to builtin-stat.c. Move the code to builtin-stat.c to tidy this up. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-03perf namespaces: Avoid get_current_dir_name dependencyIan Rogers4-31/+3
get_current_dir_name is a GNU extension not supported on, for example, Android. There is only one use of it so let's just switch to getcwd to avoid build and other complexity. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf capstone: Remove open_capstone_handleIan Rogers1-28/+6
open_capstone_handle is similar to capstone_init and used only by symbol__disassemble_capstone. symbol__disassemble_capstone_powerpc already uses capstone_init, transition symbol__disassemble_capstone and eliminate open_capstone_handle. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf libbfd: Move libbfd functionality to its own fileIan Rogers10-659/+717
Move symbolization and srcline libbfd dependencies to a separate libbfd.c. This mirrors moving llvm and capstone code. While this code is deprecated as it is part of BUILD_NONDISTRO license incompatible code, moving the code to its own file minimizes disruption in the main files. disasm_bpf.c is moved to libbfd.c also except for symbol__disassemble_bpf_image which is currently more of a placeholder function rather than something that provides disassembly support. demangle-cxx.cpp code isn't migrated as it is very limited. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf llvm: Move llvm functionality into its own fileIan Rogers7-311/+373
LLVM disassembly support was in disasm.c and addr2line support in srcline.c. Move support out of these files into llvm.[ch] and remove LLVM includes from those files. As disassembly routines can fail, make failure the only option without HAVE_LIBLLVM_SUPPORT. For simplicity's sake, duplicate the read_symbol utility function. The intent with moving LLVM support into a single file is that dynamic support, using dlopen for libllvm, can be added in later patches. This can potentially always succeed or fail, so relying on ifdefs isn't sufficient. Using dlopen is a useful option to minimize the perf tools dependencies and potentially size. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf capstone: Move capstone functionality into its own fileIan Rogers5-465/+569
Capstone disassembly support was split between disasm.c and print_insn.c. Move support out of these files into capstone.[ch] and remove include capstone/capstone.h from those files. As disassembly routines can fail, make failure the only option without HAVE_LIBCAPSTONE_SUPPORT. For simplicity's sake, duplicate the read_symbol utility function. The intent with moving capstone support into a single file is that dynamic support, using dlopen for libcapstone, can be added in later patches. This can potentially always succeed or fail, so relying on ifdefs isn't sufficient. Using dlopen is a useful option to minimize the perf tools dependencies and potentially size. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf map: Constify objdump offset/address conversion APIsIan Rogers2-7/+18
Make the map argument const as the conversion act won't modify the map and this allows other callers to use a const struct map. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf tools kvm: Use "cycles" to sample guest for "kvm record" on IntelDapeng Mi1-0/+10
After KVM supports PEBS for guest on Intel platforms (https://lore.kernel.org/all/20220411101946.20262-1-likexu@tencent.com/), host loses the capability to sample guest with PEBS since all PEBS related MSRs are switched to guest value after vm-entry, like IA32_DS_AREA MSR is switched to guest GVA at vm-entry. This would lead to "perf kvm record" fails to sample guest on Intel platforms since "cycles:P" event is used to sample guest by default as below case shows. sudo perf kvm record -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.787 MB perf.data.guest ] So to ensure guest record can be sampled successfully, use "cycles" instead of "cycles:P" to sample guest record by default on Intel platforms. With this patch, the guest record can be sampled successfully. sudo perf kvm record -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.783 MB perf.data.guest (23 samples) ] Fixes: cf8e55fe50df0c02 ("KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64") Reported-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Like Xu <likexu@tencent.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf tools: Add helper x86__is_intel_cpu()Dapeng Mi2-0/+24
Add helper x86__is_intel_cpu() to indicate if it's a x86 intel platform. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf symbol-minimal: Be more defensive when reading build IDsIan Rogers1-1/+1
The note_data at ptr is read as a nhdr but this may yield out-of-bounds reads if there isn't nhdrs worth of data. Be more defensive before doing the reads. This is motivated by address sanitizer capturing out of bounds reads running "perf top". Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf bpf: Use __builtin_preserve_field_info for GCC compatibilitySam James1-1/+1
When exploring building bpf_skel with GCC's BPF support, there was a build failure because of bpf_core_field_exists vs the mem_hops bitfield: ``` In file included from util/bpf_skel/sample_filter.bpf.c:6: util/bpf_skel/sample_filter.bpf.c: In function 'perf_get_sample': tools/perf/libbpf/include/bpf/bpf_core_read.h:169:42: error: cannot take address of bit-field 'mem_hops' 169 | #define ___bpf_field_ref1(field) (&(field)) | ^ tools/perf/libbpf/include/bpf/bpf_helpers.h:222:29: note: in expansion of macro '___bpf_field_ref1' 222 | #define ___bpf_concat(a, b) a ## b | ^ tools/perf/libbpf/include/bpf/bpf_helpers.h:225:29: note: in expansion of macro '___bpf_concat' 225 | #define ___bpf_apply(fn, n) ___bpf_concat(fn, n) | ^~~~~~~~~~~~~ tools/perf/libbpf/include/bpf/bpf_core_read.h:173:9: note: in expansion of macro '___bpf_apply' 173 | ___bpf_apply(___bpf_field_ref, ___bpf_narg(args))(args) | ^~~~~~~~~~~~ tools/perf/libbpf/include/bpf/bpf_core_read.h:188:39: note: in expansion of macro '___bpf_field_ref' 188 | __builtin_preserve_field_info(___bpf_field_ref(field), BPF_FIELD_EXISTS) | ^~~~~~~~~~~~~~~~ util/bpf_skel/sample_filter.bpf.c:167:29: note: in expansion of macro 'bpf_core_field_exists' 167 | if (bpf_core_field_exists(data->mem_hops)) | ^~~~~~~~~~~~~~~~~~~~~ cc1: error: argument is not a field access ``` ___bpf_field_ref1 was adapted for GCC in 12bbcf8e840f40b82b02981e96e0a5fbb0703ea9 but the trick added for compatibility in 3a8b8fc3174891c4c12f5766d82184a82d4b2e3e isn't compatible with that as an address is used as an argument. Workaround this by calling __builtin_preserve_field_info directly as the bpf_core_field_exists macro does, but without the ___bpf_field_ref use. Co-developed-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com> Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com> Signed-off-by: Sam James <sam@gentoo.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://gcc.gnu.org/PR121420 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf bpf_counter: Fix handling of cpumap fixing hybridIan Rogers2-17/+12
Don't open evsels on all CPUs, open them just on the CPUs they support. This avoids opening say an e-core event on a p-core and getting a failure - achieve this by getting rid of the "all_cpu_map". In install_pe functions don't use the cpu_map_idx as a CPU number, translate the cpu_map_idx, which is a dense index into the cpu_map skipping holes at the beginning, to a proper CPU number. Before: ``` $ perf stat --bpf-counters -a -e cycles,instructions -- sleep 1 Performance counter stats for 'system wide': <not supported> cpu_atom/cycles/ 566,270,672 cpu_core/cycles/ <not supported> cpu_atom/instructions/ 572,792,836 cpu_core/instructions/ # 1.01 insn per cycle 1.001595384 seconds time elapsed ``` After: ``` $ perf stat --bpf-counters -a -e cycles,instructions -- sleep 1 Performance counter stats for 'system wide': 443,299,201 cpu_atom/cycles/ 1,233,919,737 cpu_core/cycles/ 213,634,112 cpu_atom/instructions/ # 0.48 insn per cycle 2,758,965,527 cpu_core/instructions/ # 2.24 insn per cycle 1.001699485 seconds time elapsed ``` Fixes: 7fac83aaf2eecc9e ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: bpf@vger.kernel.org Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Tengda Wu <wutengda@huaweicloud.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf bpf_counter: Move header declarations into C codeIan Rogers5-70/+69
Reduce the API surface that is in bpf_counter.h, this helps compiler analysis like unused static function, makes it easier to set a breakpoint and just makes it easier to see the code is self contained. When code is shared between BPF C code, put it inside HAVE_BPF_SKEL. Move transitively found #includes into appropriate C files. No functional change. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Tengda Wu <wutengda@huaweicloud.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf annotate: Use architecture-agnostic register limitSuchit Karunakaran1-5/+8
Remove the arch-specific guard around TYPE_STATE_MAX_REGS and define it as 32 for all architectures. The architecture that perf is built on may not match the architecture that produced the perf.data file, so relying on __powerpc__ or similar is fragile. Using 32 as a fixed upper bound is safe since it is greater than the previous maximum of 16. Add a comment to clarify that TYPE_STATE_MAX_REGS is an arch-independent maximum rather than a build-time choice. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf script: Enable to present DTL entriesAthira Rajeev1-0/+3
The process_event() function in "builtin-script.c" invokes perf_sample__fprintf_synth() for displaying PERF_TYPE_SYNTH type events. if (attr->type == PERF_TYPE_SYNTH && PRINT_FIELD(SYNTH)) perf_sample__fprintf_synth(sample, evsel, fp); perf_sample__fprintf_synth() process the sample depending on the value in evsel->core.attr.config. Introduce perf_sample__fprintf_synth_vpadtl() and invoke this for PERF_SYNTH_POWERPC_VPA_DTL Sample output: ./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.300 MB perf.data ] ./perf script perf 13322 [002] 233.835807: sched:sched_switch: perf:13322 [120] R ==> migration/2:27 [0] migration/2 27 [002] 233.835811: sched:sched_migrate_task: comm=perf pid=13322 prio=120 orig_cpu=2 dest_cpu=3 migration/2 27 [002] 233.835818: sched:sched_stat_runtime: comm=migration/2 pid=27 runtime=9214 [ns] migration/2 27 [002] 233.835819: sched:sched_switch: migration/2:27 [0] S ==> swapper/2:0 [120] swapper 0 [002] 233.835822: vpa-dtl: timebase: 338954486062657 dispatch_reason:decrementer_interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:435, ready_to_enqueue_time:0, waiting_to_ready_time:34775058, processor_id: 202 c0000000000f8094 plpar_hcall_norets_notrace+0x18 ([kernel.kallsyms]) swapper 0 [001] 233.835886: vpa-dtl: timebase: 338954486095398 dispatch_reason:priv_doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:542, ready_to_enqueue_time:0, waiting_to_ready_time:1245360, processor_id: 201 c0000000000f8094 plpar_hcall_norets_notrace+0x18 ([kernel.kallsyms]) Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf powerpc: Process the DTL entries in queue and deliver samplesAthira Rajeev1-0/+175
Create samples from DTL entries for displaying in 'perf report' and 'perf script'. When the different PERF_RECORD_XX records are processed from perf session, powerpc_vpadtl_process_event() will be invoked. For each of the PERF_RECORD_XX record, compare the timestamp of perf record with timestamp of top element in the auxtrace heap. Process the auxtrace queue if the timestamp of element from heap is lower than timestamp from entry in perf record. Sometimes it could happen that one buffer is only partially processed. if the timestamp of occurrence of another event is more than currently processed element in the queue, it will move on to next perf record. So keep track of position of buffer to continue processing next time. Update the timestamp of the auxtrace heap with the timestamp of last processed entry from the auxtrace buffer. Generate perf sample for each entry in the dispatch trace log. Fill in the sample details: - sample ip is picked from srr0 field of dtl_entry - sample cpu is picked from processor_id of dtl_entry - sample id is from sample_id of powerpc_vpadtl - cpumode is set to PERF_RECORD_MISC_KERNEL - Additionally save the details in raw_data of sample. This is to print the relevant fields in perf_sample__fprintf_synth() when called from builtin-script The sample is processed by calling perf_session__deliver_synth_event() so that it gets included in perf report. Sample Output: ./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.300 MB perf.data ] ./perf report # Samples: 321 of event 'vpa-dtl' # Event count (approx.): 321 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................. .............................. # 100.00% 100.00% swapper [kernel.kallsyms] [k] plpar_hcall_norets_notrace Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf powerpc: Allocate and setup aux buffer queue to help co-relate with other events across CPU'sAthira Rajeev1-4/+223
When the Dispatch Trace Log data is collected along with other events like sched tracepoint events, it needs to be correlated and present interleaved along with these events. Perf events can be collected parallely across the CPUs. Hence it needs to be ensured events/dtl entries are processed in timestamp order. An auxtrace_queue is created for each CPU. Data within each queue is in increasing order of timestamp. Each auxtrace queue has a array/list of auxtrace buffers. When processing the auxtrace buffer, the data is mmapp'ed. All auxtrace queues is maintained in auxtrace heap. Each queue has a queue number and a timestamp. The queues are sorted/added to head based on the time stamp. So always the lowest timestamp (entries to be processed first) is on top of the heap. The auxtrace queue needs to be allocated and heap needs to be populated in the sorted order of timestamp. The queue needs to be filled with data only once via powerpc_vpadtl__update_queues() function. powerpc_vpadtl__setup_queues() iterates through all the entries to allocate and setup the auxtrace queue. To add to auxtrace heap, it is required to fetch the timebase of first entry for each of the queue. The first entry in the queue for VPA DTL PMU has the boot timebase, frequency details which are needed to get timestamp which is required to correlate with other events. The very next entry is the actual trace data that provides timestamp for occurrence of DTL event. Formula used to get the timestamp from dtl entry is: ((timbase from DTL entry - boot time) / frequency) * 1000000000 powerpc_vpadtl_decode() adds the boot time and frequency as part of powerpc_vpadtl_queue structure so that it can be reused. Each of the dtl_entry is of 48 bytes size. Sometimes it could happen that one buffer is only partially processed (if the timestamp of occurrence of another event is more than currently processed element in queue, it will move on to next event). In order to keep track of position of buffer, additional fields is added to powerpc_vpadtl_queue structure. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf powerpc: Add event name as vpa-dtl of PERF_TYPE_SYNTH type to present DTL samplesAthira Rajeev2-0/+77
Dispatch Trace Log details are captured as-is in PERF_RECORD_AUXTRACE records. To present dtl entries as samples, create an event with name as "vpa-dtl" and type PERF_TYPE_SYNTH. Add perf_synth_id, "PERF_SYNTH_POWERPC_VPA_DTL" as config value for the event. Create a sample id to be a fixed offset from evsel id. To present the relevant fields from the "struct dtl_entry", prepare the entries as events of type PERF_TYPE_SYNTH. By defining as PERF_TYPE_SYNTH type, samples can be printed as part of perf_sample__fprintf_synth in builtin-script.c From powerpc_vpadtl_process_auxtrace_info(), invoke auxtrace_queues__process_index() function which will queue the auxtrace buffers by invoke auxtrace_queues__add_event(). Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf powerpc: Process auxtrace events and display in 'perf report -D'Athira Rajeev5-0/+291
Add VPA DTL PMU auxtrace process function for "perf report -D". The auxtrace event processing functions are defined in file "util/powerpc-vpadtl.c". Data structures used includes "struct powerpc_vpadtl_queue", "struct powerpc_vpadtl" to store the auxtrace buffers in queue. Different PERF_RECORD_XXX are generated during recording. PERF_RECORD_AUXTRACE_INFO is processed first since it is of type perf_user_event_type and perf session event delivers perf_session__process_user_event() first. Define function powerpc_vpadtl_process_auxtrace_info() to handle the processing of PERF_RECORD_AUXTRACE_INFO records. In this function, initialize the aux buffer queues using auxtrace_queues__init(). Setup the required infrastructure for aux data processing. The data is collected per CPU and auxtrace_queue is created for each CPU. Define powerpc_vpadtl_process_event() function to process PERF_RECORD_AUXTRACE records. In this, add the event to queue using auxtrace_queues__add_event() and process the buffer in powerpc_vpadtl_dump_event(). The first entry in the buffer with timebase as zero has boot timebase and frequency. Remaining data is of format for "struct powerpc_vpadtl_entry". Define the translation for dispatch_reasons and preempt_reasons, report this when dump trace is invoked via powerpc_vpadtl_dump() Sample output: ./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.300 MB perf.data ] ./perf report -D 0 0 0x39b10 [0x30]: PERF_RECORD_AUXTRACE size: 0x690 offset: 0 ref: 0 idx: 0 tid: -1 cpu: 0 . . ... VPA DTL PMU data: size 1680 bytes, entries is 35 . 00000000: boot_tb: 21349649546353231, tb_freq: 512000000 . 00000030: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:7064, ready_to_enqueue_time:187, waiting_to_ready_time:6611773 . 00000060: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:146, ready_to_enqueue_time:0, waiting_to_ready_time:15359437 . 00000090: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:4868, ready_to_enqueue_time:232, waiting_to_ready_time:5100709 . 000000c0: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:179, ready_to_enqueue_time:0, waiting_to_ready_time:30714243 . 000000f0: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:197, ready_to_enqueue_time:0, waiting_to_ready_time:15350648 . 00000120: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:213, ready_to_enqueue_time:0, waiting_to_ready_time:15353446 . 00000150: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:212, ready_to_enqueue_time:0, waiting_to_ready_time:15355126 . 00000180: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:6368, ready_to_enqueue_time:164, waiting_to_ready_time:5104665 Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf powerpc: Add basic CONFIG_AUXTRACE support for VPA pmu on powerpcAthira Rajeev3-0/+18
The powerpc PMU collecting Dispatch Trace Log (DTL) entries makes use of AUX support in perf infrastructure. The PMU driver has the functionality to collect trace entries in the aux buffer. On the tools side, this data is made available as PERF_RECORD_AUXTRACE records. This record is generated by "perf record" command. To enable the creation of PERF_RECORD_AUXTRACE, add functions to initialize auxtrace records ie "auxtrace_record__init()". Fill in fields for other callbacks like info_priv_size, info_fill, free, recording options etc. Define auxtrace_type as PERF_AUXTRACE_VPA_DTL. Add header file to define vpa dtl pmu specific details. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf tools: Fix duplicated words in documentation and commentsMarkus Heidelberg1-1/+1
- "the the" - "in in" - "a a" Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Markus Heidelberg <m.heidelberg@cab.de> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-30Merge tag 'x86_misc_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull x86 instruction decoder update from Borislav Petkov: - Add instruction decoding support for the XOP-prefixed instruction set present on the AMD Bulldozer uarch [ These instructions don't normally happen, but a X86_NATIVE_CPU build on a bulldozer host can make the compiler then use these unusual instruction encodings ] * tag 'x86_misc_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/insn: Add XOP prefix instructions decoder support
2025-09-30perf bpf: Check libbpf version to use btf_dump_type_data_opts.emit_stringsArnaldo Carvalho de Melo1-0/+2
When building perf with LIBBPF_DYNAMIC=1 on a fedora system with libbpf-devel 1.5 I it was breaking with: util/bpf-event.c: In function ‘format_btf_variable’: util/bpf-event.c:291:18: error: ‘const struct btf_dump_type_data_opts’ has no member named ‘emit_strings’ 291 | .emit_strings = 1, | ^~~~~~~~~~~~ util/bpf-event.c:291:33: error: initialized field overwritten [-Werror=override-init] 291 | .emit_strings = 1, | ^ util/bpf-event.c:291:33: note: (near initialization for ‘opts.skip_names’) Check the version before using that feature. Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-30perf bpf: Move the LIBBPF_CURRENT_VERSION_GEQ macro to bpf-utils.hArnaldo Carvalho de Melo2-4/+6
We need it to fix some other libbpf version dependent issues when building with LIBBPF_DYNAMIC=1. Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-30perf bpf-filter: Fix opts declaration on older libbpfsIan Rogers1-0/+8
Building perf with LIBBPF_DYNAMIC (ie not the default static linking of libbpf with perf) is breaking as the libbpf isn't version 1.7 or newer, where dont_enable is added to bpf_perf_event_opts. To avoid this breakage add a compile time version check and don't declare the variable when not present. Fixes: 5e2ac8e8571df54d ("perf bpf-filter: Enable events manually") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: bpf@vger.kernel.org Cc: Hao Ge <gehao@kylinos.cn> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-29Merge tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds1-1/+1
Pull hardening updates from Kees Cook: "One notable addition is the creation of the 'transitional' keyword for kconfig so CONFIG renaming can go more smoothly. This has been a long-standing deficiency, and with the renaming of CONFIG_CFI_CLANG to CONFIG_CFI (since GCC will soon have KCFI support), this came up again. The breadth of the diffstat is mainly this renaming. - Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva) - lkdtm: fortify: Fix potential NULL dereference on kmalloc failure (Junjie Cao) - Add str_assert_deassert() helper (Lad Prabhakar) - gcc-plugins: Remove TODO_verify_il for GCC >= 16 - kconfig: Fix BrokenPipeError warnings in selftests - kconfig: Add transitional symbol attribute for migration support - kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI" * tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lib/string_choices: Add str_assert_deassert() helper kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI kconfig: Add transitional symbol attribute for migration support kconfig: Fix BrokenPipeError warnings in selftests gcc-plugins: Remove TODO_verify_il for GCC >= 16 stddef: Introduce __TRAILING_OVERLAP() stddef: Remove token-pasting in TRAILING_OVERLAP() lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
2025-09-24kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFIKees Cook1-1/+1
The kernel's CFI implementation uses the KCFI ABI specifically, and is not strictly tied to a particular compiler. In preparation for GCC supporting KCFI, rename CONFIG_CFI_CLANG to CONFIG_CFI (along with associated options). Use new "transitional" Kconfig option for old CONFIG_CFI_CLANG that will enable CONFIG_CFI during olddefconfig. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250923213422.1105654-3-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-09-19perf build-id: Ensure snprintf string is empty when size is 0Ian Rogers1-0/+7
The string result of build_id__snprintf() is unconditionally used in places like dsos__fprintf_buildid_cb(). If the build id has size 0 then this creates a use of uninitialized memory. Add null termination for the size 0 case. A similar fix was written by Jiri Olsa in commit 6311951d4f8f28c4 ("perf tools: Initialize output buffer in build_id__sprintf") but lost in the transition to snprintf. Fixes: fccaaf6fbbc59910 ("perf build-id: Change sprintf functions to snprintf") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-19perf evsel: Ensure the fallback message is always written toIan Rogers1-5/+7
The fallback message is unconditionally printed in places like record__open(). If no fallback is attempted this can lead to printing uninitialized data, crashes, etc. Fixes: c0a54341c0e89333 ("perf evsel: Introduce event fallback method") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-19perf evsel: Fix uniquification when PMU given without suffixIan Rogers1-10/+18
The PMU name is appearing twice in: ``` $ perf stat -e uncore_imc_free_running/data_total/ -A true Performance counter stats for 'system wide': CPU0 1.57 MiB uncore_imc_free_running_0/uncore_imc_free_running,data_total/ CPU0 1.58 MiB uncore_imc_free_running_1/uncore_imc_free_running,data_total/ 0.000892376 seconds time elapsed ``` Use the pmu_name_len_no_suffix to avoid this problem. Committer testing: After this patch: root@x1:~# perf stat -e uncore_imc_free_running/data_total/ -A true Performance counter stats for 'system wide': CPU0 1.69 MiB uncore_imc_free_running_0/data_total/ CPU0 1.68 MiB uncore_imc_free_running_1/data_total/ 0.002141605 seconds time elapsed root@x1:~# Fixes: 7d45f402d3117e0b ("perf evlist: Make uniquifying counter names consistent") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-19perf session: Fix handling when buffer exceeds 2 GiBLeo Yan1-1/+1
If a user specifies an AUX buffer larger than 2 GiB, the returned size may exceed 0x80000000. Since the err variable is defined as a signed 32-bit integer, such a value overflows and becomes negative. As a result, the perf record command reports an error: 0x146e8 [0x30]: failed to process type: 71 [Unknown error 183711232] Change the type of the err variable to a signed 64-bit integer to accommodate large buffer sizes correctly. Fixes: d5652d865ea734a1 ("perf session: Add ability to skip 4GiB or more") Reported-by: Tamas Zsoldos <tamas.zsoldos@arm.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20250808-perf_fix_big_buffer_size-v1-1-45f45444a9a4@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-19Merge remote-tracking branch 'torvalds/master' into perf-tools-nextArnaldo Carvalho de Melo1-4/+5
To pick up the latest perf-tools batch sent by Namhyung Kim for v6.17-rc7. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-19perf tools: Remove a pointless checkNamhyung Kim1-3/+0
Static analyser cppcheck says: linux-6.16/tools/perf/util/tool_pmu.c:242:15: warning: Opposite inner 'if' condition leads to a dead code block. [oppositeInnerCondition] Source code is: for (thread = 0; thread < nthreads; thread++) { if (thread >= nthreads) break; Reported-by: David Binderman <dcb314@hotmail.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-19perf dwarf-aux: Fix __die_find_scope_cb() for namespacesZecheng Li1-0/+9
Currently __die_find_scope_cb() goes to check siblings when the DIE doesn't include the given PC. However namespaces don't have a PC and could contain children that have that PC. When we encounter a namespace, we should check both its children and siblings. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Zecheng Li <zecheng@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xu Liu <xliuprof@google.com> Link: https://lore.kernel.org/r/20250825195817.226560-1-zecheng@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-19perf dwarf-aux: Better variable collection for insn trackingZecheng Li3-1/+21
Utilizes the previous is_breg_access_indirect function to determine if the register + offset stores the variable itself or the struct it points to, save the information in die_var_type.is_reg_var_addr. Since we are storing the real types in the stack state, we need to do a type dereference when is_reg_var_addr is set to false for stack/frame registers. For other gp registers, skip the variable when the register is a pointer to the type. If we want to accept these variables, we might also utilize is_reg_var_addr in a different way, we need to mark that register as a pointer to the type. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Zecheng Li <zecheng@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xu Liu <xliuprof@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>