| Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Extended 'perf annotate' with DWARF type information
(--code-with-type) integration in the TUI, including a 'T'
hotkey to toggle it
- Enhanced 'perf bench mem' with new mmap() workloads and control
over page/chunk sizes
- Fix 'perf stat' error handling to correctly display unsupported
events
- Improved support for Clang cross-compilation
- Refactored LLVM and Capstone disasm for modularity
- Introduced the :X modifier to exclude an event from automatic
regrouping
- Adjusted KVM sampling defaults to use the "cycles" event to prevent
failures
- Added comprehensive support for decoding PowerPC Dispatch Trace Log
(DTL)
- Updated Arm SPE tracing logic for better analysis of memory and snoop
details
- Synchronized Intel PMU events and metrics with TMA 5.1 across
multiple processor generations
- Converted dependencies like libperl and libtracefs to be opt-in
- Handle more Rust symbols in kallsyms ('N', debugging)
- Improve the python binding to allow for python based tools to use
more of the libraries, add a 'ilist' utility to test those new
bindings
- Various 'perf test' fixes
- Kan Liang no longer a perf tools reviewer
* tag 'perf-tools-for-v6.18-1-2025-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (192 commits)
perf tools: Fix arm64 libjvmti build by generating unistd_64.h
perf tests: Don't retest sections in "Object code reading"
perf docs: Document building with Clang
perf build: Support build with clang
perf test coresight: Dismiss clang warning for unroll loop thread
perf test coresight: Dismiss clang warning for thread loop
perf test coresight: Dismiss clang warning for memcpy thread
perf build: Disable thread safety analysis for perl header
perf build: Correct CROSS_ARCH for clang
perf python: split Clang options when invoking Popen
tools build: Align warning options with perf
perf disasm: Remove unused evsel from 'struct annotate_args'
perf srcline: Fallback between addr2line implementations
perf disasm: Make ins__scnprintf() and ins__is_nop() static
perf dso: Clean up read_symbol() error handling
perf dso: Support BPF programs in dso__read_symbol()
perf dso: Move read_symbol() from llvm/capstone to dso
perf llvm: Reduce LLVM initialization
perf check: Add libLLVM feature
perf parse-events: Fix parsing of >30kb event strings
...
|
|
When build with perl5, it reports error:
In file included from /usr/lib/perl5/5.42.0/x86_64-linux-thread-multi/CORE/perl.h:7933:
/usr/lib/perl5/5.42.0/x86_64-linux-thread-multi/CORE/inline.h:298:5: error:
mutex 'PL_env_mutex.lock' is not held on every path through
here [-Werror,-Wthread-safety-analysis]
298 | ENV_UNLOCK;
| ^
/usr/lib/perl5/5.42.0/x86_64-linux-thread-multi/CORE/perl.h:7091:31: note:
expanded from macro 'ENV_UNLOCK'
7091 | # define ENV_UNLOCK PERL_REENTRANT_UNLOCK("env"...
| ^
/usr/lib/perl5/5.42.0/x86_64-linux-thread-multi/CORE/perl.h:6465:7: note:
expanded from macro 'PERL_REENTRANT_UNLOCK'
6465 | } STMT_END
| ^
/usr/lib/perl5/5.42.0/x86_64-linux-thread-multi/CORE/perl.h:865:28: note:
expanded from macro 'STMT_END'
865 | # define STMT_END while (0)
| ^
The error is caused by perl header but not perf code, disable thread
safety analysis if including the header.
Though GCC does not support the thread safety analysis option, this
negative warning flag is silently ignored by it.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20251006-perf_build_android_ndk-v3-4-4305590795b2@arm.com
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: linux-riscv@lists.infradead.org
Cc: llvm@lists.linux.dev
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When passing a list to subprocess.Popen, each element maps to one argv
token. Current code bundles multiple Clang flags into a single element,
something like:
cmd = ['clang',
'--target=x86_64-linux-gnu -fintegrated-as -Wno-cast-function-type-mismatch',
'test-hello.c']
So Clang only sees one long, invalid option instead of separate flags,
as a result, the script cannot capture any log via PIPE.
Fix this by using shlex.split() to separate the string so each option
becomes its own argv element. The fixed list will be:
cmd = ['clang',
'--target=x86_64-linux-gnu',
'-fintegrated-as',
'-Wno-cast-function-type-mismatch',
'test-hello.c']
Fixes: 09e6f9f98370 ("perf python: Fix splitting CC into compiler and options")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20251006-perf_build_android_ndk-v3-2-4305590795b2@arm.com
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: linux-riscv@lists.infradead.org
Cc: llvm@lists.linux.dev
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Set in symbol__annotate() but never used.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Factor the addr2line function implementation into separate source files
(addr2line.[ch]) and rename the addr2line function cmd__addr2line. In
srcline replace the ifdef-ed addr2line implementations with one that
first tries the llvm__addr2line implementation, then the deprecated
libbfd__addr2line function and on failure uses cmd__addr2line.
If HAVE_LIBLLVM_SUPPORT is enabled the llvm__addr2line will execute
against the libLLVM.so it is linked against.
If HAVE_LIBLLVM_DYNAMIC is enabled then libperf-llvm.so (that links
against libLLVM.so) will be dlopened. If the dlopen succeeds then the
behavior should match HAVE_LIBLLVM_SUPPORT. On failure cmd__addr2line is
used. The dlopen is only tried once.
If HAVE_LIBLLVM_DYNAMIC isn't enabled then llvm__addr2line immediately
fails and cmd__addr2line is used.
Clean up the dso__free_a2l logic, which is only needed in the non-LLVM
version and moved to addr2line.c.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Reduce the scope of ins__scnprintf() and ins__is_nop() that aren't used
outside of disasm.c.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Ensure errno is set and return to caller for error handling.
Unusually for perf the value isn't negated as expected by
symbol__strerror_disassemble().
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Set the buffer to the code in the BPF linear info. This enables BPF
JIT code disassembly by LLVM and capstone.
Move the common but minimal disassmble_bpf_image call to
disassemble_objdump so that it is only called after falling back to the
objdump option.
Similarly move the disassmble_bpf function to disassemble_objdump and
rename to disassmble_bpf_libbfd to make it clearer that this support
relies on libbfd.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move the read_symbol function to dso.h, make the return type const and
add a mutable out_buf out parameter.
In future changes this will allow a code pointer to be returned without
necessary allocating memory.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move the 3 LLVM initialization routines to be called in a single
init_llvm function that has its own bool to avoid repeated
initialization.
Reduce the scope of triplet and avoid copying strings for x86.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
[ Move init_llvm() under HAVE_LIBLLVM_SUPPORT to fix the build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Metrics may generate many particularly uncore event references. The
resulting event string may then be >32kb. The parse events lex is
using "%option reject" which stores backtracking state in a buffer
sized at roughtly 30kb. If the event string is larger than this then a
buffer overflow and typically a crash happens.
The need for "%option reject" was for BPF events which were removed in
commit 3d6dfae88917 ("perf parse-events: Remove BPF event
support"). As "%option reject" is both a memory and performance cost
let's remove it and fix the parsing case for event strings being over
~30kb.
Whilst cleaning up "%option reject" make the header files accurately
reflect functions used in the code and tidy up not requiring yywrap.
Measuring on the "PMU JSON event tests" a modest reduction of 0.41%
user time and 0.27% max resident size was observed. More importantly
this change fixes parsing large metrics and event strings.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Provide ratio-to-prev term which allows the user to
set the event sample period of two events corresponding
to a desired ratio.
If using on an Intel x86 platform with Auto Counter Reload support, also
set corresponding event's config2 attribute with a bitmask which
counters to reset and which counters to sample if the desired ratio is
met or exceeded.
On other platforms, only the sample period is affected by the
ratio-to-prev term.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The feature check guarded the -DHAVE_LIBBPF_STRINGS_SUPPORT is
unnecessary as it is sufficient and easier to use the
LIBBPF_CURRENT_VERSION_GEQ macro.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
TSR_KIND_POINTER only represents percpu pointers currently. Rename it to
TSR_KIND_PERCPU_POINTER so we can use the TSR_KIND_POINTER to represent
pointer to a type.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Zecheng Li <zecheng@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xu Liu <xliuprof@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For the sake of Intel topdown events commit 9eac5612da1c9102 ("perf
stat: Don't skip failing group events") changed 'perf stat' error
handling making it so that more errors were fatal and didn't report
"<not supported>" events. The change outside of topdown events was
unintentional.
The notion of "fatal" error handling was introduced in commit
e0e6a6ca3ac211cc ("perf stat: Factor out open error handling") and
refined in commits like commit cb5ef60067c11cc8 ("perf stat: Error out
unsupported group leader immediately") to be an approach for avoiding
later assertion failures in the code base.
This change fixes those issues and removes the notion of a fatal error
on an event. If all events fail to open then a fatal error occurs with
the previous fatal error message. This seems to best match the notion of
supported events and allowing some errors not to stop 'perf stat', while
allowing the truly fatal no event case to terminate the tool early.
The evsel->errored flag is only used in the stat code but always just
meaning !evsel->supported although there is a comment about it being
sticky. Force all evsels to be supported in evsel__init and then clear
this when evsel__open fails. When an event is tried the supported is
set to true again. This simplifies the notion of whether an evsel is
broken.
In the get_group_fd code, fail to get a group fd when the evsel isn't
supported. If the leader isn't supported then it is also expected that
there is no group_fd as the leader will have been skipped. Therefore
change the BUG_ON test to be on supported rather than skippable. This
corrects the assertion errors that were the reason for the previous
fatal error handling.
Fixes: 9eac5612da1c9102 ("perf stat: Don't skip failing group events")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20251002220727.1889799-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The function create_perf_stat_counter is only used in builtin-stat.c
and contains logic about retrying events specific to
builtin-stat.c.
Move the code to builtin-stat.c to tidy this up.
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
get_current_dir_name is a GNU extension not supported on, for example,
Android. There is only one use of it so let's just switch to getcwd to
avoid build and other complexity.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
open_capstone_handle is similar to capstone_init and used only by
symbol__disassemble_capstone. symbol__disassemble_capstone_powerpc
already uses capstone_init, transition symbol__disassemble_capstone
and eliminate open_capstone_handle.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move symbolization and srcline libbfd dependencies to a separate
libbfd.c. This mirrors moving llvm and capstone code. While this code
is deprecated as it is part of BUILD_NONDISTRO license incompatible
code, moving the code to its own file minimizes disruption in the main
files.
disasm_bpf.c is moved to libbfd.c also except for
symbol__disassemble_bpf_image which is currently more of a placeholder
function rather than something that provides disassembly support.
demangle-cxx.cpp code isn't migrated as it is very limited.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
LLVM disassembly support was in disasm.c and addr2line support in
srcline.c. Move support out of these files into llvm.[ch] and remove
LLVM includes from those files. As disassembly routines can fail, make
failure the only option without HAVE_LIBLLVM_SUPPORT. For simplicity's
sake, duplicate the read_symbol utility function.
The intent with moving LLVM support into a single file is that dynamic
support, using dlopen for libllvm, can be added in later patches. This
can potentially always succeed or fail, so relying on ifdefs isn't
sufficient. Using dlopen is a useful option to minimize the perf tools
dependencies and potentially size.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Capstone disassembly support was split between disasm.c and
print_insn.c. Move support out of these files into capstone.[ch] and
remove include capstone/capstone.h from those files. As disassembly
routines can fail, make failure the only option without
HAVE_LIBCAPSTONE_SUPPORT. For simplicity's sake, duplicate the
read_symbol utility function.
The intent with moving capstone support into a single file is that
dynamic support, using dlopen for libcapstone, can be added in later
patches. This can potentially always succeed or fail, so relying on
ifdefs isn't sufficient. Using dlopen is a useful option to minimize
the perf tools dependencies and potentially size.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make the map argument const as the conversion act won't modify the map
and this allows other callers to use a const struct map.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
After KVM supports PEBS for guest on Intel platforms
(https://lore.kernel.org/all/20220411101946.20262-1-likexu@tencent.com/),
host loses the capability to sample guest with PEBS since all PEBS related
MSRs are switched to guest value after vm-entry, like IA32_DS_AREA MSR is
switched to guest GVA at vm-entry. This would lead to "perf kvm record"
fails to sample guest on Intel platforms since "cycles:P" event is used to
sample guest by default as below case shows.
sudo perf kvm record -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.787 MB perf.data.guest ]
So to ensure guest record can be sampled successfully, use "cycles"
instead of "cycles:P" to sample guest record by default on Intel
platforms. With this patch, the guest record can be sampled
successfully.
sudo perf kvm record -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.783 MB perf.data.guest (23 samples) ]
Fixes: cf8e55fe50df0c02 ("KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64")
Reported-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add helper x86__is_intel_cpu() to indicate if it's a x86 intel platform.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The note_data at ptr is read as a nhdr but this may yield
out-of-bounds reads if there isn't nhdrs worth of data.
Be more defensive before doing the reads.
This is motivated by address sanitizer capturing out of bounds reads
running "perf top".
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When exploring building bpf_skel with GCC's BPF support, there was a
build failure because of bpf_core_field_exists vs the mem_hops bitfield:
```
In file included from util/bpf_skel/sample_filter.bpf.c:6:
util/bpf_skel/sample_filter.bpf.c: In function 'perf_get_sample':
tools/perf/libbpf/include/bpf/bpf_core_read.h:169:42: error: cannot take address of bit-field 'mem_hops'
169 | #define ___bpf_field_ref1(field) (&(field))
| ^
tools/perf/libbpf/include/bpf/bpf_helpers.h:222:29: note: in expansion of macro '___bpf_field_ref1'
222 | #define ___bpf_concat(a, b) a ## b
| ^
tools/perf/libbpf/include/bpf/bpf_helpers.h:225:29: note: in expansion of macro '___bpf_concat'
225 | #define ___bpf_apply(fn, n) ___bpf_concat(fn, n)
| ^~~~~~~~~~~~~
tools/perf/libbpf/include/bpf/bpf_core_read.h:173:9: note: in expansion of macro '___bpf_apply'
173 | ___bpf_apply(___bpf_field_ref, ___bpf_narg(args))(args)
| ^~~~~~~~~~~~
tools/perf/libbpf/include/bpf/bpf_core_read.h:188:39: note: in expansion of macro '___bpf_field_ref'
188 | __builtin_preserve_field_info(___bpf_field_ref(field), BPF_FIELD_EXISTS)
| ^~~~~~~~~~~~~~~~
util/bpf_skel/sample_filter.bpf.c:167:29: note: in expansion of macro 'bpf_core_field_exists'
167 | if (bpf_core_field_exists(data->mem_hops))
| ^~~~~~~~~~~~~~~~~~~~~
cc1: error: argument is not a field access
```
___bpf_field_ref1 was adapted for GCC in 12bbcf8e840f40b82b02981e96e0a5fbb0703ea9
but the trick added for compatibility in 3a8b8fc3174891c4c12f5766d82184a82d4b2e3e
isn't compatible with that as an address is used as an argument.
Workaround this by calling __builtin_preserve_field_info directly as the
bpf_core_field_exists macro does, but without the ___bpf_field_ref use.
Co-developed-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Signed-off-by: Sam James <sam@gentoo.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://gcc.gnu.org/PR121420
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Don't open evsels on all CPUs, open them just on the CPUs they
support. This avoids opening say an e-core event on a p-core and
getting a failure - achieve this by getting rid of the "all_cpu_map".
In install_pe functions don't use the cpu_map_idx as a CPU number,
translate the cpu_map_idx, which is a dense index into the cpu_map
skipping holes at the beginning, to a proper CPU number.
Before:
```
$ perf stat --bpf-counters -a -e cycles,instructions -- sleep 1
Performance counter stats for 'system wide':
<not supported> cpu_atom/cycles/
566,270,672 cpu_core/cycles/
<not supported> cpu_atom/instructions/
572,792,836 cpu_core/instructions/ # 1.01 insn per cycle
1.001595384 seconds time elapsed
```
After:
```
$ perf stat --bpf-counters -a -e cycles,instructions -- sleep 1
Performance counter stats for 'system wide':
443,299,201 cpu_atom/cycles/
1,233,919,737 cpu_core/cycles/
213,634,112 cpu_atom/instructions/ # 0.48 insn per cycle
2,758,965,527 cpu_core/instructions/ # 2.24 insn per cycle
1.001699485 seconds time elapsed
```
Fixes: 7fac83aaf2eecc9e ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: bpf@vger.kernel.org
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Reduce the API surface that is in bpf_counter.h, this helps compiler
analysis like unused static function, makes it easier to set a
breakpoint and just makes it easier to see the code is self contained.
When code is shared between BPF C code, put it inside HAVE_BPF_SKEL.
Move transitively found #includes into appropriate C files.
No functional change.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove the arch-specific guard around TYPE_STATE_MAX_REGS and define it
as 32 for all architectures.
The architecture that perf is built on may not match the architecture
that produced the perf.data file, so relying on __powerpc__ or similar
is fragile.
Using 32 as a fixed upper bound is safe since it is greater than the
previous maximum of 16.
Add a comment to clarify that TYPE_STATE_MAX_REGS is an arch-independent
maximum rather than a build-time choice.
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The process_event() function in "builtin-script.c" invokes
perf_sample__fprintf_synth() for displaying PERF_TYPE_SYNTH
type events.
if (attr->type == PERF_TYPE_SYNTH && PRINT_FIELD(SYNTH))
perf_sample__fprintf_synth(sample, evsel, fp);
perf_sample__fprintf_synth() process the sample depending on the value
in evsel->core.attr.config. Introduce perf_sample__fprintf_synth_vpadtl()
and invoke this for PERF_SYNTH_POWERPC_VPA_DTL
Sample output:
./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.300 MB perf.data ]
./perf script
perf 13322 [002] 233.835807: sched:sched_switch: perf:13322 [120] R ==> migration/2:27 [0]
migration/2 27 [002] 233.835811: sched:sched_migrate_task: comm=perf pid=13322 prio=120 orig_cpu=2 dest_cpu=3
migration/2 27 [002] 233.835818: sched:sched_stat_runtime: comm=migration/2 pid=27 runtime=9214 [ns]
migration/2 27 [002] 233.835819: sched:sched_switch: migration/2:27 [0] S ==> swapper/2:0 [120]
swapper 0 [002] 233.835822: vpa-dtl: timebase: 338954486062657 dispatch_reason:decrementer_interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:435, ready_to_enqueue_time:0, waiting_to_ready_time:34775058, processor_id: 202 c0000000000f8094 plpar_hcall_norets_notrace+0x18 ([kernel.kallsyms])
swapper 0 [001] 233.835886: vpa-dtl: timebase: 338954486095398 dispatch_reason:priv_doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:542, ready_to_enqueue_time:0, waiting_to_ready_time:1245360, processor_id: 201 c0000000000f8094 plpar_hcall_norets_notrace+0x18 ([kernel.kallsyms])
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Create samples from DTL entries for displaying in 'perf report'
and 'perf script'.
When the different PERF_RECORD_XX records are processed from perf
session, powerpc_vpadtl_process_event() will be invoked.
For each of the PERF_RECORD_XX record, compare the timestamp of perf
record with timestamp of top element in the auxtrace heap.
Process the auxtrace queue if the timestamp of element from heap is
lower than timestamp from entry in perf record.
Sometimes it could happen that one buffer is only partially processed.
if the timestamp of occurrence of another event is more than currently
processed element in the queue, it will move on to next perf record.
So keep track of position of buffer to continue processing next time.
Update the timestamp of the auxtrace heap with the timestamp of last
processed entry from the auxtrace buffer.
Generate perf sample for each entry in the dispatch trace log.
Fill in the sample details:
- sample ip is picked from srr0 field of dtl_entry
- sample cpu is picked from processor_id of dtl_entry
- sample id is from sample_id of powerpc_vpadtl
- cpumode is set to PERF_RECORD_MISC_KERNEL
- Additionally save the details in raw_data of sample.
This is to print the relevant fields in perf_sample__fprintf_synth()
when called from builtin-script
The sample is processed by calling perf_session__deliver_synth_event()
so that it gets included in perf report.
Sample Output:
./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.300 MB perf.data ]
./perf report
# Samples: 321 of event 'vpa-dtl'
# Event count (approx.): 321
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ..............................
#
100.00% 100.00% swapper [kernel.kallsyms] [k] plpar_hcall_norets_notrace
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When the Dispatch Trace Log data is collected along with other events
like sched tracepoint events, it needs to be correlated and present
interleaved along with these events.
Perf events can be collected parallely across the CPUs. Hence it needs
to be ensured events/dtl entries are processed in timestamp order.
An auxtrace_queue is created for each CPU.
Data within each queue is in increasing order of timestamp. Each
auxtrace queue has a array/list of auxtrace buffers.
When processing the auxtrace buffer, the data is mmapp'ed.
All auxtrace queues is maintained in auxtrace heap.
Each queue has a queue number and a timestamp.
The queues are sorted/added to head based on the time stamp.
So always the lowest timestamp (entries to be processed first) is on top
of the heap.
The auxtrace queue needs to be allocated and heap needs to be populated
in the sorted order of timestamp.
The queue needs to be filled with data only once via
powerpc_vpadtl__update_queues() function.
powerpc_vpadtl__setup_queues() iterates through all the entries to
allocate and setup the auxtrace queue.
To add to auxtrace heap, it is required to fetch the timebase of first
entry for each of the queue.
The first entry in the queue for VPA DTL PMU has the boot timebase,
frequency details which are needed to get timestamp which is required to
correlate with other events.
The very next entry is the actual trace data that provides timestamp for
occurrence of DTL event.
Formula used to get the timestamp from dtl entry is:
((timbase from DTL entry - boot time) / frequency) * 1000000000
powerpc_vpadtl_decode() adds the boot time and frequency as part of
powerpc_vpadtl_queue structure so that it can be reused.
Each of the dtl_entry is of 48 bytes size. Sometimes it could happen
that one buffer is only partially processed (if the timestamp of
occurrence of another event is more than currently processed element in
queue, it will move on to next event).
In order to keep track of position of buffer, additional fields is added
to powerpc_vpadtl_queue structure.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Dispatch Trace Log details are captured as-is in PERF_RECORD_AUXTRACE
records.
To present dtl entries as samples, create an event with name as
"vpa-dtl" and type PERF_TYPE_SYNTH.
Add perf_synth_id, "PERF_SYNTH_POWERPC_VPA_DTL" as config value for the
event.
Create a sample id to be a fixed offset from evsel id.
To present the relevant fields from the "struct dtl_entry", prepare the
entries as events of type PERF_TYPE_SYNTH.
By defining as PERF_TYPE_SYNTH type, samples can be printed as part of
perf_sample__fprintf_synth in builtin-script.c
From powerpc_vpadtl_process_auxtrace_info(), invoke
auxtrace_queues__process_index() function which will queue the auxtrace
buffers by invoke auxtrace_queues__add_event().
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add VPA DTL PMU auxtrace process function for "perf report -D".
The auxtrace event processing functions are defined in file
"util/powerpc-vpadtl.c".
Data structures used includes "struct powerpc_vpadtl_queue", "struct
powerpc_vpadtl" to store the auxtrace buffers in queue. Different
PERF_RECORD_XXX are generated during recording.
PERF_RECORD_AUXTRACE_INFO is processed first since it is of type
perf_user_event_type and perf session event delivers
perf_session__process_user_event() first.
Define function powerpc_vpadtl_process_auxtrace_info() to handle the
processing of PERF_RECORD_AUXTRACE_INFO records.
In this function, initialize the aux buffer queues using
auxtrace_queues__init().
Setup the required infrastructure for aux data processing.
The data is collected per CPU and auxtrace_queue is created for each
CPU.
Define powerpc_vpadtl_process_event() function to process
PERF_RECORD_AUXTRACE records.
In this, add the event to queue using auxtrace_queues__add_event() and
process the buffer in powerpc_vpadtl_dump_event().
The first entry in the buffer with timebase as zero has boot timebase
and frequency.
Remaining data is of format for "struct powerpc_vpadtl_entry".
Define the translation for dispatch_reasons and preempt_reasons, report
this when dump trace is invoked via powerpc_vpadtl_dump()
Sample output:
./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.300 MB perf.data ]
./perf report -D
0 0 0x39b10 [0x30]: PERF_RECORD_AUXTRACE size: 0x690 offset: 0 ref: 0 idx: 0 tid: -1 cpu: 0
.
. ... VPA DTL PMU data: size 1680 bytes, entries is 35
. 00000000: boot_tb: 21349649546353231, tb_freq: 512000000
. 00000030: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:7064, ready_to_enqueue_time:187, waiting_to_ready_time:6611773
. 00000060: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:146, ready_to_enqueue_time:0, waiting_to_ready_time:15359437
. 00000090: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:4868, ready_to_enqueue_time:232, waiting_to_ready_time:5100709
. 000000c0: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:179, ready_to_enqueue_time:0, waiting_to_ready_time:30714243
. 000000f0: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:197, ready_to_enqueue_time:0, waiting_to_ready_time:15350648
. 00000120: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:213, ready_to_enqueue_time:0, waiting_to_ready_time:15353446
. 00000150: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:212, ready_to_enqueue_time:0, waiting_to_ready_time:15355126
. 00000180: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:6368, ready_to_enqueue_time:164, waiting_to_ready_time:5104665
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The powerpc PMU collecting Dispatch Trace Log (DTL) entries makes use of
AUX support in perf infrastructure.
The PMU driver has the functionality to collect trace entries in the aux
buffer.
On the tools side, this data is made available as PERF_RECORD_AUXTRACE
records.
This record is generated by "perf record" command.
To enable the creation of PERF_RECORD_AUXTRACE, add functions to
initialize auxtrace records ie "auxtrace_record__init()".
Fill in fields for other callbacks like info_priv_size, info_fill, free,
recording options etc.
Define auxtrace_type as PERF_AUXTRACE_VPA_DTL. Add header file to
define vpa dtl pmu specific details.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
- "the the"
- "in in"
- "a a"
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Markus Heidelberg <m.heidelberg@cab.de>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pull x86 instruction decoder update from Borislav Petkov:
- Add instruction decoding support for the XOP-prefixed instruction set
present on the AMD Bulldozer uarch
[ These instructions don't normally happen, but a X86_NATIVE_CPU build
on a bulldozer host can make the compiler then use these unusual
instruction encodings ]
* tag 'x86_misc_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/insn: Add XOP prefix instructions decoder support
|
|
When building perf with LIBBPF_DYNAMIC=1 on a fedora system with
libbpf-devel 1.5 I it was breaking with:
util/bpf-event.c: In function ‘format_btf_variable’:
util/bpf-event.c:291:18: error: ‘const struct btf_dump_type_data_opts’ has no member named ‘emit_strings’
291 | .emit_strings = 1,
| ^~~~~~~~~~~~
util/bpf-event.c:291:33: error: initialized field overwritten [-Werror=override-init]
291 | .emit_strings = 1,
| ^
util/bpf-event.c:291:33: note: (near initialization for ‘opts.skip_names’)
Check the version before using that feature.
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We need it to fix some other libbpf version dependent issues when
building with LIBBPF_DYNAMIC=1.
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Building perf with LIBBPF_DYNAMIC (ie not the default static linking of
libbpf with perf) is breaking as the libbpf isn't version 1.7 or newer,
where dont_enable is added to bpf_perf_event_opts.
To avoid this breakage add a compile time version check and don't
declare the variable when not present.
Fixes: 5e2ac8e8571df54d ("perf bpf-filter: Enable events manually")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: bpf@vger.kernel.org
Cc: Hao Ge <gehao@kylinos.cn>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pull hardening updates from Kees Cook:
"One notable addition is the creation of the 'transitional' keyword for
kconfig so CONFIG renaming can go more smoothly.
This has been a long-standing deficiency, and with the renaming of
CONFIG_CFI_CLANG to CONFIG_CFI (since GCC will soon have KCFI
support), this came up again.
The breadth of the diffstat is mainly this renaming.
- Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva)
- lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
(Junjie Cao)
- Add str_assert_deassert() helper (Lad Prabhakar)
- gcc-plugins: Remove TODO_verify_il for GCC >= 16
- kconfig: Fix BrokenPipeError warnings in selftests
- kconfig: Add transitional symbol attribute for migration support
- kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI"
* tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lib/string_choices: Add str_assert_deassert() helper
kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
kconfig: Add transitional symbol attribute for migration support
kconfig: Fix BrokenPipeError warnings in selftests
gcc-plugins: Remove TODO_verify_il for GCC >= 16
stddef: Introduce __TRAILING_OVERLAP()
stddef: Remove token-pasting in TRAILING_OVERLAP()
lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
|
|
The kernel's CFI implementation uses the KCFI ABI specifically, and is
not strictly tied to a particular compiler. In preparation for GCC
supporting KCFI, rename CONFIG_CFI_CLANG to CONFIG_CFI (along with
associated options).
Use new "transitional" Kconfig option for old CONFIG_CFI_CLANG that will
enable CONFIG_CFI during olddefconfig.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250923213422.1105654-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The string result of build_id__snprintf() is unconditionally used in
places like dsos__fprintf_buildid_cb(). If the build id has size 0 then
this creates a use of uninitialized memory. Add null termination for the
size 0 case.
A similar fix was written by Jiri Olsa in commit 6311951d4f8f28c4 ("perf
tools: Initialize output buffer in build_id__sprintf") but lost in the
transition to snprintf.
Fixes: fccaaf6fbbc59910 ("perf build-id: Change sprintf functions to snprintf")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The fallback message is unconditionally printed in places like
record__open().
If no fallback is attempted this can lead to printing uninitialized
data, crashes, etc.
Fixes: c0a54341c0e89333 ("perf evsel: Introduce event fallback method")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The PMU name is appearing twice in:
```
$ perf stat -e uncore_imc_free_running/data_total/ -A true
Performance counter stats for 'system wide':
CPU0 1.57 MiB uncore_imc_free_running_0/uncore_imc_free_running,data_total/
CPU0 1.58 MiB uncore_imc_free_running_1/uncore_imc_free_running,data_total/
0.000892376 seconds time elapsed
```
Use the pmu_name_len_no_suffix to avoid this problem.
Committer testing:
After this patch:
root@x1:~# perf stat -e uncore_imc_free_running/data_total/ -A true
Performance counter stats for 'system wide':
CPU0 1.69 MiB uncore_imc_free_running_0/data_total/
CPU0 1.68 MiB uncore_imc_free_running_1/data_total/
0.002141605 seconds time elapsed
root@x1:~#
Fixes: 7d45f402d3117e0b ("perf evlist: Make uniquifying counter names consistent")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If a user specifies an AUX buffer larger than 2 GiB, the returned size
may exceed 0x80000000. Since the err variable is defined as a signed
32-bit integer, such a value overflows and becomes negative.
As a result, the perf record command reports an error:
0x146e8 [0x30]: failed to process type: 71 [Unknown error 183711232]
Change the type of the err variable to a signed 64-bit integer to
accommodate large buffer sizes correctly.
Fixes: d5652d865ea734a1 ("perf session: Add ability to skip 4GiB or more")
Reported-by: Tamas Zsoldos <tamas.zsoldos@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20250808-perf_fix_big_buffer_size-v1-1-45f45444a9a4@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up the latest perf-tools batch sent by Namhyung Kim for
v6.17-rc7.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Static analyser cppcheck says:
linux-6.16/tools/perf/util/tool_pmu.c:242:15: warning:
Opposite inner 'if' condition leads to a dead code block. [oppositeInnerCondition]
Source code is:
for (thread = 0; thread < nthreads; thread++) {
if (thread >= nthreads)
break;
Reported-by: David Binderman <dcb314@hotmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently __die_find_scope_cb() goes to check siblings when the DIE
doesn't include the given PC.
However namespaces don't have a PC and could contain children that have
that PC.
When we encounter a namespace, we should check both its children and
siblings.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Zecheng Li <zecheng@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xu Liu <xliuprof@google.com>
Link: https://lore.kernel.org/r/20250825195817.226560-1-zecheng@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Utilizes the previous is_breg_access_indirect function to determine if
the register + offset stores the variable itself or the struct it points
to, save the information in die_var_type.is_reg_var_addr.
Since we are storing the real types in the stack state, we need to do a
type dereference when is_reg_var_addr is set to false for stack/frame
registers.
For other gp registers, skip the variable when the register is a pointer
to the type. If we want to accept these variables, we might also utilize
is_reg_var_addr in a different way, we need to mark that register as a
pointer to the type.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Zecheng Li <zecheng@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xu Liu <xliuprof@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|