Age | Commit message (Collapse) | Author | Files | Lines |
|
Don't load sysfs aliases for a PMU when the PMU is first created, defer
until an alias needs to be found. For the pmu-scan benchmark, average
core PMU scanning is reduced by 30.8%, and average PMU scanning by
12.6%.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Event info is only needed when an event is parsed or when merging data
from an JSON and sysfs event. Be lazy in its loading to reduce file
accesses.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Scan sysfs PMU's type early so that format and aliases aren't
attempted to be loaded if the PMU name is invalid.
This is the case for event_pmu tokens in parse-events.y where a wildcard
name is first assumed to be a PMU name.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rather than scanning all JSON events and adding them when a PMU is
created, add the alias when the JSON event is needed.
Average core PMU scanning run time reduced by 60.2%. Average PMU
scanning run time reduced by 15%. Page faults with no events reduced by
74 page faults, 4% of total.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Cache the JSON events table so that finding it isn't done per
event/alias.
Change the events table find so that when the PMU is given, if the PMU
has no JSON events return null.
Update usage to always use the PMU variable.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rather than load all sysfs events then parsing all JSON events and
merging with ones that already exist. When a sysfs event is loaded, look
for a corresponding JSON event and merge immediately.
To simplify the logic, early exit the perf_pmu__new_alias function if an
alias is attempted to be added twice - as merging has already been
explicitly handled.
Fix the copying of terms to a merged alias and some ENOMEM paths.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The aliases list is part of the PMU. Rather than pass the aliases
list, pass the full PMU simplifying some callbacks.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rather than read a sysfs events file into a 256 byte char buffer, pass
the FILE* directly to the lex/yacc parser.
This avoids there being a maximum events file size.
While changing the API, constify some arguments to remove unnecessary
casts.
Allocating the read buffer decreases the performance of pmu-scan by
around 3%.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
jevents stores events sorted by name. Add a find function that will
binary search event names avoiding the need to linearly search through
events.
Add a test in tests/pmu-events.c. If the PMU or event aren't found -1000
is returned. If the event is found but no callback function given, 0 is
returned.
This allows the find function also act as a test for existence.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pass the PMU to pmu_events_table__for_each_event so that entries that
don't match don't need to be processed by callback.
If a NULL PMU is passed then all PMUs are processed.
'perf bench internals pmu-scan's "Average PMU scanning" performance is
reduced by about 5% on an Intel tigerlake.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rather than scanning all PMUs for a counter name, scan the PMU
associated with the evsel of the sample. This is done to remove a
dependence on pmu-events.h.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Double setting information for an event would produce an error message
associated with the PMU rather than the term that was double setting.
Improve the error message to be on the term.
Before:
$ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true
event syntax error: 'cpu/inst_retired.any,inst_retired.any/'
\___ Bad event or PMU
Unabled to find PMU or event on a PMU of 'cpu'
Run 'perf list' for a list of valid events
$
After:
$ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true
event syntax error: '..etired.any,inst_retired.any/'
\___ Bad event or PMU
Unabled to find PMU or event on a PMU of 'cpu'
Initial error:
event syntax error: '..etired.any,inst_retired.any/'
\___ Attempt to set event's scale twice
Run 'perf list' for a list of valid events
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Prior to this change a cpuid would map to a list of events where the PMU
would be encoded alongside the event information. This change breaks
apart each group of events so that there is a group per PMU. A new table
is added with the PMU's name and the list of events, the original table
now holding an array of these per PMU tables.
These changes are to make it easier to get per PMU information about
events, rather than the current approach of scanning all events. The
perf binary size with BPF skeletons on x86 is reduced by about 1%. The
unidentified PMU is now always expanded to "cpu".
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add extra underscore before "for" of pmu_events_table_for_each_event
and pmu_metrics_table_for_each_metric.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In order to be able to lazily compute aliases/events for a PMU, move
the struct perf_pmu_alias into pmu.c.
Add perf_pmu__find_event and perf_pmu__for_each_event that take a
callback that is called for the found event or for each event.
The layout of struct pmu and the event/alias list is unchanged but the
API is altered so that aliases are no longer directly accessed, allowing
for later changes.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The sysfs format files are loaded eagerly in a PMU. Add a flag so that
we create the format but only load the contents when necessary.
Reduce the size of the value in struct perf_pmu_format and avoid holes
so there is no additional space requirement.
For "perf stat -e cycles true" this reduces the number of openat calls
from 648 to 573 (about 12%). The benchmark pmu scan speed is improved
by roughly 5%.
Before:
$ perf bench internals pmu-scan
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 1061.100 usec (+- 9.965 usec)
Average PMU scanning took: 4725.300 usec (+- 260.599 usec)
After:
$ perf bench internals pmu-scan
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 989.170 usec (+- 6.873 usec)
Average PMU scanning took: 4520.960 usec (+- 251.272 usec)
Committer testing:
On a AMD Ryzen 5950x:
Before:
$ perf bench internals pmu-scan -i1000
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 563.466 usec (+- 1.008 usec)
Average PMU scanning took: 1619.174 usec (+- 23.627 usec)
$ perf stat -r5 perf bench internals pmu-scan -i1000
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 583.401 usec (+- 2.098 usec)
Average PMU scanning took: 1677.352 usec (+- 24.636 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 553.254 usec (+- 0.825 usec)
Average PMU scanning took: 1635.655 usec (+- 24.312 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 557.733 usec (+- 0.980 usec)
Average PMU scanning took: 1600.659 usec (+- 23.344 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 554.906 usec (+- 0.774 usec)
Average PMU scanning took: 1595.338 usec (+- 23.288 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 551.798 usec (+- 0.967 usec)
Average PMU scanning took: 1623.213 usec (+- 23.998 usec)
Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):
3276.82 msec task-clock:u # 0.990 CPUs utilized ( +- 0.82% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
1008 page-faults:u # 307.615 /sec ( +- 0.04% )
12049614778 cycles:u # 3.677 GHz ( +- 0.07% ) (83.34%)
117507478 stalled-cycles-frontend:u # 0.98% frontend cycles idle ( +- 0.33% ) (83.32%)
27106761 stalled-cycles-backend:u # 0.22% backend cycles idle ( +- 9.55% ) (83.36%)
33294953848 instructions:u # 2.76 insn per cycle
# 0.00 stalled cycles per insn ( +- 0.03% ) (83.31%)
6849825049 branches:u # 2.090 G/sec ( +- 0.03% ) (83.37%)
71533903 branch-misses:u # 1.04% of all branches ( +- 0.20% ) (83.30%)
3.3088 +- 0.0302 seconds time elapsed ( +- 0.91% )
$
After:
$ perf stat -r5 perf bench internals pmu-scan -i1000
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 550.702 usec (+- 0.958 usec)
Average PMU scanning took: 1566.577 usec (+- 22.747 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 548.315 usec (+- 0.555 usec)
Average PMU scanning took: 1565.499 usec (+- 22.760 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 548.073 usec (+- 0.555 usec)
Average PMU scanning took: 1586.097 usec (+- 23.299 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 561.184 usec (+- 2.709 usec)
Average PMU scanning took: 1567.153 usec (+- 22.548 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 546.987 usec (+- 0.553 usec)
Average PMU scanning took: 1562.814 usec (+- 22.729 usec)
Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):
3170.86 msec task-clock:u # 0.992 CPUs utilized ( +- 0.22% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
1010 page-faults:u # 318.526 /sec ( +- 0.04% )
11890047674 cycles:u # 3.750 GHz ( +- 0.14% ) (83.27%)
119090499 stalled-cycles-frontend:u # 1.00% frontend cycles idle ( +- 0.46% ) (83.40%)
32502449 stalled-cycles-backend:u # 0.27% backend cycles idle ( +- 8.32% ) (83.30%)
33119141261 instructions:u # 2.79 insn per cycle
# 0.00 stalled cycles per insn ( +- 0.01% ) (83.37%)
6812816561 branches:u # 2.149 G/sec ( +- 0.01% ) (83.29%)
70157855 branch-misses:u # 1.03% of all branches ( +- 0.28% ) (83.38%)
3.19710 +- 0.00826 seconds time elapsed ( +- 0.26% )
$
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This also puts an unconditional -Werror under control of WERROR. The
clang includes added during the build can lead to a warning that may be
turned into an error. In addition, hardened clang produces a warning
about lack of support for -fstack-protector* options for the BPF target:
clang -g -O2 -target bpf -Wall -Werror -Ilinux/tools/perf/util/bpf_skel/.tmp/.. \
-I -idirafter /usr/lib/llvm/16/bin/../../../../lib/clang/16/include -idirafter /usr/local/include \
-idirafter /usr/include -Ilinux/tools/include/uapi -c util/bpf_skel/bperf_follower.bpf.c \
-o linux/tools/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o && llvm-strip -g linux/tools/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o
clang-16: error: /usr/lib/llvm/16/bin/../../../../lib/clang/16/include: 'linker' input unused [-Werror,-Wunused-command-line-argument]
clang-16: error: ignoring '-fstack-protector-strong' option as it is not currently supported for target 'bpf' [-Werror,-Woption-ignored]
make[1]: *** [Makefile.perf:1082: linux/tools/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o] Error 1
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZOZQ2LDA+3Wg8x2T@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pass the pmu so the aliases and format list can be better abstracted
and later lazily loaded.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pass the PMU so the format list can be better abstracted and later
lazily loaded.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-8-irogers@google.com
[ Did missing conversions in tools/perf/arch/arm*/util/cs-etm.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pass the pmu so the format list can be better abstracted and later
lazily loaded.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Abstract the format list better, hiding it in the PMU, by changing
perf_pmu__config_terms() the PMU rather than the format list in the PMU.
Change the PMU test to pass a dummy PMU for this purpose. Changing the
test allows perf_pmu__del_formats() to become static.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move declaration from header file to pmu.y and make static.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Avoid having the function in the C and header file, as it is only used
locally by pmu.y.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rather than read a base path and append into a 2nd path, read the base
path directly into output buffer and append to that.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Done to reduce dependencies on pmu-events.h.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Based on commit 7d54a4acd8c1de3e ("perf test: Skip watchpoint tests if
no watchpoints available"), hardware breakpoints are not available for
power9 platform and because of that 'perf bench breakpoint' run fails on
power9 platform.
Add code to check for the return value of perf_event_open() in the
breakpoint run and skip the 'perf bench breakpoint' run, if hardware
breakpoints are not available.
Result on power9 system before patch changes:
[command]# perf bench breakpoint thread
perf_event_open: No such device
Result on power9 system after patch changes:
[command]# ./perf bench breakpoint thread
Skipping perf bench breakpoint thread: No hardware support
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Naveen N Rao <naveen@kernel.org>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230823075103.190565-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I noticed some error with:
# perf list ex_ret_brn
lzma: fopen failed on /usr/lib/modules/5.15.14-100.fc34.x86_64/kernel/net/bluetooth/bnep/bnep.ko.xz: 'No such file or directory'
lzma: fopen failed on /usr/lib/modules/5.16.16-200.fc35.x86_64/kernel/drivers/gpu/drm/drm_kms_helper.ko.xz: 'No such file or directory'
lzma: fopen failed on /usr/lib/modules/5.18.16-200.fc36.x86_64/kernel/arch/x86/crypto/crct10dif-pclmul.ko.xz: 'No such file or directory'
lzma: fopen failed on /usr/lib/modules/5.16.16-200.fc35.x86_64/kernel/drivers/i2c/busses/i2c-piix4.ko.xz: 'No such file or directory'
<BIG SNIP>
Then using 'perf probe' + 'perf trace' to debug 'perf list', it seems
its some inconsistency in the ~/.debug/ cache where broken build id
symlinks that ends up making it try to uncompress some kernel modules
using the lzma routines:
395.309 perf/3594447 probe_perf:lzma_decompress_to_file(__probe_ip: 6118448, input_string: "/usr/lib/modules/5.18.17-200.fc36.x86_64/kernel/drivers/nvme/host/nvme.ko.xz")
lzma_decompress_to_file (/var/home/acme/bin/perf)
filename__decompress (/var/home/acme/bin/perf)
filename__read_build_id (/var/home/acme/bin/perf)
filename__sprintf_build_id (inlined)
build_id_cache__valid_id (inlined)
build_id_cache__list_all (/var/home/acme/bin/perf)
print_sdt_events (/var/home/acme/bin/perf)
cmd_list (/var/home/acme/bin/perf)
run_builtin (/var/home/acme/bin/perf)
handle_internal_command (inlined)
run_argv (inlined)
main (/var/home/acme/bin/perf)
__libc_start_call_main (/usr/lib64/libc.so.6)
__libc_start_main@@GLIBC_2.34 (/usr/lib64/libc.so.6)
_start (/var/home/acme/bin/perf)
But callers of filename__decompress() already check its return and use
pr_debug(), so be consistent and make functions it calls also use
pr_debug().
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZOUD0+GkuCVkYF7n@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It is undefined behavior to pass NULL as snprintf()'s fmt argument.
Here is an example to trigger the problem:
$ perf stat --metric-only -x, -e instructions -- sleep 1
insn per cycle,
Segmentation fault (core dumped)
With this patch:
$ perf stat --metric-only -x, -e instructions -- sleep 1
insn per cycle,
,
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kaige Ye <ye@kaige.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/01CA7674B690CA24+20230804020907.144562-2-ye@kaige.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Similar to what was done in the previous cset for sizeof(saddr), we need
to make sure sizeof(augmented_arg->value) is a power of two to do bounds
checking using &=:
augmented_len &= sizeof(augmented_arg->value) - 1;
Suggested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZONrPo0NSqdbXiGx@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We're using the BPF verifier suggestion:
22: (85) call bpf_probe_read#4
R2 min value is negative, either use unsigned or 'var &= const'
That works only when const is a (power of two - 1) so add an assert to
make sure that that is the case.
Suggested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZONrFmJBNlQpSpZj@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some of the events included in the ampereone/core-imp-def are not
supported on AmpereOne, remove them.
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dave Kleikamp <dave.kleikamp@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230803211331.140553-5-ilkka@os.amperecomputing.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|