<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/tools/perf/Documentation, branch master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/tools/perf/Documentation?h=master</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/tools/perf/Documentation?h=master'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-10-25T20:40:48Z</updated>
<entry>
<title>perf docs: Fix man page build wrt perf-arm-coresight.txt</title>
<updated>2022-10-25T20:40:48Z</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2022-10-17T09:35:49Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=231e61bc2e87486270686f7bc3c43c4fb3b0e0b9'/>
<id>urn:sha1:231e61bc2e87486270686f7bc3c43c4fb3b0e0b9</id>
<content type='text'>
perf build assumes documentation files starting with "perf-" are man
pages but perf-arm-coresight.txt is not a man page:

  asciidoc: ERROR: perf-arm-coresight.txt: line 2: malformed manpage title
  asciidoc: ERROR: perf-arm-coresight.txt: line 3: name section expected
  asciidoc: FAILED: perf-arm-coresight.txt: line 3: section title expected
  make[3]: *** [Makefile:266: perf-arm-coresight.xml] Error 1
  make[3]: *** Waiting for unfinished jobs....
  make[2]: *** [Makefile.perf:895: man] Error 2

Fix by renaming it.

Fixes: dc2e0fb00bb2b24f ("perf test coresight: Add relevant documentation about ARM64 CoreSight testing")
Reported-by: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Reported-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Reviewed-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Carsten Haitzler &lt;carsten.haitzler@arm.com&gt;
Cc: coresight@lists.linaro.org
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Poirier &lt;mathieu.poirier@linaro.org&gt;
Cc: Mike Leach &lt;mike.leach@linaro.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Suzuki Poulouse &lt;suzuki.poulose@arm.com&gt;
Link: https://lore.kernel.org/r/a176a3e1-6ddc-bb63-e41c-15cda8c2d5d2@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf mem/c2c: Add load store event mappings for AMD</title>
<updated>2022-10-06T19:30:06Z</updated>
<author>
<name>Ravi Bangoria</name>
<email>ravi.bangoria@amd.com</email>
</author>
<published>2022-10-06T15:39:43Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=f7b58cbdb3ff36eba8622e67eee66c10dd1c9995'/>
<id>urn:sha1:f7b58cbdb3ff36eba8622e67eee66c10dd1c9995</id>
<content type='text'>
The 'perf mem' and 'perf c2c' tools are wrappers around 'perf record'
with mem load/ store events. IBS tagged load/store sample provides most
of the information needed for these tools. Wire in the "ibs_op//" event
as mem-ldst event for AMD.

There are some limitations though: Only load/store micro-ops provide
mem/c2c information. Whereas, IBS does not have a way to choose a
particular type of micro-op to tag. This results in many non-LS
micro-ops being tagged which appear as N/A in the perf report. IBS,
being an uncore pmu from kernel point of view[1], does not support per
process monitoring. Thus, perf mem/c2c on AMD are currently supported in
per-cpu mode only.

Example:

  $ sudo perf mem record -- -c 10000
  ^C[ perf record: Woken up 227 times to write data ]
  [ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]

  $ sudo perf mem report -F mem,sample,snoop
  Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
  Memory access                  Samples  Snoop
  N/A                             700620  N/A
  L1 hit                          126675  N/A
  L2 hit                             424  N/A
  L3 hit                             664  HitM
  L3 hit                              10  N/A
  Local RAM hit                        2  N/A
  Remote RAM (1 hop) hit            8558  N/A
  Remote Cache (1 hop) hit             3  N/A
  Remote Cache (1 hop) hit             2  HitM
  Remote Cache (2 hops) hit           10  HitM
  Remote Cache (2 hops) hit            6  N/A
  Uncached hit                         4  N/A
  $

[1]: https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@amd.com

Signed-off-by: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ali Saidi &lt;alisaidi@amazon.com&gt;
Cc: Ananth Narayan &lt;ananth.narayan@amd.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Joe Mario &lt;jmario@redhat.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Kim Phillips &lt;kim.phillips@amd.com&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sandipan Das &lt;sandipan.das@amd.com&gt;
Cc: Santosh Shukla &lt;santosh.shukla@amd.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20221006153946.7816-6-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events</title>
<updated>2022-10-06T19:29:32Z</updated>
<author>
<name>Ravi Bangoria</name>
<email>ravi.bangoria@amd.com</email>
</author>
<published>2022-10-06T15:39:42Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=4173cc055dc92f199a43775775e54dc7fafd37b6'/>
<id>urn:sha1:4173cc055dc92f199a43775775e54dc7fafd37b6</id>
<content type='text'>
Currently perf sets PERF_SAMPLE_WEIGHT flag only for mem load events.
Set it for combined load-store event as well which will enable recording
of load latency by default on arch that does not support independent
mem load event.

Also document missing -W in perf-record man page.

Signed-off-by: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ali Saidi &lt;alisaidi@amazon.com&gt;
Cc: Ananth Narayan &lt;ananth.narayan@amd.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Joe Mario &lt;jmario@redhat.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Kim Phillips &lt;kim.phillips@amd.com&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sandipan Das &lt;sandipan.das@amd.com&gt;
Cc: Santosh Shukla &lt;santosh.shukla@amd.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20221006153946.7816-5-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf test coresight: Add relevant documentation about ARM64 CoreSight testing</title>
<updated>2022-10-06T17:50:55Z</updated>
<author>
<name>Carsten Haitzler</name>
<email>carsten.haitzler@arm.com</email>
</author>
<published>2022-09-09T15:28:03Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=dc2e0fb00bb2b24f0b6c4877c34bb1d288d31fb2'/>
<id>urn:sha1:dc2e0fb00bb2b24f0b6c4877c34bb1d288d31fb2</id>
<content type='text'>
Add/improve documentation helping people get started with CoreSight and
perf as well as describe the testing and how it works.

Reviewed-by: James Clark &lt;james.clark@arm.com&gt;
Signed-off-by: Carsten Haitzler &lt;carsten.haitzler@arm.com&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Mathieu Poirier &lt;mathieu.poirier@linaro.org&gt;
Cc: Mike Leach &lt;mike.leach@linaro.org&gt;
Cc: Suzuki Poulouse &lt;suzuki.poulose@arm.com&gt;
Cc: coresight@lists.linaro.org
Cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/20220909152803.2317006-14-carsten.haitzler@foss.arm.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf lock: Add -q/--quiet option to suppress header and debug messages</title>
<updated>2022-10-04T11:55:23Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-09-24T00:42:20Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=6bbc482017deeacf5c9953bafdeb90517e22dc90'/>
<id>urn:sha1:6bbc482017deeacf5c9953bafdeb90517e22dc90</id>
<content type='text'>
Like in 'perf report', this option is to suppress header and debug messages.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220924004221.841024-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf lock: Add -E/--entries option</title>
<updated>2022-10-04T11:55:23Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-09-24T00:42:19Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=6282a1f4f846fda21b16065a2ef094c7b71b2771'/>
<id>urn:sha1:6282a1f4f846fda21b16065a2ef094c7b71b2771</id>
<content type='text'>
Like in 'perf top', the -E option can limit number of entries to print.

It can be useful when users want to see top N contended locks only.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220924004221.841024-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf tools: Add 'addr' sort key</title>
<updated>2022-10-04T11:55:22Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-09-23T17:31:41Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=762461f1a53b268e44fbd941d3734f4553a6e925'/>
<id>urn:sha1:762461f1a53b268e44fbd941d3734f4553a6e925</id>
<content type='text'>
Sometimes users want to see actual (virtual) address of sampled instructions.
Add a new 'addr' sort key to display the raw addresses.

  $ perf record -o- true | perf report -i- -s addr
  # To display the perf.data header info, please use --header/--header-only options.
  #
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
  #
  # Total Lost Samples: 0
  #
  # Samples: 12  of event 'cycles:u'
  # Event count (approx.): 252512
  #
  # Overhead  Address
  # ........  ..................
  #
      42.96%  0x7f96f08443d7
      29.55%  0x7f96f0859b50
      14.76%  0x7f96f0852e02
       8.30%  0x7f96f0855028
       4.43%  0xffffffff8de01087

Note that it just compares and displays the sample ip.  Each process can
have a different memory layout and the ip will be different even if they run
the same binary.  So this sort key is mostly meaningful for per-process
profile data.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: https://lore.kernel.org/r/20220923173142.805896-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf inject: Clarify build-id options a little bit</title>
<updated>2022-10-04T11:55:22Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-09-23T17:31:40Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=fd941521e81fd24e4ab164f88513612fb5f3af85'/>
<id>urn:sha1:fd941521e81fd24e4ab164f88513612fb5f3af85</id>
<content type='text'>
Update the documentation of --build-id and --buildid-all options to
clarify the difference between them.  The former requires full sample
processing to find which DSOs are actually used.  While the latter simply
injects every DSO's build-id from MMAP{,2} records, skipping SAMPLEs.

Reviewed-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: https://lore.kernel.org/r/20220923173142.805896-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf lock contention: Allow to change stack depth and skip</title>
<updated>2022-10-04T11:55:22Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-09-12T05:53:13Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=96532a83ee8e30035e584b046c859adb001a3b8d'/>
<id>urn:sha1:96532a83ee8e30035e584b046c859adb001a3b8d</id>
<content type='text'>
It needs stack traces to find callers of locks.  To minimize the
performance overhead it only collects up to 8 entries for each stack
trace.  And it skips first 3 entries as they came from BPF, tracepoint
and lock functions which are not interested for most users.

But it turned out that those numbers are different in some
configuration.  Using fixed number can result in non meaningful caller
names.  Let's make them adjustable with --stack-depth and --skip-stack
options.

On my setup, the default output is like below:

  # /perf lock con -ab -F contended,wait_total sleep 3
   contended   total wait         type   caller

          28      4.55 ms     rwlock:W   __bpf_trace_contention_begin+0xb
          33      1.67 ms     rwlock:W   __bpf_trace_contention_begin+0xb
          12    580.28 us     spinlock   __bpf_trace_contention_begin+0xb
          60    240.54 us      rwsem:R   __bpf_trace_contention_begin+0xb
          27     64.45 us     spinlock   __bpf_trace_contention_begin+0xb

If I change the stack skip to 5, the result will be like:

  # perf lock con -ab -F contended,wait_total --stack-skip 5 sleep 3
   contended   total wait         type   caller

          32    715.45 us     spinlock   folio_lruvec_lock_irqsave+0x61
          26    550.22 us     spinlock   folio_lruvec_lock_irqsave+0x61
          15    486.93 us      rwsem:R   mmap_read_lock+0x13
          12    139.66 us      rwsem:W   vm_mmap_pgoff+0x93
           1      7.04 us     spinlock   tick_do_update_jiffies64+0x25

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220912055314.744552-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf intel-pt: Support itrace option flag d+e to log on error</title>
<updated>2022-10-04T11:55:21Z</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2022-09-05T07:34:23Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=65aee81afe7f6a54e2fb2de59e1d6cd47dcf8eb9'/>
<id>urn:sha1:65aee81afe7f6a54e2fb2de59e1d6cd47dcf8eb9</id>
<content type='text'>
Pass d+e option and log size via intel_pt_log_enable(). Allocate a buffer
for log messages and provide intel_pt_log_dump_buf() to dump and reset the
buffer upon decoder errors.

Example:

 $ sudo perf record -e intel_pt// sleep 1
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.094 MB perf.data ]
 $ sudo perf config itrace.debug-log-buffer-size=300
 $ sudo perf script --itrace=ed+e+o | head -20
 Dumping debug log buffer (first line may be sliced)
                                         Other
           ffffffff96ca22f6:  48 89 e5                                        Other
           ffffffff96ca22f9:  65 48 8b 05 ff e0 38 69                         Other
           ffffffff96ca2301:  48 3d c0 a5 c1 98                               Other
           ffffffff96ca2307:  74 08                                           Jcc +8
           ffffffff96ca2311:  5d                                              Other
           ffffffff96ca2312:  c3                                              Ret
 ERROR: Bad RET compression (TNT=N) at 0xffffffff96ca2312
 End of debug log buffer dump
  instruction trace error type 1 time 15913.537143482 cpu 5 pid 36292 tid 36292 ip 0xffffffff96ca2312 code 6: Trace doesn't match instruction
 Dumping debug log buffer (first line may be sliced)
                                        Other
           ffffffff96ce7fe9:  f6 47 2e 20                                     Other
           ffffffff96ce7fed:  74 11                                           Jcc +17
           ffffffff96ce7fef:  48 8b 87 28 0a 00 00                            Other
           ffffffff96ce7ff6:  5d                                              Other
           ffffffff96ce7ff7:  48 8b 40 18                                     Other
           ffffffff96ce7ffb:  c3                                              Ret
 ERROR: Bad RET compression (TNT=N) at 0xffffffff96ce7ffb
 Warning:
 8 instruction trace errors

Reviewed-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Reviewed-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: https://lore.kernel.org/r/20220905073424.3971-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
</feed>
