<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/tools/perf/util/Build, branch linus/master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/tools/perf/util/Build?h=linus%2Fmaster</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/tools/perf/util/Build?h=linus%2Fmaster'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-05-26T15:36:57Z</updated>
<entry>
<title>perf record: Enable off-cpu analysis with BPF</title>
<updated>2022-05-26T15:36:57Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-05-18T22:47:21Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=edc41a1099c2d08ccfd4ed7d59688501e3749015'/>
<id>urn:sha1:edc41a1099c2d08ccfd4ed7d59688501e3749015</id>
<content type='text'>
Add --off-cpu option to enable the off-cpu profiling with BPF.  It'd
use a bpf_output event and rename it to "offcpu-time".  Samples will
be synthesized at the end of the record session using data from a BPF
map which contains the aggregated off-cpu time at context switches.
So it needs root privilege to get the off-cpu profiling.

Each sample will have a separate user stacktrace so it will skip
kernel threads.  The sample ip will be set from the stacktrace and
other sample data will be updated accordingly.  Currently it only
handles some basic sample types.

The sample timestamp is set to a dummy value just not to bother with
other events during the sorting.  So it has a very big initial value
and increase it on processing each samples.

Good thing is that it can be used together with regular profiling like
cpu cycles.  If you don't want to that, you can use a dummy event to
enable off-cpu profiling only.

Example output:
  $ sudo perf record --off-cpu perf bench sched messaging -l 1000

  $ sudo perf report --stdio --call-graph=no
  # Total Lost Samples: 0
  #
  # Samples: 41K of event 'cycles'
  # Event count (approx.): 42137343851
  ...

  # Samples: 1K of event 'offcpu-time'
  # Event count (approx.): 587990831640
  #
  # Children      Self  Command          Shared Object       Symbol
  # ........  ........  ...............  ..................  .........................
  #
      81.66%     0.00%  sched-messaging  libc-2.33.so        [.] __libc_start_main
      81.66%     0.00%  sched-messaging  perf                [.] cmd_bench
      81.66%     0.00%  sched-messaging  perf                [.] main
      81.66%     0.00%  sched-messaging  perf                [.] run_builtin
      81.43%     0.00%  sched-messaging  perf                [.] bench_sched_messaging
      40.86%    40.86%  sched-messaging  libpthread-2.33.so  [.] __read
      37.66%    37.66%  sched-messaging  libpthread-2.33.so  [.] __write
       2.91%     2.91%  sched-messaging  libc-2.33.so        [.] __poll
  ...

As you can see it spent most of off-cpu time in read and write in
bench_sched_messaging().  The --call-graph=no was added just to make
the output concise here.

It uses perf hooks facility to control BPF program during the record
session rather than adding new BPF/off-cpu specific calls.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Blake Jones &lt;blakejones@google.com&gt;
Cc: Hao Luo &lt;haoluo@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Milian Wolff &lt;milian.wolff@kdab.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220518224725.742882-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf maps: Move maps code to own C file</title>
<updated>2022-02-14T19:56:32Z</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2022-02-11T10:34:01Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=9d31d18bbb6898c9640763fee20cf357ab51a7aa'/>
<id>urn:sha1:9d31d18bbb6898c9640763fee20cf357ab51a7aa</id>
<content type='text'>
The maps code has its own header, move the corresponding C function
definitions to their own C file. In the process tidy and minimize
includes.

Committer notes:

Add back the 'static' for maps__init() and maps__exit().

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Alexey Bayduraev &lt;alexey.v.bayduraev@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: André Almeida &lt;andrealmeid@collabora.com&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Darren Hart &lt;dvhart@infradead.org&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Dmitriy Vyukov &lt;dvyukov@google.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: German Gomez &lt;german.gomez@arm.com&gt;
Cc: Hao Luo &lt;haoluo@google.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: John Garry &lt;john.garry@huawei.com&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Miaoqian Lin &lt;linmq006@gmail.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Riccardo Mancini &lt;rickyman7@gmail.com&gt;
Cc: Shunsuke Nakamura &lt;nakamura.shun@fujitsu.com&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Stephen Brennan &lt;stephen.s.brennan@oracle.com&gt;
Cc: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Thomas Richter &lt;tmricht@linux.ibm.com&gt;
Cc: Yury Norov &lt;yury.norov@gmail.com&gt;
Link: http://lore.kernel.org/lkml/20220211103415.2737789-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf arm64: Inject missing frames when using 'perf record --call-graph=fp'</title>
<updated>2021-12-21T21:37:13Z</updated>
<author>
<name>Alexandre Truong</name>
<email>alexandre.truong@arm.com</email>
</author>
<published>2021-12-17T15:45:20Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=b9f6fbb3b2c29736970ae9fcc0e82b0bd459442b'/>
<id>urn:sha1:b9f6fbb3b2c29736970ae9fcc0e82b0bd459442b</id>
<content type='text'>
When unwinding using frame pointers on ARM64, the return address of the
current function may not have been pushed into the stack when a function
was interrupted, which makes perf show an incorrect call graph to the
user.

Consider the following example program:

  void leaf() {
      /* long computation */
  }

  void parent() {
      // (1)
      leaf();
      // (2)
  }

  ... could be compiled into (using gcc -fno-inline -fno-omit-frame-pointer):

  leaf:
      /* long computation */
      nop
      ret
  parent:
      // (1)
      stp     x29, x30, [sp, -16]!
      mov     x29, sp
      bl      parent
      nop
      ldp     x29, x30, [sp], 16
      // (2)
      ret

If the program is interrupted at (1), (2), or any point in "leaf:", the
call graph will skip the callers of the current function. We can unwind
using the dwarf info and check if the return addr is the same as the LR
register, and inject the missing frame into the call graph.

Before this patch, the above example shows the following call-graph when
recording using "--call-graph fp" mode in ARM64:

  # Children      Self  Command   Shared Object     Symbol
  # ........  ........  ........  ................  ......................
  #
      99.86%    99.86%  program3  program3          [.] leaf
  	    |
  	    ---_start
  	       __libc_start_main
  	       main
  	       leaf

As can be seen, the "parent" function is missing. This is specially
problematic in "leaf" because for leaf functions the compiler may always
omit pushing the return addr into the stack. After this patch, it shows
the correct graph:

  # Children      Self  Command   Shared Object     Symbol
  # ........  ........  ........  ................  ......................
  #
      99.86%    99.86%  program3  program3          [.] leaf
  	    |
  	    ---_start
  	       __libc_start_main
  	       main
  	       parent
  	       leaf

Reviewed-by: James Clark &lt;james.clark@arm.com&gt;
Signed-off-by: Alexandre Truong &lt;alexandre.truong@arm.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: John Garry &lt;john.garry@huawei.com&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Poirier &lt;mathieu.poirier@linaro.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211217154521.80603-7-german.gomez@arm.com
Signed-off-by: German Gomez &lt;german.gomez@arm.com&gt;
[ Rename machine__normalize_is() to machine__normalized_is(), as suggested by James Clark ]
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf ftrace: Add -b/--use-bpf option for latency subcommand</title>
<updated>2021-12-16T15:18:12Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2021-12-15T18:51:53Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=177f4eac7fb7fe5c70fef30dd6c4ef8f81cf7776'/>
<id>urn:sha1:177f4eac7fb7fe5c70fef30dd6c4ef8f81cf7776</id>
<content type='text'>
The -b/--use-bpf option is to use BPF to get latency info of kernel
functions.  It'd have better performance impact and I observed that
latency of same function is smaller than before when using BPF.

Committer testing:

  # strace -e bpf perf ftrace latency -b -T __handle_mm_fault -a sleep 1
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7fff51914e00, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\08\0\0\08\0\0\0\t\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=89, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\f\0\0\0\f\0\0\0\7\0\0\0\1\0\0\0\0\0\0\20"..., btf_log_buf=NULL, btf_size=43, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\5\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=77, btf_log_size=0, btf_log_level=0}, 128) = -1 EINVAL (Invalid argument)
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\350\2\0\0\350\2\0\0\353\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1515, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=32, max_entries=1, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=5, insns=0x7fff51914c30, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 5
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7fff51914a80, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="test", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 4
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=8, value_size=8, max_entries=10000, map_flags=0, inner_map_fd=0, map_name="functime", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="cpu_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 5
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="task_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 7
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=8, max_entries=22, map_flags=0, inner_map_fd=0, map_name="latency", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 8
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="func_lat.bss", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=30, btf_vmlinux_value_type_id=0}, 128) = 9
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=9, key=0x7fff51914c40, value=0x7f6e99be2000, flags=BPF_ANY}, 128) = 0
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=18, insns=0x11e4160, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_begin", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x11dfc50, func_info_cnt=1, line_info_rec_size=16, line_info=0x11e04c0, line_info_cnt=9, attach_btf_id=0, attach_prog_fd=0}, 128) = 10
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=99, insns=0x11ded70, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_end", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x11dfc70, func_info_cnt=1, line_info_rec_size=16, line_info=0x11f6e10, line_info_cnt=20, attach_btf_id=0, attach_prog_fd=0}, 128) = 11
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=2, insns=0x7fff51914a80, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 13
  bpf(BPF_LINK_CREATE, {link_create={prog_fd=13, target_fd=-1, attach_type=0x29 /* BPF_??? */, flags=0}}, 128) = -1 EINVAL (Invalid argument)
  --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1699992, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  #   DURATION     |      COUNT | GRAPH                                          |
       0 - 1    us |         52 | ###################                            |
       1 - 2    us |         36 | #############                                  |
       2 - 4    us |         24 | #########                                      |
       4 - 8    us |          7 | ##                                             |
       8 - 16   us |          1 |                                                |
      16 - 32   us |          0 |                                                |
      32 - 64   us |          0 |                                                |
      64 - 128  us |          0 |                                                |
     128 - 256  us |          0 |                                                |
     256 - 512  us |          0 |                                                |
     512 - 1024 us |          0 |                                                |
       1 - 2    ms |          0 |                                                |
       2 - 4    ms |          0 |                                                |
       4 - 8    ms |          0 |                                                |
       8 - 16   ms |          0 |                                                |
      16 - 32   ms |          0 |                                                |
      32 - 64   ms |          0 |                                                |
      64 - 128  ms |          0 |                                                |
     128 - 256  ms |          0 |                                                |
     256 - 512  ms |          0 |                                                |
     512 - 1024 ms |          0 |                                                |
       1 - ...   s |          0 |                                                |
  +++ exited with 0 +++
  #

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Athira Jajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Changbin Du &lt;changbin.du@gmail.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: https://lore.kernel.org/r/20211215185154.360314-5-namhyung@kernel.org
[ Add missing util/cpumap.h include and removed unused 'fd' variable ]
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf bpf: Pull in bpf_program__get_prog_info_linear()</title>
<updated>2021-11-01T21:16:40Z</updated>
<author>
<name>Dave Marchevsky</name>
<email>davemarchevsky@fb.com</email>
</author>
<published>2021-10-11T08:20:30Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=6ac22d036f86c4e2d38bea1108672f947f7facca'/>
<id>urn:sha1:6ac22d036f86c4e2d38bea1108672f947f7facca</id>
<content type='text'>
To prepare for impending deprecation of libbpf's bpf_program__get_prog_info_linear(),
pull in the function and associated helpers into the perf codebase and migrate
existing uses to the perf copy.

Since libbpf's deprecated definitions will still be visible to perf, it is necessary
to rename perf's definitions.

Signed-off-by: Dave Marchevsky &lt;davemarchevsky@fb.com&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20211011082031.4148337-4-davemarchevsky@fb.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>tools lib: Adopt list_sort() from the kernel sources</title>
<updated>2021-10-20T13:30:59Z</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2021-10-15T17:21:12Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=92ec3cc94c2cb60db21cc16b469c0a7366b86742'/>
<id>urn:sha1:92ec3cc94c2cb60db21cc16b469c0a7366b86742</id>
<content type='text'>
Add list_sort.[ch] from the main kernel tree. The linux/bug.h #include
is removed due to conflicting definitions. Add check-headers and modify
perf build accordingly.

MANIFEST and python-ext-sources fixes suggested by Arnaldo.

Suggested-by: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Acked-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Antonov &lt;alexander.antonov@linux.intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andrew Kilroy &lt;andrew.kilroy@arm.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Changbin Du &lt;changbin.du@intel.com&gt;
Cc: Denys Zagorui &lt;dzagorui@cisco.com&gt;
Cc: Fabian Hemmer &lt;copy@copy.sh&gt;
Cc: Felix Fietkau &lt;nbd@nbd.name&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Cc: Jiapeng Chong &lt;jiapeng.chong@linux.alibaba.com&gt;
Cc: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Joakim Zhang &lt;qiangqing.zhang@nxp.com&gt;
Cc: John Garry &lt;john.garry@huawei.com&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Kees Kook &lt;keescook@chromium.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Nicholas Fraser &lt;nfraser@codeweavers.com&gt;
Cc: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Cc: Paul Clarke &lt;pc@us.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Riccardo Mancini &lt;rickyman7@gmail.com&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: ShihCheng Tu &lt;mrtoastcheng@gmail.com&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Sumanth Korikkar &lt;sumanthk@linux.ibm.com&gt;
Cc: Thomas Richter &lt;tmricht@linux.ibm.com&gt;
Cc: Wan Jiabing &lt;wanjiabing@vivo.com&gt;
Cc: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Link: https://lore.kernel.org/r/20211015172132.1162559-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf report: Add support to print a textual representation of IBS raw sample data</title>
<updated>2021-09-10T21:15:21Z</updated>
<author>
<name>Kim Phillips</name>
<email>kim.phillips@amd.com</email>
</author>
<published>2021-08-17T22:15:09Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=291dcb98d7ee5cd719f4c5991d977794b1829c16'/>
<id>urn:sha1:291dcb98d7ee5cd719f4c5991d977794b1829c16</id>
<content type='text'>
Perf records IBS (Instruction Based Sampling) extra sample data when
'perf record --raw-samples' is used with an IBS-compatible event, on a
machine that supports IBS.  IBS support is indicated in
CPUID_Fn80000001_ECX bit #10.

Up until now, users have been able to see the extra sample data solely
in raw hex format using 'perf report --dump-raw-trace'.  From there,
users could decode the data either manually, or by using an external
script.

Enable the built-in 'perf report --dump-raw-trace' to do the decoding of
the extra sample data bits, so manual or external script decoding isn't
necessary.

Example usage:

  $ sudo perf record -c 10000001 -a --raw-samples -e ibs_fetch/rand_en=1/,ibs_op/cnt_ctl=1/ -C 0,1 taskset -c 0,1 7za b -mmt2 | perf report --dump-raw-trace

Stdout contains IBS Fetch samples, e.g.:

  ibs_fetch_ctl:	02170007ffffffff MaxCnt 1048560 Cnt 1048560 Lat     7 En 1 Val 1 Comp 1 IcMiss 0 PhyAddrValid 1 L1TlbPgSz 4KB L1TlbMiss 0 L2TlbMiss 0 RandEn 1 L2Miss 0
  IbsFetchLinAd:	000056016b2ead40
  IbsFetchPhysAd:	000000115cedfd40
  c_ibs_ext_ctl:	0000000000000000 IbsItlbRefillLat   0

..and IBS Op samples, e.g.:

  ibs_op_ctl:	0000009e009e8968 MaxCnt  10000000 En 1 Val 1 CntCtl 1=uOps CurCnt       158
  IbsOpRip:	000056016b2ea73d
  ibs_op_data:	00000000000b0002 CompToRetCtr     2 TagToRetCtr    11 BrnRet 0  RipInvalid 0 BrnFuse 0 Microcode 0
  ibs_op_data2:	0000000000000002 CacheHitSt 0=M-state RmtNode 0 DataSrc 2=Local node cache
  ibs_op_data3:	0000000000c60002 LdOp 0 StOp 1 DcL1TlbMiss 0 DcL2TlbMiss 0 DcL1TlbHit2M 0 DcL1TlbHit1G 0 DcL2TlbHit2M 0 DcMiss 0 DcMisAcc 0 DcWcMemAcc 0 DcUcMemAcc 0 DcLockedOp 0 DcMissNoMabAlloc 0 DcLinAddrValid 1 DcPhyAddrValid 1 DcL2TlbHit1G 0 L2Miss 0 SwPf 0 OpMemWidth  4 bytes OpDcMissOpenMemReqs  0 DcMissLat     0 TlbRefillLat     0
  IbsDCLinAd:	00007f133c319ce0
  IbsDCPhysAd:	0000000270485ce0

Committer notes:

Fixed up this:

  util/amd-sample-raw.c: In function ‘evlist__amd_sample_raw’:
  util/amd-sample-raw.c:125:42: error: ‘ bytes’ directive output may be truncated writing 6 bytes into a region of size between 4 and 7 [-Werror=format-truncation=]
    125 |                          " OpMemWidth %2d bytes", 1 &lt;&lt; (reg.op_mem_width - 1));
        |                                          ^~~~~~
  In file included from /usr/include/stdio.h:866,
                   from util/amd-sample-raw.c:7:
  /usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’ output between 21 and 24 bytes into a destination of size 21
     71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     72 |                                    __glibc_objsize (__s), __fmt,
        |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     73 |                                    __va_arg_pack ());
        |                                    ~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

As that %2d won't limit the number of chars to 2, just state that 2 is
the minimal width:

  $ cat printf.c
  #include &lt;stdio.h&gt;
  #include &lt;stdlib.h&gt;

  int main(int argc, char *argv[])
  {
  	char bf[64];
  	int len = snprintf(bf, sizeof(bf), "%2d", atoi(argv[1]));

  	printf("strlen(%s): %u\n", bf, len);

  	return 0;
  }
  $ ./printf 1
  strlen( 1): 2
  $ ./printf 12
  strlen(12): 2
  $ ./printf 123
  strlen(123): 3
  $ ./printf 1234
  strlen(1234): 4
  $ ./printf 12345
  strlen(12345): 5
  $ ./printf 123456
  strlen(123456): 6
  $

And since we probably don't want that output to be truncated, just
assume the worst case, as the compiler did, and add a few more chars to
that buffer.

Also use sizeof(var) instead of sizeof(dup-of-wanted-format-string) to
avoid bugs when changing one but not the other.

I also had to change this:

  -#include &lt;asm/amd-ibs.h&gt;
  +#include "../../arch/x86/include/asm/amd-ibs.h"

To make it build on other architectures, just like intel-pt does.

Signed-off-by: Kim Phillips &lt;kim.phillips@amd.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Joao Martins &lt;joao.m.martins@oracle.com&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Michael Petlan &lt;mpetlan@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Robert Richter &lt;robert.richter@amd.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: https //lore.kernel.org/r/20210817221509.88391-4-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat: Enable BPF counter with --for-each-cgroup</title>
<updated>2021-07-05T17:16:57Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2021-07-01T21:12:27Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=944138f048f7d7591ec7568c94b21de8df2724d4'/>
<id>urn:sha1:944138f048f7d7591ec7568c94b21de8df2724d4</id>
<content type='text'>
Recently bperf was added to use BPF to count perf events for various
purposes.  This is an extension for the approach and targetting to
cgroup usages.

Unlike the other bperf, it doesn't share the events with other
processes but it'd reduce unnecessary events (and the overhead of
multiplexing) for each monitored cgroup within the perf session.

When --for-each-cgroup is used with --bpf-counters, it will open
cgroup-switches event per cpu internally and attach the new BPF
program to read given perf_events and to aggregate the results for
cgroups.  It's only called when task is switched to a task in a
different cgroup.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: http://lore.kernel.org/lkml/20210701211227.1403788-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf script: Add API for filtering via dynamically loaded shared object</title>
<updated>2021-07-01T19:14:37Z</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2021-06-27T13:18:09Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=291961fc3c28b4c1acfc3b03559aa14c286a6b0d'/>
<id>urn:sha1:291961fc3c28b4c1acfc3b03559aa14c286a6b0d</id>
<content type='text'>
In some cases, users want to filter very large amounts of data (e.g.
from AUX area tracing like Intel PT) looking for something specific.
While scripting such as Python can be used, Python is 10 to 20 times
slower than C. So define a C API so that custom filters can be written
and loaded.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20210627131818.810-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf parse-events: Add bison --file-prefix-map option</title>
<updated>2021-05-27T16:55:28Z</updated>
<author>
<name>Denys Zagorui</name>
<email>dzagorui@cisco.com</email>
</author>
<published>2021-05-24T11:15:14Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=6793672accf380f895b2dc39eff90c7f0cc99fb6'/>
<id>urn:sha1:6793672accf380f895b2dc39eff90c7f0cc99fb6</id>
<content type='text'>
During a perf build with O= bison stores full paths in generated files
and those paths are stored in resulting perf binary.

Starting from bison v3.7.1 those paths can be remapped by using the
--file-prefix-map option.  Use this option if possible to make perf
binary more reproducible.

Signed-off-by: Denys Zagorui &lt;dzagorui@cisco.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20210524111514.65713-3-dzagorui@cisco.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
</feed>
