aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/util/machine.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-05-15perf machine: Null-terminate version char array upon fgets(/proc/version) errorDonald Yandt1-1/+2
If fgets() fails due to any other error besides end-of-file, the version char array may not even be null-terminated. Signed-off-by: Donald Yandt <donald.yandt@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Avi Kivity <avi@scylladb.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Fixes: a1645ce12adb ("perf: 'perf kvm' tool for monitoring guest performance from host") Link: http://lkml.kernel.org/r/20190514110100.22019-1-donald.yandt@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28perf machine: Update kernel map address and re-order properlyWei Li1-12/+20
Since commit 1fb87b8e9599 ("perf machine: Don't search for active kernel start in __machine__create_kernel_maps"), the __machine__create_kernel_maps() just create a map what start and end are both zero. Though the address will be updated later, the order of map in the rbtree may be incorrect. The commit ee05d21791db ("perf machine: Set main kernel end address properly") fixed the logic in machine__create_kernel_maps(), but it's still wrong in function machine__process_kernel_mmap_event(). To reproduce this issue, we need an environment which the module address is before the kernel text segment. I tested it on an aarch64 machine with kernel 4.19.25: [root@localhost hulk]# grep _stext /proc/kallsyms ffff000008081000 T _stext [root@localhost hulk]# grep _etext /proc/kallsyms ffff000009780000 R _etext [root@localhost hulk]# tail /proc/modules hisi_sas_v2_hw 77824 0 - Live 0xffff00000191d000 nvme_core 126976 7 nvme, Live 0xffff0000018b6000 mdio 20480 1 ixgbe, Live 0xffff0000018ab000 hisi_sas_main 106496 1 hisi_sas_v2_hw, Live 0xffff000001861000 hns_mdio 20480 2 - Live 0xffff000001822000 hnae 28672 3 hns_dsaf,hns_enet_drv, Live 0xffff000001815000 dm_mirror 40960 0 - Live 0xffff000001804000 dm_region_hash 32768 1 dm_mirror, Live 0xffff0000017f5000 dm_log 32768 2 dm_mirror,dm_region_hash, Live 0xffff0000017e7000 dm_mod 315392 17 dm_mirror,dm_log, Live 0xffff000001780000 [root@localhost hulk]# Before fix: [root@localhost bin]# perf record sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ] [root@localhost bin]# perf buildid-list -i perf.data 4c4e46c971ca935f781e603a09b52a92e8bdfee8 [vdso] [root@localhost bin]# perf buildid-list -i perf.data -H 0000000000000000000000000000000000000000 /proc/kcore [root@localhost bin]# After fix: [root@localhost tools]# ./perf/perf record sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ] [root@localhost tools]# ./perf/perf buildid-list -i perf.data 28a6c690262896dbd1b5e1011ed81623e6db0610 [kernel.kallsyms] 106c14ce6e4acea3453e484dc604d66666f08a2f [vdso] [root@localhost tools]# ./perf/perf buildid-list -i perf.data -H 28a6c690262896dbd1b5e1011ed81623e6db0610 /proc/kcore Signed-off-by: Wei Li <liwei391@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190228092003.34071-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06perf tools: Add missing include for symbols.hArnaldo Carvalho de Melo1-0/+1
Several places were using definitions found in symbols.h but not including it, getting it by sheer luck from some other headers that now are in the process of removing that include because they don't need it or because simply having struct forward declarations is enough, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-xbcvvx296d70kpg9wb0qmeq9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-25perf machine: Use cached rbtreesDavidlohr Bueso1-22/+31
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something required for nearly every operation dealing with machine->guests and threads->entries. The conversion is straightforward, however, it's worth noticing that the rb_erase_init() calls have been replaced by rb_erase_cached() which has no _init() flavor, however, the node is explicitly cleared next anyway, which was redundant until now. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20181206191819.30182-3-dave@stgolabs.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-21perf tools: Handle PERF_RECORD_BPF_EVENTSong Liu1-0/+3
This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to turn it on. Committer notes: Add dummy machine__process_bpf_event() variant that returns zero for systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking the build in such systems. Remove the needless include <machine.h> from bpf->event.h, provide just forward declarations for the structs and unions in the parameters, to reduce compilation time and needless rebuilds when machine.h gets changed. Committer testing: When running with: # perf record --bpf-event On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL is not present, we fallback to removing those two bits from perf_event_attr, making the tool to continue to work on older kernels: perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off ksymbol ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ And then proceeds to work without those two features. As passing --bpf-event is an explicit action performed by the user, perhaps we should emit a warning telling that the kernel has no such feature, but this can be done on top of this patch. Now with a kernel that supports these events, start the 'record --bpf-event -a' and then run 'perf trace sleep 10000' that will use the BPF augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus should generate PERF_RECORD_BPF_EVENT events: [root@quaco ~]# perf record -e dummy -a --bpf-event ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.713 MB perf.data ] [root@quaco ~]# bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 31: tracepoint name sys_enter tag 12504ba9402f952f gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 512B jited 374B memlock 4096B map_ids 30,29,28 32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 256B jited 191B memlock 4096B map_ids 30,29 # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29 10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29 11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30 12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30 13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31 14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32 # There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-21perf tools: Handle PERF_RECORD_KSYMBOLSong Liu1-0/+55
This patch handles PERF_RECORD_KSYMBOL in perf record/report. Specifically, map and symbol are created for ksymbol register, and removed for ksymbol unregister. This patch also sets perf_event_attr.ksymbol properly. The flag is ON by default. Committer notes: Use proper inttypes.h for u64, fixing the build in some environments like in the android NDK r15c targetting ARM 32-bit. I.e. fixing this build error: util/event.c: In function 'perf_event__fprintf_ksymbol': util/event.c:1489:10: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=] event->ksymbol_event.flags, event->ksymbol_event.name); ^ cc1: all warnings being treated as errors Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-6-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-04perf report: Fix wrong iteration count in --branch-historyJin Yao1-1/+1
By calculating the removed loops, we can get the iteration count. But the iteration count could be reported incorrectly, reporting impossibly high counts. That's because previous code uses the number of removed LBR entries for the iteration count. That's not good. Fix this by increasing the iteration count when a loop is detected. When matching the chain, the iteration count would be added up, finally we need to compute the average value when printing out. For example, $ perf report --branch-history --stdio --no-children Before: ---f2 +0 | |--33.62%--f1 +9 (cycles:1) | f1 +0 | main +22 (cycles:1) | main +17 | main +38 (cycles:1) | main +27 | f1 +26 (cycles:1) | f1 +24 | f2 +27 (cycles:7) | f2 +0 | f1 +19 (cycles:1) | f1 +14 | f2 +27 (cycles:11) | f2 +0 | f1 +9 (cycles:1 iter:2968 avg_cycles:3) | f1 +0 | main +22 (cycles:1 iter:2968 avg_cycles:3) | main +17 | main +38 (cycles:1 iter:2968 avg_cycles:3) 2968 is an impossible high iteration count and avg_cycles is too small. After: ---f2 +0 | |--33.62%--f1 +9 (cycles:1) | f1 +0 | main +22 (cycles:1) | main +17 | main +38 (cycles:1) | main +27 | f1 +26 (cycles:1) | f1 +24 | f2 +27 (cycles:7) | f2 +0 | f1 +19 (cycles:1) | f1 +14 | f2 +27 (cycles:11) | f2 +0 | f1 +9 (cycles:1 iter:1 avg_cycles:23) | f1 +0 | main +22 (cycles:1 iter:1 avg_cycles:23) | main +17 | main +38 (cycles:1 iter:1 avg_cycles:23) avg_cycles:23 is the average cycles of this iteration. Fixes: c4ee06251d42 ("perf report: Calculate the average cycles of iterations") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1546582230-17507-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17perf tools: Allow specifying proc-map-timeout in config fileMark Drayton1-3/+1
The default timeout of 500ms for parsing /proc/<pid>/maps files is too short for profiling many of our services. This can be overridden by passing --proc-map-timeout to the relevant command but it'd be nice to globally increase our default value. This patch permits setting a different default with the core.proc-map-timeout config file parameter. Signed-off-by: Mark Drayton <mbd@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20181204203420.1683114-1-mbd@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17perf tools: Fix diverse comment typosIngo Molnar1-1/+1
Go over the tools/ files that are maintained in Arnaldo's tree and fix common typos: half of them were in comments, the other half in JSON files. No change in functionality intended. Committer notes: This was split from a larger patch as there are code that is, additionally, maintained outside the kernel tree, so to ease cherry-picking and/or backporting, split this into multiple patches. Just typos in comments, no need to backport, reducing the possibility of possible backporting artifacts. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20181203102200.GA104797@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17perf thread: Add fallback functions for cases where cpumode is insufficientAdrian Hunter1-0/+27
For branch stacks or branch samples, the sample cpumode might not be correct because it applies only to the sample 'ip' and not necessary to 'addr' or branch stack addresses. Add fallback functions that can be used to deal with those cases Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org # 4.19 Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-31perf tools: Don't clone maps from parent when synthesizing forksDavid Miller1-1/+18
When synthesizing FORK events, we are trying to create thread objects for the already running tasks on the machine. Normally, for a kernel FORK event, we want to clone the parent's maps because that is what the kernel just did. But when synthesizing, this should not be done. If we do, we end up with overlapping maps as we process the sythesized MMAP2 events that get delivered shortly thereafter. Use the FORK event misc flags in an internal way to signal this situation, so we can elide the map clone when appropriate. Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Don Zickus <dzickus@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joe Mario <jmario@redhat.com> Link: http://lkml.kernel.org/r/20181030.222404.2085088822877051075.davem@davemloft.net [ Added comment about flag use in machine__process_fork_event(), use ternary op in thread__clone_map_groups() as suggested by Jiri ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-31perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc}David S. Miller1-1/+34
When processing using 'perf report -g caller', which is the default, we ended up reverting the callchain entries received from the kernel, but simply reverting throws away the information that tells that from a point onwards the addresses are for userspace, kernel, guest kernel, guest user, hypervisor. The idea is that if we are walking backwards, for each cluster of non-cpumode entries we have to first scan backwards for the next one and use that for the cluster. This seems silly and more expensive than it needs to be but it is enough for a initial fix. The code here is really complicated because it is intimately intertwined with the lbr and branch handling, as well as this callchain order, further fixes will be needed to properly take into account the cpumode in those cases. Another problem with ORDER_CALLER is that the NULL "0" IP that is at the end of most callchains shows up at the top of the histogram because every callchain contains it and with ORDER_CALLER it is the first entry. Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Souvik Banerjee <souvik1997@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: stable@vger.kernel.org # 4.19 Link: https://lkml.kernel.org/n/tip-2wt3ayp6j2y2f2xowixa8y6y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-05perf record: Use unmapped IP for inline callchain cursorsMilian Wolff1-1/+2
Only use the mapped IP to find inline frames, but keep using the unmapped IP for the callchain cursor. This ensures we properly show the unmapped IP when displaying a frame we received via the dso__parse_addr_inlines API for a module which does not contain sufficient debug symbols to show the srcline. This is another follow-up to commit 19610184693c ("perf script: Show virtual addresses instead of offsets"). Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sandipan Das <sandipan@linux.ibm.com> Fixes: 19610184693c ("perf script: Show virtual addresses instead of offsets") Link: http://lkml.kernel.org/r/20180926135207.30263-2-milian.wolff@kdab.com Link: http://lkml.kernel.org/r/20181002073949.3297-1-milian.wolff@kdab.com [ Squashed a fix from Milian for a problem reported by Ravi, fixed up space damage ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-27perf report: Don't try to map ip to invalid mapMilian Wolff1-2/+3
Fixes a crash when the report encounters an address that could not be associated with an mmaped region: #0 0x00005555557bdc4a in callchain_srcline (ip=<error reading variable: Cannot access memory at address 0x38>, sym=0x0, map=0x0) at util/machine.c:2329 #1 unwind_entry (entry=entry@entry=0x7fffffff9180, arg=arg@entry=0x7ffff5642498) at util/machine.c:2329 #2 0x00005555558370af in entry (arg=0x7ffff5642498, cb=0x5555557bdb50 <unwind_entry>, thread=<optimized out>, ip=18446744073709551615) at util/unwind-libunwind-local.c:586 #3 get_entries (ui=ui@entry=0x7fffffff9620, cb=0x5555557bdb50 <unwind_entry>, arg=0x7ffff5642498, max_stack=<optimized out>) at util/unwind-libunwind-local.c:703 #4 0x0000555555837192 in _unwind__get_entries (cb=<optimized out>, arg=<optimized out>, thread=<optimized out>, data=<optimized out>, max_stack=<optimized out>) at util/unwind-libunwind-local.c:725 #5 0x00005555557c310f in thread__resolve_callchain_unwind (max_stack=127, sample=0x7fffffff9830, evsel=0x555555c7b3b0, cursor=0x7ffff5642498, thread=0x555555c7f6f0) at util/machine.c:2351 #6 thread__resolve_callchain (thread=0x555555c7f6f0, cursor=0x7ffff5642498, evsel=0x555555c7b3b0, sample=0x7fffffff9830, parent=0x7fffffff97b8, root_al=0x7fffffff9750, max_stack=127) at util/machine.c:2378 #7 0x00005555557ba4ee in sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fffffff97b8, evsel=<optimized out>, al=al@entry=0x7fffffff9750, max_stack=<optimized out>) at util/callchain.c:1085 Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Tested-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 2a9d5050dc84 ("perf script: Show correct offsets for DWARF-based unwinding") Link: http://lkml.kernel.org/r/20180926135207.30263-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-20perf tools: Store compression id into struct dsoJiri Olsa1-1/+3
Add comp to 'struct dso' to hold the compression index. It will be used in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24perf machine: Use last_match threads cache only in single thread modeJiri Olsa1-3/+22
There's an issue with using threads::last_match in multithread mode which is enabled during the perf top synthesize. It might crash with following assertion: perf: ...include/linux/refcount.h:109: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed. The gdb backtrace looks like this: 0x00007ffff50839fb in raise () from /lib64/libc.so.6 (gdb) #0 0x00007ffff50839fb in raise () from /lib64/libc.so.6 #1 0x00007ffff5085800 in abort () from /lib64/libc.so.6 #2 0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000000535ff9 in refcount_inc (r=0x7fffe8009a70) at ...include/linux/refcount.h:109 #5 0x0000000000536771 in thread__get (thread=0x7fffe8009a40) at util/thread.c:115 #6 0x0000000000523cd0 in ____machine__findnew_thread (machine=0xbfde38, threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:432 #7 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38, pid=2, tid=2) at util/machine.c:489 #8 0x0000000000523f24 in machine__findnew_thread (machine=0xbfde38, pid=2, tid=2) at util/machine.c:499 #9 0x0000000000526fbe in machine__process_fork_event (machine=0xbfde38, ... The failing assertion is this one: REFCOUNT_WARN(!refcount_inc_not_zero(r), ... the problem is that we don't serialize access to threads::last_match. We serialize the access to the threads tree, but we don't care how's threads::last_match being accessed. Both locked/unlocked paths use that data and can set it. In multithreaded mode we can end up with invalid object in thread__get call, like in following paths race: thread 1 ... machine__findnew_thread down_write(&threads->lock); __machine__findnew_thread ____machine__findnew_thread th = threads->last_match; if (th->tid == tid) { thread__get thread 2 ... machine__find_thread down_read(&threads->lock); __machine__findnew_thread ____machine__findnew_thread th = threads->last_match; if (th->tid == tid) { thread__get thread 3 ... machine__process_fork_event machine__remove_thread __machine__remove_thread threads->last_match = NULL thread__put thread__put Thread 1 and 2 might got stale last_match, before thread 3 clears it. Thread 1 and 2 then race with thread 3's thread__put and they might trigger the refcnt == 0 assertion above. The patch is disabling the last_match cache for multiple thread mode. It was originally meant for single thread scenarios, where it's common to have multiple sequential searches of the same thread. In multithread mode this does not make sense, because top's threads processes different /proc entries and so the 'struct threads' object is queried for various threads. Moreover we'd need to add more locks to make it work. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20180719143345.12963-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24perf machine: Add threads__set_last_match functionJiri Olsa1-3/+9
Separating threads::last_match cache set into separate threads__set_last_match function. This will be useful in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20180719143345.12963-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24perf machine: Add threads__get_last_match functionJiri Olsa1-13/+26
Separating threads::last_match cache read/check into separate threads__get_last_match function. This will be useful in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20180719143345.12963-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24perf script: Show correct offsets for DWARF-based unwindingSandipan Das1-1/+8
When perf/data is recorded with the dwarf call-graph option, the callchain shown by 'perf script' still shows the binary offsets of the userspace symbols instead of their virtual addresses. Since the symbol offset calculation is based on using virtual address as the ip, we see incorrect offsets as well. The use of virtual addresses affects the ability to find out the line number in the corresponding source file to which an address maps to as described in commit 67540759151a ("perf unwind: Use addr_location::addr instead of ip for entries"). This has also been addressed by temporarily converting the virtual address to the correponding binary offset so that it can be mapped to the source line number correctly. This is a follow-up for commit 19610184693c ("perf script: Show virtual addresses instead of offsets"). This can be verified on a powerpc64le system running Fedora 27 as shown below: # perf probe -x /usr/lib64/libc-2.26.so -a inet_pton # perf record -e probe_libc:inet_pton --call-graph=dwarf ping -6 -c 1 ::1 Before: # perf report --stdio --no-children -s sym,srcline -g address # Samples: 1 of event 'probe_libc:inet_pton' # Event count (approx.): 1 # # Overhead Symbol Source:Line # ........ .................... ........... # 100.00% [.] __GI___inet_pton inet_pton.c | ---gaih_inet getaddrinfo.c:537 (inlined) __GI_getaddrinfo getaddrinfo.c:2304 (inlined) main ping.c:519 generic_start_main libc-start.c:308 (inlined) __libc_start_main libc-start.c:102 ... # perf script -F comm,ip,sym,symoff,srcline,dso ping 15af28 __GI___inet_pton+0xffff000099160008 (/usr/lib64/libc-2.26.so) libc-2.26.so[ffff80004ca0af28] 10fa53 gaih_inet+0xffff000099160f43 libc-2.26.so[ffff80004c9bfa53] (inlined) 1105b3 __GI_getaddrinfo+0xffff000099160163 libc-2.26.so[ffff80004c9c05b3] (inlined) 2d6f main+0xfffffffd9f1003df (/usr/bin/ping) ping[fffffffecf882d6f] 2369f generic_start_main+0xffff00009916013f libc-2.26.so[ffff80004c8d369f] (inlined) 23897 __libc_start_main+0xffff0000991600b7 (/usr/lib64/libc-2.26.so) libc-2.26.so[ffff80004c8d3897] After: # perf report --stdio --no-children -s sym,srcline -g address # Samples: 1 of event 'probe_libc:inet_pton' # Event count (approx.): 1 # # Overhead Symbol Source:Line # ........ .................... ........... # 100.00% [.] __GI___inet_pton inet_pton.c | ---gaih_inet.constprop.7 getaddrinfo.c:537 getaddrinfo getaddrinfo.c:2304 main ping.c:519 generic_start_main.isra.0 libc-start.c:308 __libc_start_main libc-start.c:102 ... # perf script -F comm,ip,sym,symoff,srcline,dso ping 7fffb38aaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so) inet_pton.c:68 7fffb385fa53 gaih_inet.constprop.7+0xf43 (/usr/lib64/libc-2.26.so) getaddrinfo.c:537 7fffb38605b3 getaddrinfo+0x163 (/usr/lib64/libc-2.26.so) getaddrinfo.c:2304 130782d6f main+0x3df (/usr/bin/ping) ping.c:519 7fffb377369f generic_start_main.isra.0+0x13f (/usr/lib64/libc-2.26.so) libc-start.c:308 7fffb3773897 __libc_start_main+0xb7 (/usr/lib64/libc-2.26.so) libc-start.c:102 Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Fixes: 67540759151a ("perf unwind: Use addr_location::addr instead of ip for entries") Link: http://lkml.kernel.org/r/20180703120555.32971-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf machine: Synthesize and process mmap events for x86 PTI entry trampolinesAdrian Hunter1-0/+28
Like the kernel text, the location of x86 PTI entry trampolines must be recorded in the perf.data file. Like the kernel, synthesize a mmap event for that, and add processing for it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf machine: Create maps for x86 PTI entry trampolinesAdrian Hunter1-19/+47
Create maps for x86 PTI entry trampolines, based on symbols found in kallsyms. It is also necessary to keep track of whether the trampolines have been mapped particularly when the kernel dso is kcore. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-9-git-send-email-adrian.hunter@intel.com [ Fix extra_kernel_map_info.cnt designed struct initializer on gcc 4.4.7 (centos:6, etc) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Allow for extra kernel mapsAdrian Hunter1-2/+6
Identify extra kernel maps by name so that they can be distinguished from the kernel map and module maps. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Workaround missing maps for x86 PTI entry trampolinesAdrian Hunter1-0/+96
On x86_64 the PTI entry trampolines are not in the kernel map created by perf tools. That results in the addresses having no symbols and prevents annotation. It also causes Intel PT to have decoding errors at the trampoline addresses. Workaround that by creating maps for the trampolines. At present the kernel does not export information revealing where the trampolines are. Until that happens, the addresses are hardcoded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Add nr_cpus_avail()Adrian Hunter1-0/+5
Add a function to return the number of the machine's available CPUs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19perf tools: Fix kernel_start for PTI on x86Adrian Hunter1-1/+6
Opickn x86_64, PTI entry trampolines are less than the start of kernel text, but still above 2^63. So leave kernel_start = 1ULL << 63 for x86_64. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526548928-20790-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19perf machine: Add machine__is() to identify machine archAdrian Hunter1-0/+9
Add a function to identify the machine architecture. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526548928-20790-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17perf script: Show virtual addresses instead of offsetsSandipan Das1-1/+1
When perf data is recorded with the call-graph option enabled, the callchain shown by perf script shows the binary offsets of the symbols as the ip. This is incorrect for kernel symbols as the ip values are always off by a fixed offset depending on the architecture. If the offsets from the start of the symbols are printed, they are also incorrect for both kernel and userspace symbols. Without the call-graph option, the callchain shows the virtual addresses of the symbols rather than their binary offsets. The offsets printed in this case are also correct. This fixes the inconsistency in perf script's output. This can be verified on a powerpc64le system running Fedora 27 as follows: # cat /proc/kallsyms | grep sys_write ... c0000000004025a0 T sys_write c0000000004025a0 T __se_sys_write ... # perf probe -a sys_write Before applying this patch: # perf record -e probe:sys_write -g ~/test # perf script -F ip,sym,symoff 4125b0 sys_write+0x8000000000008010 1b9e0 system_call+0x8000000000008058 118234 __GI___libc_write+0xffff0000f52c0024 92c74 _IO_file_write@@GLIBC_2.17+0xffff0000f52c0044 5afbfd8a [unknown] 91a60 new_do_write+0xffff0000f52c0090 94638 _IO_do_write@@GLIBC_2.17+0xffff0000f52c0038 94bbc _IO_file_overflow@@GLIBC_2.17+0xffff0000f52c014c 95a24 __overflow+0xffff0000f52c0064 84548 _IO_puts+0xffff0000f52c0218 440 main+0xffffffffe0000020 236a0 generic_start_main.isra.0+0xffff0000f52c0140 23898 __libc_start_main+0xffff0000f52c00b8 0 [unknown] ... # perf record -e probe:sys_write ~/test # perf script -F ip,sym,symoff c0000000004025b0 sys_write+0x10 ... After applying this patch: # perf record -e probe:sys_write -g ~/test # perf script -F ip,sym,symoff c0000000004025b0 sys_write+0x10 c00000000000b9e0 system_call+0x58 7fffb70d8234 __GI___libc_write+0x24 7fffb7052c74 _IO_file_write@@GLIBC_2.17+0x44 5afc1818 [unknown] 7fffb7051a60 new_do_write+0x90 7fffb7054638 _IO_do_write@@GLIBC_2.17+0x38 7fffb7054bbc _IO_file_overflow@@GLIBC_2.17+0x14c 7fffb7055a24 __overflow+0x64 7fffb7044548 _IO_puts+0x218 10000440 main+0x20 7fffb6fe36a0 generic_start_main.isra.0+0x140 7fffb6fe3898 __libc_start_main+0xb8 0 [unknown] ... # perf record -e probe:sys_write ~/test # perf script -F ip,sym,symoff c0000000004025b0 sys_write+0x10 ... Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20180517063326.6319-1-sandipan@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-30perf machine: Ditch find_kernel_function variantsArnaldo Carvalho de Melo1-1/+1
Since we do not have split symtabs anymore, no need to have explicit find_kernel_function variants, use the find_kernel_symbol ones. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-hiw2ryflju000f6wl62128it@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-27perf symbols: Unify symbol mapsArnaldo Carvalho de Melo1-80/+44
Remove the split of symbol tables for data (MAP__VARIABLE) and for functions (MAP__FUNCTION), its unneeded and there were various places doing two lookups to find a symbol, so simplify this. We still will consider only the symbols that matched the filters in place, i.e. see the (elf_(sec,sym)|symbol_type)__filter() routines in the patch, just so that we consider only the same symbols as before, to reduce the possibility of regressions. All the tests on 50-something build environments, in varios versions of lots of distros and cross build environments were performed without build regressions, as usual with all pull requests the other tests were also performed: 'perf test' and 'make -C tools/perf build-test'. Also this was done at a great granularity so that regressions can be bisected more easily. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-hiq0fy2rsleupnqqwuojo1ne@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26perf machine: Set PROT_EXEC for executable PERF_RECORD_MMAP recordsArnaldo Carvalho de Melo1-2/+6
The kernel doesn't fill the map 'prot' field for PERF_RECORD_MMAP records, and we will use that info to replace checking for MAP__VARIABLE, so store that when processing the PERF_RECORD_MISC_MMAP_DATA perf_event_attr.header.misc bit. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-es3zz9r0q2qlssg4wh1w1d8p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26perf thread: Ditch __thread__find_symbol()Arnaldo Carvalho de Melo1-9/+1
Simulate having all symbols in just one tree by searching the still existing two trees. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-uss70e8tvzzbzs326330t83q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26perf machine: Use machine__find_kernel_function() instead of open coded versionArnaldo Carvalho de Melo1-1/+1
We have that equivalent, shorter helper, use it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-1hcgu3k7vxdy4vknqf3kbtzt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26perf thread: Remove addr_type arg from thread__find_cpumode_addr_location()Arnaldo Carvalho de Melo1-3/+2
All callers are for MAP__FUNCTION, so just ditch it and use thread__find_symbol(), that already ditched MAP__FUNCTION, i.e. internally uses it till we ditch it for good. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-i0ocxs00b4a0tlrx31lyh2cs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26perf machine: Remove needless map_type from machine__load_vmlinux_path()Arnaldo Carvalho de Melo1-2/+2
Since it uses machine__kernel_map() and this function always returns the MAP__FUNCTION map, it doesn't make sense to call it with MAP__VARIABLE. And also this is a step in the direction of nuking the MAP__{FUNCTION,VARIABLE} split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-0h3eof3kx3kq32ixg5fquf3p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26perf machine: Shorten machine__load_kallsyms() signatureArnaldo Carvalho de Melo1-5/+3
So far the only use is for MAP__FUNCTION, and since we're going to remove that split, remove the map_type argument in machine__load_kallsyms(). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-5dhgh7x8g9hx5hpxlp3k08jp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26perf map: Shorten map_groups__find_by_name() signatureArnaldo Carvalho de Melo1-4/+2
Another step in the road to elliminate the MAP_{FUNCTION,VARIABLE} separation, reducing the exposure to these details in the tools using the symbol APIs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-8a1hvrqe3r5i0kw865u3uxwt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26perf thread: Introduce thread__find_symbol()Arnaldo Carvalho de Melo1-4/+3
Out of thread__find_addr_location(..., MAP__FUNCTION, ...), idea here is to continue removing references to MAP__{FUNCTION,VARIABLE} ahead of getting both types of symbols in the same rbtree, as various places do two lookups, looking first at MAP__FUNCTION, then at MAP__VARIABLE. So thread__find_symbol() will eventually do just that, and 'struct symbol' will have the symbol type, for code that cares about that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-n7528en9e08yd3flzmb26tth@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-23perf machine: Set main kernel end address properlyNamhyung Kim1-12/+18
map_groups__fixup_end() was called to set the end addresses of kernel and module maps. But now since machine__create_modules() sets the end address of modules properly, the only remaining piece is the kernel map. We can set it with adjacent module's address directly instead of calling map_groups__fixup_end(). If there's no module after the kernel map, the end address will be ~0ULL. Since it also changes the start address of the kernel map, it needs to re-insert the map to the kmaps in order to keep a correct ordering. Kim reported that it caused problems on ARM64. Reported-by: Kim Phillips <kim.phillips@arm.com> Tested-by: Kim Phillips <kim.phillips@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20180419235915.GA19067@sejong Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-16perf machine: Fix mmap name setupJiri Olsa1-15/+13
Leo reported broken -k option behavior. The reason is that we used symbol_conf.vmlinux_name as a source for mmap event name, but in fact it's a vmlinux path. Moving the symbol_conf.vmlinux_name check for both host and guest to the proper place and out of the machine__set_mmap_name function. Reported-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: commit ("8c7f1bb37b29 perf machine: Move kernel mmap name into struct machine") Link: http://lkml.kernel.org/r/20180312152406.10141-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf tools: Add refcnt into struct mem_infoJiri Olsa1-1/+1
It's passed along several hists entries in --hierarchy mode, so it's better we keep track of it. The current fail I see is that it gets removed in hierarchy --mem-mode mode, where it's shared in the different hierarchies, but removed from the template hist entry, so the report crashes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180307155020.32613-6-jolsa@kernel.org [ Rename mem_info__aloc() to mem_info__new(), to fix the typo and use the convention for constructors ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-19perf machine: Fix paranoid check in machine__set_kernel_mmap()Namhyung Kim1-1/+1
The machine__set_kernel_mmap() is to setup addresses of the kernel map using external info. But it has a check when the address is given from an incorrect input which should have the start and end address of 0 (i.e. machine__process_kernel_mmap_event). But we also use the end address of 0 for a valid input so change it to check both start and end addresses. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20180219101936.GD1583@sejong Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16perf machine: Remove machine__load_kallsyms()Jiri Olsa1-10/+4
The current machine__load_kallsyms() function has no caller, so replace it directly with __machine__load_kallsyms(). Also remove the no_kcore argument as it was always called with a 'true' value. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180215122635.24029-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16perf machine: Don't search for active kernel start in __machine__create_kernel_mapsJiri Olsa1-29/+26
We should not search for the kernel start address in __machine__create_kernel_maps(), because it's being used in the 'report' code path, where we are interested in kernel MMAP data address (the one recorded via 'perf record', possibly on another machine, or an older or newer kernel on the same machine where analysis is being performed) instead of in current kernel address. The __machine__create_kernel_maps() function serves purely for creating the machines kernel maps and setting up the kmap group. The report code path then sets the address based on the data from kernel MMAP event in the machine__set_kernel_mmap() function. The kallsyms search address logic is used for test code, that calls machine__create_kernel_maps() to get current maps and calls machine__get_running_kernel_start() to get kernel starting address. Use machine__set_kernel_mmap() to set the kernel maps start address and moving map_groups__fixup_end to be call when all maps are in place. Also make __machine__create_kernel_maps static, because there's no external user. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180215122635.24029-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16perf machine: Generalize machine__set_kernel_mmap()Jiri Olsa1-6/+7
So it could be called without event object, just with start and end values. It will be used in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180215122635.24029-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16perf machine: Move kernel mmap name into struct machineJiri Olsa1-34/+33
It simplifies and centralizes the code. The kernel mmap name is set for machine type, which we know from the beginning, so there's no reason to generate it every time we need it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180215122635.24029-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16perf machine: Free root_dir in machine__init() error pathJiri Olsa1-1/+7
Free root_dir in machine__init() error path. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180215122635.24029-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-08perf report: Fix a wrong offset issue when using /proc/kcoreJin Yao1-1/+1
When a valid vmlinux is not found, 'perf report' falls back to look at /proc/kcore. In this case, it will report the impossible large offset. For example: # perf record -b -e cycles:k find /etc/ > /dev/null # perf report --stdio --branch-history 22.77% _vm_normal_page+18446603336221188162 | ---page_remove_rmap +18446603336221188324 page_remove_rmap +18446603336221188487 (cycles:5) unlock_page_memcg +18446603336221188096 page_remove_rmap +18446603336221188327 (cycles:1) The issue is the value which is passed to parameter 'addr' in __get_srcline() is the objdump address. It's not correct if we calculate the offset by using 'addr - sym->start'. This patch creates a new parameter 'ip' in __get_srcline(). It is not converted to objdump address. With this patch, the perf report output is: 22.77% _vm_normal_page+66 | ---page_remove_rmap +228 page_remove_rmap +391 (cycles:5) unlock_page_memcg +0 page_remove_rmap +231 (cycles:1) page_remove_rmap +236 Committer testing: Make sure you get any valid vmlinux out of the way, using '-v' on the 'perf report' case and deleting it from places where perf searches them, like your kernel build dir and the build-id cache, in ~/.debug/. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1514564812-17344-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-16perf callchain: Reset cursor arg instead of callchain_cursorJiri Olsa1-1/+1
We already pass cursor into thread__resolve_callchain function, so there's no point in resetting the global instance. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-puk015qvuppao9m1xtdy9v7j@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-16perf machine: Guard against NULL in machine__exit()Arnaldo Carvalho de Melo1-0/+3
A recent fix for 'perf trace' introduced a bug where machine__exit(trace->host) could be called while trace->host was still NULL, so make this more robust by guarding against NULL, just like free() does. The problem happens, for instance, when !root users try to run 'perf trace': [acme@jouet linux]$ trace Error: No permissions to read /sys/kernel/debug/tracing/events/raw_syscalls/sys_(enter|exit) Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing' perf: Segmentation fault Obtained 7 stack frames. [0x4f1b2e] /lib64/libc.so.6(+0x3671f) [0x7f43a1dd971f] [0x4f3fec] [0x47468b] [0x42a2db] /lib64/libc.so.6(__libc_start_main+0xe9) [0x7f43a1dc3509] [0x42a6c9] Segmentation fault (core dumped) [acme@jouet linux]$ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrei Vagin <avagin@openvz.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vasily Averin <vvs@virtuozzo.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 33974a414ce2 ("perf trace: Call machine__exit() at exit") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-07Merge branch 'linus' into perf/core, to fix conflictsIngo Molnar1-0/+1
Conflicts: tools/perf/arch/arm/annotate/instructions.c tools/perf/arch/arm64/annotate/instructions.c tools/perf/arch/powerpc/annotate/instructions.c tools/perf/arch/s390/annotate/instructions.c tools/perf/arch/x86/tests/intel-cqm.c tools/perf/ui/tui/progress.c tools/perf/util/zlib.c Signed-off-by: Ingo Molnar <mingo@kernel.org>