aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds6-6/+6
Pull networking updates from David Miller: 1) Add WireGuard 2) Add HE and TWT support to ath11k driver, from John Crispin. 3) Add ESP in TCP encapsulation support, from Sabrina Dubroca. 4) Add variable window congestion control to TIPC, from Jon Maloy. 5) Add BCM84881 PHY driver, from Russell King. 6) Start adding netlink support for ethtool operations, from Michal Kubecek. 7) Add XDP drop and TX action support to ena driver, from Sameeh Jubran. 8) Add new ipv4 route notifications so that mlxsw driver does not have to handle identical routes itself. From Ido Schimmel. 9) Add BPF dynamic program extensions, from Alexei Starovoitov. 10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes. 11) Add support for macsec HW offloading, from Antoine Tenart. 12) Add initial support for MPTCP protocol, from Christoph Paasch, Matthieu Baerts, Florian Westphal, Peter Krystad, and many others. 13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu Cherian, and others. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits) net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC udp: segment looped gso packets correctly netem: change mailing list qed: FW 8.42.2.0 debug features qed: rt init valid initialization changed qed: Debug feature: ilt and mdump qed: FW 8.42.2.0 Add fw overlay feature qed: FW 8.42.2.0 HSI changes qed: FW 8.42.2.0 iscsi/fcoe changes qed: Add abstraction for different hsi values per chip qed: FW 8.42.2.0 Additional ll2 type qed: Use dmae to write to widebus registers in fw_funcs qed: FW 8.42.2.0 Parser offsets modified qed: FW 8.42.2.0 Queue Manager changes qed: FW 8.42.2.0 Expose new registers and change windows qed: FW 8.42.2.0 Internal ram offsets modifications MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver Documentation: net: octeontx2: Add RVU HW and drivers overview octeontx2-pf: ethtool RSS config support octeontx2-pf: Add basic ethtool support ...
2020-01-28Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds68-3999/+388
Pull perf updates from Ingo Molnar: "Kernel side changes: - Ftrace is one of the last W^X violators (after this only KLP is left). These patches move it over to the generic text_poke() interface and thereby get rid of this oddity. This requires a surprising amount of surgery, by Peter Zijlstra. - x86/AMD PMUs: add support for 'Large Increment per Cycle Events' to count certain types of events that have a special, quirky hw ABI (by Kim Phillips) - kprobes fixes by Masami Hiramatsu Lots of tooling updates as well, the following subcommands were updated: annotate/report/top, c2c, clang, record, report/top TUI, sched timehist, tests; plus updates were done to the gtk ui, libperf, headers and the parser" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits) perf/x86/amd: Add support for Large Increment per Cycle Events perf/x86/amd: Constrain Large Increment per Cycle events perf/x86/intel/rapl: Add Comet Lake support tracing: Initialize ret in syscall_enter_define_fields() perf header: Use last modification time for timestamp perf c2c: Fix return type for histogram sorting comparision functions perf beauty sockaddr: Fix augmented syscall format warning perf/ui/gtk: Fix gtk2 build perf ui gtk: Add missing zalloc object perf tools: Use %define api.pure full instead of %pure-parser libperf: Setup initial evlist::all_cpus value perf report: Fix no libunwind compiled warning break s390 issue perf tools: Support --prefix/--prefix-strip perf report: Clarify in help that --children is default tools build: Fix test-clang.cpp with Clang 8+ perf clang: Fix build with Clang 9 kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic tools lib: Fix builds when glibc contains strlcpy() perf report/top: Make 'e' visible in the help and make it toggle showing callchains perf report/top: Do not offer annotation for symbols without samples ...
2020-01-27Merge tag 'timers-core-2020-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-2/+4
Pull timer updates from Thomas Gleixner: "The timekeeping and timers departement provides: - Time namespace support: If a container migrates from one host to another then it expects that clocks based on MONOTONIC and BOOTTIME are not subject to disruption. Due to different boot time and non-suspended runtime these clocks can differ significantly on two hosts, in the worst case time goes backwards which is a violation of the POSIX requirements. The time namespace addresses this problem. It allows to set offsets for clock MONOTONIC and BOOTTIME once after creation and before tasks are associated with the namespace. These offsets are taken into account by timers and timekeeping including the VDSO. Offsets for wall clock based clocks (REALTIME/TAI) are not provided by this mechanism. While in theory possible, the overhead and code complexity would be immense and not justified by the esoteric potential use cases which were discussed at Plumbers '18. The overhead for tasks in the root namespace (ie where host time offsets = 0) is in the noise and great effort was made to ensure that especially in the VDSO. If time namespace is disabled in the kernel configuration the code is compiled out. Kudos to Andrei Vagin and Dmitry Sofanov who implemented this feature and kept on for more than a year addressing review comments, finding better solutions. A pleasant experience. - Overhaul of the alarmtimer device dependency handling to ensure that the init/suspend/resume ordering is correct. - A new clocksource/event driver for Microchip PIT64 - Suspend/resume support for the Hyper-V clocksource - The usual pile of fixes, updates and improvements mostly in the driver code" * tag 'timers-core-2020-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits) alarmtimer: Make alarmtimer_get_rtcdev() a stub when CONFIG_RTC_CLASS=n alarmtimer: Use wakeup source from alarmtimer platform device alarmtimer: Make alarmtimer platform device child of RTC device alarmtimer: Update alarmtimer_get_rtcdev() docs to reflect reality hrtimer: Add missing sparse annotation for __run_timer() lib/vdso: Only read hrtimer_res when needed in __cvdso_clock_getres() MIPS: vdso: Define BUILD_VDSO32 when building a 32bit kernel clocksource/drivers/hyper-v: Set TSC clocksource as default w/ InvariantTSC clocksource/drivers/hyper-v: Untangle stimers and timesync from clocksources clocksource/drivers/timer-microchip-pit64b: Fix sparse warning clocksource/drivers/exynos_mct: Rename Exynos to lowercase clocksource/drivers/timer-ti-dm: Fix uninitialized pointer access clocksource/drivers/timer-ti-dm: Switch to platform_get_irq clocksource/drivers/timer-ti-dm: Convert to devm_platform_ioremap_resource clocksource/drivers/em_sti: Fix variable declaration in em_sti_probe clocksource/drivers/em_sti: Convert to devm_platform_ioremap_resource clocksource/drivers/bcm2835_timer: Fix memory leak of timer clocksource/drivers/cadence-ttc: Use ttc driver as platform driver clocksource/drivers/timer-microchip-pit64b: Add Microchip PIT64B support clocksource/drivers/hyper-v: Reserve PAGE_SIZE space for tsc page ...
2020-01-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller6-6/+6
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-01-22 The following pull-request contains BPF updates for your *net-next* tree. We've added 92 non-merge commits during the last 16 day(s) which contain a total of 320 files changed, 7532 insertions(+), 1448 deletions(-). The main changes are: 1) function by function verification and program extensions from Alexei. 2) massive cleanup of selftests/bpf from Toke and Andrii. 3) batched bpf map operations from Brian and Yonghong. 4) tcp congestion control in bpf from Martin. 5) bulking for non-map xdp_redirect form Toke. 6) bpf_send_signal_thread helper from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-20perf: Use consistent include paths for libbpfToke Høiland-Jørgensen6-6/+6
Fix perf to include libbpf header files with the bpf/ prefix, to be consistent with external users of the library. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157952560797.1683545.7685921032671386301.stgit@toke.dk
2020-01-15perf header: Use last modification time for timestampMichael Petlan1-1/+1
Using .st_ctime clobbers the timestamp information in perf report header whenever any operation is done with the file. Even tar-ing and untar-ing the perf.data file (which preserves the file last modification timestamp) doesn't prevent that: [Michael@Diego tmp]$ ls -l perf.data -> -rw-------. 1 Michael Michael 169888 Dec 2 15:23 perf.data [Michael@Diego tmp]$ perf report --header-only # ======== -> # captured on : Mon Dec 2 15:23:42 2019 [...] [Michael@Diego tmp]$ tar c perf.data | xz > perf.data.tar.xz [Michael@Diego tmp]$ mkdir aaa [Michael@Diego tmp]$ cd aaa [Michael@Diego aaa]$ xzcat ../perf.data.tar.xz | tar x [Michael@Diego aaa]$ ls -l -a total 172 drwxrwxr-x. 2 Michael Michael 23 Jan 14 11:26 . drwxrwxr-x. 6 Michael Michael 4096 Jan 14 11:26 .. -> -rw-------. 1 Michael Michael 169888 Dec 2 15:23 perf.data [Michael@Diego aaa]$ perf report --header-only # ======== -> # captured on : Tue Jan 14 11:26:16 2020 [...] When using .st_mtime instead, correct information is printed: [Michael@Diego aaa]$ ~/acme/tools/perf/perf report --header-only # ======== -> # captured on : Mon Dec 2 15:23:42 2019 [...] Signed-off-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20200114104236.31555-1-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-14perf c2c: Fix return type for histogram sorting comparision functionsAndres Freund1-4/+6
Commit 722ddfde366f ("perf tools: Fix time sorting") changed - correctly so - hist_entry__sort to return int64. Unfortunately several of the builtin-c2c.c comparison routines only happened to work due the cast caused by the wrong return type. This causes meaningless ordering of both the cacheline list, and the cacheline details page. E.g a simple: perf c2c record -a sleep 3 perf c2c report will result in cacheline table like ================================================= Shared Data Cache Line Table ================================================= # # ------- Cacheline ---------- Total Tot - LLC Load Hitm - - Store Reference - - Load Dram - LLC Total - Core Load Hit - - LLC Load Hit - # Index Address Node PA cnt records Hitm Total Lcl Rmt Total L1Hit L1Miss Lcl Rmt Ld Miss Loads FB L1 L2 Llc Rmt # ..... .............. .... ...... ....... ...... ..... ..... ... .... ..... ...... ...... .... ...... ..... ..... ..... ... .... ....... 0 0x7f0d27ffba00 N/A 0 52 0.12% 13 6 7 12 12 0 0 7 14 40 4 16 0 0 0 1 0x7f0d27ff61c0 N/A 0 6353 14.04% 1475 801 674 779 779 0 0 718 1392 5574 1299 1967 0 115 0 2 0x7f0d26d3ec80 N/A 0 71 0.15% 16 4 12 13 13 0 0 12 24 58 1 20 0 9 0 3 0x7f0d26d3ec00 N/A 0 98 0.22% 23 17 6 19 19 0 0 6 12 79 0 40 0 10 0 i.e. with the list not being ordered by Total Hitm. Fixes: 722ddfde366f ("perf tools: Fix time sorting") Signed-off-by: Andres Freund <andres@anarazel.de> Tested-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org # v3.16+ Link: http://lore.kernel.org/lkml/20200109043030.233746-1-andres@anarazel.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-14perf beauty sockaddr: Fix augmented syscall format warningCengiz Can1-1/+1
The sockaddr related examples given in `tools/perf/examples/bpf/augmented_syscalls.c` almost always use `long`s to represent most of their fields. However, `size_t syscall_arg__scnprintf_sockaddr(..)` has a `scnprintf` call that uses `"%#x"` as format string. This throws a warning (whenever the syscall argument is `unsigned long`). Added `l` identifier to indicate that the `arg->value` is an unsigned long. Not sure about the complications of this with x86 though. Signed-off-by: Cengiz Can <cengiz@kernel.wtf> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200113174438.102975-1-cengiz@kernel.wtf Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-14perf/ui/gtk: Fix gtk2 buildJiri Olsa1-1/+1
Ravi Bangoria reported an issue when doing the gtk2 feature detection on Fedora 31, where some types got deprecated: /usr/include/gtk-2.0/gtk/gtktypeutils.h:236:1: error: ‘GTypeDebugFlags’ is deprecated [-Werror=deprecated-declarations] 236 | void gtk_type_init (GTypeDebugFlags debug_flags); Fix this for perf by allowing the compile to pass with deprecated symbols via the -Wno-deprecated-declarations compiler directive. Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jelle van der Waa <jelle@vdwaa.nl> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200113104358.123511-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-14perf ui gtk: Add missing zalloc objectJiri Olsa1-0/+5
When we moved zalloc.o to the library we missed gtk library which needs it compiled in, otherwise the missing __zfree symbol will cause the library to fail to load. Adding the zalloc object to the gtk library build. Fixes: 7f7c536f23e6 ("tools lib: Adopt zalloc()/zfree() from tools/perf") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jelle van der Waa <jelle@vdwaa.nl> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200113104358.123511-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-14perf tools: Use %define api.pure full instead of %pure-parserJiri Olsa2-2/+3
bison deprecated the "%pure-parser" directive in favor of "%define api.pure full". The api.pure got introduced in bison 2.3 (Oct 2007), so it seems safe to use it without any version check. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Clark Williams <williams@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lore.kernel.org/lkml/20200112192259.GA35080@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-14perf report: Fix no libunwind compiled warning break s390 issueJin Yao1-3/+3
Commit 800d3f561659 ("perf report: Add warning when libunwind not compiled in") breaks the s390 platform. S390 uses libdw-dwarf-unwind for call chain unwinding and had no support for libunwind. So the warning "Please install libunwind development packages during the perf build." caused the confusion even if the call-graph is displayed correctly. This patch adds checking for HAVE_DWARF_SUPPORT, which is set when libdw-dwarf-unwind is compiled in. Fixes: 800d3f561659 ("perf report: Add warning when libunwind not compiled in") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200107191745.18415-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-14perf tools: Support --prefix/--prefix-stripAndi Kleen8-2/+61
The objdump utility has useful --prefix / --prefix-strip options to allow changing source code file names hardcoded into executables' debug info. Add options to 'perf report', 'perf top' and 'perf annotate', which are then passed to objdump. $ mkdir foo $ echo 'main() { for (;;); }' > foo/foo.c $ gcc -g foo/foo.c foo/foo.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int] 1 | main() { for (;;); } | ^~~~ $ perf record ./a.out ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.230 MB perf.data (5721 samples) ] $ mv foo bar $ perf annotate <does not show source code> $ perf annotate --prefix=/home/ak/lsrc/git/bar --prefix-strip=5 <does show source code> Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> LPU-Reference: 20200107210444.214071-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-14perf report: Clarify in help that --children is defaultAndi Kleen1-1/+2
Refer to --no-children, which is what most people probably want. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20200103183643.149150-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-14perf clang: Fix build with Clang 9Maciej S. Szmigiero1-0/+4
LLVM D59377 (included in Clang 9) refactored Clang VFS construction a bit, which broke perf clang build. Let's fix it. Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Reviewed-by: Dennis Schridde <devurandom@gmx.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: clang-built-linux@googlegroups.com Cc: Denis Pronin <dannftk@yandex.ru> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naohiro Aota <naota@elisp.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191228171314.946469-2-mail@maciej.szmigiero.name Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-14hrtimers: Prepare hrtimer_nanosleep() for time namespacesAndrei Vagin1-2/+4
clock_nanosleep() accepts absolute values of expiration time when TIMER_ABSTIME flag is set. This absolute value is inside the task's time namespace, and has to be converted to the host's time. There is timens_ktime_to_host() helper for converting time, but it accepts ktime argument. As a preparation, make hrtimer_nanosleep() accept a clock value in ktime instead of timespec64. Co-developed-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-17-dima@arista.com
2020-01-06perf report/top: Make 'e' visible in the help and make it toggle showing callchainsArnaldo Carvalho de Melo1-1/+7
The 'e' and 'c' hotkeys were present for a long time, but not documented in the help window, change 'e' to be a toggle so that it gets consistent with other toggles like '+' and document it in the help window. Keep 'c' as is for people used to it but don't document, as it is easier to just use 'e' to show/hide all the callchains for a top level histogram entry. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-pmyi5x34stlqmyu81rci94x9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf report/top: Do not offer annotation for symbols without samplesArnaldo Carvalho de Melo1-1/+10
This can happen in the --children mode, i.e. the default mode when callchains are present, where one of the main entries may be a callchain entry with no samples. So far we were not providing any information about why an annotation couldn't be provided even offering the Annotation option in the popup menu. Work is needed to allow for no-samples "annotation', i.e. to show the disassembly anyway and allow for navigation, etc. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-0hhzj2de15o88cguy7h66zre@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf report/top: Allow pressing hotkeys in the options popup menuArnaldo Carvalho de Melo1-6/+10
When the users presses ENTER in the main 'perf report/top' screen a popup menu is presented, in it some hotkeys are suggested as alternatives to using the menu, or for additional features. At that point the user may try those hotkeys, so allow for that by recording the key used and exiting, the caller then can check for that possibility and process the hotkey. I.e. try pressing ENTER, and then 'k' to exit and zoom into the kernel map, using ESC then zooms out, etc. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ujfq3fw44kf6qrtfajl5dcsp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06tools ui popup: Allow returning hotkeysArnaldo Carvalho de Melo5-9/+13
With this patch if an optional pointer is passed to ui__popup_menu() then when any key that is not being handled (ENTER, ESC, etc) is typed, it'll record that key in the pointer and return, allowing for hotkey processing on the caller. If NULL is passed, no change in logic, unhandled keys continue to be ignored. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-6ojn19mqzgmrdm8kdoigic0m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf hists browser: Allow passing an initial hotkeyArnaldo Carvalho de Melo3-77/+82
Sometimes we're in an outer code, like the main hists browser popup menu and the user follows a suggestion about using some hotkey, and that hotkey is really handled by hists_browser__run(), so allow for calling it with that hotkey, making it handle it instead of waiting for the user to press one. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-xv2l7i6o4urn37nv1h40ryfs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf report/top: Add 'k' hotkey to zoom directly into the kernel mapArnaldo Carvalho de Melo1-1/+8
As a convenience, equivalent to pressing Enter in a line with a kernel symbol and then selecting "Zoom" into the kernel DSO. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-vbnlnrpyfvz9deqoobtc3dz7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf hists browser: Generalize the do_zoom_dso() functionArnaldo Carvalho de Melo1-4/+7
We'll use it to provide a top level hotkey to zoom into the kernel dso directly. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ae9cjel6v05wjnz9r6z77b6x@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf report/top: Improve toggle callchain menu optionArnaldo Carvalho de Melo3-5/+54
Taking into account the current status of the callchain, i.e. if folded, show "Expand", otherwise "Collapse", also show the name of the entry that will be affected and mention the hotkeys for expanding/collapsing all callchains below the main entry, the one that appears with/without callchains. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-03arm6poo8463k5tfcfp7gkk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf report/top: Add menu entry for toggling callchain expansionArnaldo Carvalho de Melo1-0/+21
Since previously pressing ENTER toggled expansion/collapse of callchain entries and now brings up the same menu used when callchains are not present, add an entry so that users can quickly figure out the change in behaviour. Its worth mentioning that we also always had 'e'/'c' to expand/collapse all entries in a hist entry and 'E'/'C' for all hist entries. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-f9o03jo29fypvd8ly3j49d36@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf report/top: Make ENTER consistently bring up menuArnaldo Carvalho de Melo1-1/+2
When callchains are present the ENTER key switches from bringing up the menu that offers Annotation, Zoom by DSO, etc to expanding/collapsing one callchain level, causing confusion, fix it by making it consistently bring up the menu and use '+' to expand/collapse one callchain level. Next patch will also add an entry to the menu to allow expanding/collapsing, so that people used to ENTER expanding one callchain level can quickly find it and use it instead. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-bjz35omktig8cwn6lbj1ifns@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf hists browser: Restore ESC as "Zoom out" of DSO/thread/etcArnaldo Carvalho de Melo1-0/+1
We need to set actions->ms.map since 599a2f38a989 ("perf hists browser: Check sort keys before hot key actions"), as in that patch we bail out if map is NULL. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 599a2f38a989 ("perf hists browser: Check sort keys before hot key actions") Link: https://lkml.kernel.org/n/tip-wp1ssoewy6zihwwexqpohv0j@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06libperf: Move to tools/lib/perfJiri Olsa39-3868/+3
Move libperf from its current location under tools/perf to a separate directory under tools/lib/. Also change various paths (mainly includes) to reflect the libperf move to a separate directory and add a new directory under MANIFEST. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191206210612.8676-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf tests bp_signal: Show expected versus obtained valuesArnaldo Carvalho de Melo1-5/+5
To help understand failures. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-c951j3gvrgnrsyg7ki7pwkiz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf sched timehist: Add support for filtering on CPUDavid Ahern2-0/+17
Allow user to limit output to one or more CPUs. Really helpful on systems with a large number of cpus. Committer testing: # perf sched record -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.765 MB perf.data (1412 samples) ] [root@quaco ~]# perf sched timehist | head Samples do not have callchains. time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- 66307.802686 [0000] perf[13086] 0.000 0.000 0.000 66307.802700 [0000] migration/0[12] 0.000 0.001 0.014 66307.802766 [0001] perf[13086] 0.000 0.000 0.000 66307.802774 [0001] migration/1[15] 0.000 0.001 0.007 66307.802841 [0002] perf[13086] 0.000 0.000 0.000 66307.802849 [0002] migration/2[20] 0.000 0.001 0.008 66307.802913 [0003] perf[13086] 0.000 0.000 0.000 # # perf sched timehist --cpu 2 | head Samples do not have callchains. time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- 66307.802841 [0002] perf[13086] 0.000 0.000 0.000 66307.802849 [0002] migration/2[20] 0.000 0.001 0.008 66307.964485 [0002] <idle> 0.000 0.000 161.635 66307.964811 [0002] CPU 0/KVM[3589/3561] 0.000 0.056 0.325 66307.965477 [0002] <idle> 0.325 0.000 0.666 66307.965553 [0002] CPU 0/KVM[3589/3561] 0.666 0.024 0.076 66307.966456 [0002] <idle> 0.076 0.000 0.903 # Signed-off-by: David Ahern <dsahern@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20191204173925.66976-1-dsahern@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf record: Adapt affinity to machines with #CPUs > 1KAlexey Budankov3-13/+45
Use struct mmap_cpu_mask type for the tool's thread and mmap data buffers to overcome current 1024 CPUs mask size limitation of cpu_set_t type. Currently glibc's cpu_set_t type has an internal mask size limit of 1024 CPUs. Moving to the 'struct mmap_cpu_mask' type allows overcoming that limit. The tools bitmap API is used to manipulate objects of 'struct mmap_cpu_mask' type. Committer notes: To print the 'nbits' struct member we must use %zd, since it is a size_t, this fixes the build in some toolchains/arches. Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/96d7e2ff-ce8b-c1e0-d52c-aa59ea96f0ea@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06perf mmap: Declare type for cpu mask of arbitrary lengthAlexey Budankov2-0/+23
Declare a dedicated struct map_cpu_mask type for cpu masks of arbitrary length. The mask is available thru bits pointer and the mask length is kept in nbits field. MMAP_CPU_MASK_BYTES() macro returns mask storage size in bytes. The mmap_cpu_mask__scnprintf() function can be used to log text representation of the mask. Committer notes: To print the 'nbits' struct member we must use %zd, since it is a size_t, this fixes the build in some toolchains/arches. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/0fd2454f-477f-d15a-f4ee-79bcbd2585ff@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-20perf hists: Fix variable name's inconsistency in hists__for_each() macroYuya Fujita1-2/+2
Variable names are inconsistent in hists__for_each macro(). Due to this inconsistency, the macro replaces its second argument with "fmt" regardless of its original name. So far it works because only "fmt" is passed to the second argument. However, this behavior is not expected and should be fixed. Fixes: f0786af536bb ("perf hists: Introduce hists__for_each_format macro") Fixes: aa6f50af822a ("perf hists: Introduce hists__for_each_sort_list macro") Signed-off-by: Yuya Fujita <fujita.yuya@fujitsu.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/OSAPR01MB1588E1C47AC22043175DE1B2E8520@OSAPR01MB1588.jpnprd01.prod.outlook.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-20perf map: Set kmap->kmaps backpointer for main kernel map chunksArnaldo Carvalho de Melo1-0/+3
When a map is create to represent the main kernel area (vmlinux) with map__new2() we allocate an extra area to store a pointer to the 'struct maps' for the kernel maps, so that we can access that struct when loading ELF files or kallsyms, as we will need to split it in multiple maps, one per kernel module or ELF section (such as ".init.text"). So when map->dso->kernel is non-zero, it is expected that map__kmap(map)->kmaps to be set to the tree of kernel maps (modules, chunks of the main kernel, bpf progs put in place via PERF_RECORD_KSYMBOL, the main kernel). This was not the case when we were splitting the main kernel into chunks for its ELF sections, which ended up making 'perf report --children' processing a perf.data file with callchains to trip on __map__is_kernel(), when we press ENTER to see the popup menu for main histogram entries that starts at a symbol in the ".init.text" ELF section, e.g.: - 8.83% 0.00% swapper [kernel.vmlinux].init.text [k] start_kernel start_kernel cpu_startup_entry do_idle cpuidle_enter cpuidle_enter_state intel_idle Fix it. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/20191218190120.GB13282@kernel.org/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-20perf report: Fix incorrectly added dimensions as switch perf data fileJin Yao1-1/+4
We observed an issue that was some extra columns displayed after switching perf data file in browser. The steps to reproduce: 1. perf record -a -e cycles,instructions -- sleep 3 2. perf report --group 3. In browser, we use hotkey 's' to switch to another perf.data 4. Now in browser, the extra columns 'Self' and 'Children' are displayed. The issue is setup_sorting() executed again after repeat path, so dimensions are added again. This patch checks the last key returned from __cmd_report(). If it's K_SWITCH_INPUT_DATA, skips the setup_sorting(). Fixes: ad0de0971b7f ("perf report: Enable the runtime switching of perf data file") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191220013722.20592-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-16perf vendor events s390: Remove name from L1D_RO_EXCL_WRITES descriptionEd Maste1-1/+1
In 7fcfa9a2d9 an unintended prefix "Counter:18 Name:" was removed from the description for L1D_RO_EXCL_WRITES, but the extra name remained in the description. Remove it too. Fixes: 7fcfa9a2d9a7 ("perf list: Fix s390 counter long description for L1D_RO_EXCL_WRITES") Signed-off-by: Ed Maste <emaste@freebsd.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Vincent Chen <deanbo422@gmail.com> Link: http://lore.kernel.org/lkml/20191212145346.5026-1-emaste@freefall.freebsd.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-16perf vendor events s390: Fix counter long description for DTLB1_GPAGE_WRITESEd Maste1-1/+1
The cf_z13 counter DTLB1_GPAGE_WRITES included a prefix 'Counter:132\tName:'. This is incorrect; remove the prefix as with 7fcfa9a2d9 for cf_z14. Signed-off-by: Ed Maste <emaste@freebsd.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Vincent Chen <deanbo422@gmail.com> Link: http://lore.kernel.org/lkml/20191212143446.88582-1-emaste@freefall.freebsd.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-11perf header: Fix false warning when there are no duplicate cache entriesMichael Petlan1-15/+6
Before this patch, perf expected that there might be NPROC*4 unique cache entries at max, however, it also expected that some of them would be shared and/or of the same size, thus the final number of entries would be reduced to be lower than NPROC*4. In case the number of entries hadn't been reduced (was NPROC*4), the warning was printed. However, some systems might have unusual cache topology, such as the following two-processor KVM guest: cpu level shared_cpu_list size 0 1 0 32K 0 1 0 64K 0 2 0 512K 0 3 0 8192K 1 1 1 32K 1 1 1 64K 1 2 1 512K 1 3 1 8192K This KVM guest has 8 (NPROC*4) unique cache entries, which used to make perf printing the message, although there actually aren't "way too many cpu caches". v2: Removing unused argument. v3: Unifying the way we obtain number of cpus. v4: Removed '& UINT_MAX' construct which is redundant. Signed-off-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> LPU-Reference: 20191208162056.20772-1-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-11perf metricgroup: Fix printing event names of metric group with multiple eventsKajol Jain1-2/+5
Commit f01642e4912b ("perf metricgroup: Support multiple events for metricgroup") introduced support for multiple events in a metric group. But with the current upstream, metric events names are not printed properly In power9 platform: command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2 1.000208486 2.000368863 2.001400558 Similarly in skylake platform: command:./perf stat --metric-only -M Power -I 1000 1.000579994 2.002189493 With current upstream version, issue is with event name comparison logic in find_evsel_group(). Current logic is to compare events belonging to a metric group to the events in perf_evlist. Since the break statement is missing in the loop used for comparison between metric group and perf_evlist events, the loop continues to execute even after getting a pattern match, and end up in discarding the matches. Incase of single metric event belongs to metric group, its working fine, because in case of single event once it compare all events it reaches to end of perf_evlist. Example for single metric event in power9 platform: command:# ./perf stat --metric-only -M branches_per_inst -I 1000 sleep 1 1.000094653 0.2 1.001337059 0.0 This patch fixes the issue by making sure once we found all events belongs to that metric event matched in find_evsel_group(), we successfully break from that loop by adding corresponding condition. With this patch: In power9 platform: command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2 result:# time derat_4k_miss_rate_percent derat_4k_miss_ratio derat_miss_ratio derat_64k_miss_rate_percent derat_64k_miss_ratio dslb_miss_rate_percent islb_miss_rate_percent 1.000135672 0.0 0.3 1.0 0.0 0.2 0.0 0.0 2.000380617 0.0 0.0 0.0 0.0 0.0 0.0 0.0 command:# ./perf stat --metric-only -M Power -I 1000 Similarly in skylake platform: result:# time Turbo_Utilization C3_Core_Residency C6_Core_Residency C7_Core_Residency C2_Pkg_Residency C3_Pkg_Residency C6_Pkg_Residency C7_Pkg_Residency 1.000563580 0.3 0.0 2.6 44.2 21.9 0.0 0.0 0.0 2.002235027 0.4 0.0 2.7 43.0 20.7 0.0 0.0 0.0 Committer testing: Before: [root@seventh ~]# perf stat --metric-only -M Power -I 1000 # time 1.000383223 2.001168182 3.001968545 4.002741200 5.003442022 ^C 5.777687244 [root@seventh ~]# After the patch: [root@seventh ~]# perf stat --metric-only -M Power -I 1000 # time Turbo_Utilization C3_Core_Residency C6_Core_Residency C7_Core_Residency C2_Pkg_Residency C3_Pkg_Residency C6_Pkg_Residency C7_Pkg_Residency 1.000406577 0.4 0.1 1.4 97.0 0.0 0.0 0.0 0.0 2.001481572 0.3 0.0 0.6 97.9 0.0 0.0 0.0 0.0 3.002332585 0.2 0.0 1.0 97.5 0.0 0.0 0.0 0.0 4.003196624 0.2 0.0 0.3 98.6 0.0 0.0 0.0 0.0 5.004063851 0.3 0.0 0.7 97.7 0.0 0.0 0.0 0.0 ^C 5.471260276 0.2 0.0 0.5 49.3 0.0 0.0 0.0 0.0 [root@seventh ~]# [root@seventh ~]# dmesg | grep -i skylake [ 0.187807] Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver. [root@seventh ~]# Fixes: f01642e4912b ("perf metricgroup: Support multiple events for metricgroup") Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191120084059.24458-1-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-11perf/x86/pmu-events: Fix Kernel_Utilization metricRavi Bangoria12-12/+12
Kernel Utilization should divide ref cycles spent in kernel with total ref cycles. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Haiyan Song <haiyanx.song@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20191204162121.29998-1-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-11perf top: Do not bail out when perf_env__read_cpuid() returns ENOSYSArnaldo Carvalho de Melo1-3/+7
'perf top' stopped working on hw architectures that do not provide a get_cpuid() implementation and thus fallback to the weak get_cpuid() default function. This is done because at annotation time we may need it in the arch specific annotation init routine, but that is only being used by arches that do provide a get_cpuid() implementation: $ find tools/ -name "*.[ch]" | xargs grep 'evlist->env' tools/perf/builtin-top.c: top.evlist->env = &perf_env; tools/perf/util/evsel.c: return evsel->evlist->env; tools/perf/util/s390-cpumsf.c: sf->machine_type = s390_cpumsf_get_type(session->evlist->env->cpuid); tools/perf/util/header.c: session->evlist->env = &header->env; tools/perf/util/sample-raw.c: const char *arch_pf = perf_env__arch(evlist->env); $ $ find tools/perf/arch -name "*.[ch]" | xargs grep -w get_cpuid tools/perf/arch/x86/util/auxtrace.c: ret = get_cpuid(buffer, sizeof(buffer)); tools/perf/arch/x86/util/header.c:get_cpuid(char *buffer, size_t sz) tools/perf/arch/powerpc/util/header.c:get_cpuid(char *buffer, size_t sz) tools/perf/arch/s390/util/header.c: * Implementation of get_cpuid(). tools/perf/arch/s390/util/header.c:int get_cpuid(char *buffer, size_t sz) tools/perf/arch/s390/util/header.c: if (buf && get_cpuid(buf, 128)) $ For 'report' or 'script', i.e. tools working on perf.data files, that is setup while reading the header, its just top that needs to explicitely read it at tool start. Fixes: 608127f73779 ("perf top: Initialize perf_env->cpuid, needed by the per arch annotation init routine") Reported-by: John Garry <john.garry@huawei.com> Analysed-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Tested-by: John Garry <john.garry@huawei.com> # arm64 Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/n/tip-lxwjr0cd2eggzx04a780ffrv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-11perf arch: Make the default get_cpuid() return compatible errorArnaldo Carvalho de Melo1-1/+1
Some of the functions calling get_cpuid() propagate back the error it returns, and all are using errno (positive) values, make the weak default get_cpuid() function return ENOSYS to be consistent and to allow checking if this is an arch not providing this function or if a provided one is having trouble getting the cpuid, to decide if the warning should be provided to the user or just a debug message should be emitted. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Tested-by: John Garry <john.garry@huawei.com> # arm64 Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/n/tip-lxwjr0cd2eggzx04a780ffrv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-11Merge remote-tracking branch 'torvalds/master' into perf/urgentArnaldo Carvalho de Melo1-0/+1
To pick up BPF fixes to allow a clean 'make -C tools/perf build-test': 7c3977d1e804 libbpf: Fix sym->st_value print on 32-bit arches 1fd450f99272 libbpf: Fix up generation of bpf_helper_defs.h Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-04perf inject: Fix processing of ID index for injected instruction tracingAdrian Hunter1-12/+1
The ID index event is used when decoding, but can result in the following error: $ perf record --aux-sample -e '{intel_pt//,branch-misses}:u' ls $ perf inject -i perf.data -o perf.data.inj --itrace=be $ perf script -i perf.data.inj 0x1020 [0x410]: failed to process type: 69 [No such file or directory] Fix by having 'perf inject' drop the ID index event. Fixes: c0a6de06c446 ("perf record: Add support for AUX area sampling") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191204120800.8138-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-04perf report: Bail out --mem-mode if mem info is not availableRavi Bangoria1-0/+8
If perf.data is recorded without -d, don't allow user to use --mem-mode with 'perf report'. symbol_daddr and phys_daddr can be recorded separately and may be present in the perf.data but at the report time they are associated with mem-mode fields and thus this restriction applies to them as well. Before: $ perf record ls $ perf report --mem-mode --stdio # Overhead Local Weight Memory access Symbol # ........ ............ ............. ....................... 55.56% 0 N/A [k] 0xffffffff81a00ae7 After: $ perf report --mem-mode --stdio Error: Selected --mem-mode but no mem data. Did you call perf record without -d? Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20191114132213.5419-4-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-04perf report: Make -F more strict like -sRavi Bangoria1-0/+6
Currently -F allows branch-mode / mem-mode fields with -F even when perf report is not running in that mode. Don't allow that. Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20191114132213.5419-3-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-04perf report/top TUI: Replace pr_err() with ui__error()Ravi Bangoria1-5/+5
pr_err() in TUI mode does not print anyting on the screen and just quits. Replace such pr_err() with ui__error(). Before: $ perf report -s + $ After: $ perf report -s + ┌─Error:────────────────┐ │Invalid --sort key: `+'│ │ │ │Press any key... │ └───────────────────────┘ Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20191114132213.5419-2-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-0/+1
Daniel Borkmann says: ==================== pull-request: bpf 2019-12-02 The following pull-request contains BPF updates for your *net* tree. We've added 10 non-merge commits during the last 6 day(s) which contain a total of 10 files changed, 60 insertions(+), 51 deletions(-). The main changes are: 1) Fix vmlinux BTF generation for binutils pre v2.25, from Stanislav Fomichev. 2) Fix libbpf global variable relocation to take symbol's st_value offset into account, from Andrii Nakryiko. 3) Fix libbpf build on powerpc where check_abi target fails due to different readelf output format, from Aurelien Jarno. 4) Don't set BPF insns RO for the case when they are JITed in order to avoid fragmenting the direct map, from Daniel Borkmann. 5) Fix static checker warning in btf_distill_func_proto() as well as a build error due to empty enum when BPF is compiled out, from Alexei Starovoitov. 6) Fix up generation of bpf_helper_defs.h for perf, from Arnaldo Carvalho de Melo. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-02perf kvm: Clarify the 'perf kvm' -i and -o command line optionsArnaldo Carvalho de Melo1-2/+3
The 'perf kvm' subcommand has options that it in turn passes to other perf subcommands such as 'report' and 'record', particularly -i and -o end up setting the same variable that will then be used for 'record's -o and report '-i', which ends up being confusing, leading some to think that both -i and -o can be used with 'report'. Improve the man page to state that -i is used with the post-processing subcommands while -o is used just with 'record' and that to save the output of 'report' one should simply redirect its output to a file. Noticed while reading the https://www.linux-kvm.org/page/Perf_events page. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steve Dickson <steved@redhat.com> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-tclbttvmgtm525fvmh85f7d9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-02perf beauty: Add CLEAR_SIGHAND support for clone's flags argArnaldo Carvalho de Melo1-0/+1
Add support for the recently added CLONE_CLEAR_SIGHAND flag. This takes advantage of the copy of the uapi/linux/sched.h we have in tools/include, which allows us to build tools/perf in older systems and have the binary support printing that flag whenever that system gets its kernel updated to one where this feature is present. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Reber <areber@redhat.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org Link: https://lkml.kernel.org/n/tip-1vnz497ubtu5oz16ygdcul0e@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>