aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-07-03Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds79-600/+2350
Pull perf updates from Ingo Molnar: "Most of the changes are for tooling, the main changes in this cycle were: - Improve Intel-PT hardware tracing support, both on the kernel and on the tooling side: PTWRITE instruction support, power events for C-state tracing, etc. (Adrian Hunter) - Add support to measure SMI cost to the x86 architecture, with tooling support in 'perf stat' (Kan Liang) - Support function filtering in 'perf ftrace', plus related improvements (Namhyung Kim) - Allow adding and removing fields to the default 'perf script' columns, using + or - as field prefixes to do so (Andi Kleen) - Allow resolving the DSO name with 'perf script -F brstack{sym,off},dso' (Mark Santaniello) - Add perf tooling unwind support for PowerPC (Paolo Bonzini) - ... and various other improvements as well" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (84 commits) perf auxtrace: Add CPU filter support perf intel-pt: Do not use TSC packets for calculating CPU cycles to TSC perf intel-pt: Update documentation to include new ptwrite and power events perf intel-pt: Add example script for power events and PTWRITE perf intel-pt: Synthesize new power and "ptwrite" events perf intel-pt: Move code in intel_pt_synth_events() to simplify attr setting perf intel-pt: Factor out intel_pt_set_event_name() perf intel-pt: Tidy messages into called function intel_pt_synth_event() perf intel-pt: Tidy Intel PT evsel lookup into separate function perf intel-pt: Join needlessly wrapped lines perf intel-pt: Remove unused instructions_sample_period perf intel-pt: Factor out common code synthesizing event samples perf script: Add synthesized Intel PT power and ptwrite events perf/x86/intel: Constify the 'lbr_desc[]' array and make a function static perf script: Add 'synth' field for synthesized event payloads perf auxtrace: Add itrace option to output power events perf auxtrace: Add itrace option to output ptwrite events tools include: Add byte-swapping macros to kernel.h perf script: Add 'synth' event type for synthesized events x86/insn: perf tools: Add new ptwrite instruction ...
2017-06-30perf auxtrace: Add CPU filter supportAdrian Hunter4-0/+14
Decoding auxtrace data can take a long time. To avoid decoding unnecessarily, filter auxtrace data that is collected per-cpu before it is decoded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-38-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Do not use TSC packets for calculating CPU cycles to TSCAdrian Hunter1-0/+14
CBR (core-to-bus ratio) packets provide an indication of CPU frequency. A more accurate measure can be made by counting the cycles (given by CYC packets) in between other timing packets (either MTC or TSC). Using TSC packets has at least 2 issues: 1) timing might have stopped (e.g. mwait) or 2) TSC packets within PSB+ might slip past CYC packets. For now, simply do not use TSC packets for calculating CPU cycles to TSC. That leaves the case where 2 MTC packets are used, otherwise falling back to the CBR value. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-37-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Update documentation to include new ptwrite and power eventsAdrian Hunter1-2/+40
Update documentation to include new ptwrite and power events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-36-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Add example script for power events and PTWRITEAdrian Hunter3-0/+144
Add script intel-pt-events.py that provides an example of how to unpack the raw data for power events and PTWRITE. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-35-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Synthesize new power and "ptwrite" eventsAdrian Hunter1-0/+283
Synthesize new power and ptwrite events. Power events report changes to C-state but I have also added support for the existing CBR (core-to-bus ratio) packet and included that when outputting power events. The PTWRITE packet is associated with the new "ptwrite" instruction, which is essentially just a way to stuff a 32 or 64 bit value into the PT trace. More details can be found in the patches that add documentation and in the Intel SDM. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1498811805-2335-1-git-send-email-adrian.hunter@intel.com [ Copy the description of such packet from the patchkit cover message ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Move code in intel_pt_synth_events() to simplify attr settingAdrian Hunter1-23/+22
intel_pt_synth_events() uses the same attr structure to create each event. Move the code around a bit to simplify that. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-33-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Factor out intel_pt_set_event_name()Adrian Hunter1-8/+16
Factor out intel_pt_set_event_name() so it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-32-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Tidy messages into called function intel_pt_synth_event()Adrian Hunter1-24/+18
Tidy print messages into called function intel_pt_synth_event(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-31-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Tidy Intel PT evsel lookup into separate functionAdrian Hunter1-10/+15
Tidy the lookup of the Intel PT selected event (perf_evsel) into a separate function. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-30-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Join needlessly wrapped linesAdrian Hunter1-4/+2
Join needlessly wrapped lines. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-29-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Remove unused instructions_sample_periodAdrian Hunter1-2/+0
Remove unused struct intel_pt member instructions_sample_period. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-28-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf intel-pt: Factor out common code synthesizing event samplesAdrian Hunter1-122/+100
Factor out common code in functions synthesizing event samples i.e. intel_pt_synth_branch_sample(), intel_pt_synth_instruction_sample() and intel_pt_synth_transaction_sample(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-27-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-30perf script: Add synthesized Intel PT power and ptwrite eventsAdrian Hunter2-1/+231
Add definitions for synthesized Intel PT events for power and ptwrite. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1498811802-2301-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf script: Add 'synth' field for synthesized event payloadsAdrian Hunter2-3/+23
Add a field to display the content the raw_data of a synthesized event. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-22-git-send-email-adrian.hunter@intel.com [ Resolved conflict with 106dacd86f04 ("perf script: Support -F brstackoff,dso") ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf auxtrace: Add itrace option to output power eventsAdrian Hunter3-2/+9
Add itrace option to output power events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-25-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf auxtrace: Add itrace option to output ptwrite eventsAdrian Hunter3-3/+10
Add itrace option to output ptwrite events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-24-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf script: Add 'synth' event type for synthesized eventsAdrian Hunter2-16/+61
Instruction trace decoders such as Intel PT may have additional information recorded in the trace. For example, Intel PT has power information and a there is a new instruction 'ptwrite' that can write a value into a PTWRITE trace packet. Such information may be associated with an IP and so can be treated as a sample (PERF_RECORD_SAMPLE). Custom data can be incorporated in the sample as raw_data (PERF_SAMPLE_RAW). However a means of identifying the raw data format is needed. That will be done by synthesizing an attribute for it. So add an attribute type for custom synthesized events. Different synthesized events will be identified by the attribute 'config'. Committer notes: Start those PERF_TYPE_ after the PMU range, i.e. after (INT_MAX + 1U), i.e. after perf_pmu_register() -> idr_alloc(end=0). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1498040239-32418-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27x86/insn: perf tools: Add new ptwrite instructionAdrian Hunter4-1/+73
Add ptwrite to the op code map and the perf tools new instructions test. To run the test: $ tools/perf/perf test "x86 ins" 39: Test x86 instruction decoder - new instructions : Ok Or to see the details: $ tools/perf/perf test -v "x86 ins" 2>&1 | grep ptwrite For information about ptwrite, refer the Intel SDM. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: http://lkml.kernel.org/r/1495180230-19367-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf jit: fix typo: "incalid" -> "invalid"Colin Ian King1-1/+1
Trivial fix to typo in jvmti_close() warnx warning message. Signed-off-by: Colin King <colin.king@canonical.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20170627124917.19151-1-colin.king@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf tools: Kill die()Arnaldo Carvalho de Melo2-27/+5
Finally can nuke this function, no more users. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-eivvvzn8ie6w42gy3batxoy7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf config: Do not die when parsing u64 or int config valuesArnaldo Carvalho de Melo6-24/+34
Just warn the user and ignore those values. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-tbf60nj3ierm6hrkhpothymx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf tools: Replace error() with pr_err()Arnaldo Carvalho de Melo11-42/+28
To consolidate the error reporting facility. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-b41iot1094katoffdf19w9zk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf tools: Remove warning()Arnaldo Carvalho de Melo3-36/+0
Now everything uses pr_warning(), so ditch it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-hv8r0mgdhk73wtfq3zrhavgx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf event-parse: Use pr_warning()Arnaldo Carvalho de Melo1-2/+2
Convert sole user of warning() in this file to pr_warning(), consolidating error reporting facilities. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-3y7yf6v673ujl2rcs34tzv8n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf config: Use pr_warning()Arnaldo Carvalho de Melo1-4/+2
warning() is going away, consolidating error reporting. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-5r3636cwl4z1varo90mervai@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf help: Use pr_warning()Arnaldo Carvalho de Melo1-2/+2
Complete the switch to using te pr_{warning,error,etc} error reporting facilities. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-3l9gr6237b4aqyo0rsspixe2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf help: Elliminate dup code for reportingArnaldo Carvalho de Melo1-6/+8
And switch from warning() to pr_warning(), to elliminate another duplication: too many error reporting facilities. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-pkzcjrhek3uuqc4i5i9ealwd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf help: Introduce exec_failed() to avoid code duplicationArnaldo Carvalho de Melo1-13/+9
The warning(str_error_r(errno)) pattern can be replaced with a function, do it. And while at it use pr_warning(), we have way too many error reporting facilities, time to drop some, starting with the one we got from the git sources. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-lbak5npj1ri1uuvf1en3c0p0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-26perf tests: Add platform dependency to test 15Thomas Richter1-0/+48
This patch adds platform dependency into the test case 15 (perf_event_attr). It is based on a suggestion from Jiri Olsa. Add a new optional attribute named 'arch' in the [config] section of the test case file. It is a comma separated list of architecture names this test can be executed on. For example: arch = x86_64,alpha,ppc If this attribute is missing the test is executed on any platform. This does not break existing behavior. The values listed for this attribute should be identical to uname -m output. If the list starts with an exclamation mark (!) the comparison is inverted, for example for arch = !s390x,ppc the test is not executed on s390x or ppc platforms. The exclamation mark must be at the beginnning of the list. Here is an example debug output: [root@s35lp76]# fgrep arch tests/attr/test-stat-C2 arch = x86_64,alpha,ppc [root@s35lp76]# PERF_TEST_ATTR=/tmp /usr/bin/python2 ./tests/attr.py \ -d ./tests/attr/ -p ./perf -vvvvv -t test-stat-C1 provides the following output: running './tests/attr//test-stat-C1' test limitation 'x86_64,alpha,ppc' <--- new loading expected events Event event:base-stat fd = 1 group_fd = -1 ..... Here is the output when a test is skipped: [root@s35lp76]# fgrep arch tests/attr/test-stat-C1 arch = !s390x [root@s35lp76]# PERF_TEST_ATTR=/tmp /usr/bin/python2 ./tests/attr.py \ -d ./tests/attr/ -p ./perf -vvvvv -t test-stat-C1 provides the following output: test limitation '!s390x' <--- new skipped [s390x] './tests/attr//test-stat-C1' <--- new The test is skipped with return code 0. Suggested-and-Acked-by: Jiri Olsa <jolsa@redhat.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: linux-s390@vger.kernel.org Link: http://lkml.kernel.org/r/20170622073625.86762-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-26perf machine: Fix segfault for kernel.kptr_restrict=2Jiri Olsa1-4/+6
Michael reported the segfault when kernel.kptr_restrict=2 is set. $ perf record ls ... perf: Segmentation fault Obtained 16 stack frames. ./perf(dump_stack+0x2d) [0x5068df] ./perf(sighandler_dump_stack+0x2d) [0x5069bf] ./perf() [0x43e47b] /lib64/libc.so.6(+0x3594f) [0x7f762004794f] /lib64/libc.so.6(strlen+0x26) [0x7f762009ef86] /lib64/libc.so.6(__strdup+0xd) [0x7f762009ecbd] ./perf(maps__set_kallsyms_ref_reloc_sym+0x4d) [0x51590f] ./perf(machine__create_kernel_maps+0x136) [0x50a7de] ./perf(perf_session__create_kernel_maps+0x2c) [0x510a81] ./perf(perf_session__new+0x13d) [0x510e23] ./perf() [0x43fd61] ./perf(cmd_record+0x704) [0x441823] ./perf() [0x4bc1a0] ./perf() [0x4bc40d] ./perf() [0x4bc55f] ./perf(main+0x2d5) [0x4bc939] Segmentation fault (core dumped) The reason is that with kernel.kptr_restrict=2, we don't get the symbol from machine__get_running_kernel_start, which we want to use in maps__set_kallsyms_ref_reloc_sym and we crash. Check the symbol name value before calling maps__set_kallsyms_ref_reloc_sym() and succeed without ref_reloc_sym being set. It's safe because we check its existence before we use it. Reported-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170626095153.553-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-22perf probe: Fix probe definition for inlined functionsBjörn Töpel1-1/+1
In commit 613f050d68a8 ("perf probe: Fix to probe on gcc generated functions in modules"), the offset from symbol is, incorrectly, added to the trace point address. This leads to incorrect probe trace points for inlined functions and when using relative line number on symbols. Prior this patch: $ perf probe -m nf_nat -D in_range p:probe/in_range nf_nat:in_range.isra.9+0 $ perf probe -m i40e -D i40e_clean_rx_irq p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2212 $ perf probe -m i40e -D i40e_clean_rx_irq:16 p:probe/i40e_clean_rx_irq i40e:i40e_lan_xmit_frame+626 After: $ perf probe -m nf_nat -D in_range p:probe/in_range nf_nat:in_range.isra.9+0 $ perf probe -m i40e -D i40e_clean_rx_irq p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+1106 $ perf probe -m i40e -D i40e_clean_rx_irq:16 p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2665 Committer testing: Using 'pfunct', a tool found in the 'dwarves' package [1], one can ask what are the functions that while not being explicitely marked as inline, were inlined by the compiler: # pfunct --cc_inlined /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko | head __ew32 e1000_regdump e1000e_dump_ps_pages e1000_desc_unused e1000e_systim_to_hwtstamp e1000e_rx_hwtstamp e1000e_update_rdt_wa e1000e_update_tdt_wa e1000_put_txbuf e1000_consume_page Then ask 'perf probe' to produce the kprobe_tracer probe definitions for two of them: # perf probe -m e1000e -D e1000e_rx_hwtstamp p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+74 # perf probe -m e1000e -D e1000_consume_page p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+876 p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+1506 p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074 Now lets concentrate on the 'e1000_consume_page' one, that was inlined twice in e1000_clean_jumbo_rx_irq(), lets see what readelf says about the DWARF tags for that function: $ readelf -wi /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko <SNIP> <1><13e27b>: Abbrev Number: 121 (DW_TAG_subprogram) <13e27c> DW_AT_name : (indirect string, offset: 0xa8945): e1000_clean_jumbo_rx_irq <13e287> DW_AT_low_pc : 0x17a30 <3><13e6ef>: Abbrev Number: 119 (DW_TAG_inlined_subroutine) <13e6f0> DW_AT_abstract_origin: <0x13ed2c> <13e6f4> DW_AT_low_pc : 0x17be6 <SNIP> <1><13ed2c>: Abbrev Number: 142 (DW_TAG_subprogram) <13ed2e> DW_AT_name : (indirect string, offset: 0xa54c3): e1000_consume_page So, the first time in e1000_clean_jumbo_rx_irq() where e1000_consume_page() is inlined is at PC 0x17be6, which subtracted from e1000_clean_jumbo_rx_irq()'s address, gives us the offset we should use in the probe definition: 0x17be6 - 0x17a30 = 438 but above we have 876, which is twice as much. Lets see the second inline expansion of e1000_consume_page() in e1000_clean_jumbo_rx_irq(): <3><13e86e>: Abbrev Number: 119 (DW_TAG_inlined_subroutine) <13e86f> DW_AT_abstract_origin: <0x13ed2c> <13e873> DW_AT_low_pc : 0x17d21 0x17d21 - 0x17a30 = 753 So we where adding it at twice the offset from the containing function as we should. And then after this patch: # perf probe -m e1000e -D e1000e_rx_hwtstamp p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+37 # perf probe -m e1000e -D e1000_consume_page p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+438 p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+753 p:probe/e1000_consume_page_2 e1000e:e1000_clean_jumbo_rx_irq+1353 # Which matches the two first expansions and shows that because we were doubling the offset it would spill over the next function: readelf -sw /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko 673: 0000000000017a30 1626 FUNC LOCAL DEFAULT 2 e1000_clean_jumbo_rx_irq 674: 0000000000018090 2013 FUNC LOCAL DEFAULT 2 e1000_clean_rx_irq_ps This is the 3rd inline expansion of e1000_consume_page() in e1000_clean_jumbo_rx_irq(): <3><13ec77>: Abbrev Number: 119 (DW_TAG_inlined_subroutine) <13ec78> DW_AT_abstract_origin: <0x13ed2c> <13ec7c> DW_AT_low_pc : 0x17f79 0x17f79 - 0x17a30 = 1353 So: 0x17a30 + 2 * 1353 = 0x184c2 And: 0x184c2 - 0x18090 = 1074 Which explains the bogus third expansion for e1000_consume_page() to end up at: p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074 All fixed now :-) [1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: stable@vger.kernel.org Fixes: 613f050d68a8 ("perf probe: Fix to probe on gcc generated functions in modules") Link: http://lkml.kernel.org/r/20170621164134.5701-1-bjorn.topel@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf script: Fix message because field list option is -F not -fAdrian Hunter1-1/+1
Fix message because field list option is -F not -f. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-20-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf tools: Fix message because cpu list option is -C not -cAdrian Hunter1-1/+1
Fix message because cpu list option is -C not -c Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-19-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Fix transactions_sample_typeAdrian Hunter1-0/+1
'transactions_sample_type' is needed to correctly inject transactions samples but it was not being set. Set it from the event sample type. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-18-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Remove redundant initial_skip checksAdrian Hunter1-6/+2
'initial_skip' is checked inside the sample synthesis functions which means it is actually being done twice for 'instructions' and 'transactions' samples. Remove the redundant checks. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-17-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Add decoder support for CBR eventsAdrian Hunter2-0/+21
Add decoder support for informing the tools of changes to the core-to-bus ratio (CBR). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-16-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Add reserved byte to CBR packet payloadAdrian Hunter2-2/+2
Future proof CBR packet decoding by passing through also the undefined 'reserved' byte in the packet payload. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-15-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Add decoder support for ptwrite and power event packetsAdrian Hunter4-8/+293
Add decoder support for PTWRITE, MWAIT, PWRE, PWRX and EXSTOP packets. This patch only affects the decoder, so the tools still do not select or consume the new information. That is added in subsequent patches. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-14-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Add documentation for new config termsAdrian Hunter1-0/+36
Add documentation for new config terms. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-13-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Add default config for pass-through branch enableAdrian Hunter1-0/+5
Branch tracing is enabled by default, so a fake config bit called 'pt' (pass-through) was added to allow the 'branch enable' bit to have affect. Add default config 'pt,branch' which will allow users to disable branch tracing using 'branch=0' instead of having to specify 'pt,branch=0'. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-12-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Allow decoding with branch tracing disabledAdrian Hunter3-0/+28
The kernel now supports the disabling of branch tracing, however the decoder assumes branch tracing is always enabled. Pass through a parameter to indicate whether branch tracing is enabled and use it to avoid cases when the decoder is expecting branch packets. There are 2 such cases. First, FUP packets which can bind to an IP even when there is no branch tracing. Secondly, the decoder will try to use branch packets to find an IP to start decoding or to recover from errors. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-11-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Add missing __fallthroughAdrian Hunter1-1/+1
perf tools uses __fallthrough. Add missing __fallthrough to a switch statement. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Clear FUP flag on errorAdrian Hunter1-0/+2
Sometimes a FUP packet is associated with a TSX transaction and a flag is set to indicate that. Ensure that flag is cleared on any error condition because at that point the decoder can no longer assume it is correct. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-9-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Use FUP always when scanning for an IPAdrian Hunter1-8/+4
The decoder will try to use branch packets to find an IP to start decoding or to recover from errors. Currently the FUP packet is used only in the case of an overflow, however there is no reason for that to be a special case. So just use FUP always when scanning for an IP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zeroAdrian Hunter1-3/+5
Intel PT uses IP compression based on the last IP. For decoding purposes, 'last IP' is not updated when a branch target has been suppressed, which is indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so ensure never to set 'last_ip' when packet 'count' is zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Fix last_ip usageAdrian Hunter1-2/+11
Intel PT uses IP compression based on the last IP. For decoding purposes, 'last IP' is considered to be reset to zero whenever there is a synchronization packet (PSB). The decoder wasn't doing that, and was treating the zero value to mean that there was no last IP, whereas compression can be done against the zero value. Fix by setting last_ip to zero when a PSB is received and keep track of have_last_ip. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IPAdrian Hunter1-0/+1
A value of zero is used to indicate that there is no IP. Ensure the value is zero when the state is INTEL_PT_STATE_NO_IP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Fix missing stack clearAdrian Hunter1-0/+1
The return compression stack must be cleared whenever there is a PSB. Fix one case where that was not happening. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Improve sample timestampAdrian Hunter1-3/+31
The decoder uses its current timestamp in samples. Usually that is a timestamp that has already passed, but in some cases it is a timestamp for a branch that the decoder is walking towards, and consequently hasn't reached. Improve that situation by using the pkt_state to determine when to use the current or previous timestamp. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>