diff options
Diffstat (limited to 'tools/perf/Documentation/perf-record.txt')
-rw-r--r-- | tools/perf/Documentation/perf-record.txt | 226 |
1 files changed, 212 insertions, 14 deletions
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index b23a4012a606..e41ae950fdc3 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -9,7 +9,7 @@ SYNOPSIS -------- [verse] 'perf record' [-e <EVENT> | --event=EVENT] [-a] <command> -'perf record' [-e <EVENT> | --event=EVENT] [-a] -- <command> [<options>] +'perf record' [-e <EVENT> | --event=EVENT] [-a] \-- <command> [<options>] DESCRIPTION ----------- @@ -30,8 +30,14 @@ OPTIONS - a symbolic event name (use 'perf list' to list all events) - - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a - hexadecimal event descriptor. + - a raw PMU event in the form of rN where N is a hexadecimal value + that represents the raw register encoding with the layout of the + event control registers as described by entries in + /sys/bus/event_source/devices/cpu/format/*. + + - a symbolic or raw PMU event followed by an optional colon + and a list of event modifiers, e.g., cpu-cycles:p. See the + linkperf:perf-list[1] man page for details on event modifiers. - a symbolically formed PMU event like 'pmu/param1=0x3,param2/' where 'param1', 'param2', etc are defined as formats for the PMU in @@ -237,16 +243,22 @@ OPTIONS option and remains only for backward compatibility. See --event. -g:: - Enables call-graph (stack chain/backtrace) recording. + Enables call-graph (stack chain/backtrace) recording for both + kernel space and user space. --call-graph:: Setup and enable call-graph (stack chain/backtrace) recording, - implies -g. Default is "fp". + implies -g. Default is "fp" (for user space). + + The unwinding method used for kernel space is dependent on the + unwinder used by the active kernel configuration, i.e + CONFIG_UNWINDER_FRAME_POINTER (fp) or CONFIG_UNWINDER_ORC (orc) + + Any option specified here controls the method used for user space. - Allows specifying "fp" (frame pointer) or "dwarf" - (DWARF's CFI - Call Frame Information) or "lbr" - (Hardware Last Branch Record facility) as the method to collect - the information used to show the call graphs. + Valid options are "fp" (frame pointer), "dwarf" (DWARF's CFI - + Call Frame Information) or "lbr" (Hardware Last Branch Record + facility). In some systems, where binaries are build with gcc --fomit-frame-pointer, using the "fp" method will produce bogus @@ -263,6 +275,11 @@ OPTIONS User can change the size by passing the size after comma like "--call-graph dwarf,4096". + When "fp" recording is used, perf tries to save stack enties + up to the number specified in sysctl.kernel.perf_event_max_stack + by default. User can change the number by passing it after comma + like "--call-graph fp,32". + -q:: --quiet:: Don't print any message, useful for scripting. @@ -283,6 +300,12 @@ OPTIONS --phys-data:: Record the sample physical addresses. +--data-page-size:: + Record the sampled data address data page size. + +--code-page-size:: + Record the sampled code address (ip) page size + -T:: --timestamp:: Record the sample timestamps. Use it with 'perf report -D' to see the @@ -295,6 +318,11 @@ OPTIONS --sample-cpu:: Record the sample cpu. +--sample-identifier:: + Record the sample identifier i.e. PERF_SAMPLE_IDENTIFIER bit set in + the sample_type member of the struct perf_event_attr argument to the + perf_event_open system call. + -n:: --no-samples:: Don't sample. @@ -369,6 +397,10 @@ following filters are defined: - abort_tx: only when the target is a hardware transaction abort - cond: conditional branches - save_type: save branch type during sampling in case binary is not available later + For the platforms with Intel Arch LBR support (12th-Gen+ client or + 4th-Gen Xeon+ server), the save branch type is unconditionally enabled + when the taken branch stack sampling is enabled. + - priv: save privilege state during sampling in case binary is not available later + The option requires at least one branch type among any, any_call, any_ret, ind_call, cond. @@ -379,13 +411,17 @@ is enabled for all the sampling events. The sampled branch type is the same for The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k Note that this feature may not be available on all processors. +-W:: --weight:: Enable weightened sampling. An additional weight is recorded per sample and can be displayed with the weight and local_weight sort keys. This currently works for TSX abort events and some memory events in precise mode on modern Intel CPUs. --namespaces:: -Record events of type PERF_RECORD_NAMESPACES. +Record events of type PERF_RECORD_NAMESPACES. This enables 'cgroup_id' sort key. + +--all-cgroups:: +Record events of type PERF_RECORD_CGROUP. This enables 'cgroup' sort key. --transaction:: Record transaction flags for transaction related events. @@ -398,8 +434,11 @@ if combined with -a or -C options. -D:: --delay=:: -After starting the program, wait msecs before measuring. This is useful to -filter out the startup phase of the program, which is often very different. +After starting the program, wait msecs before measuring (-1: start with events +disabled), or enable events only for specified ranges of msecs (e.g. +-D 10-20,30-40 means wait 10 msecs, enable for 10 msecs, wait 10 msecs, enable +for 10 msecs, then stop). Note, delaying enabling of events is useful to filter +out the startup phase of the program, which is often very different. -I:: --intr-regs:: @@ -449,7 +488,9 @@ This option sets the time out limit. The default value is 500 ms. --switch-events:: Record context switch events i.e. events of type PERF_RECORD_SWITCH or -PERF_RECORD_SWITCH_CPU_WIDE. +PERF_RECORD_SWITCH_CPU_WIDE. In some cases (e.g. Intel PT, CoreSight or Arm SPE) +switch events will be enabled automatically, which can be suppressed by +by the option --no-switch-events. --clang-path=PATH:: Path to clang binary to use for compiling BPF scriptlets. @@ -466,6 +507,9 @@ Specify vmlinux path which has debuginfo. --buildid-all:: Record build-id of all DSOs regardless whether it's actually hit or not. +--buildid-mmap:: +Record build ids in mmap2 events, disables build id cache (implies --no-buildid). + --aio[=n]:: Use <n> control blocks in asynchronous (Posix AIO) trace writing mode (default: 1, max: 4). Asynchronous mode is supported only when linking Perf tool with libc library @@ -547,6 +591,19 @@ overhead. You can still switch them on with: --switch-output --no-no-buildid --no-no-buildid-cache +--switch-output-event:: +Events that will cause the switch of the perf.data file, auto-selecting +--switch-output=signal, the results are similar as internally the side band +thread will also send a SIGUSR2 to the main one. + +Uses the same syntax as --event, it will just not be recorded, serving only to +switch the perf.data file as soon as the --switch-output event is processed by +a separate sideband thread. + +This sideband thread is also used to other purposes, like processing the +PERF_RECORD_BPF_EVENT records as they happen, asking the kernel for extra BPF +information, etc. + --switch-max-files=N:: When rotating perf.data with --switch-output, only keep N files. @@ -558,6 +615,22 @@ options. 'perf record --dry-run -e' can act as a BPF script compiler if llvm.dump-obj in config file is set to true. +--synth=TYPE:: +Collect and synthesize given type of events (comma separated). Note that +this option controls the synthesis from the /proc filesystem which represent +task status for pre-existing threads. + +Kernel (and some other) events are recorded regardless of the +choice in this option. For example, --synth=no would have MMAP events for +kernel and modules. + +Available types are: + 'task' - synthesize FORK and COMM events for each task + 'mmap' - synthesize MMAP events for each process (implies 'task') + 'cgroup' - synthesize CGROUP events for each cgroup + 'all' - synthesize all events (default) + 'no' - do not synthesize any of the above events + --tail-synthesize:: Instead of collecting non-sample events (for example, fork, comm, mmap) at the beginning of record, collect them during finalizing an output file. @@ -587,6 +660,131 @@ Make a copy of /proc/kcore and place it into a directory with the perf data file Limit the sample data max size, <size> is expected to be a number with appended unit character - B/K/M/G +--num-thread-synthesize:: + The number of threads to run when synthesizing events for existing processes. + By default, the number of threads equals 1. + +ifdef::HAVE_LIBPFM[] +--pfm-events events:: +Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net) +including support for event filters. For example '--pfm-events +inst_retired:any_p:u:c=1:i'. More than one event can be passed to the +option using the comma separator. Hardware events and generic hardware +events cannot be mixed together. The latter must be used with the -e +option. The -e option and this one can be mixed and matched. Events +can be grouped using the {} notation. +endif::HAVE_LIBPFM[] + +--control=fifo:ctl-fifo[,ack-fifo]:: +--control=fd:ctl-fd[,ack-fd]:: +ctl-fifo / ack-fifo are opened and used as ctl-fd / ack-fd as follows. +Listen on ctl-fd descriptor for command to control measurement. + +Available commands: + 'enable' : enable events + 'disable' : disable events + 'enable name' : enable event 'name' + 'disable name' : disable event 'name' + 'snapshot' : AUX area tracing snapshot). + 'stop' : stop perf record + 'ping' : ping + + 'evlist [-v|-g|-F] : display all events + -F Show just the sample frequency used for each event. + -v Show all fields. + -g Show event group information. + +Measurements can be started with events disabled using --delay=-1 option. Optionally +send control command completion ('ack\n') to ack-fd descriptor to synchronize with the +controlling process. Example of bash shell script to enable and disable events during +measurements: + + #!/bin/bash + + ctl_dir=/tmp/ + + ctl_fifo=${ctl_dir}perf_ctl.fifo + test -p ${ctl_fifo} && unlink ${ctl_fifo} + mkfifo ${ctl_fifo} + exec {ctl_fd}<>${ctl_fifo} + + ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo + test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo} + mkfifo ${ctl_ack_fifo} + exec {ctl_fd_ack}<>${ctl_ack_fifo} + + perf record -D -1 -e cpu-cycles -a \ + --control fd:${ctl_fd},${ctl_fd_ack} \ + -- sleep 30 & + perf_pid=$! + + sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})" + sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})" + + exec {ctl_fd_ack}>&- + unlink ${ctl_ack_fifo} + + exec {ctl_fd}>&- + unlink ${ctl_fifo} + + wait -n ${perf_pid} + exit $? + +--threads=<spec>:: +Write collected trace data into several data files using parallel threads. +<spec> value can be user defined list of masks. Masks separated by colon +define CPUs to be monitored by a thread and affinity mask of that thread +is separated by slash: + + <cpus mask 1>/<affinity mask 1>:<cpus mask 2>/<affinity mask 2>:... + +CPUs or affinity masks must not overlap with other corresponding masks. +Invalid CPUs are ignored, but masks containing only invalid CPUs are not +allowed. + +For example user specification like the following: + + 0,2-4/2-4:1,5-7/5-7 + +specifies parallel threads layout that consists of two threads, +the first thread monitors CPUs 0 and 2-4 with the affinity mask 2-4, +the second monitors CPUs 1 and 5-7 with the affinity mask 5-7. + +<spec> value can also be a string meaning predefined parallel threads +layout: + + cpu - create new data streaming thread for every monitored cpu + core - create new thread to monitor CPUs grouped by a core + package - create new thread to monitor CPUs grouped by a package + numa - create new threed to monitor CPUs grouped by a NUMA domain + +Predefined layouts can be used on systems with large number of CPUs in +order not to spawn multiple per-cpu streaming threads but still avoid LOST +events in data directory files. Option specified with no or empty value +defaults to CPU layout. Masks defined or provided by the option value are +filtered through the mask provided by -C option. + +--debuginfod[=URLs]:: + Specify debuginfod URL to be used when cacheing perf.data binaries, + it follows the same syntax as the DEBUGINFOD_URLS variable, like: + + http://192.168.122.174:8002 + + If the URLs is not specified, the value of DEBUGINFOD_URLS + system environment variable is used. + +--off-cpu:: + Enable off-cpu profiling with BPF. The BPF program will collect + task scheduling information with (user) stacktrace and save them + as sample data of a software event named "offcpu-time". The + sample period will have the time the task slept in nanoseconds. + + Note that BPF can collect stack traces using frame pointer ("fp") + only, as of now. So the applications built without the frame + pointer might see bogus addresses. + +include::intel-hybrid.txt[] + SEE ALSO -------- -linkperf:perf-stat[1], linkperf:perf-list[1] +linkperf:perf-stat[1], linkperf:perf-list[1], linkperf:perf-intel-pt[1] |