aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/Documentation/perf-record.txt
diff options
context:
space:
mode:
Diffstat (limited to 'tools/perf/Documentation/perf-record.txt')
-rw-r--r--tools/perf/Documentation/perf-record.txt226
1 files changed, 212 insertions, 14 deletions
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index b23a4012a606..e41ae950fdc3 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -9,7 +9,7 @@ SYNOPSIS
--------
[verse]
'perf record' [-e <EVENT> | --event=EVENT] [-a] <command>
-'perf record' [-e <EVENT> | --event=EVENT] [-a] -- <command> [<options>]
+'perf record' [-e <EVENT> | --event=EVENT] [-a] \-- <command> [<options>]
DESCRIPTION
-----------
@@ -30,8 +30,14 @@ OPTIONS
- a symbolic event name (use 'perf list' to list all events)
- - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
- hexadecimal event descriptor.
+ - a raw PMU event in the form of rN where N is a hexadecimal value
+ that represents the raw register encoding with the layout of the
+ event control registers as described by entries in
+ /sys/bus/event_source/devices/cpu/format/*.
+
+ - a symbolic or raw PMU event followed by an optional colon
+ and a list of event modifiers, e.g., cpu-cycles:p. See the
+ linkperf:perf-list[1] man page for details on event modifiers.
- a symbolically formed PMU event like 'pmu/param1=0x3,param2/' where
'param1', 'param2', etc are defined as formats for the PMU in
@@ -237,16 +243,22 @@ OPTIONS
option and remains only for backward compatibility. See --event.
-g::
- Enables call-graph (stack chain/backtrace) recording.
+ Enables call-graph (stack chain/backtrace) recording for both
+ kernel space and user space.
--call-graph::
Setup and enable call-graph (stack chain/backtrace) recording,
- implies -g. Default is "fp".
+ implies -g. Default is "fp" (for user space).
+
+ The unwinding method used for kernel space is dependent on the
+ unwinder used by the active kernel configuration, i.e
+ CONFIG_UNWINDER_FRAME_POINTER (fp) or CONFIG_UNWINDER_ORC (orc)
+
+ Any option specified here controls the method used for user space.
- Allows specifying "fp" (frame pointer) or "dwarf"
- (DWARF's CFI - Call Frame Information) or "lbr"
- (Hardware Last Branch Record facility) as the method to collect
- the information used to show the call graphs.
+ Valid options are "fp" (frame pointer), "dwarf" (DWARF's CFI -
+ Call Frame Information) or "lbr" (Hardware Last Branch Record
+ facility).
In some systems, where binaries are build with gcc
--fomit-frame-pointer, using the "fp" method will produce bogus
@@ -263,6 +275,11 @@ OPTIONS
User can change the size by passing the size after comma like
"--call-graph dwarf,4096".
+ When "fp" recording is used, perf tries to save stack enties
+ up to the number specified in sysctl.kernel.perf_event_max_stack
+ by default. User can change the number by passing it after comma
+ like "--call-graph fp,32".
+
-q::
--quiet::
Don't print any message, useful for scripting.
@@ -283,6 +300,12 @@ OPTIONS
--phys-data::
Record the sample physical addresses.
+--data-page-size::
+ Record the sampled data address data page size.
+
+--code-page-size::
+ Record the sampled code address (ip) page size
+
-T::
--timestamp::
Record the sample timestamps. Use it with 'perf report -D' to see the
@@ -295,6 +318,11 @@ OPTIONS
--sample-cpu::
Record the sample cpu.
+--sample-identifier::
+ Record the sample identifier i.e. PERF_SAMPLE_IDENTIFIER bit set in
+ the sample_type member of the struct perf_event_attr argument to the
+ perf_event_open system call.
+
-n::
--no-samples::
Don't sample.
@@ -369,6 +397,10 @@ following filters are defined:
- abort_tx: only when the target is a hardware transaction abort
- cond: conditional branches
- save_type: save branch type during sampling in case binary is not available later
+ For the platforms with Intel Arch LBR support (12th-Gen+ client or
+ 4th-Gen Xeon+ server), the save branch type is unconditionally enabled
+ when the taken branch stack sampling is enabled.
+ - priv: save privilege state during sampling in case binary is not available later
+
The option requires at least one branch type among any, any_call, any_ret, ind_call, cond.
@@ -379,13 +411,17 @@ is enabled for all the sampling events. The sampled branch type is the same for
The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
Note that this feature may not be available on all processors.
+-W::
--weight::
Enable weightened sampling. An additional weight is recorded per sample and can be
displayed with the weight and local_weight sort keys. This currently works for TSX
abort events and some memory events in precise mode on modern Intel CPUs.
--namespaces::
-Record events of type PERF_RECORD_NAMESPACES.
+Record events of type PERF_RECORD_NAMESPACES. This enables 'cgroup_id' sort key.
+
+--all-cgroups::
+Record events of type PERF_RECORD_CGROUP. This enables 'cgroup' sort key.
--transaction::
Record transaction flags for transaction related events.
@@ -398,8 +434,11 @@ if combined with -a or -C options.
-D::
--delay=::
-After starting the program, wait msecs before measuring. This is useful to
-filter out the startup phase of the program, which is often very different.
+After starting the program, wait msecs before measuring (-1: start with events
+disabled), or enable events only for specified ranges of msecs (e.g.
+-D 10-20,30-40 means wait 10 msecs, enable for 10 msecs, wait 10 msecs, enable
+for 10 msecs, then stop). Note, delaying enabling of events is useful to filter
+out the startup phase of the program, which is often very different.
-I::
--intr-regs::
@@ -449,7 +488,9 @@ This option sets the time out limit. The default value is 500 ms.
--switch-events::
Record context switch events i.e. events of type PERF_RECORD_SWITCH or
-PERF_RECORD_SWITCH_CPU_WIDE.
+PERF_RECORD_SWITCH_CPU_WIDE. In some cases (e.g. Intel PT, CoreSight or Arm SPE)
+switch events will be enabled automatically, which can be suppressed by
+by the option --no-switch-events.
--clang-path=PATH::
Path to clang binary to use for compiling BPF scriptlets.
@@ -466,6 +507,9 @@ Specify vmlinux path which has debuginfo.
--buildid-all::
Record build-id of all DSOs regardless whether it's actually hit or not.
+--buildid-mmap::
+Record build ids in mmap2 events, disables build id cache (implies --no-buildid).
+
--aio[=n]::
Use <n> control blocks in asynchronous (Posix AIO) trace writing mode (default: 1, max: 4).
Asynchronous mode is supported only when linking Perf tool with libc library
@@ -547,6 +591,19 @@ overhead. You can still switch them on with:
--switch-output --no-no-buildid --no-no-buildid-cache
+--switch-output-event::
+Events that will cause the switch of the perf.data file, auto-selecting
+--switch-output=signal, the results are similar as internally the side band
+thread will also send a SIGUSR2 to the main one.
+
+Uses the same syntax as --event, it will just not be recorded, serving only to
+switch the perf.data file as soon as the --switch-output event is processed by
+a separate sideband thread.
+
+This sideband thread is also used to other purposes, like processing the
+PERF_RECORD_BPF_EVENT records as they happen, asking the kernel for extra BPF
+information, etc.
+
--switch-max-files=N::
When rotating perf.data with --switch-output, only keep N files.
@@ -558,6 +615,22 @@ options.
'perf record --dry-run -e' can act as a BPF script compiler if llvm.dump-obj
in config file is set to true.
+--synth=TYPE::
+Collect and synthesize given type of events (comma separated). Note that
+this option controls the synthesis from the /proc filesystem which represent
+task status for pre-existing threads.
+
+Kernel (and some other) events are recorded regardless of the
+choice in this option. For example, --synth=no would have MMAP events for
+kernel and modules.
+
+Available types are:
+ 'task' - synthesize FORK and COMM events for each task
+ 'mmap' - synthesize MMAP events for each process (implies 'task')
+ 'cgroup' - synthesize CGROUP events for each cgroup
+ 'all' - synthesize all events (default)
+ 'no' - do not synthesize any of the above events
+
--tail-synthesize::
Instead of collecting non-sample events (for example, fork, comm, mmap) at
the beginning of record, collect them during finalizing an output file.
@@ -587,6 +660,131 @@ Make a copy of /proc/kcore and place it into a directory with the perf data file
Limit the sample data max size, <size> is expected to be a number with
appended unit character - B/K/M/G
+--num-thread-synthesize::
+ The number of threads to run when synthesizing events for existing processes.
+ By default, the number of threads equals 1.
+
+ifdef::HAVE_LIBPFM[]
+--pfm-events events::
+Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net)
+including support for event filters. For example '--pfm-events
+inst_retired:any_p:u:c=1:i'. More than one event can be passed to the
+option using the comma separator. Hardware events and generic hardware
+events cannot be mixed together. The latter must be used with the -e
+option. The -e option and this one can be mixed and matched. Events
+can be grouped using the {} notation.
+endif::HAVE_LIBPFM[]
+
+--control=fifo:ctl-fifo[,ack-fifo]::
+--control=fd:ctl-fd[,ack-fd]::
+ctl-fifo / ack-fifo are opened and used as ctl-fd / ack-fd as follows.
+Listen on ctl-fd descriptor for command to control measurement.
+
+Available commands:
+ 'enable' : enable events
+ 'disable' : disable events
+ 'enable name' : enable event 'name'
+ 'disable name' : disable event 'name'
+ 'snapshot' : AUX area tracing snapshot).
+ 'stop' : stop perf record
+ 'ping' : ping
+
+ 'evlist [-v|-g|-F] : display all events
+ -F Show just the sample frequency used for each event.
+ -v Show all fields.
+ -g Show event group information.
+
+Measurements can be started with events disabled using --delay=-1 option. Optionally
+send control command completion ('ack\n') to ack-fd descriptor to synchronize with the
+controlling process. Example of bash shell script to enable and disable events during
+measurements:
+
+ #!/bin/bash
+
+ ctl_dir=/tmp/
+
+ ctl_fifo=${ctl_dir}perf_ctl.fifo
+ test -p ${ctl_fifo} && unlink ${ctl_fifo}
+ mkfifo ${ctl_fifo}
+ exec {ctl_fd}<>${ctl_fifo}
+
+ ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
+ test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
+ mkfifo ${ctl_ack_fifo}
+ exec {ctl_fd_ack}<>${ctl_ack_fifo}
+
+ perf record -D -1 -e cpu-cycles -a \
+ --control fd:${ctl_fd},${ctl_fd_ack} \
+ -- sleep 30 &
+ perf_pid=$!
+
+ sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
+ sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
+
+ exec {ctl_fd_ack}>&-
+ unlink ${ctl_ack_fifo}
+
+ exec {ctl_fd}>&-
+ unlink ${ctl_fifo}
+
+ wait -n ${perf_pid}
+ exit $?
+
+--threads=<spec>::
+Write collected trace data into several data files using parallel threads.
+<spec> value can be user defined list of masks. Masks separated by colon
+define CPUs to be monitored by a thread and affinity mask of that thread
+is separated by slash:
+
+ <cpus mask 1>/<affinity mask 1>:<cpus mask 2>/<affinity mask 2>:...
+
+CPUs or affinity masks must not overlap with other corresponding masks.
+Invalid CPUs are ignored, but masks containing only invalid CPUs are not
+allowed.
+
+For example user specification like the following:
+
+ 0,2-4/2-4:1,5-7/5-7
+
+specifies parallel threads layout that consists of two threads,
+the first thread monitors CPUs 0 and 2-4 with the affinity mask 2-4,
+the second monitors CPUs 1 and 5-7 with the affinity mask 5-7.
+
+<spec> value can also be a string meaning predefined parallel threads
+layout:
+
+ cpu - create new data streaming thread for every monitored cpu
+ core - create new thread to monitor CPUs grouped by a core
+ package - create new thread to monitor CPUs grouped by a package
+ numa - create new threed to monitor CPUs grouped by a NUMA domain
+
+Predefined layouts can be used on systems with large number of CPUs in
+order not to spawn multiple per-cpu streaming threads but still avoid LOST
+events in data directory files. Option specified with no or empty value
+defaults to CPU layout. Masks defined or provided by the option value are
+filtered through the mask provided by -C option.
+
+--debuginfod[=URLs]::
+ Specify debuginfod URL to be used when cacheing perf.data binaries,
+ it follows the same syntax as the DEBUGINFOD_URLS variable, like:
+
+ http://192.168.122.174:8002
+
+ If the URLs is not specified, the value of DEBUGINFOD_URLS
+ system environment variable is used.
+
+--off-cpu::
+ Enable off-cpu profiling with BPF. The BPF program will collect
+ task scheduling information with (user) stacktrace and save them
+ as sample data of a software event named "offcpu-time". The
+ sample period will have the time the task slept in nanoseconds.
+
+ Note that BPF can collect stack traces using frame pointer ("fp")
+ only, as of now. So the applications built without the frame
+ pointer might see bogus addresses.
+
+include::intel-hybrid.txt[]
+
SEE ALSO
--------
-linkperf:perf-stat[1], linkperf:perf-list[1]
+linkperf:perf-stat[1], linkperf:perf-list[1], linkperf:perf-intel-pt[1]