summaryrefslogtreecommitdiffstats
path: root/gnu/llvm/docs/CommandGuide/llvm-exegesis.rst
diff options
context:
space:
mode:
authorpatrick <patrick@openbsd.org>2020-08-03 15:06:44 +0000
committerpatrick <patrick@openbsd.org>2020-08-03 15:06:44 +0000
commitb64793999546ed8adebaeebd9d8345d18db8927d (patch)
tree4357c27b561d73b0e089727c6ed659f2ceff5f47 /gnu/llvm/docs/CommandGuide/llvm-exegesis.rst
parentAdd support for UTF-8 DISPLAY-HINTs with octet length. For now only (diff)
downloadwireguard-openbsd-b64793999546ed8adebaeebd9d8345d18db8927d.tar.xz
wireguard-openbsd-b64793999546ed8adebaeebd9d8345d18db8927d.zip
Remove LLVM 8.0.1 files.
Diffstat (limited to 'gnu/llvm/docs/CommandGuide/llvm-exegesis.rst')
-rw-r--r--gnu/llvm/docs/CommandGuide/llvm-exegesis.rst236
1 files changed, 0 insertions, 236 deletions
diff --git a/gnu/llvm/docs/CommandGuide/llvm-exegesis.rst b/gnu/llvm/docs/CommandGuide/llvm-exegesis.rst
deleted file mode 100644
index f27db9e57ed..00000000000
--- a/gnu/llvm/docs/CommandGuide/llvm-exegesis.rst
+++ /dev/null
@@ -1,236 +0,0 @@
-llvm-exegesis - LLVM Machine Instruction Benchmark
-==================================================
-
-SYNOPSIS
---------
-
-:program:`llvm-exegesis` [*options*]
-
-DESCRIPTION
------------
-
-:program:`llvm-exegesis` is a benchmarking tool that uses information available
-in LLVM to measure host machine instruction characteristics like latency or port
-decomposition.
-
-Given an LLVM opcode name and a benchmarking mode, :program:`llvm-exegesis`
-generates a code snippet that makes execution as serial (resp. as parallel) as
-possible so that we can measure the latency (resp. uop decomposition) of the
-instruction.
-The code snippet is jitted and executed on the host subtarget. The time taken
-(resp. resource usage) is measured using hardware performance counters. The
-result is printed out as YAML to the standard output.
-
-The main goal of this tool is to automatically (in)validate the LLVM's TableDef
-scheduling models. To that end, we also provide analysis of the results.
-
-:program:`llvm-exegesis` can also benchmark arbitrary user-provided code
-snippets.
-
-EXAMPLE 1: benchmarking instructions
-------------------------------------
-
-Assume you have an X86-64 machine. To measure the latency of a single
-instruction, run:
-
-.. code-block:: bash
-
- $ llvm-exegesis -mode=latency -opcode-name=ADD64rr
-
-Measuring the uop decomposition of an instruction works similarly:
-
-.. code-block:: bash
-
- $ llvm-exegesis -mode=uops -opcode-name=ADD64rr
-
-The output is a YAML document (the default is to write to stdout, but you can
-redirect the output to a file using `-benchmarks-file`):
-
-.. code-block:: none
-
- ---
- key:
- opcode_name: ADD64rr
- mode: latency
- config: ''
- cpu_name: haswell
- llvm_triple: x86_64-unknown-linux-gnu
- num_repetitions: 10000
- measurements:
- - { key: latency, value: 1.0058, debug_string: '' }
- error: ''
- info: 'explicit self cycles, selecting one aliasing configuration.
- Snippet:
- ADD64rr R8, R8, R10
- '
- ...
-
-To measure the latency of all instructions for the host architecture, run:
-
-.. code-block:: bash
-
- #!/bin/bash
- readonly INSTRUCTIONS=$(($(grep INSTRUCTION_LIST_END build/lib/Target/X86/X86GenInstrInfo.inc | cut -f2 -d=) - 1))
- for INSTRUCTION in $(seq 1 ${INSTRUCTIONS});
- do
- ./build/bin/llvm-exegesis -mode=latency -opcode-index=${INSTRUCTION} | sed -n '/---/,$p'
- done
-
-FIXME: Provide an :program:`llvm-exegesis` option to test all instructions.
-
-
-EXAMPLE 2: benchmarking a custom code snippet
----------------------------------------------
-
-To measure the latency/uops of a custom piece of code, you can specify the
-`snippets-file` option (`-` reads from standard input).
-
-.. code-block:: bash
-
- $ echo "vzeroupper" | llvm-exegesis -mode=uops -snippets-file=-
-
-Real-life code snippets typically depend on registers or memory.
-:program:`llvm-exegesis` checks the liveliness of registers (i.e. any register
-use has a corresponding def or is a "live in"). If your code depends on the
-value of some registers, you have two options:
-
-- Mark the register as requiring a definition. :program:`llvm-exegesis` will
- automatically assign a value to the register. This can be done using the
- directive `LLVM-EXEGESIS-DEFREG <reg name> <hex_value>`, where `<hex_value>`
- is a bit pattern used to fill `<reg_name>`. If `<hex_value>` is smaller than
- the register width, it will be sign-extended.
-- Mark the register as a "live in". :program:`llvm-exegesis` will benchmark
- using whatever value was in this registers on entry. This can be done using
- the directive `LLVM-EXEGESIS-LIVEIN <reg name>`.
-
-For example, the following code snippet depends on the values of XMM1 (which
-will be set by the tool) and the memory buffer passed in RDI (live in).
-
-.. code-block:: none
-
- # LLVM-EXEGESIS-LIVEIN RDI
- # LLVM-EXEGESIS-DEFREG XMM1 42
- vmulps (%rdi), %xmm1, %xmm2
- vhaddps %xmm2, %xmm2, %xmm3
- addq $0x10, %rdi
-
-
-EXAMPLE 3: analysis
--------------------
-
-Assuming you have a set of benchmarked instructions (either latency or uops) as
-YAML in file `/tmp/benchmarks.yaml`, you can analyze the results using the
-following command:
-
-.. code-block:: bash
-
- $ llvm-exegesis -mode=analysis \
- -benchmarks-file=/tmp/benchmarks.yaml \
- -analysis-clusters-output-file=/tmp/clusters.csv \
- -analysis-inconsistencies-output-file=/tmp/inconsistencies.html
-
-This will group the instructions into clusters with the same performance
-characteristics. The clusters will be written out to `/tmp/clusters.csv` in the
-following format:
-
-.. code-block:: none
-
- cluster_id,opcode_name,config,sched_class
- ...
- 2,ADD32ri8_DB,,WriteALU,1.00
- 2,ADD32ri_DB,,WriteALU,1.01
- 2,ADD32rr,,WriteALU,1.01
- 2,ADD32rr_DB,,WriteALU,1.00
- 2,ADD32rr_REV,,WriteALU,1.00
- 2,ADD64i32,,WriteALU,1.01
- 2,ADD64ri32,,WriteALU,1.01
- 2,MOVSX64rr32,,BSWAP32r_BSWAP64r_MOVSX64rr32,1.00
- 2,VPADDQYrr,,VPADDBYrr_VPADDDYrr_VPADDQYrr_VPADDWYrr_VPSUBBYrr_VPSUBDYrr_VPSUBQYrr_VPSUBWYrr,1.02
- 2,VPSUBQYrr,,VPADDBYrr_VPADDDYrr_VPADDQYrr_VPADDWYrr_VPSUBBYrr_VPSUBDYrr_VPSUBQYrr_VPSUBWYrr,1.01
- 2,ADD64ri8,,WriteALU,1.00
- 2,SETBr,,WriteSETCC,1.01
- ...
-
-:program:`llvm-exegesis` will also analyze the clusters to point out
-inconsistencies in the scheduling information. The output is an html file. For
-example, `/tmp/inconsistencies.html` will contain messages like the following :
-
-.. image:: llvm-exegesis-analysis.png
- :align: center
-
-Note that the scheduling class names will be resolved only when
-:program:`llvm-exegesis` is compiled in debug mode, else only the class id will
-be shown. This does not invalidate any of the analysis results though.
-
-
-OPTIONS
--------
-
-.. option:: -help
-
- Print a summary of command line options.
-
-.. option:: -opcode-index=<LLVM opcode index>
-
- Specify the opcode to measure, by index. See example 1 for details.
- Either `opcode-index`, `opcode-name` or `snippets-file` must be set.
-
-.. option:: -opcode-name=<opcode name 1>,<opcode name 2>,...
-
- Specify the opcode to measure, by name. Several opcodes can be specified as
- a comma-separated list. See example 1 for details.
- Either `opcode-index`, `opcode-name` or `snippets-file` must be set.
-
- .. option:: -snippets-file=<filename>
-
- Specify the custom code snippet to measure. See example 2 for details.
- Either `opcode-index`, `opcode-name` or `snippets-file` must be set.
-
-.. option:: -mode=[latency|uops|analysis]
-
- Specify the run mode.
-
-.. option:: -num-repetitions=<Number of repetition>
-
- Specify the number of repetitions of the asm snippet.
- Higher values lead to more accurate measurements but lengthen the benchmark.
-
-.. option:: -benchmarks-file=</path/to/file>
-
- File to read (`analysis` mode) or write (`latency`/`uops` modes) benchmark
- results. "-" uses stdin/stdout.
-
-.. option:: -analysis-clusters-output-file=</path/to/file>
-
- If provided, write the analysis clusters as CSV to this file. "-" prints to
- stdout.
-
-.. option:: -analysis-inconsistencies-output-file=</path/to/file>
-
- If non-empty, write inconsistencies found during analysis to this file. `-`
- prints to stdout.
-
-.. option:: -analysis-numpoints=<dbscan numPoints parameter>
-
- Specify the numPoints parameters to be used for DBSCAN clustering
- (`analysis` mode).
-
-.. option:: -analysis-espilon=<dbscan epsilon parameter>
-
- Specify the numPoints parameters to be used for DBSCAN clustering
- (`analysis` mode).
-
-.. option:: -ignore-invalid-sched-class=false
-
- If set, ignore instructions that do not have a sched class (class idx = 0).
-
- .. option:: -mcpu=<cpu name>
-
- If set, measure the cpu characteristics using the counters for this CPU. This
- is useful when creating new sched models (the host CPU is unknown to LLVM).
-
-EXIT STATUS
------------
-
-:program:`llvm-exegesis` returns 0 on success. Otherwise, an error message is
-printed to standard error, and the tool returns a non 0 value.