aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/Documentation/bpf/btf.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/bpf/btf.rst')
-rw-r--r--Documentation/bpf/btf.rst300
1 files changed, 239 insertions, 61 deletions
diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst
index 4d565d202ce3..257a7e1cdf5d 100644
--- a/Documentation/bpf/btf.rst
+++ b/Documentation/bpf/btf.rst
@@ -3,7 +3,7 @@ BPF Type Format (BTF)
=====================
1. Introduction
-***************
+===============
BTF (BPF Type Format) is the metadata format which encodes the debug info
related to BPF program/map. The name BTF was used initially to describe data
@@ -30,7 +30,7 @@ sections are discussed in details in :ref:`BTF_Type_String`.
.. _BTF_Type_String:
2. BTF Type and String Encoding
-*******************************
+===============================
The file ``include/uapi/linux/btf.h`` provides high-level definition of how
types/strings are encoded.
@@ -57,13 +57,13 @@ little-endian target. The ``btf_header`` is designed to be extensible with
generated.
2.1 String Encoding
-===================
+-------------------
The first string in the string section must be a null string. The rest of
string table is a concatenation of other null-terminated strings.
2.2 Type Encoding
-=================
+-----------------
The type id ``0`` is reserved for ``void`` type. The type section is parsed
sequentially and type id is assigned to each recognized type starting from id
@@ -74,7 +74,7 @@ sequentially and type id is assigned to each recognized type starting from id
#define BTF_KIND_ARRAY 3 /* Array */
#define BTF_KIND_STRUCT 4 /* Struct */
#define BTF_KIND_UNION 5 /* Union */
- #define BTF_KIND_ENUM 6 /* Enumeration */
+ #define BTF_KIND_ENUM 6 /* Enumeration up to 32-bit values */
#define BTF_KIND_FWD 7 /* Forward */
#define BTF_KIND_TYPEDEF 8 /* Typedef */
#define BTF_KIND_VOLATILE 9 /* Volatile */
@@ -84,6 +84,10 @@ sequentially and type id is assigned to each recognized type starting from id
#define BTF_KIND_FUNC_PROTO 13 /* Function Proto */
#define BTF_KIND_VAR 14 /* Variable */
#define BTF_KIND_DATASEC 15 /* Section */
+ #define BTF_KIND_FLOAT 16 /* Floating point */
+ #define BTF_KIND_DECL_TAG 17 /* Decl Tag */
+ #define BTF_KIND_TYPE_TAG 18 /* Type Tag */
+ #define BTF_KIND_ENUM64 19 /* Enumeration up to 64-bit values */
Note that the type section encodes debug info, not just pure types.
``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
@@ -95,17 +99,17 @@ Each type contains the following common data::
/* "info" bits arrangement
* bits 0-15: vlen (e.g. # of struct's members)
* bits 16-23: unused
- * bits 24-27: kind (e.g. int, ptr, array...etc)
- * bits 28-30: unused
+ * bits 24-28: kind (e.g. int, ptr, array...etc)
+ * bits 29-30: unused
* bit 31: kind_flag, currently used by
- * struct, union and fwd
+ * struct, union, fwd, enum and enum64.
*/
__u32 info;
- /* "size" is used by INT, ENUM, STRUCT and UNION.
+ /* "size" is used by INT, ENUM, STRUCT, UNION and ENUM64.
* "size" tells the size of the type it is describing.
*
* "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
- * FUNC and FUNC_PROTO.
+ * FUNC, FUNC_PROTO, DECL_TAG and TYPE_TAG.
* "type" is a type_id referring to another type.
*/
union {
@@ -268,20 +272,18 @@ In this case, if the base type is an int type, it must be a regular int type:
* ``BTF_INT_OFFSET()`` must be 0.
* ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``.
-The following kernel patch introduced ``kind_flag`` and explained why both
-modes exist:
-
- https://github.com/torvalds/linux/commit/9d5f9f701b1891466fb3dbb1806ad97716f95cc3#diff-fa650a64fdd3968396883d2fe8215ff3
+Commit 9d5f9f701b18 introduced ``kind_flag`` and explains why both modes
+exist.
2.2.6 BTF_KIND_ENUM
~~~~~~~~~~~~~~~~~~~
``struct btf_type`` encoding requirement:
* ``name_off``: 0 or offset to a valid C identifier
- * ``info.kind_flag``: 0
+ * ``info.kind_flag``: 0 for unsigned, 1 for signed
* ``info.kind``: BTF_KIND_ENUM
* ``info.vlen``: number of enum values
- * ``size``: 4
+ * ``size``: 1/2/4/8
``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.::
@@ -294,6 +296,10 @@ The ``btf_enum`` encoding:
* ``name_off``: offset to a valid C identifier
* ``val``: any value
+If the original enum value is signed and the size is less than 4,
+that value will be sign extended into 4 bytes. If the size is 8,
+the value will be truncated into 4 bytes.
+
2.2.7 BTF_KIND_FWD
~~~~~~~~~~~~~~~~~~
@@ -361,7 +367,8 @@ No additional type data follow ``btf_type``.
* ``name_off``: offset to a valid C identifier
* ``info.kind_flag``: 0
* ``info.kind``: BTF_KIND_FUNC
- * ``info.vlen``: 0
+ * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
+ or BTF_FUNC_EXTERN)
* ``type``: a BTF_KIND_FUNC_PROTO type
No additional type data follow ``btf_type``.
@@ -372,6 +379,9 @@ type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the
:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load`
(ABI).
+Currently, only linkage values of BTF_FUNC_STATIC and BTF_FUNC_GLOBAL are
+supported in the kernel.
+
2.2.13 BTF_KIND_FUNC_PROTO
~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -452,8 +462,95 @@ map definition.
* ``offset``: the in-section offset of the variable
* ``size``: the size of the variable in bytes
+2.2.16 BTF_KIND_FLOAT
+~~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+ * ``name_off``: any valid offset
+ * ``info.kind_flag``: 0
+ * ``info.kind``: BTF_KIND_FLOAT
+ * ``info.vlen``: 0
+ * ``size``: the size of the float type in bytes: 2, 4, 8, 12 or 16.
+
+No additional type data follow ``btf_type``.
+
+2.2.17 BTF_KIND_DECL_TAG
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+ * ``name_off``: offset to a non-empty string
+ * ``info.kind_flag``: 0
+ * ``info.kind``: BTF_KIND_DECL_TAG
+ * ``info.vlen``: 0
+ * ``type``: ``struct``, ``union``, ``func``, ``var`` or ``typedef``
+
+``btf_type`` is followed by ``struct btf_decl_tag``.::
+
+ struct btf_decl_tag {
+ __u32 component_idx;
+ };
+
+The ``name_off`` encodes btf_decl_tag attribute string.
+The ``type`` should be ``struct``, ``union``, ``func``, ``var`` or ``typedef``.
+For ``var`` or ``typedef`` type, ``btf_decl_tag.component_idx`` must be ``-1``.
+For the other three types, if the btf_decl_tag attribute is
+applied to the ``struct``, ``union`` or ``func`` itself,
+``btf_decl_tag.component_idx`` must be ``-1``. Otherwise,
+the attribute is applied to a ``struct``/``union`` member or
+a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a
+valid index (starting from 0) pointing to a member or an argument.
+
+2.2.18 BTF_KIND_TYPE_TAG
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+ * ``name_off``: offset to a non-empty string
+ * ``info.kind_flag``: 0
+ * ``info.kind``: BTF_KIND_TYPE_TAG
+ * ``info.vlen``: 0
+ * ``type``: the type with ``btf_type_tag`` attribute
+
+Currently, ``BTF_KIND_TYPE_TAG`` is only emitted for pointer types.
+It has the following btf type chain:
+::
+
+ ptr -> [type_tag]*
+ -> [const | volatile | restrict | typedef]*
+ -> base_type
+
+Basically, a pointer type points to zero or more
+type_tag, then zero or more const/volatile/restrict/typedef
+and finally the base type. The base type is one of
+int, ptr, array, struct, union, enum, func_proto and float types.
+
+2.2.19 BTF_KIND_ENUM64
+~~~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+ * ``name_off``: 0 or offset to a valid C identifier
+ * ``info.kind_flag``: 0 for unsigned, 1 for signed
+ * ``info.kind``: BTF_KIND_ENUM64
+ * ``info.vlen``: number of enum values
+ * ``size``: 1/2/4/8
+
+``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum64``.::
+
+ struct btf_enum64 {
+ __u32 name_off;
+ __u32 val_lo32;
+ __u32 val_hi32;
+ };
+
+The ``btf_enum64`` encoding:
+ * ``name_off``: offset to a valid C identifier
+ * ``val_lo32``: lower 32-bit value for a 64-bit value
+ * ``val_hi32``: high 32-bit value for a 64-bit value
+
+If the original enum value is signed and the size is less than 8,
+that value will be sign extended into 8 bytes.
+
3. BTF Kernel API
-*****************
+=================
The following bpf syscall command involves BTF:
* BPF_BTF_LOAD: load a blob of BTF data into kernel
@@ -496,14 +593,14 @@ The workflow typically looks like:
3.1 BPF_BTF_LOAD
-================
+----------------
Load a blob of BTF data into kernel. A blob of data, described in
:ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd``
is returned to a userspace.
3.2 BPF_MAP_CREATE
-==================
+------------------
A map can be created with ``btf_fd`` and specified key/value type id.::
@@ -514,23 +611,20 @@ A map can be created with ``btf_fd`` and specified key/value type id.::
In libbpf, the map can be defined with extra annotation like below:
::
- struct bpf_map_def SEC("maps") btf_map = {
- .type = BPF_MAP_TYPE_ARRAY,
- .key_size = sizeof(int),
- .value_size = sizeof(struct ipv_counts),
- .max_entries = 4,
- };
- BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts);
+ struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __type(key, int);
+ __type(value, struct ipv_counts);
+ __uint(max_entries, 4);
+ } btf_map SEC(".maps");
-Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name, key and
-value types for the map. During ELF parsing, libbpf is able to extract
-key/value type_id's and assign them to BPF_MAP_CREATE attributes
-automatically.
+During ELF parsing, libbpf is able to extract key/value type_id's and assign
+them to BPF_MAP_CREATE attributes automatically.
.. _BPF_Prog_Load:
3.3 BPF_PROG_LOAD
-=================
+-----------------
During prog_load, func_info and line_info can be passed to kernel with proper
values for the following attributes:
@@ -580,7 +674,7 @@ For line_info, the line number and column number are defined as below:
#define BPF_LINE_INFO_LINE_COL(line_col) ((line_col) & 0x3ff)
3.4 BPF_{PROG,MAP}_GET_NEXT_ID
-==============================
+------------------------------
In kernel, every loaded program, map or btf has a unique id. The id won't
change during the lifetime of a program, map, or btf.
@@ -590,13 +684,13 @@ each command, to user space, for bpf program or maps, respectively, so an
inspection tool can inspect all programs and maps.
3.5 BPF_{PROG,MAP}_GET_FD_BY_ID
-===============================
+-------------------------------
An introspection tool cannot use id to get details about program or maps.
A file descriptor needs to be obtained first for reference-counting purpose.
3.6 BPF_OBJ_GET_INFO_BY_FD
-==========================
+--------------------------
Once a program/map fd is acquired, an introspection tool can get the detailed
information from kernel about this fd, some of which are BTF-related. For
@@ -605,7 +699,7 @@ example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids.
bpf byte codes, and jited_line_info.
3.7 BPF_BTF_GET_FD_BY_ID
-========================
+------------------------
With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf
syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with
@@ -617,10 +711,10 @@ tool has full btf knowledge and is able to pretty print map key/values, dump
func signatures and line info, along with byte/jit codes.
4. ELF File Format Interface
-****************************
+============================
4.1 .BTF section
-================
+----------------
The .BTF section contains type and string data. The format of this section is
same as the one describe in :ref:`BTF_Type_String`.
@@ -628,10 +722,10 @@ same as the one describe in :ref:`BTF_Type_String`.
.. _BTF_Ext_Section:
4.2 .BTF.ext section
-====================
+--------------------
-The .BTF.ext section encodes func_info and line_info which needs loader
-manipulation before loading into the kernel.
+The .BTF.ext section encodes func_info, line_info and CO-RE relocations
+which needs loader manipulation before loading into the kernel.
The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h``
and ``tools/lib/bpf/btf.c``.
@@ -649,15 +743,20 @@ The current header of .BTF.ext section::
__u32 func_info_len;
__u32 line_info_off;
__u32 line_info_len;
+
+ /* optional part of .BTF.ext header */
+ __u32 core_relo_off;
+ __u32 core_relo_len;
};
It is very similar to .BTF section. Instead of type/string section, it
-contains func_info and line_info section. See :ref:`BPF_Prog_Load` for details
-about func_info and line_info record format.
+contains func_info, line_info and core_relo sub-sections.
+See :ref:`BPF_Prog_Load` for details about func_info and line_info
+record format.
The func_info is organized as below.::
- func_info_rec_size
+ func_info_rec_size /* __u32 value */
btf_ext_info_sec for section #1 /* func_info for section #1 */
btf_ext_info_sec for section #2 /* func_info for section #2 */
...
@@ -677,7 +776,7 @@ Here, num_info must be greater than 0.
The line_info is organized as below.::
- line_info_rec_size
+ line_info_rec_size /* __u32 value */
btf_ext_info_sec for section #1 /* line_info for section #1 */
btf_ext_info_sec for section #2 /* line_info for section #2 */
...
@@ -691,11 +790,86 @@ kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct
bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the
beginning of section (``btf_ext_info_sec->sec_name_off``).
+The core_relo is organized as below.::
+
+ core_relo_rec_size /* __u32 value */
+ btf_ext_info_sec for section #1 /* core_relo for section #1 */
+ btf_ext_info_sec for section #2 /* core_relo for section #2 */
+
+``core_relo_rec_size`` specifies the size of ``bpf_core_relo``
+structure when .BTF.ext is generated. All ``bpf_core_relo`` structures
+within a single ``btf_ext_info_sec`` describe relocations applied to
+section named by ``btf_ext_info_sec->sec_name_off``.
+
+See :ref:`Documentation/bpf/llvm_reloc.rst <btf-co-re-relocations>`
+for more information on CO-RE relocations.
+
+4.2 .BTF_ids section
+--------------------
+
+The .BTF_ids section encodes BTF ID values that are used within the kernel.
+
+This section is created during the kernel compilation with the help of
+macros defined in ``include/linux/btf_ids.h`` header file. Kernel code can
+use them to create lists and sets (sorted lists) of BTF ID values.
+
+The ``BTF_ID_LIST`` and ``BTF_ID`` macros define unsorted list of BTF ID values,
+with following syntax::
+
+ BTF_ID_LIST(list)
+ BTF_ID(type1, name1)
+ BTF_ID(type2, name2)
+
+resulting in following layout in .BTF_ids section::
+
+ __BTF_ID__type1__name1__1:
+ .zero 4
+ __BTF_ID__type2__name2__2:
+ .zero 4
+
+The ``u32 list[];`` variable is defined to access the list.
+
+The ``BTF_ID_UNUSED`` macro defines 4 zero bytes. It's used when we
+want to define unused entry in BTF_ID_LIST, like::
+
+ BTF_ID_LIST(bpf_skb_output_btf_ids)
+ BTF_ID(struct, sk_buff)
+ BTF_ID_UNUSED
+ BTF_ID(struct, task_struct)
+
+The ``BTF_SET_START/END`` macros pair defines sorted list of BTF ID values
+and their count, with following syntax::
+
+ BTF_SET_START(set)
+ BTF_ID(type1, name1)
+ BTF_ID(type2, name2)
+ BTF_SET_END(set)
+
+resulting in following layout in .BTF_ids section::
+
+ __BTF_ID__set__set:
+ .zero 4
+ __BTF_ID__type1__name1__3:
+ .zero 4
+ __BTF_ID__type2__name2__4:
+ .zero 4
+
+The ``struct btf_id_set set;`` variable is defined to access the list.
+
+The ``typeX`` name can be one of following::
+
+ struct, union, typedef, func
+
+and is used as a filter when resolving the BTF ID value.
+
+All the BTF ID lists and sets are compiled in the .BTF_ids section and
+resolved during the linking phase of kernel build by ``resolve_btfids`` tool.
+
5. Using BTF
-************
+============
5.1 bpftool map pretty print
-============================
+----------------------------
With BTF, the map key/value can be printed based on fields rather than simply
raw bytes. This is especially valuable for large structure or if your data
@@ -712,13 +886,12 @@ structure has bitfields. For example, for the following map,::
___A b1:4;
enum A b2:4;
};
- struct bpf_map_def SEC("maps") tmpmap = {
- .type = BPF_MAP_TYPE_ARRAY,
- .key_size = sizeof(__u32),
- .value_size = sizeof(struct tmp_t),
- .max_entries = 1,
- };
- BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t);
+ struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __type(key, int);
+ __type(value, struct tmp_t);
+ __uint(max_entries, 1);
+ } tmpmap SEC(".maps");
bpftool is able to pretty print like below:
::
@@ -737,7 +910,7 @@ bpftool is able to pretty print like below:
]
5.2 bpftool prog dump
-=====================
+---------------------
The following is an example showing how func_info and line_info can help prog
dump with better kernel symbol names, function prototypes and line
@@ -771,7 +944,7 @@ information.::
[...]
5.3 Verifier Log
-================
+----------------
The following is an example of how line_info can help debugging verification
failure.::
@@ -797,7 +970,7 @@ failure.::
R2 offset is outside of the packet
6. BTF Generation
-*****************
+=================
You need latest pahole
@@ -834,7 +1007,7 @@ format.::
} g2;
int main() { return 0; }
int test() { return 0; }
- -bash-4.4$ clang -c -g -O2 -target bpf t2.c
+ -bash-4.4$ clang -c -g -O2 --target=bpf t2.c
-bash-4.4$ readelf -S t2.o
......
[ 8] .BTF PROGBITS 0000000000000000 00000247
@@ -844,7 +1017,7 @@ format.::
[10] .rel.BTF.ext REL 0000000000000000 000007e0
0000000000000040 0000000000000010 16 9 8
......
- -bash-4.4$ clang -S -g -O2 -target bpf t2.c
+ -bash-4.4$ clang -S -g -O2 --target=bpf t2.c
-bash-4.4$ cat t2.s
......
.section .BTF,"",@progbits
@@ -904,6 +1077,11 @@ format.::
.long 8206 # Line 8 Col 14
7. Testing
-**********
+==========
+
+The kernel BPF selftest `tools/testing/selftests/bpf/prog_tests/btf.c`_
+provides an extensive set of BTF-related tests.
-Kernel bpf selftest `test_btf.c` provides extensive set of BTF-related tests.
+.. Links
+.. _tools/testing/selftests/bpf/prog_tests/btf.c:
+ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/bpf/prog_tests/btf.c