diff options
Diffstat (limited to 'Documentation/bpf/btf.rst')
-rw-r--r-- | Documentation/bpf/btf.rst | 300 |
1 files changed, 239 insertions, 61 deletions
diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst index 4d565d202ce3..257a7e1cdf5d 100644 --- a/Documentation/bpf/btf.rst +++ b/Documentation/bpf/btf.rst @@ -3,7 +3,7 @@ BPF Type Format (BTF) ===================== 1. Introduction -*************** +=============== BTF (BPF Type Format) is the metadata format which encodes the debug info related to BPF program/map. The name BTF was used initially to describe data @@ -30,7 +30,7 @@ sections are discussed in details in :ref:`BTF_Type_String`. .. _BTF_Type_String: 2. BTF Type and String Encoding -******************************* +=============================== The file ``include/uapi/linux/btf.h`` provides high-level definition of how types/strings are encoded. @@ -57,13 +57,13 @@ little-endian target. The ``btf_header`` is designed to be extensible with generated. 2.1 String Encoding -=================== +------------------- The first string in the string section must be a null string. The rest of string table is a concatenation of other null-terminated strings. 2.2 Type Encoding -================= +----------------- The type id ``0`` is reserved for ``void`` type. The type section is parsed sequentially and type id is assigned to each recognized type starting from id @@ -74,7 +74,7 @@ sequentially and type id is assigned to each recognized type starting from id #define BTF_KIND_ARRAY 3 /* Array */ #define BTF_KIND_STRUCT 4 /* Struct */ #define BTF_KIND_UNION 5 /* Union */ - #define BTF_KIND_ENUM 6 /* Enumeration */ + #define BTF_KIND_ENUM 6 /* Enumeration up to 32-bit values */ #define BTF_KIND_FWD 7 /* Forward */ #define BTF_KIND_TYPEDEF 8 /* Typedef */ #define BTF_KIND_VOLATILE 9 /* Volatile */ @@ -84,6 +84,10 @@ sequentially and type id is assigned to each recognized type starting from id #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ #define BTF_KIND_VAR 14 /* Variable */ #define BTF_KIND_DATASEC 15 /* Section */ + #define BTF_KIND_FLOAT 16 /* Floating point */ + #define BTF_KIND_DECL_TAG 17 /* Decl Tag */ + #define BTF_KIND_TYPE_TAG 18 /* Type Tag */ + #define BTF_KIND_ENUM64 19 /* Enumeration up to 64-bit values */ Note that the type section encodes debug info, not just pure types. ``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram. @@ -95,17 +99,17 @@ Each type contains the following common data:: /* "info" bits arrangement * bits 0-15: vlen (e.g. # of struct's members) * bits 16-23: unused - * bits 24-27: kind (e.g. int, ptr, array...etc) - * bits 28-30: unused + * bits 24-28: kind (e.g. int, ptr, array...etc) + * bits 29-30: unused * bit 31: kind_flag, currently used by - * struct, union and fwd + * struct, union, fwd, enum and enum64. */ __u32 info; - /* "size" is used by INT, ENUM, STRUCT and UNION. + /* "size" is used by INT, ENUM, STRUCT, UNION and ENUM64. * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC and FUNC_PROTO. + * FUNC, FUNC_PROTO, DECL_TAG and TYPE_TAG. * "type" is a type_id referring to another type. */ union { @@ -268,20 +272,18 @@ In this case, if the base type is an int type, it must be a regular int type: * ``BTF_INT_OFFSET()`` must be 0. * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``. -The following kernel patch introduced ``kind_flag`` and explained why both -modes exist: - - https://github.com/torvalds/linux/commit/9d5f9f701b1891466fb3dbb1806ad97716f95cc3#diff-fa650a64fdd3968396883d2fe8215ff3 +Commit 9d5f9f701b18 introduced ``kind_flag`` and explains why both modes +exist. 2.2.6 BTF_KIND_ENUM ~~~~~~~~~~~~~~~~~~~ ``struct btf_type`` encoding requirement: * ``name_off``: 0 or offset to a valid C identifier - * ``info.kind_flag``: 0 + * ``info.kind_flag``: 0 for unsigned, 1 for signed * ``info.kind``: BTF_KIND_ENUM * ``info.vlen``: number of enum values - * ``size``: 4 + * ``size``: 1/2/4/8 ``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.:: @@ -294,6 +296,10 @@ The ``btf_enum`` encoding: * ``name_off``: offset to a valid C identifier * ``val``: any value +If the original enum value is signed and the size is less than 4, +that value will be sign extended into 4 bytes. If the size is 8, +the value will be truncated into 4 bytes. + 2.2.7 BTF_KIND_FWD ~~~~~~~~~~~~~~~~~~ @@ -361,7 +367,8 @@ No additional type data follow ``btf_type``. * ``name_off``: offset to a valid C identifier * ``info.kind_flag``: 0 * ``info.kind``: BTF_KIND_FUNC - * ``info.vlen``: 0 + * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL + or BTF_FUNC_EXTERN) * ``type``: a BTF_KIND_FUNC_PROTO type No additional type data follow ``btf_type``. @@ -372,6 +379,9 @@ type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the :ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load` (ABI). +Currently, only linkage values of BTF_FUNC_STATIC and BTF_FUNC_GLOBAL are +supported in the kernel. + 2.2.13 BTF_KIND_FUNC_PROTO ~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -452,8 +462,95 @@ map definition. * ``offset``: the in-section offset of the variable * ``size``: the size of the variable in bytes +2.2.16 BTF_KIND_FLOAT +~~~~~~~~~~~~~~~~~~~~~ + +``struct btf_type`` encoding requirement: + * ``name_off``: any valid offset + * ``info.kind_flag``: 0 + * ``info.kind``: BTF_KIND_FLOAT + * ``info.vlen``: 0 + * ``size``: the size of the float type in bytes: 2, 4, 8, 12 or 16. + +No additional type data follow ``btf_type``. + +2.2.17 BTF_KIND_DECL_TAG +~~~~~~~~~~~~~~~~~~~~~~~~ + +``struct btf_type`` encoding requirement: + * ``name_off``: offset to a non-empty string + * ``info.kind_flag``: 0 + * ``info.kind``: BTF_KIND_DECL_TAG + * ``info.vlen``: 0 + * ``type``: ``struct``, ``union``, ``func``, ``var`` or ``typedef`` + +``btf_type`` is followed by ``struct btf_decl_tag``.:: + + struct btf_decl_tag { + __u32 component_idx; + }; + +The ``name_off`` encodes btf_decl_tag attribute string. +The ``type`` should be ``struct``, ``union``, ``func``, ``var`` or ``typedef``. +For ``var`` or ``typedef`` type, ``btf_decl_tag.component_idx`` must be ``-1``. +For the other three types, if the btf_decl_tag attribute is +applied to the ``struct``, ``union`` or ``func`` itself, +``btf_decl_tag.component_idx`` must be ``-1``. Otherwise, +the attribute is applied to a ``struct``/``union`` member or +a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a +valid index (starting from 0) pointing to a member or an argument. + +2.2.18 BTF_KIND_TYPE_TAG +~~~~~~~~~~~~~~~~~~~~~~~~ + +``struct btf_type`` encoding requirement: + * ``name_off``: offset to a non-empty string + * ``info.kind_flag``: 0 + * ``info.kind``: BTF_KIND_TYPE_TAG + * ``info.vlen``: 0 + * ``type``: the type with ``btf_type_tag`` attribute + +Currently, ``BTF_KIND_TYPE_TAG`` is only emitted for pointer types. +It has the following btf type chain: +:: + + ptr -> [type_tag]* + -> [const | volatile | restrict | typedef]* + -> base_type + +Basically, a pointer type points to zero or more +type_tag, then zero or more const/volatile/restrict/typedef +and finally the base type. The base type is one of +int, ptr, array, struct, union, enum, func_proto and float types. + +2.2.19 BTF_KIND_ENUM64 +~~~~~~~~~~~~~~~~~~~~~~ + +``struct btf_type`` encoding requirement: + * ``name_off``: 0 or offset to a valid C identifier + * ``info.kind_flag``: 0 for unsigned, 1 for signed + * ``info.kind``: BTF_KIND_ENUM64 + * ``info.vlen``: number of enum values + * ``size``: 1/2/4/8 + +``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum64``.:: + + struct btf_enum64 { + __u32 name_off; + __u32 val_lo32; + __u32 val_hi32; + }; + +The ``btf_enum64`` encoding: + * ``name_off``: offset to a valid C identifier + * ``val_lo32``: lower 32-bit value for a 64-bit value + * ``val_hi32``: high 32-bit value for a 64-bit value + +If the original enum value is signed and the size is less than 8, +that value will be sign extended into 8 bytes. + 3. BTF Kernel API -***************** +================= The following bpf syscall command involves BTF: * BPF_BTF_LOAD: load a blob of BTF data into kernel @@ -496,14 +593,14 @@ The workflow typically looks like: 3.1 BPF_BTF_LOAD -================ +---------------- Load a blob of BTF data into kernel. A blob of data, described in :ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd`` is returned to a userspace. 3.2 BPF_MAP_CREATE -================== +------------------ A map can be created with ``btf_fd`` and specified key/value type id.:: @@ -514,23 +611,20 @@ A map can be created with ``btf_fd`` and specified key/value type id.:: In libbpf, the map can be defined with extra annotation like below: :: - struct bpf_map_def SEC("maps") btf_map = { - .type = BPF_MAP_TYPE_ARRAY, - .key_size = sizeof(int), - .value_size = sizeof(struct ipv_counts), - .max_entries = 4, - }; - BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts); + struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, int); + __type(value, struct ipv_counts); + __uint(max_entries, 4); + } btf_map SEC(".maps"); -Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name, key and -value types for the map. During ELF parsing, libbpf is able to extract -key/value type_id's and assign them to BPF_MAP_CREATE attributes -automatically. +During ELF parsing, libbpf is able to extract key/value type_id's and assign +them to BPF_MAP_CREATE attributes automatically. .. _BPF_Prog_Load: 3.3 BPF_PROG_LOAD -================= +----------------- During prog_load, func_info and line_info can be passed to kernel with proper values for the following attributes: @@ -580,7 +674,7 @@ For line_info, the line number and column number are defined as below: #define BPF_LINE_INFO_LINE_COL(line_col) ((line_col) & 0x3ff) 3.4 BPF_{PROG,MAP}_GET_NEXT_ID -============================== +------------------------------ In kernel, every loaded program, map or btf has a unique id. The id won't change during the lifetime of a program, map, or btf. @@ -590,13 +684,13 @@ each command, to user space, for bpf program or maps, respectively, so an inspection tool can inspect all programs and maps. 3.5 BPF_{PROG,MAP}_GET_FD_BY_ID -=============================== +------------------------------- An introspection tool cannot use id to get details about program or maps. A file descriptor needs to be obtained first for reference-counting purpose. 3.6 BPF_OBJ_GET_INFO_BY_FD -========================== +-------------------------- Once a program/map fd is acquired, an introspection tool can get the detailed information from kernel about this fd, some of which are BTF-related. For @@ -605,7 +699,7 @@ example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids. bpf byte codes, and jited_line_info. 3.7 BPF_BTF_GET_FD_BY_ID -======================== +------------------------ With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with @@ -617,10 +711,10 @@ tool has full btf knowledge and is able to pretty print map key/values, dump func signatures and line info, along with byte/jit codes. 4. ELF File Format Interface -**************************** +============================ 4.1 .BTF section -================ +---------------- The .BTF section contains type and string data. The format of this section is same as the one describe in :ref:`BTF_Type_String`. @@ -628,10 +722,10 @@ same as the one describe in :ref:`BTF_Type_String`. .. _BTF_Ext_Section: 4.2 .BTF.ext section -==================== +-------------------- -The .BTF.ext section encodes func_info and line_info which needs loader -manipulation before loading into the kernel. +The .BTF.ext section encodes func_info, line_info and CO-RE relocations +which needs loader manipulation before loading into the kernel. The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h`` and ``tools/lib/bpf/btf.c``. @@ -649,15 +743,20 @@ The current header of .BTF.ext section:: __u32 func_info_len; __u32 line_info_off; __u32 line_info_len; + + /* optional part of .BTF.ext header */ + __u32 core_relo_off; + __u32 core_relo_len; }; It is very similar to .BTF section. Instead of type/string section, it -contains func_info and line_info section. See :ref:`BPF_Prog_Load` for details -about func_info and line_info record format. +contains func_info, line_info and core_relo sub-sections. +See :ref:`BPF_Prog_Load` for details about func_info and line_info +record format. The func_info is organized as below.:: - func_info_rec_size + func_info_rec_size /* __u32 value */ btf_ext_info_sec for section #1 /* func_info for section #1 */ btf_ext_info_sec for section #2 /* func_info for section #2 */ ... @@ -677,7 +776,7 @@ Here, num_info must be greater than 0. The line_info is organized as below.:: - line_info_rec_size + line_info_rec_size /* __u32 value */ btf_ext_info_sec for section #1 /* line_info for section #1 */ btf_ext_info_sec for section #2 /* line_info for section #2 */ ... @@ -691,11 +790,86 @@ kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the beginning of section (``btf_ext_info_sec->sec_name_off``). +The core_relo is organized as below.:: + + core_relo_rec_size /* __u32 value */ + btf_ext_info_sec for section #1 /* core_relo for section #1 */ + btf_ext_info_sec for section #2 /* core_relo for section #2 */ + +``core_relo_rec_size`` specifies the size of ``bpf_core_relo`` +structure when .BTF.ext is generated. All ``bpf_core_relo`` structures +within a single ``btf_ext_info_sec`` describe relocations applied to +section named by ``btf_ext_info_sec->sec_name_off``. + +See :ref:`Documentation/bpf/llvm_reloc.rst <btf-co-re-relocations>` +for more information on CO-RE relocations. + +4.2 .BTF_ids section +-------------------- + +The .BTF_ids section encodes BTF ID values that are used within the kernel. + +This section is created during the kernel compilation with the help of +macros defined in ``include/linux/btf_ids.h`` header file. Kernel code can +use them to create lists and sets (sorted lists) of BTF ID values. + +The ``BTF_ID_LIST`` and ``BTF_ID`` macros define unsorted list of BTF ID values, +with following syntax:: + + BTF_ID_LIST(list) + BTF_ID(type1, name1) + BTF_ID(type2, name2) + +resulting in following layout in .BTF_ids section:: + + __BTF_ID__type1__name1__1: + .zero 4 + __BTF_ID__type2__name2__2: + .zero 4 + +The ``u32 list[];`` variable is defined to access the list. + +The ``BTF_ID_UNUSED`` macro defines 4 zero bytes. It's used when we +want to define unused entry in BTF_ID_LIST, like:: + + BTF_ID_LIST(bpf_skb_output_btf_ids) + BTF_ID(struct, sk_buff) + BTF_ID_UNUSED + BTF_ID(struct, task_struct) + +The ``BTF_SET_START/END`` macros pair defines sorted list of BTF ID values +and their count, with following syntax:: + + BTF_SET_START(set) + BTF_ID(type1, name1) + BTF_ID(type2, name2) + BTF_SET_END(set) + +resulting in following layout in .BTF_ids section:: + + __BTF_ID__set__set: + .zero 4 + __BTF_ID__type1__name1__3: + .zero 4 + __BTF_ID__type2__name2__4: + .zero 4 + +The ``struct btf_id_set set;`` variable is defined to access the list. + +The ``typeX`` name can be one of following:: + + struct, union, typedef, func + +and is used as a filter when resolving the BTF ID value. + +All the BTF ID lists and sets are compiled in the .BTF_ids section and +resolved during the linking phase of kernel build by ``resolve_btfids`` tool. + 5. Using BTF -************ +============ 5.1 bpftool map pretty print -============================ +---------------------------- With BTF, the map key/value can be printed based on fields rather than simply raw bytes. This is especially valuable for large structure or if your data @@ -712,13 +886,12 @@ structure has bitfields. For example, for the following map,:: ___A b1:4; enum A b2:4; }; - struct bpf_map_def SEC("maps") tmpmap = { - .type = BPF_MAP_TYPE_ARRAY, - .key_size = sizeof(__u32), - .value_size = sizeof(struct tmp_t), - .max_entries = 1, - }; - BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t); + struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, int); + __type(value, struct tmp_t); + __uint(max_entries, 1); + } tmpmap SEC(".maps"); bpftool is able to pretty print like below: :: @@ -737,7 +910,7 @@ bpftool is able to pretty print like below: ] 5.2 bpftool prog dump -===================== +--------------------- The following is an example showing how func_info and line_info can help prog dump with better kernel symbol names, function prototypes and line @@ -771,7 +944,7 @@ information.:: [...] 5.3 Verifier Log -================ +---------------- The following is an example of how line_info can help debugging verification failure.:: @@ -797,7 +970,7 @@ failure.:: R2 offset is outside of the packet 6. BTF Generation -***************** +================= You need latest pahole @@ -834,7 +1007,7 @@ format.:: } g2; int main() { return 0; } int test() { return 0; } - -bash-4.4$ clang -c -g -O2 -target bpf t2.c + -bash-4.4$ clang -c -g -O2 --target=bpf t2.c -bash-4.4$ readelf -S t2.o ...... [ 8] .BTF PROGBITS 0000000000000000 00000247 @@ -844,7 +1017,7 @@ format.:: [10] .rel.BTF.ext REL 0000000000000000 000007e0 0000000000000040 0000000000000010 16 9 8 ...... - -bash-4.4$ clang -S -g -O2 -target bpf t2.c + -bash-4.4$ clang -S -g -O2 --target=bpf t2.c -bash-4.4$ cat t2.s ...... .section .BTF,"",@progbits @@ -904,6 +1077,11 @@ format.:: .long 8206 # Line 8 Col 14 7. Testing -********** +========== + +The kernel BPF selftest `tools/testing/selftests/bpf/prog_tests/btf.c`_ +provides an extensive set of BTF-related tests. -Kernel bpf selftest `test_btf.c` provides extensive set of BTF-related tests. +.. Links +.. _tools/testing/selftests/bpf/prog_tests/btf.c: + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/bpf/prog_tests/btf.c |