linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-11-21	bpf: adding tests for map_in_map helpber in libbpf	Nikita V. Shirokov	3	-1/+141
	adding test/example of bpf_map__set_inner_map_fd usage Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	bpf: adding support for map in map in libbpf	Nikita V. Shirokov	2	-6/+36
	idea is pretty simple. for specified map (pointed by struct bpf_map) we would provide descriptor of already loaded map, which is going to be used as a prototype for inner map. proposed workflow: 1) open bpf's object (bpf_object__open) 2) create bpf's map which is going to be used as a prototype 3) find (by name) map-in-map which you want to load and update w/ descriptor of inner map w/ a new helper from this patch 4) load bpf program w/ bpf_object__load Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	bpf: libbpf: don't specify prog name if kernel doesn't support it	Stanislav Fomichev	1	-1/+2
	Use recently added capability check. See commit 23499442c319 ("bpf: libbpf: retry map creation without the name") for rationale. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	bpf: libbpf: remove map name retry from bpf_create_map_xattr	Stanislav Fomichev	2	-11/+3
	Instead, check for a newly created caps.name bpf_object capability. If kernel doesn't support names, don't specify the attribute. See commit 23499442c319 ("bpf: libbpf: retry map creation without the name") for rationale. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	bpf, libbpf: introduce bpf_object__probe_caps to test BPF capabilities	Stanislav Fomichev	1	-0/+58
	It currently only checks whether kernel supports map/prog names. This capability check will be used in the next two commits to skip setting prog/map names. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	libbpf: make sure bpf headers are c++ include-able	Stanislav Fomichev	5	-3/+56
	Wrap headers in extern "C", to turn off C++ mangling. This simplifies including libbpf in c++ and linking against it. v2 changes: * do the same for btf.h v3 changes: * test_libbpf.cpp to test for possible future c++ breakages Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	bpf: fix a libbpf loader issue	Yonghong Song	1	-1/+1
	Commit 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections") added support to read .BTF.ext sections from an object file, create and pass prog_btf_fd and func_info to the kernel. The program btf_fd (prog->btf_fd) is initialized to be -1 to please zclose so we do not need special handling dur prog close. Passing -1 to the kernel, however, will cause loading error. Passing btf_fd 0 to the kernel if prog->btf_fd is invalid fixed the problem. Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections") Reported-by: Andrey Ignatov <rdna@fb.com> Reported-by: Emre Cantimur <haydum@fb.com> Tested-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	bpf: fix a compilation error when CONFIG_BPF_SYSCALL is not defined	Yonghong Song	1	-0/+14
	Kernel test robot (lkp@intel.com) reports a compilation error at https://www.spinics.net/lists/netdev/msg534913.html introduced by commit 838e96904ff3 ("bpf: Introduce bpf_func_info"). If CONFIG_BPF is defined and CONFIG_BPF_SYSCALL is not defined, the following error will appear: kernel/bpf/core.c:414: undefined reference to `btf_type_by_id' kernel/bpf/core.c:415: undefined reference to `btf_name_by_offset' When CONFIG_BPF_SYSCALL is not defined, let us define stub inline functions for btf_type_by_id() and btf_name_by_offset() in include/linux/btf.h. This way, the compilation failure can be avoided. Fixes: 838e96904ff3 ("bpf: Introduce bpf_func_info") Reported-by: kbuild test robot <lkp@intel.com> Cc: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	Merge branch 'btf-func-info'	Alexei Starovoitov	27	-173/+2317
	Martin KaFai Lau says: ==================== The BTF support was added to kernel by Commit 69b693f0aefa ("bpf: btf: Introduce BPF Type Format (BTF)"), which introduced .BTF section into ELF file and is primarily used for map pretty print. pahole is used to convert dwarf to BTF for ELF files. This patch added func info support to the kernel so we can get better ksym's for bpf function calls. Basically, function call types are passed to kernel and the kernel extract function names from these types in order to contruct ksym for these functions. The llvm patch at https://reviews.llvm.org/D53736 will generate .BTF section and one more section .BTF.ext. The .BTF.ext section encodes function type information. The following is a sample output for selftests test_btf with file test_btf_haskv.o for translated insns and jited insns respectively. $ bpftool prog dump xlated id 1 int _dummy_tracepoint(struct dummy_tracepoint_args * arg): 0: (85) call pc+2#bpf_prog_2dcecc18072623fc_test_long_fname_1 1: (b7) r0 = 0 2: (95) exit int test_long_fname_1(struct dummy_tracepoint_args * arg): 3: (85) call pc+1#bpf_prog_89d64e4abf0f0126_test_long_fname_2 4: (95) exit int test_long_fname_2(struct dummy_tracepoint_args * arg): 5: (b7) r2 = 0 6: (63) (u32 )(r10 -4) = r2 7: (79) r1 = (u64 )(r1 +8) ... 22: (07) r1 += 1 23: (63) (u32 )(r0 +4) = r1 24: (95) exit $ bpftool prog dump jited id 1 int _dummy_tracepoint(struct dummy_tracepoint_args * arg): bpf_prog_b07ccb89267cf242__dummy_tracepoint: 0: push %rbp 1: mov %rsp,%rbp ...... 3c: add $0x28,%rbp 40: leaveq 41: retq int test_long_fname_1(struct dummy_tracepoint_args * arg): bpf_prog_2dcecc18072623fc_test_long_fname_1: 0: push %rbp 1: mov %rsp,%rbp ...... 3a: add $0x28,%rbp 3e: leaveq 3f: retq int test_long_fname_2(struct dummy_tracepoint_args * arg): bpf_prog_89d64e4abf0f0126_test_long_fname_2: 0: push %rbp 1: mov %rsp,%rbp ...... 80: add $0x28,%rbp 84: leaveq 85: retq Changelogs: v4 -> v5: . Add back BTF_KIND_FUNC_PROTO as v1 did. The difference is BTF_KIND_FUNC_PROTO cannot have t->name_off now. All param metadata is defined in BTF_KIND_FUNC_PROTO. BTF_KIND_FUNC must have t->name_off != 0 and t->type refers to a BTF_KIND_FUNC_PROTO. The above is the conclusion after the discussion between Edward Cree, Alexei, Daniel, Yonghong and Martin. v3 -> v4: . Remove BTF_KIND_FUNC_PROTO. BTF_KIND_FUNC is used for both function pointer and subprogram. The name_off field is used to distinguish both. . The record size is added to the func_info subsection in .BTF.ext to enable future extension. . The bpf_prog_info interface change to make it similar bpf_prog_load. . Related kernel and libbpf changes to accommodate the new .BTF.ext and kernel interface changes. v2 -> v3: . Removed kernel btf extern functions btf_type_id_func() and btf_get_name_by_id(). Instead, exposing existing functions btf_type_by_id() and btf_name_by_offset(). . Added comments about ELF section .BTF.ext layout. . Better codes in btftool as suggested by Edward Cree. v1 -> v2: . Added missing sign-off. . Limited the func_name/struct_member_name length for validity test. . Removed/changed several verifier messages. . Modified several commit messages to remove line_off reference. ==================== Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: bpftool: add support for func types	Yonghong Song	5	-0/+230
	This patch added support to print function signature if btf func_info is available. Note that ksym now uses function name instead of prog_name as prog_name has a limit of 16 bytes including ending '\0'. The following is a sample output for selftests test_btf with file test_btf_haskv.o for translated insns and jited insns respectively. $ bpftool prog dump xlated id 1 int _dummy_tracepoint(struct dummy_tracepoint_args * arg): 0: (85) call pc+2#bpf_prog_2dcecc18072623fc_test_long_fname_1 1: (b7) r0 = 0 2: (95) exit int test_long_fname_1(struct dummy_tracepoint_args * arg): 3: (85) call pc+1#bpf_prog_89d64e4abf0f0126_test_long_fname_2 4: (95) exit int test_long_fname_2(struct dummy_tracepoint_args * arg): 5: (b7) r2 = 0 6: (63) (u32 )(r10 -4) = r2 7: (79) r1 = (u64 )(r1 +8) ... 22: (07) r1 += 1 23: (63) (u32 )(r0 +4) = r1 24: (95) exit $ bpftool prog dump jited id 1 int _dummy_tracepoint(struct dummy_tracepoint_args * arg): bpf_prog_b07ccb89267cf242__dummy_tracepoint: 0: push %rbp 1: mov %rsp,%rbp ...... 3c: add $0x28,%rbp 40: leaveq 41: retq int test_long_fname_1(struct dummy_tracepoint_args * arg): bpf_prog_2dcecc18072623fc_test_long_fname_1: 0: push %rbp 1: mov %rsp,%rbp ...... 3a: add $0x28,%rbp 3e: leaveq 3f: retq int test_long_fname_2(struct dummy_tracepoint_args * arg): bpf_prog_89d64e4abf0f0126_test_long_fname_2: 0: push %rbp 1: mov %rsp,%rbp ...... 80: add $0x28,%rbp 84: leaveq 85: retq Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: enhance test_btf file testing to test func info	Yonghong Song	3	-13/+136
	Change the bpf programs test_btf_haskv.c and test_btf_nokv.c to have two sections, and enhance test_btf.c test_file feature to test btf func_info returned by the kernel. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: refactor to implement btf_get_from_id() in lib/bpf	Yonghong Song	3	-66/+72
	The function get_btf() is implemented in tools/bpf/bpftool/map.c to get a btf structure given a map_info. This patch refactored this function to be function btf_get_from_id() in tools/lib/bpf so that it can be used later. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: do not use pahole if clang/llvm can generate BTF sections	Yonghong Song	2	-0/+16
	Add additional checks in tools/testing/selftests/bpf and samples/bpf such that if clang/llvm compiler can generate BTF sections, do not use pahole. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: add support to read .BTF.ext sections	Yonghong Song	4	-15/+442
	The .BTF section is already available to encode types. These types can be used for map pretty print. The whole .BTF will be passed to the kernel as well for which kernel can verify and return to the user space for pretty print etc. The llvm patch at https://reviews.llvm.org/D53736 will generate .BTF section and one more section .BTF.ext. The .BTF.ext section encodes function type information and line information. Note that this patch set only supports function type info. The functionality is implemented in libbpf. The .BTF section can be directly loaded into the kernel, and the .BTF.ext section cannot. The loader may need to do some relocation and merging, similar to merging multiple code sections, before loading into the kernel. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: extends test_btf to test load/retrieve func_type info	Yonghong Song	1	-3/+329
	A two function bpf program is loaded with btf and func_info. After successful prog load, the bpf_get_info syscall is called to retrieve prog info to ensure the types returned from the kernel matches the types passed to the kernel from the user space. Several negative tests are also added to test loading/retriving of func_type info. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: add new fields for program load in lib/bpf	Yonghong Song	2	-0/+8
	The new fields are added for program load in lib/bpf so application uses api bpf_load_program_xattr() is able to load program with btf and func_info data. This functionality will be used in next patch by bpf selftest test_btf. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: sync kernel uapi bpf.h header to tools directory	Yonghong Song	1	-0/+13
	The kernel uapi bpf.h is synced to tools directory. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	bpf: Introduce bpf_func_info	Yonghong Song	8	-8/+209
	This patch added interface to load a program with the following additional information: . prog_btf_fd . func_info, func_info_rec_size and func_info_cnt where func_info will provide function range and type_id corresponding to each function. The func_info_rec_size is introduced in the UAPI to specify struct bpf_func_info size passed from user space. This intends to make bpf_func_info structure growable in the future. If the kernel gets a different bpf_func_info size from userspace, it will try to handle user request with part of bpf_func_info it can understand. In this patch, kernel can understand struct bpf_func_info { __u32 insn_offset; __u32 type_id; }; If user passed a bpf func_info record size of 16 bytes, the kernel can still handle part of records with the above definition. If verifier agrees with function range provided by the user, the bpf_prog ksym for each function will use the func name provided in the type_id, which is supposed to provide better encoding as it is not limited by 16 bytes program name limitation and this is better for bpf program which contains multiple subprograms. The bpf_prog_info interface is also extended to return btf_id, func_info, func_info_rec_size and func_info_cnt to userspace, so userspace can print out the function prototype for each xlated function. The insn_offset in the returned func_info corresponds to the insn offset for xlated functions. With other jit related fields in bpf_prog_info, userspace can also print out function prototypes for each jited function. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: Add tests for BTF_KIND_FUNC_PROTO and BTF_KIND_FUNC	Martin KaFai Lau	2	-2/+476
	This patch adds unit tests for BTF_KIND_FUNC_PROTO and BTF_KIND_FUNC to test_btf. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: Sync kernel btf.h header	Martin KaFai Lau	1	-3/+15
	The kernel uapi btf.h is synced to the tools directory. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO	Martin KaFai Lau	2	-53/+354
	This patch adds BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO to support the function debug info. BTF_KIND_FUNC_PROTO must not have a name (i.e. !t->name_off) and it is followed by >= 0 'struct bpf_param' objects to describe the function arguments. The BTF_KIND_FUNC must have a valid name and it must refer back to a BTF_KIND_FUNC_PROTO. The above is the conclusion after the discussion between Edward Cree, Alexei, Daniel, Yonghong and Martin. By combining BTF_KIND_FUNC and BTF_LIND_FUNC_PROTO, a complete function signature can be obtained. It will be used in the later patches to learn the function signature of a running bpf program. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	bpf: btf: Break up btf_type_is_void()	Martin KaFai Lau	1	-15/+22
	This patch breaks up btf_type_is_void() into btf_type_is_void() and btf_type_is_fwd(). It also adds btf_type_nosize() to better describe it is testing a type has nosize info. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	Merge branch 'bpf-zero-hash-seed'	Daniel Borkmann	4	-17/+82
	Lorenz Bauer says: ==================== Allow forcing the seed of a hash table to zero, for deterministic execution during benchmarking and testing. Changes from v2: * Change ordering of BPF_F_ZERO_SEED in linux/bpf.h Comments adressed from v1: * Add comment to discourage production use to linux/bpf.h * Require CAP_SYS_ADMIN ==================== Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	tools: add selftest for BPF_F_ZERO_SEED	Lorenz Bauer	1	-9/+55
	Check that iterating two separate hash maps produces the same order of keys if BPF_F_ZERO_SEED is used. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	tools: sync linux/bpf.h	Lorenz Bauer	1	-3/+10
	Synchronize changes to linux/bpf.h from * "bpf: allow zero-initializing hash map seed" * "bpf: move BPF_F_QUERY_EFFECTIVE after map flags" Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	bpf: move BPF_F_QUERY_EFFECTIVE after map flags	Lorenz Bauer	1	-3/+3
	BPF_F_QUERY_EFFECTIVE is in the middle of the flags valid for BPF_MAP_CREATE. Move it to its own section to reduce confusion. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	bpf: allow zero-initializing hash map seed	Lorenz Bauer	2	-2/+14
	Add a new flag BPF_F_ZERO_SEED, which forces a hash map to initialize the seed to zero. This is useful when doing performance analysis both on individual BPF programs, as well as the kernel's hash table implementation. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	bpf: libbpf: retry map creation without the name	Stanislav Fomichev	1	-1/+10
	Since commit 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name for maps. Pre v4.14 kernels don't know about map names and return an error about unexpected non-zero data. Retry sys_bpf without a map name to cover older kernels. v2 changes: * check for errno == EINVAL as suggested by Daniel Borkmann Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-16	bpf: fix null pointer dereference on pointer offload	Colin Ian King	1	-2/+3
	Pointer offload is being null checked however the following statement dereferences the potentially null pointer offload when assigning offload->dev_state. Fix this by only assigning it if offload is not null. Detected by CoverityScan, CID#1475437 ("Dereference after null check") Fixes: 00db12c3d141 ("bpf: call verifier_prep from its callback in struct bpf_offload_dev") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-16	bpftool: make libbfd optional	Stanislav Fomichev	5	-6/+35
	Make it possible to build bpftool without libbfd. libbfd and libopcodes are typically provided in dev/dbg packages (binutils-dev in debian) which we usually don't have installed on the fleet machines and we'd like a way to have bpftool version that works without installing any additional packages. This excludes support for disassembling jit-ted code and prints an error if the user tries to use these features. Tested by: cat > FEATURES_DUMP.bpftool <<EOF feature-libbfd=0 feature-disassembler-four-args=1 feature-reallocarray=0 feature-libelf=1 feature-libelf-mmap=1 feature-bpf=1 EOF FEATURES_DUMP=$PWD/FEATURES_DUMP.bpftool make ldd bpftool \| grep libbfd Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-16	Merge branch 'socket-lookup-cg_sock'	Alexei Starovoitov	3	-24/+125
	Andrey Ignatov says: ==================== This patch set makes bpf_sk_lookup_tcp, bpf_sk_lookup_udp and bpf_sk_release helpers available in programs of type BPF_PROG_TYPE_CGROUP_SOCK_ADDR. Patch 1 is a fix for bpf_sk_lookup_udp that was already merged to bpf (stable) tree. Here it's prerequisite for patch 3. Patch 2 is the main patch in the set, it makes the helpers available for BPF_PROG_TYPE_CGROUP_SOCK_ADDR and provides more details about use-case. Patch 3 adds selftest for new functionality. v1->v2: - remove "Split bpf_sk_lookup" patch since it was already split by: commit c8123ead13a5 ("bpf: Extend the sk_lookup() helper to XDP hookpoint."); - avoid unnecessary bpf_sock_addr_sk_lookup function. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-16	selftest/bpf: Use bpf_sk_lookup_{tcp, udp} in test_sock_addr	Andrey Ignatov	2	-21/+78
	Use bpf_sk_lookup_tcp, bpf_sk_lookup_udp and bpf_sk_release helpers from test_sock_addr programs to make sure they're available and can lookup and release socket properly for IPv4/IPv4, TCP/UDP. Reading from a few fields of returned struct bpf_sock is also tested. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-16	bpf: Support socket lookup in CGROUP_SOCK_ADDR progs	Andrey Ignatov	1	-0/+45
	Make bpf_sk_lookup_tcp, bpf_sk_lookup_udp and bpf_sk_release helpers available in programs of type BPF_PROG_TYPE_CGROUP_SOCK_ADDR. Such programs operate on sockets and have access to socket and struct sockaddr passed by user to system calls such as sys_bind, sys_connect, sys_sendmsg. It's useful to be able to lookup other sockets from these programs. E.g. sys_connect may lookup IP:port endpoint and if there is a server socket bound to that endpoint ("server" can be defined by saddr & sport being zero), redirect client connection to it by rewriting IP:port in sockaddr passed to sys_connect. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-16	bpf: Fix IPv6 dport byte order in bpf_sk_lookup_udp	Andrey Ignatov	1	-3/+2
	Lookup functions in sk_lookup have different expectations about byte order of provided arguments. Specifically __inet_lookup, __udp4_lib_lookup and __udp6_lib_lookup expect dport to be in network byte order and do ntohs(dport) internally. At the same time __inet6_lookup expects dport to be in host byte order and correspondingly name the argument hnum. sk_lookup works correctly with __inet_lookup, __udp4_lib_lookup and __inet6_lookup with regard to dport. But in __udp6_lib_lookup case it uses host instead of expected network byte order. It makes result returned by bpf_sk_lookup_udp for IPv6 incorrect. The patch fixes byte order of dport passed to __udp6_lib_lookup. Originally sk_lookup properly handled UDPv6, but not TCPv6. 5ef0ae84f02a fixes TCPv6 but breaks UDPv6. Fixes: 5ef0ae84f02a ("bpf: Fix IPv6 dport byte-order in bpf_sk_lookup") Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Joe Stringer <joe@wand.net.nz> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-16	bpf: Remove unused variable in nsim_bpf	Nathan Chancellor	1	-1/+0
	Clang warns: drivers/net/netdevsim/bpf.c:557:30: error: unused variable 'state' [-Werror,-Wunused-variable] struct nsim_bpf_bound_prog *state; ^ 1 error generated. The declaration should have been removed in commit b07ade27e933 ("bpf: pass translate() as a callback and remove its ndo_bpf subcommand"). Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-16	bpf: libbpf: Fix bpf_program__next() API	Martin KaFai Lau	1	-14/+11
	This patch restores the behavior in commit eac7d84519a3 ("tools: libbpf: don't return '.text' as a program for multi-function programs") such that bpf_program__next() does not return pseudo programs in ".text". Fixes: 0c19a9fbc9cd ("libbpf: cleanup after partial failure in bpf_object__pin") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-16	selftests/bpf: Fix uninitialized duration warning	Joe Stringer	1	-1/+1
	Daniel Borkmann reports: test_progs.c: In function ‘main’: test_progs.c:81:3: warning: ‘duration’ may be used uninitialized in this function [-Wmaybe-uninitialized] printf("%s:PASS:%s %d nsec\n", __func__, tag, duration);\ ^~~~~~ test_progs.c:1706:8: note: ‘duration’ was declared here __u32 duration; ^~~~~~~~ Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	Merge branch 'narrow-loads'	Alexei Starovoitov	4	-34/+79
	Andrey Ignatov says: ==================== This patch set adds support for narrow loads with offset > 0 to BPF verifier. Patch 1 provides more details and is the main patch in the set. Patches 2 and 3 add new test cases to test_verifier and test_sock_addr selftests. v1->v2: - fix -Wdeclaration-after-statement warning. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	selftests/bpf: Test narrow loads with off > 0 for bpf_sock_addr	Andrey Ignatov	1	-4/+24
	Add more test cases for context bpf_sock_addr to test narrow loads with offset > 0 for ctx->user_ip4 field (__u32): * off=1, size=1; * off=2, size=1; * off=3, size=1; * off=2, size=2. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	selftests/bpf: Test narrow loads with off > 0 in test_verifier	Andrey Ignatov	1	-10/+38
	Test the following narrow loads in test_verifier for context __sk_buff: * off=1, size=1 - ok; * off=2, size=1 - ok; * off=3, size=1 - ok; * off=0, size=2 - ok; * off=1, size=2 - fail; * off=0, size=2 - ok; * off=3, size=2 - fail. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	bpf: Allow narrow loads with offset > 0	Andrey Ignatov	2	-20/+17
	Currently BPF verifier allows narrow loads for a context field only with offset zero. E.g. if there is a __u32 field then only the following loads are permitted: * off=0, size=1 (narrow); * off=0, size=2 (narrow); * off=0, size=4 (full). On the other hand LLVM can generate a load with offset different than zero that make sense from program logic point of view, but verifier doesn't accept it. E.g. tools/testing/selftests/bpf/sendmsg4_prog.c has code: #define DST_IP4 0xC0A801FEU /* 192.168.1.254 / ... if ((ctx->user_ip4 >> 24) == (bpf_htonl(DST_IP4) >> 24) && where ctx is struct bpf_sock_addr. Some versions of LLVM can produce the following byte code for it: 8: 71 12 07 00 00 00 00 00 r2 = (u8 )(r1 + 7) 9: 67 02 00 00 18 00 00 00 r2 <<= 24 10: 18 03 00 00 00 00 00 fe 00 00 00 00 00 00 00 00 r3 = 4261412864 ll 12: 5d 32 07 00 00 00 00 00 if r2 != r3 goto +7 <LBB0_6> where `(u8 )(r1 + 7)` means narrow load for ctx->user_ip4 with size=1 and offset=3 (7 - sizeof(ctx->user_family) = 3). This load is currently rejected by verifier. Verifier code that rejects such loads is in bpf_ctx_narrow_access_ok() what means any is_valid_access implementation, that uses the function, works this way, e.g. bpf_skb_is_valid_access() for __sk_buff or sock_addr_is_valid_access() for bpf_sock_addr. The patch makes such loads supported. Offset can be in [0; size_default) but has to be multiple of load size. E.g. for __u32 field the following loads are supported now: off=0, size=1 (narrow); * off=1, size=1 (narrow); * off=2, size=1 (narrow); * off=3, size=1 (narrow); * off=0, size=2 (narrow); * off=2, size=2 (narrow); * off=0, size=4 (full). Reported-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	Merge branch 'bpftool-flow-dissector'	Alexei Starovoitov	9	-122/+537
	Stanislav Fomichev says: ==================== v5 changes: * FILE -> PATH for load/loadall (can be either file or directory now) * simpler implementation for __bpf_program__pin_name * removed p_err for REQ_ARGS checks * parse_atach_detach_args -> parse_attach_detach_args * for -> while in bpf_object__pin_{programs,maps} recovery v4 changes: * addressed another round of comments/style issues from Jakub Kicinski & Quentin Monnet (thanks!) * implemented bpf_object__pin_maps and bpf_object__pin_programs helpers and used them in bpf_program__pin * added new pin_name to bpf_program so bpf_program__pin works with sections that contain '/' * moved loadall command implementation into a separate patch * added patch that implements pinmaps to pin maps when doing load/loadall v3 changes: * (maybe) better cleanup for partial failure in bpf_object__pin * added special case in bpf_program__pin for programs with single instances v2 changes: * addressed comments/style issues from Jakub Kicinski & Quentin Monnet * removed logic that populates jump table * added cleanup for partial failure in bpf_object__pin This patch series adds support for loading and attaching flow dissector programs from the bpftool: * first patch fixes flow dissector section name in the selftests (so libbpf auto-detection works) * second patch adds proper cleanup to bpf_object__pin, parts of which are now being used to attach all flow dissector progs/maps * third patch adds special case in bpf_program__pin for programs with single instances (we don't create <prog>/0 pin anymore, just <prog>) * forth patch adds pin_name to the bpf_program struct which is now used as a pin name in bpf_program__pin et al * fifth patch adds loadall command that pins all programs, not just the first one * sixth patch adds pinmaps argument to load/loadall to let users pin all maps of the obj file * seventh patch adds actual flow_dissector support to the bpftool and an example ==================== Acked-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	bpftool: support loading flow dissector	Stanislav Fomichev	3	-51/+74
	This commit adds support for loading/attaching/detaching flow dissector program. When `bpftool loadall` is called with a flow_dissector prog (i.e. when the 'type flow_dissector' argument is passed), we load and pin all programs. User is responsible to construct the jump table for the tail calls. The last argument of `bpftool attach` is made optional for this use case. Example: bpftool prog load tools/testing/selftests/bpf/bpf_flow.o \ /sys/fs/bpf/flow type flow_dissector \ pinmaps /sys/fs/bpf/flow bpftool map update pinned /sys/fs/bpf/flow/jmp_table \ key 0 0 0 0 \ value pinned /sys/fs/bpf/flow/IP bpftool map update pinned /sys/fs/bpf/flow/jmp_table \ key 1 0 0 0 \ value pinned /sys/fs/bpf/flow/IPV6 bpftool map update pinned /sys/fs/bpf/flow/jmp_table \ key 2 0 0 0 \ value pinned /sys/fs/bpf/flow/IPV6OP bpftool map update pinned /sys/fs/bpf/flow/jmp_table \ key 3 0 0 0 \ value pinned /sys/fs/bpf/flow/IPV6FR bpftool map update pinned /sys/fs/bpf/flow/jmp_table \ key 4 0 0 0 \ value pinned /sys/fs/bpf/flow/MPLS bpftool map update pinned /sys/fs/bpf/flow/jmp_table \ key 5 0 0 0 \ value pinned /sys/fs/bpf/flow/VLAN bpftool prog attach pinned /sys/fs/bpf/flow/flow_dissector flow_dissector Tested by using the above lines to load the prog in the test_flow_dissector.sh selftest. Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	bpftool: add pinmaps argument to the load/loadall	Stanislav Fomichev	3	-3/+28
	This new additional argument lets users pin all maps from the object at specified path. Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	bpftool: add loadall command	Stanislav Fomichev	5	-43/+81
	This patch adds new loadall command which slightly differs from the existing load. load command loads all programs from the obj file, but pins only the first programs. loadall pins all programs from the obj file under specified directory. The intended usecase is flow_dissector, where we want to load a bunch of progs, pin them all and after that construct a jump table. Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	libbpf: add internal pin_name	Stanislav Fomichev	1	-3/+26
	pin_name is the same as section_name where '/' is replaced by '_'. bpf_object__pin_programs is converted to use pin_name to avoid the situation where section_name would require creating another subdirectory for a pin (as, for example, when calling bpf_object__pin_programs for programs in sections like "cgroup/connect6"). Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	libbpf: bpf_program__pin: add special case for instances.nr == 1	Stanislav Fomichev	1	-0/+10
	When bpf_program has only one instance, don't create a subdirectory with per-instance pin files (<prog>/0). Instead, just create a single pin file for that single instance. This simplifies object pinning by not creating unnecessary subdirectories. This can potentially break existing users that depend on the case where '/0' is always created. However, I couldn't find any serious usage of bpf_program__pin inside the kernel tree and I suppose there should be none outside. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	libbpf: cleanup after partial failure in bpf_object__pin	Stanislav Fomichev	2	-23/+319
	bpftool will use bpf_object__pin in the next commits to pin all programs and maps from the file; in case of a partial failure, we need to get back to the clean state (undo previous program/map pins). As part of a cleanup, I've added and exported separate routines to pin all maps (bpf_object__pin_maps) and progs (bpf_object__pin_programs) of an object. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	selftests/bpf: rename flow dissector section to flow_dissector	Stanislav Fomichev	2	-2/+2
	Makes it compatible with the logic that derives program type from section name in libbpf_prog_type_by_name. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	Merge branch 'device-ops-as-cb'	Alexei Starovoitov	10	-118/+85
	Quentin Monnet says: ==================== For passing device functions for offloaded eBPF programs, there used to be no place where to store the pointer without making the non-offloaded programs pay a memory price. As a consequence, three functions were called with ndo_bpf() through specific commands. Now that we have struct bpf_offload_dev, and since none of those operations rely on RTNL, we can turn these three commands into hooks inside the struct bpf_prog_offload_ops, and pass them as part of bpf_offload_dev_create(). This patch set changes the offload architecture to do so, and brings the relevant changes to the nfp and netdevsim drivers. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>