linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-11-30	tools/bpf: make libbpf _GNU_SOURCE friendly	Yonghong Song	2	-0/+3
	During porting libbpf to bcc, I got some warnings like below: ... [ 2%] Building C object src/cc/CMakeFiles/bpf-shared.dir/libbpf/src/libbpf.c.o /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf.c:12:0: warning: "_GNU_SOURCE" redefined [enabled by default] #define _GNU_SOURCE ... [ 3%] Building C object src/cc/CMakeFiles/bpf-shared.dir/libbpf/src/libbpf_errno.c.o /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf_errno.c: In function ‘libbpf_strerror’: /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf_errno.c:45:7: warning: assignment makes integer from pointer without a cast [enabled by default] ret = strerror_r(err, buf, size); ... bcc is built with _GNU_SOURCE defined and this caused the above warning. This patch intends to make libpf _GNU_SOURCE friendly by . define _GNU_SOURCE in libbpf.c unless it is not defined . undefine _GNU_SOURCE as non-gnu version of strerror_r is expected. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-28	bpf: Fix various lib and testsuite build failures on 32-bit.	David Miller	1	-1/+1
	Cannot cast a u64 to a pointer on 32-bit without an intervening (long) cast otherwise GCC warns. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26	libbpf: Document API and ABI conventions	Andrey Ignatov	1	-0/+139
	Document API and ABI for libbpf: naming convention, symbol visibility, ABI versioning. This is just a starting point. Documentation can be significantly extended in the future to cover more topics. ABI versioning section touches only a few basic points with a link to more comprehensive documentation from Ulrich Drepper. This section can be extended in the future when there is better understanding what works well and what not so well in libbpf development process and production usage. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26	libbpf: Verify versioned symbols	Andrey Ignatov	1	-1/+18
	Since ABI versioning info is kept separately from the code it's easy to forget to update it while adding a new API. Add simple verification that all global symbols exported with LIBBPF_API are versioned in libbpf.map version script. The idea is to check that number of global symbols in libbpf-in.o, that is the input to the linker, matches with number of unique versioned symbols in libbpf.so, that is the output of the linker. If these numbers don't match, it may mean some symbol was not versioned and make will fail. "Unique" means that if a symbol is present in more than one version of ABI due to ABI changes, it'll be counted once. Another option to calculate number of global symbols in the "input" could be to count number of LIBBPF_ABI entries in C headers but it seems to be fragile. Example of output when a symbol is missing in version script: ... LD libbpf-in.o LINK libbpf.a LINK libbpf.so Warning: Num of global symbols in libbpf-in.o (115) does NOT match with num of versioned symbols in libbpf.so (114). Please make sure all LIBBPF_API symbols are versioned in libbpf.map. make: *** [check_abi] Error 1 Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26	libbpf: Add version script for DSO	Andrey Ignatov	2	-1/+124
	More and more projects use libbpf and one day it'll likely be packaged and distributed as DSO and that requires ABI versioning so that both compatible and incompatible changes to ABI can be introduced in a safe way in the future without breaking executables dynamically linked with a previous version of the library. Usual way to do ABI versioning is version script for the linker. Add such a script for libbpf. All global symbols currently exported via LIBBPF_API macro are added to the version script libbpf.map. The version name LIBBPF_0.0.1 is constructed from the name of the library + version specified by $(LIBBPF_VERSION) in Makefile. Version script does not duplicate the work done by LIBBPF_API macro, it rather complements it. The macro is used at compile time and can be used by compiler to do optimization that can't be done at link time, it is purely about global symbol visibility. The version script, in turn, is used at link time and takes care of ABI versioning. Both techniques are described in details in [1]. Whenever ABI is changed in the future, version script should be changed appropriately. [1] https://www.akkadia.org/drepper/dsohowto.pdf Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26	libbpf: Name changing for btf_get_from_id	Martin KaFai Lau	2	-2/+2
	s/btf_get_from_id/btf__get_from_id/ to restore the API naming convention. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-23	libbpf: make bpf_object__open default to UNSPEC	Nikita V. Shirokov	1	-4/+4
	currently by default libbpf's bpf_object__open requires bpf's program to specify version in a code because of two things: 1) default prog type is set to KPROBE 2) KPROBE requires (in kernel/bpf/syscall.c) version to be specified in this patch i'm changing default prog type to UNSPEC and also changing requirments for version's section to be present in object file. now it would reflect what we have today in kernel (only KPROBE prog type requires for version to be explicitly set). v1 -> v2: - RFC tag has been dropped Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	bpf: adding support for map in map in libbpf	Nikita V. Shirokov	2	-6/+36
	idea is pretty simple. for specified map (pointed by struct bpf_map) we would provide descriptor of already loaded map, which is going to be used as a prototype for inner map. proposed workflow: 1) open bpf's object (bpf_object__open) 2) create bpf's map which is going to be used as a prototype 3) find (by name) map-in-map which you want to load and update w/ descriptor of inner map w/ a new helper from this patch 4) load bpf program w/ bpf_object__load Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	bpf: libbpf: don't specify prog name if kernel doesn't support it	Stanislav Fomichev	1	-1/+2
	Use recently added capability check. See commit 23499442c319 ("bpf: libbpf: retry map creation without the name") for rationale. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	bpf: libbpf: remove map name retry from bpf_create_map_xattr	Stanislav Fomichev	2	-11/+3
	Instead, check for a newly created caps.name bpf_object capability. If kernel doesn't support names, don't specify the attribute. See commit 23499442c319 ("bpf: libbpf: retry map creation without the name") for rationale. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	bpf, libbpf: introduce bpf_object__probe_caps to test BPF capabilities	Stanislav Fomichev	1	-0/+58
	It currently only checks whether kernel supports map/prog names. This capability check will be used in the next two commits to skip setting prog/map names. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	libbpf: make sure bpf headers are c++ include-able	Stanislav Fomichev	5	-3/+56
	Wrap headers in extern "C", to turn off C++ mangling. This simplifies including libbpf in c++ and linking against it. v2 changes: * do the same for btf.h v3 changes: * test_libbpf.cpp to test for possible future c++ breakages Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21	bpf: fix a libbpf loader issue	Yonghong Song	1	-1/+1
	Commit 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections") added support to read .BTF.ext sections from an object file, create and pass prog_btf_fd and func_info to the kernel. The program btf_fd (prog->btf_fd) is initialized to be -1 to please zclose so we do not need special handling dur prog close. Passing -1 to the kernel, however, will cause loading error. Passing btf_fd 0 to the kernel if prog->btf_fd is invalid fixed the problem. Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections") Reported-by: Andrey Ignatov <rdna@fb.com> Reported-by: Emre Cantimur <haydum@fb.com> Tested-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	tools/bpf: refactor to implement btf_get_from_id() in lib/bpf	Yonghong Song	2	-0/+70
	The function get_btf() is implemented in tools/bpf/bpftool/map.c to get a btf structure given a map_info. This patch refactored this function to be function btf_get_from_id() in tools/lib/bpf so that it can be used later. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: add support to read .BTF.ext sections	Yonghong Song	4	-15/+442
	The .BTF section is already available to encode types. These types can be used for map pretty print. The whole .BTF will be passed to the kernel as well for which kernel can verify and return to the user space for pretty print etc. The llvm patch at https://reviews.llvm.org/D53736 will generate .BTF section and one more section .BTF.ext. The .BTF.ext section encodes function type information and line information. Note that this patch set only supports function type info. The functionality is implemented in libbpf. The .BTF section can be directly loaded into the kernel, and the .BTF.ext section cannot. The loader may need to do some relocation and merging, similar to merging multiple code sections, before loading into the kernel. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: add new fields for program load in lib/bpf	Yonghong Song	2	-0/+8
	The new fields are added for program load in lib/bpf so application uses api bpf_load_program_xattr() is able to load program with btf and func_info data. This functionality will be used in next patch by bpf selftest test_btf. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	tools/bpf: Add tests for BTF_KIND_FUNC_PROTO and BTF_KIND_FUNC	Martin KaFai Lau	1	-0/+4
	This patch adds unit tests for BTF_KIND_FUNC_PROTO and BTF_KIND_FUNC to test_btf. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20	bpf: libbpf: retry map creation without the name	Stanislav Fomichev	1	-1/+10
	Since commit 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name for maps. Pre v4.14 kernels don't know about map names and return an error about unexpected non-zero data. Retry sys_bpf without a map name to cover older kernels. v2 changes: * check for errno == EINVAL as suggested by Daniel Borkmann Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-16	bpf: libbpf: Fix bpf_program__next() API	Martin KaFai Lau	1	-14/+11
	This patch restores the behavior in commit eac7d84519a3 ("tools: libbpf: don't return '.text' as a program for multi-function programs") such that bpf_program__next() does not return pseudo programs in ".text". Fixes: 0c19a9fbc9cd ("libbpf: cleanup after partial failure in bpf_object__pin") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	libbpf: add internal pin_name	Stanislav Fomichev	1	-3/+26
	pin_name is the same as section_name where '/' is replaced by '_'. bpf_object__pin_programs is converted to use pin_name to avoid the situation where section_name would require creating another subdirectory for a pin (as, for example, when calling bpf_object__pin_programs for programs in sections like "cgroup/connect6"). Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	libbpf: bpf_program__pin: add special case for instances.nr == 1	Stanislav Fomichev	1	-0/+10
	When bpf_program has only one instance, don't create a subdirectory with per-instance pin files (<prog>/0). Instead, just create a single pin file for that single instance. This simplifies object pinning by not creating unnecessary subdirectories. This can potentially break existing users that depend on the case where '/0' is always created. However, I couldn't find any serious usage of bpf_program__pin inside the kernel tree and I suppose there should be none outside. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10	libbpf: cleanup after partial failure in bpf_object__pin	Stanislav Fomichev	2	-23/+319
	bpftool will use bpf_object__pin in the next commits to pin all programs and maps from the file; in case of a partial failure, we need to get back to the clean state (undo previous program/map pins). As part of a cleanup, I've added and exported separate routines to pin all maps (bpf_object__pin_maps) and progs (bpf_object__pin_programs) of an object. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-31	libbpf: Fix compile error in libbpf_attach_type_by_name	Andrey Ignatov	1	-6/+7
	Arnaldo Carvalho de Melo reported build error in libbpf when clang version 3.8.1-24 (tags/RELEASE_381/final) is used: libbpf.c:2201:36: error: comparison of constant -22 with expression of type 'const enum bpf_attach_type' is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (section_names[i].attach_type == -EINVAL) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~ 1 error generated. Fix the error by keeping "is_attachable" property of a program in a separate struct field instead of trying to use attach_type itself. Fixes: 956b620fcf0b ("libbpf: Introduce libbpf_attach_type_by_name") Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Andrey Ignatov <rdna@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-20	bpf, libbpf: simplify and cleanup perf ring buffer walk	Daniel Borkmann	2	-45/+37
	Simplify bpf_perf_event_read_simple() a bit and fix up some minor things along the way: the return code in the header is not of type int but enum bpf_perf_event_ret instead. Once callback indicated to break the loop walking event data, it also needs to be consumed in data_tail since it has been processed already. Moreover, bpf_perf_event_print_t callback should avoid void * as we actually get a pointer to struct perf_event_header and thus applications can make use of container_of() to have type checks. The walk also doesn't have to use modulo op since the ring size is required to be power of two. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-19	bpf, libbpf: use correct barriers in perf ring buffer walk	Daniel Borkmann	1	-6/+4
	Given libbpf is a generic library and not restricted to x86-64 only, the compiler barrier in bpf_perf_event_read_simple() after fetching the head needs to be replaced with smp_rmb() at minimum. Also, writing out the tail we should use WRITE_ONCE() to avoid store tearing. Now that we have the logic in place in ring_buffer_read_head() and ring_buffer_write_tail() helper also used by perf tool which would select the correct and best variant for a given architecture (e.g. x86-64 can avoid CPU barriers entirely), make use of these in order to fix bpf_perf_event_read_simple(). Fixes: d0cabbb021be ("tools: bpf: move the event reading loop to libbpf") Fixes: 39111695b1b8 ("samples: bpf: add bpf_perf_event_output example") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-19	selftests/bpf: add test cases for queue and stack maps	Mauricio Vasquez B	2	-0/+14
	test_maps: Tests that queue/stack maps are behaving correctly even in corner cases test_progs: Tests new ebpf helpers Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-16	libbpf: Per-symbol visibility for DSO	Andrey Ignatov	4	-148/+179
	Make global symbols in libbpf DSO hidden by default with -fvisibility=hidden and export symbols that are part of ABI explicitly with __attribute__((visibility("default"))). This is common practice that should prevent from accidentally exporting a symbol, that is not supposed to be a part of ABI what, in turn, improves both libbpf developer- and user-experiences. See [1] for more details. Export control becomes more important since more and more projects use libbpf. The patch doesn't export a bunch of netlink related functions since as agreed in [2] they'll be reworked. That doesn't break bpftool since bpftool links libbpf statically. [1] https://www.akkadia.org/drepper/dsohowto.pdf (2.2 Export Control) [2] https://www.mail-archive.com/netdev@vger.kernel.org/msg251434.html Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-15	bpf: bpftool, add flag to allow non-compat map definitions	John Fastabend	3	-9/+23
	Multiple map definition structures exist and user may have non-zero fields in their definition that are not recognized by bpftool and libbpf. The normal behavior is to then fail loading the map. Although this is a good default behavior users may still want to load the map for debugging or other reasons. This patch adds a --mapcompat flag that can be used to override the default behavior and allow loading the map even when it has additional non-zero fields. For now the only user is 'bpftool prog' we can switch over other subcommands as needed. The library exposes an API that consumes a flags field now but I kept the original API around also in case users of the API don't want to expose this. The flags field is an int in case we need more control over how the API call handles errors/features/etc in the future. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-09	tools/bpf: use proper type and uapi perf_event.h header for libbpf	Yonghong Song	2	-5/+5
	Use __u32 instead u32 in libbpf.c and also use uapi perf_event.h instead of tools/perf/perf-sys.h. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-08	libbpf: relicense libbpf as LGPL-2.1 OR BSD-2-Clause	Alexei Starovoitov	13	-62/+13
	libbpf is maturing as a library and gaining features that no other bpf libraries support (BPF Type Format, bpf to bpf calls, etc) Many Apache2 licensed projects (like bcc, bpftrace, gobpf, cilium, etc) would like to use libbpf, but cannot do this yet, since Apache Foundation explicitly states that LGPL is incompatible with Apache2. Hence let's relicense libbpf as dual license LGPL-2.1 or BSD-2-Clause, since BSD-2 is compatible with Apache2. Dual LGPL or Apache2 is invalid combination. Fix license mistake in Makefile as well. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrey Ignatov <rdna@fb.com> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Acked-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Beckett <david.beckett@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Wang Nan <wangnan0@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04	libbpf: Use __u32 instead of u32 in bpf_program__load	Andrey Ignatov	2	-2/+2
	Make bpf_program__load consistent with other interfaces: use __u32 instead of u32. That in turn fixes build of samples: In file included from ./samples/bpf/trace_output_user.c:21:0: ./tools/lib/bpf/libbpf.h:132:9: error: unknown type name ‘u32’ u32 kern_version); ^ Fixes: commit 29cd77f41620d ("libbpf: Support loading individual progs") Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04	libbpf: Make include guards consistent	Andrey Ignatov	5	-15/+15
	Rename include guards to have consistent names "__LIBBPF_<header_name>". Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04	libbpf: Consistent prefixes for interfaces in str_error.h.	Andrey Ignatov	3	-11/+13
	libbpf is used more and more outside kernel tree. That means the library should follow good practices in library design and implementation to play well with third party code that uses it. One of such practices is to have a common prefix (or a few) for every interface, function or data structure, library provides. I helps to avoid name conflicts with other libraries and keeps API consistent. Inconsistent names in libbpf already cause problems in real life. E.g. an application can't use both libbpf and libnl due to conflicting symbols. Having common prefix will help to fix current and avoid future problems. libbpf already uses the following prefixes for its interfaces: * bpf_ for bpf system call wrappers, program/map/elf-object abstractions and a few other things; * btf_ for BTF related API; * libbpf_ for everything else. The patch renames function in str_error.h to have libbpf_ prefix since it misses one and doesn't fit well into the first two categories. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04	libbpf: Consistent prefixes for interfaces in nlattr.h.	Andrey Ignatov	3	-64/+69
	libbpf is used more and more outside kernel tree. That means the library should follow good practices in library design and implementation to play well with third party code that uses it. One of such practices is to have a common prefix (or a few) for every interface, function or data structure, library provides. I helps to avoid name conflicts with other libraries and keeps API consistent. Inconsistent names in libbpf already cause problems in real life. E.g. an application can't use both libbpf and libnl due to conflicting symbols. Having common prefix will help to fix current and avoid future problems. libbpf already uses the following prefixes for its interfaces: * bpf_ for bpf system call wrappers, program/map/elf-object abstractions and a few other things; * btf_ for BTF related API; * libbpf_ for everything else. The patch adds libbpf_ prefix to interfaces in nlattr.h that use none of mentioned above prefixes and doesn't fit well into the first two categories. Since affected part of API is used in bpftool, the patch applies corresponding change to bpftool as well. Having it in a separate patch will cause a state of tree where bpftool is broken what may not be a good idea. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04	libbpf: Consistent prefixes for interfaces in libbpf.h.	Andrey Ignatov	2	-27/+30
	libbpf is used more and more outside kernel tree. That means the library should follow good practices in library design and implementation to play well with third party code that uses it. One of such practices is to have a common prefix (or a few) for every interface, function or data structure, library provides. I helps to avoid name conflicts with other libraries and keeps API consistent. Inconsistent names in libbpf already cause problems in real life. E.g. an application can't use both libbpf and libnl due to conflicting symbols. Having common prefix will help to fix current and avoid future problems. libbpf already uses the following prefixes for its interfaces: * bpf_ for bpf system call wrappers, program/map/elf-object abstractions and a few other things; * btf_ for BTF related API; * libbpf_ for everything else. The patch adds libbpf_ prefix to functions and typedef in libbpf.h that use none of mentioned above prefixes and doesn't fit well into the first two categories. Since affected part of API is used in bpftool, the patch applies corresponding change to bpftool as well. Having it in a separate patch will cause a state of tree where bpftool is broken what may not be a good idea. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04	libbpf: Move __dump_nlmsg_t from API to implementation	Andrey Ignatov	2	-3/+3
	This typedef is used only by implementation in netlink.c. Nothing uses it in public API. Move it to netlink.c. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03	libbpf: Support loading individual progs	Joe Stringer	2	-2/+5
	Allow the individual program load to be invoked. This will help with testing, where a single ELF may contain several sections, some of which denote subprograms that are expected to fail verification, along with some which are expected to pass verification. By allowing programs to be iterated and individually loaded, each program can be independently checked against its expected verification result. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-27	libbpf: Support sk_skb/stream_{parser, verdict} section names	Andrey Ignatov	1	-0/+4
	Add section names for BPF_SK_SKB_STREAM_PARSER and BPF_SK_SKB_STREAM_VERDICT attach types to be able to identify them in libbpf_attach_type_by_name. "stream_parser" and "stream_verdict" are used instead of simple "parser" and "verdict" just to avoid possible confusion in a place where attach type is used alone (e.g. in bpftool's show sub-commands) since there is another attach point that can be named as "verdict": BPF_SK_MSG_VERDICT. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-27	libbpf: Support cgroup_skb/{e,in}gress section names	Andrey Ignatov	1	-0/+4
	Add section names for BPF_CGROUP_INET_INGRESS and BPF_CGROUP_INET_EGRESS attach types to be able to identify them in libbpf_attach_type_by_name. "cgroup_skb" is used instead of "cgroup/skb" mostly to easy possible unifying of how libbpf and bpftool works with section names: * bpftool uses "cgroup_skb" to in "prog list" sub-command; * bpftool uses "ingress" and "egress" in "cgroup list" sub-command; * having two parts instead of three in a string like "cgroup_skb/ingress" can be leveraged to split it to prog_type part and attach_type part, or vise versa: use two parts to make a section name. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-27	libbpf: Introduce libbpf_attach_type_by_name	Andrey Ignatov	2	-39/+84
	There is a common use-case when ELF object contains multiple BPF programs and every program has its own section name. If it's cgroup-bpf then programs have to be 1) loaded and 2) attached to a cgroup. It's convenient to have information necessary to load BPF program together with program itself. This is where section name works fine in conjunction with libbpf_prog_type_by_name that identifies prog_type and expected_attach_type and these can be used with BPF_PROG_LOAD. But there is currently no way to identify attach_type by section name and it leads to messy code in user space that reinvents guessing logic every time it has to identify attach type to use with BPF_PROG_ATTACH. The patch introduces libbpf_attach_type_by_name that guesses attach type by section name if a program can be attached. The difference between expected_attach_type provided by libbpf_prog_type_by_name and attach_type provided by libbpf_attach_type_by_name is the former is used at BPF_PROG_LOAD time and can be zero if a program of prog_type X has only one corresponding attach type Y whether the latter provides specific attach type to use with BPF_PROG_ATTACH. No new section names were added to section_names array. Only existing ones were reorganized and attach_type was added where appropriate. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-25	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller	8	-143/+411
	Daniel Borkmann says: ==================== pull-request: bpf-next 2018-09-25 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Allow for RX stack hardening by implementing the kernel's flow dissector in BPF. Idea was originally presented at netconf 2017 [0]. Quote from merge commit: [...] Because of the rigorous checks of the BPF verifier, this provides significant security guarantees. In particular, the BPF flow dissector cannot get inside of an infinite loop, as with CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot read outside of packet bounds, because all memory accesses are checked. Also, with BPF the administrator can decide which protocols to support, reducing potential attack surface. Rarely encountered protocols can be excluded from dissection and the program can be updated without kernel recompile or reboot if a bug is discovered. [...] Also, a sample flow dissector has been implemented in BPF as part of this work, from Petar and Willem. [0] http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf 2) Add support for bpftool to list currently active attachment points of BPF networking programs providing a quick overview similar to bpftool's perf subcommand, from Yonghong. 3) Fix a verifier pruning instability bug where a union member from the register state was not cleared properly leading to branches not being pruned despite them being valid candidates, from Alexei. 4) Various smaller fast-path optimizations in XDP's map redirect code, from Jesper. 5) Enable to recognize BPF_MAP_TYPE_REUSEPORT_SOCKARRAY maps in bpftool, from Roman. 6) Remove a duplicate check in libbpf that probes for function storage, from Taeung. 7) Fix an issue in test_progs by avoid checking for errno since on success its value should not be checked, from Mauricio. 8) Fix unused variable warning in bpf_getsockopt() helper when CONFIG_INET is not configured, from Anders. 9) Fix a compilation failure in the BPF sample code's use of bpf_flow_keys, from Prashant. 10) Minor cleanups in BPF code, from Yue and Zhong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-18	tools lib bpf: Provide wrapper for strerror_r to build in !_GNU_SOURCE systems	Arnaldo Carvalho de Melo	4	-11/+35
	Same problem that got fixed in a similar fashion in tools/perf/ in c8b5f2c96d1b ("tools: Introduce str_error_r()"), fix it in the same way, licensing needs to be sorted out to libbpf to use libapi, so, for this simple case, just get the same wrapper in tools/lib/bpf. This makes libbpf and its users (bpftool, selftests, perf) to build again in Alpine Linux 3.[45678] and edge. Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Quentin Monnet <quentin.monnet@netronome.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Yonghong Song <yhs@fb.com> Fixes: 1ce6a9fc1549 ("bpf: fix build error in libbpf with EXTRA_CFLAGS="-Wp, -D_FORTIFY_SOURCE=2 -O2"") Link: https://lkml.kernel.org/r/20180917151636.GA21790@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-14	bpf: support flow dissector in libbpf and bpftool	Petar Penkov	1	-0/+2
	This patch extends libbpf and bpftool to work with programs of type BPF_PROG_TYPE_FLOW_DISSECTOR. Signed-off-by: Petar Penkov <ppenkov@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-09-11	tools/bpf: fix a netlink recv issue	Yonghong Song	1	-1/+8
	Commit f7010770fbac ("tools/bpf: move bpf/lib netlink related functions into a new file") introduced a while loop for the netlink recv path. This while loop is needed since the buffer in recv syscall may not be enough to hold all the information and in such cases multiple recv calls are needed. There is a bug introduced by the above commit as the while loop may block on recv syscall if there is no more messages are expected. The netlink message header flag NLM_F_MULTI is used to indicate that more messages are expected and this patch fixed the bug by doing further recv syscall only if multipart message is expected. The patch added another fix regarding to message length of 0. When netlink recv returns message length of 0, there will be no more messages for returning data so the while loop can end. Fixes: f7010770fbac ("tools/bpf: move bpf/lib netlink related functions into a new file") Reported-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-09-06	tools/bpf: add more netlink functionalities in lib/bpf	Yonghong Song	5	-15/+238
	This patch added a few netlink attribute parsing functions and the netlink API functions to query networking links, tc classes, tc qdiscs and tc filters. For example, the following API is to get networking links: int nl_get_link(int sock, unsigned int nl_pid, dump_nlmsg_t dump_link_nlmsg, void cookie); Note that when the API is called, the user also provided a callback function with the following signature: int (dump_nlmsg_t)(void cookie, void msg, struct nlattr **tb); The "cookie" is the parameter the user passed to the API and will be available for the callback function. The "msg" is the information about the result, e.g., ifinfomsg or tcmsg. The "tb" is the parsed netlink attributes. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-09-06	tools/bpf: move bpf/lib netlink related functions into a new file	Yonghong Song	3	-130/+166
	There are no functionality change for this patch. In the subsequent patches, more netlink related library functions will be added and a separate file is better than cluttering bpf.c. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-09-05	libbpf: Remove the duplicate checking of function storage	Taeung Song	1	-1/+1
	After the commit eac7d84519a3 ("tools: libbpf: don't return '.text' as a program for multi-function programs"), bpf_program__next() in bpf_object__for_each_program skips the function storage such as .text, so eliminate the duplicate checking. Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller	3	-0/+3
	Daniel Borkmann says: ==================== pull-request: bpf-next 2018-08-13 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add driver XDP support for veth. This can be used in conjunction with redirect of another XDP program e.g. sitting on NIC so the xdp_frame can be forwarded to the peer veth directly without modification, from Toshiaki. 2) Add a new BPF map type REUSEPORT_SOCKARRAY and prog type SK_REUSEPORT in order to provide more control and visibility on where a SO_REUSEPORT sk should be located, and the latter enables to directly select a sk from the bpf map. This also enables map-in-map for application migration use cases, from Martin. 3) Add a new BPF helper bpf_skb_ancestor_cgroup_id() that returns the id of cgroup v2 that is the ancestor of the cgroup associated with the skb at the ancestor_level, from Andrey. 4) Implement BPF fs map pretty-print support based on BTF data for regular hash table and LRU map, from Yonghong. 5) Decouple the ability to attach BTF for a map from the key and value pretty-printer in BPF fs, and enable further support of BTF for maps for percpu and LPM trie, from Daniel. 6) Implement a better BPF sample of using XDP's CPU redirect feature for load balancing SKB processing to remote CPU. The sample implements the same XDP load balancing as Suricata does which is symmetric hash based on IP and L4 protocol, from Jesper. 7) Revert adding NULL pointer check with WARN_ON_ONCE() in __xdp_return()'s critical path as it is ensured that the allocator is present, from Björn. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-11	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	David S. Miller	2	-2/+2

2018-08-11	bpf: Test BPF_PROG_TYPE_SK_REUSEPORT	Martin KaFai Lau	2	-0/+2
	This patch add tests for the new BPF_PROG_TYPE_SK_REUSEPORT. The tests cover: - IPv4/IPv6 + TCP/UDP - TCP syncookie - TCP fastopen - Cases when the bpf_sk_select_reuseport() returning errors - Cases when the bpf prog returns SK_DROP - Values from sk_reuseport_md - outer_map => reuseport_array The test depends on commit 3eee1f75f2b9 ("bpf: fix bpf_skb_load_bytes_relative pkt length check") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>