linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-12-19	xsk: Make xskmap flush_list common for all map instances	Björn Töpel	4	-35/+20
	The xskmap flush list is used to track entries that need to flushed from via the xdp_do_flush_map() function. This list used to be per-map, but there is really no reason for that. Instead make the flush list global for all xskmaps, which simplifies __xsk_map_flush() and xsk_map_alloc(). Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-5-bjorn.topel@gmail.com
2019-12-19	xdp: Fix graze->grace type-o in cpumap comments	Björn Töpel	1	-3/+3
	Simple spelling fix. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-4-bjorn.topel@gmail.com
2019-12-19	xdp: Simplify cpumap cleanup	Björn Töpel	1	-29/+5
	After the RCU flavor consolidation [1], call_rcu() and synchronize_rcu() waits for preempt-disable regions (NAPI) in addition to the read-side critical sections. As a result of this, the cleanup code in cpumap can be simplified * There is no longer a need to flush in __cpu_map_entry_free, since we know that this has been done when the call_rcu() callback is triggered. * When freeing the map, there is no need to explicitly wait for a flush. It's guaranteed to be done after the synchronize_rcu() call in cpu_map_free(). [1] https://lwn.net/Articles/777036/ Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-3-bjorn.topel@gmail.com
2019-12-19	xdp: Simplify devmap cleanup	Björn Töpel	1	-38/+5
	After the RCU flavor consolidation [1], call_rcu() and synchronize_rcu() waits for preempt-disable regions (NAPI) in addition to the read-side critical sections. As a result of this, the cleanup code in devmap can be simplified * There is no longer a need to flush in __dev_map_entry_free, since we know that this has been done when the call_rcu() callback is triggered. * When freeing the map, there is no need to explicitly wait for a flush. It's guaranteed to be done after the synchronize_rcu() call in dev_map_free(). The rcu_barrier() is still needed, so that the map is not freed prior the elements. [1] https://lwn.net/Articles/777036/ Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-2-bjorn.topel@gmail.com
2019-12-19	bpf: Remove unnecessary assertion on fp_old	Aditya Pakki	1	-2/+0
	The two callers of bpf_prog_realloc - bpf_patch_insn_single and bpf_migrate_filter dereference the struct fp_old, before passing it to the function. Thus assertion to check fp_old is unnecessary and can be removed. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191219175735.19231-1-pakki001@umn.edu
2019-12-19	libbpf: Fix another __u64 printf warning	Andrii Nakryiko	1	-2/+2
	Fix yet another printf warning for %llu specifier on ppc64le. This time size_t casting won't work, so cast to verbose `unsigned long long`. Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191219052103.3515-1-andriin@fb.com
2019-12-19	libbpf: Fix printing of ulimit value	Toke Høiland-Jørgensen	1	-1/+1
	Naresh pointed out that libbpf builds fail on 32-bit architectures because rlimit.rlim_cur is defined as 'unsigned long long' on those architectures. Fix this by using %zu in printf and casting to size_t. Fixes: dc3a2d254782 ("libbpf: Print hint about ulimit when getting permission denied error") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191219090236.905059-1-toke@redhat.com
2019-12-19	selftests/bpf: Fix test_attach_probe	Alexei Starovoitov	1	-3/+4
	Fix two issues in test_attach_probe: 1. it was not able to parse /proc/self/maps beyond the first line, since %s means parse string until white space. 2. offset has to be accounted for otherwise uprobed address is incorrect. Fixes: 1e8611bbdfc9 ("selftests/bpf: add kprobe/uprobe selftests") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191219020442.1922617-1-ast@kernel.org
2019-12-19	libbpf: Add missing newline in opts validation macro	Toke Høiland-Jørgensen	1	-1/+1
	The error log output in the opts validation macro was missing a newline. Fixes: 2ce8450ef5a3 ("libbpf: add bpf_object__open_{file, mem} w/ extensible opts") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191219120714.928380-1-toke@redhat.com
2019-12-19	Merge branch 'bpf-riscv-jit-improvements'	Daniel Borkmann	5	-238/+312
	Björn Töpel says: ==================== This series contain one non-critical fix, support for far jumps, and some optimizations for the BPF JIT. Previously, the JIT only supported 12b branch targets for conditional branches, and 21b for unconditional branches. Starting with this series, 32b branching is supported. As part of supporting far jumps, branch relaxation was introduced. The idea is to start with a pessimistic jump (e.g. auipc/jalr) and for each pass the JIT will have an opportunity to pick a better instruction (e.g. jal) and shrink the image. Instead of two passes, the JIT requires more passes. It typically converges after 3 passes. The optimizations mentioned in the subject are for calls and tail calls. In the tail call generation we can save one instruction by using the offset in jalr. Calls are optimized by doing (auipc)/jal(r) relative jumps instead of loading the entire absolute address and doing jalr. This required that the JIT image allocator was made RISC-V specific, so we can ensure that the JIT image and the kernel text are in range (32b). The last two patches of the series is not critical to the series, but are two UAPI build issues for BPF events. A closer look from the RV-folks would be much appreciated. The test_bpf.ko module, selftests/bpf/test_verifier and selftests/seccomp/seccomp_bpf pass all tests. RISC-V is still missing proper kprobe and tracepoint support, so a lot of BPF selftests cannot be run. v1->v2: [1] * Removed unused function parameter from emit_branch() * Added patch to support far branch in tail call emit [1] https://lore.kernel.org/bpf/20191209173136.29615-1-bjorn.topel@gmail.com/ ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-12-19	riscv, perf: Add arch specific perf_arch_bpf_user_pt_regs	Björn Töpel	1	-0/+4
	RISC-V was missing a proper perf_arch_bpf_user_pt_regs macro for CONFIG_PERF_EVENT builds. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191216091343.23260-10-bjorn.topel@gmail.com
2019-12-19	riscv, bpf: Add missing uapi header for BPF_PROG_TYPE_PERF_EVENT programs	Björn Töpel	2	-0/+11
	Add missing uapi header the BPF_PROG_TYPE_PERF_EVENT programs by exporting struct user_regs_struct instead of struct pt_regs which is in-kernel only. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191216091343.23260-9-bjorn.topel@gmail.com
2019-12-19	riscv, bpf: Optimize calls	Björn Töpel	1	-37/+64
	Instead of using emit_imm() and emit_jalr() which can expand to six instructions, start using jal or auipc+jalr. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191216091343.23260-8-bjorn.topel@gmail.com
2019-12-19	riscv, bpf: Provide RISC-V specific JIT image alloc/free	Björn Töpel	2	-0/+17
	This commit makes sure that the JIT images is kept close to the kernel text, so BPF calls can use relative calling with auipc/jalr or jal instead of loading the full 64-bit address and jalr. The BPF JIT image region is 128 MB before the kernel text. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191216091343.23260-7-bjorn.topel@gmail.com
2019-12-19	riscv, bpf: Optimize BPF tail calls	Björn Töpel	1	-6/+7
	Remove one addi, and instead use the offset part of jalr. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191216091343.23260-6-bjorn.topel@gmail.com
2019-12-19	riscv, bpf: Add support for far jumps and exits	Björn Töpel	1	-20/+17
	This commit add support for far (offset > 21b) jumps and exits. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Luke Nelson <lukenels@cs.washington.edu> Link: https://lore.kernel.org/bpf/20191216091343.23260-5-bjorn.topel@gmail.com
2019-12-19	riscv, bpf: Add support for far branching when emitting tail call	Björn Töpel	1	-19/+3
	Start use the emit_branch() function in the tail call emitter in order to support far branching. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191216091343.23260-4-bjorn.topel@gmail.com
2019-12-19	riscv, bpf: Add support for far branching	Björn Töpel	1	-164/+188
	This commit adds branch relaxation to the BPF JIT, and with that support for far (offset greater than 12b) branching. The branch relaxation requires more than two passes to converge. For most programs it is three passes, but for larger programs it can be more. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Luke Nelson <lukenels@cs.washington.edu> Link: https://lore.kernel.org/bpf/20191216091343.23260-3-bjorn.topel@gmail.com
2019-12-19	riscv, bpf: Fix broken BPF tail calls	Björn Töpel	1	-2/+11
	The BPF JIT incorrectly clobbered the a0 register, and did not flag usage of s5 register when BPF stack was being used. Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G") Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191216091343.23260-2-bjorn.topel@gmail.com
2019-12-18	Merge branch 'libbpf-extern-followups'	Alexei Starovoitov	8	-161/+194
	Andrii Nakryiko says: ==================== Based on latest feedback and discussions, this patch set implements the following changes: - Kconfig-provided externs have to be in .kconfig section, for which bpf_helpers.h provides convenient __kconfig macro (Daniel); - instead of allowing to override Kconfig file path, switch this to ability to extend and override system Kconfig with user-provided custom values (Alexei); - BTF is required when externs are used. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-12-18	libbpf: BTF is required when externs are present	Andrii Nakryiko	1	-1/+2
	BTF is required to get type information about extern variables. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191219002837.3074619-4-andriin@fb.com
2019-12-18	libbpf: Allow to augment system Kconfig through extra optional config	Andrii Nakryiko	3	-112/+132
	Instead of all or nothing approach of overriding Kconfig file location, allow to extend it with extra values and override chosen subset of values though optional user-provided extra config, passed as a string through open options' .kconfig option. If same config key is present in both user-supplied config and Kconfig, user-supplied one wins. This allows applications to more easily test various conditions despite host kernel's real configuration. If all of BPF object's __kconfig externs are satisfied from user-supplied config, system Kconfig won't be read at all. Simplify selftests by not needing to create temporary Kconfig files. Suggested-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191219002837.3074619-3-andriin@fb.com
2019-12-18	libbpf: Put Kconfig externs into .kconfig section	Andrii Nakryiko	6	-48/+60
	Move Kconfig-provided externs into custom .kconfig section. Add __kconfig into bpf_helpers.h for user convenience. Update selftests accordingly. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191219002837.3074619-2-andriin@fb.com
2019-12-18	libbpf: Add bpf_link__disconnect() API to preserve underlying BPF resource	Andrii Nakryiko	3	-10/+32
	There are cases in which BPF resource (program, map, etc) has to outlive userspace program that "installed" it in the system in the first place. When BPF program is attached, libbpf returns bpf_link object, which is supposed to be destroyed after no longer necessary through bpf_link__destroy() API. Currently, bpf_link destruction causes both automatic detachment and frees up any resources allocated to for bpf_link in-memory representation. This is inconvenient for the case described above because of coupling of detachment and resource freeing. This patch introduces bpf_link__disconnect() API call, which marks bpf_link as disconnected from its underlying BPF resouces. This means that when bpf_link is destroyed later, all its memory resources will be freed, but BPF resource itself won't be detached. This design allows to follow strict and resource-leak-free design by default, while giving easy and straightforward way for user code to opt for keeping BPF resource attached beyond lifetime of a bpf_link. For some BPF programs (i.e., FS-based tracepoints, kprobes, raw tracepoint, etc), user has to make sure to pin BPF program to prevent kernel to automatically detach it on process exit. This should typically be achived by pinning BPF program (or map in some cases) in BPF FS. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191218225039.2668205-1-andriin@fb.com
2019-12-18	bpf: Allow to change skb mark in test_run	Nikita V. Shirokov	3	-1/+15
	allow to pass skb's mark field into bpf_prog_test_run ctx for BPF_PROG_TYPE_SCHED_CLS prog type. that would allow to test bpf programs which are doing decision based on this field Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-12-18	bpftool: Work-around rst2man conversion bug	Andrii Nakryiko	1	-7/+8
	Work-around what appears to be a bug in rst2man convertion tool, used to create man pages out of reStructureText-formatted documents. If text line starts with dot, rst2man will put it in resulting man file verbatim. This seems to cause man tool to interpret it as a directive/command (e.g., `.bs`), and subsequently not render entire line because it's unrecognized one. Enclose '.xxx' words in extra formatting to work around. Fixes: cb21ac588546 ("bpftool: Add gen subcommand manpage") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com Link: https://lore.kernel.org/bpf/20191218221707.2552199-1-andriin@fb.com
2019-12-18	bpftool: Simplify format string to not use positional args	Andrii Nakryiko	1	-2/+2
	Change format string referring to just single argument out of two available. Some versions of libc can reject such format string. Reported-by: Nikita Shirokov <tehnerd@tehnerd.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191218214314.2403729-1-andriin@fb.com
2019-12-17	Merge branch 'skel-fixes'	Alexei Starovoitov	12	-155/+460
	Andrii Nakryiko says: ==================== Simplify skeleton usage by embedding source BPF object file inside skeleton itself. This allows to keep skeleton and object file in sync at all times with no chance of confusion. Also, add bpftool-gen.rst manpage, explaining concepts and ideas behind skeleton. In examples section it also includes a complete small BPF application utilizing skeleton, as a demonstration of API. Patch #2 also removes BPF_EMBED_OBJ, as there is currently no use of it. v2->v3: - (void) in no-args function (Alexei); - bpftool-gen.rst code block formatting fix (Alexei); - simplified xxx__create_skeleton to fill in obj and return error code; v1->v2: - remove whitespace from empty lines in code blocks (Yonghong). ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-12-17	bpftool: Add gen subcommand manpage	Andrii Nakryiko	2	-1/+306
	Add bpftool-gen.rst describing skeleton on the high level. Also include a small, but complete, example BPF app (BPF side, userspace side, generated skeleton) in example section to demonstrate skeleton API and its usage. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191218052552.2915188-4-andriin@fb.com
2019-12-17	libbpf: Remove BPF_EMBED_OBJ macro from libbpf.h	Andrii Nakryiko	1	-35/+0
	Drop BPF_EMBED_OBJ and struct bpf_embed_data now that skeleton automatically embeds contents of its source object file. While BPF_EMBED_OBJ is useful independently of skeleton, we are currently don't have any use cases utilizing it, so let's remove them until/if we need it. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191218052552.2915188-3-andriin@fb.com
2019-12-17	bpftool, selftests/bpf: Embed object file inside skeleton	Andrii Nakryiko	9	-119/+154
	Embed contents of BPF object file used for BPF skeleton generation inside skeleton itself. This allows to keep BPF object file and its skeleton in sync at all times, and simpifies skeleton instantiation. Also switch existing selftests to not require BPF_EMBED_OBJ anymore. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191218052552.2915188-2-andriin@fb.com
2019-12-17	libbpf: Reduce log level for custom section names	Andrii Nakryiko	1	-3/+3
	Libbpf is trying to recognize BPF program type based on its section name during bpf_object__open() phase. This is not strictly enforced and user code has ability to specify/override correct BPF program type after open. But if BPF program is using custom section name, libbpf will still emit warnings, which can be quite annoying to users. This patch reduces log level of information messages emitted by libbpf if section name is not canonical. User can still get a list of all supported section names as debug-level message. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191217234228.1739308-1-andriin@fb.com
2019-12-18	libbpf: Fix libbpf_common.h when installing libbpf through 'make install'	Toke Høiland-Jørgensen	2	-0/+3
	This fixes two issues with the newly introduced libbpf_common.h file: - The header failed to include <string.h> for the definition of memset() - The new file was not included in the install_headers rule in the Makefile Both of these issues cause breakage when installing libbpf with 'make install' and trying to use it in applications. Fixes: 544402d4b493 ("libbpf: Extract common user-facing helpers") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191217112810.768078-1-toke@redhat.com
2019-12-18	selftests/bpf: More succinct Makefile output	Andrii Nakryiko	1	-0/+36
	Similarly to bpftool/libbpf output, make selftests/bpf output succinct per-item output line. Output is roughly as follows: $ make ... CLANG-LLC [test_maps] pyperf600.o CLANG-LLC [test_maps] strobemeta.o CLANG-LLC [test_maps] pyperf100.o EXTRA-OBJ [test_progs] cgroup_helpers.o EXTRA-OBJ [test_progs] trace_helpers.o BINARY test_align BINARY test_verifier_log GEN-SKEL [test_progs] fexit_bpf2bpf.skel.h GEN-SKEL [test_progs] test_global_data.skel.h GEN-SKEL [test_progs] sendmsg6_prog.skel.h ... To see the actual command invocation, verbose mode can be turned on with V=1 argument: $ make V=1 ... very verbose output ... Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191217061425.2346359-1-andriin@fb.com
2019-12-16	libbpf: Add zlib as a dependency in pkg-config template	Andrii Nakryiko	1	-1/+1
	List zlib as another dependency of libbpf in pkg-config template. Verified it is correctly resolved to proper -lz flag: $ make DESTDIR=/tmp/libbpf-install install $ pkg-config --libs /tmp/libbpf-install/usr/local/lib64/pkgconfig/libbpf.pc -L/usr/local/lib64 -lbpf $ pkg-config --libs --static /tmp/libbpf-install/usr/local/lib64/pkgconfig/libbpf.pc -L/usr/local/lib64 -lbpf -lelf -lz Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Cc: Luca Boccassi <bluca@debian.org> Link: https://lore.kernel.org/bpf/20191216183830.3972964-1-andriin@fb.com
2019-12-16	libbpf: Print hint about ulimit when getting permission denied error	Toke Høiland-Jørgensen	1	-0/+29
	Probably the single most common error newcomers to XDP are stumped by is the 'permission denied' error they get when trying to load their program and 'ulimit -l' is set too low. For examples, see [0], [1]. Since the error code is UAPI, we can't change that. Instead, this patch adds a few heuristics in libbpf and outputs an additional hint if they are met: If an EPERM is returned on map create or program load, and geteuid() shows we are root, and the current RLIMIT_MEMLOCK is not infinity, we output a hint about raising 'ulimit -l' as an additional log line. [0] https://marc.info/?l=xdp-newbies&m=157043612505624&w=2 [1] https://github.com/xdp-project/xdp-tutorial/issues/86 Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191216181204.724953-1-toke@redhat.com
2019-12-16	samples/bpf: Attach XDP programs in driver mode by default	Toke Høiland-Jørgensen	11	-12/+58
	When attaching XDP programs, userspace can set flags to request the attach mode (generic/SKB mode, driver mode or hw offloaded mode). If no such flags are requested, the kernel will attempt to attach in driver mode, and then silently fall back to SKB mode if this fails. The silent fallback is a major source of user confusion, as users will try to load a program on a device without XDP support, and instead of an error they will get the silent fallback behaviour, not notice, and then wonder why performance is not what they were expecting. In an attempt to combat this, let's switch all the samples to default to explicitly requesting driver-mode attach. As part of this, ensure that all the userspace utilities have a switch to enable SKB mode. For those that have a switch to request driver mode, keep it but turn it into a no-op. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: David Ahern <dsahern@gmail.com> Link: https://lore.kernel.org/bpf/20191216110742.364456-1-toke@redhat.com
2019-12-16	samples/bpf: Set -fno-stack-protector when building BPF programs	Toke Høiland-Jørgensen	1	-0/+1
	It seems Clang can in some cases turn on stack protection by default, which doesn't work with BPF. This was reported once before[0], but it seems the flag to explicitly turn off the stack protector wasn't added to the Makefile, so do that now. The symptom of this is compile errors like the following: error: <unknown>:0:0: in function bpf_prog1 i32 (%struct.__sk_buff*): A call to built-in function '__stack_chk_fail' is not supported. [0] https://www.spinics.net/lists/netdev/msg556400.html Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191216103819.359535-1-toke@redhat.com
2019-12-16	samples/bpf: Add missing -lz to TPROGS_LDLIBS	Toke Høiland-Jørgensen	1	-1/+1
	Since libbpf now links against zlib, this needs to be included in the linker invocation for the userspace programs in samples/bpf that link statically against libbpf. Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/20191216102405.353834-1-toke@redhat.com
2019-12-16	samples/bpf: Reintroduce missed build targets	Prashant Bhole	1	-0/+2
	Add xdp_redirect and per_socket_stats_example in build targets. They got removed from build targets in Makefile reorganization. Fixes: 1d97c6c2511f ("samples/bpf: Base target programs rules on Makefile.target") Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191216071619.25479-1-prashantbhole.linux@gmail.com
2019-12-16	bpftool: Fix compilation warning on shadowed variable	Paul Chaignon	1	-1/+1
	The ident variable has already been declared at the top of the function and doesn't need to be re-declared. Fixes: 985ead416df39 ("bpftool: Add skeleton codegen command") Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191216112733.GA28366@Omicron
2019-12-16	libbpf: Fix build by renaming variables	Prashant Bhole	1	-6/+6
	In btf__align_of() variable name 't' is shadowed by inner block declaration of another variable with same name. Patch renames variables in order to fix it. CC sharedobjs/btf.o btf.c: In function ‘btf__align_of’: btf.c:303:21: error: declaration of ‘t’ shadows a previous local [-Werror=shadow] 303 \| int i, align = 1, t; \| ^ btf.c:283:25: note: shadowed declaration is here 283 \| const struct btf_type *t = btf__type_by_id(btf, id); \| Fixes: 3d208f4ca111 ("libbpf: Expose btf__align_of() API") Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191216082738.28421-1-prashantbhole.linux@gmail.com
2019-12-15	Merge branch 'support-flex-arrays'	Alexei Starovoitov	7	-9/+85
	Andrii Nakryiko says: ==================== Add support for flexible array accesses in a relocatable manner in BPF CO-RE. It's a typical pattern in C, and kernel in particular, to provide a fixed-length struct with zero-sized or dimensionless array at the end. In such cases variable-sized array contents follows immediately after the end of a struct. This patch set adds support for such access pattern by allowing accesses to such arrays. Patch #1 adds libbpf support. Patch #2 adds few test cases for validation. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-12-15	selftests/bpf: Add flexible array relocation tests	Andrii Nakryiko	6	-4/+56
	Add few tests validation CO-RE relocation handling of flexible array accesses. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191215070844.1014385-3-andriin@fb.com
2019-12-15	libbpf: Support flexible arrays in CO-RE	Andrii Nakryiko	1	-5/+29
	Some data stuctures in kernel are defined with either zero-sized array or flexible (dimensionless) array at the end of a struct. Actual data of such array follows in memory immediately after the end of that struct, forming its variable-sized "body" of elements. Support such access pattern in CO-RE relocation handling. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191215070844.1014385-2-andriin@fb.com
2019-12-15	Merge branch 'extern-var-support'	Alexei Starovoitov	13	-81/+1035
	Andrii Nakryiko says: ==================== It's often important for BPF program to know kernel version or some specific config values (e.g., CONFIG_HZ to convert jiffies to seconds) and change or adjust program logic based on their values. As of today, any such need has to be resolved by recompiling BPF program for specific kernel and kernel configuration. In practice this is usually achieved by using BCC and its embedded LLVM/Clang. With such set up #ifdef CONFIG_XXX and similar compile-time constructs allow to deal with kernel varieties. With CO-RE (Compile Once – Run Everywhere) approach, this is not an option, unfortunately. All such logic variations have to be done as a normal C language constructs (i.e., if/else, variables, etc), not a preprocessor directives. This patch series add support for such advanced scenarios through C extern variables. These extern variables will be recognized by libbpf and supplied through extra .extern internal map, similarly to global data. This .extern map is read-only, which allows BPF verifier to track its content precisely as constants. That gives an opportunity to have pre-compiled BPF program, which can potentially use BPF functionality (e.g., BPF helpers) or kernel features (types, fields, etc), that are available only on a subset of targeted kernels, while effectively eleminating (through verifier's dead code detection) such unsupported functionality for other kernels (typically, older versions). Patch #3 explicitly tests a scenario of using unsupported BPF helper, to validate the approach. This patch set heavily relies on BTF type information emitted by compiler for each extern variable declaration. Based on specific types, libbpf does strict checks of config data values correctness. See patch #1 for details. Outline of the patch set: - patch #1 does a small clean up of internal map names contants; - patch #2 adds all of the libbpf internal machinery for externs support, including setting up BTF information for .extern data section; - patch #3 adds support for .extern into BPF skeleton; - patch #4 adds externs selftests, as well as enhances test_skeleton.c test to validate mmap()-ed .extern datasection functionality. v3->v4: - clean up copyrights and rebase onto latest skeleton patches (Alexei); v2->v3: - truncate too long strings (Alexei); - clean ups, adding comments (Alexei); v1->v2: - use BTF type information for externs (Alexei); - add strings support; - add BPF skeleton support for .extern. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-12-15	selftests/bpf: Add tests for libbpf-provided externs	Andrii Nakryiko	4	-1/+283
	Add a set of tests validating libbpf-provided extern variables. One crucial feature that's tested is dead code elimination together with using invalid BPF helper. CONFIG_MISSING is not supposed to exist and should always be specified by libbpf as zero, which allows BPF verifier to correctly do branch pruning and not fail validation, when invalid BPF helper is called from dead if branch. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191214014710.3449601-5-andriin@fb.com
2019-12-15	bpftool: Generate externs datasec in BPF skeleton	Andrii Nakryiko	2	-5/+9
	Add support for generation of mmap()-ed read-only view of libbpf-provided extern variables. As externs are not supposed to be provided by user code (that's what .data, .bss, and .rodata is for), don't mmap() it initially. Only after skeleton load is performed, map .extern contents as read-only memory. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191214014710.3449601-4-andriin@fb.com
2019-12-15	libbpf: Support libbpf-provided extern variables	Andrii Nakryiko	8	-66/+729
	Add support for extern variables, provided to BPF program by libbpf. Currently the following extern variables are supported: - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte long; - CONFIG_xxx values; a set of values of actual kernel config. Tristate, boolean, strings, and integer values are supported. Set of possible values is determined by declared type of extern variable. Supported types of variables are: - Tristate values. Are represented as `enum libbpf_tristate`. Accepted values are strictly 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO, or TRI_MODULE, respectively. - Boolean values. Are represented as bool (_Bool) types. Accepted values are 'y' and 'n' only, turning into true/false values, respectively. - Single-character values. Can be used both as a substritute for bool/tristate, or as a small-range integer: - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm'; - integers in a range [-128, 127] or [0, 255] (depending on signedness of char in target architecture) are recognized and represented with respective values of char type. - Strings. String values are declared as fixed-length char arrays. String of up to that length will be accepted and put in first N bytes of char array, with the rest of bytes zeroed out. If config string value is longer than space alloted, it will be truncated and warning message emitted. Char array is always zero terminated. String literals in config have to be enclosed in double quotes, just like C-style string literals. - Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and unsigned variants. Libbpf enforces parsed config value to be in the supported range of corresponding integer type. Integers values in config can be: - decimal integers, with optional + and - signs; - hexadecimal integers, prefixed with 0x or 0X; - octal integers, starting with 0. Config file itself is searched in /boot/config-$(uname -r) location with fallback to /proc/config.gz, unless config path is specified explicitly through bpf_object_open_opts' kernel_config_path option. Both gzipped and plain text formats are supported. Libbpf adds explicit dependency on zlib because of this, but this shouldn't be a problem, given libelf already depends on zlib. All detected extern variables, are put into a separate .extern internal map. It, similarly to .rodata map, is marked as read-only from BPF program side, as well as is frozen on load. This allows BPF verifier to track extern values as constants and perform enhanced branch prediction and dead code elimination. This can be relied upon for doing kernel version/feature detection and using potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF program, while still having a single version of BPF program running on old and new kernels. Selftests are validating this explicitly for unexisting BPF helper. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com
2019-12-15	libbpf: Extract internal map names into constants	Andrii Nakryiko	1	-9/+14
	Instead of duplicating string literals, keep them in one place and consistent. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191214014710.3449601-2-andriin@fb.com