linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-09-05	xsk: use state member for socket synchronization	Björn Töpel	1	-15/+39
	Prior the state variable was introduced by Ilya, the dev member was used to determine whether the socket was bound or not. However, when dev was read, proper SMP barriers and READ_ONCE were missing. In order to address the missing barriers and READ_ONCE, we start using the state variable as a point of synchronization. The state member read/write is paired with proper SMP barriers, and from this follows that the members described above does not need READ_ONCE if used in conjunction with state check. In all syscalls and the xsk_rcv path we check if state is XSK_BOUND. If that is the case we do a SMP read barrier, and this implies that the dev, umem and all rings are correctly setup. Note that no READ_ONCE are needed for these variable if used when state is XSK_BOUND (plus the read barrier). To summarize: The members struct xdp_sock members dev, queue_id, umem, fq, cq, tx, rx, and state were read lock-less, with incorrect barriers and missing {READ, WRITE}_ONCE. Now, umem, fq, cq, tx, rx, and state are read lock-less. When these members are updated, WRITE_ONCE is used. When read, READ_ONCE are only used when read outside the control mutex (e.g. mmap) or, not synchronized with the state member (XSK_BOUND plus smp_rmb()) Note that dev and queue_id do not need a WRITE_ONCE or READ_ONCE, due to the introduce state synchronization (XSK_BOUND plus smp_rmb()). Introducing the state check also fixes a race, found by syzcaller, in xsk_poll() where umem could be accessed when stale. Suggested-by: Hillf Danton <hdanton@sina.com> Reported-by: syzbot+c82697e3043781e08802@syzkaller.appspotmail.com Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05	xsk: avoid store-tearing when assigning umem	Björn Töpel	1	-2/+2
	The umem member of struct xdp_sock is read outside of the control mutex, in the mmap implementation, and needs a WRITE_ONCE to avoid potential store-tearing. Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Fixes: 423f38329d26 ("xsk: add umem fill queue support and mmap") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05	xsk: avoid store-tearing when assigning queues	Björn Töpel	1	-1/+1
	Use WRITE_ONCE when doing the store of tx, rx, fq, and cq, to avoid potential store-tearing. These members are read outside of the control mutex in the mmap implementation. Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Fixes: 37b076933a8e ("xsk: add missing write- and data-dependency barrier") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05	selftests/bpf: precision tracking tests	Alexei Starovoitov	1	-0/+52
	Add two tests to check that stack slot marking during backtracking doesn't trigger 'spi > allocated_stack' warning. One test is using BPF_ST insn. Another is using BPF_STX. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05	ixgbe: fix xdp handle calculations	Kevin Laatz	1	-3/+4
	Currently, we don't add headroom to the handle in ixgbe_zca_free, ixgbe_alloc_buffer_slow_zc and ixgbe_alloc_buffer_zc. The addition of the headroom to the handle was removed in commit d8c3061e5edd ("ixgbe: modify driver for handling offsets"), which will break things when headroom isvnon-zero. This patch fixes this and uses xsk_umem_adjust_offset to add it appropritely based on the mode being run. Fixes: d8c3061e5edd ("ixgbe: modify driver for handling offsets") Reported-by: Bjorn Topel <bjorn.topel@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05	i40e: fix xdp handle calculations	Kevin Laatz	1	-3/+4
	Currently, we don't add headroom to the handle in i40e_zca_free, i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc. The addition of the headroom to the handle was removed in commit 2f86c806a8a8 ("i40e: modify driver for handling offsets"), which will break things when headroom is non-zero. This patch fixes this and uses xsk_umem_adjust_offset to add it appropritely based on the mode being run. Fixes: 2f86c806a8a8 ("i40e: modify driver for handling offsets") Reported-by: Bjorn Topel <bjorn.topel@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-03	selftests/bpf: fix endianness issues in test_sysctl	Ilya Leoshkevich	1	-43/+82
	A lot of test_sysctl sub-tests fail due to handling strings as a bunch of immediate values in a little-endian-specific manner. Fix by wrapping all immediates in bpf_ntohl and the new bpf_be64_to_cpu. fixup_sysctl_value() dynamically writes an immediate, and thus should be endianness-aware. Implement this by simply memcpy()ing the raw user-provided value, since testcase endianness and bpf program endianness match. Fixes: 1f5fa9ab6e2e ("selftests/bpf: Test BPF_CGROUP_SYSCTL") Fixes: 9a1027e52535 ("selftests/bpf: Test file_pos field in bpf_sysctl ctx") Fixes: 6041c67f28d8 ("selftests/bpf: Test bpf_sysctl_get_name helper") Fixes: 11ff34f74e32 ("selftests/bpf: Test sysctl_get_current_value helper") Fixes: 786047dd08de ("selftests/bpf: Test bpf_sysctl_{get,set}_new_value helpers") Fixes: 8549ddc832d6 ("selftests/bpf: Test bpf_strtol and bpf_strtoul helpers") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-03	selftests/bpf: improve unexpected success reporting in test_syctl	Ilya Leoshkevich	1	-1/+2
	When tests fail because sysctl() unexpectedly succeeds, they print an inappropriate "Unexpected failure" message and a random errno. Zero out errno before calling sysctl() and replace the message with "Unexpected success". Fixes: 1f5fa9ab6e2e ("selftests/bpf: Test BPF_CGROUP_SYSCTL") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-03	selftests/bpf: fix "ctx:write sysctl:write read ok" on s390	Ilya Leoshkevich	1	-1/+1
	"ctx:write sysctl:write read ok" fails on s390 because it reads the first byte of an int assuming it's the least-significant one, which is not the case on big-endian arches. Since we are not testing narrow accesses here (there is e.g. "ctx:file_pos sysctl:read read ok narrow" for that), simply read the whole int. Fixes: 1f5fa9ab6e2e ("selftests/bpf: Test BPF_CGROUP_SYSCTL") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-03	selftests/bpf: introduce bpf_cpu_to_be64 and bpf_be64_to_cpu	Ilya Leoshkevich	3	-16/+22
	test_lwt_seg6local and test_seg6_loop use custom 64-bit endianness conversion macros. Centralize their definitions in bpf_endian.h in order to reduce code duplication. This will also be useful when bpf_endian.h is promoted to an offical libbpf header. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-03	arm64: bpf: optimize modulo operation	Jerin Jacob	2	-4/+5
	Optimize modulo operation instruction generation by using single MSUB instruction vs MUL followed by SUB instruction scheme. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-03	bpf: s390: add JIT support for bpf line info	Yauheni Kaliuta	1	-0/+1
	This adds support for generating bpf line info for JITed programs like commit 6f20c71d8505 ("bpf: powerpc64: add JIT support for bpf line info") does for powerpc, but it should pass the array starting from 1. This fixes test_btf. Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-03	selftests/bpf: test_progs: add missing \n to CHECK_FAIL	Stanislav Fomichev	1	-1/+1
	Copy-paste error from CHECK. Fixes: d38835b75f67 ("selftests/bpf: test_progs: remove global fail/success counts") Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-03	selftests/bpf: test_progs: fix verbose mode garbage	Stanislav Fomichev	1	-0/+1
	fseeko(.., 0, SEEK_SET) on a memstream just puts the buffer pointer to the beginning so when we call fflush on it we get some garbage log data from the previous test. Let's manually set terminating byte to zero at the reported buffer size. To show the issue consider the following snippet: stream = open_memstream (&buf, &len); fprintf(stream, "aaa"); fflush(stream); printf("buf=%s, len=%zu\n", buf, len); fseeko(stream, 0, SEEK_SET); fprintf(stream, "b"); fflush(stream); printf("buf=%s, len=%zu\n", buf, len); Output: buf=aaa, len=3 buf=baa, len=1 Fixes: 946152b3c5d6 ("selftests/bpf: test_progs: switch to open_memstream") Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	doc/af_xdp: include unaligned chunk case	Kevin Laatz	1	-4/+6
	The addition of unaligned chunks mode, the documentation needs to be updated to indicate that the incoming addr to the fill ring will only be masked if the user application is run in the aligned chunk mode. This patch also adds a line to explicitly indicate that the incoming addr will not be masked if running the user application in the unaligned chunk mode. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	samples/bpf: use hugepages in xdpsock app	Kevin Laatz	1	-5/+10
	This patch modifies xdpsock to use mmap instead of posix_memalign. With this change, we can use hugepages when running the application in unaligned chunks mode. Using hugepages makes it more likely that we have physically contiguous memory, which supports the unaligned chunk mode better. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	samples/bpf: add buffer recycling for unaligned chunks to xdpsock	Kevin Laatz	1	-10/+16
	This patch adds buffer recycling support for unaligned buffers. Since we don't mask the addr to 2k at umem_reg in unaligned mode, we need to make sure we give back the correct (original) addr to the fill queue. We achieve this using the new descriptor format and associated masks. The new format uses the upper 16-bits for the offset and the lower 48-bits for the addr. Since we have a field for the offset, we no longer need to modify the actual address. As such, all we have to do to get back the original address is mask for the lower 48 bits (i.e. strip the offset and we get the address on it's own). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	samples/bpf: add unaligned chunks mode support to xdpsock	Kevin Laatz	1	-3/+17
	This patch adds support for the unaligned chunks mode. The addition of the unaligned chunks option will allow users to run the application with more relaxed chunk placement in the XDP umem. Unaligned chunks mode can be used with the '-u' or '--unaligned' command line options. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	libbpf: add flags to umem config	Kevin Laatz	5	-4/+71
	This patch adds a 'flags' field to the umem_config and umem_reg structs. This will allow for more options to be added for configuring umems. The first use for the flags field is to add a flag for unaligned chunks mode. These flags can either be user-provided or filled with a default. Since we change the size of the xsk_umem_config struct, we need to version the ABI. This patch includes the ABI versioning for xsk_umem__create. The Makefile was also updated to handle multiple function versions in check-abi. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	net/mlx5e: Allow XSK frames smaller than a page	Maxim Mikityanskiy	4	-10/+32
	Relax the requirements to the XSK frame size to allow it to be smaller than a page and even not a power of two. The current implementation can work in this mode, both with Striding RQ and without it. The code that checks `mtu + headroom <= XSK frame size` is modified accordingly. Any frame size between 2048 and PAGE_SIZE is accepted. Functions that worked with pages only now work with XSK frames, even if their size is different from PAGE_SIZE. With XSK queues, regardless of the frame size, Striding RQ uses the stride size of PAGE_SIZE, and UMR MTTs are posted using starting addresses of frames, but PAGE_SIZE as page size. MTU guarantees that no packet data will overlap with other frames. UMR MTT size is made equal to the stride size of the RQ, because UMEM frames may come in random order, and we need to handle them one by one. PAGE_SIZE is just a power of two that is bigger than any allowed XSK frame size, and also it doesn't require making additional changes to the code. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	mlx5e: modify driver for handling offsets	Kevin Laatz	2	-3/+8
	With the addition of the unaligned chunks option, we need to make sure we handle the offsets accordingly based on the mode we are currently running in. This patch modifies the driver to appropriately mask the address for each case. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	ixgbe: modify driver for handling offsets	Kevin Laatz	1	-4/+9
	With the addition of the unaligned chunks option, we need to make sure we handle the offsets accordingly based on the mode we are currently running in. This patch modifies the driver to appropriately mask the address for each case. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	i40e: modify driver for handling offsets	Kevin Laatz	1	-4/+9
	With the addition of the unaligned chunks option, we need to make sure we handle the offsets accordingly based on the mode we are currently running in. This patch modifies the driver to appropriately mask the address for each case. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	xsk: add support to allow unaligned chunk placement	Kevin Laatz	6	-36/+233
	Currently, addresses are chunk size aligned. This means, we are very restricted in terms of where we can place chunk within the umem. For example, if we have a chunk size of 2k, then our chunks can only be placed at 0,2k,4k,6k,8k... and so on (ie. every 2k starting from 0). This patch introduces the ability to use unaligned chunks. With these changes, we are no longer bound to having to place chunks at a 2k (or whatever your chunk size is) interval. Since we are no longer dealing with aligned chunks, they can now cross page boundaries. Checks for page contiguity have been added in order to keep track of which pages are followed by a physically contiguous page. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	ixgbe: simplify Rx buffer recycle	Kevin Laatz	1	-10/+3
	Currently, the dma, addr and handle are modified when we reuse Rx buffers in zero-copy mode. However, this is not required as the inputs to the function are copies, not the original values themselves. As we use the copies within the function, we can use the original 'obi' values directly without having to mask and add the headroom. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	i40e: simplify Rx buffer recycle	Kevin Laatz	1	-10/+3
	Currently, the dma, addr and handle are modified when we reuse Rx buffers in zero-copy mode. However, this is not required as the inputs to the function are copies, not the original values themselves. As we use the copies within the function, we can use the original 'old_bi' values directly without having to mask and add the headroom. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	selftests/bpf: Fix a typo in test_offload.py	Masanari Iida	1	-1/+1
	This patch fix a spelling typo in test_offload.py Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	bpf: fix error check in bpf_tcp_gen_syncookie	Petar Penkov	1	-1/+1
	If a SYN cookie is not issued by tcp_v#_gen_syncookie, then the return value will be exactly 0, rather than <= 0. Let's change the check to reflect that, especially since mss is an unsigned value and cannot be negative. Fixes: 70d66244317e ("bpf: add bpf_tcp_gen_syncookie helper") Reported-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Petar Penkov <ppenkov@google.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	nfp: bpf: add simple map op cache	Jakub Kicinski	5	-9/+215
	Each get_next and lookup call requires a round trip to the device. However, the device is capable of giving us a few entries back, instead of just one. In this patch we ask for a small yet reasonable number of entries (4) on every get_next call, and on subsequent get_next/lookup calls check this little cache for a hit. The cache is only kept for 250us, and is invalidated on every operation which may modify the map (e.g. delete or update call). Note that operations may be performed simultaneously, so we have to keep track of operations in flight. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	nfp: bpf: rework MTU checking	Jakub Kicinski	5	-12/+25
	If control channel MTU is too low to support map operations a warning will be printed. This is not enough, we want to make sure probe fails in such scenario, as this would clearly be a faulty configuration. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	tools: bpftool: do not link twice against libbpf.a in Makefile	Quentin Monnet	1	-2/+2
	In bpftool's Makefile, $(LIBS) includes $(LIBBPF), therefore the library is used twice in the linking command. No need to have $(LIBBPF) (from $^) on that command, let's do with "$(OBJS) $(LIBS)" (but move $(LIBBPF) _before_ the -l flags in $(LIBS)). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	tools: bpf: account for generated feature/ and libbpf/ directories	Quentin Monnet	4	-6/+12
	When building "tools/bpf" from the top of the Linux repository, the build system passes a value for the $(OUTPUT) Makefile variable to tools/bpf/Makefile and tools/bpf/bpftool/Makefile, which results in generating "libbpf/" (for bpftool) and "feature/" (bpf and bpftool) directories inside the tree. This commit adds such directories to the relevant .gitignore files, and edits the Makefiles to ensure they are removed on "make clean". The use of "rm" is also made consistent throughout those Makefiles (relies on the $(RM) variable, use "--" to prevent interpreting $(OUTPUT)/$(DESTDIR) as options. v2: - New patch. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	tools: bpftool: improve and check builds for different make invocations	Quentin Monnet	3	-6/+152
	There are a number of alternative "make" invocations that can be used to compile bpftool. The following invocations are expected to work: - through the kbuild system, from the top of the repository (make tools/bpf) - by telling make to change to the bpftool directory (make -C tools/bpf/bpftool) - by building the BPF tools from tools/ (cd tools && make bpf) - by running make from bpftool directory (cd tools/bpf/bpftool && make) Additionally, setting the O or OUTPUT variables should tell the build system to use a custom output path, for each of these alternatives. The following patch fixes the following invocations: $ make tools/bpf $ make tools/bpf O=<dir> $ make -C tools/bpf/bpftool OUTPUT=<dir> $ make -C tools/bpf/bpftool O=<dir> $ cd tools/ && make bpf O=<dir> $ cd tools/bpf/bpftool && make OUTPUT=<dir> $ cd tools/bpf/bpftool && make O=<dir> After this commit, the build still fails for two variants when passing the OUTPUT variable: $ make tools/bpf OUTPUT=<dir> $ cd tools/ && make bpf OUTPUT=<dir> In order to remember and check what make invocations are supposed to work, and to document the ones which do not, a new script is added to the BPF selftests. Note that some invocations require the kernel to be configured, so the script skips them if no .config file is found. v2: - In make_and_clean(), set $ERROR to 1 when "make" returns non-zero, even if the binary was produced. - Run "make clean" from the correct directory (bpf/ instead of bpftool/, when relevant). Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	tools: bpftool: ignore make built-in rules for getting kernel version	Quentin Monnet	1	-1/+1
	Bpftool calls the toplevel Makefile to get the kernel version for the sources it is built from. But when the utility is built from the top of the kernel repository, it may dump the following error message for certain architectures (including x86): $ make tools/bpf [...] make[3]: *** [checkbin] Error 1 [...] This does not prevent bpftool compilation, but may feel disconcerting. The "checkbin" arch-dependent target is not supposed to be called for target "kernelversion", which is a simple "echo" of the version number. It turns out this is caused by the make invocation in tools/bpf/bpftool, which attempts to find implicit rules to apply. Extract from debug output: Reading makefiles... Reading makefile 'Makefile'... Reading makefile 'scripts/Kbuild.include' (search path) (no ~ expansion)... Reading makefile 'scripts/subarch.include' (search path) (no ~ expansion)... Reading makefile 'arch/x86/Makefile' (search path) (no ~ expansion)... Reading makefile 'scripts/Makefile.kcov' (search path) (no ~ expansion)... Reading makefile 'scripts/Makefile.gcc-plugins' (search path) (no ~ expansion)... Reading makefile 'scripts/Makefile.kasan' (search path) (no ~ expansion)... Reading makefile 'scripts/Makefile.extrawarn' (search path) (no ~ expansion)... Reading makefile 'scripts/Makefile.ubsan' (search path) (no ~ expansion)... Updating makefiles.... Considering target file 'scripts/Makefile.ubsan'. Looking for an implicit rule for 'scripts/Makefile.ubsan'. Trying pattern rule with stem 'Makefile.ubsan'. [...] Trying pattern rule with stem 'Makefile.ubsan'. Trying implicit prerequisite 'scripts/Makefile.ubsan.o'. Looking for a rule with intermediate file 'scripts/Makefile.ubsan.o'. Avoiding implicit rule recursion. Trying pattern rule with stem 'Makefile.ubsan'. Trying rule prerequisite 'prepare'. Trying rule prerequisite 'FORCE'. Found an implicit rule for 'scripts/Makefile.ubsan'. Considering target file 'prepare'. File 'prepare' does not exist. Considering target file 'prepare0'. File 'prepare0' does not exist. Considering target file 'archprepare'. File 'archprepare' does not exist. Considering target file 'archheaders'. File 'archheaders' does not exist. Finished prerequisites of target file 'archheaders'. Must remake target 'archheaders'. Putting child 0x55976f4f6980 (archheaders) PID 31743 on the chain. To avoid that, pass the -r and -R flags to eliminate the use of make built-in rules (and while at it, built-in variables) when running command "make kernelversion" from bpftool's Makefile. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31	bpf: s390: add JIT support for multi-function programs	Yauheni Kaliuta	1	-11/+55
	This adds support for bpf-to-bpf function calls in the s390 JIT compiler. The JIT compiler converts the bpf call instructions to native branch instructions. After a round of the usual passes, the start addresses of the JITed images for the callee functions are known. Finally, to fixup the branch target addresses, we need to perform an extra pass. Because of the address range in which JITed images are allocated on s390, the offsets of the start addresses of these images from __bpf_call_base are as large as 64 bits. So, for a function call, the imm field of the instruction cannot be used to determine the callee's address. Use bpf_jit_get_func_addr() helper instead. The patch borrows a lot from: commit 8c11ea5ce13d ("bpf, arm64: fix getting subprog addr from aux for calls") commit e2c95a61656d ("bpf, ppc64: generalize fetching subprog into bpf_jit_get_func_addr") commit 8484ce8306f9 ("bpf: powerpc64: add JIT support for multi-function programs") (including the commit message). test_verifier (5.3-rc6 with CONFIG_BPF_JIT_ALWAYS_ON=y): without patch: Summary: 1501 PASSED, 0 SKIPPED, 47 FAILED with patch: Summary: 1540 PASSED, 0 SKIPPED, 8 FAILED Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28	selftests/bpf: remove wrong nhoff in flow dissector test	Stanislav Fomichev	1	-1/+0
	.nhoff = 0 is (correctly) reset to ETH_HLEN on the next line so let's drop it. Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28	selftests/bpf: test_progs: remove unused ret	Stanislav Fomichev	1	-23/+19
	send_signal test returns static codes from the subtests which nobody looks at, let's rely on the CHECK macros instead. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28	selftests/bpf: test_progs: remove asserts from subtests	Stanislav Fomichev	5	-26/+40
	Otherwise they can bring the whole process down. Cc: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28	selftests/bpf: test_progs: remove global fail/success counts	Stanislav Fomichev	22	-123/+60
	Now that we have a global per-test/per-environment state, there is no longer need to have global fail/success counters (and there is no need to save/get the diff before/after the test). Introduce CHECK_FAIL macro (suggested by Andrii) and covert existing tests to it. CHECK_FAIL uses new test__fail() to record the failure. Cc: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28	selftests/bpf: test_progs: test__skip	Stanislav Fomichev	3	-2/+21
	Export test__skip() to indicate skipped tests and use it in test_send_signal_nmi(). Cc: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28	selftests/bpf: add precision tracking test	Alexei Starovoitov	1	-0/+25
	Copy-paste of existing test "calls: cross frame pruning - liveness propagation" but ran with different parentage chain heuristic which stresses different path in precision tracking logic. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28	selftests/bpf: verifier precise tests	Alexei Starovoitov	2	-11/+174
	Use BPF_F_TEST_STATE_FREQ flag to check that precision tracking works as expected by comparing every step it takes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28	tools/bpf: sync bpf.h	Alexei Starovoitov	1	-0/+3
	sync bpf.h from kernel/ to tools/ Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-28	bpf: introduce verifier internal test flag	Alexei Starovoitov	4	-1/+9
	Introduce BPF_F_TEST_STATE_FREQ flag to stress test parentage chain and state pruning. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21	tools: bpftool: add "bpftool map freeze" subcommand	Quentin Monnet	3	-3/+44
	Add a new subcommand to freeze maps from user space. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21	tools: bpftool: show frozen status for maps	Quentin Monnet	1	-3/+27
	When listing maps, read their "frozen" status from procfs, and tell if maps are frozen. As commit log for map freezing command mentions that the feature might be extended with flags (e.g. for write-only instead of read-only) in the future, use an integer and not a boolean for JSON output. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21	bpf: sync bpf.h to tools/	Peter Wu	1	-3/+5
	Fix a 'struct pt_reg' typo and clarify when bpf_trace_printk discards lines. Affects documentation only. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-21	bpf: clarify when bpf_trace_printk discards lines	Peter Wu	1	-0/+2
	I opened /sys/kernel/tracing/trace once and kept reading from it. bpf_trace_printk somehow did not seem to work, no entries were appended to that trace file. It turns out that tracing is disabled when that file is open. Save the next person some time and document this. The trace file is described in Documentation/trace/ftrace.rst, however the implication "tracing is disabled" did not immediate translate to "bpf_trace_printk silently discards entries". Signed-off-by: Peter Wu <peter@lekensteyn.nl> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-21	bpf: fix 'struct pt_reg' typo in documentation	Peter Wu	1	-3/+3
	There is no 'struct pt_reg'. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-21	bpf: clarify description for CONFIG_BPF_EVENTS	Peter Wu	1	-1/+2
	PERF_EVENT_IOC_SET_BPF supports uprobes since v4.3, and tracepoints since v4.7 via commit 04a22fae4cbc ("tracing, perf: Implement BPF programs attached to uprobes"), and commit 98b5c2c65c29 ("perf, bpf: allow bpf programs attach to tracepoints") respectively. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Signed-off-by: Alexei Starovoitov <ast@kernel.org>