aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/call-graph-from-sql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2018-10-08bpf: allow offload of programs with BPF-to-BPF function callsQuentin Monnet1-7/+3
Now that there is at least one driver supporting BPF-to-BPF function calls, lift the restriction, in the verifier, on hardware offload of eBPF programs containing such calls. But prevent jit_subprogs(), still in the verifier, from being run for offloaded programs. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: support pointers to other stack frames for BPF-to-BPF callsQuentin Monnet3-1/+6
Mark instructions that use pointers to areas in the stack outside of the current stack frame, and process them accordingly in mem_op_stack(). This way, we also support BPF-to-BPF calls where the caller passes a pointer to data in its own stack frame to the callee (typically, when the caller passes an address to one of its local variables located in the stack, as an argument). Thanks to Jakub and Jiong for figuring out how to deal with this case, I just had to turn their email discussion into this patch. Suggested-by: Jiong Wang <jiong.wang@netronome.com> Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: optimise save/restore for R6~R9 based on register usageQuentin Monnet3-23/+78
When pre-processing the instructions, it is trivial to detect what subprograms are using R6, R7, R8 or R9 as destination registers. If a subprogram uses none of those, then we do not need to jump to the subroutines dedicated to saving and restoring callee-saved registers in its prologue and epilogue. This patch introduces detection of callee-saved registers in subprograms and prevents the JIT from adding calls to those subroutines whenever we can: we save some instructions in the translated program, and some time at runtime on BPF-to-BPF calls and returns. If no subprogram needs to save those registers, we can avoid appending the subroutines at the end of the program. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: fix return address from register-saving subroutine to calleeQuentin Monnet1-1/+27
On performing a BPF-to-BPF call, we first jump to a subroutine that pushes callee-saved registers (R6~R9) to the stack, and from there we goes to the start of the callee next. In order to do so, the caller must pass to the subroutine the address of the NFP instruction to jump to at the end of that subroutine. This cannot be reliably implemented when translated the caller, as we do not always know the start offset of the callee yet. This patch implement the required fixup step for passing the start offset in the callee via the register used by the subroutine to hold its return address. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: update fixup function for BPF-to-BPF calls supportQuentin Monnet2-3/+24
Relocation for targets of BPF-to-BPF calls are required at the end of translation. Update the nfp_fixup_branches() function in that regard. When checking that the last instruction of each bloc is a branch, we must account for the length of the instructions required to pop the return address from the stack. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: account for additional stack usage when checking stack limitQuentin Monnet2-8/+68
Offloaded programs using BPF-to-BPF calls use the stack to store the return address when calling into a subprogram. Callees also need some space to save eBPF registers R6 to R9. And contrarily to kernel verifier, we align stack frames on 64 bytes (and not 32). Account for all this when checking the stack size limit before JIT-ing the program. This means we have to recompute maximum stack usage for the program, we cannot get the value from the kernel. In addition to adapting the checks on stack usage, move them to the finalize() callback, now that we have it and because such checks are part of the verification step rather than translation. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: add main logics for BPF-to-BPF calls support in nfp driverQuentin Monnet5-4/+295
This is the main patch for the logics of BPF-to-BPF calls in the nfp driver. The functions called on BPF_JUMP | BPF_CALL and BPF_JUMP | BPF_EXIT were used to call helpers and exit from the program, respectively; make them usable for calling into, or returning from, a BPF subprogram as well. For all calls, push the return address as well as the callee-saved registers (R6 to R9) to the stack, and pop them upon returning from the calls. In order to limit the overhead in terms of instruction number, this is done through dedicated subroutines. Jumping to the callee actually consists in jumping to the subroutine, that "returns" to the callee: this will require some fixup for passing the address in a later patch. Similarly, returning consists in jumping to the subroutine, which pops registers and then return directly to the caller (but no fixup is needed here). Return to the caller is performed with the RTN instruction newly added to the JIT. For the few steps where we need to know what subprogram an instruction belongs to, the struct nfp_insn_meta is extended with a new subprog_idx field. Note that checks on the available stack size, to take into account the additional requirements associated to BPF-to-BPF calls (storing R6-R9 and return addresses), are added in a later patch. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: account for BPF-to-BPF calls when preparing nfp JITQuentin Monnet2-11/+27
Similarly to "exit" or "helper call" instructions, BPF-to-BPF calls will require additional processing before translation starts, in order to record and mark jump destinations. We also mark the instructions where each subprogram begins. This will be used in a following commit to determine where to add prologues for subprograms. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: ignore helper-related checks for BPF calls in nfp verifierQuentin Monnet2-4/+13
The checks related to eBPF helper calls are performed each time the nfp driver meets a BPF_JUMP | BPF_CALL instruction. However, these checks are not relevant for BPF-to-BPF call (same instruction code, different value in source register), so just skip the checks for such calls. While at it, rename the function that runs those checks to make it clear they apply to _helper_ calls only. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: copy eBPF subprograms information from kernel verifierQuentin Monnet3-0/+29
In order to support BPF-to-BPF calls in offloaded programs, the nfp driver must collect information about the distinct subprograms: namely, the number of subprograms composing the complete program and the stack depth of those subprograms. The latter in particular is non-trivial to collect, so we copy those elements from the kernel verifier via the newly added post-verification hook. The struct nfp_prog is extended to store this information. Stack depths are stored in an array of dedicated structs. Subprogram start indexes are not collected. Instead, meta instructions associated to the start of a subprogram will be marked with a flag in a later patch. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: rename nfp_prog->stack_depth as nfp_prog->stack_frame_depthQuentin Monnet3-8/+8
In preparation for support for BPF to BPF calls in offloaded programs, rename the "stack_depth" field of the struct nfp_prog as "stack_frame_depth". This is to make it clear that the field refers to the maximum size of the current stack frame (as opposed to the maximum size of the whole stack memory). Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08bpf: add verifier callback to get stack usage info for offloaded progsQuentin Monnet6-2/+37
In preparation for BPF-to-BPF calls in offloaded programs, add a new function attribute to the struct bpf_prog_offload_ops so that drivers supporting eBPF offload can hook at the end of program verification, and potentially extract information collected by the verifier. Implement a minimal callback (returning 0) in the drivers providing the structs, namely netdevsim and nfp. This will be useful in the nfp driver, in later commits, to extract the number of subprograms as well as the stack depth for those subprograms. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08bpf, doc: Document Jump X addressing modeArthur Fabre1-14/+16
bpf_asm and the other classic BPF tools support jump conditions comparing register A to register X, in addition to comparing register A with constant K. Only the latter was documented in filter.txt, add two new addressing modes that describe the former. Signed-off-by: Arthur Fabre <arthur@arthurfabre.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08libbpf: relicense libbpf as LGPL-2.1 OR BSD-2-ClauseAlexei Starovoitov13-62/+13
libbpf is maturing as a library and gaining features that no other bpf libraries support (BPF Type Format, bpf to bpf calls, etc) Many Apache2 licensed projects (like bcc, bpftrace, gobpf, cilium, etc) would like to use libbpf, but cannot do this yet, since Apache Foundation explicitly states that LGPL is incompatible with Apache2. Hence let's relicense libbpf as dual license LGPL-2.1 or BSD-2-Clause, since BSD-2 is compatible with Apache2. Dual LGPL or Apache2 is invalid combination. Fix license mistake in Makefile as well. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrey Ignatov <rdna@fb.com> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Acked-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Beckett <david.beckett@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Wang Nan <wangnan0@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08xsk: proper AF_XDP socket teardown orderingBjörn Töpel2-13/+11
The AF_XDP socket struct can exist in three different, implicit states: setup, bound and released. Setup is prior the socket has been bound to a device. Bound is when the socket is active for receive and send. Released is when the process/userspace side of the socket is released, but the sock object is still lingering, e.g. when there is a reference to the socket in an XSKMAP after process termination. The Rx fast-path code uses the "dev" member of struct xdp_sock to check whether a socket is bound or relased, and the Tx code uses the struct xdp_umem "xsk_list" member in conjunction with "dev" to determine the state of a socket. However, the transition from bound to released did not tear the socket down in correct order. On the Rx side "dev" was cleared after synchronize_net() making the synchronization useless. On the Tx side, the internal queues were destroyed prior removing them from the "xsk_list". This commit corrects the cleanup order, and by doing so xdp_del_sk_umem() can be simplified and one synchronize_net() can be removed. Fixes: 965a99098443 ("xsk: add support for bind for Rx") Fixes: ac98d8aab61b ("xsk: wire upp Tx zero-copy functions") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-05xsk: simplify xdp_clear_umem_at_qid implementationMagnus Karlsson1-5/+2
As we now do not allow ethtool to deactivate the queue id we are running an AF_XDP socket on, we can simplify the implementation of xdp_clear_umem_at_qid(). Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-05ethtool: don't allow disabling queues with umem installedJakub Kicinski3-2/+20
We already check the RSS indirection table does not use queues which would be disabled by channel reconfiguration. Make sure user does not try to disable queues which have a UMEM and zero-copy AF_XDP socket installed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-05ethtool: rename local variable max -> currJakub Kicinski1-6/+6
ethtool_set_channels() validates the config against driver's max settings. It retrieves the current config and stores it in a variable called max. This was okay when only max settings were accessed but we will soon want to access current settings as well, so calling the entire structure max makes the code less readable. While at it drop unnecessary parenthesis. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-05xsk: fix bug when trying to use both copy and zero-copy on one queue idMagnus Karlsson3-35/+64
Previously, the xsk code did not record which umem was bound to a specific queue id. This was not required if all drivers were zero-copy enabled as this had to be recorded in the driver anyway. So if a user tried to bind two umems to the same queue, the driver would say no. But if copy-mode was first enabled and then zero-copy mode (or the reverse order), we mistakenly enabled both of them on the same umem leading to buggy behavior. The main culprit for this is that we did not store the association of umem to queue id in the copy case and only relied on the driver reporting this. As this relation was not stored in the driver for copy mode (it does not rely on the AF_XDP NDOs), this obviously could not work. This patch fixes the problem by always recording the umem to queue id relationship in the netdev_queue and netdev_rx_queue structs. This way we always know what kind of umem has been bound to a queue id and can act appropriately at bind time. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-05net: add umem reference in netdev{_rx}_queueMagnus Karlsson1-0/+6
These references to the umem will be used to store information on what kind of AF_XDP umem that is bound to a queue id, if any. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-05bpf: typo fix in Documentation/networking/af_xdp.rstKonrad Djimeli1-2/+2
Fix a simple typo: Completetion -> Completion Signed-off-by: Konrad Djimeli <kdjimeli@igalia.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04bpf, tracex3_user: erase "ARRAY_SIZE" redefinedBo YU1-2/+0
There is a warning when compiling bpf sample programs in sample/bpf: make -C /home/foo/bpf/samples/bpf/../../tools/lib/bpf/ RM='rm -rf' LDFLAGS= srctree=/home/foo/bpf/samples/bpf/../../ O= HOSTCC /home/foo/bpf/samples/bpf/tracex3_user.o /home/foo/bpf/samples/bpf/tracex3_user.c:20:0: warning: "ARRAY_SIZE" redefined #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x))) In file included from /home/foo/bpf/samples/bpf/tracex3_user.c:18:0: ./tools/testing/selftests/bpf/bpf_util.h:48:0: note: this is the location of the previous definition # define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) Signed-off-by: Bo YU <tsu.yubo@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04libbpf: Use __u32 instead of u32 in bpf_program__loadAndrey Ignatov2-2/+2
Make bpf_program__load consistent with other interfaces: use __u32 instead of u32. That in turn fixes build of samples: In file included from ./samples/bpf/trace_output_user.c:21:0: ./tools/lib/bpf/libbpf.h:132:9: error: unknown type name ‘u32’ u32 kern_version); ^ Fixes: commit 29cd77f41620d ("libbpf: Support loading individual progs") Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04libbpf: Make include guards consistentAndrey Ignatov5-15/+15
Rename include guards to have consistent names "__LIBBPF_<header_name>". Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04libbpf: Consistent prefixes for interfaces in str_error.h.Andrey Ignatov3-11/+13
libbpf is used more and more outside kernel tree. That means the library should follow good practices in library design and implementation to play well with third party code that uses it. One of such practices is to have a common prefix (or a few) for every interface, function or data structure, library provides. I helps to avoid name conflicts with other libraries and keeps API consistent. Inconsistent names in libbpf already cause problems in real life. E.g. an application can't use both libbpf and libnl due to conflicting symbols. Having common prefix will help to fix current and avoid future problems. libbpf already uses the following prefixes for its interfaces: * bpf_ for bpf system call wrappers, program/map/elf-object abstractions and a few other things; * btf_ for BTF related API; * libbpf_ for everything else. The patch renames function in str_error.h to have libbpf_ prefix since it misses one and doesn't fit well into the first two categories. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04libbpf: Consistent prefixes for interfaces in nlattr.h.Andrey Ignatov5-81/+94
libbpf is used more and more outside kernel tree. That means the library should follow good practices in library design and implementation to play well with third party code that uses it. One of such practices is to have a common prefix (or a few) for every interface, function or data structure, library provides. I helps to avoid name conflicts with other libraries and keeps API consistent. Inconsistent names in libbpf already cause problems in real life. E.g. an application can't use both libbpf and libnl due to conflicting symbols. Having common prefix will help to fix current and avoid future problems. libbpf already uses the following prefixes for its interfaces: * bpf_ for bpf system call wrappers, program/map/elf-object abstractions and a few other things; * btf_ for BTF related API; * libbpf_ for everything else. The patch adds libbpf_ prefix to interfaces in nlattr.h that use none of mentioned above prefixes and doesn't fit well into the first two categories. Since affected part of API is used in bpftool, the patch applies corresponding change to bpftool as well. Having it in a separate patch will cause a state of tree where bpftool is broken what may not be a good idea. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04libbpf: Consistent prefixes for interfaces in libbpf.h.Andrey Ignatov3-43/+45
libbpf is used more and more outside kernel tree. That means the library should follow good practices in library design and implementation to play well with third party code that uses it. One of such practices is to have a common prefix (or a few) for every interface, function or data structure, library provides. I helps to avoid name conflicts with other libraries and keeps API consistent. Inconsistent names in libbpf already cause problems in real life. E.g. an application can't use both libbpf and libnl due to conflicting symbols. Having common prefix will help to fix current and avoid future problems. libbpf already uses the following prefixes for its interfaces: * bpf_ for bpf system call wrappers, program/map/elf-object abstractions and a few other things; * btf_ for BTF related API; * libbpf_ for everything else. The patch adds libbpf_ prefix to functions and typedef in libbpf.h that use none of mentioned above prefixes and doesn't fit well into the first two categories. Since affected part of API is used in bpftool, the patch applies corresponding change to bpftool as well. Having it in a separate patch will cause a state of tree where bpftool is broken what may not be a good idea. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04libbpf: Move __dump_nlmsg_t from API to implementationAndrey Ignatov2-3/+3
This typedef is used only by implementation in netlink.c. Nothing uses it in public API. Move it to netlink.c. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-04net: core: Fix build with CONFIG_IPV6=mJoe Stringer1-1/+1
Stephen Rothwell reports the following link failure with IPv6 as module: x86_64-linux-gnu-ld: net/core/filter.o: in function `sk_lookup': (.text+0x19219): undefined reference to `__udp6_lib_lookup' Fix the build by only enabling the IPv6 socket lookup if IPv6 support is compiled into the kernel. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03Documentation: Describe bpf reference trackingJoe Stringer1-0/+64
Document the new pointer types in the verifier and how the pointer ID tracking works to ensure that references which are taken are later released. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03selftests/bpf: Add C tests for reference trackingJoe Stringer3-1/+219
Add some tests that demonstrate and test the balanced lookup/free nature of socket lookup. Section names that start with "fail" represent programs that are expected to fail verification; all others should succeed. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03libbpf: Support loading individual progsJoe Stringer2-2/+5
Allow the individual program load to be invoked. This will help with testing, where a single ELF may contain several sections, some of which denote subprograms that are expected to fail verification, along with some which are expected to pass verification. By allowing programs to be iterated and individually loaded, each program can be independently checked against its expected verification result. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03selftests/bpf: Add tests for reference trackingJoe Stringer1-0/+759
reference tracking: leak potential reference reference tracking: leak potential reference on stack reference tracking: leak potential reference on stack 2 reference tracking: zero potential reference reference tracking: copy and zero potential references reference tracking: release reference without check reference tracking: release reference reference tracking: release reference twice reference tracking: release reference twice inside branch reference tracking: alloc, check, free in one subbranch reference tracking: alloc, check, free in both subbranches reference tracking in call: free reference in subprog reference tracking in call: free reference in subprog and outside reference tracking in call: alloc & leak reference in subprog reference tracking in call: alloc in subprog, release outside reference tracking in call: sk_ptr leak into caller stack reference tracking in call: sk_ptr spill into caller stack reference tracking: allow LD_ABS reference tracking: forbid LD_ABS while holding reference reference tracking: allow LD_IND reference tracking: forbid LD_IND while holding reference reference tracking: check reference or tail call reference tracking: release reference then tail call reference tracking: leak possible reference over tail call reference tracking: leak checked reference over tail call reference tracking: mangle and release sock_or_null reference tracking: mangle and release sock reference tracking: access member reference tracking: write to member reference tracking: invalid 64-bit access of member reference tracking: access after release reference tracking: direct access for lookup unpriv: spill/fill of different pointers stx - ctx and sock unpriv: spill/fill of different pointers stx - leak sock unpriv: spill/fill of different pointers stx - sock and ctx (read) unpriv: spill/fill of different pointers stx - sock and ctx (write) Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03selftests/bpf: Generalize dummy program typesJoe Stringer1-14/+17
Don't hardcode the dummy program types to SOCKET_FILTER type, as this prevents testing bpf_tail_call in conjunction with other program types. Instead, use the program type specified in the test case. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Add helper to retrieve socket in BPFJoe Stringer5-3/+354
This patch adds new BPF helper functions, bpf_sk_lookup_tcp() and bpf_sk_lookup_udp() which allows BPF programs to find out if there is a socket listening on this host, and returns a socket pointer which the BPF program can then access to determine, for instance, whether to forward or drop traffic. bpf_sk_lookup_xxx() may take a reference on the socket, so when a BPF program makes use of this function, it must subsequently pass the returned pointer into the newly added sk_release() to return the reference. By way of example, the following pseudocode would filter inbound connections at XDP if there is no corresponding service listening for the traffic: struct bpf_sock_tuple tuple; struct bpf_sock_ops *sk; populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet sk = bpf_sk_lookup_tcp(ctx, &tuple, sizeof tuple, netns, 0); if (!sk) { // Couldn't find a socket listening for this traffic. Drop. return TC_ACT_SHOT; } bpf_sk_release(sk, 0); return TC_ACT_OK; Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Add reference tracking to verifierJoe Stringer2-22/+308
Allow helper functions to acquire a reference and return it into a register. Specific pointer types such as the PTR_TO_SOCKET will implicitly represent such a reference. The verifier must ensure that these references are released exactly once in each path through the program. To achieve this, this commit assigns an id to the pointer and tracks it in the 'bpf_func_state', then when the function or program exits, verifies that all of the acquired references have been freed. When the pointer is passed to a function that frees the reference, it is removed from the 'bpf_func_state` and all existing copies of the pointer in registers are marked invalid. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Macrofy stack state copyJoe Stringer1-46/+60
An upcoming commit will need very similar copy/realloc boilerplate, so refactor the existing stack copy/realloc functions into macros to simplify it. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Add PTR_TO_SOCKET verifier typeJoe Stringer4-26/+160
Teach the verifier a little bit about a new type of pointer, a PTR_TO_SOCKET. This pointer type is accessed from BPF through the 'struct bpf_sock' structure. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Generalize ptr_or_null regs checkJoe Stringer1-18/+25
This check will be reused by an upcoming commit for conditional jump checks for sockets. Refactor it a bit to simplify the later commit. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Reuse canonical string formatter for ctx errsJoe Stringer2-9/+8
The array "reg_type_str" provides canonical formatting of register types, however a couple of places would previously check whether a register represented the context and write the name "context" directly. An upcoming commit will add another pointer type to these statements, so to provide more accurate error messages in the verifier, update these error messages to use "reg_type_str" instead. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Simplify ptr_min_max_vals adjustmentJoe Stringer2-19/+17
An upcoming commit will add another two pointer types that need very similar behaviour, so generalise this function now. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03bpf: Add iterator for spilled registersJoe Stringer2-9/+18
Add this iterator for spilled registers, it concentrates the details of how to get the current frame's spilled registers into a single macro while clarifying the intention of the code which is calling the macro. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02nfp: bpf: allow control message sizing for map opsJakub Kicinski5-17/+83
In current ABI the size of the messages carrying map elements was statically defined to at most 16 words of key and 16 words of value (NFP word is 4 bytes). We should not make this assumption and use the max key and value sizes from the BPF capability instead. To make sure old kernels don't get surprised with larger (or smaller) messages bump the FW ABI version to 3 when key/value size is different than 16 words. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02nfp: allow apps to request larger MTU on control vNICJakub Kicinski2-2/+16
Some apps may want to have higher MTU on the control vNIC/queue. Allow them to set the requested MTU at init time. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02nfp: bpf: parse global BPF ABI version capabilityJakub Kicinski3-4/+44
Up until now we only had per-vNIC BPF ABI version capabilities, which are slightly awkward to use because bulk of the resources and configuration does not relate to any particular vNIC. Add a new capability for global ABI version and check the per-vNIC version are equal to it. Assume the ABI version 2 if no explicit version capability is present. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-01selftests/bpf: cgroup local storage-based network countersRoman Gushchin4-2/+257
This commit adds a bpf kselftest, which demonstrates how percpu and shared cgroup local storage can be used for efficient lookup-free network accounting. Cgroup local storage provides generic memory area with a very efficient lookup free access. To avoid expensive atomic operations for each packet, per-cpu cgroup local storage is used. Each packet is initially charged to a per-cpu counter, and only if the counter reaches certain value (32 in this case), the charge is moved into the global atomic counter. This allows to amortize atomic operations, keeping reasonable accuracy. The test also implements a naive network traffic throttling, mostly to demonstrate the possibility of bpf cgroup--based network bandwidth control. Expected output: ./test_netcnt test_netcnt:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-01samples/bpf: extend test_cgrp2_attach2 test to use per-cpu cgroup storageRoman Gushchin1-1/+18
This commit extends the test_cgrp2_attach2 test to cover per-cpu cgroup storage. Bpf program will use shared and per-cpu cgroup storages simultaneously, so a better coverage of corresponding core code will be achieved. Expected output: $ ./test_cgrp2_attach2 Attached DROP prog. This ping in cgroup /foo should fail... ping: sendmsg: Operation not permitted Attached DROP prog. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS prog. This ping in cgroup /foo/bar should pass... Detached PASS from /foo/bar while DROP is attached to /foo. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS from /foo/bar and detached DROP from /foo. This ping in cgroup /foo/bar should pass... ### override:PASS ### multi:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-01selftests/bpf: extend the storage test to test per-cpu cgroup storageRoman Gushchin1-3/+57
This test extends the cgroup storage test to use per-cpu flavor of the cgroup storage as well. The test initializes a per-cpu cgroup storage to some non-zero initial value (1000), and then simple bumps a per-cpu counter each time the shared counter is atomically incremented. Then it reads all per-cpu areas from the userspace side, and checks that the sum of values adds to the expected sum. Expected output: $ ./test_cgroup_storage test_cgroup_storage:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-01selftests/bpf: add verifier per-cpu cgroup storage testsRoman Gushchin1-6/+133
This commits adds verifier tests covering per-cpu cgroup storage functionality. There are 6 new tests, which are exactly the same as for shared cgroup storage, but do use per-cpu cgroup storage map. Expected output: $ ./test_verifier #0/u add+sub+mul OK #0/p add+sub+mul OK ... #286/p invalid cgroup storage access 6 OK #287/p valid per-cpu cgroup storage access OK #288/p invalid per-cpu cgroup storage access 1 OK #289/p invalid per-cpu cgroup storage access 2 OK #290/p invalid per-cpu cgroup storage access 3 OK #291/p invalid per-cpu cgroup storage access 4 OK #292/p invalid per-cpu cgroup storage access 5 OK #293/p invalid per-cpu cgroup storage access 6 OK #294/p multiple registers share map_lookup_elem result OK ... #662/p mov64 src == dst OK #663/p mov64 src != dst OK Summary: 914 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-01bpftool: add support for PERCPU_CGROUP_STORAGE mapsRoman Gushchin1-1/+3
This commit adds support for BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE map type. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>