linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-01-24	selftests, xsk: Fix rx_full stats test	Magnus Karlsson	1	-1/+4
	Fix the rx_full stats test so that it correctly reports pass even when the fill ring is not full of buffers. Fixes: 872a1184dbf2 ("selftests: xsk: Put the same buffer only once in the fill ring") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20220121123508.12759-1-magnus.karlsson@gmail.com
2022-01-24	bpf: Fix flexible_array.cocci warnings	kernel test robot	1	-1/+1
	Zero-length and one-element arrays are deprecated, see: Documentation/process/deprecated.rst Flexible-array members should be used instead. Generated by: scripts/coccinelle/misc/flexible_array.cocci Fixes: c1ff181ffabc ("selftests/bpf: Extend kfunc selftests") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Julia Lawall <julia.lawall@inria.fr> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/alpine.DEB.2.22.394.2201221206320.12220@hadrien
2022-01-21	xdp: disable XDP_REDIRECT for xdp frags	Lorenzo Bianconi	1	-0/+8
	XDP_REDIRECT is not fully supported yet for xdp frags since not all XDP capable drivers can map non-linear xdp_frame in ndo_xdp_xmit so disable it for the moment. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/0da25e117d0e2673f5d0ce6503393c55c6fb1be9.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: selftests: add CPUMAP/DEVMAP selftests for xdp frags	Lorenzo Bianconi	6	-1/+185
	Verify compatibility checks attaching a XDP frags program to a CPUMAP/DEVMAP Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/d94b4d35adc1e42c9ca5004e6b2cdfd75992304d.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: selftests: introduce bpf_xdp_{load,store}_bytes selftest	Lorenzo Bianconi	2	-0/+146
	Introduce kernel selftest for new bpf_xdp_{load,store}_bytes helpers. and bpf_xdp_pointer/bpf_xdp_copy_buf utility routines. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/2c99ae663a5dcfbd9240b1d0489ad55dea4f4601.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	net: xdp: introduce bpf_xdp_pointer utility routine	Lorenzo Bianconi	3	-38/+174
	Similar to skb_header_pointer, introduce bpf_xdp_pointer utility routine to return a pointer to a given position in the xdp_buff if the requested area (offset + len) is contained in a contiguous memory area otherwise it will be copied in a bounce buffer provided by the caller. Similar to the tc counterpart, introduce the two following xdp helpers: - bpf_xdp_load_bytes - bpf_xdp_store_bytes Reviewed-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/ab285c1efdd5b7a9d361348b1e7d3ef49f6382b3.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: generalise tail call map compatibility check	Toke Hoiland-Jorgensen	6	-40/+48
	The check for tail call map compatibility ensures that tail calls only happen between maps of the same type. To ensure backwards compatibility for XDP frags we need a similar type of check for cpumap and devmap programs, so move the state from bpf_array_aux into bpf_map, add xdp_has_frags to the check, and apply the same check to cpumap and devmap. Acked-by: John Fastabend <john.fastabend@gmail.com> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Toke Hoiland-Jorgensen <toke@redhat.com> Link: https://lore.kernel.org/r/f19fd97c0328a39927f3ad03e1ca6b43fd53cdfd.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	libbpf: Add SEC name for xdp frags programs	Lorenzo Bianconi	1	-0/+8
	Introduce support for the following SEC entries for XDP frags property: - SEC("xdp.frags") - SEC("xdp.frags/devmap") - SEC("xdp.frags/cpumap") Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/af23b6e4841c171ad1af01917839b77847a4bc27.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: selftests: update xdp_adjust_tail selftest to include xdp frags	Eelco Chaudron	3	-7/+160
	This change adds test cases for the xdp frags scenarios when shrinking and growing. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Link: https://lore.kernel.org/r/d2e6a0ebc52db6f89e62b9befe045032e5e0a5fe.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature	Lorenzo Bianconi	1	-9/+39
	introduce xdp_shared_info pointer in bpf_test_finish signature in order to copy back paged data from a xdp frags frame to userspace buffer Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/c803673798c786f915bcdd6c9338edaa9740d3d6.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: introduce frags support to bpf_prog_test_run_xdp()	Lorenzo Bianconi	1	-13/+45
	Introduce the capability to allocate a xdp frags in bpf_prog_test_run_xdp routine. This is a preliminary patch to introduce the selftests for new xdp frags ebpf helpers Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/b7c0e425a9287f00f601c4fc0de54738ec6ceeea.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: move user_size out of bpf_test_init	Lorenzo Bianconi	1	-6/+7
	Rely on data_size_in in bpf_test_init routine signature. This is a preliminary patch to introduce xdp frags selftest Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/6b48d38ed3d60240d7d6bb15e6fa7fabfac8dfb2.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: add frags support to xdp copy helpers	Eelco Chaudron	4	-36/+137
	This patch adds support for frags for the following helpers: - bpf_xdp_output() - bpf_perf_event_output() Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/340b4a99cdc24337b40eaf8bb597f9f9e7b0373e.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: add frags support to the bpf_xdp_adjust_tail() API	Eelco Chaudron	4	-8/+88
	This change adds support for tail growing and shrinking for XDP frags. When called on a non-linear packet with a grow request, it will work on the last fragment of the packet. So the maximum grow size is the last fragments tailroom, i.e. no new buffer will be allocated. A XDP frags capable driver is expected to set frag_size in xdp_rxq_info data structure to notify the XDP core the fragment size. frag_size set to 0 is interpreted by the XDP core as tail growing is not allowed. Introduce __xdp_rxq_info_reg utility routine to initialize frag_size field. When shrinking, it will work from the last fragment, all the way down to the base buffer depending on the shrinking size. It's important to mention that once you shrink down the fragment(s) are freed, so you can not grow again to the original size. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Link: https://lore.kernel.org/r/eabda3485dda4f2f158b477729337327e609461d.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: introduce bpf_xdp_get_buff_len helper	Lorenzo Bianconi	4	-0/+43
	Introduce bpf_xdp_get_buff_len helper in order to return the xdp buffer total size (linear and paged area) Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/aac9ac3504c84026cf66a3c71b7c5ae89bc991be.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	net: mvneta: enable jumbo frames if the loaded XDP program support frags	Lorenzo Bianconi	1	-4/+9
	Enable the capability to receive jumbo frames even if the interface is running in XDP mode if the loaded program declare to properly support xdp frags. At same time reject a xdp program not supporting xdp frags if the driver is running in xdp frags mode. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/6909f81a3cbb8fb6b88e914752c26395771b882a.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program	Lorenzo Bianconi	4	-1/+14
	Introduce BPF_F_XDP_HAS_FRAGS and the related field in bpf_prog_aux in order to notify the driver the loaded program support xdp frags. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/db2e8075b7032a356003f407d1b0deb99adaa0ed.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	net: mvneta: add frags support to XDP_TX	Lorenzo Bianconi	1	-36/+76
	Introduce the capability to map non-linear xdp buffer running mvneta_xdp_submit_frame() for XDP_TX and XDP_REDIRECT Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/5d46ab63870ffe96fb95e6075a7ff0c81ef6424d.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	xdp: add frags support to xdp_return_{buff/frame}	Lorenzo Bianconi	2	-3/+69
	Take into account if the received xdp_buff/xdp_frame is non-linear recycling/returning the frame memory to the allocator or into xdp_frame_bulk. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/a961069febc868508ce1bdf5e53a343eb4e57cb2.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	net: marvell: rely on xdp_update_skb_shared_info utility routine	Lorenzo Bianconi	1	-13/+10
	Rely on xdp_update_skb_shared_info routine in order to avoid resetting frags array in skb_shared_info structure building the skb in mvneta_swbm_build_skb(). Frags array is expected to be initialized by the receiving driver building the xdp_buff and here we just need to update memory metadata. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/e0dad97f5d02b13f189f99f1e5bc8e61bef73412.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	net: xdp: add xdp_update_skb_shared_info utility routine	Lorenzo Bianconi	2	-1/+44
	Introduce xdp_update_skb_shared_info routine to update frags array metadata in skb_shared_info data structure converting to a skb from a xdp_buff or xdp_frame. According to the current skb_shared_info architecture in xdp_frame/xdp_buff and to the xdp frags support, there is no need to run skb_add_rx_frag() and reset frags array converting the buffer to a skb since the frag array will be in the same position for xdp_buff/xdp_frame and for the skb, we just need to update memory metadata. Introduce XDP_FLAGS_PF_MEMALLOC flag in xdp_buff_flags in order to mark the xdp_buff or xdp_frame as under memory-pressure if pages of the frags array are under memory pressure. Doing so we can avoid looping over all fragments in xdp_update_skb_shared_info routine. The driver is expected to set the flag constructing the xdp_buffer using xdp_buff_set_frag_pfmemalloc utility routine. Rely on xdp_update_skb_shared_info in __xdp_build_skb_from_frame routine converting the non-linear xdp_frame to a skb after performing a XDP_REDIRECT. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/bfd23fb8a8d7438724f7819c567cdf99ffd6226f.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	net: mvneta: simplify mvneta_swbm_add_rx_fragment management	Lorenzo Bianconi	1	-27/+15
	Relying on xdp frags bit, remove skb_shared_info structure allocated on the stack in mvneta_rx_swbm routine and simplify mvneta_swbm_add_rx_fragment accessing skb_shared_info in the xdp_buff structure directly. There is no performance penalty in this approach since mvneta_swbm_add_rx_fragment is run just for xdp frags use-case. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/45f050c094ccffce49d6bc5112939ed35250ba90.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	net: mvneta: update frags bit before passing the xdp buffer to eBPF layer	Lorenzo Bianconi	1	-5/+18
	Update frags bit (XDP_FLAGS_HAS_FRAGS) in xdp_buff to notify XDP/eBPF layer and XDP remote drivers if this is a "non-linear" XDP buffer. Access skb_shared_info only if XDP_FLAGS_HAS_FRAGS flag is set in order to avoid possible cache-misses. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/c00a73097f8a35860d50dae4a36e6cc9ef7e172f.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	xdp: introduce flags field in xdp_buff/xdp_frame	Lorenzo Bianconi	1	-0/+29
	Introduce flags field in xdp_frame and xdp_buffer data structures to define additional buffer features. At the moment the only supported buffer feature is frags bit (XDP_FLAGS_HAS_FRAGS). frags bit is used to specify if this is a linear buffer (XDP_FLAGS_HAS_FRAGS not set) or a frags frame (XDP_FLAGS_HAS_FRAGS set). In the latter case the driver is expected to initialize the skb_shared_info structure at the end of the first buffer to link together subsequent buffers belonging to the same frame. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/e389f14f3a162c0a5bc6a2e1aa8dd01a90be117d.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	net: skbuff: add size metadata to skb_shared_info for xdp	Lorenzo Bianconi	1	-0/+1
	Introduce xdp_frags_size field in skb_shared_info data structure to store xdp_buff/xdp_frame frame paged size (xdp_frags_size will be used in xdp frags support). In order to not increase skb_shared_info size we will use a hole due to skb_shared_info alignment. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/8a849819a3e0a143d540f78a3a5add76e17e980d.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	selftests: bpf: test BPF_PROG_QUERY for progs attached to sockmap	Di Zhu	2	-0/+90
	Add test for querying progs attached to sockmap. we use an existing libbpf query interface to query prog cnt before and after progs attaching to sockmap and check whether the queried prog id is right. Signed-off-by: Di Zhu <zhudi2@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20220119014005.1209-2-zhudi2@huawei.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	bpf: support BPF_PROG_QUERY for progs attached to sockmap	Di Zhu	3	-7/+84
	Right now there is no way to query whether BPF programs are attached to a sockmap or not. we can use the standard interface in libbpf to query, such as: bpf_prog_query(mapFd, BPF_SK_SKB_STREAM_PARSER, 0, NULL, ...); the mapFd is the fd of sockmap. Signed-off-by: Di Zhu <zhudi2@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20220119014005.1209-1-zhudi2@huawei.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	samples/bpf: adapt samples/bpf to bpf_xdp_xxx() APIs	Andrii Nakryiko	11	-41/+40
	Use new bpf_xdp_*() APIs across all XDP-related BPF samples. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120061422.2710637-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	selftests/bpf: switch to new libbpf XDP APIs	Andrii Nakryiko	7	-50/+47
	Switch to using new bpf_xdp_*() APIs across all selftests. Take advantage of a more straightforward and user-friendly semantics of old_prog_fd (0 means "don't care") in few places. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120061422.2710637-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	bpftool: use new API for attaching XDP program	Andrii Nakryiko	1	-1/+1
	Switch to new bpf_xdp_attach() API to avoid deprecation warnings. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120061422.2710637-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	libbpf: streamline low-level XDP APIs	Andrii Nakryiko	3	-33/+117
	Introduce 4 new netlink-based XDP APIs for attaching, detaching, and querying XDP programs: - bpf_xdp_attach; - bpf_xdp_detach; - bpf_xdp_query; - bpf_xdp_query_id. These APIs replace bpf_set_link_xdp_fd, bpf_set_link_xdp_fd_opts, bpf_get_link_xdp_id, and bpf_get_link_xdp_info APIs ([0]). The latter don't follow a consistent naming pattern and some of them use non-extensible approaches (e.g., struct xdp_link_info which can't be modified without breaking libbpf ABI). The approach I took with these low-level XDP APIs is similar to what we did with low-level TC APIs. There is a nice duality of bpf_tc_attach vs bpf_xdp_attach, and so on. I left bpf_xdp_attach() to support detaching when -1 is specified for prog_fd for generality and convenience, but bpf_xdp_detach() is preferred due to clearer naming and associated semantics. Both bpf_xdp_attach() and bpf_xdp_detach() accept the same opts struct allowing to specify expected old_prog_fd. While doing the refactoring, I noticed that old APIs require users to specify opts with old_fd == -1 to declare "don't care about already attached XDP prog fd" condition. Otherwise, FD 0 is assumed, which is essentially never an intended behavior. So I made this behavior consistent with other kernel and libbpf APIs, in which zero FD means "no FD". This seems to be more in line with the latest thinking in BPF land and should cause less user confusion, hopefully. For querying, I left two APIs, both more generic bpf_xdp_query() allowing to query multiple IDs and attach mode, but also a specialization of it, bpf_xdp_query_id(), which returns only requested prog_id. Uses of prog_id returning bpf_get_link_xdp_id() were so prevalent across selftests and samples, that it seemed a very common use case and using bpf_xdp_query() for doing it felt very cumbersome with a highly branches if/else chain based on flags and attach mode. Old APIs are scheduled for deprecation in libbpf 0.8 release. [0] Closes: https://github.com/libbpf/libbpf/issues/309 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20220120061422.2710637-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	docs/bpf: update BPF map definition example	Andrii Nakryiko	1	-18/+14
	Use BTF-defined map definition in the documentation example. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120060529.1890907-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	libbpf: deprecate legacy BPF map definitions	Andrii Nakryiko	5	-2/+26
	Enact deprecation of legacy BPF map definition in SEC("maps") ([0]). For the definitions themselves introduce LIBBPF_STRICT_MAP_DEFINITIONS flag for libbpf strict mode. If it is set, error out on any struct bpf_map_def-based map definition. If not set, libbpf will print out a warning for each legacy BPF map to raise awareness that it goes away. For any use of BPF_ANNOTATE_KV_PAIR() macro providing a legacy way to associate BTF key/value type information with legacy BPF map definition, warn through libbpf's pr_warn() error message (but don't fail BPF object open). BPF-side struct bpf_map_def is marked as deprecated. User-space struct bpf_map_def has to be used internally in libbpf, so it is left untouched. It should be enough for bpf_map__def() to be marked deprecated to raise awareness that it goes away. bpftool is an interesting case that utilizes libbpf to open BPF ELF object to generate skeleton. As such, even though bpftool itself uses full on strict libbpf mode (LIBBPF_STRICT_ALL), it has to relax it a bit for BPF map definition handling to minimize unnecessary disruptions. So opt-out of LIBBPF_STRICT_MAP_DEFINITIONS for bpftool. User's code that will later use generated skeleton will make its own decision whether to enforce LIBBPF_STRICT_MAP_DEFINITIONS or not. There are few tests in selftests/bpf that are consciously using legacy BPF map definitions to test libbpf functionality. For those, temporary opt out of LIBBPF_STRICT_MAP_DEFINITIONS mode for the duration of those tests. [0] Closes: https://github.com/libbpf/libbpf/issues/272 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120060529.1890907-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	selftests/bpf: convert remaining legacy map definitions	Andrii Nakryiko	8	-42/+48
	Converted few remaining legacy BPF map definition to BTF-defined ones. For the remaining two bpf_map_def-based legacy definitions that we want to keep for testing purposes until libbpf 1.0 release, guard them in pragma to suppres deprecation warnings which will be added in libbpf in the next commit. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120060529.1890907-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	selftests/bpf: fail build on compilation warning	Andrii Nakryiko	1	-2/+2
	It's very easy to miss compilation warnings without -Werror, which is not set for selftests. libbpf and bpftool are already strict about this, so make selftests/bpf also treat compilation warnings as errors to catch such regressions early. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120060529.1890907-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-20	selftests/bpf: Do not fail build if CONFIG_NF_CONNTRACK=m/n	Kumar Kartikeya Dwivedi	1	-4/+13
	Some users have complained that selftests fail to build when CONFIG_NF_CONNTRACK=m. It would be useful to allow building as long as it is set to module or built-in, even though in case of building as module, user would need to load it before running the selftest. Note that this also allows building selftest when CONFIG_NF_CONNTRACK is disabled. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220120164932.2798544-1-memxor@gmail.com
2022-01-20	selftests: bpf: Fix bind on used port	Felix Maurer	1	-3/+17
	The bind_perm BPF selftest failed when port 111/tcp was already in use during the test. To fix this, the test now runs in its own network name space. To use unshare, it is necessary to reorder the includes. The style of the includes is adapted to be consistent with the other prog_tests. v2: Replace deprecated CHECK macro with ASSERT_OK Fixes: 8259fdeb30326 ("selftests/bpf: Verify that rebinding to port < 1024 from BPF works") Signed-off-by: Felix Maurer <fmaurer@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/551ee65533bb987a43f93d88eaf2368b416ccd32.1642518457.git.fmaurer@redhat.com
2022-01-20	bpf: selftests: Get rid of CHECK macro in xdp_bpf2bpf.c	Lorenzo Bianconi	1	-40/+20
	Rely on ASSERT* macros and get rid of deprecated CHECK ones in xdp_bpf2bpf bpf selftest. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/df7e5098465016e27d91f2c69a376a35d63a7621.1642679130.git.lorenzo@kernel.org
2022-01-20	bpf: selftests: Get rid of CHECK macro in xdp_adjust_tail.c	Lorenzo Bianconi	1	-40/+28
	Rely on ASSERT* macros and get rid of deprecated CHECK ones in xdp_adjust_tail bpf selftest. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/c0ab002ffa647a20ec9e584214bf0d4373142b54.1642679130.git.lorenzo@kernel.org
2022-01-19	selftests/bpf: Update sockopt_sk test to the use bpf_set_retval	YiFei Zhu	2	-18/+18
	The tests would break without this patch, because at one point it calls getsockopt(fd, SOL_TCP, TCP_ZEROCOPY_RECEIVE, &buf, &optlen) This getsockopt receives the kernel-set -EINVAL. Prior to this patch series, the eBPF getsockopt hook's -EPERM would override kernel's -EINVAL, however, after this patch series, return 0's automatic -EPERM will not; the eBPF prog has to explicitly bpf_set_retval(-EPERM) if that is wanted. I also removed the explicit mentions of EPERM in the comments in the prog. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/4f20b77cb46812dbc2bdcd7e3fa87c7573bde55e.1639619851.git.zhuyifei@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19	selftests/bpf: Test bpf_{get,set}_retval behavior with cgroup/sockopt	YiFei Zhu	3	-0/+578
	The tests checks how different ways of interacting with the helpers (getting retval, setting EUNATCH, EISCONN, and legacy reject returning 0 without setting retval), produce different results in both the setsockopt syscall and the retval returned by the helper. A few more tests verify the interaction between the retval of the helper and the retval in getsockopt context. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/43ec60d679ae3f4f6fd2460559c28b63cb93cd12.1639619851.git.zhuyifei@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19	bpf: Add cgroup helpers bpf_{get,set}_retval to get/set syscall return value	YiFei Zhu	4	-5/+79
	The helpers continue to use int for retval because all the hooks are int-returning rather than long-returning. The return value of bpf_set_retval is int for future-proofing, in case in the future there may be errors trying to set the retval. After the previous patch, if a program rejects a syscall by returning 0, an -EPERM will be generated no matter if the retval is already set to -err. This patch change it being forced only if retval is not -err. This is because we want to support, for example, invoking bpf_set_retval(-EINVAL) and return 0, and have the syscall return value be -EINVAL not -EPERM. For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is that, if the return value is NET_XMIT_DROP, the packet is silently dropped. We preserve this behavior for backward compatibility reasons, so even if an errno is set, the errno does not return to caller. However, setting a non-err to retval cannot propagate so this is not allowed and we return a -EFAULT in that case. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/b4013fd5d16bed0b01977c1fafdeae12e1de61fb.1639619851.git.zhuyifei@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19	bpf: Move getsockopt retval to struct bpf_cg_run_ctx	YiFei Zhu	3	-44/+63
	The retval value is moved to struct bpf_cg_run_ctx for ease of access in different prog types with different context structs layouts. The helper implementation (to be added in a later patch in the series) can simply perform a container_of from current->bpf_ctx to retrieve bpf_cg_run_ctx. Unfortunately, there is no easy way to access the current task_struct via the verifier BPF bytecode rewrite, aside from possibly calling a helper, so a pointer to current task is added to struct bpf_sockopt_kern so that the rewritten BPF bytecode can access struct bpf_cg_run_ctx with an indirection. For backward compatibility, if a getsockopt program rejects a syscall by returning 0, an -EPERM will be generated, by having the BPF_PROG_RUN_ARRAY_CG family macros automatically set the retval to -EPERM. Unlike prior to this patch, this -EPERM will be visible to ctx->retval for any other hooks down the line in the prog array. Additionally, the restriction that getsockopt filters can only set the retval to 0 is removed, considering that certain getsockopt implementations may return optlen. Filters are now able to set the value arbitrarily. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/73b0325f5c29912ccea7ea57ec1ed4d388fc1d37.1639619851.git.zhuyifei@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19	bpf: Make BPF_PROG_RUN_ARRAY return -err instead of allow boolean	YiFei Zhu	3	-34/+25
	Right now BPF_PROG_RUN_ARRAY and related macros return 1 or 0 for whether the prog array allows or rejects whatever is being hooked. The caller of these macros then return -EPERM or continue processing based on thw macro's return value. Unforunately this is inflexible, since -EPERM is the only err that can be returned. This patch should be a no-op; it prepares for the next patch. The returning of the -EPERM is moved to inside the macros, so the outer functions are directly returning what the macros returned if they are non-zero. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/788abcdca55886d1f43274c918eaa9f792a9f33b.1639619851.git.zhuyifei@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-19	libbpf: Improve btf__add_btf() with an additional hashmap for strings.	Kui-Feng Lee	1	-1/+30
	Add a hashmap to map the string offsets from a source btf to the string offsets from a target btf to reduce overheads. btf__add_btf() calls btf__add_str() to add strings from a source to a target btf. It causes many string comparisons, and it is a major hotspot when adding a big btf. btf__add_str() uses strcmp() to check if a hash entry is the right one. The extra hashmap here compares offsets of strings, that are much cheaper. It remembers the results of btf__add_str() for later uses to reduce the cost. We are parallelizing BTF encoding for pahole by creating separated btf instances for worker threads. These per-thread btf instances will be added to the btf instance of the main thread by calling btf__add_str() to deduplicate and write out. With this patch and -j4, the running time of pahole drops to about 6.0s from 6.6s. The following lines are the summary of 'perf stat' w/o the change. 6.668126396 seconds time elapsed 13.451054000 seconds user 0.715520000 seconds sys The following lines are the summary w/ the change. 5.986973919 seconds time elapsed 12.939903000 seconds user 0.724152000 seconds sys V4 fixes a bug of error checking against the pointer returned by hashmap__new(). [v3] https://lore.kernel.org/bpf/20220118232053.2113139-1-kuifeng@fb.com/ [v2] https://lore.kernel.org/bpf/20220114193713.461349-1-kuifeng@fb.com/ Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220119180214.255634-1-kuifeng@fb.com
2022-01-19	bpf/scripts: Raise an exception if the correct number of sycalls are not generated	Usama Arif	1	-27/+59
	Currently the syscalls rst and subsequently man page are auto-generated using function documentation present in bpf.h. If the documentation for the syscall is missing or doesn't follow a specific format, then that syscall is not dumped in the auto-generated rst. This patch checks the number of syscalls documented within the header file with those present as part of the enum bpf_cmd and raises an Exception if they don't match. It is not needed with the currently documented upstream syscalls, but can help in debugging when developing new syscalls when there might be missing or misformatted documentation. The function helper_number_check is moved to the Printer parent class and renamed to elem_number_check as all the most derived children classes are using this function now. Signed-off-by: Usama Arif <usama.arif@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220119114442.1452088-3-usama.arif@bytedance.com
2022-01-19	bpf/scripts: Make description and returns section for helpers/syscalls mandatory	Usama Arif	1	-12/+18
	This enforce a minimal formatting consistency for the documentation. The description and returns missing for a few helpers have also been added. Signed-off-by: Usama Arif <usama.arif@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220119114442.1452088-2-usama.arif@bytedance.com
2022-01-19	uapi/bpf: Add missing description and returns for helper documentation	Usama Arif	2	-0/+30
	Both description and returns section will become mandatory for helpers and syscalls in a later commit to generate man pages. This commit also adds in the documentation that BPF_PROG_RUN is an alias for BPF_PROG_TEST_RUN for anyone searching for the syscall in the generated man pages. Signed-off-by: Usama Arif <usama.arif@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220119114442.1452088-1-usama.arif@bytedance.com
2022-01-19	bpftool: Adding support for BTF program names	Raman Shukhau	4	-12/+70
	`bpftool prog list` and other bpftool subcommands that show BPF program names currently get them from bpf_prog_info.name. That field is limited to 16 (BPF_OBJ_NAME_LEN) chars which leads to truncated names since many progs have much longer names. The idea of this change is to improve all bpftool commands that output prog name so that bpftool uses info from BTF to print program names if available. It tries bpf_prog_info.name first and fall back to btf only if the name is suspected to be truncated (has 15 chars length). Right now `bpftool p show id <id>` returns capped prog name <id>: kprobe name example_cap_cap tag 712e... ... With this change it would return <id>: kprobe name example_cap_capable tag 712e... ... Note, other commands that print prog names (e.g. "bpftool cgroup tree") are also addressed in this change. Signed-off-by: Raman Shukhau <ramasha@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220119100255.1068997-1-ramasha@fb.com
2022-01-18	libbpf: Define BTF_KIND_* constants in btf.h to avoid compilation errors	Toke Høiland-Jørgensen	1	-1/+21
	The btf.h header included with libbpf contains inline helper functions to check for various BTF kinds. These helpers directly reference the BTF_KIND_* constants defined in the kernel header, and because the header file is included in user applications, this happens in the user application compile units. This presents a problem if a user application is compiled on a system with older kernel headers because the constants are not available. To avoid this, add #defines of the constants directly in btf.h before using them. Since the kernel header moved to an enum for BTF_KIND_*, the #defines can shadow the enum values without any errors, so we only need #ifndef guards for the constants that predates the conversion to enum. We group these so there's only one guard for groups of values that were added together. [0] Closes: https://github.com/libbpf/libbpf/issues/436 Fixes: 223f903e9c83 ("bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG") Fixes: 5b84bd10363e ("libbpf: Add support for BTF_KIND_TAG") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/bpf/20220118141327.34231-1-toke@redhat.com