wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2023-06-13	bpf: Verify scalar ids mapping in regsafe() using check_ids()	Eduard Zingerman	2	-29/+79
	Make sure that the following unsafe example is rejected by verifier: 1: r9 = ... some pointer with range X ... 2: r6 = ... unbound scalar ID=a ... 3: r7 = ... unbound scalar ID=b ... 4: if (r6 > r7) goto +1 5: r6 = r7 6: if (r6 > X) goto ... --- checkpoint --- 7: r9 += r7 8: (u64 )r9 = Y This example is unsafe because not all execution paths verify r7 range. Because of the jump at (4) the verifier would arrive at (6) in two states: I. r6{.id=b}, r7{.id=b} via path 1-6; II. r6{.id=a}, r7{.id=b} via path 1-4, 6. Currently regsafe() does not call check_ids() for scalar registers, thus from POV of regsafe() states (I) and (II) are identical. If the path 1-6 is taken by verifier first, and checkpoint is created at (6) the path [1-4, 6] would be considered safe. Changes in this commit: - check_ids() is modified to disallow mapping multiple old_id to the same cur_id. - check_scalar_ids() is added, unlike check_ids() it treats ID zero as a unique scalar ID. - check_scalar_ids() needs to generate temporary unique IDs, field 'tmp_id_gen' is added to bpf_verifier_env::idmap_scratch to facilitate this. - regsafe() is updated to: - use check_scalar_ids() for precise scalar registers. - compare scalar registers using memcmp only for explore_alu_limits branch. This simplifies control flow for scalar case, and has no measurable performance impact. - check_alu_op() is updated to avoid generating bpf_reg_state::id for constant scalar values when processing BPF_MOV. ID is needed to propagate range information for identical values, but there is nothing to propagate for constants. Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230613153824.3324830-4-eddyz87@gmail.com
2023-06-13	selftests/bpf: Check if mark_chain_precision() follows scalar ids	Eduard Zingerman	2	-0/+346
	Check __mark_chain_precision() log to verify that scalars with same IDs are marked as precise. Use several scenarios to test that precision marks are propagated through: - registers of scalar type with the same ID within one state; - registers of scalar type with the same ID cross several states; - registers of scalar type with the same ID cross several stack frames; - stack slot of scalar type with the same ID; - multiple scalar IDs are tracked independently. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230613153824.3324830-3-eddyz87@gmail.com
2023-06-13	bpf: Use scalar ids in mark_chain_precision()	Eduard Zingerman	3	-5/+128
	Change mark_chain_precision() to track precision in situations like below: r2 = unknown value ... --- state #0 --- ... r1 = r2 // r1 and r2 now share the same ID ... --- state #1 {r1.id = A, r2.id = A} --- ... if (r2 > 10) goto exit; // find_equal_scalars() assigns range to r1 ... --- state #2 {r1.id = A, r2.id = A} --- r3 = r10 r3 += r1 // need to mark both r1 and r2 At the beginning of the processing of each state, ensure that if a register with a scalar ID is marked as precise, all registers sharing this ID are also marked as precise. This property would be used by a follow-up change in regsafe(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230613153824.3324830-2-eddyz87@gmail.com
2023-06-12	bpf/docs: Update documentation for new cpumask kfuncs	David Vernet	1	-2/+3
	We recently added the bpf_cpumask_first_and() kfunc, and changed bpf_cpumask_any() / bpf_cpumask_any_and() to bpf_cpumask_any_distribute() and bpf_cpumask_any_distribute_and() respectively. This patch adds an entry for the bpf_cpumask_first_and() kfunc, and updates the documentation for the any kfuncs to the new names. Signed-off-by: David Vernet <void@manifault.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230610035053.117605-5-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-12	selftests/bpf: Update bpf_cpumask_any* tests to use bpf_cpumask_any_distribute*	David Vernet	2	-6/+6
	In a prior patch, we removed the bpf_cpumask_any() and bpf_cpumask_any_and() kfuncs, and replaced them with bpf_cpumask_any_distribute() and bpf_cpumask_any_distribute_and(). The advertised semantics between the two kfuncs were identical, with the former always returning the first CPU, and the latter actually returning any CPU. This patch updates the selftests for these kfuncs to use the new names. Signed-off-by: David Vernet <void@manifault.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230610035053.117605-4-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-12	bpf: Replace bpf_cpumask_any* with bpf_cpumask_any_distribute*	David Vernet	1	-10/+12
	We currently export the bpf_cpumask_any() and bpf_cpumask_any_and() kfuncs. Intuitively, one would expect these to choose any CPU in the cpumask, but what they actually do is alias to cpumask_first() and cpmkas_first_and(). This is useless given that we already export bpf_cpumask_first() and bpf_cpumask_first_and(), so this patch replaces them with kfuncs that call cpumask_any_distribute() and cpumask_any_and_distribute(), which actually choose any CPU from the cpumask (or the AND of two cpumasks for the latter). Signed-off-by: David Vernet <void@manifault.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230610035053.117605-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-12	selftests/bpf: Add test for new bpf_cpumask_first_and() kfunc	David Vernet	3	-0/+35
	A prior patch added a new kfunc called bpf_cpumask_first_and() which wraps cpumask_first_and(). This patch adds a selftest to validate its behavior. Signed-off-by: David Vernet <void@manifault.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230610035053.117605-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-12	bpf: Add bpf_cpumask_first_and() kfunc	David Vernet	1	-0/+16
	We currently provide bpf_cpumask_first(), bpf_cpumask_any(), and bpf_cpumask_any_and() kfuncs. bpf_cpumask_any() and bpf_cpumask_any_and() are confusing misnomers in that they actually just call cpumask_first() and cpumask_first_and() respectively. We'll replace them with bpf_cpumask_any_distribute() and bpf_cpumask_any_distribute_and() kfuncs in a subsequent patch, so let's ensure feature parity by adding a bpf_cpumask_first_and() kfunc to account for bpf_cpumask_any_and() being removed. Signed-off-by: David Vernet <void@manifault.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230610035053.117605-1-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-12	bpf: Hide unused bpf_patch_call_args	Arnd Bergmann	1	-3/+5
	This function is only used when CONFIG_BPF_JIT_ALWAYS_ON is disabled, but CONFIG_BPF_SYSCALL is enabled. When both are turned off, the prototype is missing but the unused function is still compiled, as seen from this W=1 warning: [...] kernel/bpf/core.c:2075:6: error: no previous prototype for 'bpf_patch_call_args' [-Werror=missing-prototypes] [...] Add a matching #ifdef for the definition to leave it out. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20230602135128.1498362-1-arnd@kernel.org
2023-06-12	selftests/bpf: Fix invalid pointer check in get_xlated_program()	Eduard Zingerman	1	-11/+13
	Dan Carpenter reported invalid check for calloc() result in test_verifier.c:get_xlated_program(): ./tools/testing/selftests/bpf/test_verifier.c:1365 get_xlated_program() warn: variable dereferenced before check 'buf' (see line 1364) ./tools/testing/selftests/bpf/test_verifier.c 1363 cnt = xlated_prog_len / buf_element_size; 1364 buf = calloc(cnt, buf_element_size); 1365 if (!buf) { This should be if (!buf) { 1366 perror("can't allocate xlated program buffer"); 1367 return -ENOMEM; This commit refactors the get_xlated_program() to avoid using double pointer type. Fixes: 933ff53191eb ("selftests/bpf: specify expected instructions in test_verifier tests") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://lore.kernel.org/bpf/ZH7u0hEGVB4MjGZq@moroto/ Link: https://lore.kernel.org/bpf/20230609221637.2631800-1-eddyz87@gmail.com
2023-06-08	selftests/bpf: Add missing prototypes for several test kfuncs	Jiri Olsa	2	-8/+15
	Adding missing prototypes for several kfuncs that are used by test_verifier tests. We don't really need kfunc prototypes for these tests, but adding them to silence 'make W=1' build and to have all test kfuncs declarations in bpf_testmod_kfunc.h. Also moving __diag_pop for -Wmissing-prototypes to cover also bpf_testmod_test_write and bpf_testmod_test_read and adding bpf_fentry_shadow_test in there as well. All of them need to be exported, but there's no need for declarations. Fixes: 65eb006d85a2 ("bpf: Move kernel test kfuncs to bpf_testmod") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://lore.kernel.org/oe-kbuild-all/202306051319.EihCQZPs-lkp@intel.com Link: https://lore.kernel.org/bpf/20230607224046.236510-1-jolsa@kernel.org
2023-06-06	bpf: Factor out a common helper free_all()	Hou Tao	1	-15/+16
	Factor out a common helper free_all() to free all normal elements or per-cpu elements on a lock-less list. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230606035310.4026145-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-06	selftests/bpf: Fix check_mtu using wrong variable type	Jesper Dangaard Brouer	1	-1/+1
	Dan Carpenter found via Smatch static checker, that unsigned 'mtu_lo' is never less than zero. Variable mtu_lo should have been an 'int', because read_mtu_device_lo() uses minus as error indications. Fixes: b62eba563229 ("selftests/bpf: Tests using bpf_check_mtu BPF-helper") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/bpf/168605104733.3636467.17945947801753092590.stgit@firesoul
2023-06-06	bpf: Cleanup unused function declaration	Ruiqi Gong	1	-1/+0
	All usage and the definition of `bpf_prog_free_linfo()` has been removed in commit e16301fbe183 ("bpf: Simplify freeing logic in linfo and jited_linfo"). Clean up its declaration in the header file. Signed-off-by: Ruiqi Gong <gongruiqi@huaweicloud.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/all/20230602030842.279262-1-gongruiqi@huaweicloud.com/ Link: https://lore.kernel.org/bpf/20230606021047.170667-1-gongruiqi@huaweicloud.com
2023-06-05	selftests/bpf: Add missing selftests kconfig options	David Vernet	1	-0/+4
	Our selftests of course rely on the kernel being built with CONFIG_DEBUG_INFO_BTF=y, though this (nor its dependencies of CONFIG_DEBUG_INFO=y and CONFIG_DEBUG_INFO_DWARF4=y) are not specified. This causes the wrong kernel to be built, and selftests to similarly fail to build. Additionally, in the BPF selftests kconfig file, CONFIG_NF_CONNTRACK_MARK=y is specified, so that the 'u_int32_t mark' field will be present in the definition of struct nf_conn. While a dependency of CONFIG_NF_CONNTRACK_MARK=y, CONFIG_NETFILTER_ADVANCED=y, should be enabled by default, I've run into instances of CONFIG_NF_CONNTRACK_MARK not being set because CONFIG_NETFILTER_ADVANCED isn't set, and have to manually enable them with make menuconfig. Let's add these missing kconfig options to the file so that the necessary dependencies are in place to build vmlinux. Otherwise, we'll get errors like this when we try to compile selftests and generate vmlinux.h: $ cd /path/to/bpf-next $ make mrproper; make defconfig $ cat tools/testing/selftests/config >> .config $ make -j ... $ cd tools/testing/selftests/bpf $ make clean $ make -j ... LD [M] tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.ko tools/testing/selftests/bpf/tools/build/bpftool/bootstrap/bpftool btf dump file vmlinux format c > tools/testing/selftests/bpf/tools/build/bpftool/vmlinux.h libbpf: failed to find '.BTF' ELF section in vmlinux Error: failed to load BTF from bpf-next/vmlinux: No data available make[1]: * [Makefile:208: tools/testing/selftests/bpf/tools/build/bpftool/vmlinux.h] Error 195 make[1]: * Deleting file 'tools/testing/selftests/bpf/tools/build/bpftool/vmlinux.h' make: *** [Makefile:261: tools/testing/selftests/bpf/tools/sbin/bpftool] Error 2 Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230602140108.1177900-1-void@manifault.com
2023-06-05	tools/resolve_btfids: Fix setting HOSTCFLAGS	Viktor Malik	1	-2/+2
	Building BPF selftests with custom HOSTCFLAGS yields an error: # make HOSTCFLAGS="-O2" [...] HOSTCC ./tools/testing/selftests/bpf/tools/build/resolve_btfids/main.o main.c:73:10: fatal error: linux/rbtree.h: No such file or directory 73 \| #include <linux/rbtree.h> \| ^~~~~~~~~~~~~~~~ The reason is that tools/bpf/resolve_btfids/Makefile passes header include paths by extending HOSTCFLAGS which is overridden by setting HOSTCFLAGS in the make command (because of Makefile rules [1]). This patch fixes the above problem by passing the include paths via `HOSTCFLAGS_resolve_btfids` which is used by tools/build/Build.include and can be combined with overridding HOSTCFLAGS. [1] https://www.gnu.org/software/make/manual/html_node/Overriding.html Fixes: 56a2df7615fa ("tools/resolve_btfids: Compile resolve_btfids as host program") Signed-off-by: Viktor Malik <vmalik@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230530123352.1308488-1-vmalik@redhat.com
2023-06-05	selftests/bpf: Add test for non-NULLable PTR_TO_BTF_IDs	David Vernet	2	-0/+25
	In a recent patch, we taught the verifier that trusted PTR_TO_BTF_ID can never be NULL. This prevents the verifier from incorrectly failing to load certain programs where it gets confused and thinks a reference isn't dropped because it incorrectly assumes that a branch exists in which a NULL PTR_TO_BTF_ID pointer is never released. This patch adds a testcase that verifies this cannot happen. Signed-off-by: David Vernet <void@manifault.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230602150112.1494194-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-05	bpf: Teach verifier that trusted PTR_TO_BTF_ID pointers are non-NULL	David Vernet	1	-2/+7
	In reg_type_not_null(), we currently assume that a pointer may be NULL if it has the PTR_MAYBE_NULL modifier, or if it doesn't belong to one of several base type of pointers that are never NULL-able. For example, PTR_TO_CTX, PTR_TO_MAP_VALUE, etc. It turns out that in some cases, PTR_TO_BTF_ID can never be NULL as well, though we currently don't specify it. For example, if you had the following program: SEC("tc") long example_refcnt_fail(void ctx) { struct bpf_cpumask mask1, mask2; mask1 = bpf_cpumask_create(); mask2 = bpf_cpumask_create(); if (!mask1 \|\| !mask2) goto error_release; bpf_cpumask_test_cpu(0, (const struct cpumask )mask1); bpf_cpumask_test_cpu(0, (const struct cpumask *)mask2); error_release: if (mask1) bpf_cpumask_release(mask1); if (mask2) bpf_cpumask_release(mask2); return ret; } The verifier will incorrectly fail to load the program, thinking (unintuitively) that we have a possibly-unreleased reference if the mask is NULL, because we (correctly) don't issue a bpf_cpumask_release() on the NULL path. The reason the verifier gets confused is due to the fact that we don't explicitly tell the verifier that trusted PTR_TO_BTF_ID pointers can never be NULL. Basically, if we successfully get past the if check (meaning both pointers go from ptr_or_null_bpf_cpumask to ptr_bpf_cpumask), the verifier will correctly assume that the references need to be dropped on any possible branch that leads to program exit. However, it will _incorrectly_ think that the ptr == NULL branch is possible, and will erroneously detect it as a branch on which we failed to drop the reference. The solution is of course to teach the verifier that trusted PTR_TO_BTF_ID pointers can never be NULL, so that it doesn't incorrectly think it's possible for the reference to be present on the ptr == NULL branch. A follow-on patch will add a selftest that verifies this behavior. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230602150112.1494194-1-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-05	bpf: Replace open code with for allocated object check	Daniel T. Lee	1	-2/+2
	>From commit 282de143ead9 ("bpf: Introduce allocated objects support"), With this allocated object with BPF program, (PTR_TO_BTF_ID \| MEM_ALLOC) has been a way of indicating to check the type is the allocated object. commit d8939cb0a03c ("bpf: Loosen alloc obj test in verifier's reg_btf_record") >From the commit, there has been helper function for checking this, named type_is_ptr_alloc_obj(). But still, some of the code use open code to retrieve this info. This commit replaces the open code with the type_is_alloc(), and the type_is_ptr_alloc_obj() function. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230527122706.59315-1-danieltimlee@gmail.com
2023-06-05	bpf/xdp: optimize bpf_xdp_pointer to avoid reading sinfo	Jesper Dangaard Brouer	1	-3/+4
	Currently we observed a significant performance degradation in samples/bpf xdp1 and xdp2, due XDP multibuffer "xdp.frags" handling, added in commit 772251742262 ("samples/bpf: fixup some tools to be able to support xdp multibuffer"). This patch reduce the overhead by avoiding to read/load shared_info (sinfo) memory area, when XDP packet don't have any frags. This improves performance because sinfo is located in another cacheline. Function bpf_xdp_pointer() is used by BPF helpers bpf_xdp_load_bytes() and bpf_xdp_store_bytes(). As a help to reviewers, xdp_get_buff_len() can potentially access sinfo, but it uses xdp_buff_has_frags() flags bit check to avoid accessing sinfo in no-frags case. The likely/unlikely instrumentation lays out asm code such that sinfo access isn't interleaved with no-frags case (checked on GCC 12.2.1-4). The generated asm code is more compact towards the no-frags case. The BPF kfunc bpf_dynptr_slice() also use bpf_xdp_pointer(). Thus, it should also take effect for that. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/168563651438.3436004.17735707525651776648.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-05	bpf: Make bpf_refcount_acquire fallible for non-owning refs	Dave Marchevsky	4	-11/+29
	This patch fixes an incorrect assumption made in the original bpf_refcount series [0], specifically that the BPF program calling bpf_refcount_acquire on some node can always guarantee that the node is alive. In that series, the patch adding failure behavior to rbtree_add and list_push_{front, back} breaks this assumption for non-owning references. Consider the following program: n = bpf_kptr_xchg(&mapval, NULL); /* skip error checking / bpf_spin_lock(&l); if(bpf_rbtree_add(&t, &n->rb, less)) { bpf_refcount_acquire(n); / Failed to add, do something else with the node */ } bpf_spin_unlock(&l); It's incorrect to assume that bpf_refcount_acquire will always succeed in this scenario. bpf_refcount_acquire is being called in a critical section here, but the lock being held is associated with rbtree t, which isn't necessarily the lock associated with the tree that the node is already in. So after bpf_rbtree_add fails to add the node and calls bpf_obj_drop in it, the program has no ownership of the node's lifetime. Therefore the node's refcount can be decr'd to 0 at any time after the failing rbtree_add. If this happens before the refcount_acquire above, the node might be free'd, and regardless refcount_acquire will be incrementing a 0 refcount. Later patches in the series exercise this scenario, resulting in the expected complaint from the kernel (without this patch's changes): refcount_t: addition on 0; use-after-free. WARNING: CPU: 1 PID: 207 at lib/refcount.c:25 refcount_warn_saturate+0xbc/0x110 Modules linked in: bpf_testmod(O) CPU: 1 PID: 207 Comm: test_progs Tainted: G O 6.3.0-rc7-02231-g723de1a718a2-dirty #371 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 RIP: 0010:refcount_warn_saturate+0xbc/0x110 Code: 6f 64 f6 02 01 e8 84 a3 5c ff 0f 0b eb 9d 80 3d 5e 64 f6 02 00 75 94 48 c7 c7 e0 13 d2 82 c6 05 4e 64 f6 02 01 e8 64 a3 5c ff <0f> 0b e9 7a ff ff ff 80 3d 38 64 f6 02 00 0f 85 6d ff ff ff 48 c7 RSP: 0018:ffff88810b9179b0 EFLAGS: 00010082 RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 RDX: 0000000000000202 RSI: 0000000000000008 RDI: ffffffff857c3680 RBP: ffff88810027d3c0 R08: ffffffff8125f2a4 R09: ffff88810b9176e7 R10: ffffed1021722edc R11: 746e756f63666572 R12: ffff88810027d388 R13: ffff88810027d3c0 R14: ffffc900005fe030 R15: ffffc900005fe048 FS: 00007fee0584a700(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005634a96f6c58 CR3: 0000000108ce9002 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> bpf_refcount_acquire_impl+0xb5/0xc0 (rest of output snipped) The patch addresses this by changing bpf_refcount_acquire_impl to use refcount_inc_not_zero instead of refcount_inc and marking bpf_refcount_acquire KF_RET_NULL. For owning references, though, we know the above scenario is not possible and thus that bpf_refcount_acquire will always succeed. Some verifier bookkeeping is added to track "is input owning ref?" for bpf_refcount_acquire calls and return false from is_kfunc_ret_null for bpf_refcount_acquire on owning refs despite it being marked KF_RET_NULL. Existing selftests using bpf_refcount_acquire are modified where necessary to NULL-check its return value. [0]: https://lore.kernel.org/bpf/20230415201811.343116-1-davemarchevsky@fb.com/ Fixes: d2dcc67df910 ("bpf: Migrate bpf_rbtree_add and bpf_list_push_{front,back} to possibly fail") Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230602022647.1571784-5-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-05	bpf: Fix __bpf_{list,rbtree}_add's beginning-of-node calculation	Dave Marchevsky	1	-2/+2
	Given the pointer to struct bpf_{rb,list}_node within a local kptr and the byte offset of that field within the kptr struct, the calculation changed by this patch is meant to find the beginning of the kptr so that it can be passed to bpf_obj_drop. Unfortunately instead of doing ptr_to_kptr = ptr_to_node_field - offset_bytes the calculation is erroneously doing ptr_to_ktpr = ptr_to_node_field - (offset_bytes * sizeof(struct bpf_rb_node)) or the bpf_list_node equivalent. This patch fixes the calculation. Fixes: d2dcc67df910 ("bpf: Migrate bpf_rbtree_add and bpf_list_push_{front,back} to possibly fail") Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230602022647.1571784-4-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-05	bpf: Set kptr_struct_meta for node param to list and rbtree insert funcs	Dave Marchevsky	1	-0/+3
	In verifier.c, fixup_kfunc_call uses struct bpf_insn_aux_data's kptr_struct_meta field to pass information about local kptr types to various helpers and kfuncs at runtime. The recent bpf_refcount series added a few functions to the set that need this information: * bpf_refcount_acquire * Needs to know where the refcount field is in order to increment * Graph collection insert kfuncs: bpf_rbtree_add, bpf_list_push_{front,back} * Were migrated to possibly fail by the bpf_refcount series. If insert fails, the input node is bpf_obj_drop'd. bpf_obj_drop needs the kptr_struct_meta in order to decr refcount and properly free special fields. Unfortunately the verifier handling of collection insert kfuncs was not modified to actually populate kptr_struct_meta. Accordingly, when the node input to those kfuncs is passed to bpf_obj_drop, it is done so without the information necessary to decr refcount. This patch fixes the issue by populating kptr_struct_meta for those kfuncs. Fixes: d2dcc67df910 ("bpf: Migrate bpf_rbtree_add and bpf_list_push_{front,back} to possibly fail") Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230602022647.1571784-3-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-06-01	selftests/bpf: Test table ID fib lookup BPF helper	Louis DeLosSantos	1	-8/+53
	Add additional test cases to `fib_lookup.c` prog_test. These test cases add a new /24 network to the previously unused veth2 device, removes the directly connected route from the main routing table and moves it to table 100. The first test case then confirms a fib lookup for a remote address in this directly connected network, using the main routing table fails. The second test case ensures the same fib lookup using table 100 succeeds. An additional pair of tests which function in the same manner are added for IPv6. Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230505-bpf-add-tbid-fib-lookup-v2-2-0a31c22c748c@gmail.com
2023-06-01	bpf: Add table ID to bpf_fib_lookup BPF helper	Louis DeLosSantos	3	-7/+49
	Add ability to specify routing table ID to the `bpf_fib_lookup` BPF helper. A new field `tbid` is added to `struct bpf_fib_lookup` used as parameters to the `bpf_fib_lookup` BPF helper. When the helper is called with the `BPF_FIB_LOOKUP_DIRECT` and `BPF_FIB_LOOKUP_TBID` flags the `tbid` field in `struct bpf_fib_lookup` will be used as the table ID for the fib lookup. If the `tbid` does not exist the fib lookup will fail with `BPF_FIB_LKUP_RET_NOT_FWDED`. The `tbid` field becomes a union over the vlan related output fields in `struct bpf_fib_lookup` and will be zeroed immediately after usage. This functionality is useful in containerized environments. For instance, if a CNI wants to dictate the next-hop for traffic leaving a container it can create a container-specific routing table and perform a fib lookup against this table in a "host-net-namespace-side" TC program. This functionality also allows `ip rule` like functionality at the TC layer, allowing an eBPF program to pick a routing table based on some aspect of the sk_buff. As a concrete use case, this feature will be used in Cilium's SRv6 L3VPN datapath. When egress traffic leaves a Pod an eBPF program attached by Cilium will determine which VRF the egress traffic should target, and then perform a FIB lookup in a specific table representing this VRF's FIB. Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230505-bpf-add-tbid-fib-lookup-v2-1-0a31c22c748c@gmail.com
2023-05-31	samples/bpf: xdp1 and xdp2 reduce XDPBUFSIZE to 60	Jesper Dangaard Brouer	2	-2/+2
	Default samples/pktgen scripts send 60 byte packets as hardware adds 4-bytes FCS checksum, which fulfils minimum Ethernet 64 bytes frame size. XDP layer will not necessary have access to the 4-bytes FCS checksum. This leads to bpf_xdp_load_bytes() failing as it tries to copy 64-bytes from an XDP packet that only have 60-bytes available. Fixes: 772251742262 ("samples/bpf: fixup some tools to be able to support xdp multibuffer") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/bpf/168545704139.2996228.2516528552939485216.stgit@firesoul
2023-05-31	net: Use umd_cleanup_helper()	Jarkko Sakkinen	3	-12/+2
	bpfilter_umh_cleanup() is the same function as umd_cleanup_helper(). Drop the redundant function. Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230526112104.1044686-1-jarkko@kernel.org
2023-05-31	bpf: Replace all non-returning strlcpy with strscpy	Azeem Shaikh	1	-2/+2
	strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. This is not the case here, however, in an effort to remove strlcpy() completely [2], lets replace strlcpy() here with strscpy(). No return values were used, so a direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/bpf/20230530155659.309657-1-azeemshaikh38@gmail.com
2023-05-31	bpf/tests: Use struct_size()	Su Hui	1	-2/+1
	Use struct_size() instead of hand writing it. This is less verbose and more informative. Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20230531043251.989312-1-suhui@nfschina.com
2023-05-30	selftests/bpf: Add a test where map key_type_id with decl_tag type	Yonghong Song	1	-0/+40
	Add two selftests where map creation key/value type_id's are decl_tags. Without previous patch, kernel warnings will appear similar to the one in the previous patch. With the previous patch, both kernel warnings are silenced. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230530205034.266643-1-yhs@fb.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-05-30	bpf: Silence a warning in btf_type_id_size()	Yonghong Song	1	-9/+10
	syzbot reported a warning in [1] with the following stacktrace: WARNING: CPU: 0 PID: 5005 at kernel/bpf/btf.c:1988 btf_type_id_size+0x2d9/0x9d0 kernel/bpf/btf.c:1988 ... RIP: 0010:btf_type_id_size+0x2d9/0x9d0 kernel/bpf/btf.c:1988 ... Call Trace: <TASK> map_check_btf kernel/bpf/syscall.c:1024 [inline] map_create+0x1157/0x1860 kernel/bpf/syscall.c:1198 __sys_bpf+0x127f/0x5420 kernel/bpf/syscall.c:5040 __do_sys_bpf kernel/bpf/syscall.c:5162 [inline] __se_sys_bpf kernel/bpf/syscall.c:5160 [inline] __x64_sys_bpf+0x79/0xc0 kernel/bpf/syscall.c:5160 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd With the following btf [1] DECL_TAG 'a' type_id=4 component_idx=-1 [2] PTR '(anon)' type_id=0 [3] TYPE_TAG 'a' type_id=2 [4] VAR 'a' type_id=3, linkage=static and when the bpf_attr.btf_key_type_id = 1 (DECL_TAG), the following WARN_ON_ONCE in btf_type_id_size() is triggered: if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) && !btf_type_is_var(size_type))) return NULL; Note that 'return NULL' is the correct behavior as we don't want a DECL_TAG type to be used as a btf_{key,value}_type_id even for the case like 'DECL_TAG -> STRUCT'. So there is no correctness issue here, we just want to silence warning. To silence the warning, I added DECL_TAG as one of kinds in btf_type_nosize() which will cause btf_type_id_size() returning NULL earlier without the warning. [1] https://lore.kernel.org/bpf/000000000000e0df8d05fc75ba86@google.com/ Reported-by: syzbot+958967f249155967d42a@syzkaller.appspotmail.com Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230530205029.264910-1-yhs@fb.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-05-30	r8169: check for PCI read error in probe	Heiner Kallweit	1	-1/+8
	Check whether first PCI read returns 0xffffffff. Currently, if this is the case, the user sees the following misleading message: unknown chip XID fcf, contact r8169 maintainers (see MAINTAINERS file) Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/75b54d23-fefe-2bf4-7e80-c9d3bc91af11@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	dsa: lan9303: Remove stray gpiod_unexport() call	Andy Shevchenko	1	-1/+0
	There is no gpiod_export() and gpiod_unexport() looks pretty much stray. The gpiod_export() and gpiod_unexport() shouldn't be used in the code, GPIO sysfs is deprecated. That said, simply drop the stray call. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230528142531.38602-1-andriy.shevchenko@linux.intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	liquidio: Use vzalloc()	Christophe JAILLET	2	-6/+2
	Use vzalloc() instead of hand writing it with vmalloc()+memset(). This is less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/93b010824d9d92376e8d49b9eb396a0fa0c0ac80.1685216322.git.christophe.jaillet@wanadoo.fr Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: phy: microchip_t1s: add support for Microchip LAN865x Rev.B0 PHYs	Parthiban Veerasooran	2	-4/+179
	Add support for the Microchip LAN865x Rev.B0 10BASE-T1S Internal PHYs (LAN8650/1). The LAN865x combines a Media Access Controller (MAC) and an internal 10BASE-T1S Ethernet PHY to access 10BASE‑T1S networks. As LAN867X and LAN865X are using the same function for the read_status, rename the function as lan86xx_read_status. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com> Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: phy: microchip_t1s: remove unnecessary interrupts disabling code	Parthiban Veerasooran	1	-13/+1
	By default, except Reset Complete interrupt in the Interrupt Mask 2 Register all other interrupts are disabled/masked. As Reset Complete status is already handled, it doesn't make sense to disable it. Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se> Tested-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: phy: microchip_t1s: fix reset complete status handling	Parthiban Veerasooran	1	-0/+21
	As per the datasheet DS-LAN8670-1-2-60001573C.pdf, the Reset Complete status bit in the STS2 register has to be checked before proceeding to the initial configuration. Reading STS2 register will also clear the Reset Complete interrupt which is non-maskable. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com> Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se> Tested-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: phy: microchip_t1s: update LAN867x PHY supported revision number	Parthiban Veerasooran	2	-15/+15
	As per AN1699, the initial configuration in the driver applies to LAN867x Rev.B1 hardware revision. 0x0007C160 (Rev.A0) and 0x0007C161 (Rev.B0) never released to production and hence they don't need to be supported. Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: phy: microchip_t1s: replace read-modify-write code with phy_modify_mmd	Parthiban Veerasooran	1	-28/+13
	Replace read-modify-write code in the lan867x_config_init function to avoid handling data type mismatch and to simplify the code. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com> Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: phy: microchip_t1s: modify driver description to be more generic	Parthiban Veerasooran	2	-5/+5
	Remove LAN867X from the driver description as this driver is common for all the Microchip 10BASE-T1S PHYs. Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: dsa: microchip: Add register access control for KSZ8873 chip	Oleksij Rempel	1	-0/+41
	This update introduces specific register access boundaries for the KSZ8873 and KSZ8863 chips within the DSA Microchip driver. The outlined ranges target global control registers, port registers, and advanced control registers. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: dsa: microchip: ksz8: Prepare ksz8863_smi for regmap register access validation	Oleksij Rempel	1	-0/+11
	This patch prepares the ksz8863_smi part of ksz8 driver to utilize the regmap register access validation feature. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: dsa: microchip: remove ksz_port:on variable	Oleksij Rempel	2	-20/+1
	The only place where this variable would be set to false is the ksz8_config_cpu_port() function. But it is done in a bogus way: for (i = 0; i < dev->phy_port_cnt; i++) { if (i == dev->phy_port_cnt) <--- will be never executed. break; p->on = 1; So, we never have a situation where p->on = 0. In this case, we can just remove it. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: dsa: microchip: add an enum for regmap widths	Vladimir Oltean	8	-43/+63
	It is not immediately obvious that this driver allocates, via the KSZ_REGMAP_TABLE() macro, 3 regmaps for register access: dev->regmap[0] for 8-bit access, dev->regmap[1] for 16-bit and dev->regmap[2] for 32-bit access. In future changes that add support for reg_fields, each field will have to specify through which of the 3 regmaps it's going to go. Add an enum now, to denote one of the 3 register access widths, and make the code go through some wrapper functions for easier review and further modification. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30	net: dsa: microchip: improving error handling for 8-bit register RMW operations	Oleksij Rempel	1	-6/+22
	This patch refines the error handling mechanism for 8-bit register read-modify-write operations. In case of a failure, it now logs an error message detailing the problematic offset. This enhancement aids in debugging by providing more precise information when these operations encounter issues. Furthermore, the ksz_prmw8() function has been updated to return error values rather than void, enabling calling functions to appropriately respond to errors. Additionally, in case of an error that affects both the current and future accesses, the PHY driver will log the errors consistently, akin to the existing behavior in all ksz_read/ksz_write helpers. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>