aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-03-15bpf: fix missing kdoc string fields in cpumask.cEmil Tsalapatis1-0/+20
Some bpf_cpumask-related kfuncs have kdoc strings that are missing return values. Add a the missing descriptions for the return values. Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20250309230427.26603-4-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests: bpf: add bpf_cpumask_populate selftestsEmil Tsalapatis4-0/+161
Add selftests for the bpf_cpumask_populate helper that sets a bpf_cpumask to a bit pattern provided by a BPF program. Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20250309230427.26603-3-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: add kfunc for populating cpumask bitsEmil Tsalapatis1-0/+33
Add a helper kfunc that sets the bitmap of a bpf_cpumask from BPF memory. Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com> Acked-by: Hou Tao <houtao1@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20250309230427.26603-2-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: Fix cap_enable_effective() return codeFeng Yang4-9/+10
The caller of cap_enable_effective() expects negative error code. Fix it. Before: failed to restore CAP_SYS_ADMIN: -1, Unknown error -1 After: failed to restore CAP_SYS_ADMIN: -3, No such process failed to restore CAP_SYS_ADMIN: -22, Invalid argument Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250305022234.44932-1-yangfeng59949@163.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: lwt_seg6local: Move test to test_progsBastien Curutchet (eBPF Foundation)3-152/+176
test_lwt_seg6local.sh isn't used by the BPF CI. Add a new file in the test_progs framework to migrate the tests done by test_lwt_seg6local.sh. It uses the same network topology and the same BPF programs located in progs/test_lwt_seg6local.c. Use the network helpers instead of `nc` to exchange the final packet. Remove test_lwt_seg6local.sh and its Makefile entry. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250307-seg6local-v1-2-990fff8f180d@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: Fix dangling stdout seen by traffic monitor threadAmery Hung1-17/+22
Traffic monitor thread may see dangling stdout as the main thread closes and reassigns stdout without protection. This happens when the main thread finishes one subtest and moves to another one in the same netns_new() scope. The issue can be reproduced by running test_progs repeatedly with traffic monitor enabled: for ((i=1;i<=100;i++)); do ./test_progs -a flow_dissector_skb* -m '*' done For restoring stdout in crash_handler(), since it does not really care about closing stdout, simlpy flush stdout and restore it to the original one. Then, Fix the issue by consolidating stdio_restore_cleanup() and stdio_restore(), and protecting the use/close/assignment of stdout with a lock. The locking in the main thread is always performed regradless of whether traffic monitor is running or not for simplicity. It won't have any side-effect. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250305182057.2802606-3-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: lwt_seg6local: Remove unused routesBastien Curutchet (eBPF Foundation)1-5/+0
Some routes in fb00:: are initialized during setup, even though they aren't needed by the test as the UDP packets will travel through the lightweight tunnels. Remove these unnecessary routes. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250307-seg6local-v1-1-990fff8f180d@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: Allow assigning traffic monitor print functionAmery Hung2-19/+59
Allow users to change traffic monitor's print function. If not provided, traffic monitor will print to stdout by default. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250305182057.2802606-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: Clean up call sites of stdio_restore()Amery Hung1-11/+6
reset_affinity() and save_ns() are only called in run_one_test(). There is no need to call stdio_restore() in reset_affinity() and save_ns() if stdio_restore() is moved right after a test finishes in run_one_test(). Also remove an unnecessary check of env.stdout_saved in crash_handler() by moving env.stdout_saved assignment to the beginning of main(). Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250305182057.2802606-1-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: Move test_lwt_ip_encap to test_progsBastien Curutchet (eBPF Foundation)3-478/+541
test_lwt_ip_encap.sh isn't used by the BPF CI. Add a new file in the test_progs framework to migrate the tests done by test_lwt_ip_encap.sh. It uses the same network topology and the same BPF programs located in progs/test_lwt_ip_encap.c. Rework the GSO part to avoid using nc and dd. Remove test_lwt_ip_encap.sh and its Makefile entry. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250304-lwt_ip-v1-1-8fdeb9e79a56@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf, docs: Fix broken link to renamed bpf_iter_task_vmas.cT.J. Mercier1-1/+1
This file was renamed from bpf_iter_task_vma.c. Fixes: 45b38941c81f ("selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c") Signed-off-by: T.J. Mercier <tjmercier@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250304204520.201115-1-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: Add tests for arena spin lockKumar Kartikeya Dwivedi2-0/+159
Add some basic selftests for qspinlock built over BPF arena using cond_break_label macro. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250306035431.2186189-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: Introduce arena spin lockKumar Kartikeya Dwivedi2-0/+652
Implement queued spin lock algorithm as BPF program for lock words living in BPF arena. The algorithm is copied from kernel/locking/qspinlock.c and adapted for BPF use. We first implement abstract helpers for portable atomics and acquire/release load instructions, by relying on X86_64 presence to elide expensive barriers and rely on implementation details of the JIT, and fall back to slow but correct implementations elsewhere. When support for acquire/release load/stores lands, we can improve this state. Then, the qspinlock algorithm is adapted to remove dependence on multi-word atomics due to lack of support in BPF ISA. For instance, xchg_tail cannot use 16-bit xchg, and needs to be a implemented as a 32-bit try_cmpxchg loop. Loops which are seemingly infinite from verifier PoV are annotated with cond_break_label macro to return an error. Only 1024 NR_CPUs are supported. Note that the slow path is a global function, hence the verifier doesn't know the return value's precision. The recommended way of usage is to always test against zero for success, and not ret < 0 for error, as the verifier would assume ret > 0 has not been accounted for. Add comments in the function documentation about this quirk. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250306035431.2186189-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: Introduce cond_break_labelKumar Kartikeya Dwivedi1-6/+9
Add a new cond_break_label macro that jumps to the specified label when the cond_break termination check fires, and allows us to better handle the uncontrolled termination of the loop. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250306035431.2186189-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: correct use/def for may_goto instructionEduard Zingerman2-3/+4
may_goto instruction does not use any registers, but in compute_insn_live_regs() it was treated as a regular conditional jump of kind BPF_K with r0 as source register. Thus unnecessarily marking r0 as used. Fixes: 14c8552db644 ("bpf: simple DFA-based live registers analysis") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250305085436.2731464-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test cases for compute_live_registers()Eduard Zingerman6-14/+455
Cover instructions from each kind: - assignment - arithmetic - store/load - endian conversion - atomics - branches, conditional branches, may_goto, calls - LD_ABS/LD_IND - address_space_cast Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-6-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: use register liveness information for func_states_equalEduard Zingerman1-4/+10
Liveness analysis DFA computes a set of registers live before each instruction. Leverage this information to skip comparison of dead registers in func_states_equal(). This helps with convergance of iterator processing loops, as bpf_reg_state->live marks can't be used when loops are processed. This has certain performance impact for selftests, here is a veristat listing using `-f "insns_pct>5" -f "!insns<200"` selftests: File Program States (A) States (B) States (DIFF) -------------------- ----------------------------- ---------- ---------- -------------- arena_htab.bpf.o arena_htab_llvm 37 35 -2 (-5.41%) arena_htab_asm.bpf.o arena_htab_asm 37 33 -4 (-10.81%) arena_list.bpf.o arena_list_add 37 22 -15 (-40.54%) dynptr_success.bpf.o test_dynptr_copy 22 16 -6 (-27.27%) dynptr_success.bpf.o test_dynptr_copy_xdp 68 58 -10 (-14.71%) iters.bpf.o checkpoint_states_deletion 918 40 -878 (-95.64%) iters.bpf.o clean_live_states 136 66 -70 (-51.47%) iters.bpf.o iter_nested_deeply_iters 43 37 -6 (-13.95%) iters.bpf.o iter_nested_iters 72 62 -10 (-13.89%) iters.bpf.o iter_pass_iter_ptr_to_subprog 30 26 -4 (-13.33%) iters.bpf.o iter_subprog_iters 68 59 -9 (-13.24%) iters.bpf.o loop_state_deps2 35 32 -3 (-8.57%) iters_css.bpf.o iter_css_for_each 32 29 -3 (-9.38%) pyperf600_iter.bpf.o on_event 286 192 -94 (-32.87%) Total progs: 3578 Old success: 2061 New success: 2061 States diff min: -95.64% States diff max: 0.00% -100 .. -90 %: 1 -55 .. -45 %: 3 -45 .. -35 %: 2 -35 .. -25 %: 5 -20 .. -10 %: 12 -10 .. 0 %: 6 sched_ext: File Program States (A) States (B) States (DIFF) ----------------- ---------------------- ---------- ---------- --------------- bpf.bpf.o lavd_dispatch 8950 7065 -1885 (-21.06%) bpf.bpf.o lavd_init 516 480 -36 (-6.98%) bpf.bpf.o layered_dispatch 662 501 -161 (-24.32%) bpf.bpf.o layered_dump 298 237 -61 (-20.47%) bpf.bpf.o layered_init 523 423 -100 (-19.12%) bpf.bpf.o layered_init_task 24 22 -2 (-8.33%) bpf.bpf.o layered_runnable 151 125 -26 (-17.22%) bpf.bpf.o p2dq_dispatch 66 53 -13 (-19.70%) bpf.bpf.o p2dq_init 170 142 -28 (-16.47%) bpf.bpf.o refresh_layer_cpumasks 120 78 -42 (-35.00%) bpf.bpf.o rustland_init 37 34 -3 (-8.11%) bpf.bpf.o rustland_init 37 34 -3 (-8.11%) bpf.bpf.o rusty_select_cpu 125 108 -17 (-13.60%) scx_central.bpf.o central_dispatch 59 43 -16 (-27.12%) scx_central.bpf.o central_init 39 28 -11 (-28.21%) scx_nest.bpf.o nest_init 58 51 -7 (-12.07%) scx_pair.bpf.o pair_dispatch 142 111 -31 (-21.83%) scx_qmap.bpf.o qmap_dispatch 174 141 -33 (-18.97%) scx_qmap.bpf.o qmap_init 768 654 -114 (-14.84%) Total progs: 216 Old success: 186 New success: 186 States diff min: -35.00% States diff max: 0.00% -35 .. -25 %: 3 -25 .. -20 %: 6 -20 .. -15 %: 6 -15 .. -5 %: 7 -5 .. 0 %: 6 Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-5-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: Add selftests for load-acquire and store-release instructionsPeilin Ye6-3/+698
Add several ./test_progs tests: - arena_atomics/load_acquire - arena_atomics/store_release - verifier_load_acquire/* - verifier_store_release/* - verifier_precision/bpf_load_acquire - verifier_precision/bpf_store_release The last two tests are added to check if backtrack_insn() handles the new instructions correctly. Additionally, the last test also makes sure that the verifier "remembers" the value (in src_reg) we store-release into e.g. a stack slot. For example, if we take a look at the test program: #0: r1 = 8; /* store_release((u64 *)(r10 - 8), r1); */ #1: .8byte %[store_release]; #2: r1 = *(u64 *)(r10 - 8); #3: r2 = r10; #4: r2 += r1; #5: r0 = 0; #6: exit; At #1, if the verifier doesn't remember that we wrote 8 to the stack, then later at #4 we would be adding an unbounded scalar value to the stack pointer, which would cause the program to be rejected: VERIFIER LOG: ============= ... math between fp pointer and register with unbounded min value is not allowed For easier CI integration, instead of using built-ins like __atomic_{load,store}_n() which depend on the new __BPF_FEATURE_LOAD_ACQ_STORE_REL pre-defined macro, manually craft load-acquire/store-release instructions using __imm_insn(), as suggested by Eduard. All new tests depend on: (1) Clang major version >= 18, and (2) ENABLE_ATOMICS_TESTS is defined (currently implies -mcpu=v3 or v4), and (3) JIT supports load-acquire/store-release (currently arm64 and x86-64) In .../progs/arena_atomics.c: /* 8-byte-aligned */ __u8 __arena_global load_acquire8_value = 0x12; /* 1-byte hole */ __u16 __arena_global load_acquire16_value = 0x1234; That 1-byte hole in the .addr_space.1 ELF section caused clang-17 to crash: fatal error: error in backend: unable to write nop sequence of 1 bytes To work around such llvm-17 CI job failures, conditionally define __arena_global variables as 64-bit if __clang_major__ < 18, to make sure .addr_space.1 has no holes. Ideally we should avoid compiling this file using clang-17 at all (arena tests depend on __BPF_FEATURE_ADDR_SPACE_CAST, and are skipped for llvm-17 anyway), but that is a separate topic. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/1b46c6feaf0f1b6984d9ec80e500cc7383e9da1a.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: simple DFA-based live registers analysisEduard Zingerman3-7/+330
Compute may-live registers before each instruction in the program. The register is live before the instruction I if it is read by I or some instruction S following I during program execution and is not overwritten between I and S. This information would be used in the next patch as a hint in func_states_equal(). Use a simple algorithm described in [1] to compute this information: - define the following: - I.use : a set of all registers read by instruction I; - I.def : a set of all registers written by instruction I; - I.in : a set of all registers that may be alive before I execution; - I.out : a set of all registers that may be alive after I execution; - I.successors : a set of instructions S that might immediately follow I for some program execution; - associate separate empty sets 'I.in' and 'I.out' with each instruction; - visit each instruction in a postorder and update corresponding 'I.in' and 'I.out' sets as follows: I.out = U [S.in for S in I.successors] I.in = (I.out / I.def) U I.use (where U stands for set union, / stands for set difference) - repeat the computation while I.{in,out} changes for any instruction. On implementation side keep things as simple, as possible: - check_cfg() already marks instructions EXPLORED in post-order, modify it to save the index of each EXPLORED instruction in a vector; - represent I.{in,out,use,def} as bitmasks; - don't split the program into basic blocks and don't maintain the work queue, instead: - do fixed-point computation by visiting each instruction; - maintain a simple 'changed' flag if I.{in,out} for any instruction change; Measurements show that even such simplistic implementation does not add measurable verification time overhead (for selftests, at-least). Note on check_cfg() ex_insn_beg/ex_done change: To avoid out of bounds access to env->cfg.insn_postorder array, it should be guaranteed that instruction transitions to EXPLORED state only once. Previously this was not the fact for incorrect programs with direct calls to exception callbacks. The 'align' selftest needs adjustment to skip computed insn/live registers printout. Otherwise it matches lines from the live registers printout. [1] https://en.wikipedia.org/wiki/Live-variable_analysis Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf, x86: Support load-acquire and store-release instructionsPeilin Ye1-17/+82
Recently we introduced BPF load-acquire (BPF_LOAD_ACQ) and store-release (BPF_STORE_REL) instructions. For x86-64, simply implement them as regular BPF_LDX/BPF_STX loads and stores. The verifier always rejects misaligned load-acquires/store-releases (even if BPF_F_ANY_ALIGNMENT is set), so emitted MOV* instructions are guaranteed to be atomic. Arena accesses are supported. 8- and 16-bit load-acquires are zero-extending (i.e., MOVZBQ, MOVZWQ). Rename emit_atomic{,_index}() to emit_atomic_rmw{,_index}() to make it clear that they only handle read-modify-write atomics, and extend their @atomic_op parameter from u8 to u32, since we are starting to use more than the lowest 8 bits of the 'imm' field. Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/d22bb3c69f126af1d962b7314f3489eff606a3b7.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: get_call_summary() utility functionEduard Zingerman1-64/+57
Refactor mark_fastcall_pattern_for_call() to extract a utility function get_call_summary(). For a helper or kfunc call this function fills the following information: {num_params, is_void, fastcall}. This function would be used in the next patch in order to get number of parameters of a helper or kfunc call. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf, arm64: Support load-acquire and store-release instructionsPeilin Ye2-6/+104
Support BPF load-acquire (BPF_LOAD_ACQ) and store-release (BPF_STORE_REL) instructions in the arm64 JIT compiler. For example (assuming little-endian): db 10 00 00 00 01 00 00 r0 = load_acquire((u64 *)(r1 + 0x0)) 95 00 00 00 00 00 00 00 exit opcode (0xdb): BPF_ATOMIC | BPF_DW | BPF_STX imm (0x00000100): BPF_LOAD_ACQ The JIT compiler would emit an LDAR instruction for the above, e.g.: ldar x7, [x0] Similarly, consider the following 16-bit store-release: cb 21 00 00 10 01 00 00 store_release((u16 *)(r1 + 0x0), w2) 95 00 00 00 00 00 00 00 exit opcode (0xcb): BPF_ATOMIC | BPF_H | BPF_STX imm (0x00000110): BPF_STORE_REL An STLRH instruction would be emitted, e.g.: stlrh w1, [x0] For a complete mapping: load-acquire 8-bit LDARB (BPF_LOAD_ACQ) 16-bit LDARH 32-bit LDAR (32-bit) 64-bit LDAR (64-bit) store-release 8-bit STLRB (BPF_STORE_REL) 16-bit STLRH 32-bit STLR (32-bit) 64-bit STLR (64-bit) Arena accesses are supported. bpf_jit_supports_insn(..., /*in_arena=*/true) always returns true for BPF_LOAD_ACQ and BPF_STORE_REL instructions, as they don't depend on ARM64_HAS_LSE_ATOMICS. Acked-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/51664a1300710238ba2d4d95142b57a52c4f0cae.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: jmp_offset() and verbose_insn() utility functionsEduard Zingerman1-17/+23
Extract two utility functions: - One BPF jump instruction uses .imm field to encode jump offset, while the rest use .off. Encapsulate this detail as jmp_offset() function. - Avoid duplicating instruction printing callback definitions by defining a verbose_insn() function, which disassembles an instruction into the verifier log while hiding this detail. These functions will be used in the next patch. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15arm64: insn: Add load-acquire and store-release instructionsPeilin Ye2-0/+37
Add load-acquire ("load_acq", LDAR{,B,H}) and store-release ("store_rel", STLR{,B,H}) instructions. Breakdown of encoding: size L (Rs) o0 (Rt2) Rn Rt mask (0x3fdffc00): 00 111111 1 1 0 11111 1 11111 00000 00000 value, load_acq (0x08dffc00): 00 001000 1 1 0 11111 1 11111 00000 00000 value, store_rel (0x089ffc00): 00 001000 1 0 0 11111 1 11111 00000 00000 As suggested by Xu [1], include all Should-Be-One (SBO) bits ("Rs" and "Rt2" fields) in the "mask" and "value" numbers. It is worth noting that we are adding the "no offset" variant of STLR instead of the "pre-index" variant, which has a different encoding. Reference: Arm Architecture Reference Manual (ARM DDI 0487K.a, ID032224), * C6.2.161 LDAR * C6.2.353 STLR [1] https://lore.kernel.org/bpf/4e6641ce-3f1e-4251-8daf-4dd4b77d08c4@huaweicloud.com/ Acked-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/ba92057b7502ce4c9c9b03b7d637abe5e178134e.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15arm64: insn: Add BIT(23) to {load,store}_ex's maskPeilin Ye1-2/+2
We are planning to add load-acquire (LDAR{,B,H}) and store-release (STLR{,B,H}) instructions to insn.{c,h}; add BIT(23) to mask of load_ex and store_ex to prevent aarch64_insn_is_{load,store}_ex() from returning false-positives for load-acquire and store-release instructions. Reference: Arm Architecture Reference Manual (ARM DDI 0487K.a, ID032224), * C6.2.228 LDXR * C6.2.165 LDAXR * C6.2.161 LDAR * C6.2.393 STXR * C6.2.360 STLXR * C6.2.353 STLR Acked-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/5a4d2a52b2cc022bf86d0b572789f0b3bc3d5162.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: Introduce load-acquire and store-release instructionsPeilin Ye10-13/+166
Introduce BPF instructions with load-acquire and store-release semantics, as discussed in [1]. Define 2 new flags: #define BPF_LOAD_ACQ 0x100 #define BPF_STORE_REL 0x110 A "load-acquire" is a BPF_STX | BPF_ATOMIC instruction with the 'imm' field set to BPF_LOAD_ACQ (0x100). Similarly, a "store-release" is a BPF_STX | BPF_ATOMIC instruction with the 'imm' field set to BPF_STORE_REL (0x110). Unlike existing atomic read-modify-write operations that only support BPF_W (32-bit) and BPF_DW (64-bit) size modifiers, load-acquires and store-releases also support BPF_B (8-bit) and BPF_H (16-bit). As an exception, however, 64-bit load-acquires/store-releases are not supported on 32-bit architectures (to fix a build error reported by the kernel test robot). An 8- or 16-bit load-acquire zero-extends the value before writing it to a 32-bit register, just like ARM64 instruction LDARH and friends. Similar to existing atomic read-modify-write operations, misaligned load-acquires/store-releases are not allowed (even if BPF_F_ANY_ALIGNMENT is set). As an example, consider the following 64-bit load-acquire BPF instruction (assuming little-endian): db 10 00 00 00 01 00 00 r0 = load_acquire((u64 *)(r1 + 0x0)) opcode (0xdb): BPF_ATOMIC | BPF_DW | BPF_STX imm (0x00000100): BPF_LOAD_ACQ Similarly, a 16-bit BPF store-release: cb 21 00 00 10 01 00 00 store_release((u16 *)(r1 + 0x0), w2) opcode (0xcb): BPF_ATOMIC | BPF_H | BPF_STX imm (0x00000110): BPF_STORE_REL In arch/{arm64,s390,x86}/net/bpf_jit_comp.c, have bpf_jit_supports_insn(..., /*in_arena=*/true) return false for the new instructions, until the corresponding JIT compiler supports them in arena. [1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/ Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/a217f46f0e445fbd573a1a024be5c6bf1d5fe716.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf, x86: Add x86 JIT support for timed may_gotoKumar Kartikeya Dwivedi5-13/+130
Implement the arch_bpf_timed_may_goto function using inline assembly to have control over which registers are spilled, and use our special protocol of using BPF_REG_AX as an argument into the function, and as the return value when going back. Emit call depth accounting for the call made from this stub, and ensure we don't have naked returns (when rethunk mitigations are enabled) by falling back to the RET macro (instead of retq). After popping all saved registers, the return address into the BPF program should be on top of the stack. Since the JIT support is now enabled, ensure selftests which are checking the produced may_goto sequences do not break by adjusting them. Make sure we still test the old may_goto sequence on other architectures, while testing the new sequence on x86_64. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250304003239.2390751-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: Add tests for bpf_object__prepareMykyta Yatsenko2-0/+127
Add selftests, checking that running bpf_object__prepare successfully creates maps before load step. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250303135752.158343-5-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: Add verifier support for timed may_gotoKumar Kartikeya Dwivedi4-8/+96
Implement support in the verifier for replacing may_goto implementation from a counter-based approach to one which samples time on the local CPU to have a bigger loop bound. We implement it by maintaining 16-bytes per-stack frame, and using 8 bytes for maintaining the count for amortizing time sampling, and 8 bytes for the starting timestamp. To minimize overhead, we need to avoid spilling and filling of registers around this sequence, so we push this cost into the time sampling function 'arch_bpf_timed_may_goto'. This is a JIT-specific wrapper around bpf_check_timed_may_goto which returns us the count to store into the stack through BPF_REG_AX. All caller-saved registers (r0-r5) are guaranteed to remain untouched. The loop can be broken by returning count as 0, otherwise we dispatch into the function when the count drops to 0, and the runtime chooses to refresh it (by returning count as BPF_MAX_TIMED_LOOPS) or returning 0 and aborting the loop on next iteration. Since the check for 0 is done right after loading the count from the stack, all subsequent cond_break sequences should immediately break as well, of the same loop or subsequent loops in the program. We pass in the stack_depth of the count (and thus the timestamp, by adding 8 to it) to the arch_bpf_timed_may_goto call so that it can be passed in to bpf_check_timed_may_goto as an argument after r1 is saved, by adding the offset to r10/fp. This adjustment will be arch specific, and the next patch will introduce support for x86. Note that depending on loop complexity, time spent in the loop can be more than the current limit (250 ms), but imposing an upper bound on program runtime is an orthogonal problem which will be addressed when program cancellations are supported. The current time afforded by cond_break may not be enough for cases where BPF programs want to implement locking algorithms inline, and use cond_break as a promise to the verifier that they will eventually terminate. Below are some benchmarking numbers on the time taken per-iteration for an empty loop that counts the number of iterations until cond_break fires. For comparison, we compare it against bpf_for/bpf_repeat which is another way to achieve the same number of spins (BPF_MAX_LOOPS). The hardware used for benchmarking was a Sapphire Rapids Intel server with performance governor enabled, mitigations were enabled. +-----------------------------+--------------+--------------+------------------+ | Loop type | Iterations | Time (ms) | Time/iter (ns) | +-----------------------------|--------------+--------------+------------------+ | may_goto | 8388608 | 3 | 0.36 | | timed_may_goto (count=65535)| 589674932 | 250 | 0.42 | | bpf_for | 8388608 | 10 | 1.19 | +-----------------------------+--------------+--------------+------------------+ This gives a good approximation at low overhead while staying close to the current implementation. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250304003239.2390751-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15libbpf: Split bpf object load into prepare/loadMykyta Yatsenko3-43/+117
Introduce bpf_object__prepare API: additional intermediate preparation step that performs ELF processing, relocations, prepares final state of BPF program instructions (accessible with bpf_program__insns()), creates and (potentially) pins maps, and stops short of loading BPF programs. We anticipate few use cases for this API, such as: * Use prepare to initialize bpf_token, without loading freplace programs, unlocking possibility to lookup BTF of other programs. * Execute prepare to obtain finalized BPF program instructions without loading programs, enabling tools like veristat to process one program at a time, without incurring cost of ELF parsing and processing. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250303135752.158343-4-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15libbpf: Introduce more granular state for bpf_objectMykyta Yatsenko1-17/+22
We are going to split bpf_object loading into 2 stages: preparation and loading. This will increase flexibility when working with bpf_object and unlock some optimizations and use cases. This patch substitutes a boolean flag (loaded) by more finely-grained state for bpf_object. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250303135752.158343-3-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15net: filter: Avoid shadowing variable in bpf_convert_ctx_access()Breno Leitao1-2/+2
Rename the local variable 'off' to 'offset' to avoid shadowing the existing 'off' variable that is declared as an `int` in the outer scope of bpf_convert_ctx_access(). This fixes a compiler warning: net/core/filter.c:9679:8: warning: declaration shadows a local variable [-Wshadow] Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://patch.msgid.link/20250228-fix_filter-v1-1-ce13eae66fe9@debian.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15libbpf: Use map_is_created helper in map settersMykyta Yatsenko1-9/+9
Refactoring: use map_is_created helper in map setters that need to check the state of the map. This helps to reduce the number of the places that depend explicitly on the loaded flag, simplifying refactoring in the next patch of this set. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250303135752.158343-2-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test_tunnel: Remove test_tunnel.shBastien Curutchet (eBPF Foundation)2-180/+0
All tests from test_tunnel.sh have been migrated into test test_progs. The last test remaining in the script is the test_ipip() that is already covered in the test_prog framework by the NONE case of test_ipip_tunnel(). Remove the test_tunnel.sh script and its Makefile entry Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-10-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test_tunnel: Move ip6tnl tunnel tests to test_progsBastien Curutchet (eBPF Foundation)2-88/+59
ip6tnl tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6tnl tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ipip6() and test_ip6ip6() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-9-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test_tunnel: Move ip6geneve tunnel test to test_progsBastien Curutchet (eBPF Foundation)2-49/+48
ip6geneve tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6geneve tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ip6geneve() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-8-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test_tunnel: Move geneve tunnel test to test_progsBastien Curutchet (eBPF Foundation)2-45/+45
geneve tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test geneve tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_geneve() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-7-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test_tunnel: Move ip6erspan tunnel test to test_progsBastien Curutchet (eBPF Foundation)2-58/+41
ip6erspan tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6erspan tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ip6erspan() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-6-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test_tunnel: Move erspan tunnel tests to test_progsBastien Curutchet (eBPF Foundation)2-52/+46
erspan tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test erspan tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_erspan() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-5-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test_tunnel: Move ip6gre tunnel test to test_progsBastien Curutchet (eBPF Foundation)2-95/+104
ip6gre tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6gre tunnels. It uses the same network topology and the same BPF programs than the script. Disable the IPv6 DAD feature because it can take lot of time and cause some tests to fail depending on the environment they're run on. Remove test_ip6gre() and test_ip6gretap() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-4-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test_tunnel: Move gre tunnel test to test_progsBastien Curutchet (eBPF Foundation)2-79/+97
gre tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test gre tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_gre() and test_gre_no_tunnel_key() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-3-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test_tunnel: Add ping helpersBastien Curutchet (eBPF Foundation)1-29/+24
All tests use more or less the same ping commands as final validation. Also test_ping()'s return value is checked with ASSERT_OK() while this check is already done by the SYS() macro inside test_ping(). Create helpers around test_ping() and use them in the tests to avoid code duplication. Remove the unnecessary ASSERT_OK() from the tests. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-2-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: Factor out check_load_mem() and check_store_reg()Peilin Ye1-43/+67
Extract BPF_LDX and most non-ATOMIC BPF_STX instruction handling logic in do_check() into helper functions to be used later. While we are here, make that comment about "reserved fields" more specific. Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/8b39c94eac2bb7389ff12392ca666f939124ec4f.1740978603.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15veristat: Report program type guess results to sdterrEduard Zingerman1-2/+2
In order not to pollute CSV output, e.g.: $ ./veristat -o csv exceptions_ext.bpf.o > test.csv Using guessed program type 'sched_cls' for exceptions_ext.bpf.o/extension... Using guessed program type 'sched_cls' for exceptions_ext.bpf.o/throwing_extension... Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250301000147.1583999-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15selftests/bpf: test_tunnel: Add generic_attach* helpersBastien Curutchet (eBPF Foundation)1-74/+66
A fair amount of code duplication is present among tests to attach BPF programs. Create generic_attach* helpers that attach BPF programs to a given interface. Use ASSERT_OK_FD() instead of ASSERT_GE() to check fd's validity. Use these helpers in all the available tests. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-1-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: Factor out check_atomic_rmw()Peilin Ye1-24/+29
Currently, check_atomic() only handles atomic read-modify-write (RMW) instructions. Since we are planning to introduce other types of atomic instructions (i.e., atomic load/store), extract the existing RMW handling logic into its own function named check_atomic_rmw(). Remove the @insn_idx parameter as it is not really necessary. Use 'env->insn_idx' instead, as in other places in verifier.c. Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/6323ac8e73a10a1c8ee547c77ed68cf8eb6b90e1.1740978603.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15veristat: Strerror expects positive number (errno)Eduard Zingerman1-2/+2
Before: ./veristat -G @foobar iters.bpf.o Failed to open presets in 'foobar': Unknown error -2 ... After: ./veristat -G @foobar iters.bpf.o Failed to open presets in 'foobar': No such file or directory ... Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250301000147.1583999-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: Factor out atomic_ptr_type_ok()Peilin Ye1-5/+21
Factor out atomic_ptr_type_ok() as a helper function to be used later. Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/e5ef8b3116f3fffce78117a14060ddce05eba52a.1740978603.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15veristat: @files-list.txt notation for object files listEduard Zingerman1-9/+53
Allow reading object file list from file. E.g. the following command: ./veristat @list.txt Is equivalent to the following invocation: ./veristat line-1 line-2 ... line-N Where line-i corresponds to lines from list.txt. Lines starting with '#' are ignored. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250301000147.1583999-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf: no longer acquire map_idr_lock in bpf_map_inc_not_zero()Eric Dumazet1-5/+2
bpf_sk_storage_clone() is the only caller of bpf_map_inc_not_zero() and is holding rcu_read_lock(). map_idr_lock does not add any protection, just remove the cost for passive TCP flows. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Kui-Feng Lee <kuifeng@meta.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/r/20250301191315.1532629-1-edumazet@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>