aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/testing
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2022-08-03 16:29:08 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2022-08-03 16:29:08 -0700
commitf86d1fbbe7858884d6754534a0afbb74fc30bc26 (patch)
treef61796870edefbe77d495e9d719c68af1d14275b /tools/testing
parentMerge tag 'ata-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata (diff)
parentMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net (diff)
downloadwireguard-linux-f86d1fbbe7858884d6754534a0afbb74fc30bc26.tar.xz
wireguard-linux-f86d1fbbe7858884d6754534a0afbb74fc30bc26.zip
Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking changes from Paolo Abeni: "Core: - Refactor the forward memory allocation to better cope with memory pressure with many open sockets, moving from a per socket cache to a per-CPU one - Replace rwlocks with RCU for better fairness in ping, raw sockets and IP multicast router. - Network-side support for IO uring zero-copy send. - A few skb drop reason improvements, including codegen the source file with string mapping instead of using macro magic. - Rename reference tracking helpers to a more consistent netdev_* schema. - Adapt u64_stats_t type to address load/store tearing issues. - Refine debug helper usage to reduce the log noise caused by bots. BPF: - Improve socket map performance, avoiding skb cloning on read operation. - Add support for 64 bits enum, to match types exposed by kernel. - Introduce support for sleepable uprobes program. - Introduce support for enum textual representation in libbpf. - New helpers to implement synproxy with eBPF/XDP. - Improve loop performances, inlining indirect calls when possible. - Removed all the deprecated libbpf APIs. - Implement new eBPF-based LSM flavor. - Add type match support, which allow accurate queries to the eBPF used types. - A few TCP congetsion control framework usability improvements. - Add new infrastructure to manipulate CT entries via eBPF programs. - Allow for livepatch (KLP) and BPF trampolines to attach to the same kernel function. Protocols: - Introduce per network namespace lookup tables for unix sockets, increasing scalability and reducing contention. - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support. - Add support to forciby close TIME_WAIT TCP sockets via user-space tools. - Significant performance improvement for the TLS 1.3 receive path, both for zero-copy and not-zero-copy. - Support for changing the initial MTPCP subflow priority/backup status - Introduce virtually contingus buffers for sockets over RDMA, to cope better with memory pressure. - Extend CAN ethtool support with timestamping capabilities - Refactor CAN build infrastructure to allow building only the needed features. Driver API: - Remove devlink mutex to allow parallel commands on multiple links. - Add support for pause stats in distributed switch. - Implement devlink helpers to query and flash line cards. - New helper for phy mode to register conversion. New hardware / drivers: - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro. - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch. - Ethernet DSA driver for the Microchip LAN937x switch. - Ethernet PHY driver for the Aquantia AQR113C EPHY. - CAN driver for the OBD-II ELM327 interface. - CAN driver for RZ/N1 SJA1000 CAN controller. - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device. Drivers: - Intel Ethernet NICs: - i40e: add support for vlan pruning - i40e: add support for XDP framented packets - ice: improved vlan offload support - ice: add support for PPPoE offload - Mellanox Ethernet (mlx5) - refactor packet steering offload for performance and scalability - extend support for TC offload - refactor devlink code to clean-up the locking schema - support stacked vlans for bridge offloads - use TLS objects pool to improve connection rate - Netronome Ethernet NICs (nfp): - extend support for IPv6 fields mangling offload - add support for vepa mode in HW bridge - better support for virtio data path acceleration (VDPA) - enable TSO by default - Microsoft vNIC driver (mana) - add support for XDP redirect - Others Ethernet drivers: - bonding: add per-port priority support - microchip lan743x: extend phy support - Fungible funeth: support UDP segmentation offload and XDP xmit - Solarflare EF100: add support for virtual function representors - MediaTek SoC: add XDP support - Mellanox Ethernet/IB switch (mlxsw): - dropped support for unreleased H/W (XM router). - improved stats accuracy - unified bridge model coversion improving scalability (parts 1-6) - support for PTP in Spectrum-2 asics - Broadcom PHYs - add PTP support for BCM54210E - add support for the BCM53128 internal PHY - Marvell Ethernet switches (prestera): - implement support for multicast forwarding offload - Embedded Ethernet switches: - refactor OcteonTx MAC filter for better scalability - improve TC H/W offload for the Felix driver - refactor the Microchip ksz8 and ksz9477 drivers to share the probe code (parts 1, 2), add support for phylink mac configuration - Other WiFi: - Microchip wilc1000: diable WEP support and enable WPA3 - Atheros ath10k: encapsulation offload support Old code removal: - Neterion vxge ethernet driver: this is untouched since more than 10 years" * tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits) doc: sfp-phylink: Fix a broken reference wireguard: selftests: support UML wireguard: allowedips: don't corrupt stack when detecting overflow wireguard: selftests: update config fragments wireguard: ratelimiter: use hrtimer in selftest net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ net: usb: ax88179_178a: Bind only to vendor-specific interface selftests: net: fix IOAM test skip return code net: usb: make USB_RTL8153_ECM non user configurable net: marvell: prestera: remove reduntant code octeontx2-pf: Reduce minimum mtu size to 60 net: devlink: Fix missing mutex_unlock() call net/tls: Remove redundant workqueue flush before destroy net: txgbe: Fix an error handling path in txgbe_probe() net: dsa: Fix spelling mistakes and cleanup code Documentation: devlink: add add devlink-selftests to the table of contents dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock net: ionic: fix error check for vlan flags in ionic_set_nic_features() net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr() nfp: flower: add support for tunnel offload without key ID ...
Diffstat (limited to 'tools/testing')
-rw-r--r--tools/testing/selftests/bpf/.gitignore3
-rw-r--r--tools/testing/selftests/bpf/DENYLIST6
-rw-r--r--tools/testing/selftests/bpf/DENYLIST.s390x67
-rw-r--r--tools/testing/selftests/bpf/Makefile34
-rw-r--r--tools/testing/selftests/bpf/bench.c99
-rw-r--r--tools/testing/selftests/bpf/bench.h16
-rw-r--r--tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c96
-rw-r--r--tools/testing/selftests/bpf/benchs/bench_local_storage.c287
-rw-r--r--tools/testing/selftests/bpf/benchs/bench_local_storage_rcu_tasks_trace.c281
-rwxr-xr-xtools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh11
-rwxr-xr-xtools/testing/selftests/bpf/benchs/run_bench_local_storage.sh24
-rwxr-xr-xtools/testing/selftests/bpf/benchs/run_bench_local_storage_rcu_tasks_trace.sh11
-rw-r--r--tools/testing/selftests/bpf/benchs/run_common.sh17
-rw-r--r--tools/testing/selftests/bpf/bpf_legacy.h9
-rw-r--r--tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c10
-rw-r--r--tools/testing/selftests/bpf/btf_helpers.c25
-rw-r--r--tools/testing/selftests/bpf/config93
-rw-r--r--tools/testing/selftests/bpf/config.s390x147
-rw-r--r--tools/testing/selftests/bpf/config.x86_64251
-rw-r--r--tools/testing/selftests/bpf/network_helpers.c2
-rw-r--r--tools/testing/selftests/bpf/prog_tests/attach_probe.c49
-rw-r--r--tools/testing/selftests/bpf/prog_tests/bpf_iter.c16
-rw-r--r--tools/testing/selftests/bpf/prog_tests/bpf_loop.c62
-rw-r--r--tools/testing/selftests/bpf/prog_tests/bpf_nf.c64
-rw-r--r--tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c61
-rw-r--r--tools/testing/selftests/bpf/prog_tests/btf.c157
-rw-r--r--tools/testing/selftests/bpf/prog_tests/btf_write.c126
-rw-r--r--tools/testing/selftests/bpf/prog_tests/core_extern.c17
-rw-r--r--tools/testing/selftests/bpf/prog_tests/core_reloc.c140
-rw-r--r--tools/testing/selftests/bpf/prog_tests/fexit_stress.c32
-rw-r--r--tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c6
-rw-r--r--tools/testing/selftests/bpf/prog_tests/libbpf_str.c207
-rw-r--r--tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c313
-rw-r--r--tools/testing/selftests/bpf/prog_tests/probe_user.c35
-rw-r--r--tools/testing/selftests/bpf/prog_tests/resolve_btfids.c2
-rw-r--r--tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c11
-rw-r--r--tools/testing/selftests/bpf/prog_tests/send_signal.c2
-rw-r--r--tools/testing/selftests/bpf/prog_tests/skeleton.c2
-rw-r--r--tools/testing/selftests/bpf/prog_tests/sock_fields.c1
-rw-r--r--tools/testing/selftests/bpf/prog_tests/tc_redirect.c8
-rw-r--r--tools/testing/selftests/bpf/prog_tests/test_tunnel.c17
-rw-r--r--tools/testing/selftests/bpf/prog_tests/usdt.c2
-rw-r--r--tools/testing/selftests/bpf/prog_tests/xdp_synproxy.c183
-rw-r--r--tools/testing/selftests/bpf/progs/bpf_hashmap_full_update_bench.c40
-rw-r--r--tools/testing/selftests/bpf/progs/bpf_iter.h7
-rw-r--r--tools/testing/selftests/bpf/progs/bpf_iter_ksym.c74
-rw-r--r--tools/testing/selftests/bpf/progs/bpf_loop.c114
-rw-r--r--tools/testing/selftests/bpf/progs/bpf_syscall_macro.c6
-rw-r--r--tools/testing/selftests/bpf/progs/bpf_tracing_net.h1
-rw-r--r--tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val.c3
-rw-r--r--tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___diff.c3
-rw-r--r--tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___err_missing.c3
-rw-r--r--tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___val3_missing.c3
-rw-r--r--tools/testing/selftests/bpf/progs/btf__core_reloc_type_based___diff.c3
-rw-r--r--tools/testing/selftests/bpf/progs/core_reloc_types.h190
-rw-r--r--tools/testing/selftests/bpf/progs/local_storage_bench.c104
-rw-r--r--tools/testing/selftests/bpf/progs/local_storage_rcu_tasks_trace_bench.c67
-rw-r--r--tools/testing/selftests/bpf/progs/lsm_cgroup.c180
-rw-r--r--tools/testing/selftests/bpf/progs/lsm_cgroup_nonvoid.c14
-rw-r--r--tools/testing/selftests/bpf/progs/tcp_ca_incompl_cong_ops.c35
-rw-r--r--tools/testing/selftests/bpf/progs/tcp_ca_unsupp_cong_op.c21
-rw-r--r--tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c60
-rw-r--r--tools/testing/selftests/bpf/progs/test_attach_probe.c73
-rw-r--r--tools/testing/selftests/bpf/progs/test_bpf_nf.c85
-rw-r--r--tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c134
-rw-r--r--tools/testing/selftests/bpf/progs/test_btf_haskv.c51
-rw-r--r--tools/testing/selftests/bpf/progs/test_btf_newkv.c18
-rw-r--r--tools/testing/selftests/bpf/progs/test_core_extern.c3
-rw-r--r--tools/testing/selftests/bpf/progs/test_core_reloc_enum64val.c70
-rw-r--r--tools/testing/selftests/bpf/progs/test_core_reloc_kernel.c19
-rw-r--r--tools/testing/selftests/bpf/progs/test_core_reloc_type_based.c49
-rw-r--r--tools/testing/selftests/bpf/progs/test_probe_user.c50
-rw-r--r--tools/testing/selftests/bpf/progs/test_skeleton.c4
-rw-r--r--tools/testing/selftests/bpf/progs/test_tc_dtime.c53
-rw-r--r--tools/testing/selftests/bpf/progs/test_tunnel_kern.c80
-rw-r--r--tools/testing/selftests/bpf/progs/test_varlen.c8
-rw-r--r--tools/testing/selftests/bpf/progs/test_xdp_noinline.c30
-rw-r--r--tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c843
-rwxr-xr-xtools/testing/selftests/bpf/test_bpftool_synctypes.py182
-rw-r--r--tools/testing/selftests/bpf/test_btf.h3
-rw-r--r--tools/testing/selftests/bpf/test_progs.c7
-rw-r--r--tools/testing/selftests/bpf/test_verifier.c367
-rwxr-xr-xtools/testing/selftests/bpf/test_xdp_veth.sh6
-rwxr-xr-xtools/testing/selftests/bpf/test_xdping.sh4
-rwxr-xr-xtools/testing/selftests/bpf/test_xsk.sh6
-rw-r--r--tools/testing/selftests/bpf/verifier/bpf_loop_inline.c264
-rw-r--r--tools/testing/selftests/bpf/verifier/calls.c53
-rwxr-xr-xtools/testing/selftests/bpf/vmtest.sh53
-rw-r--r--tools/testing/selftests/bpf/xdp_synproxy.c466
-rw-r--r--tools/testing/selftests/bpf/xsk.c1268
-rw-r--r--tools/testing/selftests/bpf/xsk.h316
-rwxr-xr-xtools/testing/selftests/bpf/xsk_prereqs.sh4
-rw-r--r--tools/testing/selftests/bpf/xskxceiver.c (renamed from tools/testing/selftests/bpf/xdpxceiver.c)25
-rw-r--r--tools/testing/selftests/bpf/xskxceiver.h (renamed from tools/testing/selftests/bpf/xdpxceiver.h)6
-rw-r--r--tools/testing/selftests/drivers/net/dsa/Makefile17
-rwxr-xr-xtools/testing/selftests/drivers/net/mlxsw/devlink_linecard.sh54
-rw-r--r--tools/testing/selftests/drivers/net/mlxsw/rif_counter_scale.sh107
-rwxr-xr-xtools/testing/selftests/drivers/net/mlxsw/spectrum-2/resource_scale.sh31
l---------tools/testing/selftests/drivers/net/mlxsw/spectrum-2/rif_counter_scale.sh1
-rw-r--r--tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower_scale.sh15
-rwxr-xr-xtools/testing/selftests/drivers/net/mlxsw/spectrum/resource_scale.sh29
-rw-r--r--tools/testing/selftests/drivers/net/mlxsw/spectrum/rif_counter_scale.sh34
-rw-r--r--tools/testing/selftests/drivers/net/mlxsw/tc_flower_scale.sh17
-rwxr-xr-xtools/testing/selftests/drivers/net/netdevsim/fib.sh45
-rw-r--r--tools/testing/selftests/net/.gitignore1
-rw-r--r--tools/testing/selftests/net/Makefile3
-rw-r--r--tools/testing/selftests/net/af_unix/Makefile3
-rw-r--r--tools/testing/selftests/net/af_unix/unix_connect.c148
-rwxr-xr-xtools/testing/selftests/net/arp_ndisc_untracked_subnets.sh308
-rw-r--r--tools/testing/selftests/net/cmsg_sender.c2
-rwxr-xr-xtools/testing/selftests/net/fib_rule_tests.sh23
-rw-r--r--tools/testing/selftests/net/forwarding/Makefile1
-rwxr-xr-xtools/testing/selftests/net/forwarding/bridge_mdb_port_down.sh118
-rwxr-xr-xtools/testing/selftests/net/forwarding/ethtool_extended_state.sh43
-rwxr-xr-xtools/testing/selftests/net/forwarding/mirror_gre_bridge_1q_lag.sh7
-rwxr-xr-xtools/testing/selftests/net/forwarding/vxlan_asymmetric.sh2
-rwxr-xr-xtools/testing/selftests/net/ioam6.sh12
-rw-r--r--tools/testing/selftests/net/ipv6_flowlabel.c75
-rwxr-xr-xtools/testing/selftests/net/ipv6_flowlabel.sh16
-rwxr-xr-xtools/testing/selftests/net/mptcp/mptcp_join.sh116
-rw-r--r--tools/testing/selftests/net/mptcp/pm_nl_ctl.c2
-rwxr-xr-xtools/testing/selftests/net/mptcp/simult_flows.sh14
-rwxr-xr-xtools/testing/selftests/net/mptcp/userspace_pm.sh40
-rwxr-xr-xtools/testing/selftests/net/srv6_hencap_red_l3vpn_test.sh879
-rwxr-xr-xtools/testing/selftests/net/srv6_hl2encap_red_l2vpn_test.sh821
-rw-r--r--tools/testing/selftests/net/tls.c124
-rw-r--r--tools/testing/selftests/tc-testing/.gitignore1
-rw-r--r--tools/testing/selftests/wireguard/qemu/Makefile17
-rw-r--r--tools/testing/selftests/wireguard/qemu/arch/um.config3
-rw-r--r--tools/testing/selftests/wireguard/qemu/debug.config5
-rw-r--r--tools/testing/selftests/wireguard/qemu/kernel.config1
131 files changed, 11061 insertions, 604 deletions
diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
index 595565eb68c0..3a8cb2404ea6 100644
--- a/tools/testing/selftests/bpf/.gitignore
+++ b/tools/testing/selftests/bpf/.gitignore
@@ -41,5 +41,6 @@ test_cpp
/bench
*.ko
*.tmp
-xdpxceiver
+xskxceiver
xdp_redirect_multi
+xdp_synproxy
diff --git a/tools/testing/selftests/bpf/DENYLIST b/tools/testing/selftests/bpf/DENYLIST
new file mode 100644
index 000000000000..939de574fc7f
--- /dev/null
+++ b/tools/testing/selftests/bpf/DENYLIST
@@ -0,0 +1,6 @@
+# TEMPORARY
+get_stack_raw_tp # spams with kernel warnings until next bpf -> bpf-next merge
+stacktrace_build_id_nmi
+stacktrace_build_id
+task_fd_query_rawtp
+varlen
diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
new file mode 100644
index 000000000000..e33cab34d22f
--- /dev/null
+++ b/tools/testing/selftests/bpf/DENYLIST.s390x
@@ -0,0 +1,67 @@
+# TEMPORARY
+atomics # attach(add): actual -524 <= expected 0 (trampoline)
+bpf_iter_setsockopt # JIT does not support calling kernel function (kfunc)
+bloom_filter_map # failed to find kernel BTF type ID of '__x64_sys_getpgid': -3 (?)
+bpf_tcp_ca # JIT does not support calling kernel function (kfunc)
+bpf_loop # attaches to __x64_sys_nanosleep
+bpf_mod_race # BPF trampoline
+bpf_nf # JIT does not support calling kernel function
+core_read_macros # unknown func bpf_probe_read#4 (overlapping)
+d_path # failed to auto-attach program 'prog_stat': -524 (trampoline)
+dummy_st_ops # test_run unexpected error: -524 (errno 524) (trampoline)
+fentry_fexit # fentry attach failed: -524 (trampoline)
+fentry_test # fentry_first_attach unexpected error: -524 (trampoline)
+fexit_bpf2bpf # freplace_attach_trace unexpected error: -524 (trampoline)
+fexit_sleep # fexit_skel_load fexit skeleton failed (trampoline)
+fexit_stress # fexit attach failed prog 0 failed: -524 (trampoline)
+fexit_test # fexit_first_attach unexpected error: -524 (trampoline)
+get_func_args_test # trampoline
+get_func_ip_test # get_func_ip_test__attach unexpected error: -524 (trampoline)
+get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace)
+kfree_skb # attach fentry unexpected error: -524 (trampoline)
+kfunc_call # 'bpf_prog_active': not found in kernel BTF (?)
+ksyms_module # test_ksyms_module__open_and_load unexpected error: -9 (?)
+ksyms_module_libbpf # JIT does not support calling kernel function (kfunc)
+ksyms_module_lskel # test_ksyms_module_lskel__open_and_load unexpected error: -9 (?)
+modify_return # modify_return attach failed: -524 (trampoline)
+module_attach # skel_attach skeleton attach failed: -524 (trampoline)
+mptcp
+kprobe_multi_test # relies on fentry
+netcnt # failed to load BPF skeleton 'netcnt_prog': -7 (?)
+probe_user # check_kprobe_res wrong kprobe res from probe read (?)
+recursion # skel_attach unexpected error: -524 (trampoline)
+ringbuf # skel_load skeleton load failed (?)
+sk_assign # Can't read on server: Invalid argument (?)
+sk_lookup # endianness problem
+sk_storage_tracing # test_sk_storage_tracing__attach unexpected error: -524 (trampoline)
+skc_to_unix_sock # could not attach BPF object unexpected error: -524 (trampoline)
+socket_cookie # prog_attach unexpected error: -524 (trampoline)
+stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?)
+tailcalls # tail_calls are not allowed in non-JITed programs with bpf-to-bpf calls (?)
+task_local_storage # failed to auto-attach program 'trace_exit_creds': -524 (trampoline)
+test_bpffs # bpffs test failed 255 (iterator)
+test_bprm_opts # failed to auto-attach program 'secure_exec': -524 (trampoline)
+test_ima # failed to auto-attach program 'ima': -524 (trampoline)
+test_local_storage # failed to auto-attach program 'unlink_hook': -524 (trampoline)
+test_lsm # failed to find kernel BTF type ID of '__x64_sys_setdomainname': -3 (?)
+test_overhead # attach_fentry unexpected error: -524 (trampoline)
+test_profiler # unknown func bpf_probe_read_str#45 (overlapping)
+timer # failed to auto-attach program 'test1': -524 (trampoline)
+timer_crash # trampoline
+timer_mim # failed to auto-attach program 'test1': -524 (trampoline)
+trace_ext # failed to auto-attach program 'test_pkt_md_access_new': -524 (trampoline)
+trace_printk # trace_printk__load unexpected error: -2 (errno 2) (?)
+trace_vprintk # trace_vprintk__open_and_load unexpected error: -9 (?)
+trampoline_count # prog 'prog1': failed to attach: ERROR: strerror_r(-524)=22 (trampoline)
+verif_stats # trace_vprintk__open_and_load unexpected error: -9 (?)
+vmlinux # failed to auto-attach program 'handle__fentry': -524 (trampoline)
+xdp_adjust_tail # case-128 err 0 errno 28 retval 1 size 128 expect-size 3520 (?)
+xdp_bonding # failed to auto-attach program 'trace_on_entry': -524 (trampoline)
+xdp_bpf2bpf # failed to auto-attach program 'trace_on_entry': -524 (trampoline)
+map_kptr # failed to open_and_load program: -524 (trampoline)
+bpf_cookie # failed to open_and_load program: -524 (trampoline)
+xdp_do_redirect # prog_run_max_size unexpected error: -22 (errno 22)
+send_signal # intermittently fails to receive signal
+select_reuseport # intermittently fails on new s390x setup
+xdp_synproxy # JIT does not support calling kernel function (kfunc)
+unpriv_bpf_disabled # fentry
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 2d3c8c8f558a..8d59ec7f4c2d 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -82,7 +82,7 @@ TEST_PROGS_EXTENDED := with_addr.sh \
TEST_GEN_PROGS_EXTENDED = test_sock_addr test_skb_cgroup_id_user \
flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \
test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko \
- xdpxceiver xdp_redirect_multi
+ xskxceiver xdp_redirect_multi xdp_synproxy
TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read
@@ -168,17 +168,26 @@ $(OUTPUT)/%:%.c
$(call msg,BINARY,,$@)
$(Q)$(LINK.c) $^ $(LDLIBS) -o $@
+# LLVM's ld.lld doesn't support all the architectures, so use it only on x86
+ifeq ($(SRCARCH),x86)
+LLD := lld
+else
+LLD := ld
+endif
+
# Filter out -static for liburandom_read.so and its dependent targets so that static builds
# do not fail. Static builds leave urandom_read relying on system-wide shared libraries.
$(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c
$(call msg,LIB,,$@)
- $(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS) -fPIC -shared -o $@
+ $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS) \
+ -fuse-ld=$(LLD) -Wl,-znoseparate-code -fPIC -shared -o $@
$(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so
$(call msg,BINARY,,$@)
- $(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \
- liburandom_read.so $(LDLIBS) \
- -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
+ $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \
+ liburandom_read.so $(LDLIBS) \
+ -fuse-ld=$(LLD) -Wl,-znoseparate-code \
+ -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
$(OUTPUT)/bpf_testmod.ko: $(VMLINUX_BTF) $(wildcard bpf_testmod/Makefile bpf_testmod/*.[ch])
$(call msg,MOD,,$@)
@@ -221,6 +230,8 @@ $(OUTPUT)/xdping: $(TESTING_HELPERS)
$(OUTPUT)/flow_dissector_load: $(TESTING_HELPERS)
$(OUTPUT)/test_maps: $(TESTING_HELPERS)
$(OUTPUT)/test_verifier: $(TESTING_HELPERS) $(CAP_HELPERS)
+$(OUTPUT)/xsk.o: $(BPFOBJ)
+$(OUTPUT)/xskxceiver: $(OUTPUT)/xsk.o
BPFTOOL ?= $(DEFAULT_BPFTOOL)
$(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \
@@ -502,6 +513,7 @@ TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c \
cap_helpers.c
TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \
$(OUTPUT)/liburandom_read.so \
+ $(OUTPUT)/xdp_synproxy \
ima_setup.sh \
$(wildcard progs/btf_dump_test_case_*.c)
TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE
@@ -560,6 +572,9 @@ $(OUTPUT)/bench_ringbufs.o: $(OUTPUT)/ringbuf_bench.skel.h \
$(OUTPUT)/bench_bloom_filter_map.o: $(OUTPUT)/bloom_filter_bench.skel.h
$(OUTPUT)/bench_bpf_loop.o: $(OUTPUT)/bpf_loop_bench.skel.h
$(OUTPUT)/bench_strncmp.o: $(OUTPUT)/strncmp_bench.skel.h
+$(OUTPUT)/bench_bpf_hashmap_full_update.o: $(OUTPUT)/bpf_hashmap_full_update_bench.skel.h
+$(OUTPUT)/bench_local_storage.o: $(OUTPUT)/local_storage_bench.skel.h
+$(OUTPUT)/bench_local_storage_rcu_tasks_trace.o: $(OUTPUT)/local_storage_rcu_tasks_trace_bench.skel.h
$(OUTPUT)/bench.o: bench.h testing_helpers.h $(BPFOBJ)
$(OUTPUT)/bench: LDLIBS += -lm
$(OUTPUT)/bench: $(OUTPUT)/bench.o \
@@ -571,13 +586,18 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o \
$(OUTPUT)/bench_ringbufs.o \
$(OUTPUT)/bench_bloom_filter_map.o \
$(OUTPUT)/bench_bpf_loop.o \
- $(OUTPUT)/bench_strncmp.o
+ $(OUTPUT)/bench_strncmp.o \
+ $(OUTPUT)/bench_bpf_hashmap_full_update.o \
+ $(OUTPUT)/bench_local_storage.o \
+ $(OUTPUT)/bench_local_storage_rcu_tasks_trace.o
$(call msg,BINARY,,$@)
$(Q)$(CC) $(CFLAGS) $(LDFLAGS) $(filter %.a %.o,$^) $(LDLIBS) -o $@
EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) $(HOST_SCRATCH_DIR) \
prog_tests/tests.h map_tests/tests.h verifier/tests.h \
feature bpftool \
- $(addprefix $(OUTPUT)/,*.o *.skel.h *.lskel.h *.subskel.h no_alu32 bpf_gcc bpf_testmod.ko)
+ $(addprefix $(OUTPUT)/,*.o *.skel.h *.lskel.h *.subskel.h \
+ no_alu32 bpf_gcc bpf_testmod.ko \
+ liburandom_read.so)
.PHONY: docs docs-clean
diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c
index f061cc20e776..c1f20a147462 100644
--- a/tools/testing/selftests/bpf/bench.c
+++ b/tools/testing/selftests/bpf/bench.c
@@ -79,6 +79,43 @@ void hits_drops_report_progress(int iter, struct bench_res *res, long delta_ns)
hits_per_sec, hits_per_prod, drops_per_sec, hits_per_sec + drops_per_sec);
}
+void
+grace_period_latency_basic_stats(struct bench_res res[], int res_cnt, struct basic_stats *gp_stat)
+{
+ int i;
+
+ memset(gp_stat, 0, sizeof(struct basic_stats));
+
+ for (i = 0; i < res_cnt; i++)
+ gp_stat->mean += res[i].gp_ns / 1000.0 / (double)res[i].gp_ct / (0.0 + res_cnt);
+
+#define IT_MEAN_DIFF (res[i].gp_ns / 1000.0 / (double)res[i].gp_ct - gp_stat->mean)
+ if (res_cnt > 1) {
+ for (i = 0; i < res_cnt; i++)
+ gp_stat->stddev += (IT_MEAN_DIFF * IT_MEAN_DIFF) / (res_cnt - 1.0);
+ }
+ gp_stat->stddev = sqrt(gp_stat->stddev);
+#undef IT_MEAN_DIFF
+}
+
+void
+grace_period_ticks_basic_stats(struct bench_res res[], int res_cnt, struct basic_stats *gp_stat)
+{
+ int i;
+
+ memset(gp_stat, 0, sizeof(struct basic_stats));
+ for (i = 0; i < res_cnt; i++)
+ gp_stat->mean += res[i].stime / (double)res[i].gp_ct / (0.0 + res_cnt);
+
+#define IT_MEAN_DIFF (res[i].stime / (double)res[i].gp_ct - gp_stat->mean)
+ if (res_cnt > 1) {
+ for (i = 0; i < res_cnt; i++)
+ gp_stat->stddev += (IT_MEAN_DIFF * IT_MEAN_DIFF) / (res_cnt - 1.0);
+ }
+ gp_stat->stddev = sqrt(gp_stat->stddev);
+#undef IT_MEAN_DIFF
+}
+
void hits_drops_report_final(struct bench_res res[], int res_cnt)
{
int i;
@@ -150,6 +187,53 @@ void ops_report_final(struct bench_res res[], int res_cnt)
printf("latency %8.3lf ns/op\n", 1000.0 / hits_mean * env.producer_cnt);
}
+void local_storage_report_progress(int iter, struct bench_res *res,
+ long delta_ns)
+{
+ double important_hits_per_sec, hits_per_sec;
+ double delta_sec = delta_ns / 1000000000.0;
+
+ hits_per_sec = res->hits / 1000000.0 / delta_sec;
+ important_hits_per_sec = res->important_hits / 1000000.0 / delta_sec;
+
+ printf("Iter %3d (%7.3lfus): ", iter, (delta_ns - 1000000000) / 1000.0);
+
+ printf("hits %8.3lfM/s ", hits_per_sec);
+ printf("important_hits %8.3lfM/s\n", important_hits_per_sec);
+}
+
+void local_storage_report_final(struct bench_res res[], int res_cnt)
+{
+ double important_hits_mean = 0.0, important_hits_stddev = 0.0;
+ double hits_mean = 0.0, hits_stddev = 0.0;
+ int i;
+
+ for (i = 0; i < res_cnt; i++) {
+ hits_mean += res[i].hits / 1000000.0 / (0.0 + res_cnt);
+ important_hits_mean += res[i].important_hits / 1000000.0 / (0.0 + res_cnt);
+ }
+
+ if (res_cnt > 1) {
+ for (i = 0; i < res_cnt; i++) {
+ hits_stddev += (hits_mean - res[i].hits / 1000000.0) *
+ (hits_mean - res[i].hits / 1000000.0) /
+ (res_cnt - 1.0);
+ important_hits_stddev +=
+ (important_hits_mean - res[i].important_hits / 1000000.0) *
+ (important_hits_mean - res[i].important_hits / 1000000.0) /
+ (res_cnt - 1.0);
+ }
+
+ hits_stddev = sqrt(hits_stddev);
+ important_hits_stddev = sqrt(important_hits_stddev);
+ }
+ printf("Summary: hits throughput %8.3lf \u00B1 %5.3lf M ops/s, ",
+ hits_mean, hits_stddev);
+ printf("hits latency %8.3lf ns/op, ", 1000.0 / hits_mean);
+ printf("important_hits throughput %8.3lf \u00B1 %5.3lf M ops/s\n",
+ important_hits_mean, important_hits_stddev);
+}
+
const char *argp_program_version = "benchmark";
const char *argp_program_bug_address = "<bpf@vger.kernel.org>";
const char argp_program_doc[] =
@@ -188,13 +272,18 @@ static const struct argp_option opts[] = {
extern struct argp bench_ringbufs_argp;
extern struct argp bench_bloom_map_argp;
extern struct argp bench_bpf_loop_argp;
+extern struct argp bench_local_storage_argp;
+extern struct argp bench_local_storage_rcu_tasks_trace_argp;
extern struct argp bench_strncmp_argp;
static const struct argp_child bench_parsers[] = {
{ &bench_ringbufs_argp, 0, "Ring buffers benchmark", 0 },
{ &bench_bloom_map_argp, 0, "Bloom filter map benchmark", 0 },
{ &bench_bpf_loop_argp, 0, "bpf_loop helper benchmark", 0 },
+ { &bench_local_storage_argp, 0, "local_storage benchmark", 0 },
{ &bench_strncmp_argp, 0, "bpf_strncmp helper benchmark", 0 },
+ { &bench_local_storage_rcu_tasks_trace_argp, 0,
+ "local_storage RCU Tasks Trace slowdown benchmark", 0 },
{},
};
@@ -396,6 +485,11 @@ extern const struct bench bench_hashmap_with_bloom;
extern const struct bench bench_bpf_loop;
extern const struct bench bench_strncmp_no_helper;
extern const struct bench bench_strncmp_helper;
+extern const struct bench bench_bpf_hashmap_full_update;
+extern const struct bench bench_local_storage_cache_seq_get;
+extern const struct bench bench_local_storage_cache_interleaved_get;
+extern const struct bench bench_local_storage_cache_hashmap_control;
+extern const struct bench bench_local_storage_tasks_trace;
static const struct bench *benchs[] = {
&bench_count_global,
@@ -430,6 +524,11 @@ static const struct bench *benchs[] = {
&bench_bpf_loop,
&bench_strncmp_no_helper,
&bench_strncmp_helper,
+ &bench_bpf_hashmap_full_update,
+ &bench_local_storage_cache_seq_get,
+ &bench_local_storage_cache_interleaved_get,
+ &bench_local_storage_cache_hashmap_control,
+ &bench_local_storage_tasks_trace,
};
static void setup_benchmark()
diff --git a/tools/testing/selftests/bpf/bench.h b/tools/testing/selftests/bpf/bench.h
index fb3e213df3dc..d748255877e2 100644
--- a/tools/testing/selftests/bpf/bench.h
+++ b/tools/testing/selftests/bpf/bench.h
@@ -30,10 +30,19 @@ struct env {
struct cpu_set cons_cpus;
};
+struct basic_stats {
+ double mean;
+ double stddev;
+};
+
struct bench_res {
long hits;
long drops;
long false_hits;
+ long important_hits;
+ unsigned long gp_ns;
+ unsigned long gp_ct;
+ unsigned int stime;
};
struct bench {
@@ -61,6 +70,13 @@ void false_hits_report_progress(int iter, struct bench_res *res, long delta_ns);
void false_hits_report_final(struct bench_res res[], int res_cnt);
void ops_report_progress(int iter, struct bench_res *res, long delta_ns);
void ops_report_final(struct bench_res res[], int res_cnt);
+void local_storage_report_progress(int iter, struct bench_res *res,
+ long delta_ns);
+void local_storage_report_final(struct bench_res res[], int res_cnt);
+void grace_period_latency_basic_stats(struct bench_res res[], int res_cnt,
+ struct basic_stats *gp_stat);
+void grace_period_ticks_basic_stats(struct bench_res res[], int res_cnt,
+ struct basic_stats *gp_stat);
static inline __u64 get_time_ns(void)
{
diff --git a/tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c b/tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c
new file mode 100644
index 000000000000..cec51e0ff4b8
--- /dev/null
+++ b/tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Bytedance */
+
+#include <argp.h>
+#include "bench.h"
+#include "bpf_hashmap_full_update_bench.skel.h"
+#include "bpf_util.h"
+
+/* BPF triggering benchmarks */
+static struct ctx {
+ struct bpf_hashmap_full_update_bench *skel;
+} ctx;
+
+#define MAX_LOOP_NUM 10000
+
+static void validate(void)
+{
+ if (env.consumer_cnt != 1) {
+ fprintf(stderr, "benchmark doesn't support multi-consumer!\n");
+ exit(1);
+ }
+}
+
+static void *producer(void *input)
+{
+ while (true) {
+ /* trigger the bpf program */
+ syscall(__NR_getpgid);
+ }
+
+ return NULL;
+}
+
+static void *consumer(void *input)
+{
+ return NULL;
+}
+
+static void measure(struct bench_res *res)
+{
+}
+
+static void setup(void)
+{
+ struct bpf_link *link;
+ int map_fd, i, max_entries;
+
+ setup_libbpf();
+
+ ctx.skel = bpf_hashmap_full_update_bench__open_and_load();
+ if (!ctx.skel) {
+ fprintf(stderr, "failed to open skeleton\n");
+ exit(1);
+ }
+
+ ctx.skel->bss->nr_loops = MAX_LOOP_NUM;
+
+ link = bpf_program__attach(ctx.skel->progs.benchmark);
+ if (!link) {
+ fprintf(stderr, "failed to attach program!\n");
+ exit(1);
+ }
+
+ /* fill hash_map */
+ map_fd = bpf_map__fd(ctx.skel->maps.hash_map_bench);
+ max_entries = bpf_map__max_entries(ctx.skel->maps.hash_map_bench);
+ for (i = 0; i < max_entries; i++)
+ bpf_map_update_elem(map_fd, &i, &i, BPF_ANY);
+}
+
+void hashmap_report_final(struct bench_res res[], int res_cnt)
+{
+ unsigned int nr_cpus = bpf_num_possible_cpus();
+ int i;
+
+ for (i = 0; i < nr_cpus; i++) {
+ u64 time = ctx.skel->bss->percpu_time[i];
+
+ if (!time)
+ continue;
+
+ printf("%d:hash_map_full_perf %lld events per sec\n",
+ i, ctx.skel->bss->nr_loops * 1000000000ll / time);
+ }
+}
+
+const struct bench bench_bpf_hashmap_full_update = {
+ .name = "bpf-hashmap-ful-update",
+ .validate = validate,
+ .setup = setup,
+ .producer_thread = producer,
+ .consumer_thread = consumer,
+ .measure = measure,
+ .report_progress = NULL,
+ .report_final = hashmap_report_final,
+};
diff --git a/tools/testing/selftests/bpf/benchs/bench_local_storage.c b/tools/testing/selftests/bpf/benchs/bench_local_storage.c
new file mode 100644
index 000000000000..5a378c84e81f
--- /dev/null
+++ b/tools/testing/selftests/bpf/benchs/bench_local_storage.c
@@ -0,0 +1,287 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+
+#include <argp.h>
+#include <linux/btf.h>
+
+#include "local_storage_bench.skel.h"
+#include "bench.h"
+
+#include <test_btf.h>
+
+static struct {
+ __u32 nr_maps;
+ __u32 hashmap_nr_keys_used;
+} args = {
+ .nr_maps = 1000,
+ .hashmap_nr_keys_used = 1000,
+};
+
+enum {
+ ARG_NR_MAPS = 6000,
+ ARG_HASHMAP_NR_KEYS_USED = 6001,
+};
+
+static const struct argp_option opts[] = {
+ { "nr_maps", ARG_NR_MAPS, "NR_MAPS", 0,
+ "Set number of local_storage maps"},
+ { "hashmap_nr_keys_used", ARG_HASHMAP_NR_KEYS_USED, "NR_KEYS",
+ 0, "When doing hashmap test, set number of hashmap keys test uses"},
+ {},
+};
+
+static error_t parse_arg(int key, char *arg, struct argp_state *state)
+{
+ long ret;
+
+ switch (key) {
+ case ARG_NR_MAPS:
+ ret = strtol(arg, NULL, 10);
+ if (ret < 1 || ret > UINT_MAX) {
+ fprintf(stderr, "invalid nr_maps");
+ argp_usage(state);
+ }
+ args.nr_maps = ret;
+ break;
+ case ARG_HASHMAP_NR_KEYS_USED:
+ ret = strtol(arg, NULL, 10);
+ if (ret < 1 || ret > UINT_MAX) {
+ fprintf(stderr, "invalid hashmap_nr_keys_used");
+ argp_usage(state);
+ }
+ args.hashmap_nr_keys_used = ret;
+ break;
+ default:
+ return ARGP_ERR_UNKNOWN;
+ }
+
+ return 0;
+}
+
+const struct argp bench_local_storage_argp = {
+ .options = opts,
+ .parser = parse_arg,
+};
+
+/* Keep in sync w/ array of maps in bpf */
+#define MAX_NR_MAPS 1000
+/* keep in sync w/ same define in bpf */
+#define HASHMAP_SZ 4194304
+
+static void validate(void)
+{
+ if (env.producer_cnt != 1) {
+ fprintf(stderr, "benchmark doesn't support multi-producer!\n");
+ exit(1);
+ }
+ if (env.consumer_cnt != 1) {
+ fprintf(stderr, "benchmark doesn't support multi-consumer!\n");
+ exit(1);
+ }
+
+ if (args.nr_maps > MAX_NR_MAPS) {
+ fprintf(stderr, "nr_maps must be <= 1000\n");
+ exit(1);
+ }
+
+ if (args.hashmap_nr_keys_used > HASHMAP_SZ) {
+ fprintf(stderr, "hashmap_nr_keys_used must be <= %u\n", HASHMAP_SZ);
+ exit(1);
+ }
+}
+
+static struct {
+ struct local_storage_bench *skel;
+ void *bpf_obj;
+ struct bpf_map *array_of_maps;
+} ctx;
+
+static void prepopulate_hashmap(int fd)
+{
+ int i, key, val;
+
+ /* local_storage gets will have BPF_LOCAL_STORAGE_GET_F_CREATE flag set, so
+ * populate the hashmap for a similar comparison
+ */
+ for (i = 0; i < HASHMAP_SZ; i++) {
+ key = val = i;
+ if (bpf_map_update_elem(fd, &key, &val, 0)) {
+ fprintf(stderr, "Error prepopulating hashmap (key %d)\n", key);
+ exit(1);
+ }
+ }
+}
+
+static void __setup(struct bpf_program *prog, bool hashmap)
+{
+ struct bpf_map *inner_map;
+ int i, fd, mim_fd, err;
+
+ LIBBPF_OPTS(bpf_map_create_opts, create_opts);
+
+ if (!hashmap)
+ create_opts.map_flags = BPF_F_NO_PREALLOC;
+
+ ctx.skel->rodata->num_maps = args.nr_maps;
+ ctx.skel->rodata->hashmap_num_keys = args.hashmap_nr_keys_used;
+ inner_map = bpf_map__inner_map(ctx.array_of_maps);
+ create_opts.btf_key_type_id = bpf_map__btf_key_type_id(inner_map);
+ create_opts.btf_value_type_id = bpf_map__btf_value_type_id(inner_map);
+
+ err = local_storage_bench__load(ctx.skel);
+ if (err) {
+ fprintf(stderr, "Error loading skeleton\n");
+ goto err_out;
+ }
+
+ create_opts.btf_fd = bpf_object__btf_fd(ctx.skel->obj);
+
+ mim_fd = bpf_map__fd(ctx.array_of_maps);
+ if (mim_fd < 0) {
+ fprintf(stderr, "Error getting map_in_map fd\n");
+ goto err_out;
+ }
+
+ for (i = 0; i < args.nr_maps; i++) {
+ if (hashmap)
+ fd = bpf_map_create(BPF_MAP_TYPE_HASH, NULL, sizeof(int),
+ sizeof(int), HASHMAP_SZ, &create_opts);
+ else
+ fd = bpf_map_create(BPF_MAP_TYPE_TASK_STORAGE, NULL, sizeof(int),
+ sizeof(int), 0, &create_opts);
+ if (fd < 0) {
+ fprintf(stderr, "Error creating map %d: %d\n", i, fd);
+ goto err_out;
+ }
+
+ if (hashmap)
+ prepopulate_hashmap(fd);
+
+ err = bpf_map_update_elem(mim_fd, &i, &fd, 0);
+ if (err) {
+ fprintf(stderr, "Error updating array-of-maps w/ map %d\n", i);
+ goto err_out;
+ }
+ }
+
+ if (!bpf_program__attach(prog)) {
+ fprintf(stderr, "Error attaching bpf program\n");
+ goto err_out;
+ }
+
+ return;
+err_out:
+ exit(1);
+}
+
+static void hashmap_setup(void)
+{
+ struct local_storage_bench *skel;
+
+ setup_libbpf();
+
+ skel = local_storage_bench__open();
+ ctx.skel = skel;
+ ctx.array_of_maps = skel->maps.array_of_hash_maps;
+ skel->rodata->use_hashmap = 1;
+ skel->rodata->interleave = 0;
+
+ __setup(skel->progs.get_local, true);
+}
+
+static void local_storage_cache_get_setup(void)
+{
+ struct local_storage_bench *skel;
+
+ setup_libbpf();
+
+ skel = local_storage_bench__open();
+ ctx.skel = skel;
+ ctx.array_of_maps = skel->maps.array_of_local_storage_maps;
+ skel->rodata->use_hashmap = 0;
+ skel->rodata->interleave = 0;
+
+ __setup(skel->progs.get_local, false);
+}
+
+static void local_storage_cache_get_interleaved_setup(void)
+{
+ struct local_storage_bench *skel;
+
+ setup_libbpf();
+
+ skel = local_storage_bench__open();
+ ctx.skel = skel;
+ ctx.array_of_maps = skel->maps.array_of_local_storage_maps;
+ skel->rodata->use_hashmap = 0;
+ skel->rodata->interleave = 1;
+
+ __setup(skel->progs.get_local, false);
+}
+
+static void measure(struct bench_res *res)
+{
+ res->hits = atomic_swap(&ctx.skel->bss->hits, 0);
+ res->important_hits = atomic_swap(&ctx.skel->bss->important_hits, 0);
+}
+
+static inline void trigger_bpf_program(void)
+{
+ syscall(__NR_getpgid);
+}
+
+static void *consumer(void *input)
+{
+ return NULL;
+}
+
+static void *producer(void *input)
+{
+ while (true)
+ trigger_bpf_program();
+
+ return NULL;
+}
+
+/* cache sequential and interleaved get benchs test local_storage get
+ * performance, specifically they demonstrate performance cliff of
+ * current list-plus-cache local_storage model.
+ *
+ * cache sequential get: call bpf_task_storage_get on n maps in order
+ * cache interleaved get: like "sequential get", but interleave 4 calls to the
+ * 'important' map (idx 0 in array_of_maps) for every 10 calls. Goal
+ * is to mimic environment where many progs are accessing their local_storage
+ * maps, with 'our' prog needing to access its map more often than others
+ */
+const struct bench bench_local_storage_cache_seq_get = {
+ .name = "local-storage-cache-seq-get",
+ .validate = validate,
+ .setup = local_storage_cache_get_setup,
+ .producer_thread = producer,
+ .consumer_thread = consumer,
+ .measure = measure,
+ .report_progress = local_storage_report_progress,
+ .report_final = local_storage_report_final,
+};
+
+const struct bench bench_local_storage_cache_interleaved_get = {
+ .name = "local-storage-cache-int-get",
+ .validate = validate,
+ .setup = local_storage_cache_get_interleaved_setup,
+ .producer_thread = producer,
+ .consumer_thread = consumer,
+ .measure = measure,
+ .report_progress = local_storage_report_progress,
+ .report_final = local_storage_report_final,
+};
+
+const struct bench bench_local_storage_cache_hashmap_control = {
+ .name = "local-storage-cache-hashmap-control",
+ .validate = validate,
+ .setup = hashmap_setup,
+ .producer_thread = producer,
+ .consumer_thread = consumer,
+ .measure = measure,
+ .report_progress = local_storage_report_progress,
+ .report_final = local_storage_report_final,
+};
diff --git a/tools/testing/selftests/bpf/benchs/bench_local_storage_rcu_tasks_trace.c b/tools/testing/selftests/bpf/benchs/bench_local_storage_rcu_tasks_trace.c
new file mode 100644
index 000000000000..43f109d93130
--- /dev/null
+++ b/tools/testing/selftests/bpf/benchs/bench_local_storage_rcu_tasks_trace.c
@@ -0,0 +1,281 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+
+#include <argp.h>
+
+#include <sys/prctl.h>
+#include "local_storage_rcu_tasks_trace_bench.skel.h"
+#include "bench.h"
+
+#include <signal.h>
+
+static struct {
+ __u32 nr_procs;
+ __u32 kthread_pid;
+ bool quiet;
+} args = {
+ .nr_procs = 1000,
+ .kthread_pid = 0,
+ .quiet = false,
+};
+
+enum {
+ ARG_NR_PROCS = 7000,
+ ARG_KTHREAD_PID = 7001,
+ ARG_QUIET = 7002,
+};
+
+static const struct argp_option opts[] = {
+ { "nr_procs", ARG_NR_PROCS, "NR_PROCS", 0,
+ "Set number of user processes to spin up"},
+ { "kthread_pid", ARG_KTHREAD_PID, "PID", 0,
+ "Pid of rcu_tasks_trace kthread for ticks tracking"},
+ { "quiet", ARG_QUIET, "{0,1}", 0,
+ "If true, don't report progress"},
+ {},
+};
+
+static error_t parse_arg(int key, char *arg, struct argp_state *state)
+{
+ long ret;
+
+ switch (key) {
+ case ARG_NR_PROCS:
+ ret = strtol(arg, NULL, 10);
+ if (ret < 1 || ret > UINT_MAX) {
+ fprintf(stderr, "invalid nr_procs\n");
+ argp_usage(state);
+ }
+ args.nr_procs = ret;
+ break;
+ case ARG_KTHREAD_PID:
+ ret = strtol(arg, NULL, 10);
+ if (ret < 1) {
+ fprintf(stderr, "invalid kthread_pid\n");
+ argp_usage(state);
+ }
+ args.kthread_pid = ret;
+ break;
+ case ARG_QUIET:
+ ret = strtol(arg, NULL, 10);
+ if (ret < 0 || ret > 1) {
+ fprintf(stderr, "invalid quiet %ld\n", ret);
+ argp_usage(state);
+ }
+ args.quiet = ret;
+ break;
+break;
+ default:
+ return ARGP_ERR_UNKNOWN;
+ }
+
+ return 0;
+}
+
+const struct argp bench_local_storage_rcu_tasks_trace_argp = {
+ .options = opts,
+ .parser = parse_arg,
+};
+
+#define MAX_SLEEP_PROCS 150000
+
+static void validate(void)
+{
+ if (env.producer_cnt != 1) {
+ fprintf(stderr, "benchmark doesn't support multi-producer!\n");
+ exit(1);
+ }
+ if (env.consumer_cnt != 1) {
+ fprintf(stderr, "benchmark doesn't support multi-consumer!\n");
+ exit(1);
+ }
+
+ if (args.nr_procs > MAX_SLEEP_PROCS) {
+ fprintf(stderr, "benchmark supports up to %u sleeper procs!\n",
+ MAX_SLEEP_PROCS);
+ exit(1);
+ }
+}
+
+static long kthread_pid_ticks(void)
+{
+ char procfs_path[100];
+ long stime;
+ FILE *f;
+
+ if (!args.kthread_pid)
+ return -1;
+
+ sprintf(procfs_path, "/proc/%u/stat", args.kthread_pid);
+ f = fopen(procfs_path, "r");
+ if (!f) {
+ fprintf(stderr, "couldn't open %s, exiting\n", procfs_path);
+ goto err_out;
+ }
+ if (fscanf(f, "%*s %*s %*s %*s %*s %*s %*s %*s %*s %*s %*s %*s %*s %*s %ld", &stime) != 1) {
+ fprintf(stderr, "fscanf of %s failed, exiting\n", procfs_path);
+ goto err_out;
+ }
+ fclose(f);
+ return stime;
+
+err_out:
+ if (f)
+ fclose(f);
+ exit(1);
+ return 0;
+}
+
+static struct {
+ struct local_storage_rcu_tasks_trace_bench *skel;
+ long prev_kthread_stime;
+} ctx;
+
+static void sleep_and_loop(void)
+{
+ while (true) {
+ sleep(rand() % 4);
+ syscall(__NR_getpgid);
+ }
+}
+
+static void local_storage_tasks_trace_setup(void)
+{
+ int i, err, forkret, runner_pid;
+
+ runner_pid = getpid();
+
+ for (i = 0; i < args.nr_procs; i++) {
+ forkret = fork();
+ if (forkret < 0) {
+ fprintf(stderr, "Error forking sleeper proc %u of %u, exiting\n", i,
+ args.nr_procs);
+ goto err_out;
+ }
+
+ if (!forkret) {
+ err = prctl(PR_SET_PDEATHSIG, SIGKILL);
+ if (err < 0) {
+ fprintf(stderr, "prctl failed with err %d, exiting\n", errno);
+ goto err_out;
+ }
+
+ if (getppid() != runner_pid) {
+ fprintf(stderr, "Runner died while spinning up procs, exiting\n");
+ goto err_out;
+ }
+ sleep_and_loop();
+ }
+ }
+ printf("Spun up %u procs (our pid %d)\n", args.nr_procs, runner_pid);
+
+ setup_libbpf();
+
+ ctx.skel = local_storage_rcu_tasks_trace_bench__open_and_load();
+ if (!ctx.skel) {
+ fprintf(stderr, "Error doing open_and_load, exiting\n");
+ goto err_out;
+ }
+
+ ctx.prev_kthread_stime = kthread_pid_ticks();
+
+ if (!bpf_program__attach(ctx.skel->progs.get_local)) {
+ fprintf(stderr, "Error attaching bpf program\n");
+ goto err_out;
+ }
+
+ if (!bpf_program__attach(ctx.skel->progs.pregp_step)) {
+ fprintf(stderr, "Error attaching bpf program\n");
+ goto err_out;
+ }
+
+ if (!bpf_program__attach(ctx.skel->progs.postgp)) {
+ fprintf(stderr, "Error attaching bpf program\n");
+ goto err_out;
+ }
+
+ return;
+err_out:
+ exit(1);
+}
+
+static void measure(struct bench_res *res)
+{
+ long ticks;
+
+ res->gp_ct = atomic_swap(&ctx.skel->bss->gp_hits, 0);
+ res->gp_ns = atomic_swap(&ctx.skel->bss->gp_times, 0);
+ ticks = kthread_pid_ticks();
+ res->stime = ticks - ctx.prev_kthread_stime;
+ ctx.prev_kthread_stime = ticks;
+}
+
+static void *consumer(void *input)
+{
+ return NULL;
+}
+
+static void *producer(void *input)
+{
+ while (true)
+ syscall(__NR_getpgid);
+ return NULL;
+}
+
+static void report_progress(int iter, struct bench_res *res, long delta_ns)
+{
+ if (ctx.skel->bss->unexpected) {
+ fprintf(stderr, "Error: Unexpected order of bpf prog calls (postgp after pregp).");
+ fprintf(stderr, "Data can't be trusted, exiting\n");
+ exit(1);
+ }
+
+ if (args.quiet)
+ return;
+
+ printf("Iter %d\t avg tasks_trace grace period latency\t%lf ns\n",
+ iter, res->gp_ns / (double)res->gp_ct);
+ printf("Iter %d\t avg ticks per tasks_trace grace period\t%lf\n",
+ iter, res->stime / (double)res->gp_ct);
+}
+
+static void report_final(struct bench_res res[], int res_cnt)
+{
+ struct basic_stats gp_stat;
+
+ grace_period_latency_basic_stats(res, res_cnt, &gp_stat);
+ printf("SUMMARY tasks_trace grace period latency");
+ printf("\tavg %.3lf us\tstddev %.3lf us\n", gp_stat.mean, gp_stat.stddev);
+ grace_period_ticks_basic_stats(res, res_cnt, &gp_stat);
+ printf("SUMMARY ticks per tasks_trace grace period");
+ printf("\tavg %.3lf\tstddev %.3lf\n", gp_stat.mean, gp_stat.stddev);
+}
+
+/* local-storage-tasks-trace: Benchmark performance of BPF local_storage's use
+ * of RCU Tasks-Trace.
+ *
+ * Stress RCU Tasks Trace by forking many tasks, all of which do no work aside
+ * from sleep() loop, and creating/destroying BPF task-local storage on wakeup.
+ * The number of forked tasks is configurable.
+ *
+ * exercising code paths which call call_rcu_tasks_trace while there are many
+ * thousands of tasks on the system should result in RCU Tasks-Trace having to
+ * do a noticeable amount of work.
+ *
+ * This should be observable by measuring rcu_tasks_trace_kthread CPU usage
+ * after the grace period has ended, or by measuring grace period latency.
+ *
+ * This benchmark uses both approaches, attaching to rcu_tasks_trace_pregp_step
+ * and rcu_tasks_trace_postgp functions to measure grace period latency and
+ * using /proc/PID/stat to measure rcu_tasks_trace_kthread kernel ticks
+ */
+const struct bench bench_local_storage_tasks_trace = {
+ .name = "local-storage-tasks-trace",
+ .validate = validate,
+ .setup = local_storage_tasks_trace_setup,
+ .producer_thread = producer,
+ .consumer_thread = consumer,
+ .measure = measure,
+ .report_progress = report_progress,
+ .report_final = report_final,
+};
diff --git a/tools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh b/tools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh
new file mode 100755
index 000000000000..1e2de838f9fa
--- /dev/null
+++ b/tools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+source ./benchs/run_common.sh
+
+set -eufo pipefail
+
+nr_threads=`expr $(cat /proc/cpuinfo | grep "processor"| wc -l) - 1`
+summary=$($RUN_BENCH -p $nr_threads bpf-hashmap-ful-update)
+printf "$summary"
+printf "\n"
diff --git a/tools/testing/selftests/bpf/benchs/run_bench_local_storage.sh b/tools/testing/selftests/bpf/benchs/run_bench_local_storage.sh
new file mode 100755
index 000000000000..2eb2b513a173
--- /dev/null
+++ b/tools/testing/selftests/bpf/benchs/run_bench_local_storage.sh
@@ -0,0 +1,24 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+source ./benchs/run_common.sh
+
+set -eufo pipefail
+
+header "Hashmap Control"
+for i in 10 1000 10000 100000 4194304; do
+subtitle "num keys: $i"
+ summarize_local_storage "hashmap (control) sequential get: "\
+ "$(./bench --nr_maps 1 --hashmap_nr_keys_used=$i local-storage-cache-hashmap-control)"
+ printf "\n"
+done
+
+header "Local Storage"
+for i in 1 10 16 17 24 32 100 1000; do
+subtitle "num_maps: $i"
+ summarize_local_storage "local_storage cache sequential get: "\
+ "$(./bench --nr_maps $i local-storage-cache-seq-get)"
+ summarize_local_storage "local_storage cache interleaved get: "\
+ "$(./bench --nr_maps $i local-storage-cache-int-get)"
+ printf "\n"
+done
diff --git a/tools/testing/selftests/bpf/benchs/run_bench_local_storage_rcu_tasks_trace.sh b/tools/testing/selftests/bpf/benchs/run_bench_local_storage_rcu_tasks_trace.sh
new file mode 100755
index 000000000000..5dac1f02892c
--- /dev/null
+++ b/tools/testing/selftests/bpf/benchs/run_bench_local_storage_rcu_tasks_trace.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+kthread_pid=`pgrep rcu_tasks_trace_kthread`
+
+if [ -z $kthread_pid ]; then
+ echo "error: Couldn't find rcu_tasks_trace_kthread"
+ exit 1
+fi
+
+./bench --nr_procs 15000 --kthread_pid $kthread_pid -d 600 --quiet 1 local-storage-tasks-trace
diff --git a/tools/testing/selftests/bpf/benchs/run_common.sh b/tools/testing/selftests/bpf/benchs/run_common.sh
index 6c5e6023a69f..d9f40af82006 100644
--- a/tools/testing/selftests/bpf/benchs/run_common.sh
+++ b/tools/testing/selftests/bpf/benchs/run_common.sh
@@ -41,6 +41,16 @@ function ops()
echo "$*" | sed -E "s/.*latency\s+([0-9]+\.[0-9]+\sns\/op).*/\1/"
}
+function local_storage()
+{
+ echo -n "hits throughput: "
+ echo -n "$*" | sed -E "s/.* hits throughput\s+([0-9]+\.[0-9]+ ± [0-9]+\.[0-9]+\sM\sops\/s).*/\1/"
+ echo -n -e ", hits latency: "
+ echo -n "$*" | sed -E "s/.* hits latency\s+([0-9]+\.[0-9]+\sns\/op).*/\1/"
+ echo -n ", important_hits throughput: "
+ echo "$*" | sed -E "s/.*important_hits throughput\s+([0-9]+\.[0-9]+ ± [0-9]+\.[0-9]+\sM\sops\/s).*/\1/"
+}
+
function total()
{
echo "$*" | sed -E "s/.*total operations\s+([0-9]+\.[0-9]+ ± [0-9]+\.[0-9]+M\/s).*/\1/"
@@ -67,6 +77,13 @@ function summarize_ops()
printf "%-20s %s\n" "$bench" "$(ops $summary)"
}
+function summarize_local_storage()
+{
+ bench="$1"
+ summary=$(echo $2 | tail -n1)
+ printf "%-20s %s\n" "$bench" "$(local_storage $summary)"
+}
+
function summarize_total()
{
bench="$1"
diff --git a/tools/testing/selftests/bpf/bpf_legacy.h b/tools/testing/selftests/bpf/bpf_legacy.h
index 719ab56cdb5d..845209581440 100644
--- a/tools/testing/selftests/bpf/bpf_legacy.h
+++ b/tools/testing/selftests/bpf/bpf_legacy.h
@@ -2,15 +2,6 @@
#ifndef __BPF_LEGACY__
#define __BPF_LEGACY__
-#define BPF_ANNOTATE_KV_PAIR(name, type_key, type_val) \
- struct ____btf_map_##name { \
- type_key key; \
- type_val value; \
- }; \
- struct ____btf_map_##name \
- __attribute__ ((section(".maps." #name), used)) \
- ____btf_map_##name = { }
-
/* llvm builtin functions that eBPF C program may use to
* emit BPF_LD_ABS and BPF_LD_IND instructions
*/
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index e585e1cefc77..792cb15bac40 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -148,13 +148,13 @@ static struct bin_attribute bin_attr_bpf_testmod_file __ro_after_init = {
.write = bpf_testmod_test_write,
};
-BTF_SET_START(bpf_testmod_check_kfunc_ids)
-BTF_ID(func, bpf_testmod_test_mod_kfunc)
-BTF_SET_END(bpf_testmod_check_kfunc_ids)
+BTF_SET8_START(bpf_testmod_check_kfunc_ids)
+BTF_ID_FLAGS(func, bpf_testmod_test_mod_kfunc)
+BTF_SET8_END(bpf_testmod_check_kfunc_ids)
static const struct btf_kfunc_id_set bpf_testmod_kfunc_set = {
- .owner = THIS_MODULE,
- .check_set = &bpf_testmod_check_kfunc_ids,
+ .owner = THIS_MODULE,
+ .set = &bpf_testmod_check_kfunc_ids,
};
extern int bpf_fentry_test1(int a);
diff --git a/tools/testing/selftests/bpf/btf_helpers.c b/tools/testing/selftests/bpf/btf_helpers.c
index b5941d514e17..1c1c2c26690a 100644
--- a/tools/testing/selftests/bpf/btf_helpers.c
+++ b/tools/testing/selftests/bpf/btf_helpers.c
@@ -26,11 +26,12 @@ static const char * const btf_kind_str_mapping[] = {
[BTF_KIND_FLOAT] = "FLOAT",
[BTF_KIND_DECL_TAG] = "DECL_TAG",
[BTF_KIND_TYPE_TAG] = "TYPE_TAG",
+ [BTF_KIND_ENUM64] = "ENUM64",
};
static const char *btf_kind_str(__u16 kind)
{
- if (kind > BTF_KIND_TYPE_TAG)
+ if (kind > BTF_KIND_ENUM64)
return "UNKNOWN";
return btf_kind_str_mapping[kind];
}
@@ -139,14 +140,32 @@ int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id)
}
case BTF_KIND_ENUM: {
const struct btf_enum *v = btf_enum(t);
+ const char *fmt_str;
- fprintf(out, " size=%u vlen=%u", t->size, vlen);
+ fmt_str = btf_kflag(t) ? "\n\t'%s' val=%d" : "\n\t'%s' val=%u";
+ fprintf(out, " encoding=%s size=%u vlen=%u",
+ btf_kflag(t) ? "SIGNED" : "UNSIGNED", t->size, vlen);
for (i = 0; i < vlen; i++, v++) {
- fprintf(out, "\n\t'%s' val=%u",
+ fprintf(out, fmt_str,
btf_str(btf, v->name_off), v->val);
}
break;
}
+ case BTF_KIND_ENUM64: {
+ const struct btf_enum64 *v = btf_enum64(t);
+ const char *fmt_str;
+
+ fmt_str = btf_kflag(t) ? "\n\t'%s' val=%lld" : "\n\t'%s' val=%llu";
+
+ fprintf(out, " encoding=%s size=%u vlen=%u",
+ btf_kflag(t) ? "SIGNED" : "UNSIGNED", t->size, vlen);
+ for (i = 0; i < vlen; i++, v++) {
+ fprintf(out, fmt_str,
+ btf_str(btf, v->name_off),
+ ((__u64)v->val_hi32 << 32) | v->val_lo32);
+ }
+ break;
+ }
case BTF_KIND_FWD:
fprintf(out, " fwd_kind=%s", btf_kflag(t) ? "union" : "struct");
break;
diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index 3b3edc0fc8a6..fabf0c014349 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -1,59 +1,64 @@
+CONFIG_BLK_DEV_LOOP=y
CONFIG_BPF=y
-CONFIG_BPF_SYSCALL=y
-CONFIG_NET_CLS_BPF=m
CONFIG_BPF_EVENTS=y
-CONFIG_TEST_BPF=m
+CONFIG_BPF_JIT=y
+CONFIG_BPF_LIRC_MODE2=y
+CONFIG_BPF_LSM=y
+CONFIG_BPF_STREAM_PARSER=y
+CONFIG_BPF_SYSCALL=y
CONFIG_CGROUP_BPF=y
-CONFIG_NETDEVSIM=m
-CONFIG_NET_CLS_ACT=y
-CONFIG_NET_SCHED=y
-CONFIG_NET_SCH_INGRESS=y
-CONFIG_NET_IPIP=y
-CONFIG_IPV6=y
-CONFIG_NET_IPGRE_DEMUX=y
-CONFIG_NET_IPGRE=y
-CONFIG_IPV6_GRE=y
-CONFIG_CRYPTO_USER_API_HASH=m
CONFIG_CRYPTO_HMAC=m
CONFIG_CRYPTO_SHA256=m
-CONFIG_VXLAN=y
-CONFIG_GENEVE=y
-CONFIG_NET_CLS_FLOWER=m
-CONFIG_LWTUNNEL=y
-CONFIG_BPF_STREAM_PARSER=y
-CONFIG_XDP_SOCKETS=y
+CONFIG_CRYPTO_USER_API_HASH=m
+CONFIG_DYNAMIC_FTRACE=y
+CONFIG_FPROBE=y
CONFIG_FTRACE_SYSCALLS=y
-CONFIG_IPV6_TUNNEL=y
+CONFIG_FUNCTION_TRACER=y
+CONFIG_GENEVE=y
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+CONFIG_IMA=y
+CONFIG_IMA_READ_POLICY=y
+CONFIG_IMA_WRITE_POLICY=y
+CONFIG_IP_NF_FILTER=y
+CONFIG_IP_NF_RAW=y
+CONFIG_IP_NF_TARGET_SYNPROXY=y
+CONFIG_IPV6=y
+CONFIG_IPV6_FOU=m
+CONFIG_IPV6_FOU_TUNNEL=m
CONFIG_IPV6_GRE=y
CONFIG_IPV6_SEG6_BPF=y
+CONFIG_IPV6_SIT=m
+CONFIG_IPV6_TUNNEL=y
+CONFIG_LIRC=y
+CONFIG_LWTUNNEL=y
+CONFIG_MPLS=y
+CONFIG_MPLS_IPTUNNEL=m
+CONFIG_MPLS_ROUTING=m
+CONFIG_MPTCP=y
+CONFIG_NET_CLS_ACT=y
+CONFIG_NET_CLS_BPF=y
+CONFIG_NET_CLS_FLOWER=m
CONFIG_NET_FOU=m
CONFIG_NET_FOU_IP_TUNNELS=y
-CONFIG_IPV6_FOU=m
-CONFIG_IPV6_FOU_TUNNEL=m
-CONFIG_MPLS=y
+CONFIG_NET_IPGRE=y
+CONFIG_NET_IPGRE_DEMUX=y
+CONFIG_NET_IPIP=y
CONFIG_NET_MPLS_GSO=m
-CONFIG_MPLS_ROUTING=m
-CONFIG_MPLS_IPTUNNEL=m
-CONFIG_IPV6_SIT=m
-CONFIG_BPF_JIT=y
-CONFIG_BPF_LSM=y
-CONFIG_SECURITY=y
-CONFIG_RC_CORE=y
-CONFIG_LIRC=y
-CONFIG_BPF_LIRC_MODE2=y
-CONFIG_IMA=y
-CONFIG_SECURITYFS=y
-CONFIG_IMA_WRITE_POLICY=y
-CONFIG_IMA_READ_POLICY=y
-CONFIG_BLK_DEV_LOOP=y
-CONFIG_FUNCTION_TRACER=y
-CONFIG_DYNAMIC_FTRACE=y
+CONFIG_NET_SCH_INGRESS=y
+CONFIG_NET_SCHED=y
+CONFIG_NETDEVSIM=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_SYNPROXY=y
+CONFIG_NETFILTER_XT_MATCH_STATE=y
+CONFIG_NETFILTER_XT_TARGET_CT=y
+CONFIG_NF_CONNTRACK=y
CONFIG_NF_DEFRAG_IPV4=y
CONFIG_NF_DEFRAG_IPV6=y
-CONFIG_NF_CONNTRACK=y
+CONFIG_RC_CORE=y
+CONFIG_SECURITY=y
+CONFIG_SECURITYFS=y
+CONFIG_TEST_BPF=m
CONFIG_USERFAULTFD=y
-CONFIG_FPROBE=y
-CONFIG_IKCONFIG=y
-CONFIG_IKCONFIG_PROC=y
-CONFIG_MPTCP=y
+CONFIG_VXLAN=y
+CONFIG_XDP_SOCKETS=y
diff --git a/tools/testing/selftests/bpf/config.s390x b/tools/testing/selftests/bpf/config.s390x
new file mode 100644
index 000000000000..f8a7a258a718
--- /dev/null
+++ b/tools/testing/selftests/bpf/config.s390x
@@ -0,0 +1,147 @@
+CONFIG_9P_FS=y
+CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
+CONFIG_AUDIT=y
+CONFIG_BLK_CGROUP=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_BLK_DEV_IO_TRACE=y
+CONFIG_BLK_DEV_RAM=y
+CONFIG_BONDING=y
+CONFIG_BPF_JIT_ALWAYS_ON=y
+CONFIG_BPF_JIT_DEFAULT_ON=y
+CONFIG_BPF_PRELOAD=y
+CONFIG_BPF_PRELOAD_UMD=y
+CONFIG_BPFILTER=y
+CONFIG_CGROUP_CPUACCT=y
+CONFIG_CGROUP_DEVICE=y
+CONFIG_CGROUP_FREEZER=y
+CONFIG_CGROUP_HUGETLB=y
+CONFIG_CGROUP_NET_CLASSID=y
+CONFIG_CGROUP_PERF=y
+CONFIG_CGROUP_PIDS=y
+CONFIG_CGROUP_SCHED=y
+CONFIG_CGROUPS=y
+CONFIG_CHECKPOINT_RESTORE=y
+CONFIG_CPUSETS=y
+CONFIG_CRASH_DUMP=y
+CONFIG_CRYPTO_USER_API_RNG=y
+CONFIG_CRYPTO_USER_API_SKCIPHER=y
+CONFIG_DEBUG_ATOMIC_SLEEP=y
+CONFIG_DEBUG_INFO_BTF=y
+CONFIG_DEBUG_INFO_DWARF4=y
+CONFIG_DEBUG_LIST=y
+CONFIG_DEBUG_LOCKDEP=y
+CONFIG_DEBUG_NOTIFIERS=y
+CONFIG_DEBUG_PAGEALLOC=y
+CONFIG_DEBUG_SECTION_MISMATCH=y
+CONFIG_DEBUG_SG=y
+CONFIG_DETECT_HUNG_TASK=y
+CONFIG_DEVTMPFS=y
+CONFIG_EXPERT=y
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_FS_POSIX_ACL=y
+CONFIG_EXT4_FS_SECURITY=y
+CONFIG_FANOTIFY=y
+CONFIG_FUNCTION_PROFILER=y
+CONFIG_GDB_SCRIPTS=y
+CONFIG_HAVE_EBPF_JIT=y
+CONFIG_HAVE_KPROBES=y
+CONFIG_HAVE_KPROBES_ON_FTRACE=y
+CONFIG_HAVE_KRETPROBES=y
+CONFIG_HAVE_MARCH_Z10_FEATURES=y
+CONFIG_HAVE_MARCH_Z196_FEATURES=y
+CONFIG_HEADERS_INSTALL=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_HUGETLBFS=y
+CONFIG_HW_RANDOM=y
+CONFIG_HZ_100=y
+CONFIG_IDLE_PAGE_TRACKING=y
+CONFIG_IKHEADERS=y
+CONFIG_INET6_ESP=y
+CONFIG_INET=y
+CONFIG_INET_ESP=y
+CONFIG_IP_ADVANCED_ROUTER=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_NF_IPTABLES=y
+CONFIG_IPV6_SEG6_LWTUNNEL=y
+CONFIG_IPVLAN=y
+CONFIG_JUMP_LABEL=y
+CONFIG_KERNEL_UNCOMPRESSED=y
+CONFIG_KPROBES=y
+CONFIG_KPROBES_ON_FTRACE=y
+CONFIG_KRETPROBES=y
+CONFIG_KSM=y
+CONFIG_LATENCYTOP=y
+CONFIG_LIVEPATCH=y
+CONFIG_LOCK_STAT=y
+CONFIG_MACVLAN=y
+CONFIG_MACVTAP=y
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_MARCH_Z196=y
+CONFIG_MARCH_Z196_TUNE=y
+CONFIG_MEMCG=y
+CONFIG_MEMORY_HOTPLUG=y
+CONFIG_MEMORY_HOTREMOVE=y
+CONFIG_MODULE_SIG=y
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODULES=y
+CONFIG_NAMESPACES=y
+CONFIG_NET=y
+CONFIG_NET_9P=y
+CONFIG_NET_9P_VIRTIO=y
+CONFIG_NET_ACT_BPF=y
+CONFIG_NET_ACT_GACT=y
+CONFIG_NET_KEY=y
+CONFIG_NET_SCH_FQ=y
+CONFIG_NET_VRF=y
+CONFIG_NETDEVICES=y
+CONFIG_NETFILTER_XT_MATCH_BPF=y
+CONFIG_NETFILTER_XT_TARGET_MARK=y
+CONFIG_NF_TABLES=y
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NR_CPUS=256
+CONFIG_NUMA=y
+CONFIG_PACKET=y
+CONFIG_PANIC_ON_OOPS=y
+CONFIG_PARTITION_ADVANCED=y
+CONFIG_PCI=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_PROC_KCORE=y
+CONFIG_PROFILING=y
+CONFIG_PROVE_LOCKING=y
+CONFIG_PTDUMP_DEBUGFS=y
+CONFIG_RC_DEVICES=y
+CONFIG_RC_LOOPBACK=y
+CONFIG_RT_GROUP_SCHED=y
+CONFIG_SAMPLE_SECCOMP=y
+CONFIG_SAMPLES=y
+CONFIG_SCHED_TRACER=y
+CONFIG_SCSI=y
+CONFIG_SCSI_VIRTIO=y
+CONFIG_SECURITY_NETWORK=y
+CONFIG_STACK_TRACER=y
+CONFIG_STATIC_KEYS_SELFTEST=y
+CONFIG_SYSVIPC=y
+CONFIG_TASK_DELAY_ACCT=y
+CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_TASK_XACCT=y
+CONFIG_TASKSTATS=y
+CONFIG_TCP_CONG_ADVANCED=y
+CONFIG_TCP_CONG_DCTCP=y
+CONFIG_TLS=y
+CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
+CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP=y
+CONFIG_TRANSPARENT_HUGEPAGE=y
+CONFIG_TUN=y
+CONFIG_UNIX=y
+CONFIG_UPROBES=y
+CONFIG_USELIB=y
+CONFIG_USER_NS=y
+CONFIG_VETH=y
+CONFIG_VIRTIO_BALLOON=y
+CONFIG_VIRTIO_BLK=y
+CONFIG_VIRTIO_NET=y
+CONFIG_VIRTIO_PCI=y
+CONFIG_VLAN_8021Q=y
+CONFIG_XFRM_USER=y
diff --git a/tools/testing/selftests/bpf/config.x86_64 b/tools/testing/selftests/bpf/config.x86_64
new file mode 100644
index 000000000000..f0859a1d37ab
--- /dev/null
+++ b/tools/testing/selftests/bpf/config.x86_64
@@ -0,0 +1,251 @@
+CONFIG_9P_FS=y
+CONFIG_9P_FS_POSIX_ACL=y
+CONFIG_9P_FS_SECURITY=y
+CONFIG_AGP=y
+CONFIG_AGP_AMD64=y
+CONFIG_AGP_INTEL=y
+CONFIG_AGP_SIS=y
+CONFIG_AGP_VIA=y
+CONFIG_AMIGA_PARTITION=y
+CONFIG_AUDIT=y
+CONFIG_BACKLIGHT_CLASS_DEVICE=y
+CONFIG_BINFMT_MISC=y
+CONFIG_BLK_CGROUP=y
+CONFIG_BLK_CGROUP_IOLATENCY=y
+CONFIG_BLK_DEV_BSGLIB=y
+CONFIG_BLK_DEV_IO_TRACE=y
+CONFIG_BLK_DEV_RAM=y
+CONFIG_BLK_DEV_RAM_SIZE=16384
+CONFIG_BLK_DEV_THROTTLING=y
+CONFIG_BONDING=y
+CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
+CONFIG_BOOTTIME_TRACING=y
+CONFIG_BPF_JIT_ALWAYS_ON=y
+CONFIG_BPF_KPROBE_OVERRIDE=y
+CONFIG_BPF_PRELOAD=y
+CONFIG_BPF_PRELOAD_UMD=y
+CONFIG_BPFILTER=y
+CONFIG_BSD_DISKLABEL=y
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_CFS_BANDWIDTH=y
+CONFIG_CGROUP_CPUACCT=y
+CONFIG_CGROUP_DEVICE=y
+CONFIG_CGROUP_FREEZER=y
+CONFIG_CGROUP_HUGETLB=y
+CONFIG_CGROUP_PERF=y
+CONFIG_CGROUP_SCHED=y
+CONFIG_CGROUPS=y
+CONFIG_CMA=y
+CONFIG_CMA_AREAS=7
+CONFIG_COMPAT_32BIT_TIME=y
+CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
+CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
+CONFIG_CPU_FREQ_GOV_ONDEMAND=y
+CONFIG_CPU_FREQ_GOV_USERSPACE=y
+CONFIG_CPU_FREQ_STAT=y
+CONFIG_CPU_IDLE_GOV_LADDER=y
+CONFIG_CPUSETS=y
+CONFIG_CRC_T10DIF=y
+CONFIG_CRYPTO_BLAKE2B=y
+CONFIG_CRYPTO_DEV_VIRTIO=m
+CONFIG_CRYPTO_SEQIV=y
+CONFIG_CRYPTO_XXHASH=y
+CONFIG_DCB=y
+CONFIG_DEBUG_ATOMIC_SLEEP=y
+CONFIG_DEBUG_CREDENTIALS=y
+CONFIG_DEBUG_INFO_BTF=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+CONFIG_DEBUG_MEMORY_INIT=y
+CONFIG_DEFAULT_FQ_CODEL=y
+CONFIG_DEFAULT_RENO=y
+CONFIG_DEFAULT_SECURITY_DAC=y
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
+CONFIG_DMA_CMA=y
+CONFIG_DNS_RESOLVER=y
+CONFIG_EFI=y
+CONFIG_EFI_STUB=y
+CONFIG_EXPERT=y
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_FS_POSIX_ACL=y
+CONFIG_EXT4_FS_SECURITY=y
+CONFIG_FAIL_FUNCTION=y
+CONFIG_FAULT_INJECTION=y
+CONFIG_FAULT_INJECTION_DEBUG_FS=y
+CONFIG_FB=y
+CONFIG_FB_MODE_HELPERS=y
+CONFIG_FB_TILEBLITTING=y
+CONFIG_FB_VESA=y
+CONFIG_FONT_8x16=y
+CONFIG_FONT_MINI_4x6=y
+CONFIG_FONTS=y
+CONFIG_FRAMEBUFFER_CONSOLE=y
+CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
+CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
+CONFIG_FW_LOADER_USER_HELPER=y
+CONFIG_GART_IOMMU=y
+CONFIG_GENERIC_PHY=y
+CONFIG_HARDLOCKUP_DETECTOR=y
+CONFIG_HID_A4TECH=y
+CONFIG_HID_BELKIN=y
+CONFIG_HID_CHERRY=y
+CONFIG_HID_CYPRESS=y
+CONFIG_HID_DRAGONRISE=y
+CONFIG_HID_EZKEY=y
+CONFIG_HID_GREENASIA=y
+CONFIG_HID_GYRATION=y
+CONFIG_HID_KENSINGTON=y
+CONFIG_HID_KYE=y
+CONFIG_HID_MICROSOFT=y
+CONFIG_HID_MONTEREY=y
+CONFIG_HID_PANTHERLORD=y
+CONFIG_HID_PETALYNX=y
+CONFIG_HID_SMARTJOYPLUS=y
+CONFIG_HID_SUNPLUS=y
+CONFIG_HID_TOPSEED=y
+CONFIG_HID_TWINHAN=y
+CONFIG_HID_ZEROPLUS=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_HPET=y
+CONFIG_HUGETLBFS=y
+CONFIG_HWPOISON_INJECT=y
+CONFIG_HZ_1000=y
+CONFIG_INET=y
+CONFIG_INPUT_EVDEV=y
+CONFIG_INTEL_POWERCLAMP=y
+CONFIG_IP6_NF_IPTABLES=y
+CONFIG_IP_ADVANCED_ROUTER=y
+CONFIG_IP_MROUTE=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_NF_IPTABLES=y
+CONFIG_IP_PIMSM_V1=y
+CONFIG_IP_PIMSM_V2=y
+CONFIG_IP_ROUTE_MULTIPATH=y
+CONFIG_IP_ROUTE_VERBOSE=y
+CONFIG_IPV6_MIP6=y
+CONFIG_IPV6_ROUTE_INFO=y
+CONFIG_IPV6_ROUTER_PREF=y
+CONFIG_IPV6_SEG6_LWTUNNEL=y
+CONFIG_IPV6_SUBTREES=y
+CONFIG_IRQ_POLL=y
+CONFIG_JUMP_LABEL=y
+CONFIG_KARMA_PARTITION=y
+CONFIG_KEXEC=y
+CONFIG_KPROBES=y
+CONFIG_KSM=y
+CONFIG_LEGACY_VSYSCALL_NONE=y
+CONFIG_LOG_BUF_SHIFT=21
+CONFIG_LOG_CPU_MAX_BUF_SHIFT=0
+CONFIG_LOGO=y
+CONFIG_LSM="selinux,bpf,integrity"
+CONFIG_MAC_PARTITION=y
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_MCORE2=y
+CONFIG_MEMCG=y
+CONFIG_MEMORY_FAILURE=y
+CONFIG_MINIX_SUBPARTITION=y
+CONFIG_MODULE_SIG=y
+CONFIG_MODULE_SRCVERSION_ALL=y
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODULES=y
+CONFIG_MODVERSIONS=y
+CONFIG_NAMESPACES=y
+CONFIG_NET=y
+CONFIG_NET_9P=y
+CONFIG_NET_9P_VIRTIO=y
+CONFIG_NET_ACT_BPF=y
+CONFIG_NET_CLS_CGROUP=y
+CONFIG_NET_EMATCH=y
+CONFIG_NET_IPGRE_BROADCAST=y
+CONFIG_NET_L3_MASTER_DEV=y
+CONFIG_NET_SCH_DEFAULT=y
+CONFIG_NET_SCH_FQ_CODEL=y
+CONFIG_NET_TC_SKB_EXT=y
+CONFIG_NET_VRF=y
+CONFIG_NETDEVICES=y
+CONFIG_NETFILTER_NETLINK_LOG=y
+CONFIG_NETFILTER_NETLINK_QUEUE=y
+CONFIG_NETFILTER_XT_MATCH_BPF=y
+CONFIG_NETFILTER_XT_MATCH_STATISTIC=y
+CONFIG_NETLABEL=y
+CONFIG_NLS_ASCII=y
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_DEFAULT="utf8"
+CONFIG_NO_HZ=y
+CONFIG_NR_CPUS=128
+CONFIG_NUMA=y
+CONFIG_NUMA_BALANCING=y
+CONFIG_NVMEM=y
+CONFIG_OSF_PARTITION=y
+CONFIG_PACKET=y
+CONFIG_PANIC_ON_OOPS=y
+CONFIG_PARTITION_ADVANCED=y
+CONFIG_PCI=y
+CONFIG_PCI_IOV=y
+CONFIG_PCI_MSI=y
+CONFIG_PCIEPORTBUS=y
+CONFIG_PHYSICAL_ALIGN=0x1000000
+CONFIG_POSIX_MQUEUE=y
+CONFIG_POWER_SUPPLY=y
+CONFIG_PREEMPT=y
+CONFIG_PRINTK_TIME=y
+CONFIG_PROC_KCORE=y
+CONFIG_PROFILING=y
+CONFIG_PROVE_LOCKING=y
+CONFIG_PTP_1588_CLOCK=y
+CONFIG_RC_DEVICES=y
+CONFIG_RC_LOOPBACK=y
+CONFIG_RCU_CPU_STALL_TIMEOUT=60
+CONFIG_SCHED_STACK_END_CHECK=y
+CONFIG_SCHEDSTATS=y
+CONFIG_SECURITY_NETWORK=y
+CONFIG_SECURITY_SELINUX=y
+CONFIG_SERIAL_8250=y
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_8250_DETECT_IRQ=y
+CONFIG_SERIAL_8250_EXTENDED=y
+CONFIG_SERIAL_8250_MANY_PORTS=y
+CONFIG_SERIAL_8250_NR_UARTS=32
+CONFIG_SERIAL_8250_RSA=y
+CONFIG_SERIAL_8250_SHARE_IRQ=y
+CONFIG_SERIAL_NONSTANDARD=y
+CONFIG_SERIO_LIBPS2=y
+CONFIG_SGI_PARTITION=y
+CONFIG_SMP=y
+CONFIG_SOLARIS_X86_PARTITION=y
+CONFIG_SUN_PARTITION=y
+CONFIG_SYNC_FILE=y
+CONFIG_SYSVIPC=y
+CONFIG_TASK_DELAY_ACCT=y
+CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_TASK_XACCT=y
+CONFIG_TASKSTATS=y
+CONFIG_TCP_CONG_ADVANCED=y
+CONFIG_TCP_MD5SIG=y
+CONFIG_TLS=y
+CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
+CONFIG_TRANSPARENT_HUGEPAGE=y
+CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
+CONFIG_TUN=y
+CONFIG_UNIX=y
+CONFIG_UNIXWARE_DISKLABEL=y
+CONFIG_USER_NS=y
+CONFIG_VALIDATE_FS_PARSER=y
+CONFIG_VETH=y
+CONFIG_VIRT_DRIVERS=y
+CONFIG_VIRTIO_BALLOON=y
+CONFIG_VIRTIO_BLK=y
+CONFIG_VIRTIO_CONSOLE=y
+CONFIG_VIRTIO_NET=y
+CONFIG_VIRTIO_PCI=y
+CONFIG_VLAN_8021Q=y
+CONFIG_X86_ACPI_CPUFREQ=y
+CONFIG_X86_CPUID=y
+CONFIG_X86_MSR=y
+CONFIG_X86_POWERNOW_K8=y
+CONFIG_XDP_SOCKETS_DIAG=y
+CONFIG_XFRM_SUB_POLICY=y
+CONFIG_XFRM_USER=y
+CONFIG_ZEROPLUS_FF=y
diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c
index 59cf81ec55af..bec15558fd93 100644
--- a/tools/testing/selftests/bpf/network_helpers.c
+++ b/tools/testing/selftests/bpf/network_helpers.c
@@ -436,7 +436,7 @@ struct nstoken *open_netns(const char *name)
int err;
struct nstoken *token;
- token = malloc(sizeof(struct nstoken));
+ token = calloc(1, sizeof(struct nstoken));
if (!ASSERT_OK_PTR(token, "malloc token"))
return NULL;
diff --git a/tools/testing/selftests/bpf/prog_tests/attach_probe.c b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
index 08c0601b3e84..0b899d2d8ea7 100644
--- a/tools/testing/selftests/bpf/prog_tests/attach_probe.c
+++ b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
@@ -17,6 +17,14 @@ static void trigger_func2(void)
asm volatile ("");
}
+/* attach point for byname sleepable uprobe */
+static void trigger_func3(void)
+{
+ asm volatile ("");
+}
+
+static char test_data[] = "test_data";
+
void test_attach_probe(void)
{
DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
@@ -49,9 +57,17 @@ void test_attach_probe(void)
if (!ASSERT_GE(ref_ctr_offset, 0, "ref_ctr_offset"))
return;
- skel = test_attach_probe__open_and_load();
+ skel = test_attach_probe__open();
if (!ASSERT_OK_PTR(skel, "skel_open"))
return;
+
+ /* sleepable kprobe test case needs flags set before loading */
+ if (!ASSERT_OK(bpf_program__set_flags(skel->progs.handle_kprobe_sleepable,
+ BPF_F_SLEEPABLE), "kprobe_sleepable_flags"))
+ goto cleanup;
+
+ if (!ASSERT_OK(test_attach_probe__load(skel), "skel_load"))
+ goto cleanup;
if (!ASSERT_OK_PTR(skel->bss, "check_bss"))
goto cleanup;
@@ -151,6 +167,30 @@ void test_attach_probe(void)
if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname2, "attach_uretprobe_byname2"))
goto cleanup;
+ /* sleepable kprobes should not attach successfully */
+ skel->links.handle_kprobe_sleepable = bpf_program__attach(skel->progs.handle_kprobe_sleepable);
+ if (!ASSERT_ERR_PTR(skel->links.handle_kprobe_sleepable, "attach_kprobe_sleepable"))
+ goto cleanup;
+
+ /* test sleepable uprobe and uretprobe variants */
+ skel->links.handle_uprobe_byname3_sleepable = bpf_program__attach(skel->progs.handle_uprobe_byname3_sleepable);
+ if (!ASSERT_OK_PTR(skel->links.handle_uprobe_byname3_sleepable, "attach_uprobe_byname3_sleepable"))
+ goto cleanup;
+
+ skel->links.handle_uprobe_byname3 = bpf_program__attach(skel->progs.handle_uprobe_byname3);
+ if (!ASSERT_OK_PTR(skel->links.handle_uprobe_byname3, "attach_uprobe_byname3"))
+ goto cleanup;
+
+ skel->links.handle_uretprobe_byname3_sleepable = bpf_program__attach(skel->progs.handle_uretprobe_byname3_sleepable);
+ if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname3_sleepable, "attach_uretprobe_byname3_sleepable"))
+ goto cleanup;
+
+ skel->links.handle_uretprobe_byname3 = bpf_program__attach(skel->progs.handle_uretprobe_byname3);
+ if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname3, "attach_uretprobe_byname3"))
+ goto cleanup;
+
+ skel->bss->user_ptr = test_data;
+
/* trigger & validate kprobe && kretprobe */
usleep(1);
@@ -164,6 +204,9 @@ void test_attach_probe(void)
/* trigger & validate uprobe attached by name */
trigger_func2();
+ /* trigger & validate sleepable uprobe attached by name */
+ trigger_func3();
+
ASSERT_EQ(skel->bss->kprobe_res, 1, "check_kprobe_res");
ASSERT_EQ(skel->bss->kprobe2_res, 11, "check_kprobe_auto_res");
ASSERT_EQ(skel->bss->kretprobe_res, 2, "check_kretprobe_res");
@@ -174,6 +217,10 @@ void test_attach_probe(void)
ASSERT_EQ(skel->bss->uretprobe_byname_res, 6, "check_uretprobe_byname_res");
ASSERT_EQ(skel->bss->uprobe_byname2_res, 7, "check_uprobe_byname2_res");
ASSERT_EQ(skel->bss->uretprobe_byname2_res, 8, "check_uretprobe_byname2_res");
+ ASSERT_EQ(skel->bss->uprobe_byname3_sleepable_res, 9, "check_uprobe_byname3_sleepable_res");
+ ASSERT_EQ(skel->bss->uprobe_byname3_res, 10, "check_uprobe_byname3_res");
+ ASSERT_EQ(skel->bss->uretprobe_byname3_sleepable_res, 11, "check_uretprobe_byname3_sleepable_res");
+ ASSERT_EQ(skel->bss->uretprobe_byname3_res, 12, "check_uretprobe_byname3_res");
cleanup:
test_attach_probe__destroy(skel);
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
index 7ff5fa93d056..a33874b081b6 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
@@ -27,6 +27,7 @@
#include "bpf_iter_test_kern5.skel.h"
#include "bpf_iter_test_kern6.skel.h"
#include "bpf_iter_bpf_link.skel.h"
+#include "bpf_iter_ksym.skel.h"
static int duration;
@@ -1120,6 +1121,19 @@ static void test_link_iter(void)
bpf_iter_bpf_link__destroy(skel);
}
+static void test_ksym_iter(void)
+{
+ struct bpf_iter_ksym *skel;
+
+ skel = bpf_iter_ksym__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "bpf_iter_ksym__open_and_load"))
+ return;
+
+ do_dummy_read(skel->progs.dump_ksym);
+
+ bpf_iter_ksym__destroy(skel);
+}
+
#define CMP_BUFFER_SIZE 1024
static char task_vma_output[CMP_BUFFER_SIZE];
static char proc_maps_output[CMP_BUFFER_SIZE];
@@ -1267,4 +1281,6 @@ void test_bpf_iter(void)
test_buf_neg_offset();
if (test__start_subtest("link-iter"))
test_link_iter();
+ if (test__start_subtest("ksym"))
+ test_ksym_iter();
}
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_loop.c b/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
index 380d7a2072e3..4cd8a25afe68 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_loop.c
@@ -120,6 +120,64 @@ static void check_nested_calls(struct bpf_loop *skel)
bpf_link__destroy(link);
}
+static void check_non_constant_callback(struct bpf_loop *skel)
+{
+ struct bpf_link *link =
+ bpf_program__attach(skel->progs.prog_non_constant_callback);
+
+ if (!ASSERT_OK_PTR(link, "link"))
+ return;
+
+ skel->bss->callback_selector = 0x0F;
+ usleep(1);
+ ASSERT_EQ(skel->bss->g_output, 0x0F, "g_output #1");
+
+ skel->bss->callback_selector = 0xF0;
+ usleep(1);
+ ASSERT_EQ(skel->bss->g_output, 0xF0, "g_output #2");
+
+ bpf_link__destroy(link);
+}
+
+static void check_stack(struct bpf_loop *skel)
+{
+ struct bpf_link *link = bpf_program__attach(skel->progs.stack_check);
+ const int max_key = 12;
+ int key;
+ int map_fd;
+
+ if (!ASSERT_OK_PTR(link, "link"))
+ return;
+
+ map_fd = bpf_map__fd(skel->maps.map1);
+
+ if (!ASSERT_GE(map_fd, 0, "bpf_map__fd"))
+ goto out;
+
+ for (key = 1; key <= max_key; ++key) {
+ int val = key;
+ int err = bpf_map_update_elem(map_fd, &key, &val, BPF_NOEXIST);
+
+ if (!ASSERT_OK(err, "bpf_map_update_elem"))
+ goto out;
+ }
+
+ usleep(1);
+
+ for (key = 1; key <= max_key; ++key) {
+ int val;
+ int err = bpf_map_lookup_elem(map_fd, &key, &val);
+
+ if (!ASSERT_OK(err, "bpf_map_lookup_elem"))
+ goto out;
+ if (!ASSERT_EQ(val, key + 1, "bad value in the map"))
+ goto out;
+ }
+
+out:
+ bpf_link__destroy(link);
+}
+
void test_bpf_loop(void)
{
struct bpf_loop *skel;
@@ -140,6 +198,10 @@ void test_bpf_loop(void)
check_invalid_flags(skel);
if (test__start_subtest("check_nested_calls"))
check_nested_calls(skel);
+ if (test__start_subtest("check_non_constant_callback"))
+ check_non_constant_callback(skel);
+ if (test__start_subtest("check_stack"))
+ check_stack(skel);
bpf_loop__destroy(skel);
}
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
index dd30b1e3a67c..7a74a1579076 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
@@ -2,13 +2,29 @@
#include <test_progs.h>
#include <network_helpers.h>
#include "test_bpf_nf.skel.h"
+#include "test_bpf_nf_fail.skel.h"
+
+static char log_buf[1024 * 1024];
+
+struct {
+ const char *prog_name;
+ const char *err_msg;
+} test_bpf_nf_fail_tests[] = {
+ { "alloc_release", "kernel function bpf_ct_release args#0 expected pointer to STRUCT nf_conn but" },
+ { "insert_insert", "kernel function bpf_ct_insert_entry args#0 expected pointer to STRUCT nf_conn___init but" },
+ { "lookup_insert", "kernel function bpf_ct_insert_entry args#0 expected pointer to STRUCT nf_conn___init but" },
+ { "set_timeout_after_insert", "kernel function bpf_ct_set_timeout args#0 expected pointer to STRUCT nf_conn___init but" },
+ { "set_status_after_insert", "kernel function bpf_ct_set_status args#0 expected pointer to STRUCT nf_conn___init but" },
+ { "change_timeout_after_alloc", "kernel function bpf_ct_change_timeout args#0 expected pointer to STRUCT nf_conn but" },
+ { "change_status_after_alloc", "kernel function bpf_ct_change_status args#0 expected pointer to STRUCT nf_conn but" },
+};
enum {
TEST_XDP,
TEST_TC_BPF,
};
-void test_bpf_nf_ct(int mode)
+static void test_bpf_nf_ct(int mode)
{
struct test_bpf_nf *skel;
int prog_fd, err;
@@ -39,14 +55,60 @@ void test_bpf_nf_ct(int mode)
ASSERT_EQ(skel->bss->test_enonet_netns_id, -ENONET, "Test ENONET for bad but valid netns_id");
ASSERT_EQ(skel->bss->test_enoent_lookup, -ENOENT, "Test ENOENT for failed lookup");
ASSERT_EQ(skel->bss->test_eafnosupport, -EAFNOSUPPORT, "Test EAFNOSUPPORT for invalid len__tuple");
+ ASSERT_EQ(skel->data->test_alloc_entry, 0, "Test for alloc new entry");
+ ASSERT_EQ(skel->data->test_insert_entry, 0, "Test for insert new entry");
+ ASSERT_EQ(skel->data->test_succ_lookup, 0, "Test for successful lookup");
+ /* allow some tolerance for test_delta_timeout value to avoid races. */
+ ASSERT_GT(skel->bss->test_delta_timeout, 8, "Test for min ct timeout update");
+ ASSERT_LE(skel->bss->test_delta_timeout, 10, "Test for max ct timeout update");
+ /* expected status is IPS_SEEN_REPLY */
+ ASSERT_EQ(skel->bss->test_status, 2, "Test for ct status update ");
end:
test_bpf_nf__destroy(skel);
}
+static void test_bpf_nf_ct_fail(const char *prog_name, const char *err_msg)
+{
+ LIBBPF_OPTS(bpf_object_open_opts, opts, .kernel_log_buf = log_buf,
+ .kernel_log_size = sizeof(log_buf),
+ .kernel_log_level = 1);
+ struct test_bpf_nf_fail *skel;
+ struct bpf_program *prog;
+ int ret;
+
+ skel = test_bpf_nf_fail__open_opts(&opts);
+ if (!ASSERT_OK_PTR(skel, "test_bpf_nf_fail__open"))
+ return;
+
+ prog = bpf_object__find_program_by_name(skel->obj, prog_name);
+ if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
+ goto end;
+
+ bpf_program__set_autoload(prog, true);
+
+ ret = test_bpf_nf_fail__load(skel);
+ if (!ASSERT_ERR(ret, "test_bpf_nf_fail__load must fail"))
+ goto end;
+
+ if (!ASSERT_OK_PTR(strstr(log_buf, err_msg), "expected error message")) {
+ fprintf(stderr, "Expected: %s\n", err_msg);
+ fprintf(stderr, "Verifier: %s\n", log_buf);
+ }
+
+end:
+ test_bpf_nf_fail__destroy(skel);
+}
+
void test_bpf_nf(void)
{
+ int i;
if (test__start_subtest("xdp-ct"))
test_bpf_nf_ct(TEST_XDP);
if (test__start_subtest("tc-bpf-ct"))
test_bpf_nf_ct(TEST_TC_BPF);
+ for (i = 0; i < ARRAY_SIZE(test_bpf_nf_fail_tests); i++) {
+ if (test__start_subtest(test_bpf_nf_fail_tests[i].prog_name))
+ test_bpf_nf_ct_fail(test_bpf_nf_fail_tests[i].prog_name,
+ test_bpf_nf_fail_tests[i].err_msg);
+ }
}
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
index e9a9a31b2ffe..2959a52ced06 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
@@ -9,6 +9,9 @@
#include "bpf_cubic.skel.h"
#include "bpf_tcp_nogpl.skel.h"
#include "bpf_dctcp_release.skel.h"
+#include "tcp_ca_write_sk_pacing.skel.h"
+#include "tcp_ca_incompl_cong_ops.skel.h"
+#include "tcp_ca_unsupp_cong_op.skel.h"
#ifndef ENOTSUPP
#define ENOTSUPP 524
@@ -322,6 +325,58 @@ static void test_rel_setsockopt(void)
bpf_dctcp_release__destroy(rel_skel);
}
+static void test_write_sk_pacing(void)
+{
+ struct tcp_ca_write_sk_pacing *skel;
+ struct bpf_link *link;
+
+ skel = tcp_ca_write_sk_pacing__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "open_and_load"))
+ return;
+
+ link = bpf_map__attach_struct_ops(skel->maps.write_sk_pacing);
+ ASSERT_OK_PTR(link, "attach_struct_ops");
+
+ bpf_link__destroy(link);
+ tcp_ca_write_sk_pacing__destroy(skel);
+}
+
+static void test_incompl_cong_ops(void)
+{
+ struct tcp_ca_incompl_cong_ops *skel;
+ struct bpf_link *link;
+
+ skel = tcp_ca_incompl_cong_ops__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "open_and_load"))
+ return;
+
+ /* That cong_avoid() and cong_control() are missing is only reported at
+ * this point:
+ */
+ link = bpf_map__attach_struct_ops(skel->maps.incompl_cong_ops);
+ ASSERT_ERR_PTR(link, "attach_struct_ops");
+
+ bpf_link__destroy(link);
+ tcp_ca_incompl_cong_ops__destroy(skel);
+}
+
+static void test_unsupp_cong_op(void)
+{
+ libbpf_print_fn_t old_print_fn;
+ struct tcp_ca_unsupp_cong_op *skel;
+
+ err_str = "attach to unsupported member get_info";
+ found = false;
+ old_print_fn = libbpf_set_print(libbpf_debug_print);
+
+ skel = tcp_ca_unsupp_cong_op__open_and_load();
+ ASSERT_NULL(skel, "open_and_load");
+ ASSERT_EQ(found, true, "expected_err_msg");
+
+ tcp_ca_unsupp_cong_op__destroy(skel);
+ libbpf_set_print(old_print_fn);
+}
+
void test_bpf_tcp_ca(void)
{
if (test__start_subtest("dctcp"))
@@ -334,4 +389,10 @@ void test_bpf_tcp_ca(void)
test_dctcp_fallback();
if (test__start_subtest("rel_setsockopt"))
test_rel_setsockopt();
+ if (test__start_subtest("write_sk_pacing"))
+ test_write_sk_pacing();
+ if (test__start_subtest("incompl_cong_ops"))
+ test_incompl_cong_ops();
+ if (test__start_subtest("unsupp_cong_op"))
+ test_unsupp_cong_op();
}
diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c
index ba5bde53d418..ef6528b8084c 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf.c
@@ -34,7 +34,6 @@ static bool always_log;
#undef CHECK
#define CHECK(condition, format...) _CHECK(condition, "check", duration, format)
-#define BTF_END_RAW 0xdeadbeef
#define NAME_TBD 0xdeadb33f
#define NAME_NTH(N) (0xfffe0000 | N)
@@ -2897,26 +2896,6 @@ static struct btf_raw_test raw_tests[] = {
},
{
- .descr = "invalid enum kind_flag",
- .raw_types = {
- BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
- BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_ENUM, 1, 1), 4), /* [2] */
- BTF_ENUM_ENC(NAME_TBD, 0),
- BTF_END_RAW,
- },
- BTF_STR_SEC("\0A"),
- .map_type = BPF_MAP_TYPE_ARRAY,
- .map_name = "enum_type_check_btf",
- .key_size = sizeof(int),
- .value_size = sizeof(int),
- .key_type_id = 1,
- .value_type_id = 1,
- .max_entries = 4,
- .btf_load_err = true,
- .err_str = "Invalid btf_info kind_flag",
-},
-
-{
.descr = "valid fwd kind_flag",
.raw_types = {
BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
@@ -4072,6 +4051,42 @@ static struct btf_raw_test raw_tests[] = {
.btf_load_err = true,
.err_str = "Type tags don't precede modifiers",
},
+{
+ .descr = "enum64 test #1, unsigned, size 8",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 2), 8), /* [2] */
+ BTF_ENUM64_ENC(NAME_TBD, 0, 0),
+ BTF_ENUM64_ENC(NAME_TBD, 1, 1),
+ BTF_END_RAW,
+ },
+ BTF_STR_SEC("\0a\0b\0c"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "tag_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = 8,
+ .key_type_id = 1,
+ .value_type_id = 2,
+ .max_entries = 1,
+},
+{
+ .descr = "enum64 test #2, signed, size 4",
+ .raw_types = {
+ BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+ BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_ENUM64, 1, 2), 4), /* [2] */
+ BTF_ENUM64_ENC(NAME_TBD, -1, 0),
+ BTF_ENUM64_ENC(NAME_TBD, 1, 0),
+ BTF_END_RAW,
+ },
+ BTF_STR_SEC("\0a\0b\0c"),
+ .map_type = BPF_MAP_TYPE_ARRAY,
+ .map_name = "tag_type_check_btf",
+ .key_size = sizeof(int),
+ .value_size = 4,
+ .key_type_id = 1,
+ .value_type_id = 2,
+ .max_entries = 1,
+},
}; /* struct btf_raw_test raw_tests[] */
@@ -4636,7 +4651,6 @@ struct btf_file_test {
};
static struct btf_file_test file_tests[] = {
- { .file = "test_btf_haskv.o", },
{ .file = "test_btf_newkv.o", },
{ .file = "test_btf_nokv.o", .btf_kv_notfound = true, },
};
@@ -5324,7 +5338,7 @@ static void do_test_pprint(int test_num)
ret = snprintf(pin_path, sizeof(pin_path), "%s/%s",
"/sys/fs/bpf", test->map_name);
- if (CHECK(ret == sizeof(pin_path), "pin_path %s/%s is too long",
+ if (CHECK(ret >= sizeof(pin_path), "pin_path %s/%s is too long",
"/sys/fs/bpf", test->map_name)) {
err = -1;
goto done;
@@ -7000,9 +7014,12 @@ static struct btf_dedup_test dedup_tests[] = {
BTF_DECL_TAG_ENC(NAME_TBD, 13, 1), /* [16] decl_tag */
BTF_DECL_TAG_ENC(NAME_TBD, 7, -1), /* [17] decl_tag */
BTF_TYPE_TAG_ENC(NAME_TBD, 8), /* [18] type_tag */
+ BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 2), 8), /* [19] enum64 */
+ BTF_ENUM64_ENC(NAME_TBD, 0, 0),
+ BTF_ENUM64_ENC(NAME_TBD, 1, 1),
BTF_END_RAW,
},
- BTF_STR_SEC("\0A\0B\0C\0D\0E\0F\0G\0H\0I\0J\0K\0L\0M\0N\0O\0P\0Q\0R"),
+ BTF_STR_SEC("\0A\0B\0C\0D\0E\0F\0G\0H\0I\0J\0K\0L\0M\0N\0O\0P\0Q\0R\0S\0T\0U"),
},
.expect = {
.raw_types = {
@@ -7030,9 +7047,12 @@ static struct btf_dedup_test dedup_tests[] = {
BTF_DECL_TAG_ENC(NAME_TBD, 13, 1), /* [16] decl_tag */
BTF_DECL_TAG_ENC(NAME_TBD, 7, -1), /* [17] decl_tag */
BTF_TYPE_TAG_ENC(NAME_TBD, 8), /* [18] type_tag */
+ BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 2), 8), /* [19] enum64 */
+ BTF_ENUM64_ENC(NAME_TBD, 0, 0),
+ BTF_ENUM64_ENC(NAME_TBD, 1, 1),
BTF_END_RAW,
},
- BTF_STR_SEC("\0A\0B\0C\0D\0E\0F\0G\0H\0I\0J\0K\0L\0M\0N\0O\0P\0Q\0R"),
+ BTF_STR_SEC("\0A\0B\0C\0D\0E\0F\0G\0H\0I\0J\0K\0L\0M\0N\0O\0P\0Q\0R\0S\0T\0U"),
},
},
{
@@ -7493,6 +7513,91 @@ static struct btf_dedup_test dedup_tests[] = {
BTF_STR_SEC("\0tag1\0t\0m"),
},
},
+{
+ .descr = "dedup: enum64, standalone",
+ .input = {
+ .raw_types = {
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8),
+ BTF_ENUM64_ENC(NAME_NTH(2), 1, 123),
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8),
+ BTF_ENUM64_ENC(NAME_NTH(2), 1, 123),
+ BTF_END_RAW,
+ },
+ BTF_STR_SEC("\0e1\0e1_val"),
+ },
+ .expect = {
+ .raw_types = {
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8),
+ BTF_ENUM64_ENC(NAME_NTH(2), 1, 123),
+ BTF_END_RAW,
+ },
+ BTF_STR_SEC("\0e1\0e1_val"),
+ },
+},
+{
+ .descr = "dedup: enum64, fwd resolution",
+ .input = {
+ .raw_types = {
+ /* [1] fwd enum64 'e1' before full enum */
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 0), 8),
+ /* [2] full enum64 'e1' after fwd */
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8),
+ BTF_ENUM64_ENC(NAME_NTH(2), 1, 123),
+ /* [3] full enum64 'e2' before fwd */
+ BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8),
+ BTF_ENUM64_ENC(NAME_NTH(4), 0, 456),
+ /* [4] fwd enum64 'e2' after full enum */
+ BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 0), 8),
+ /* [5] incompatible full enum64 with different value */
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8),
+ BTF_ENUM64_ENC(NAME_NTH(2), 0, 321),
+ BTF_END_RAW,
+ },
+ BTF_STR_SEC("\0e1\0e1_val\0e2\0e2_val"),
+ },
+ .expect = {
+ .raw_types = {
+ /* [1] full enum64 'e1' */
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8),
+ BTF_ENUM64_ENC(NAME_NTH(2), 1, 123),
+ /* [2] full enum64 'e2' */
+ BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8),
+ BTF_ENUM64_ENC(NAME_NTH(4), 0, 456),
+ /* [3] incompatible full enum64 with different value */
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8),
+ BTF_ENUM64_ENC(NAME_NTH(2), 0, 321),
+ BTF_END_RAW,
+ },
+ BTF_STR_SEC("\0e1\0e1_val\0e2\0e2_val"),
+ },
+},
+{
+ .descr = "dedup: enum and enum64, no dedup",
+ .input = {
+ .raw_types = {
+ /* [1] enum 'e1' */
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4),
+ BTF_ENUM_ENC(NAME_NTH(2), 1),
+ /* [2] enum64 'e1' */
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 4),
+ BTF_ENUM64_ENC(NAME_NTH(2), 1, 0),
+ BTF_END_RAW,
+ },
+ BTF_STR_SEC("\0e1\0e1_val"),
+ },
+ .expect = {
+ .raw_types = {
+ /* [1] enum 'e1' */
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4),
+ BTF_ENUM_ENC(NAME_NTH(2), 1),
+ /* [2] enum64 'e1' */
+ BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 4),
+ BTF_ENUM64_ENC(NAME_NTH(2), 1, 0),
+ BTF_END_RAW,
+ },
+ BTF_STR_SEC("\0e1\0e1_val"),
+ },
+},
};
@@ -7517,6 +7622,8 @@ static int btf_type_size(const struct btf_type *t)
return base_size + sizeof(__u32);
case BTF_KIND_ENUM:
return base_size + vlen * sizeof(struct btf_enum);
+ case BTF_KIND_ENUM64:
+ return base_size + vlen * sizeof(struct btf_enum64);
case BTF_KIND_ARRAY:
return base_size + sizeof(struct btf_array);
case BTF_KIND_STRUCT:
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_write.c b/tools/testing/selftests/bpf/prog_tests/btf_write.c
index addf99c05896..6e36de1302fc 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf_write.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf_write.c
@@ -9,6 +9,7 @@ static void gen_btf(struct btf *btf)
const struct btf_var_secinfo *vi;
const struct btf_type *t;
const struct btf_member *m;
+ const struct btf_enum64 *v64;
const struct btf_enum *v;
const struct btf_param *p;
int id, err, str_off;
@@ -171,7 +172,7 @@ static void gen_btf(struct btf *btf)
ASSERT_STREQ(btf__str_by_offset(btf, v->name_off), "v2", "v2_name");
ASSERT_EQ(v->val, 2, "v2_val");
ASSERT_STREQ(btf_type_raw_dump(btf, 9),
- "[9] ENUM 'e1' size=4 vlen=2\n"
+ "[9] ENUM 'e1' encoding=UNSIGNED size=4 vlen=2\n"
"\t'v1' val=1\n"
"\t'v2' val=2", "raw_dump");
@@ -202,7 +203,7 @@ static void gen_btf(struct btf *btf)
ASSERT_EQ(btf_vlen(t), 0, "enum_fwd_kind");
ASSERT_EQ(t->size, 4, "enum_fwd_sz");
ASSERT_STREQ(btf_type_raw_dump(btf, 12),
- "[12] ENUM 'enum_fwd' size=4 vlen=0", "raw_dump");
+ "[12] ENUM 'enum_fwd' encoding=UNSIGNED size=4 vlen=0", "raw_dump");
/* TYPEDEF */
id = btf__add_typedef(btf, "typedef1", 1);
@@ -307,6 +308,48 @@ static void gen_btf(struct btf *btf)
ASSERT_EQ(t->type, 1, "tag_type");
ASSERT_STREQ(btf_type_raw_dump(btf, 20),
"[20] TYPE_TAG 'tag1' type_id=1", "raw_dump");
+
+ /* ENUM64 */
+ id = btf__add_enum64(btf, "e1", 8, true);
+ ASSERT_EQ(id, 21, "enum64_id");
+ err = btf__add_enum64_value(btf, "v1", -1);
+ ASSERT_OK(err, "v1_res");
+ err = btf__add_enum64_value(btf, "v2", 0x123456789); /* 4886718345 */
+ ASSERT_OK(err, "v2_res");
+ t = btf__type_by_id(btf, 21);
+ ASSERT_STREQ(btf__str_by_offset(btf, t->name_off), "e1", "enum64_name");
+ ASSERT_EQ(btf_kind(t), BTF_KIND_ENUM64, "enum64_kind");
+ ASSERT_EQ(btf_vlen(t), 2, "enum64_vlen");
+ ASSERT_EQ(t->size, 8, "enum64_sz");
+ v64 = btf_enum64(t) + 0;
+ ASSERT_STREQ(btf__str_by_offset(btf, v64->name_off), "v1", "v1_name");
+ ASSERT_EQ(v64->val_hi32, 0xffffffff, "v1_val");
+ ASSERT_EQ(v64->val_lo32, 0xffffffff, "v1_val");
+ v64 = btf_enum64(t) + 1;
+ ASSERT_STREQ(btf__str_by_offset(btf, v64->name_off), "v2", "v2_name");
+ ASSERT_EQ(v64->val_hi32, 0x1, "v2_val");
+ ASSERT_EQ(v64->val_lo32, 0x23456789, "v2_val");
+ ASSERT_STREQ(btf_type_raw_dump(btf, 21),
+ "[21] ENUM64 'e1' encoding=SIGNED size=8 vlen=2\n"
+ "\t'v1' val=-1\n"
+ "\t'v2' val=4886718345", "raw_dump");
+
+ id = btf__add_enum64(btf, "e1", 8, false);
+ ASSERT_EQ(id, 22, "enum64_id");
+ err = btf__add_enum64_value(btf, "v1", 0xffffffffFFFFFFFF); /* 18446744073709551615 */
+ ASSERT_OK(err, "v1_res");
+ t = btf__type_by_id(btf, 22);
+ ASSERT_STREQ(btf__str_by_offset(btf, t->name_off), "e1", "enum64_name");
+ ASSERT_EQ(btf_kind(t), BTF_KIND_ENUM64, "enum64_kind");
+ ASSERT_EQ(btf_vlen(t), 1, "enum64_vlen");
+ ASSERT_EQ(t->size, 8, "enum64_sz");
+ v64 = btf_enum64(t) + 0;
+ ASSERT_STREQ(btf__str_by_offset(btf, v64->name_off), "v1", "v1_name");
+ ASSERT_EQ(v64->val_hi32, 0xffffffff, "v1_val");
+ ASSERT_EQ(v64->val_lo32, 0xffffffff, "v1_val");
+ ASSERT_STREQ(btf_type_raw_dump(btf, 22),
+ "[22] ENUM64 'e1' encoding=UNSIGNED size=8 vlen=1\n"
+ "\t'v1' val=18446744073709551615", "raw_dump");
}
static void test_btf_add()
@@ -332,12 +375,12 @@ static void test_btf_add()
"\t'f2' type_id=1 bits_offset=32 bitfield_size=16",
"[8] UNION 'u1' size=8 vlen=1\n"
"\t'f1' type_id=1 bits_offset=0 bitfield_size=16",
- "[9] ENUM 'e1' size=4 vlen=2\n"
+ "[9] ENUM 'e1' encoding=UNSIGNED size=4 vlen=2\n"
"\t'v1' val=1\n"
"\t'v2' val=2",
"[10] FWD 'struct_fwd' fwd_kind=struct",
"[11] FWD 'union_fwd' fwd_kind=union",
- "[12] ENUM 'enum_fwd' size=4 vlen=0",
+ "[12] ENUM 'enum_fwd' encoding=UNSIGNED size=4 vlen=0",
"[13] TYPEDEF 'typedef1' type_id=1",
"[14] FUNC 'func1' type_id=15 linkage=global",
"[15] FUNC_PROTO '(anon)' ret_type_id=1 vlen=2\n"
@@ -348,7 +391,12 @@ static void test_btf_add()
"\ttype_id=1 offset=4 size=8",
"[18] DECL_TAG 'tag1' type_id=16 component_idx=-1",
"[19] DECL_TAG 'tag2' type_id=14 component_idx=1",
- "[20] TYPE_TAG 'tag1' type_id=1");
+ "[20] TYPE_TAG 'tag1' type_id=1",
+ "[21] ENUM64 'e1' encoding=SIGNED size=8 vlen=2\n"
+ "\t'v1' val=-1\n"
+ "\t'v2' val=4886718345",
+ "[22] ENUM64 'e1' encoding=UNSIGNED size=8 vlen=1\n"
+ "\t'v1' val=18446744073709551615");
btf__free(btf);
}
@@ -370,7 +418,7 @@ static void test_btf_add_btf()
gen_btf(btf2);
id = btf__add_btf(btf1, btf2);
- if (!ASSERT_EQ(id, 21, "id"))
+ if (!ASSERT_EQ(id, 23, "id"))
goto cleanup;
VALIDATE_RAW_BTF(
@@ -386,12 +434,12 @@ static void test_btf_add_btf()
"\t'f2' type_id=1 bits_offset=32 bitfield_size=16",
"[8] UNION 'u1' size=8 vlen=1\n"
"\t'f1' type_id=1 bits_offset=0 bitfield_size=16",
- "[9] ENUM 'e1' size=4 vlen=2\n"
+ "[9] ENUM 'e1' encoding=UNSIGNED size=4 vlen=2\n"
"\t'v1' val=1\n"
"\t'v2' val=2",
"[10] FWD 'struct_fwd' fwd_kind=struct",
"[11] FWD 'union_fwd' fwd_kind=union",
- "[12] ENUM 'enum_fwd' size=4 vlen=0",
+ "[12] ENUM 'enum_fwd' encoding=UNSIGNED size=4 vlen=0",
"[13] TYPEDEF 'typedef1' type_id=1",
"[14] FUNC 'func1' type_id=15 linkage=global",
"[15] FUNC_PROTO '(anon)' ret_type_id=1 vlen=2\n"
@@ -403,36 +451,46 @@ static void test_btf_add_btf()
"[18] DECL_TAG 'tag1' type_id=16 component_idx=-1",
"[19] DECL_TAG 'tag2' type_id=14 component_idx=1",
"[20] TYPE_TAG 'tag1' type_id=1",
+ "[21] ENUM64 'e1' encoding=SIGNED size=8 vlen=2\n"
+ "\t'v1' val=-1\n"
+ "\t'v2' val=4886718345",
+ "[22] ENUM64 'e1' encoding=UNSIGNED size=8 vlen=1\n"
+ "\t'v1' val=18446744073709551615",
/* types appended from the second BTF */
- "[21] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
- "[22] PTR '(anon)' type_id=21",
- "[23] CONST '(anon)' type_id=25",
- "[24] VOLATILE '(anon)' type_id=23",
- "[25] RESTRICT '(anon)' type_id=24",
- "[26] ARRAY '(anon)' type_id=22 index_type_id=21 nr_elems=10",
- "[27] STRUCT 's1' size=8 vlen=2\n"
- "\t'f1' type_id=21 bits_offset=0\n"
- "\t'f2' type_id=21 bits_offset=32 bitfield_size=16",
- "[28] UNION 'u1' size=8 vlen=1\n"
- "\t'f1' type_id=21 bits_offset=0 bitfield_size=16",
- "[29] ENUM 'e1' size=4 vlen=2\n"
+ "[23] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+ "[24] PTR '(anon)' type_id=23",
+ "[25] CONST '(anon)' type_id=27",
+ "[26] VOLATILE '(anon)' type_id=25",
+ "[27] RESTRICT '(anon)' type_id=26",
+ "[28] ARRAY '(anon)' type_id=24 index_type_id=23 nr_elems=10",
+ "[29] STRUCT 's1' size=8 vlen=2\n"
+ "\t'f1' type_id=23 bits_offset=0\n"
+ "\t'f2' type_id=23 bits_offset=32 bitfield_size=16",
+ "[30] UNION 'u1' size=8 vlen=1\n"
+ "\t'f1' type_id=23 bits_offset=0 bitfield_size=16",
+ "[31] ENUM 'e1' encoding=UNSIGNED size=4 vlen=2\n"
"\t'v1' val=1\n"
"\t'v2' val=2",
- "[30] FWD 'struct_fwd' fwd_kind=struct",
- "[31] FWD 'union_fwd' fwd_kind=union",
- "[32] ENUM 'enum_fwd' size=4 vlen=0",
- "[33] TYPEDEF 'typedef1' type_id=21",
- "[34] FUNC 'func1' type_id=35 linkage=global",
- "[35] FUNC_PROTO '(anon)' ret_type_id=21 vlen=2\n"
- "\t'p1' type_id=21\n"
- "\t'p2' type_id=22",
- "[36] VAR 'var1' type_id=21, linkage=global-alloc",
- "[37] DATASEC 'datasec1' size=12 vlen=1\n"
- "\ttype_id=21 offset=4 size=8",
- "[38] DECL_TAG 'tag1' type_id=36 component_idx=-1",
- "[39] DECL_TAG 'tag2' type_id=34 component_idx=1",
- "[40] TYPE_TAG 'tag1' type_id=21");
+ "[32] FWD 'struct_fwd' fwd_kind=struct",
+ "[33] FWD 'union_fwd' fwd_kind=union",
+ "[34] ENUM 'enum_fwd' encoding=UNSIGNED size=4 vlen=0",
+ "[35] TYPEDEF 'typedef1' type_id=23",
+ "[36] FUNC 'func1' type_id=37 linkage=global",
+ "[37] FUNC_PROTO '(anon)' ret_type_id=23 vlen=2\n"
+ "\t'p1' type_id=23\n"
+ "\t'p2' type_id=24",
+ "[38] VAR 'var1' type_id=23, linkage=global-alloc",
+ "[39] DATASEC 'datasec1' size=12 vlen=1\n"
+ "\ttype_id=23 offset=4 size=8",
+ "[40] DECL_TAG 'tag1' type_id=38 component_idx=-1",
+ "[41] DECL_TAG 'tag2' type_id=36 component_idx=1",
+ "[42] TYPE_TAG 'tag1' type_id=23",
+ "[43] ENUM64 'e1' encoding=SIGNED size=8 vlen=2\n"
+ "\t'v1' val=-1\n"
+ "\t'v2' val=4886718345",
+ "[44] ENUM64 'e1' encoding=UNSIGNED size=8 vlen=1\n"
+ "\t'v1' val=18446744073709551615");
cleanup:
btf__free(btf1);
diff --git a/tools/testing/selftests/bpf/prog_tests/core_extern.c b/tools/testing/selftests/bpf/prog_tests/core_extern.c
index 1931a158510e..63a51e9f3630 100644
--- a/tools/testing/selftests/bpf/prog_tests/core_extern.c
+++ b/tools/testing/selftests/bpf/prog_tests/core_extern.c
@@ -39,6 +39,7 @@ static struct test_case {
"CONFIG_STR=\"abracad\"\n"
"CONFIG_MISSING=0",
.data = {
+ .unkn_virt_val = 0,
.bpf_syscall = false,
.tristate_val = TRI_MODULE,
.bool_val = true,
@@ -121,7 +122,7 @@ static struct test_case {
void test_core_extern(void)
{
const uint32_t kern_ver = get_kernel_version();
- int err, duration = 0, i, j;
+ int err, i, j;
struct test_core_extern *skel = NULL;
uint64_t *got, *exp;
int n = sizeof(*skel->data) / sizeof(uint64_t);
@@ -136,19 +137,17 @@ void test_core_extern(void)
continue;
skel = test_core_extern__open_opts(&opts);
- if (CHECK(!skel, "skel_open", "skeleton open failed\n"))
+ if (!ASSERT_OK_PTR(skel, "skel_open"))
goto cleanup;
err = test_core_extern__load(skel);
if (t->fails) {
- CHECK(!err, "skel_load",
- "shouldn't succeed open/load of skeleton\n");
+ ASSERT_ERR(err, "skel_load_should_fail");
goto cleanup;
- } else if (CHECK(err, "skel_load",
- "failed to open/load skeleton\n")) {
+ } else if (!ASSERT_OK(err, "skel_load")) {
goto cleanup;
}
err = test_core_extern__attach(skel);
- if (CHECK(err, "attach_raw_tp", "failed attach: %d\n", err))
+ if (!ASSERT_OK(err, "attach_raw_tp"))
goto cleanup;
usleep(1);
@@ -158,9 +157,7 @@ void test_core_extern(void)
got = (uint64_t *)skel->data;
exp = (uint64_t *)&t->data;
for (j = 0; j < n; j++) {
- CHECK(got[j] != exp[j], "check_res",
- "result #%d: expected %llx, but got %llx\n",
- j, (__u64)exp[j], (__u64)got[j]);
+ ASSERT_EQ(got[j], exp[j], "result");
}
cleanup:
test_core_extern__destroy(skel);
diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c
index 3712dfe1be59..c8655ba9a88f 100644
--- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c
+++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c
@@ -84,6 +84,7 @@ static int duration = 0;
#define NESTING_ERR_CASE(name) { \
NESTING_CASE_COMMON(name), \
.fails = true, \
+ .run_btfgen_fails = true, \
}
#define ARRAYS_DATA(struct_name) STRUCT_TO_CHAR_PTR(struct_name) { \
@@ -258,12 +259,14 @@ static int duration = 0;
BITFIELDS_CASE_COMMON("test_core_reloc_bitfields_probed.o", \
"probed:", name), \
.fails = true, \
+ .run_btfgen_fails = true, \
.raw_tp_name = "sys_enter", \
.prog_name = "test_core_bitfields", \
}, { \
BITFIELDS_CASE_COMMON("test_core_reloc_bitfields_direct.o", \
"direct:", name), \
.fails = true, \
+ .run_btfgen_fails = true, \
.prog_name = "test_core_bitfields_direct", \
}
@@ -304,6 +307,7 @@ static int duration = 0;
#define SIZE_ERR_CASE(name) { \
SIZE_CASE_COMMON(name), \
.fails = true, \
+ .run_btfgen_fails = true, \
}
#define TYPE_BASED_CASE_COMMON(name) \
@@ -363,6 +367,25 @@ static int duration = 0;
.fails = true, \
}
+#define ENUM64VAL_CASE_COMMON(name) \
+ .case_name = #name, \
+ .bpf_obj_file = "test_core_reloc_enum64val.o", \
+ .btf_src_file = "btf__core_reloc_" #name ".o", \
+ .raw_tp_name = "sys_enter", \
+ .prog_name = "test_core_enum64val"
+
+#define ENUM64VAL_CASE(name, ...) { \
+ ENUM64VAL_CASE_COMMON(name), \
+ .output = STRUCT_TO_CHAR_PTR(core_reloc_enum64val_output) \
+ __VA_ARGS__, \
+ .output_len = sizeof(struct core_reloc_enum64val_output), \
+}
+
+#define ENUM64VAL_ERR_CASE(name) { \
+ ENUM64VAL_CASE_COMMON(name), \
+ .fails = true, \
+}
+
struct core_reloc_test_case;
typedef int (*setup_test_fn)(struct core_reloc_test_case *test);
@@ -377,6 +400,7 @@ struct core_reloc_test_case {
const char *output;
int output_len;
bool fails;
+ bool run_btfgen_fails;
bool needs_testmod;
bool relaxed_core_relocs;
const char *prog_name;
@@ -519,7 +543,6 @@ static int __trigger_module_test_read(const struct core_reloc_test_case *test)
return 0;
}
-
static const struct core_reloc_test_case test_cases[] = {
/* validate we can find kernel image and use its BTF for relocs */
{
@@ -532,6 +555,7 @@ static const struct core_reloc_test_case test_cases[] = {
.valid = { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, },
.comm = "test_progs",
.comm_len = sizeof("test_progs"),
+ .local_task_struct_matches = true,
},
.output_len = sizeof(struct core_reloc_kernel_output),
.raw_tp_name = "sys_enter",
@@ -728,9 +752,10 @@ static const struct core_reloc_test_case test_cases[] = {
SIZE_CASE(size___diff_offs),
SIZE_ERR_CASE(size___err_ambiguous),
- /* validate type existence and size relocations */
+ /* validate type existence, match, and size relocations */
TYPE_BASED_CASE(type_based, {
.struct_exists = 1,
+ .complex_struct_exists = 1,
.union_exists = 1,
.enum_exists = 1,
.typedef_named_struct_exists = 1,
@@ -739,8 +764,24 @@ static const struct core_reloc_test_case test_cases[] = {
.typedef_int_exists = 1,
.typedef_enum_exists = 1,
.typedef_void_ptr_exists = 1,
+ .typedef_restrict_ptr_exists = 1,
.typedef_func_proto_exists = 1,
.typedef_arr_exists = 1,
+
+ .struct_matches = 1,
+ .complex_struct_matches = 1,
+ .union_matches = 1,
+ .enum_matches = 1,
+ .typedef_named_struct_matches = 1,
+ .typedef_anon_struct_matches = 1,
+ .typedef_struct_ptr_matches = 1,
+ .typedef_int_matches = 1,
+ .typedef_enum_matches = 1,
+ .typedef_void_ptr_matches = 1,
+ .typedef_restrict_ptr_matches = 1,
+ .typedef_func_proto_matches = 1,
+ .typedef_arr_matches = 1,
+
.struct_sz = sizeof(struct a_struct),
.union_sz = sizeof(union a_union),
.enum_sz = sizeof(enum an_enum),
@@ -756,6 +797,45 @@ static const struct core_reloc_test_case test_cases[] = {
TYPE_BASED_CASE(type_based___all_missing, {
/* all zeros */
}),
+ TYPE_BASED_CASE(type_based___diff, {
+ .struct_exists = 1,
+ .complex_struct_exists = 1,
+ .union_exists = 1,
+ .enum_exists = 1,
+ .typedef_named_struct_exists = 1,
+ .typedef_anon_struct_exists = 1,
+ .typedef_struct_ptr_exists = 1,
+ .typedef_int_exists = 1,
+ .typedef_enum_exists = 1,
+ .typedef_void_ptr_exists = 1,
+ .typedef_func_proto_exists = 1,
+ .typedef_arr_exists = 1,
+
+ .struct_matches = 1,
+ .complex_struct_matches = 1,
+ .union_matches = 1,
+ .enum_matches = 1,
+ .typedef_named_struct_matches = 1,
+ .typedef_anon_struct_matches = 1,
+ .typedef_struct_ptr_matches = 1,
+ .typedef_int_matches = 0,
+ .typedef_enum_matches = 1,
+ .typedef_void_ptr_matches = 1,
+ .typedef_func_proto_matches = 0,
+ .typedef_arr_matches = 0,
+
+ .struct_sz = sizeof(struct a_struct___diff),
+ .union_sz = sizeof(union a_union___diff),
+ .enum_sz = sizeof(enum an_enum___diff),
+ .typedef_named_struct_sz = sizeof(named_struct_typedef___diff),
+ .typedef_anon_struct_sz = sizeof(anon_struct_typedef___diff),
+ .typedef_struct_ptr_sz = sizeof(struct_ptr_typedef___diff),
+ .typedef_int_sz = sizeof(int_typedef___diff),
+ .typedef_enum_sz = sizeof(enum_typedef___diff),
+ .typedef_void_ptr_sz = sizeof(void_ptr_typedef___diff),
+ .typedef_func_proto_sz = sizeof(func_proto_typedef___diff),
+ .typedef_arr_sz = sizeof(arr_typedef___diff),
+ }),
TYPE_BASED_CASE(type_based___diff_sz, {
.struct_exists = 1,
.union_exists = 1,
@@ -768,6 +848,19 @@ static const struct core_reloc_test_case test_cases[] = {
.typedef_void_ptr_exists = 1,
.typedef_func_proto_exists = 1,
.typedef_arr_exists = 1,
+
+ .struct_matches = 0,
+ .union_matches = 0,
+ .enum_matches = 0,
+ .typedef_named_struct_matches = 0,
+ .typedef_anon_struct_matches = 0,
+ .typedef_struct_ptr_matches = 1,
+ .typedef_int_matches = 0,
+ .typedef_enum_matches = 0,
+ .typedef_void_ptr_matches = 1,
+ .typedef_func_proto_matches = 0,
+ .typedef_arr_matches = 0,
+
.struct_sz = sizeof(struct a_struct___diff_sz),
.union_sz = sizeof(union a_union___diff_sz),
.enum_sz = sizeof(enum an_enum___diff_sz),
@@ -782,10 +875,12 @@ static const struct core_reloc_test_case test_cases[] = {
}),
TYPE_BASED_CASE(type_based___incompat, {
.enum_exists = 1,
+ .enum_matches = 1,
.enum_sz = sizeof(enum an_enum),
}),
TYPE_BASED_CASE(type_based___fn_wrong_args, {
.struct_exists = 1,
+ .struct_matches = 1,
.struct_sz = sizeof(struct a_struct),
}),
@@ -831,6 +926,45 @@ static const struct core_reloc_test_case test_cases[] = {
.anon_val2 = 0x222,
}),
ENUMVAL_ERR_CASE(enumval___err_missing),
+
+ /* 64bit enumerator value existence and value relocations */
+ ENUM64VAL_CASE(enum64val, {
+ .unsigned_val1_exists = true,
+ .unsigned_val2_exists = true,
+ .unsigned_val3_exists = true,
+ .signed_val1_exists = true,
+ .signed_val2_exists = true,
+ .signed_val3_exists = true,
+ .unsigned_val1 = 0x1ffffffffULL,
+ .unsigned_val2 = 0x2,
+ .signed_val1 = 0x1ffffffffLL,
+ .signed_val2 = -2,
+ }),
+ ENUM64VAL_CASE(enum64val___diff, {
+ .unsigned_val1_exists = true,
+ .unsigned_val2_exists = true,
+ .unsigned_val3_exists = true,
+ .signed_val1_exists = true,
+ .signed_val2_exists = true,
+ .signed_val3_exists = true,
+ .unsigned_val1 = 0x101ffffffffULL,
+ .unsigned_val2 = 0x202ffffffffULL,
+ .signed_val1 = -101,
+ .signed_val2 = -202,
+ }),
+ ENUM64VAL_CASE(enum64val___val3_missing, {
+ .unsigned_val1_exists = true,
+ .unsigned_val2_exists = true,
+ .unsigned_val3_exists = false,
+ .signed_val1_exists = true,
+ .signed_val2_exists = true,
+ .signed_val3_exists = false,
+ .unsigned_val1 = 0x111ffffffffULL,
+ .unsigned_val2 = 0x222,
+ .signed_val1 = 0x111ffffffffLL,
+ .signed_val2 = -222,
+ }),
+ ENUM64VAL_ERR_CASE(enum64val___err_missing),
};
struct data {
@@ -894,7 +1028,7 @@ static void run_core_reloc_tests(bool use_btfgen)
/* generate a "minimal" BTF file and use it as source */
if (use_btfgen) {
- if (!test_case->btf_src_file || test_case->fails) {
+ if (!test_case->btf_src_file || test_case->run_btfgen_fails) {
test__skip();
continue;
}
diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
index a7e74297f15f..5a7e6011f6bf 100644
--- a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
+++ b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
@@ -7,11 +7,9 @@
void serial_test_fexit_stress(void)
{
- char test_skb[128] = {};
int fexit_fd[CNT] = {};
int link_fd[CNT] = {};
- char error[4096];
- int err, i, filter_fd;
+ int err, i;
const struct bpf_insn trace_program[] = {
BPF_MOV64_IMM(BPF_REG_0, 0),
@@ -20,25 +18,9 @@ void serial_test_fexit_stress(void)
LIBBPF_OPTS(bpf_prog_load_opts, trace_opts,
.expected_attach_type = BPF_TRACE_FEXIT,
- .log_buf = error,
- .log_size = sizeof(error),
);
- const struct bpf_insn skb_program[] = {
- BPF_MOV64_IMM(BPF_REG_0, 0),
- BPF_EXIT_INSN(),
- };
-
- LIBBPF_OPTS(bpf_prog_load_opts, skb_opts,
- .log_buf = error,
- .log_size = sizeof(error),
- );
-
- LIBBPF_OPTS(bpf_test_run_opts, topts,
- .data_in = test_skb,
- .data_size_in = sizeof(test_skb),
- .repeat = 1,
- );
+ LIBBPF_OPTS(bpf_test_run_opts, topts);
err = libbpf_find_vmlinux_btf_id("bpf_fentry_test1",
trace_opts.expected_attach_type);
@@ -58,15 +40,9 @@ void serial_test_fexit_stress(void)
goto out;
}
- filter_fd = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL",
- skb_program, sizeof(skb_program) / sizeof(struct bpf_insn),
- &skb_opts);
- if (!ASSERT_GE(filter_fd, 0, "test_program_loaded"))
- goto out;
+ err = bpf_prog_test_run_opts(fexit_fd[0], &topts);
+ ASSERT_OK(err, "bpf_prog_test_run_opts");
- err = bpf_prog_test_run_opts(filter_fd, &topts);
- close(filter_fd);
- CHECK_FAIL(err);
out:
for (i = 0; i < CNT; i++) {
if (link_fd[i])
diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
index 5b93d5d0bd93..d457a55ff408 100644
--- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
@@ -329,7 +329,7 @@ static int get_syms(char ***symsp, size_t *cntp)
struct hashmap *map;
char buf[256];
FILE *f;
- int err;
+ int err = 0;
/*
* The available_filter_functions contains many duplicates,
@@ -364,6 +364,8 @@ static int get_syms(char ***symsp, size_t *cntp)
continue;
if (!strncmp(name, "rcu_", 4))
continue;
+ if (!strcmp(name, "bpf_dispatcher_xdp_func"))
+ continue;
if (!strncmp(name, "__ftrace_invalid_address__",
sizeof("__ftrace_invalid_address__") - 1))
continue;
@@ -407,7 +409,7 @@ static void test_bench_attach(void)
double attach_delta, detach_delta;
struct bpf_link *link = NULL;
char **syms = NULL;
- size_t cnt, i;
+ size_t cnt = 0, i;
if (!ASSERT_OK(get_syms(&syms, &cnt), "get_syms"))
return;
diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_str.c b/tools/testing/selftests/bpf/prog_tests/libbpf_str.c
new file mode 100644
index 000000000000..93e9cddaadcf
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/libbpf_str.c
@@ -0,0 +1,207 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+
+#include <ctype.h>
+#include <test_progs.h>
+#include <bpf/btf.h>
+
+/*
+ * Utility function uppercasing an entire string.
+ */
+static void uppercase(char *s)
+{
+ for (; *s != '\0'; s++)
+ *s = toupper(*s);
+}
+
+/*
+ * Test case to check that all bpf_attach_type variants are covered by
+ * libbpf_bpf_attach_type_str.
+ */
+static void test_libbpf_bpf_attach_type_str(void)
+{
+ struct btf *btf;
+ const struct btf_type *t;
+ const struct btf_enum *e;
+ int i, n, id;
+
+ btf = btf__parse("/sys/kernel/btf/vmlinux", NULL);
+ if (!ASSERT_OK_PTR(btf, "btf_parse"))
+ return;
+
+ /* find enum bpf_attach_type and enumerate each value */
+ id = btf__find_by_name_kind(btf, "bpf_attach_type", BTF_KIND_ENUM);
+ if (!ASSERT_GT(id, 0, "bpf_attach_type_id"))
+ goto cleanup;
+ t = btf__type_by_id(btf, id);
+ e = btf_enum(t);
+ n = btf_vlen(t);
+ for (i = 0; i < n; e++, i++) {
+ enum bpf_attach_type attach_type = (enum bpf_attach_type)e->val;
+ const char *attach_type_name;
+ const char *attach_type_str;
+ char buf[256];
+
+ if (attach_type == __MAX_BPF_ATTACH_TYPE)
+ continue;
+
+ attach_type_name = btf__str_by_offset(btf, e->name_off);
+ attach_type_str = libbpf_bpf_attach_type_str(attach_type);
+ ASSERT_OK_PTR(attach_type_str, attach_type_name);
+
+ snprintf(buf, sizeof(buf), "BPF_%s", attach_type_str);
+ uppercase(buf);
+
+ ASSERT_STREQ(buf, attach_type_name, "exp_str_value");
+ }
+
+cleanup:
+ btf__free(btf);
+}
+
+/*
+ * Test case to check that all bpf_link_type variants are covered by
+ * libbpf_bpf_link_type_str.
+ */
+static void test_libbpf_bpf_link_type_str(void)
+{
+ struct btf *btf;
+ const struct btf_type *t;
+ const struct btf_enum *e;
+ int i, n, id;
+
+ btf = btf__parse("/sys/kernel/btf/vmlinux", NULL);
+ if (!ASSERT_OK_PTR(btf, "btf_parse"))
+ return;
+
+ /* find enum bpf_link_type and enumerate each value */
+ id = btf__find_by_name_kind(btf, "bpf_link_type", BTF_KIND_ENUM);
+ if (!ASSERT_GT(id, 0, "bpf_link_type_id"))
+ goto cleanup;
+ t = btf__type_by_id(btf, id);
+ e = btf_enum(t);
+ n = btf_vlen(t);
+ for (i = 0; i < n; e++, i++) {
+ enum bpf_link_type link_type = (enum bpf_link_type)e->val;
+ const char *link_type_name;
+ const char *link_type_str;
+ char buf[256];
+
+ if (link_type == MAX_BPF_LINK_TYPE)
+ continue;
+
+ link_type_name = btf__str_by_offset(btf, e->name_off);
+ link_type_str = libbpf_bpf_link_type_str(link_type);
+ ASSERT_OK_PTR(link_type_str, link_type_name);
+
+ snprintf(buf, sizeof(buf), "BPF_LINK_TYPE_%s", link_type_str);
+ uppercase(buf);
+
+ ASSERT_STREQ(buf, link_type_name, "exp_str_value");
+ }
+
+cleanup:
+ btf__free(btf);
+}
+
+/*
+ * Test case to check that all bpf_map_type variants are covered by
+ * libbpf_bpf_map_type_str.
+ */
+static void test_libbpf_bpf_map_type_str(void)
+{
+ struct btf *btf;
+ const struct btf_type *t;
+ const struct btf_enum *e;
+ int i, n, id;
+
+ btf = btf__parse("/sys/kernel/btf/vmlinux", NULL);
+ if (!ASSERT_OK_PTR(btf, "btf_parse"))
+ return;
+
+ /* find enum bpf_map_type and enumerate each value */
+ id = btf__find_by_name_kind(btf, "bpf_map_type", BTF_KIND_ENUM);
+ if (!ASSERT_GT(id, 0, "bpf_map_type_id"))
+ goto cleanup;
+ t = btf__type_by_id(btf, id);
+ e = btf_enum(t);
+ n = btf_vlen(t);
+ for (i = 0; i < n; e++, i++) {
+ enum bpf_map_type map_type = (enum bpf_map_type)e->val;
+ const char *map_type_name;
+ const char *map_type_str;
+ char buf[256];
+
+ map_type_name = btf__str_by_offset(btf, e->name_off);
+ map_type_str = libbpf_bpf_map_type_str(map_type);
+ ASSERT_OK_PTR(map_type_str, map_type_name);
+
+ snprintf(buf, sizeof(buf), "BPF_MAP_TYPE_%s", map_type_str);
+ uppercase(buf);
+
+ ASSERT_STREQ(buf, map_type_name, "exp_str_value");
+ }
+
+cleanup:
+ btf__free(btf);
+}
+
+/*
+ * Test case to check that all bpf_prog_type variants are covered by
+ * libbpf_bpf_prog_type_str.
+ */
+static void test_libbpf_bpf_prog_type_str(void)
+{
+ struct btf *btf;
+ const struct btf_type *t;
+ const struct btf_enum *e;
+ int i, n, id;
+
+ btf = btf__parse("/sys/kernel/btf/vmlinux", NULL);
+ if (!ASSERT_OK_PTR(btf, "btf_parse"))
+ return;
+
+ /* find enum bpf_prog_type and enumerate each value */
+ id = btf__find_by_name_kind(btf, "bpf_prog_type", BTF_KIND_ENUM);
+ if (!ASSERT_GT(id, 0, "bpf_prog_type_id"))
+ goto cleanup;
+ t = btf__type_by_id(btf, id);
+ e = btf_enum(t);
+ n = btf_vlen(t);
+ for (i = 0; i < n; e++, i++) {
+ enum bpf_prog_type prog_type = (enum bpf_prog_type)e->val;
+ const char *prog_type_name;
+ const char *prog_type_str;
+ char buf[256];
+
+ prog_type_name = btf__str_by_offset(btf, e->name_off);
+ prog_type_str = libbpf_bpf_prog_type_str(prog_type);
+ ASSERT_OK_PTR(prog_type_str, prog_type_name);
+
+ snprintf(buf, sizeof(buf), "BPF_PROG_TYPE_%s", prog_type_str);
+ uppercase(buf);
+
+ ASSERT_STREQ(buf, prog_type_name, "exp_str_value");
+ }
+
+cleanup:
+ btf__free(btf);
+}
+
+/*
+ * Run all libbpf str conversion tests.
+ */
+void test_libbpf_str(void)
+{
+ if (test__start_subtest("bpf_attach_type_str"))
+ test_libbpf_bpf_attach_type_str();
+
+ if (test__start_subtest("bpf_link_type_str"))
+ test_libbpf_bpf_link_type_str();
+
+ if (test__start_subtest("bpf_map_type_str"))
+ test_libbpf_bpf_map_type_str();
+
+ if (test__start_subtest("bpf_prog_type_str"))
+ test_libbpf_bpf_prog_type_str();
+}
diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
new file mode 100644
index 000000000000..1102e4f42d2d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
@@ -0,0 +1,313 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <test_progs.h>
+#include <bpf/btf.h>
+
+#include "lsm_cgroup.skel.h"
+#include "lsm_cgroup_nonvoid.skel.h"
+#include "cgroup_helpers.h"
+#include "network_helpers.h"
+
+#ifndef ENOTSUPP
+#define ENOTSUPP 524
+#endif
+
+static struct btf *btf;
+
+static __u32 query_prog_cnt(int cgroup_fd, const char *attach_func)
+{
+ LIBBPF_OPTS(bpf_prog_query_opts, p);
+ int cnt = 0;
+ int i;
+
+ ASSERT_OK(bpf_prog_query_opts(cgroup_fd, BPF_LSM_CGROUP, &p), "prog_query");
+
+ if (!attach_func)
+ return p.prog_cnt;
+
+ /* When attach_func is provided, count the number of progs that
+ * attach to the given symbol.
+ */
+
+ if (!btf)
+ btf = btf__load_vmlinux_btf();
+ if (!ASSERT_OK(libbpf_get_error(btf), "btf_vmlinux"))
+ return -1;
+
+ p.prog_ids = malloc(sizeof(u32) * p.prog_cnt);
+ p.prog_attach_flags = malloc(sizeof(u32) * p.prog_cnt);
+ ASSERT_OK(bpf_prog_query_opts(cgroup_fd, BPF_LSM_CGROUP, &p), "prog_query");
+
+ for (i = 0; i < p.prog_cnt; i++) {
+ struct bpf_prog_info info = {};
+ __u32 info_len = sizeof(info);
+ int fd;
+
+ fd = bpf_prog_get_fd_by_id(p.prog_ids[i]);
+ ASSERT_GE(fd, 0, "prog_get_fd_by_id");
+ ASSERT_OK(bpf_obj_get_info_by_fd(fd, &info, &info_len), "prog_info_by_fd");
+ close(fd);
+
+ if (info.attach_btf_id ==
+ btf__find_by_name_kind(btf, attach_func, BTF_KIND_FUNC))
+ cnt++;
+ }
+
+ free(p.prog_ids);
+ free(p.prog_attach_flags);
+
+ return cnt;
+}
+
+static void test_lsm_cgroup_functional(void)
+{
+ DECLARE_LIBBPF_OPTS(bpf_prog_attach_opts, attach_opts);
+ DECLARE_LIBBPF_OPTS(bpf_link_update_opts, update_opts);
+ int cgroup_fd = -1, cgroup_fd2 = -1, cgroup_fd3 = -1;
+ int listen_fd, client_fd, accepted_fd;
+ struct lsm_cgroup *skel = NULL;
+ int post_create_prog_fd2 = -1;
+ int post_create_prog_fd = -1;
+ int bind_link_fd2 = -1;
+ int bind_prog_fd2 = -1;
+ int alloc_prog_fd = -1;
+ int bind_prog_fd = -1;
+ int bind_link_fd = -1;
+ int clone_prog_fd = -1;
+ int err, fd, prio;
+ socklen_t socklen;
+
+ cgroup_fd3 = test__join_cgroup("/sock_policy_empty");
+ if (!ASSERT_GE(cgroup_fd3, 0, "create empty cgroup"))
+ goto close_cgroup;
+
+ cgroup_fd2 = test__join_cgroup("/sock_policy_reuse");
+ if (!ASSERT_GE(cgroup_fd2, 0, "create cgroup for reuse"))
+ goto close_cgroup;
+
+ cgroup_fd = test__join_cgroup("/sock_policy");
+ if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup"))
+ goto close_cgroup;
+
+ skel = lsm_cgroup__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "open_and_load"))
+ goto close_cgroup;
+
+ post_create_prog_fd = bpf_program__fd(skel->progs.socket_post_create);
+ post_create_prog_fd2 = bpf_program__fd(skel->progs.socket_post_create2);
+ bind_prog_fd = bpf_program__fd(skel->progs.socket_bind);
+ bind_prog_fd2 = bpf_program__fd(skel->progs.socket_bind2);
+ alloc_prog_fd = bpf_program__fd(skel->progs.socket_alloc);
+ clone_prog_fd = bpf_program__fd(skel->progs.socket_clone);
+
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_sk_alloc_security"), 0, "prog count");
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 0, "total prog count");
+ err = bpf_prog_attach(alloc_prog_fd, cgroup_fd, BPF_LSM_CGROUP, 0);
+ if (err == -ENOTSUPP) {
+ test__skip();
+ goto close_cgroup;
+ }
+ if (!ASSERT_OK(err, "attach alloc_prog_fd"))
+ goto detach_cgroup;
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_sk_alloc_security"), 1, "prog count");
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 1, "total prog count");
+
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_inet_csk_clone"), 0, "prog count");
+ err = bpf_prog_attach(clone_prog_fd, cgroup_fd, BPF_LSM_CGROUP, 0);
+ if (!ASSERT_OK(err, "attach clone_prog_fd"))
+ goto detach_cgroup;
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_inet_csk_clone"), 1, "prog count");
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 2, "total prog count");
+
+ /* Make sure replacing works. */
+
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 0, "prog count");
+ err = bpf_prog_attach(post_create_prog_fd, cgroup_fd,
+ BPF_LSM_CGROUP, 0);
+ if (!ASSERT_OK(err, "attach post_create_prog_fd"))
+ goto detach_cgroup;
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 1, "prog count");
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 3, "total prog count");
+
+ attach_opts.replace_prog_fd = post_create_prog_fd;
+ err = bpf_prog_attach_opts(post_create_prog_fd2, cgroup_fd,
+ BPF_LSM_CGROUP, &attach_opts);
+ if (!ASSERT_OK(err, "prog replace post_create_prog_fd"))
+ goto detach_cgroup;
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 1, "prog count");
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 3, "total prog count");
+
+ /* Try the same attach/replace via link API. */
+
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 0, "prog count");
+ bind_link_fd = bpf_link_create(bind_prog_fd, cgroup_fd,
+ BPF_LSM_CGROUP, NULL);
+ if (!ASSERT_GE(bind_link_fd, 0, "link create bind_prog_fd"))
+ goto detach_cgroup;
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count");
+
+ update_opts.old_prog_fd = bind_prog_fd;
+ update_opts.flags = BPF_F_REPLACE;
+
+ err = bpf_link_update(bind_link_fd, bind_prog_fd2, &update_opts);
+ if (!ASSERT_OK(err, "link update bind_prog_fd"))
+ goto detach_cgroup;
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count");
+
+ /* Attach another instance of bind program to another cgroup.
+ * This should trigger the reuse of the trampoline shim (two
+ * programs attaching to the same btf_id).
+ */
+
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+ ASSERT_EQ(query_prog_cnt(cgroup_fd2, "bpf_lsm_socket_bind"), 0, "prog count");
+ bind_link_fd2 = bpf_link_create(bind_prog_fd2, cgroup_fd2,
+ BPF_LSM_CGROUP, NULL);
+ if (!ASSERT_GE(bind_link_fd2, 0, "link create bind_prog_fd2"))
+ goto detach_cgroup;
+ ASSERT_EQ(query_prog_cnt(cgroup_fd2, "bpf_lsm_socket_bind"), 1, "prog count");
+ ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count");
+ ASSERT_EQ(query_prog_cnt(cgroup_fd2, NULL), 1, "total prog count");
+
+ /* AF_UNIX is prohibited. */
+
+ fd = socket(AF_UNIX, SOCK_STREAM, 0);
+ ASSERT_LT(fd, 0, "socket(AF_UNIX)");
+ close(fd);
+
+ /* AF_INET6 gets default policy (sk_priority). */
+
+ fd = socket(AF_INET6, SOCK_STREAM, 0);
+ if (!ASSERT_GE(fd, 0, "socket(SOCK_STREAM)"))
+ goto detach_cgroup;
+
+ prio = 0;
+ socklen = sizeof(prio);
+ ASSERT_GE(getsockopt(fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+ "getsockopt");
+ ASSERT_EQ(prio, 123, "sk_priority");
+
+ close(fd);
+
+ /* TX-only AF_PACKET is allowed. */
+
+ ASSERT_LT(socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)), 0,
+ "socket(AF_PACKET, ..., ETH_P_ALL)");
+
+ fd = socket(AF_PACKET, SOCK_RAW, 0);
+ ASSERT_GE(fd, 0, "socket(AF_PACKET, ..., 0)");
+
+ /* TX-only AF_PACKET can not be rebound. */
+
+ struct sockaddr_ll sa = {
+ .sll_family = AF_PACKET,
+ .sll_protocol = htons(ETH_P_ALL),
+ };
+ ASSERT_LT(bind(fd, (struct sockaddr *)&sa, sizeof(sa)), 0,
+ "bind(ETH_P_ALL)");
+
+ close(fd);
+
+ /* Trigger passive open. */
+
+ listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+ ASSERT_GE(listen_fd, 0, "start_server");
+ client_fd = connect_to_fd(listen_fd, 0);
+ ASSERT_GE(client_fd, 0, "connect_to_fd");
+ accepted_fd = accept(listen_fd, NULL, NULL);
+ ASSERT_GE(accepted_fd, 0, "accept");
+
+ prio = 0;
+ socklen = sizeof(prio);
+ ASSERT_GE(getsockopt(accepted_fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+ "getsockopt");
+ ASSERT_EQ(prio, 234, "sk_priority");
+
+ /* These are replaced and never called. */
+ ASSERT_EQ(skel->bss->called_socket_post_create, 0, "called_create");
+ ASSERT_EQ(skel->bss->called_socket_bind, 0, "called_bind");
+
+ /* AF_INET6+SOCK_STREAM
+ * AF_PACKET+SOCK_RAW
+ * listen_fd
+ * client_fd
+ * accepted_fd
+ */
+ ASSERT_EQ(skel->bss->called_socket_post_create2, 5, "called_create2");
+
+ /* start_server
+ * bind(ETH_P_ALL)
+ */
+ ASSERT_EQ(skel->bss->called_socket_bind2, 2, "called_bind2");
+ /* Single accept(). */
+ ASSERT_EQ(skel->bss->called_socket_clone, 1, "called_clone");
+
+ /* AF_UNIX+SOCK_STREAM (failed)
+ * AF_INET6+SOCK_STREAM
+ * AF_PACKET+SOCK_RAW (failed)
+ * AF_PACKET+SOCK_RAW
+ * listen_fd
+ * client_fd
+ * accepted_fd
+ */
+ ASSERT_EQ(skel->bss->called_socket_alloc, 7, "called_alloc");
+
+ close(listen_fd);
+ close(client_fd);
+ close(accepted_fd);
+
+ /* Make sure other cgroup doesn't trigger the programs. */
+
+ if (!ASSERT_OK(join_cgroup("/sock_policy_empty"), "join root cgroup"))
+ goto detach_cgroup;
+
+ fd = socket(AF_INET6, SOCK_STREAM, 0);
+ if (!ASSERT_GE(fd, 0, "socket(SOCK_STREAM)"))
+ goto detach_cgroup;
+
+ prio = 0;
+ socklen = sizeof(prio);
+ ASSERT_GE(getsockopt(fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+ "getsockopt");
+ ASSERT_EQ(prio, 0, "sk_priority");
+
+ close(fd);
+
+detach_cgroup:
+ ASSERT_GE(bpf_prog_detach2(post_create_prog_fd2, cgroup_fd,
+ BPF_LSM_CGROUP), 0, "detach_create");
+ close(bind_link_fd);
+ /* Don't close bind_link_fd2, exercise cgroup release cleanup. */
+ ASSERT_GE(bpf_prog_detach2(alloc_prog_fd, cgroup_fd,
+ BPF_LSM_CGROUP), 0, "detach_alloc");
+ ASSERT_GE(bpf_prog_detach2(clone_prog_fd, cgroup_fd,
+ BPF_LSM_CGROUP), 0, "detach_clone");
+
+close_cgroup:
+ close(cgroup_fd);
+ close(cgroup_fd2);
+ close(cgroup_fd3);
+ lsm_cgroup__destroy(skel);
+}
+
+static void test_lsm_cgroup_nonvoid(void)
+{
+ struct lsm_cgroup_nonvoid *skel = NULL;
+
+ skel = lsm_cgroup_nonvoid__open_and_load();
+ ASSERT_NULL(skel, "open succeeds");
+ lsm_cgroup_nonvoid__destroy(skel);
+}
+
+void test_lsm_cgroup(void)
+{
+ if (test__start_subtest("functional"))
+ test_lsm_cgroup_functional();
+ if (test__start_subtest("nonvoid"))
+ test_lsm_cgroup_nonvoid();
+ btf__free(btf);
+}
diff --git a/tools/testing/selftests/bpf/prog_tests/probe_user.c b/tools/testing/selftests/bpf/prog_tests/probe_user.c
index abf890d066eb..34dbd2adc157 100644
--- a/tools/testing/selftests/bpf/prog_tests/probe_user.c
+++ b/tools/testing/selftests/bpf/prog_tests/probe_user.c
@@ -4,25 +4,35 @@
/* TODO: corrupts other tests uses connect() */
void serial_test_probe_user(void)
{
- const char *prog_name = "handle_sys_connect";
+ static const char *const prog_names[] = {
+ "handle_sys_connect",
+#if defined(__s390x__)
+ "handle_sys_socketcall",
+#endif
+ };
+ enum { prog_count = ARRAY_SIZE(prog_names) };
const char *obj_file = "./test_probe_user.o";
DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts, );
int err, results_map_fd, sock_fd, duration = 0;
struct sockaddr curr, orig, tmp;
struct sockaddr_in *in = (struct sockaddr_in *)&curr;
- struct bpf_link *kprobe_link = NULL;
- struct bpf_program *kprobe_prog;
+ struct bpf_link *kprobe_links[prog_count] = {};
+ struct bpf_program *kprobe_progs[prog_count];
struct bpf_object *obj;
static const int zero = 0;
+ size_t i;
obj = bpf_object__open_file(obj_file, &opts);
if (!ASSERT_OK_PTR(obj, "obj_open_file"))
return;
- kprobe_prog = bpf_object__find_program_by_name(obj, prog_name);
- if (CHECK(!kprobe_prog, "find_probe",
- "prog '%s' not found\n", prog_name))
- goto cleanup;
+ for (i = 0; i < prog_count; i++) {
+ kprobe_progs[i] =
+ bpf_object__find_program_by_name(obj, prog_names[i]);
+ if (CHECK(!kprobe_progs[i], "find_probe",
+ "prog '%s' not found\n", prog_names[i]))
+ goto cleanup;
+ }
err = bpf_object__load(obj);
if (CHECK(err, "obj_load", "err %d\n", err))
@@ -33,9 +43,11 @@ void serial_test_probe_user(void)
"err %d\n", results_map_fd))
goto cleanup;
- kprobe_link = bpf_program__attach(kprobe_prog);
- if (!ASSERT_OK_PTR(kprobe_link, "attach_kprobe"))
- goto cleanup;
+ for (i = 0; i < prog_count; i++) {
+ kprobe_links[i] = bpf_program__attach(kprobe_progs[i]);
+ if (!ASSERT_OK_PTR(kprobe_links[i], "attach_kprobe"))
+ goto cleanup;
+ }
memset(&curr, 0, sizeof(curr));
in->sin_family = AF_INET;
@@ -69,6 +81,7 @@ void serial_test_probe_user(void)
inet_ntoa(in->sin_addr), ntohs(in->sin_port)))
goto cleanup;
cleanup:
- bpf_link__destroy(kprobe_link);
+ for (i = 0; i < prog_count; i++)
+ bpf_link__destroy(kprobe_links[i]);
bpf_object__close(obj);
}
diff --git a/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c b/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c
index f4a13d9dd5c8..c197261d02e2 100644
--- a/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c
+++ b/tools/testing/selftests/bpf/prog_tests/resolve_btfids.c
@@ -44,7 +44,7 @@ BTF_ID(union, U)
BTF_ID(func, func)
extern __u32 test_list_global[];
-BTF_ID_LIST_GLOBAL(test_list_global)
+BTF_ID_LIST_GLOBAL(test_list_global, 1)
BTF_ID_UNUSED
BTF_ID(typedef, S)
BTF_ID(typedef, T)
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c b/tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c
index eb5f7f5aa81a..1455911d9fcb 100644
--- a/tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c
+++ b/tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c
@@ -50,6 +50,13 @@ void test_ringbuf_multi(void)
if (CHECK(!skel, "skel_open", "skeleton open failed\n"))
return;
+ /* validate ringbuf size adjustment logic */
+ ASSERT_EQ(bpf_map__max_entries(skel->maps.ringbuf1), page_size, "rb1_size_before");
+ ASSERT_OK(bpf_map__set_max_entries(skel->maps.ringbuf1, page_size + 1), "rb1_resize");
+ ASSERT_EQ(bpf_map__max_entries(skel->maps.ringbuf1), 2 * page_size, "rb1_size_after");
+ ASSERT_OK(bpf_map__set_max_entries(skel->maps.ringbuf1, page_size), "rb1_reset");
+ ASSERT_EQ(bpf_map__max_entries(skel->maps.ringbuf1), page_size, "rb1_size_final");
+
proto_fd = bpf_map_create(BPF_MAP_TYPE_RINGBUF, NULL, 0, 0, page_size, NULL);
if (CHECK(proto_fd < 0, "bpf_map_create", "bpf_map_create failed\n"))
goto cleanup;
@@ -65,6 +72,10 @@ void test_ringbuf_multi(void)
close(proto_fd);
proto_fd = -1;
+ /* make sure we can't resize ringbuf after object load */
+ if (!ASSERT_ERR(bpf_map__set_max_entries(skel->maps.ringbuf1, 3 * page_size), "rb1_resize_after_load"))
+ goto cleanup;
+
/* only trigger BPF program for current process */
skel->bss->pid = getpid();
diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c
index d71226e34c34..d63a20fbed33 100644
--- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
+++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
@@ -64,7 +64,7 @@ static void test_send_signal_common(struct perf_event_attr *attr,
ASSERT_EQ(read(pipe_p2c[0], buf, 1), 1, "pipe_read");
/* wait a little for signal handler */
- for (int i = 0; i < 100000000 && !sigusr1_received; i++)
+ for (int i = 0; i < 1000000000 && !sigusr1_received; i++)
j /= i + j + 1;
buf[0] = sigusr1_received ? '2' : '0';
diff --git a/tools/testing/selftests/bpf/prog_tests/skeleton.c b/tools/testing/selftests/bpf/prog_tests/skeleton.c
index 180afd632f4c..99dac5292b41 100644
--- a/tools/testing/selftests/bpf/prog_tests/skeleton.c
+++ b/tools/testing/selftests/bpf/prog_tests/skeleton.c
@@ -122,6 +122,8 @@ void test_skeleton(void)
ASSERT_EQ(skel->bss->out_mostly_var, 123, "out_mostly_var");
+ ASSERT_EQ(bss->huge_arr[ARRAY_SIZE(bss->huge_arr) - 1], 123, "huge_arr");
+
elf_bytes = test_skeleton__elf_bytes(&elf_bytes_sz);
ASSERT_OK_PTR(elf_bytes, "elf_bytes");
ASSERT_GE(elf_bytes_sz, 0, "elf_bytes_sz");
diff --git a/tools/testing/selftests/bpf/prog_tests/sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
index 9d211b5c22c4..7d23166c77af 100644
--- a/tools/testing/selftests/bpf/prog_tests/sock_fields.c
+++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
@@ -394,7 +394,6 @@ void serial_test_sock_fields(void)
test();
done:
- test_sock_fields__detach(skel);
test_sock_fields__destroy(skel);
if (child_cg_fd >= 0)
close(child_cg_fd);
diff --git a/tools/testing/selftests/bpf/prog_tests/tc_redirect.c b/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
index 958dae769c52..cb6a53b3e023 100644
--- a/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
+++ b/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
@@ -646,7 +646,7 @@ static void test_tcp_clear_dtime(struct test_tc_dtime *skel)
__u32 *errs = skel->bss->errs[t];
skel->bss->test = t;
- test_inet_dtime(AF_INET6, SOCK_STREAM, IP6_DST, 0);
+ test_inet_dtime(AF_INET6, SOCK_STREAM, IP6_DST, 50000 + t);
ASSERT_EQ(dtimes[INGRESS_FWDNS_P100], 0,
dtime_cnt_str(t, INGRESS_FWDNS_P100));
@@ -683,7 +683,7 @@ static void test_tcp_dtime(struct test_tc_dtime *skel, int family, bool bpf_fwd)
errs = skel->bss->errs[t];
skel->bss->test = t;
- test_inet_dtime(family, SOCK_STREAM, addr, 0);
+ test_inet_dtime(family, SOCK_STREAM, addr, 50000 + t);
/* fwdns_prio100 prog does not read delivery_time_type, so
* kernel puts the (rcv) timetamp in __sk_buff->tstamp
@@ -715,13 +715,13 @@ static void test_udp_dtime(struct test_tc_dtime *skel, int family, bool bpf_fwd)
errs = skel->bss->errs[t];
skel->bss->test = t;
- test_inet_dtime(family, SOCK_DGRAM, addr, 0);
+ test_inet_dtime(family, SOCK_DGRAM, addr, 50000 + t);
ASSERT_EQ(dtimes[INGRESS_FWDNS_P100], 0,
dtime_cnt_str(t, INGRESS_FWDNS_P100));
/* non mono delivery time is not forwarded */
ASSERT_EQ(dtimes[INGRESS_FWDNS_P101], 0,
- dtime_cnt_str(t, INGRESS_FWDNS_P100));
+ dtime_cnt_str(t, INGRESS_FWDNS_P101));
for (i = EGRESS_FWDNS_P100; i < SET_DTIME; i++)
ASSERT_GT(dtimes[i], 0, dtime_cnt_str(t, i));
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c
index 3bba4a2a0530..eea274110267 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c
+++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c
@@ -82,6 +82,7 @@
#define MAC_TUNL_DEV0 "52:54:00:d9:01:00"
#define MAC_TUNL_DEV1 "52:54:00:d9:02:00"
+#define MAC_VETH1 "52:54:00:d9:03:00"
#define VXLAN_TUNL_DEV0 "vxlan00"
#define VXLAN_TUNL_DEV1 "vxlan11"
@@ -108,10 +109,9 @@
static int config_device(void)
{
SYS("ip netns add at_ns0");
- SYS("ip link add veth0 type veth peer name veth1");
+ SYS("ip link add veth0 address " MAC_VETH1 " type veth peer name veth1");
SYS("ip link set veth0 netns at_ns0");
SYS("ip addr add " IP4_ADDR1_VETH1 "/24 dev veth1");
- SYS("ip addr add " IP4_ADDR2_VETH1 "/24 dev veth1");
SYS("ip link set dev veth1 up mtu 1500");
SYS("ip netns exec at_ns0 ip addr add " IP4_ADDR_VETH0 "/24 dev veth0");
SYS("ip netns exec at_ns0 ip link set dev veth0 up mtu 1500");
@@ -140,6 +140,8 @@ static int add_vxlan_tunnel(void)
VXLAN_TUNL_DEV0, IP4_ADDR_TUNL_DEV0);
SYS("ip netns exec at_ns0 ip neigh add %s lladdr %s dev %s",
IP4_ADDR_TUNL_DEV1, MAC_TUNL_DEV1, VXLAN_TUNL_DEV0);
+ SYS("ip netns exec at_ns0 ip neigh add %s lladdr %s dev veth0",
+ IP4_ADDR2_VETH1, MAC_VETH1);
/* root namespace */
SYS("ip link add dev %s type vxlan external gbp dstport 4789",
@@ -277,6 +279,17 @@ static void test_vxlan_tunnel(void)
if (attach_tc_prog(&tc_hook, get_src_prog_fd, set_src_prog_fd))
goto done;
+ /* load and attach bpf prog to veth dev tc hook point */
+ ifindex = if_nametoindex("veth1");
+ if (!ASSERT_NEQ(ifindex, 0, "veth1 ifindex"))
+ goto done;
+ tc_hook.ifindex = ifindex;
+ set_dst_prog_fd = bpf_program__fd(skel->progs.veth_set_outer_dst);
+ if (!ASSERT_GE(set_dst_prog_fd, 0, "bpf_program__fd"))
+ goto done;
+ if (attach_tc_prog(&tc_hook, set_dst_prog_fd, -1))
+ goto done;
+
/* load and attach prog set_md to tunnel dev tc hook point at_ns0 */
nstoken = open_netns("at_ns0");
if (!ASSERT_OK_PTR(nstoken, "setns src"))
diff --git a/tools/testing/selftests/bpf/prog_tests/usdt.c b/tools/testing/selftests/bpf/prog_tests/usdt.c
index 5f733d50b0d7..9ad9da0f215e 100644
--- a/tools/testing/selftests/bpf/prog_tests/usdt.c
+++ b/tools/testing/selftests/bpf/prog_tests/usdt.c
@@ -12,7 +12,7 @@ int lets_test_this(int);
static volatile int idx = 2;
static volatile __u64 bla = 0xFEDCBA9876543210ULL;
-static volatile short nums[] = {-1, -2, -3, };
+static volatile short nums[] = {-1, -2, -3, -4};
static volatile struct {
int x;
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_synproxy.c b/tools/testing/selftests/bpf/prog_tests/xdp_synproxy.c
new file mode 100644
index 000000000000..874a846e298c
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_synproxy.c
@@ -0,0 +1,183 @@
+// SPDX-License-Identifier: LGPL-2.1 OR BSD-2-Clause
+/* Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#define _GNU_SOURCE
+#include <test_progs.h>
+#include <network_helpers.h>
+#include <ctype.h>
+
+#define CMD_OUT_BUF_SIZE 1023
+
+#define SYS(cmd) ({ \
+ if (!ASSERT_OK(system(cmd), (cmd))) \
+ goto out; \
+})
+
+#define SYS_OUT(cmd, ...) ({ \
+ char buf[1024]; \
+ snprintf(buf, sizeof(buf), (cmd), ##__VA_ARGS__); \
+ FILE *f = popen(buf, "r"); \
+ if (!ASSERT_OK_PTR(f, buf)) \
+ goto out; \
+ f; \
+})
+
+/* out must be at least `size * 4 + 1` bytes long */
+static void escape_str(char *out, const char *in, size_t size)
+{
+ static const char *hex = "0123456789ABCDEF";
+ size_t i;
+
+ for (i = 0; i < size; i++) {
+ if (isprint(in[i]) && in[i] != '\\' && in[i] != '\'') {
+ *out++ = in[i];
+ } else {
+ *out++ = '\\';
+ *out++ = 'x';
+ *out++ = hex[(in[i] >> 4) & 0xf];
+ *out++ = hex[in[i] & 0xf];
+ }
+ }
+ *out++ = '\0';
+}
+
+static bool expect_str(char *buf, size_t size, const char *str, const char *name)
+{
+ static char escbuf_expected[CMD_OUT_BUF_SIZE * 4];
+ static char escbuf_actual[CMD_OUT_BUF_SIZE * 4];
+ static int duration = 0;
+ bool ok;
+
+ ok = size == strlen(str) && !memcmp(buf, str, size);
+
+ if (!ok) {
+ escape_str(escbuf_expected, str, strlen(str));
+ escape_str(escbuf_actual, buf, size);
+ }
+ CHECK(!ok, name, "unexpected %s: actual '%s' != expected '%s'\n",
+ name, escbuf_actual, escbuf_expected);
+
+ return ok;
+}
+
+static void test_synproxy(bool xdp)
+{
+ int server_fd = -1, client_fd = -1, accept_fd = -1;
+ char *prog_id = NULL, *prog_id_end;
+ struct nstoken *ns = NULL;
+ FILE *ctrl_file = NULL;
+ char buf[CMD_OUT_BUF_SIZE];
+ size_t size;
+
+ SYS("ip netns add synproxy");
+
+ SYS("ip link add tmp0 type veth peer name tmp1");
+ SYS("ip link set tmp1 netns synproxy");
+ SYS("ip link set tmp0 up");
+ SYS("ip addr replace 198.18.0.1/24 dev tmp0");
+
+ /* When checksum offload is enabled, the XDP program sees wrong
+ * checksums and drops packets.
+ */
+ SYS("ethtool -K tmp0 tx off");
+ if (xdp)
+ /* Workaround required for veth. */
+ SYS("ip link set tmp0 xdp object xdp_dummy.o section xdp 2> /dev/null");
+
+ ns = open_netns("synproxy");
+ if (!ASSERT_OK_PTR(ns, "setns"))
+ goto out;
+
+ SYS("ip link set lo up");
+ SYS("ip link set tmp1 up");
+ SYS("ip addr replace 198.18.0.2/24 dev tmp1");
+ SYS("sysctl -w net.ipv4.tcp_syncookies=2");
+ SYS("sysctl -w net.ipv4.tcp_timestamps=1");
+ SYS("sysctl -w net.netfilter.nf_conntrack_tcp_loose=0");
+ SYS("iptables -t raw -I PREROUTING \
+ -i tmp1 -p tcp -m tcp --syn --dport 8080 -j CT --notrack");
+ SYS("iptables -t filter -A INPUT \
+ -i tmp1 -p tcp -m tcp --dport 8080 -m state --state INVALID,UNTRACKED \
+ -j SYNPROXY --sack-perm --timestamp --wscale 7 --mss 1460");
+ SYS("iptables -t filter -A INPUT \
+ -i tmp1 -m state --state INVALID -j DROP");
+
+ ctrl_file = SYS_OUT("./xdp_synproxy --iface tmp1 --ports 8080 \
+ --single --mss4 1460 --mss6 1440 \
+ --wscale 7 --ttl 64%s", xdp ? "" : " --tc");
+ size = fread(buf, 1, sizeof(buf), ctrl_file);
+ pclose(ctrl_file);
+ if (!expect_str(buf, size, "Total SYNACKs generated: 0\n",
+ "initial SYNACKs"))
+ goto out;
+
+ if (!xdp) {
+ ctrl_file = SYS_OUT("tc filter show dev tmp1 ingress");
+ size = fread(buf, 1, sizeof(buf), ctrl_file);
+ pclose(ctrl_file);
+ prog_id = memmem(buf, size, " id ", 4);
+ if (!ASSERT_OK_PTR(prog_id, "find prog id"))
+ goto out;
+ prog_id += 4;
+ if (!ASSERT_LT(prog_id, buf + size, "find prog id begin"))
+ goto out;
+ prog_id_end = prog_id;
+ while (prog_id_end < buf + size && *prog_id_end >= '0' &&
+ *prog_id_end <= '9')
+ prog_id_end++;
+ if (!ASSERT_LT(prog_id_end, buf + size, "find prog id end"))
+ goto out;
+ *prog_id_end = '\0';
+ }
+
+ server_fd = start_server(AF_INET, SOCK_STREAM, "198.18.0.2", 8080, 0);
+ if (!ASSERT_GE(server_fd, 0, "start_server"))
+ goto out;
+
+ close_netns(ns);
+ ns = NULL;
+
+ client_fd = connect_to_fd(server_fd, 10000);
+ if (!ASSERT_GE(client_fd, 0, "connect_to_fd"))
+ goto out;
+
+ accept_fd = accept(server_fd, NULL, NULL);
+ if (!ASSERT_GE(accept_fd, 0, "accept"))
+ goto out;
+
+ ns = open_netns("synproxy");
+ if (!ASSERT_OK_PTR(ns, "setns"))
+ goto out;
+
+ if (xdp)
+ ctrl_file = SYS_OUT("./xdp_synproxy --iface tmp1 --single");
+ else
+ ctrl_file = SYS_OUT("./xdp_synproxy --prog %s --single",
+ prog_id);
+ size = fread(buf, 1, sizeof(buf), ctrl_file);
+ pclose(ctrl_file);
+ if (!expect_str(buf, size, "Total SYNACKs generated: 1\n",
+ "SYNACKs after connection"))
+ goto out;
+
+out:
+ if (accept_fd >= 0)
+ close(accept_fd);
+ if (client_fd >= 0)
+ close(client_fd);
+ if (server_fd >= 0)
+ close(server_fd);
+ if (ns)
+ close_netns(ns);
+
+ system("ip link del tmp0");
+ system("ip netns del synproxy");
+}
+
+void test_xdp_synproxy(void)
+{
+ if (test__start_subtest("xdp"))
+ test_synproxy(true);
+ if (test__start_subtest("tc"))
+ test_synproxy(false);
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_hashmap_full_update_bench.c b/tools/testing/selftests/bpf/progs/bpf_hashmap_full_update_bench.c
new file mode 100644
index 000000000000..56957557e3e1
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_hashmap_full_update_bench.c
@@ -0,0 +1,40 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Bytedance */
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+char _license[] SEC("license") = "GPL";
+
+#define MAX_ENTRIES 1000
+
+struct {
+ __uint(type, BPF_MAP_TYPE_HASH);
+ __type(key, u32);
+ __type(value, u64);
+ __uint(max_entries, MAX_ENTRIES);
+} hash_map_bench SEC(".maps");
+
+u64 __attribute__((__aligned__(256))) percpu_time[256];
+u64 nr_loops;
+
+static int loop_update_callback(__u32 index, u32 *key)
+{
+ u64 init_val = 1;
+
+ bpf_map_update_elem(&hash_map_bench, key, &init_val, BPF_ANY);
+ return 0;
+}
+
+SEC("fentry/" SYS_PREFIX "sys_getpgid")
+int benchmark(void *ctx)
+{
+ u32 cpu = bpf_get_smp_processor_id();
+ u32 key = cpu + MAX_ENTRIES;
+ u64 start_time = bpf_ktime_get_ns();
+
+ bpf_loop(nr_loops, loop_update_callback, &key, 0);
+ percpu_time[cpu & 255] = bpf_ktime_get_ns() - start_time;
+ return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter.h b/tools/testing/selftests/bpf/progs/bpf_iter.h
index 97ec8bc76ae6..e9846606690d 100644
--- a/tools/testing/selftests/bpf/progs/bpf_iter.h
+++ b/tools/testing/selftests/bpf/progs/bpf_iter.h
@@ -22,6 +22,7 @@
#define BTF_F_NONAME BTF_F_NONAME___not_used
#define BTF_F_PTR_RAW BTF_F_PTR_RAW___not_used
#define BTF_F_ZERO BTF_F_ZERO___not_used
+#define bpf_iter__ksym bpf_iter__ksym___not_used
#include "vmlinux.h"
#undef bpf_iter_meta
#undef bpf_iter__bpf_map
@@ -44,6 +45,7 @@
#undef BTF_F_NONAME
#undef BTF_F_PTR_RAW
#undef BTF_F_ZERO
+#undef bpf_iter__ksym
struct bpf_iter_meta {
struct seq_file *seq;
@@ -151,3 +153,8 @@ enum {
BTF_F_PTR_RAW = (1ULL << 2),
BTF_F_ZERO = (1ULL << 3),
};
+
+struct bpf_iter__ksym {
+ struct bpf_iter_meta *meta;
+ struct kallsym_iter *ksym;
+};
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_ksym.c b/tools/testing/selftests/bpf/progs/bpf_iter_ksym.c
new file mode 100644
index 000000000000..285c008cbf9c
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_iter_ksym.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022, Oracle and/or its affiliates. */
+#include "bpf_iter.h"
+#include <bpf/bpf_helpers.h>
+
+char _license[] SEC("license") = "GPL";
+
+unsigned long last_sym_value = 0;
+
+static inline char tolower(char c)
+{
+ if (c >= 'A' && c <= 'Z')
+ c += ('a' - 'A');
+ return c;
+}
+
+static inline char toupper(char c)
+{
+ if (c >= 'a' && c <= 'z')
+ c -= ('a' - 'A');
+ return c;
+}
+
+/* Dump symbols with max size; the latter is calculated by caching symbol N value
+ * and when iterating on symbol N+1, we can print max size of symbol N via
+ * address of N+1 - address of N.
+ */
+SEC("iter/ksym")
+int dump_ksym(struct bpf_iter__ksym *ctx)
+{
+ struct seq_file *seq = ctx->meta->seq;
+ struct kallsym_iter *iter = ctx->ksym;
+ __u32 seq_num = ctx->meta->seq_num;
+ unsigned long value;
+ char type;
+ int ret;
+
+ if (!iter)
+ return 0;
+
+ if (seq_num == 0) {
+ BPF_SEQ_PRINTF(seq, "ADDR TYPE NAME MODULE_NAME KIND MAX_SIZE\n");
+ return 0;
+ }
+ if (last_sym_value)
+ BPF_SEQ_PRINTF(seq, "0x%x\n", iter->value - last_sym_value);
+ else
+ BPF_SEQ_PRINTF(seq, "\n");
+
+ value = iter->show_value ? iter->value : 0;
+
+ last_sym_value = value;
+
+ type = iter->type;
+
+ if (iter->module_name[0]) {
+ type = iter->exported ? toupper(type) : tolower(type);
+ BPF_SEQ_PRINTF(seq, "0x%llx %c %s [ %s ] ",
+ value, type, iter->name, iter->module_name);
+ } else {
+ BPF_SEQ_PRINTF(seq, "0x%llx %c %s ", value, type, iter->name);
+ }
+ if (!iter->pos_arch_end || iter->pos_arch_end > iter->pos)
+ BPF_SEQ_PRINTF(seq, "CORE ");
+ else if (!iter->pos_mod_end || iter->pos_mod_end > iter->pos)
+ BPF_SEQ_PRINTF(seq, "MOD ");
+ else if (!iter->pos_ftrace_mod_end || iter->pos_ftrace_mod_end > iter->pos)
+ BPF_SEQ_PRINTF(seq, "FTRACE_MOD ");
+ else if (!iter->pos_bpf_end || iter->pos_bpf_end > iter->pos)
+ BPF_SEQ_PRINTF(seq, "BPF ");
+ else
+ BPF_SEQ_PRINTF(seq, "KPROBE ");
+ return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_loop.c b/tools/testing/selftests/bpf/progs/bpf_loop.c
index e08565282759..de1fc82d2710 100644
--- a/tools/testing/selftests/bpf/progs/bpf_loop.c
+++ b/tools/testing/selftests/bpf/progs/bpf_loop.c
@@ -11,11 +11,19 @@ struct callback_ctx {
int output;
};
+struct {
+ __uint(type, BPF_MAP_TYPE_HASH);
+ __uint(max_entries, 32);
+ __type(key, int);
+ __type(value, int);
+} map1 SEC(".maps");
+
/* These should be set by the user program */
u32 nested_callback_nr_loops;
u32 stop_index = -1;
u32 nr_loops;
int pid;
+int callback_selector;
/* Making these global variables so that the userspace program
* can verify the output through the skeleton
@@ -111,3 +119,109 @@ int prog_nested_calls(void *ctx)
return 0;
}
+
+static int callback_set_f0(int i, void *ctx)
+{
+ g_output = 0xF0;
+ return 0;
+}
+
+static int callback_set_0f(int i, void *ctx)
+{
+ g_output = 0x0F;
+ return 0;
+}
+
+/*
+ * non-constant callback is a corner case for bpf_loop inline logic
+ */
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int prog_non_constant_callback(void *ctx)
+{
+ struct callback_ctx data = {};
+
+ if (bpf_get_current_pid_tgid() >> 32 != pid)
+ return 0;
+
+ int (*callback)(int i, void *ctx);
+
+ g_output = 0;
+
+ if (callback_selector == 0x0F)
+ callback = callback_set_0f;
+ else
+ callback = callback_set_f0;
+
+ bpf_loop(1, callback, NULL, 0);
+
+ return 0;
+}
+
+static int stack_check_inner_callback(void *ctx)
+{
+ return 0;
+}
+
+static int map1_lookup_elem(int key)
+{
+ int *val = bpf_map_lookup_elem(&map1, &key);
+
+ return val ? *val : -1;
+}
+
+static void map1_update_elem(int key, int val)
+{
+ bpf_map_update_elem(&map1, &key, &val, BPF_ANY);
+}
+
+static int stack_check_outer_callback(void *ctx)
+{
+ int a = map1_lookup_elem(1);
+ int b = map1_lookup_elem(2);
+ int c = map1_lookup_elem(3);
+ int d = map1_lookup_elem(4);
+ int e = map1_lookup_elem(5);
+ int f = map1_lookup_elem(6);
+
+ bpf_loop(1, stack_check_inner_callback, NULL, 0);
+
+ map1_update_elem(1, a + 1);
+ map1_update_elem(2, b + 1);
+ map1_update_elem(3, c + 1);
+ map1_update_elem(4, d + 1);
+ map1_update_elem(5, e + 1);
+ map1_update_elem(6, f + 1);
+
+ return 0;
+}
+
+/* Some of the local variables in stack_check and
+ * stack_check_outer_callback would be allocated on stack by
+ * compiler. This test should verify that stack content for these
+ * variables is preserved between calls to bpf_loop (might be an issue
+ * if loop inlining allocates stack slots incorrectly).
+ */
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int stack_check(void *ctx)
+{
+ if (bpf_get_current_pid_tgid() >> 32 != pid)
+ return 0;
+
+ int a = map1_lookup_elem(7);
+ int b = map1_lookup_elem(8);
+ int c = map1_lookup_elem(9);
+ int d = map1_lookup_elem(10);
+ int e = map1_lookup_elem(11);
+ int f = map1_lookup_elem(12);
+
+ bpf_loop(1, stack_check_outer_callback, NULL, 0);
+
+ map1_update_elem(7, a + 1);
+ map1_update_elem(8, b + 1);
+ map1_update_elem(9, c + 1);
+ map1_update_elem(10, d + 1);
+ map1_update_elem(11, e + 1);
+ map1_update_elem(12, f + 1);
+
+ return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
index 05838ed9b89c..e1e11897e99b 100644
--- a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
+++ b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
@@ -64,9 +64,9 @@ int BPF_KPROBE(handle_sys_prctl)
return 0;
}
-SEC("kprobe/" SYS_PREFIX "sys_prctl")
-int BPF_KPROBE_SYSCALL(prctl_enter, int option, unsigned long arg2,
- unsigned long arg3, unsigned long arg4, unsigned long arg5)
+SEC("ksyscall/prctl")
+int BPF_KSYSCALL(prctl_enter, int option, unsigned long arg2,
+ unsigned long arg3, unsigned long arg4, unsigned long arg5)
{
pid_t pid = bpf_get_current_pid_tgid() >> 32;
diff --git a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
index 1c1289ba5fc5..98dd2c4815f0 100644
--- a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
+++ b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
@@ -8,6 +8,7 @@
#define SOL_SOCKET 1
#define SO_SNDBUF 7
#define __SO_ACCEPTCON (1 << 16)
+#define SO_PRIORITY 12
#define SOL_TCP 6
#define TCP_CONGESTION 13
diff --git a/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val.c b/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val.c
new file mode 100644
index 000000000000..888e79db6a77
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val.c
@@ -0,0 +1,3 @@
+#include "core_reloc_types.h"
+
+void f(struct core_reloc_enum64val x) {}
diff --git a/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___diff.c b/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___diff.c
new file mode 100644
index 000000000000..194749130d87
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___diff.c
@@ -0,0 +1,3 @@
+#include "core_reloc_types.h"
+
+void f(struct core_reloc_enum64val___diff x) {}
diff --git a/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___err_missing.c b/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___err_missing.c
new file mode 100644
index 000000000000..3d732d4193e4
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___err_missing.c
@@ -0,0 +1,3 @@
+#include "core_reloc_types.h"
+
+void f(struct core_reloc_enum64val___err_missing x) {}
diff --git a/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___val3_missing.c b/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___val3_missing.c
new file mode 100644
index 000000000000..17cf5d6a848d
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/btf__core_reloc_enum64val___val3_missing.c
@@ -0,0 +1,3 @@
+#include "core_reloc_types.h"
+
+void f(struct core_reloc_enum64val___val3_missing x) {}
diff --git a/tools/testing/selftests/bpf/progs/btf__core_reloc_type_based___diff.c b/tools/testing/selftests/bpf/progs/btf__core_reloc_type_based___diff.c
new file mode 100644
index 000000000000..57ae2c258928
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/btf__core_reloc_type_based___diff.c
@@ -0,0 +1,3 @@
+#include "core_reloc_types.h"
+
+void f(struct core_reloc_type_based___diff x) {}
diff --git a/tools/testing/selftests/bpf/progs/core_reloc_types.h b/tools/testing/selftests/bpf/progs/core_reloc_types.h
index f9dc9766546e..fd8e1b4c6762 100644
--- a/tools/testing/selftests/bpf/progs/core_reloc_types.h
+++ b/tools/testing/selftests/bpf/progs/core_reloc_types.h
@@ -13,6 +13,7 @@ struct core_reloc_kernel_output {
int valid[10];
char comm[sizeof("test_progs")];
int comm_len;
+ bool local_task_struct_matches;
};
/*
@@ -860,10 +861,11 @@ struct core_reloc_size___err_ambiguous2 {
};
/*
- * TYPE EXISTENCE & SIZE
+ * TYPE EXISTENCE, MATCH & SIZE
*/
struct core_reloc_type_based_output {
bool struct_exists;
+ bool complex_struct_exists;
bool union_exists;
bool enum_exists;
bool typedef_named_struct_exists;
@@ -872,9 +874,24 @@ struct core_reloc_type_based_output {
bool typedef_int_exists;
bool typedef_enum_exists;
bool typedef_void_ptr_exists;
+ bool typedef_restrict_ptr_exists;
bool typedef_func_proto_exists;
bool typedef_arr_exists;
+ bool struct_matches;
+ bool complex_struct_matches;
+ bool union_matches;
+ bool enum_matches;
+ bool typedef_named_struct_matches;
+ bool typedef_anon_struct_matches;
+ bool typedef_struct_ptr_matches;
+ bool typedef_int_matches;
+ bool typedef_enum_matches;
+ bool typedef_void_ptr_matches;
+ bool typedef_restrict_ptr_matches;
+ bool typedef_func_proto_matches;
+ bool typedef_arr_matches;
+
int struct_sz;
int union_sz;
int enum_sz;
@@ -892,6 +909,14 @@ struct a_struct {
int x;
};
+struct a_complex_struct {
+ union {
+ struct a_struct * restrict a;
+ void *b;
+ } x;
+ volatile long y;
+};
+
union a_union {
int y;
int z;
@@ -916,6 +941,7 @@ typedef int int_typedef;
typedef enum { TYPEDEF_ENUM_VAL1, TYPEDEF_ENUM_VAL2 } enum_typedef;
typedef void *void_ptr_typedef;
+typedef int *restrict restrict_ptr_typedef;
typedef int (*func_proto_typedef)(long);
@@ -923,22 +949,86 @@ typedef char arr_typedef[20];
struct core_reloc_type_based {
struct a_struct f1;
- union a_union f2;
- enum an_enum f3;
- named_struct_typedef f4;
- anon_struct_typedef f5;
- struct_ptr_typedef f6;
- int_typedef f7;
- enum_typedef f8;
- void_ptr_typedef f9;
- func_proto_typedef f10;
- arr_typedef f11;
+ struct a_complex_struct f2;
+ union a_union f3;
+ enum an_enum f4;
+ named_struct_typedef f5;
+ anon_struct_typedef f6;
+ struct_ptr_typedef f7;
+ int_typedef f8;
+ enum_typedef f9;
+ void_ptr_typedef f10;
+ restrict_ptr_typedef f11;
+ func_proto_typedef f12;
+ arr_typedef f13;
};
/* no types in target */
struct core_reloc_type_based___all_missing {
};
+/* different member orders, enum variant values, signedness, etc */
+struct a_struct___diff {
+ int x;
+ int a;
+};
+
+struct a_struct___forward;
+
+struct a_complex_struct___diff {
+ union {
+ struct a_struct___forward *a;
+ void *b;
+ } x;
+ volatile long y;
+};
+
+union a_union___diff {
+ int z;
+ int y;
+};
+
+typedef struct a_struct___diff named_struct_typedef___diff;
+
+typedef struct { int z, x, y; } anon_struct_typedef___diff;
+
+typedef struct {
+ int c;
+ int b;
+ int a;
+} *struct_ptr_typedef___diff;
+
+enum an_enum___diff {
+ AN_ENUM_VAL2___diff = 0,
+ AN_ENUM_VAL1___diff = 42,
+ AN_ENUM_VAL3___diff = 1,
+};
+
+typedef unsigned int int_typedef___diff;
+
+typedef enum { TYPEDEF_ENUM_VAL2___diff, TYPEDEF_ENUM_VAL1___diff = 50 } enum_typedef___diff;
+
+typedef const void *void_ptr_typedef___diff;
+
+typedef int_typedef___diff (*func_proto_typedef___diff)(long);
+
+typedef char arr_typedef___diff[3];
+
+struct core_reloc_type_based___diff {
+ struct a_struct___diff f1;
+ struct a_complex_struct___diff f2;
+ union a_union___diff f3;
+ enum an_enum___diff f4;
+ named_struct_typedef___diff f5;
+ anon_struct_typedef___diff f6;
+ struct_ptr_typedef___diff f7;
+ int_typedef___diff f8;
+ enum_typedef___diff f9;
+ void_ptr_typedef___diff f10;
+ func_proto_typedef___diff f11;
+ arr_typedef___diff f12;
+};
+
/* different type sizes, extra modifiers, anon vs named enums, etc */
struct a_struct___diff_sz {
long x;
@@ -1117,6 +1207,20 @@ struct core_reloc_enumval_output {
int anon_val2;
};
+struct core_reloc_enum64val_output {
+ bool unsigned_val1_exists;
+ bool unsigned_val2_exists;
+ bool unsigned_val3_exists;
+ bool signed_val1_exists;
+ bool signed_val2_exists;
+ bool signed_val3_exists;
+
+ long unsigned_val1;
+ long unsigned_val2;
+ long signed_val1;
+ long signed_val2;
+};
+
enum named_enum {
NAMED_ENUM_VAL1 = 1,
NAMED_ENUM_VAL2 = 2,
@@ -1134,6 +1238,23 @@ struct core_reloc_enumval {
anon_enum f2;
};
+enum named_unsigned_enum64 {
+ UNSIGNED_ENUM64_VAL1 = 0x1ffffffffULL,
+ UNSIGNED_ENUM64_VAL2 = 0x2,
+ UNSIGNED_ENUM64_VAL3 = 0x3ffffffffULL,
+};
+
+enum named_signed_enum64 {
+ SIGNED_ENUM64_VAL1 = 0x1ffffffffLL,
+ SIGNED_ENUM64_VAL2 = -2,
+ SIGNED_ENUM64_VAL3 = 0x3ffffffffLL,
+};
+
+struct core_reloc_enum64val {
+ enum named_unsigned_enum64 f1;
+ enum named_signed_enum64 f2;
+};
+
/* differing enumerator values */
enum named_enum___diff {
NAMED_ENUM_VAL1___diff = 101,
@@ -1152,6 +1273,23 @@ struct core_reloc_enumval___diff {
anon_enum___diff f2;
};
+enum named_unsigned_enum64___diff {
+ UNSIGNED_ENUM64_VAL1___diff = 0x101ffffffffULL,
+ UNSIGNED_ENUM64_VAL2___diff = 0x202ffffffffULL,
+ UNSIGNED_ENUM64_VAL3___diff = 0x303ffffffffULL,
+};
+
+enum named_signed_enum64___diff {
+ SIGNED_ENUM64_VAL1___diff = -101,
+ SIGNED_ENUM64_VAL2___diff = -202,
+ SIGNED_ENUM64_VAL3___diff = -303,
+};
+
+struct core_reloc_enum64val___diff {
+ enum named_unsigned_enum64___diff f1;
+ enum named_signed_enum64___diff f2;
+};
+
/* missing (optional) third enum value */
enum named_enum___val3_missing {
NAMED_ENUM_VAL1___val3_missing = 111,
@@ -1168,6 +1306,21 @@ struct core_reloc_enumval___val3_missing {
anon_enum___val3_missing f2;
};
+enum named_unsigned_enum64___val3_missing {
+ UNSIGNED_ENUM64_VAL1___val3_missing = 0x111ffffffffULL,
+ UNSIGNED_ENUM64_VAL2___val3_missing = 0x222,
+};
+
+enum named_signed_enum64___val3_missing {
+ SIGNED_ENUM64_VAL1___val3_missing = 0x111ffffffffLL,
+ SIGNED_ENUM64_VAL2___val3_missing = -222,
+};
+
+struct core_reloc_enum64val___val3_missing {
+ enum named_unsigned_enum64___val3_missing f1;
+ enum named_signed_enum64___val3_missing f2;
+};
+
/* missing (mandatory) second enum value, should fail */
enum named_enum___err_missing {
NAMED_ENUM_VAL1___err_missing = 1,
@@ -1183,3 +1336,18 @@ struct core_reloc_enumval___err_missing {
enum named_enum___err_missing f1;
anon_enum___err_missing f2;
};
+
+enum named_unsigned_enum64___err_missing {
+ UNSIGNED_ENUM64_VAL1___err_missing = 0x1ffffffffULL,
+ UNSIGNED_ENUM64_VAL3___err_missing = 0x3ffffffffULL,
+};
+
+enum named_signed_enum64___err_missing {
+ SIGNED_ENUM64_VAL1___err_missing = 0x1ffffffffLL,
+ SIGNED_ENUM64_VAL3___err_missing = -3,
+};
+
+struct core_reloc_enum64val___err_missing {
+ enum named_unsigned_enum64___err_missing f1;
+ enum named_signed_enum64___err_missing f2;
+};
diff --git a/tools/testing/selftests/bpf/progs/local_storage_bench.c b/tools/testing/selftests/bpf/progs/local_storage_bench.c
new file mode 100644
index 000000000000..2c3234c5b73a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/local_storage_bench.c
@@ -0,0 +1,104 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+#define HASHMAP_SZ 4194304
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS);
+ __uint(max_entries, 1000);
+ __type(key, int);
+ __type(value, int);
+ __array(values, struct {
+ __uint(type, BPF_MAP_TYPE_TASK_STORAGE);
+ __uint(map_flags, BPF_F_NO_PREALLOC);
+ __type(key, int);
+ __type(value, int);
+ });
+} array_of_local_storage_maps SEC(".maps");
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS);
+ __uint(max_entries, 1000);
+ __type(key, int);
+ __type(value, int);
+ __array(values, struct {
+ __uint(type, BPF_MAP_TYPE_HASH);
+ __uint(max_entries, HASHMAP_SZ);
+ __type(key, int);
+ __type(value, int);
+ });
+} array_of_hash_maps SEC(".maps");
+
+long important_hits;
+long hits;
+
+/* set from user-space */
+const volatile unsigned int use_hashmap;
+const volatile unsigned int hashmap_num_keys;
+const volatile unsigned int num_maps;
+const volatile unsigned int interleave;
+
+struct loop_ctx {
+ struct task_struct *task;
+ long loop_hits;
+ long loop_important_hits;
+};
+
+static int do_lookup(unsigned int elem, struct loop_ctx *lctx)
+{
+ void *map, *inner_map;
+ int idx = 0;
+
+ if (use_hashmap)
+ map = &array_of_hash_maps;
+ else
+ map = &array_of_local_storage_maps;
+
+ inner_map = bpf_map_lookup_elem(map, &elem);
+ if (!inner_map)
+ return -1;
+
+ if (use_hashmap) {
+ idx = bpf_get_prandom_u32() % hashmap_num_keys;
+ bpf_map_lookup_elem(inner_map, &idx);
+ } else {
+ bpf_task_storage_get(inner_map, lctx->task, &idx,
+ BPF_LOCAL_STORAGE_GET_F_CREATE);
+ }
+
+ lctx->loop_hits++;
+ if (!elem)
+ lctx->loop_important_hits++;
+ return 0;
+}
+
+static long loop(u32 index, void *ctx)
+{
+ struct loop_ctx *lctx = (struct loop_ctx *)ctx;
+ unsigned int map_idx = index % num_maps;
+
+ do_lookup(map_idx, lctx);
+ if (interleave && map_idx % 3 == 0)
+ do_lookup(0, lctx);
+ return 0;
+}
+
+SEC("fentry/" SYS_PREFIX "sys_getpgid")
+int get_local(void *ctx)
+{
+ struct loop_ctx lctx;
+
+ lctx.task = bpf_get_current_task_btf();
+ lctx.loop_hits = 0;
+ lctx.loop_important_hits = 0;
+ bpf_loop(10000, &loop, &lctx, 0);
+ __sync_add_and_fetch(&hits, lctx.loop_hits);
+ __sync_add_and_fetch(&important_hits, lctx.loop_important_hits);
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/local_storage_rcu_tasks_trace_bench.c b/tools/testing/selftests/bpf/progs/local_storage_rcu_tasks_trace_bench.c
new file mode 100644
index 000000000000..03bf69f49075
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/local_storage_rcu_tasks_trace_bench.c
@@ -0,0 +1,67 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+struct {
+ __uint(type, BPF_MAP_TYPE_TASK_STORAGE);
+ __uint(map_flags, BPF_F_NO_PREALLOC);
+ __type(key, int);
+ __type(value, int);
+} task_storage SEC(".maps");
+
+long hits;
+long gp_hits;
+long gp_times;
+long current_gp_start;
+long unexpected;
+bool postgp_seen;
+
+SEC("fentry/" SYS_PREFIX "sys_getpgid")
+int get_local(void *ctx)
+{
+ struct task_struct *task;
+ int idx;
+ int *s;
+
+ idx = 0;
+ task = bpf_get_current_task_btf();
+ s = bpf_task_storage_get(&task_storage, task, &idx,
+ BPF_LOCAL_STORAGE_GET_F_CREATE);
+ if (!s)
+ return 0;
+
+ *s = 3;
+ bpf_task_storage_delete(&task_storage, task);
+ __sync_add_and_fetch(&hits, 1);
+ return 0;
+}
+
+SEC("fentry/rcu_tasks_trace_pregp_step")
+int pregp_step(struct pt_regs *ctx)
+{
+ current_gp_start = bpf_ktime_get_ns();
+ return 0;
+}
+
+SEC("fentry/rcu_tasks_trace_postgp")
+int postgp(struct pt_regs *ctx)
+{
+ if (!current_gp_start && postgp_seen) {
+ /* Will only happen if prog tracing rcu_tasks_trace_pregp_step doesn't
+ * execute before this prog
+ */
+ __sync_add_and_fetch(&unexpected, 1);
+ return 0;
+ }
+
+ __sync_add_and_fetch(&gp_times, bpf_ktime_get_ns() - current_gp_start);
+ __sync_add_and_fetch(&gp_hits, 1);
+ current_gp_start = 0;
+ postgp_seen = true;
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/lsm_cgroup.c b/tools/testing/selftests/bpf/progs/lsm_cgroup.c
new file mode 100644
index 000000000000..4f2d60b87b75
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/lsm_cgroup.c
@@ -0,0 +1,180 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include "bpf_tracing_net.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+#ifndef AF_PACKET
+#define AF_PACKET 17
+#endif
+
+#ifndef AF_UNIX
+#define AF_UNIX 1
+#endif
+
+#ifndef EPERM
+#define EPERM 1
+#endif
+
+struct {
+ __uint(type, BPF_MAP_TYPE_CGROUP_STORAGE);
+ __type(key, __u64);
+ __type(value, __u64);
+} cgroup_storage SEC(".maps");
+
+int called_socket_post_create;
+int called_socket_post_create2;
+int called_socket_bind;
+int called_socket_bind2;
+int called_socket_alloc;
+int called_socket_clone;
+
+static __always_inline int test_local_storage(void)
+{
+ __u64 *val;
+
+ val = bpf_get_local_storage(&cgroup_storage, 0);
+ if (!val)
+ return 0;
+ *val += 1;
+
+ return 1;
+}
+
+static __always_inline int real_create(struct socket *sock, int family,
+ int protocol)
+{
+ struct sock *sk;
+ int prio = 123;
+
+ /* Reject non-tx-only AF_PACKET. */
+ if (family == AF_PACKET && protocol != 0)
+ return 0; /* EPERM */
+
+ sk = sock->sk;
+ if (!sk)
+ return 1;
+
+ /* The rest of the sockets get default policy. */
+ if (bpf_setsockopt(sk, SOL_SOCKET, SO_PRIORITY, &prio, sizeof(prio)))
+ return 0; /* EPERM */
+
+ /* Make sure bpf_getsockopt is allowed and works. */
+ prio = 0;
+ if (bpf_getsockopt(sk, SOL_SOCKET, SO_PRIORITY, &prio, sizeof(prio)))
+ return 0; /* EPERM */
+ if (prio != 123)
+ return 0; /* EPERM */
+
+ /* Can access cgroup local storage. */
+ if (!test_local_storage())
+ return 0; /* EPERM */
+
+ return 1;
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_post_create")
+int BPF_PROG(socket_post_create, struct socket *sock, int family,
+ int type, int protocol, int kern)
+{
+ called_socket_post_create++;
+ return real_create(sock, family, protocol);
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_post_create")
+int BPF_PROG(socket_post_create2, struct socket *sock, int family,
+ int type, int protocol, int kern)
+{
+ called_socket_post_create2++;
+ return real_create(sock, family, protocol);
+}
+
+static __always_inline int real_bind(struct socket *sock,
+ struct sockaddr *address,
+ int addrlen)
+{
+ struct sockaddr_ll sa = {};
+
+ if (sock->sk->__sk_common.skc_family != AF_PACKET)
+ return 1;
+
+ if (sock->sk->sk_kern_sock)
+ return 1;
+
+ bpf_probe_read_kernel(&sa, sizeof(sa), address);
+ if (sa.sll_protocol)
+ return 0; /* EPERM */
+
+ /* Can access cgroup local storage. */
+ if (!test_local_storage())
+ return 0; /* EPERM */
+
+ return 1;
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_bind")
+int BPF_PROG(socket_bind, struct socket *sock, struct sockaddr *address,
+ int addrlen)
+{
+ called_socket_bind++;
+ return real_bind(sock, address, addrlen);
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_bind")
+int BPF_PROG(socket_bind2, struct socket *sock, struct sockaddr *address,
+ int addrlen)
+{
+ called_socket_bind2++;
+ return real_bind(sock, address, addrlen);
+}
+
+/* __cgroup_bpf_run_lsm_current (via bpf_lsm_current_hooks) */
+SEC("lsm_cgroup/sk_alloc_security")
+int BPF_PROG(socket_alloc, struct sock *sk, int family, gfp_t priority)
+{
+ called_socket_alloc++;
+ if (family == AF_UNIX)
+ return 0; /* EPERM */
+
+ /* Can access cgroup local storage. */
+ if (!test_local_storage())
+ return 0; /* EPERM */
+
+ return 1;
+}
+
+/* __cgroup_bpf_run_lsm_sock */
+SEC("lsm_cgroup/inet_csk_clone")
+int BPF_PROG(socket_clone, struct sock *newsk, const struct request_sock *req)
+{
+ int prio = 234;
+
+ if (!newsk)
+ return 1;
+
+ /* Accepted request sockets get a different priority. */
+ if (bpf_setsockopt(newsk, SOL_SOCKET, SO_PRIORITY, &prio, sizeof(prio)))
+ return 1;
+
+ /* Make sure bpf_getsockopt is allowed and works. */
+ prio = 0;
+ if (bpf_getsockopt(newsk, SOL_SOCKET, SO_PRIORITY, &prio, sizeof(prio)))
+ return 1;
+ if (prio != 234)
+ return 1;
+
+ /* Can access cgroup local storage. */
+ if (!test_local_storage())
+ return 1;
+
+ called_socket_clone++;
+
+ return 1;
+}
diff --git a/tools/testing/selftests/bpf/progs/lsm_cgroup_nonvoid.c b/tools/testing/selftests/bpf/progs/lsm_cgroup_nonvoid.c
new file mode 100644
index 000000000000..6cb0f161f417
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/lsm_cgroup_nonvoid.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+SEC("lsm_cgroup/inet_csk_clone")
+int BPF_PROG(nonvoid_socket_clone, struct sock *newsk, const struct request_sock *req)
+{
+ /* Can not return any errors from void LSM hooks. */
+ return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/tcp_ca_incompl_cong_ops.c b/tools/testing/selftests/bpf/progs/tcp_ca_incompl_cong_ops.c
new file mode 100644
index 000000000000..7bb872fb22dd
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tcp_ca_incompl_cong_ops.c
@@ -0,0 +1,35 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+static inline struct tcp_sock *tcp_sk(const struct sock *sk)
+{
+ return (struct tcp_sock *)sk;
+}
+
+SEC("struct_ops/incompl_cong_ops_ssthresh")
+__u32 BPF_PROG(incompl_cong_ops_ssthresh, struct sock *sk)
+{
+ return tcp_sk(sk)->snd_ssthresh;
+}
+
+SEC("struct_ops/incompl_cong_ops_undo_cwnd")
+__u32 BPF_PROG(incompl_cong_ops_undo_cwnd, struct sock *sk)
+{
+ return tcp_sk(sk)->snd_cwnd;
+}
+
+SEC(".struct_ops")
+struct tcp_congestion_ops incompl_cong_ops = {
+ /* Intentionally leaving out any of the required cong_avoid() and
+ * cong_control() here.
+ */
+ .ssthresh = (void *)incompl_cong_ops_ssthresh,
+ .undo_cwnd = (void *)incompl_cong_ops_undo_cwnd,
+ .name = "bpf_incompl_ops",
+};
diff --git a/tools/testing/selftests/bpf/progs/tcp_ca_unsupp_cong_op.c b/tools/testing/selftests/bpf/progs/tcp_ca_unsupp_cong_op.c
new file mode 100644
index 000000000000..c06f4a41c21a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tcp_ca_unsupp_cong_op.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+SEC("struct_ops/unsupp_cong_op_get_info")
+size_t BPF_PROG(unsupp_cong_op_get_info, struct sock *sk, u32 ext, int *attr,
+ union tcp_cc_info *info)
+{
+ return 0;
+}
+
+SEC(".struct_ops")
+struct tcp_congestion_ops unsupp_cong_op = {
+ .get_info = (void *)unsupp_cong_op_get_info,
+ .name = "bpf_unsupp_op",
+};
diff --git a/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
new file mode 100644
index 000000000000..43447704cf0e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+#define USEC_PER_SEC 1000000UL
+
+#define min(a, b) ((a) < (b) ? (a) : (b))
+
+static inline struct tcp_sock *tcp_sk(const struct sock *sk)
+{
+ return (struct tcp_sock *)sk;
+}
+
+SEC("struct_ops/write_sk_pacing_init")
+void BPF_PROG(write_sk_pacing_init, struct sock *sk)
+{
+#ifdef ENABLE_ATOMICS_TESTS
+ __sync_bool_compare_and_swap(&sk->sk_pacing_status, SK_PACING_NONE,
+ SK_PACING_NEEDED);
+#else
+ sk->sk_pacing_status = SK_PACING_NEEDED;
+#endif
+}
+
+SEC("struct_ops/write_sk_pacing_cong_control")
+void BPF_PROG(write_sk_pacing_cong_control, struct sock *sk,
+ const struct rate_sample *rs)
+{
+ const struct tcp_sock *tp = tcp_sk(sk);
+ unsigned long rate =
+ ((tp->snd_cwnd * tp->mss_cache * USEC_PER_SEC) << 3) /
+ (tp->srtt_us ?: 1U << 3);
+ sk->sk_pacing_rate = min(rate, sk->sk_max_pacing_rate);
+}
+
+SEC("struct_ops/write_sk_pacing_ssthresh")
+__u32 BPF_PROG(write_sk_pacing_ssthresh, struct sock *sk)
+{
+ return tcp_sk(sk)->snd_ssthresh;
+}
+
+SEC("struct_ops/write_sk_pacing_undo_cwnd")
+__u32 BPF_PROG(write_sk_pacing_undo_cwnd, struct sock *sk)
+{
+ return tcp_sk(sk)->snd_cwnd;
+}
+
+SEC(".struct_ops")
+struct tcp_congestion_ops write_sk_pacing = {
+ .init = (void *)write_sk_pacing_init,
+ .cong_control = (void *)write_sk_pacing_cong_control,
+ .ssthresh = (void *)write_sk_pacing_ssthresh,
+ .undo_cwnd = (void *)write_sk_pacing_undo_cwnd,
+ .name = "bpf_w_sk_pacing",
+};
diff --git a/tools/testing/selftests/bpf/progs/test_attach_probe.c b/tools/testing/selftests/bpf/progs/test_attach_probe.c
index ce9acf4db8d2..a1e45fec8938 100644
--- a/tools/testing/selftests/bpf/progs/test_attach_probe.c
+++ b/tools/testing/selftests/bpf/progs/test_attach_probe.c
@@ -1,10 +1,10 @@
// SPDX-License-Identifier: GPL-2.0
// Copyright (c) 2017 Facebook
-#include <linux/ptrace.h>
-#include <linux/bpf.h>
+#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
#include "bpf_misc.h"
int kprobe_res = 0;
@@ -17,6 +17,11 @@ int uprobe_byname_res = 0;
int uretprobe_byname_res = 0;
int uprobe_byname2_res = 0;
int uretprobe_byname2_res = 0;
+int uprobe_byname3_sleepable_res = 0;
+int uprobe_byname3_res = 0;
+int uretprobe_byname3_sleepable_res = 0;
+int uretprobe_byname3_res = 0;
+void *user_ptr = 0;
SEC("kprobe")
int handle_kprobe(struct pt_regs *ctx)
@@ -25,13 +30,24 @@ int handle_kprobe(struct pt_regs *ctx)
return 0;
}
-SEC("kprobe/" SYS_PREFIX "sys_nanosleep")
-int BPF_KPROBE(handle_kprobe_auto)
+SEC("ksyscall/nanosleep")
+int BPF_KSYSCALL(handle_kprobe_auto, struct __kernel_timespec *req, struct __kernel_timespec *rem)
{
kprobe2_res = 11;
return 0;
}
+/**
+ * This program will be manually made sleepable on the userspace side
+ * and should thus be unattachable.
+ */
+SEC("kprobe/" SYS_PREFIX "sys_nanosleep")
+int handle_kprobe_sleepable(struct pt_regs *ctx)
+{
+ kprobe_res = 2;
+ return 0;
+}
+
SEC("kretprobe")
int handle_kretprobe(struct pt_regs *ctx)
{
@@ -39,11 +55,11 @@ int handle_kretprobe(struct pt_regs *ctx)
return 0;
}
-SEC("kretprobe/" SYS_PREFIX "sys_nanosleep")
-int BPF_KRETPROBE(handle_kretprobe_auto)
+SEC("kretsyscall/nanosleep")
+int BPF_KRETPROBE(handle_kretprobe_auto, int ret)
{
kretprobe2_res = 22;
- return 0;
+ return ret;
}
SEC("uprobe")
@@ -93,4 +109,47 @@ int handle_uretprobe_byname2(struct pt_regs *ctx)
return 0;
}
+static __always_inline bool verify_sleepable_user_copy(void)
+{
+ char data[9];
+
+ bpf_copy_from_user(data, sizeof(data), user_ptr);
+ return bpf_strncmp(data, sizeof(data), "test_data") == 0;
+}
+
+SEC("uprobe.s//proc/self/exe:trigger_func3")
+int handle_uprobe_byname3_sleepable(struct pt_regs *ctx)
+{
+ if (verify_sleepable_user_copy())
+ uprobe_byname3_sleepable_res = 9;
+ return 0;
+}
+
+/**
+ * same target as the uprobe.s above to force sleepable and non-sleepable
+ * programs in the same bpf_prog_array
+ */
+SEC("uprobe//proc/self/exe:trigger_func3")
+int handle_uprobe_byname3(struct pt_regs *ctx)
+{
+ uprobe_byname3_res = 10;
+ return 0;
+}
+
+SEC("uretprobe.s//proc/self/exe:trigger_func3")
+int handle_uretprobe_byname3_sleepable(struct pt_regs *ctx)
+{
+ if (verify_sleepable_user_copy())
+ uretprobe_byname3_sleepable_res = 11;
+ return 0;
+}
+
+SEC("uretprobe//proc/self/exe:trigger_func3")
+int handle_uretprobe_byname3(struct pt_regs *ctx)
+{
+ uretprobe_byname3_res = 12;
+ return 0;
+}
+
+
char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/test_bpf_nf.c b/tools/testing/selftests/bpf/progs/test_bpf_nf.c
index f00a9731930e..196cd8dfe42a 100644
--- a/tools/testing/selftests/bpf/progs/test_bpf_nf.c
+++ b/tools/testing/selftests/bpf/progs/test_bpf_nf.c
@@ -8,6 +8,8 @@
#define EINVAL 22
#define ENOENT 2
+extern unsigned long CONFIG_HZ __kconfig;
+
int test_einval_bpf_tuple = 0;
int test_einval_reserved = 0;
int test_einval_netns_id = 0;
@@ -16,6 +18,11 @@ int test_eproto_l4proto = 0;
int test_enonet_netns_id = 0;
int test_enoent_lookup = 0;
int test_eafnosupport = 0;
+int test_alloc_entry = -EINVAL;
+int test_insert_entry = -EAFNOSUPPORT;
+int test_succ_lookup = -ENOENT;
+u32 test_delta_timeout = 0;
+u32 test_status = 0;
struct nf_conn;
@@ -26,31 +33,44 @@ struct bpf_ct_opts___local {
u8 reserved[3];
} __attribute__((preserve_access_index));
+struct nf_conn *bpf_xdp_ct_alloc(struct xdp_md *, struct bpf_sock_tuple *, u32,
+ struct bpf_ct_opts___local *, u32) __ksym;
struct nf_conn *bpf_xdp_ct_lookup(struct xdp_md *, struct bpf_sock_tuple *, u32,
struct bpf_ct_opts___local *, u32) __ksym;
+struct nf_conn *bpf_skb_ct_alloc(struct __sk_buff *, struct bpf_sock_tuple *, u32,
+ struct bpf_ct_opts___local *, u32) __ksym;
struct nf_conn *bpf_skb_ct_lookup(struct __sk_buff *, struct bpf_sock_tuple *, u32,
struct bpf_ct_opts___local *, u32) __ksym;
+struct nf_conn *bpf_ct_insert_entry(struct nf_conn *) __ksym;
void bpf_ct_release(struct nf_conn *) __ksym;
+void bpf_ct_set_timeout(struct nf_conn *, u32) __ksym;
+int bpf_ct_change_timeout(struct nf_conn *, u32) __ksym;
+int bpf_ct_set_status(struct nf_conn *, u32) __ksym;
+int bpf_ct_change_status(struct nf_conn *, u32) __ksym;
static __always_inline void
-nf_ct_test(struct nf_conn *(*func)(void *, struct bpf_sock_tuple *, u32,
- struct bpf_ct_opts___local *, u32),
+nf_ct_test(struct nf_conn *(*lookup_fn)(void *, struct bpf_sock_tuple *, u32,
+ struct bpf_ct_opts___local *, u32),
+ struct nf_conn *(*alloc_fn)(void *, struct bpf_sock_tuple *, u32,
+ struct bpf_ct_opts___local *, u32),
void *ctx)
{
struct bpf_ct_opts___local opts_def = { .l4proto = IPPROTO_TCP, .netns_id = -1 };
struct bpf_sock_tuple bpf_tuple;
struct nf_conn *ct;
+ int err;
__builtin_memset(&bpf_tuple, 0, sizeof(bpf_tuple.ipv4));
- ct = func(ctx, NULL, 0, &opts_def, sizeof(opts_def));
+ ct = lookup_fn(ctx, NULL, 0, &opts_def, sizeof(opts_def));
if (ct)
bpf_ct_release(ct);
else
test_einval_bpf_tuple = opts_def.error;
opts_def.reserved[0] = 1;
- ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def));
+ ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+ sizeof(opts_def));
opts_def.reserved[0] = 0;
opts_def.l4proto = IPPROTO_TCP;
if (ct)
@@ -59,21 +79,24 @@ nf_ct_test(struct nf_conn *(*func)(void *, struct bpf_sock_tuple *, u32,
test_einval_reserved = opts_def.error;
opts_def.netns_id = -2;
- ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def));
+ ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+ sizeof(opts_def));
opts_def.netns_id = -1;
if (ct)
bpf_ct_release(ct);
else
test_einval_netns_id = opts_def.error;
- ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def) - 1);
+ ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+ sizeof(opts_def) - 1);
if (ct)
bpf_ct_release(ct);
else
test_einval_len_opts = opts_def.error;
opts_def.l4proto = IPPROTO_ICMP;
- ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def));
+ ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+ sizeof(opts_def));
opts_def.l4proto = IPPROTO_TCP;
if (ct)
bpf_ct_release(ct);
@@ -81,37 +104,75 @@ nf_ct_test(struct nf_conn *(*func)(void *, struct bpf_sock_tuple *, u32,
test_eproto_l4proto = opts_def.error;
opts_def.netns_id = 0xf00f;
- ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def));
+ ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+ sizeof(opts_def));
opts_def.netns_id = -1;
if (ct)
bpf_ct_release(ct);
else
test_enonet_netns_id = opts_def.error;
- ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def));
+ ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+ sizeof(opts_def));
if (ct)
bpf_ct_release(ct);
else
test_enoent_lookup = opts_def.error;
- ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4) - 1, &opts_def, sizeof(opts_def));
+ ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4) - 1, &opts_def,
+ sizeof(opts_def));
if (ct)
bpf_ct_release(ct);
else
test_eafnosupport = opts_def.error;
+
+ bpf_tuple.ipv4.saddr = bpf_get_prandom_u32(); /* src IP */
+ bpf_tuple.ipv4.daddr = bpf_get_prandom_u32(); /* dst IP */
+ bpf_tuple.ipv4.sport = bpf_get_prandom_u32(); /* src port */
+ bpf_tuple.ipv4.dport = bpf_get_prandom_u32(); /* dst port */
+
+ ct = alloc_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+ sizeof(opts_def));
+ if (ct) {
+ struct nf_conn *ct_ins;
+
+ bpf_ct_set_timeout(ct, 10000);
+ bpf_ct_set_status(ct, IPS_CONFIRMED);
+
+ ct_ins = bpf_ct_insert_entry(ct);
+ if (ct_ins) {
+ struct nf_conn *ct_lk;
+
+ ct_lk = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4),
+ &opts_def, sizeof(opts_def));
+ if (ct_lk) {
+ /* update ct entry timeout */
+ bpf_ct_change_timeout(ct_lk, 10000);
+ test_delta_timeout = ct_lk->timeout - bpf_jiffies64();
+ test_delta_timeout /= CONFIG_HZ;
+ test_status = IPS_SEEN_REPLY;
+ bpf_ct_change_status(ct_lk, IPS_SEEN_REPLY);
+ bpf_ct_release(ct_lk);
+ test_succ_lookup = 0;
+ }
+ bpf_ct_release(ct_ins);
+ test_insert_entry = 0;
+ }
+ test_alloc_entry = 0;
+ }
}
SEC("xdp")
int nf_xdp_ct_test(struct xdp_md *ctx)
{
- nf_ct_test((void *)bpf_xdp_ct_lookup, ctx);
+ nf_ct_test((void *)bpf_xdp_ct_lookup, (void *)bpf_xdp_ct_alloc, ctx);
return 0;
}
SEC("tc")
int nf_skb_ct_test(struct __sk_buff *ctx)
{
- nf_ct_test((void *)bpf_skb_ct_lookup, ctx);
+ nf_ct_test((void *)bpf_skb_ct_lookup, (void *)bpf_skb_ct_alloc, ctx);
return 0;
}
diff --git a/tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c b/tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c
new file mode 100644
index 000000000000..bf79af15c808
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_core_read.h>
+
+struct nf_conn;
+
+struct bpf_ct_opts___local {
+ s32 netns_id;
+ s32 error;
+ u8 l4proto;
+ u8 reserved[3];
+} __attribute__((preserve_access_index));
+
+struct nf_conn *bpf_skb_ct_alloc(struct __sk_buff *, struct bpf_sock_tuple *, u32,
+ struct bpf_ct_opts___local *, u32) __ksym;
+struct nf_conn *bpf_skb_ct_lookup(struct __sk_buff *, struct bpf_sock_tuple *, u32,
+ struct bpf_ct_opts___local *, u32) __ksym;
+struct nf_conn *bpf_ct_insert_entry(struct nf_conn *) __ksym;
+void bpf_ct_release(struct nf_conn *) __ksym;
+void bpf_ct_set_timeout(struct nf_conn *, u32) __ksym;
+int bpf_ct_change_timeout(struct nf_conn *, u32) __ksym;
+int bpf_ct_set_status(struct nf_conn *, u32) __ksym;
+int bpf_ct_change_status(struct nf_conn *, u32) __ksym;
+
+SEC("?tc")
+int alloc_release(struct __sk_buff *ctx)
+{
+ struct bpf_ct_opts___local opts = {};
+ struct bpf_sock_tuple tup = {};
+ struct nf_conn *ct;
+
+ ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+ if (!ct)
+ return 0;
+ bpf_ct_release(ct);
+ return 0;
+}
+
+SEC("?tc")
+int insert_insert(struct __sk_buff *ctx)
+{
+ struct bpf_ct_opts___local opts = {};
+ struct bpf_sock_tuple tup = {};
+ struct nf_conn *ct;
+
+ ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+ if (!ct)
+ return 0;
+ ct = bpf_ct_insert_entry(ct);
+ if (!ct)
+ return 0;
+ ct = bpf_ct_insert_entry(ct);
+ return 0;
+}
+
+SEC("?tc")
+int lookup_insert(struct __sk_buff *ctx)
+{
+ struct bpf_ct_opts___local opts = {};
+ struct bpf_sock_tuple tup = {};
+ struct nf_conn *ct;
+
+ ct = bpf_skb_ct_lookup(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+ if (!ct)
+ return 0;
+ bpf_ct_insert_entry(ct);
+ return 0;
+}
+
+SEC("?tc")
+int set_timeout_after_insert(struct __sk_buff *ctx)
+{
+ struct bpf_ct_opts___local opts = {};
+ struct bpf_sock_tuple tup = {};
+ struct nf_conn *ct;
+
+ ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+ if (!ct)
+ return 0;
+ ct = bpf_ct_insert_entry(ct);
+ if (!ct)
+ return 0;
+ bpf_ct_set_timeout(ct, 0);
+ return 0;
+}
+
+SEC("?tc")
+int set_status_after_insert(struct __sk_buff *ctx)
+{
+ struct bpf_ct_opts___local opts = {};
+ struct bpf_sock_tuple tup = {};
+ struct nf_conn *ct;
+
+ ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+ if (!ct)
+ return 0;
+ ct = bpf_ct_insert_entry(ct);
+ if (!ct)
+ return 0;
+ bpf_ct_set_status(ct, 0);
+ return 0;
+}
+
+SEC("?tc")
+int change_timeout_after_alloc(struct __sk_buff *ctx)
+{
+ struct bpf_ct_opts___local opts = {};
+ struct bpf_sock_tuple tup = {};
+ struct nf_conn *ct;
+
+ ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+ if (!ct)
+ return 0;
+ bpf_ct_change_timeout(ct, 0);
+ return 0;
+}
+
+SEC("?tc")
+int change_status_after_alloc(struct __sk_buff *ctx)
+{
+ struct bpf_ct_opts___local opts = {};
+ struct bpf_sock_tuple tup = {};
+ struct nf_conn *ct;
+
+ ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+ if (!ct)
+ return 0;
+ bpf_ct_change_status(ct, 0);
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/test_btf_haskv.c b/tools/testing/selftests/bpf/progs/test_btf_haskv.c
deleted file mode 100644
index 07c94df13660..000000000000
--- a/tools/testing/selftests/bpf/progs/test_btf_haskv.c
+++ /dev/null
@@ -1,51 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (c) 2018 Facebook */
-#include <linux/bpf.h>
-#include <bpf/bpf_helpers.h>
-#include "bpf_legacy.h"
-
-struct ipv_counts {
- unsigned int v4;
- unsigned int v6;
-};
-
-#pragma GCC diagnostic push
-#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
-struct bpf_map_def SEC("maps") btf_map = {
- .type = BPF_MAP_TYPE_ARRAY,
- .key_size = sizeof(int),
- .value_size = sizeof(struct ipv_counts),
- .max_entries = 4,
-};
-#pragma GCC diagnostic pop
-
-BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts);
-
-__attribute__((noinline))
-int test_long_fname_2(void)
-{
- struct ipv_counts *counts;
- int key = 0;
-
- counts = bpf_map_lookup_elem(&btf_map, &key);
- if (!counts)
- return 0;
-
- counts->v6++;
-
- return 0;
-}
-
-__attribute__((noinline))
-int test_long_fname_1(void)
-{
- return test_long_fname_2();
-}
-
-SEC("dummy_tracepoint")
-int _dummy_tracepoint(void *arg)
-{
- return test_long_fname_1();
-}
-
-char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/test_btf_newkv.c b/tools/testing/selftests/bpf/progs/test_btf_newkv.c
index 762671a2e90c..251854a041b5 100644
--- a/tools/testing/selftests/bpf/progs/test_btf_newkv.c
+++ b/tools/testing/selftests/bpf/progs/test_btf_newkv.c
@@ -9,19 +9,6 @@ struct ipv_counts {
unsigned int v6;
};
-#pragma GCC diagnostic push
-#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
-/* just to validate we can handle maps in multiple sections */
-struct bpf_map_def SEC("maps") btf_map_legacy = {
- .type = BPF_MAP_TYPE_ARRAY,
- .key_size = sizeof(int),
- .value_size = sizeof(long long),
- .max_entries = 4,
-};
-#pragma GCC diagnostic pop
-
-BPF_ANNOTATE_KV_PAIR(btf_map_legacy, int, struct ipv_counts);
-
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__uint(max_entries, 4);
@@ -41,11 +28,6 @@ int test_long_fname_2(void)
counts->v6++;
- /* just verify we can reference both maps */
- counts = bpf_map_lookup_elem(&btf_map_legacy, &key);
- if (!counts)
- return 0;
-
return 0;
}
diff --git a/tools/testing/selftests/bpf/progs/test_core_extern.c b/tools/testing/selftests/bpf/progs/test_core_extern.c
index 3ac3603ad53d..a3c7c1042f35 100644
--- a/tools/testing/selftests/bpf/progs/test_core_extern.c
+++ b/tools/testing/selftests/bpf/progs/test_core_extern.c
@@ -11,6 +11,7 @@
static int (*bpf_missing_helper)(const void *arg1, int arg2) = (void *) 999;
extern int LINUX_KERNEL_VERSION __kconfig;
+extern int LINUX_UNKNOWN_VIRTUAL_EXTERN __kconfig __weak;
extern bool CONFIG_BPF_SYSCALL __kconfig; /* strong */
extern enum libbpf_tristate CONFIG_TRISTATE __kconfig __weak;
extern bool CONFIG_BOOL __kconfig __weak;
@@ -22,6 +23,7 @@ extern const char CONFIG_STR[8] __kconfig __weak;
extern uint64_t CONFIG_MISSING __kconfig __weak;
uint64_t kern_ver = -1;
+uint64_t unkn_virt_val = -1;
uint64_t bpf_syscall = -1;
uint64_t tristate_val = -1;
uint64_t bool_val = -1;
@@ -38,6 +40,7 @@ int handle_sys_enter(struct pt_regs *ctx)
int i;
kern_ver = LINUX_KERNEL_VERSION;
+ unkn_virt_val = LINUX_UNKNOWN_VIRTUAL_EXTERN;
bpf_syscall = CONFIG_BPF_SYSCALL;
tristate_val = CONFIG_TRISTATE;
bool_val = CONFIG_BOOL;
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_enum64val.c b/tools/testing/selftests/bpf/progs/test_core_reloc_enum64val.c
new file mode 100644
index 000000000000..63147fbfae6e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_enum64val.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+
+#include <linux/bpf.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_core_read.h>
+
+char _license[] SEC("license") = "GPL";
+
+struct {
+ char in[256];
+ char out[256];
+ bool skip;
+} data = {};
+
+enum named_unsigned_enum64 {
+ UNSIGNED_ENUM64_VAL1 = 0x1ffffffffULL,
+ UNSIGNED_ENUM64_VAL2 = 0x2ffffffffULL,
+ UNSIGNED_ENUM64_VAL3 = 0x3ffffffffULL,
+};
+
+enum named_signed_enum64 {
+ SIGNED_ENUM64_VAL1 = 0x1ffffffffLL,
+ SIGNED_ENUM64_VAL2 = -2,
+ SIGNED_ENUM64_VAL3 = 0x3ffffffffLL,
+};
+
+struct core_reloc_enum64val_output {
+ bool unsigned_val1_exists;
+ bool unsigned_val2_exists;
+ bool unsigned_val3_exists;
+ bool signed_val1_exists;
+ bool signed_val2_exists;
+ bool signed_val3_exists;
+
+ long unsigned_val1;
+ long unsigned_val2;
+ long signed_val1;
+ long signed_val2;
+};
+
+SEC("raw_tracepoint/sys_enter")
+int test_core_enum64val(void *ctx)
+{
+#if __clang_major__ >= 15
+ struct core_reloc_enum64val_output *out = (void *)&data.out;
+ enum named_unsigned_enum64 named_unsigned = 0;
+ enum named_signed_enum64 named_signed = 0;
+
+ out->unsigned_val1_exists = bpf_core_enum_value_exists(named_unsigned, UNSIGNED_ENUM64_VAL1);
+ out->unsigned_val2_exists = bpf_core_enum_value_exists(enum named_unsigned_enum64, UNSIGNED_ENUM64_VAL2);
+ out->unsigned_val3_exists = bpf_core_enum_value_exists(enum named_unsigned_enum64, UNSIGNED_ENUM64_VAL3);
+ out->signed_val1_exists = bpf_core_enum_value_exists(named_signed, SIGNED_ENUM64_VAL1);
+ out->signed_val2_exists = bpf_core_enum_value_exists(enum named_signed_enum64, SIGNED_ENUM64_VAL2);
+ out->signed_val3_exists = bpf_core_enum_value_exists(enum named_signed_enum64, SIGNED_ENUM64_VAL3);
+
+ out->unsigned_val1 = bpf_core_enum_value(named_unsigned, UNSIGNED_ENUM64_VAL1);
+ out->unsigned_val2 = bpf_core_enum_value(named_unsigned, UNSIGNED_ENUM64_VAL2);
+ out->signed_val1 = bpf_core_enum_value(named_signed, SIGNED_ENUM64_VAL1);
+ out->signed_val2 = bpf_core_enum_value(named_signed, SIGNED_ENUM64_VAL2);
+ /* NAMED_ENUM64_VAL3 value is optional */
+
+#else
+ data.skip = true;
+#endif
+
+ return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_kernel.c b/tools/testing/selftests/bpf/progs/test_core_reloc_kernel.c
index 145028b52ad8..a17dd83eae67 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_kernel.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_kernel.c
@@ -21,6 +21,7 @@ struct core_reloc_kernel_output {
/* we have test_progs[-flavor], so cut flavor part */
char comm[sizeof("test_progs")];
int comm_len;
+ bool local_task_struct_matches;
};
struct task_struct {
@@ -30,11 +31,25 @@ struct task_struct {
struct task_struct *group_leader;
};
+struct mm_struct___wrong {
+ int abc_whatever_should_not_exist;
+};
+
+struct task_struct___local {
+ int pid;
+ struct mm_struct___wrong *mm;
+};
+
#define CORE_READ(dst, src) bpf_core_read(dst, sizeof(*(dst)), src)
SEC("raw_tracepoint/sys_enter")
int test_core_kernel(void *ctx)
{
+ /* Support for the BPF_TYPE_MATCHES argument to the
+ * __builtin_preserve_type_info builtin was added at some point during
+ * development of clang 15 and it's what we require for this test.
+ */
+#if __has_builtin(__builtin_preserve_type_info) && __clang_major__ >= 15
struct task_struct *task = (void *)bpf_get_current_task();
struct core_reloc_kernel_output *out = (void *)&data.out;
uint64_t pid_tgid = bpf_get_current_pid_tgid();
@@ -93,6 +108,10 @@ int test_core_kernel(void *ctx)
group_leader, group_leader, group_leader, group_leader,
comm);
+ out->local_task_struct_matches = bpf_core_type_matches(struct task_struct___local);
+#else
+ data.skip = true;
+#endif
return 0;
}
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_type_based.c b/tools/testing/selftests/bpf/progs/test_core_reloc_type_based.c
index fb60f8195c53..2edb4df35e6e 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_type_based.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_type_based.c
@@ -19,6 +19,14 @@ struct a_struct {
int x;
};
+struct a_complex_struct {
+ union {
+ struct a_struct *a;
+ void *b;
+ } x;
+ volatile long y;
+};
+
union a_union {
int y;
int z;
@@ -43,6 +51,7 @@ typedef int int_typedef;
typedef enum { TYPEDEF_ENUM_VAL1, TYPEDEF_ENUM_VAL2 } enum_typedef;
typedef void *void_ptr_typedef;
+typedef int *restrict restrict_ptr_typedef;
typedef int (*func_proto_typedef)(long);
@@ -50,6 +59,7 @@ typedef char arr_typedef[20];
struct core_reloc_type_based_output {
bool struct_exists;
+ bool complex_struct_exists;
bool union_exists;
bool enum_exists;
bool typedef_named_struct_exists;
@@ -58,9 +68,24 @@ struct core_reloc_type_based_output {
bool typedef_int_exists;
bool typedef_enum_exists;
bool typedef_void_ptr_exists;
+ bool typedef_restrict_ptr_exists;
bool typedef_func_proto_exists;
bool typedef_arr_exists;
+ bool struct_matches;
+ bool complex_struct_matches;
+ bool union_matches;
+ bool enum_matches;
+ bool typedef_named_struct_matches;
+ bool typedef_anon_struct_matches;
+ bool typedef_struct_ptr_matches;
+ bool typedef_int_matches;
+ bool typedef_enum_matches;
+ bool typedef_void_ptr_matches;
+ bool typedef_restrict_ptr_matches;
+ bool typedef_func_proto_matches;
+ bool typedef_arr_matches;
+
int struct_sz;
int union_sz;
int enum_sz;
@@ -77,10 +102,17 @@ struct core_reloc_type_based_output {
SEC("raw_tracepoint/sys_enter")
int test_core_type_based(void *ctx)
{
-#if __has_builtin(__builtin_preserve_type_info)
+ /* Support for the BPF_TYPE_MATCHES argument to the
+ * __builtin_preserve_type_info builtin was added at some point during
+ * development of clang 15 and it's what we require for this test. Part of it
+ * could run with merely __builtin_preserve_type_info (which could be checked
+ * separately), but we have to find an upper bound.
+ */
+#if __has_builtin(__builtin_preserve_type_info) && __clang_major__ >= 15
struct core_reloc_type_based_output *out = (void *)&data.out;
out->struct_exists = bpf_core_type_exists(struct a_struct);
+ out->complex_struct_exists = bpf_core_type_exists(struct a_complex_struct);
out->union_exists = bpf_core_type_exists(union a_union);
out->enum_exists = bpf_core_type_exists(enum an_enum);
out->typedef_named_struct_exists = bpf_core_type_exists(named_struct_typedef);
@@ -89,9 +121,24 @@ int test_core_type_based(void *ctx)
out->typedef_int_exists = bpf_core_type_exists(int_typedef);
out->typedef_enum_exists = bpf_core_type_exists(enum_typedef);
out->typedef_void_ptr_exists = bpf_core_type_exists(void_ptr_typedef);
+ out->typedef_restrict_ptr_exists = bpf_core_type_exists(restrict_ptr_typedef);
out->typedef_func_proto_exists = bpf_core_type_exists(func_proto_typedef);
out->typedef_arr_exists = bpf_core_type_exists(arr_typedef);
+ out->struct_matches = bpf_core_type_matches(struct a_struct);
+ out->complex_struct_matches = bpf_core_type_matches(struct a_complex_struct);
+ out->union_matches = bpf_core_type_matches(union a_union);
+ out->enum_matches = bpf_core_type_matches(enum an_enum);
+ out->typedef_named_struct_matches = bpf_core_type_matches(named_struct_typedef);
+ out->typedef_anon_struct_matches = bpf_core_type_matches(anon_struct_typedef);
+ out->typedef_struct_ptr_matches = bpf_core_type_matches(struct_ptr_typedef);
+ out->typedef_int_matches = bpf_core_type_matches(int_typedef);
+ out->typedef_enum_matches = bpf_core_type_matches(enum_typedef);
+ out->typedef_void_ptr_matches = bpf_core_type_matches(void_ptr_typedef);
+ out->typedef_restrict_ptr_matches = bpf_core_type_matches(restrict_ptr_typedef);
+ out->typedef_func_proto_matches = bpf_core_type_matches(func_proto_typedef);
+ out->typedef_arr_matches = bpf_core_type_matches(arr_typedef);
+
out->struct_sz = bpf_core_type_size(struct a_struct);
out->union_sz = bpf_core_type_size(union a_union);
out->enum_sz = bpf_core_type_size(enum an_enum);
diff --git a/tools/testing/selftests/bpf/progs/test_probe_user.c b/tools/testing/selftests/bpf/progs/test_probe_user.c
index 702578a5e496..a8e501af9604 100644
--- a/tools/testing/selftests/bpf/progs/test_probe_user.c
+++ b/tools/testing/selftests/bpf/progs/test_probe_user.c
@@ -1,37 +1,47 @@
// SPDX-License-Identifier: GPL-2.0
-
-#include <linux/ptrace.h>
-#include <linux/bpf.h>
-
-#include <netinet/in.h>
-
+#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
#include "bpf_misc.h"
static struct sockaddr_in old;
-SEC("kprobe/" SYS_PREFIX "sys_connect")
-int BPF_KPROBE(handle_sys_connect)
+static int handle_sys_connect_common(struct sockaddr_in *uservaddr)
{
-#if SYSCALL_WRAPPER == 1
- struct pt_regs *real_regs;
-#endif
struct sockaddr_in new;
- void *ptr;
-#if SYSCALL_WRAPPER == 0
- ptr = (void *)PT_REGS_PARM2(ctx);
-#else
- real_regs = (struct pt_regs *)PT_REGS_PARM1(ctx);
- bpf_probe_read_kernel(&ptr, sizeof(ptr), &PT_REGS_PARM2(real_regs));
+ bpf_probe_read_user(&old, sizeof(old), uservaddr);
+ __builtin_memset(&new, 0xab, sizeof(new));
+ bpf_probe_write_user(uservaddr, &new, sizeof(new));
+
+ return 0;
+}
+
+SEC("ksyscall/connect")
+int BPF_KSYSCALL(handle_sys_connect, int fd, struct sockaddr_in *uservaddr,
+ int addrlen)
+{
+ return handle_sys_connect_common(uservaddr);
+}
+
+#if defined(bpf_target_s390)
+#ifndef SYS_CONNECT
+#define SYS_CONNECT 3
#endif
- bpf_probe_read_user(&old, sizeof(old), ptr);
- __builtin_memset(&new, 0xab, sizeof(new));
- bpf_probe_write_user(ptr, &new, sizeof(new));
+SEC("ksyscall/socketcall")
+int BPF_KSYSCALL(handle_sys_socketcall, int call, unsigned long *args)
+{
+ if (call == SYS_CONNECT) {
+ struct sockaddr_in *uservaddr;
+
+ bpf_probe_read_user(&uservaddr, sizeof(uservaddr), &args[1]);
+ return handle_sys_connect_common(uservaddr);
+ }
return 0;
}
+#endif
char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c b/tools/testing/selftests/bpf/progs/test_skeleton.c
index 1b1187d2967b..1a4e93f6d9df 100644
--- a/tools/testing/selftests/bpf/progs/test_skeleton.c
+++ b/tools/testing/selftests/bpf/progs/test_skeleton.c
@@ -51,6 +51,8 @@ int out_dynarr[4] SEC(".data.dyn") = { 1, 2, 3, 4 };
int read_mostly_var __read_mostly;
int out_mostly_var;
+char huge_arr[16 * 1024 * 1024];
+
SEC("raw_tp/sys_enter")
int handler(const void *ctx)
{
@@ -71,6 +73,8 @@ int handler(const void *ctx)
out_mostly_var = read_mostly_var;
+ huge_arr[sizeof(huge_arr) - 1] = 123;
+
return 0;
}
diff --git a/tools/testing/selftests/bpf/progs/test_tc_dtime.c b/tools/testing/selftests/bpf/progs/test_tc_dtime.c
index 06f300d06dbd..b596479a9ebe 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_dtime.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_dtime.c
@@ -11,6 +11,8 @@
#include <linux/in.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>
#include <sys/socket.h>
@@ -115,6 +117,19 @@ static bool bpf_fwd(void)
return test < TCP_IP4_RT_FWD;
}
+static __u8 get_proto(void)
+{
+ switch (test) {
+ case UDP_IP4:
+ case UDP_IP6:
+ case UDP_IP4_RT_FWD:
+ case UDP_IP6_RT_FWD:
+ return IPPROTO_UDP;
+ default:
+ return IPPROTO_TCP;
+ }
+}
+
/* -1: parse error: TC_ACT_SHOT
* 0: not testing traffic: TC_ACT_OK
* >0: first byte is the inet_proto, second byte has the netns
@@ -122,11 +137,16 @@ static bool bpf_fwd(void)
*/
static int skb_get_type(struct __sk_buff *skb)
{
+ __u16 dst_ns_port = __bpf_htons(50000 + test);
void *data_end = ctx_ptr(skb->data_end);
void *data = ctx_ptr(skb->data);
__u8 inet_proto = 0, ns = 0;
struct ipv6hdr *ip6h;
+ __u16 sport, dport;
struct iphdr *iph;
+ struct tcphdr *th;
+ struct udphdr *uh;
+ void *trans;
switch (skb->protocol) {
case __bpf_htons(ETH_P_IP):
@@ -138,6 +158,7 @@ static int skb_get_type(struct __sk_buff *skb)
else if (iph->saddr == ip4_dst)
ns = DST_NS;
inet_proto = iph->protocol;
+ trans = iph + 1;
break;
case __bpf_htons(ETH_P_IPV6):
ip6h = data + sizeof(struct ethhdr);
@@ -148,15 +169,43 @@ static int skb_get_type(struct __sk_buff *skb)
else if (v6_equal(ip6h->saddr, (struct in6_addr)ip6_dst))
ns = DST_NS;
inet_proto = ip6h->nexthdr;
+ trans = ip6h + 1;
break;
default:
return 0;
}
- if ((inet_proto != IPPROTO_TCP && inet_proto != IPPROTO_UDP) || !ns)
+ /* skb is not from src_ns or dst_ns.
+ * skb is not the testing IPPROTO.
+ */
+ if (!ns || inet_proto != get_proto())
return 0;
- return (ns << 8 | inet_proto);
+ switch (inet_proto) {
+ case IPPROTO_TCP:
+ th = trans;
+ if (th + 1 > data_end)
+ return -1;
+ sport = th->source;
+ dport = th->dest;
+ break;
+ case IPPROTO_UDP:
+ uh = trans;
+ if (uh + 1 > data_end)
+ return -1;
+ sport = uh->source;
+ dport = uh->dest;
+ break;
+ default:
+ return 0;
+ }
+
+ /* The skb is the testing traffic */
+ if ((ns == SRC_NS && dport == dst_ns_port) ||
+ (ns == DST_NS && sport == dst_ns_port))
+ return (ns << 8 | inet_proto);
+
+ return 0;
}
/* format: direction@iface@netns
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
index 17f2f325b3f3..df0673c4ecbe 100644
--- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
+++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
@@ -14,15 +14,24 @@
#include <linux/if_packet.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
+#include <linux/icmp.h>
#include <linux/types.h>
#include <linux/socket.h>
#include <linux/pkt_cls.h>
#include <linux/erspan.h>
+#include <linux/udp.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>
#define log_err(__ret) bpf_printk("ERROR line:%d ret:%d\n", __LINE__, __ret)
+#define VXLAN_UDP_PORT 4789
+
+/* Only IPv4 address assigned to veth1.
+ * 172.16.1.200
+ */
+#define ASSIGNED_ADDR_VETH1 0xac1001c8
+
struct geneve_opt {
__be16 opt_class;
__u8 type;
@@ -33,6 +42,11 @@ struct geneve_opt {
__u8 opt_data[8]; /* hard-coded to 8 byte */
};
+struct vxlanhdr {
+ __be32 vx_flags;
+ __be32 vx_vni;
+} __attribute__((packed));
+
struct vxlan_metadata {
__u32 gbp;
};
@@ -369,14 +383,8 @@ int vxlan_get_tunnel_src(struct __sk_buff *skb)
int ret;
struct bpf_tunnel_key key;
struct vxlan_metadata md;
+ __u32 orig_daddr;
__u32 index = 0;
- __u32 *local_ip = NULL;
-
- local_ip = bpf_map_lookup_elem(&local_ip_map, &index);
- if (!local_ip) {
- log_err(ret);
- return TC_ACT_SHOT;
- }
ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key), 0);
if (ret < 0) {
@@ -390,11 +398,10 @@ int vxlan_get_tunnel_src(struct __sk_buff *skb)
return TC_ACT_SHOT;
}
- if (key.local_ipv4 != *local_ip || md.gbp != 0x800FF) {
+ if (key.local_ipv4 != ASSIGNED_ADDR_VETH1 || md.gbp != 0x800FF) {
bpf_printk("vxlan key %d local ip 0x%x remote ip 0x%x gbp 0x%x\n",
key.tunnel_id, key.local_ipv4,
key.remote_ipv4, md.gbp);
- bpf_printk("local_ip 0x%x\n", *local_ip);
log_err(ret);
return TC_ACT_SHOT;
}
@@ -403,6 +410,61 @@ int vxlan_get_tunnel_src(struct __sk_buff *skb)
}
SEC("tc")
+int veth_set_outer_dst(struct __sk_buff *skb)
+{
+ struct ethhdr *eth = (struct ethhdr *)(long)skb->data;
+ __u32 assigned_ip = bpf_htonl(ASSIGNED_ADDR_VETH1);
+ void *data_end = (void *)(long)skb->data_end;
+ struct udphdr *udph;
+ struct iphdr *iph;
+ __u32 index = 0;
+ int ret = 0;
+ int shrink;
+ __s64 csum;
+
+ if ((void *)eth + sizeof(*eth) > data_end) {
+ log_err(ret);
+ return TC_ACT_SHOT;
+ }
+
+ if (eth->h_proto != bpf_htons(ETH_P_IP))
+ return TC_ACT_OK;
+
+ iph = (struct iphdr *)(eth + 1);
+ if ((void *)iph + sizeof(*iph) > data_end) {
+ log_err(ret);
+ return TC_ACT_SHOT;
+ }
+ if (iph->protocol != IPPROTO_UDP)
+ return TC_ACT_OK;
+
+ udph = (struct udphdr *)(iph + 1);
+ if ((void *)udph + sizeof(*udph) > data_end) {
+ log_err(ret);
+ return TC_ACT_SHOT;
+ }
+ if (udph->dest != bpf_htons(VXLAN_UDP_PORT))
+ return TC_ACT_OK;
+
+ if (iph->daddr != assigned_ip) {
+ csum = bpf_csum_diff(&iph->daddr, sizeof(__u32), &assigned_ip,
+ sizeof(__u32), 0);
+ if (bpf_skb_store_bytes(skb, ETH_HLEN + offsetof(struct iphdr, daddr),
+ &assigned_ip, sizeof(__u32), 0) < 0) {
+ log_err(ret);
+ return TC_ACT_SHOT;
+ }
+ if (bpf_l3_csum_replace(skb, ETH_HLEN + offsetof(struct iphdr, check),
+ 0, csum, 0) < 0) {
+ log_err(ret);
+ return TC_ACT_SHOT;
+ }
+ bpf_skb_change_type(skb, PACKET_HOST);
+ }
+ return TC_ACT_OK;
+}
+
+SEC("tc")
int ip6vxlan_set_tunnel_dst(struct __sk_buff *skb)
{
struct bpf_tunnel_key key;
diff --git a/tools/testing/selftests/bpf/progs/test_varlen.c b/tools/testing/selftests/bpf/progs/test_varlen.c
index 913acdffd90f..3987ff174f1f 100644
--- a/tools/testing/selftests/bpf/progs/test_varlen.c
+++ b/tools/testing/selftests/bpf/progs/test_varlen.c
@@ -41,20 +41,20 @@ int handler64_unsigned(void *regs)
{
int pid = bpf_get_current_pid_tgid() >> 32;
void *payload = payload1;
- u64 len;
+ long len;
/* ignore irrelevant invocations */
if (test_pid != pid || !capture)
return 0;
len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in1[0]);
- if (len <= MAX_LEN) {
+ if (len >= 0) {
payload += len;
payload1_len1 = len;
}
len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in2[0]);
- if (len <= MAX_LEN) {
+ if (len >= 0) {
payload += len;
payload1_len2 = len;
}
@@ -123,7 +123,7 @@ int handler32_signed(void *regs)
{
int pid = bpf_get_current_pid_tgid() >> 32;
void *payload = payload4;
- int len;
+ long len;
/* ignore irrelevant invocations */
if (test_pid != pid || !capture)
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_noinline.c b/tools/testing/selftests/bpf/progs/test_xdp_noinline.c
index 125d872d7981..ba48fcb98ab2 100644
--- a/tools/testing/selftests/bpf/progs/test_xdp_noinline.c
+++ b/tools/testing/selftests/bpf/progs/test_xdp_noinline.c
@@ -239,7 +239,7 @@ bool parse_udp(void *data, void *data_end,
udp = data + off;
if (udp + 1 > data_end)
- return 0;
+ return false;
if (!is_icmp) {
pckt->flow.port16[0] = udp->source;
pckt->flow.port16[1] = udp->dest;
@@ -247,7 +247,7 @@ bool parse_udp(void *data, void *data_end,
pckt->flow.port16[0] = udp->dest;
pckt->flow.port16[1] = udp->source;
}
- return 1;
+ return true;
}
static __attribute__ ((noinline))
@@ -261,7 +261,7 @@ bool parse_tcp(void *data, void *data_end,
tcp = data + off;
if (tcp + 1 > data_end)
- return 0;
+ return false;
if (tcp->syn)
pckt->flags |= (1 << 1);
if (!is_icmp) {
@@ -271,7 +271,7 @@ bool parse_tcp(void *data, void *data_end,
pckt->flow.port16[0] = tcp->dest;
pckt->flow.port16[1] = tcp->source;
}
- return 1;
+ return true;
}
static __attribute__ ((noinline))
@@ -287,7 +287,7 @@ bool encap_v6(struct xdp_md *xdp, struct ctl_value *cval,
void *data;
if (bpf_xdp_adjust_head(xdp, 0 - (int)sizeof(struct ipv6hdr)))
- return 0;
+ return false;
data = (void *)(long)xdp->data;
data_end = (void *)(long)xdp->data_end;
new_eth = data;
@@ -295,7 +295,7 @@ bool encap_v6(struct xdp_md *xdp, struct ctl_value *cval,
old_eth = data + sizeof(struct ipv6hdr);
if (new_eth + 1 > data_end ||
old_eth + 1 > data_end || ip6h + 1 > data_end)
- return 0;
+ return false;
memcpy(new_eth->eth_dest, cval->mac, 6);
memcpy(new_eth->eth_source, old_eth->eth_dest, 6);
new_eth->eth_proto = 56710;
@@ -314,7 +314,7 @@ bool encap_v6(struct xdp_md *xdp, struct ctl_value *cval,
ip6h->saddr.in6_u.u6_addr32[2] = 3;
ip6h->saddr.in6_u.u6_addr32[3] = ip_suffix;
memcpy(ip6h->daddr.in6_u.u6_addr32, dst->dstv6, 16);
- return 1;
+ return true;
}
static __attribute__ ((noinline))
@@ -335,7 +335,7 @@ bool encap_v4(struct xdp_md *xdp, struct ctl_value *cval,
ip_suffix <<= 15;
ip_suffix ^= pckt->flow.src;
if (bpf_xdp_adjust_head(xdp, 0 - (int)sizeof(struct iphdr)))
- return 0;
+ return false;
data = (void *)(long)xdp->data;
data_end = (void *)(long)xdp->data_end;
new_eth = data;
@@ -343,7 +343,7 @@ bool encap_v4(struct xdp_md *xdp, struct ctl_value *cval,
old_eth = data + sizeof(struct iphdr);
if (new_eth + 1 > data_end ||
old_eth + 1 > data_end || iph + 1 > data_end)
- return 0;
+ return false;
memcpy(new_eth->eth_dest, cval->mac, 6);
memcpy(new_eth->eth_source, old_eth->eth_dest, 6);
new_eth->eth_proto = 8;
@@ -367,8 +367,8 @@ bool encap_v4(struct xdp_md *xdp, struct ctl_value *cval,
csum += *next_iph_u16++;
iph->check = ~((csum & 0xffff) + (csum >> 16));
if (bpf_xdp_adjust_head(xdp, (int)sizeof(struct iphdr)))
- return 0;
- return 1;
+ return false;
+ return true;
}
static __attribute__ ((noinline))
@@ -386,10 +386,10 @@ bool decap_v6(struct xdp_md *xdp, void **data, void **data_end, bool inner_v4)
else
new_eth->eth_proto = 56710;
if (bpf_xdp_adjust_head(xdp, (int)sizeof(struct ipv6hdr)))
- return 0;
+ return false;
*data = (void *)(long)xdp->data;
*data_end = (void *)(long)xdp->data_end;
- return 1;
+ return true;
}
static __attribute__ ((noinline))
@@ -404,10 +404,10 @@ bool decap_v4(struct xdp_md *xdp, void **data, void **data_end)
memcpy(new_eth->eth_dest, old_eth->eth_dest, 6);
new_eth->eth_proto = 8;
if (bpf_xdp_adjust_head(xdp, (int)sizeof(struct iphdr)))
- return 0;
+ return false;
*data = (void *)(long)xdp->data;
*data_end = (void *)(long)xdp->data_end;
- return 1;
+ return true;
}
static __attribute__ ((noinline))
diff --git a/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c b/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
new file mode 100644
index 000000000000..736686e903f6
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
@@ -0,0 +1,843 @@
+// SPDX-License-Identifier: LGPL-2.1 OR BSD-2-Clause
+/* Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_endian.h>
+#include <asm/errno.h>
+
+#define TC_ACT_OK 0
+#define TC_ACT_SHOT 2
+
+#define NSEC_PER_SEC 1000000000L
+
+#define ETH_ALEN 6
+#define ETH_P_IP 0x0800
+#define ETH_P_IPV6 0x86DD
+
+#define tcp_flag_word(tp) (((union tcp_word_hdr *)(tp))->words[3])
+
+#define IP_DF 0x4000
+#define IP_MF 0x2000
+#define IP_OFFSET 0x1fff
+
+#define NEXTHDR_TCP 6
+
+#define TCPOPT_NOP 1
+#define TCPOPT_EOL 0
+#define TCPOPT_MSS 2
+#define TCPOPT_WINDOW 3
+#define TCPOPT_SACK_PERM 4
+#define TCPOPT_TIMESTAMP 8
+
+#define TCPOLEN_MSS 4
+#define TCPOLEN_WINDOW 3
+#define TCPOLEN_SACK_PERM 2
+#define TCPOLEN_TIMESTAMP 10
+
+#define TCP_TS_HZ 1000
+#define TS_OPT_WSCALE_MASK 0xf
+#define TS_OPT_SACK (1 << 4)
+#define TS_OPT_ECN (1 << 5)
+#define TSBITS 6
+#define TSMASK (((__u32)1 << TSBITS) - 1)
+#define TCP_MAX_WSCALE 14U
+
+#define IPV4_MAXLEN 60
+#define TCP_MAXLEN 60
+
+#define DEFAULT_MSS4 1460
+#define DEFAULT_MSS6 1440
+#define DEFAULT_WSCALE 7
+#define DEFAULT_TTL 64
+#define MAX_ALLOWED_PORTS 8
+
+#define swap(a, b) \
+ do { typeof(a) __tmp = (a); (a) = (b); (b) = __tmp; } while (0)
+
+#define __get_unaligned_t(type, ptr) ({ \
+ const struct { type x; } __attribute__((__packed__)) *__pptr = (typeof(__pptr))(ptr); \
+ __pptr->x; \
+})
+
+#define get_unaligned(ptr) __get_unaligned_t(typeof(*(ptr)), (ptr))
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __type(key, __u32);
+ __type(value, __u64);
+ __uint(max_entries, 2);
+} values SEC(".maps");
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __type(key, __u32);
+ __type(value, __u16);
+ __uint(max_entries, MAX_ALLOWED_PORTS);
+} allowed_ports SEC(".maps");
+
+/* Some symbols defined in net/netfilter/nf_conntrack_bpf.c are unavailable in
+ * vmlinux.h if CONFIG_NF_CONNTRACK=m, so they are redefined locally.
+ */
+
+struct bpf_ct_opts___local {
+ s32 netns_id;
+ s32 error;
+ u8 l4proto;
+ u8 dir;
+ u8 reserved[2];
+} __attribute__((preserve_access_index));
+
+#define BPF_F_CURRENT_NETNS (-1)
+
+extern struct nf_conn *bpf_xdp_ct_lookup(struct xdp_md *xdp_ctx,
+ struct bpf_sock_tuple *bpf_tuple,
+ __u32 len_tuple,
+ struct bpf_ct_opts___local *opts,
+ __u32 len_opts) __ksym;
+
+extern struct nf_conn *bpf_skb_ct_lookup(struct __sk_buff *skb_ctx,
+ struct bpf_sock_tuple *bpf_tuple,
+ u32 len_tuple,
+ struct bpf_ct_opts___local *opts,
+ u32 len_opts) __ksym;
+
+extern void bpf_ct_release(struct nf_conn *ct) __ksym;
+
+static __always_inline void swap_eth_addr(__u8 *a, __u8 *b)
+{
+ __u8 tmp[ETH_ALEN];
+
+ __builtin_memcpy(tmp, a, ETH_ALEN);
+ __builtin_memcpy(a, b, ETH_ALEN);
+ __builtin_memcpy(b, tmp, ETH_ALEN);
+}
+
+static __always_inline __u16 csum_fold(__u32 csum)
+{
+ csum = (csum & 0xffff) + (csum >> 16);
+ csum = (csum & 0xffff) + (csum >> 16);
+ return (__u16)~csum;
+}
+
+static __always_inline __u16 csum_tcpudp_magic(__be32 saddr, __be32 daddr,
+ __u32 len, __u8 proto,
+ __u32 csum)
+{
+ __u64 s = csum;
+
+ s += (__u32)saddr;
+ s += (__u32)daddr;
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+ s += proto + len;
+#elif __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+ s += (proto + len) << 8;
+#else
+#error Unknown endian
+#endif
+ s = (s & 0xffffffff) + (s >> 32);
+ s = (s & 0xffffffff) + (s >> 32);
+
+ return csum_fold((__u32)s);
+}
+
+static __always_inline __u16 csum_ipv6_magic(const struct in6_addr *saddr,
+ const struct in6_addr *daddr,
+ __u32 len, __u8 proto, __u32 csum)
+{
+ __u64 sum = csum;
+ int i;
+
+#pragma unroll
+ for (i = 0; i < 4; i++)
+ sum += (__u32)saddr->in6_u.u6_addr32[i];
+
+#pragma unroll
+ for (i = 0; i < 4; i++)
+ sum += (__u32)daddr->in6_u.u6_addr32[i];
+
+ /* Don't combine additions to avoid 32-bit overflow. */
+ sum += bpf_htonl(len);
+ sum += bpf_htonl(proto);
+
+ sum = (sum & 0xffffffff) + (sum >> 32);
+ sum = (sum & 0xffffffff) + (sum >> 32);
+
+ return csum_fold((__u32)sum);
+}
+
+static __always_inline __u64 tcp_clock_ns(void)
+{
+ return bpf_ktime_get_ns();
+}
+
+static __always_inline __u32 tcp_ns_to_ts(__u64 ns)
+{
+ return ns / (NSEC_PER_SEC / TCP_TS_HZ);
+}
+
+static __always_inline __u32 tcp_time_stamp_raw(void)
+{
+ return tcp_ns_to_ts(tcp_clock_ns());
+}
+
+struct tcpopt_context {
+ __u8 *ptr;
+ __u8 *end;
+ void *data_end;
+ __be32 *tsecr;
+ __u8 wscale;
+ bool option_timestamp;
+ bool option_sack;
+};
+
+static int tscookie_tcpopt_parse(struct tcpopt_context *ctx)
+{
+ __u8 opcode, opsize;
+
+ if (ctx->ptr >= ctx->end)
+ return 1;
+ if (ctx->ptr >= ctx->data_end)
+ return 1;
+
+ opcode = ctx->ptr[0];
+
+ if (opcode == TCPOPT_EOL)
+ return 1;
+ if (opcode == TCPOPT_NOP) {
+ ++ctx->ptr;
+ return 0;
+ }
+
+ if (ctx->ptr + 1 >= ctx->end)
+ return 1;
+ if (ctx->ptr + 1 >= ctx->data_end)
+ return 1;
+ opsize = ctx->ptr[1];
+ if (opsize < 2)
+ return 1;
+
+ if (ctx->ptr + opsize > ctx->end)
+ return 1;
+
+ switch (opcode) {
+ case TCPOPT_WINDOW:
+ if (opsize == TCPOLEN_WINDOW && ctx->ptr + TCPOLEN_WINDOW <= ctx->data_end)
+ ctx->wscale = ctx->ptr[2] < TCP_MAX_WSCALE ? ctx->ptr[2] : TCP_MAX_WSCALE;
+ break;
+ case TCPOPT_TIMESTAMP:
+ if (opsize == TCPOLEN_TIMESTAMP && ctx->ptr + TCPOLEN_TIMESTAMP <= ctx->data_end) {
+ ctx->option_timestamp = true;
+ /* Client's tsval becomes our tsecr. */
+ *ctx->tsecr = get_unaligned((__be32 *)(ctx->ptr + 2));
+ }
+ break;
+ case TCPOPT_SACK_PERM:
+ if (opsize == TCPOLEN_SACK_PERM)
+ ctx->option_sack = true;
+ break;
+ }
+
+ ctx->ptr += opsize;
+
+ return 0;
+}
+
+static int tscookie_tcpopt_parse_batch(__u32 index, void *context)
+{
+ int i;
+
+ for (i = 0; i < 7; i++)
+ if (tscookie_tcpopt_parse(context))
+ return 1;
+ return 0;
+}
+
+static __always_inline bool tscookie_init(struct tcphdr *tcp_header,
+ __u16 tcp_len, __be32 *tsval,
+ __be32 *tsecr, void *data_end)
+{
+ struct tcpopt_context loop_ctx = {
+ .ptr = (__u8 *)(tcp_header + 1),
+ .end = (__u8 *)tcp_header + tcp_len,
+ .data_end = data_end,
+ .tsecr = tsecr,
+ .wscale = TS_OPT_WSCALE_MASK,
+ .option_timestamp = false,
+ .option_sack = false,
+ };
+ u32 cookie;
+
+ bpf_loop(6, tscookie_tcpopt_parse_batch, &loop_ctx, 0);
+
+ if (!loop_ctx.option_timestamp)
+ return false;
+
+ cookie = tcp_time_stamp_raw() & ~TSMASK;
+ cookie |= loop_ctx.wscale & TS_OPT_WSCALE_MASK;
+ if (loop_ctx.option_sack)
+ cookie |= TS_OPT_SACK;
+ if (tcp_header->ece && tcp_header->cwr)
+ cookie |= TS_OPT_ECN;
+ *tsval = bpf_htonl(cookie);
+
+ return true;
+}
+
+static __always_inline void values_get_tcpipopts(__u16 *mss, __u8 *wscale,
+ __u8 *ttl, bool ipv6)
+{
+ __u32 key = 0;
+ __u64 *value;
+
+ value = bpf_map_lookup_elem(&values, &key);
+ if (value && *value != 0) {
+ if (ipv6)
+ *mss = (*value >> 32) & 0xffff;
+ else
+ *mss = *value & 0xffff;
+ *wscale = (*value >> 16) & 0xf;
+ *ttl = (*value >> 24) & 0xff;
+ return;
+ }
+
+ *mss = ipv6 ? DEFAULT_MSS6 : DEFAULT_MSS4;
+ *wscale = DEFAULT_WSCALE;
+ *ttl = DEFAULT_TTL;
+}
+
+static __always_inline void values_inc_synacks(void)
+{
+ __u32 key = 1;
+ __u32 *value;
+
+ value = bpf_map_lookup_elem(&values, &key);
+ if (value)
+ __sync_fetch_and_add(value, 1);
+}
+
+static __always_inline bool check_port_allowed(__u16 port)
+{
+ __u32 i;
+
+ for (i = 0; i < MAX_ALLOWED_PORTS; i++) {
+ __u32 key = i;
+ __u16 *value;
+
+ value = bpf_map_lookup_elem(&allowed_ports, &key);
+
+ if (!value)
+ break;
+ /* 0 is a terminator value. Check it first to avoid matching on
+ * a forbidden port == 0 and returning true.
+ */
+ if (*value == 0)
+ break;
+
+ if (*value == port)
+ return true;
+ }
+
+ return false;
+}
+
+struct header_pointers {
+ struct ethhdr *eth;
+ struct iphdr *ipv4;
+ struct ipv6hdr *ipv6;
+ struct tcphdr *tcp;
+ __u16 tcp_len;
+};
+
+static __always_inline int tcp_dissect(void *data, void *data_end,
+ struct header_pointers *hdr)
+{
+ hdr->eth = data;
+ if (hdr->eth + 1 > data_end)
+ return XDP_DROP;
+
+ switch (bpf_ntohs(hdr->eth->h_proto)) {
+ case ETH_P_IP:
+ hdr->ipv6 = NULL;
+
+ hdr->ipv4 = (void *)hdr->eth + sizeof(*hdr->eth);
+ if (hdr->ipv4 + 1 > data_end)
+ return XDP_DROP;
+ if (hdr->ipv4->ihl * 4 < sizeof(*hdr->ipv4))
+ return XDP_DROP;
+ if (hdr->ipv4->version != 4)
+ return XDP_DROP;
+
+ if (hdr->ipv4->protocol != IPPROTO_TCP)
+ return XDP_PASS;
+
+ hdr->tcp = (void *)hdr->ipv4 + hdr->ipv4->ihl * 4;
+ break;
+ case ETH_P_IPV6:
+ hdr->ipv4 = NULL;
+
+ hdr->ipv6 = (void *)hdr->eth + sizeof(*hdr->eth);
+ if (hdr->ipv6 + 1 > data_end)
+ return XDP_DROP;
+ if (hdr->ipv6->version != 6)
+ return XDP_DROP;
+
+ /* XXX: Extension headers are not supported and could circumvent
+ * XDP SYN flood protection.
+ */
+ if (hdr->ipv6->nexthdr != NEXTHDR_TCP)
+ return XDP_PASS;
+
+ hdr->tcp = (void *)hdr->ipv6 + sizeof(*hdr->ipv6);
+ break;
+ default:
+ /* XXX: VLANs will circumvent XDP SYN flood protection. */
+ return XDP_PASS;
+ }
+
+ if (hdr->tcp + 1 > data_end)
+ return XDP_DROP;
+ hdr->tcp_len = hdr->tcp->doff * 4;
+ if (hdr->tcp_len < sizeof(*hdr->tcp))
+ return XDP_DROP;
+
+ return XDP_TX;
+}
+
+static __always_inline int tcp_lookup(void *ctx, struct header_pointers *hdr, bool xdp)
+{
+ struct bpf_ct_opts___local ct_lookup_opts = {
+ .netns_id = BPF_F_CURRENT_NETNS,
+ .l4proto = IPPROTO_TCP,
+ };
+ struct bpf_sock_tuple tup = {};
+ struct nf_conn *ct;
+ __u32 tup_size;
+
+ if (hdr->ipv4) {
+ /* TCP doesn't normally use fragments, and XDP can't reassemble
+ * them.
+ */
+ if ((hdr->ipv4->frag_off & bpf_htons(IP_DF | IP_MF | IP_OFFSET)) != bpf_htons(IP_DF))
+ return XDP_DROP;
+
+ tup.ipv4.saddr = hdr->ipv4->saddr;
+ tup.ipv4.daddr = hdr->ipv4->daddr;
+ tup.ipv4.sport = hdr->tcp->source;
+ tup.ipv4.dport = hdr->tcp->dest;
+ tup_size = sizeof(tup.ipv4);
+ } else if (hdr->ipv6) {
+ __builtin_memcpy(tup.ipv6.saddr, &hdr->ipv6->saddr, sizeof(tup.ipv6.saddr));
+ __builtin_memcpy(tup.ipv6.daddr, &hdr->ipv6->daddr, sizeof(tup.ipv6.daddr));
+ tup.ipv6.sport = hdr->tcp->source;
+ tup.ipv6.dport = hdr->tcp->dest;
+ tup_size = sizeof(tup.ipv6);
+ } else {
+ /* The verifier can't track that either ipv4 or ipv6 is not
+ * NULL.
+ */
+ return XDP_ABORTED;
+ }
+ if (xdp)
+ ct = bpf_xdp_ct_lookup(ctx, &tup, tup_size, &ct_lookup_opts, sizeof(ct_lookup_opts));
+ else
+ ct = bpf_skb_ct_lookup(ctx, &tup, tup_size, &ct_lookup_opts, sizeof(ct_lookup_opts));
+ if (ct) {
+ unsigned long status = ct->status;
+
+ bpf_ct_release(ct);
+ if (status & IPS_CONFIRMED_BIT)
+ return XDP_PASS;
+ } else if (ct_lookup_opts.error != -ENOENT) {
+ return XDP_ABORTED;
+ }
+
+ /* error == -ENOENT || !(status & IPS_CONFIRMED_BIT) */
+ return XDP_TX;
+}
+
+static __always_inline __u8 tcp_mkoptions(__be32 *buf, __be32 *tsopt, __u16 mss,
+ __u8 wscale)
+{
+ __be32 *start = buf;
+
+ *buf++ = bpf_htonl((TCPOPT_MSS << 24) | (TCPOLEN_MSS << 16) | mss);
+
+ if (!tsopt)
+ return buf - start;
+
+ if (tsopt[0] & bpf_htonl(1 << 4))
+ *buf++ = bpf_htonl((TCPOPT_SACK_PERM << 24) |
+ (TCPOLEN_SACK_PERM << 16) |
+ (TCPOPT_TIMESTAMP << 8) |
+ TCPOLEN_TIMESTAMP);
+ else
+ *buf++ = bpf_htonl((TCPOPT_NOP << 24) |
+ (TCPOPT_NOP << 16) |
+ (TCPOPT_TIMESTAMP << 8) |
+ TCPOLEN_TIMESTAMP);
+ *buf++ = tsopt[0];
+ *buf++ = tsopt[1];
+
+ if ((tsopt[0] & bpf_htonl(0xf)) != bpf_htonl(0xf))
+ *buf++ = bpf_htonl((TCPOPT_NOP << 24) |
+ (TCPOPT_WINDOW << 16) |
+ (TCPOLEN_WINDOW << 8) |
+ wscale);
+
+ return buf - start;
+}
+
+static __always_inline void tcp_gen_synack(struct tcphdr *tcp_header,
+ __u32 cookie, __be32 *tsopt,
+ __u16 mss, __u8 wscale)
+{
+ void *tcp_options;
+
+ tcp_flag_word(tcp_header) = TCP_FLAG_SYN | TCP_FLAG_ACK;
+ if (tsopt && (tsopt[0] & bpf_htonl(1 << 5)))
+ tcp_flag_word(tcp_header) |= TCP_FLAG_ECE;
+ tcp_header->doff = 5; /* doff is part of tcp_flag_word. */
+ swap(tcp_header->source, tcp_header->dest);
+ tcp_header->ack_seq = bpf_htonl(bpf_ntohl(tcp_header->seq) + 1);
+ tcp_header->seq = bpf_htonl(cookie);
+ tcp_header->window = 0;
+ tcp_header->urg_ptr = 0;
+ tcp_header->check = 0; /* Calculate checksum later. */
+
+ tcp_options = (void *)(tcp_header + 1);
+ tcp_header->doff += tcp_mkoptions(tcp_options, tsopt, mss, wscale);
+}
+
+static __always_inline void tcpv4_gen_synack(struct header_pointers *hdr,
+ __u32 cookie, __be32 *tsopt)
+{
+ __u8 wscale;
+ __u16 mss;
+ __u8 ttl;
+
+ values_get_tcpipopts(&mss, &wscale, &ttl, false);
+
+ swap_eth_addr(hdr->eth->h_source, hdr->eth->h_dest);
+
+ swap(hdr->ipv4->saddr, hdr->ipv4->daddr);
+ hdr->ipv4->check = 0; /* Calculate checksum later. */
+ hdr->ipv4->tos = 0;
+ hdr->ipv4->id = 0;
+ hdr->ipv4->ttl = ttl;
+
+ tcp_gen_synack(hdr->tcp, cookie, tsopt, mss, wscale);
+
+ hdr->tcp_len = hdr->tcp->doff * 4;
+ hdr->ipv4->tot_len = bpf_htons(sizeof(*hdr->ipv4) + hdr->tcp_len);
+}
+
+static __always_inline void tcpv6_gen_synack(struct header_pointers *hdr,
+ __u32 cookie, __be32 *tsopt)
+{
+ __u8 wscale;
+ __u16 mss;
+ __u8 ttl;
+
+ values_get_tcpipopts(&mss, &wscale, &ttl, true);
+
+ swap_eth_addr(hdr->eth->h_source, hdr->eth->h_dest);
+
+ swap(hdr->ipv6->saddr, hdr->ipv6->daddr);
+ *(__be32 *)hdr->ipv6 = bpf_htonl(0x60000000);
+ hdr->ipv6->hop_limit = ttl;
+
+ tcp_gen_synack(hdr->tcp, cookie, tsopt, mss, wscale);
+
+ hdr->tcp_len = hdr->tcp->doff * 4;
+ hdr->ipv6->payload_len = bpf_htons(hdr->tcp_len);
+}
+
+static __always_inline int syncookie_handle_syn(struct header_pointers *hdr,
+ void *ctx,
+ void *data, void *data_end,
+ bool xdp)
+{
+ __u32 old_pkt_size, new_pkt_size;
+ /* Unlike clang 10, clang 11 and 12 generate code that doesn't pass the
+ * BPF verifier if tsopt is not volatile. Volatile forces it to store
+ * the pointer value and use it directly, otherwise tcp_mkoptions is
+ * (mis)compiled like this:
+ * if (!tsopt)
+ * return buf - start;
+ * reg = stored_return_value_of_tscookie_init;
+ * if (reg)
+ * tsopt = tsopt_buf;
+ * else
+ * tsopt = NULL;
+ * ...
+ * *buf++ = tsopt[1];
+ * It creates a dead branch where tsopt is assigned NULL, but the
+ * verifier can't prove it's dead and blocks the program.
+ */
+ __be32 * volatile tsopt = NULL;
+ __be32 tsopt_buf[2] = {};
+ __u16 ip_len;
+ __u32 cookie;
+ __s64 value;
+
+ /* Checksum is not yet verified, but both checksum failure and TCP
+ * header checks return XDP_DROP, so the order doesn't matter.
+ */
+ if (hdr->tcp->fin || hdr->tcp->rst)
+ return XDP_DROP;
+
+ /* Issue SYN cookies on allowed ports, drop SYN packets on blocked
+ * ports.
+ */
+ if (!check_port_allowed(bpf_ntohs(hdr->tcp->dest)))
+ return XDP_DROP;
+
+ if (hdr->ipv4) {
+ /* Check the IPv4 and TCP checksums before creating a SYNACK. */
+ value = bpf_csum_diff(0, 0, (void *)hdr->ipv4, hdr->ipv4->ihl * 4, 0);
+ if (value < 0)
+ return XDP_ABORTED;
+ if (csum_fold(value) != 0)
+ return XDP_DROP; /* Bad IPv4 checksum. */
+
+ value = bpf_csum_diff(0, 0, (void *)hdr->tcp, hdr->tcp_len, 0);
+ if (value < 0)
+ return XDP_ABORTED;
+ if (csum_tcpudp_magic(hdr->ipv4->saddr, hdr->ipv4->daddr,
+ hdr->tcp_len, IPPROTO_TCP, value) != 0)
+ return XDP_DROP; /* Bad TCP checksum. */
+
+ ip_len = sizeof(*hdr->ipv4);
+
+ value = bpf_tcp_raw_gen_syncookie_ipv4(hdr->ipv4, hdr->tcp,
+ hdr->tcp_len);
+ } else if (hdr->ipv6) {
+ /* Check the TCP checksum before creating a SYNACK. */
+ value = bpf_csum_diff(0, 0, (void *)hdr->tcp, hdr->tcp_len, 0);
+ if (value < 0)
+ return XDP_ABORTED;
+ if (csum_ipv6_magic(&hdr->ipv6->saddr, &hdr->ipv6->daddr,
+ hdr->tcp_len, IPPROTO_TCP, value) != 0)
+ return XDP_DROP; /* Bad TCP checksum. */
+
+ ip_len = sizeof(*hdr->ipv6);
+
+ value = bpf_tcp_raw_gen_syncookie_ipv6(hdr->ipv6, hdr->tcp,
+ hdr->tcp_len);
+ } else {
+ return XDP_ABORTED;
+ }
+
+ if (value < 0)
+ return XDP_ABORTED;
+ cookie = (__u32)value;
+
+ if (tscookie_init((void *)hdr->tcp, hdr->tcp_len,
+ &tsopt_buf[0], &tsopt_buf[1], data_end))
+ tsopt = tsopt_buf;
+
+ /* Check that there is enough space for a SYNACK. It also covers
+ * the check that the destination of the __builtin_memmove below
+ * doesn't overflow.
+ */
+ if (data + sizeof(*hdr->eth) + ip_len + TCP_MAXLEN > data_end)
+ return XDP_ABORTED;
+
+ if (hdr->ipv4) {
+ if (hdr->ipv4->ihl * 4 > sizeof(*hdr->ipv4)) {
+ struct tcphdr *new_tcp_header;
+
+ new_tcp_header = data + sizeof(*hdr->eth) + sizeof(*hdr->ipv4);
+ __builtin_memmove(new_tcp_header, hdr->tcp, sizeof(*hdr->tcp));
+ hdr->tcp = new_tcp_header;
+
+ hdr->ipv4->ihl = sizeof(*hdr->ipv4) / 4;
+ }
+
+ tcpv4_gen_synack(hdr, cookie, tsopt);
+ } else if (hdr->ipv6) {
+ tcpv6_gen_synack(hdr, cookie, tsopt);
+ } else {
+ return XDP_ABORTED;
+ }
+
+ /* Recalculate checksums. */
+ hdr->tcp->check = 0;
+ value = bpf_csum_diff(0, 0, (void *)hdr->tcp, hdr->tcp_len, 0);
+ if (value < 0)
+ return XDP_ABORTED;
+ if (hdr->ipv4) {
+ hdr->tcp->check = csum_tcpudp_magic(hdr->ipv4->saddr,
+ hdr->ipv4->daddr,
+ hdr->tcp_len,
+ IPPROTO_TCP,
+ value);
+
+ hdr->ipv4->check = 0;
+ value = bpf_csum_diff(0, 0, (void *)hdr->ipv4, sizeof(*hdr->ipv4), 0);
+ if (value < 0)
+ return XDP_ABORTED;
+ hdr->ipv4->check = csum_fold(value);
+ } else if (hdr->ipv6) {
+ hdr->tcp->check = csum_ipv6_magic(&hdr->ipv6->saddr,
+ &hdr->ipv6->daddr,
+ hdr->tcp_len,
+ IPPROTO_TCP,
+ value);
+ } else {
+ return XDP_ABORTED;
+ }
+
+ /* Set the new packet size. */
+ old_pkt_size = data_end - data;
+ new_pkt_size = sizeof(*hdr->eth) + ip_len + hdr->tcp->doff * 4;
+ if (xdp) {
+ if (bpf_xdp_adjust_tail(ctx, new_pkt_size - old_pkt_size))
+ return XDP_ABORTED;
+ } else {
+ if (bpf_skb_change_tail(ctx, new_pkt_size, 0))
+ return XDP_ABORTED;
+ }
+
+ values_inc_synacks();
+
+ return XDP_TX;
+}
+
+static __always_inline int syncookie_handle_ack(struct header_pointers *hdr)
+{
+ int err;
+
+ if (hdr->tcp->rst)
+ return XDP_DROP;
+
+ if (hdr->ipv4)
+ err = bpf_tcp_raw_check_syncookie_ipv4(hdr->ipv4, hdr->tcp);
+ else if (hdr->ipv6)
+ err = bpf_tcp_raw_check_syncookie_ipv6(hdr->ipv6, hdr->tcp);
+ else
+ return XDP_ABORTED;
+ if (err)
+ return XDP_DROP;
+
+ return XDP_PASS;
+}
+
+static __always_inline int syncookie_part1(void *ctx, void *data, void *data_end,
+ struct header_pointers *hdr, bool xdp)
+{
+ int ret;
+
+ ret = tcp_dissect(data, data_end, hdr);
+ if (ret != XDP_TX)
+ return ret;
+
+ ret = tcp_lookup(ctx, hdr, xdp);
+ if (ret != XDP_TX)
+ return ret;
+
+ /* Packet is TCP and doesn't belong to an established connection. */
+
+ if ((hdr->tcp->syn ^ hdr->tcp->ack) != 1)
+ return XDP_DROP;
+
+ /* Grow the TCP header to TCP_MAXLEN to be able to pass any hdr->tcp_len
+ * to bpf_tcp_raw_gen_syncookie_ipv{4,6} and pass the verifier.
+ */
+ if (xdp) {
+ if (bpf_xdp_adjust_tail(ctx, TCP_MAXLEN - hdr->tcp_len))
+ return XDP_ABORTED;
+ } else {
+ /* Without volatile the verifier throws this error:
+ * R9 32-bit pointer arithmetic prohibited
+ */
+ volatile u64 old_len = data_end - data;
+
+ if (bpf_skb_change_tail(ctx, old_len + TCP_MAXLEN - hdr->tcp_len, 0))
+ return XDP_ABORTED;
+ }
+
+ return XDP_TX;
+}
+
+static __always_inline int syncookie_part2(void *ctx, void *data, void *data_end,
+ struct header_pointers *hdr, bool xdp)
+{
+ if (hdr->ipv4) {
+ hdr->eth = data;
+ hdr->ipv4 = (void *)hdr->eth + sizeof(*hdr->eth);
+ /* IPV4_MAXLEN is needed when calculating checksum.
+ * At least sizeof(struct iphdr) is needed here to access ihl.
+ */
+ if ((void *)hdr->ipv4 + IPV4_MAXLEN > data_end)
+ return XDP_ABORTED;
+ hdr->tcp = (void *)hdr->ipv4 + hdr->ipv4->ihl * 4;
+ } else if (hdr->ipv6) {
+ hdr->eth = data;
+ hdr->ipv6 = (void *)hdr->eth + sizeof(*hdr->eth);
+ hdr->tcp = (void *)hdr->ipv6 + sizeof(*hdr->ipv6);
+ } else {
+ return XDP_ABORTED;
+ }
+
+ if ((void *)hdr->tcp + TCP_MAXLEN > data_end)
+ return XDP_ABORTED;
+
+ /* We run out of registers, tcp_len gets spilled to the stack, and the
+ * verifier forgets its min and max values checked above in tcp_dissect.
+ */
+ hdr->tcp_len = hdr->tcp->doff * 4;
+ if (hdr->tcp_len < sizeof(*hdr->tcp))
+ return XDP_ABORTED;
+
+ return hdr->tcp->syn ? syncookie_handle_syn(hdr, ctx, data, data_end, xdp) :
+ syncookie_handle_ack(hdr);
+}
+
+SEC("xdp")
+int syncookie_xdp(struct xdp_md *ctx)
+{
+ void *data_end = (void *)(long)ctx->data_end;
+ void *data = (void *)(long)ctx->data;
+ struct header_pointers hdr;
+ int ret;
+
+ ret = syncookie_part1(ctx, data, data_end, &hdr, true);
+ if (ret != XDP_TX)
+ return ret;
+
+ data_end = (void *)(long)ctx->data_end;
+ data = (void *)(long)ctx->data;
+
+ return syncookie_part2(ctx, data, data_end, &hdr, true);
+}
+
+SEC("tc")
+int syncookie_tc(struct __sk_buff *skb)
+{
+ void *data_end = (void *)(long)skb->data_end;
+ void *data = (void *)(long)skb->data;
+ struct header_pointers hdr;
+ int ret;
+
+ ret = syncookie_part1(skb, data, data_end, &hdr, false);
+ if (ret != XDP_TX)
+ return ret == XDP_PASS ? TC_ACT_OK : TC_ACT_SHOT;
+
+ data_end = (void *)(long)skb->data_end;
+ data = (void *)(long)skb->data;
+
+ ret = syncookie_part2(skb, data, data_end, &hdr, false);
+ switch (ret) {
+ case XDP_PASS:
+ return TC_ACT_OK;
+ case XDP_TX:
+ return bpf_redirect(skb->ifindex, 0);
+ default:
+ return TC_ACT_SHOT;
+ }
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_bpftool_synctypes.py b/tools/testing/selftests/bpf/test_bpftool_synctypes.py
index c0e7acd698ed..a6410bebe603 100755
--- a/tools/testing/selftests/bpf/test_bpftool_synctypes.py
+++ b/tools/testing/selftests/bpf/test_bpftool_synctypes.py
@@ -58,7 +58,7 @@ class BlockParser(object):
class ArrayParser(BlockParser):
"""
- A parser for extracting dicionaries of values from some BPF-related arrays.
+ A parser for extracting a set of values from some BPF-related arrays.
@reader: a pointer to the open file to parse
@array_name: name of the array to parse
"""
@@ -66,7 +66,7 @@ class ArrayParser(BlockParser):
def __init__(self, reader, array_name):
self.array_name = array_name
- self.start_marker = re.compile(f'(static )?const char \* const {self.array_name}\[.*\] = {{\n')
+ self.start_marker = re.compile(f'(static )?const bool {self.array_name}\[.*\] = {{\n')
super().__init__(reader)
def search_block(self):
@@ -80,15 +80,15 @@ class ArrayParser(BlockParser):
Parse a block and return data as a dictionary. Items to extract must be
on separate lines in the file.
"""
- pattern = re.compile('\[(BPF_\w*)\]\s*= "(.*)",?$')
- entries = {}
+ pattern = re.compile('\[(BPF_\w*)\]\s*= (true|false),?$')
+ entries = set()
while True:
line = self.reader.readline()
if line == '' or re.match(self.end_marker, line):
break
capture = pattern.search(line)
if capture:
- entries[capture.group(1)] = capture.group(2)
+ entries |= {capture.group(1)}
return entries
class InlineListParser(BlockParser):
@@ -115,7 +115,7 @@ class InlineListParser(BlockParser):
class FileExtractor(object):
"""
A generic reader for extracting data from a given file. This class contains
- several helper methods that wrap arround parser objects to extract values
+ several helper methods that wrap around parser objects to extract values
from different structures.
This class does not offer a way to set a filename, which is expected to be
defined in children classes.
@@ -139,21 +139,19 @@ class FileExtractor(object):
def get_types_from_array(self, array_name):
"""
- Search for and parse an array associating names to BPF_* enum members,
- for example:
+ Search for and parse a list of allowed BPF_* enum members, for example:
- const char * const prog_type_name[] = {
- [BPF_PROG_TYPE_UNSPEC] = "unspec",
- [BPF_PROG_TYPE_SOCKET_FILTER] = "socket_filter",
- [BPF_PROG_TYPE_KPROBE] = "kprobe",
+ const bool prog_type_name[] = {
+ [BPF_PROG_TYPE_UNSPEC] = true,
+ [BPF_PROG_TYPE_SOCKET_FILTER] = true,
+ [BPF_PROG_TYPE_KPROBE] = true,
};
- Return a dictionary with the enum member names as keys and the
- associated names as values, for example:
+ Return a set of the enum members, for example:
- {'BPF_PROG_TYPE_UNSPEC': 'unspec',
- 'BPF_PROG_TYPE_SOCKET_FILTER': 'socket_filter',
- 'BPF_PROG_TYPE_KPROBE': 'kprobe'}
+ {'BPF_PROG_TYPE_UNSPEC',
+ 'BPF_PROG_TYPE_SOCKET_FILTER',
+ 'BPF_PROG_TYPE_KPROBE'}
@array_name: name of the array to parse
"""
@@ -186,6 +184,27 @@ class FileExtractor(object):
parser.search_block(start_marker)
return parser.parse(pattern, end_marker)
+ def make_enum_map(self, names, enum_prefix):
+ """
+ Search for and parse an enum containing BPF_* members, just as get_enum
+ does. However, instead of just returning a set of the variant names,
+ also generate a textual representation from them by (assuming and)
+ removing a provided prefix and lowercasing the remainder. Then return a
+ dict mapping from name to textual representation.
+
+ @enum_values: a set of enum values; e.g., as retrieved by get_enum
+ @enum_prefix: the prefix to remove from each of the variants to infer
+ textual representation
+ """
+ mapping = {}
+ for name in names:
+ if not name.startswith(enum_prefix):
+ raise Exception(f"enum variant {name} does not start with {enum_prefix}")
+ text = name[len(enum_prefix):].lower()
+ mapping[name] = text
+
+ return mapping
+
def __get_description_list(self, start_marker, pattern, end_marker):
parser = InlineListParser(self.reader)
parser.search_block(start_marker)
@@ -333,11 +352,9 @@ class ProgFileExtractor(SourceFileExtractor):
"""
filename = os.path.join(BPFTOOL_DIR, 'prog.c')
- def get_prog_types(self):
- return self.get_types_from_array('prog_type_name')
-
def get_attach_types(self):
- return self.get_types_from_array('attach_type_strings')
+ types = self.get_types_from_array('attach_types')
+ return self.make_enum_map(types, 'BPF_')
def get_prog_attach_help(self):
return self.get_help_list('ATTACH_TYPE')
@@ -348,9 +365,6 @@ class MapFileExtractor(SourceFileExtractor):
"""
filename = os.path.join(BPFTOOL_DIR, 'map.c')
- def get_map_types(self):
- return self.get_types_from_array('map_type_name')
-
def get_map_help(self):
return self.get_help_list('TYPE')
@@ -363,30 +377,6 @@ class CgroupFileExtractor(SourceFileExtractor):
def get_prog_attach_help(self):
return self.get_help_list('ATTACH_TYPE')
-class CommonFileExtractor(SourceFileExtractor):
- """
- An extractor for bpftool's common.c.
- """
- filename = os.path.join(BPFTOOL_DIR, 'common.c')
-
- def __init__(self):
- super().__init__()
- self.attach_types = {}
-
- def get_attach_types(self):
- if not self.attach_types:
- self.attach_types = self.get_types_from_array('attach_type_name')
- return self.attach_types
-
- def get_cgroup_attach_types(self):
- if not self.attach_types:
- self.get_attach_types()
- cgroup_types = {}
- for (key, value) in self.attach_types.items():
- if key.find('BPF_CGROUP') != -1:
- cgroup_types[key] = value
- return cgroup_types
-
class GenericSourceExtractor(SourceFileExtractor):
"""
An extractor for generic source code files.
@@ -403,14 +393,28 @@ class BpfHeaderExtractor(FileExtractor):
"""
filename = os.path.join(INCLUDE_DIR, 'uapi/linux/bpf.h')
+ def __init__(self):
+ super().__init__()
+ self.attach_types = {}
+
def get_prog_types(self):
return self.get_enum('bpf_prog_type')
- def get_map_types(self):
- return self.get_enum('bpf_map_type')
+ def get_map_type_map(self):
+ names = self.get_enum('bpf_map_type')
+ return self.make_enum_map(names, 'BPF_MAP_TYPE_')
- def get_attach_types(self):
- return self.get_enum('bpf_attach_type')
+ def get_attach_type_map(self):
+ if not self.attach_types:
+ names = self.get_enum('bpf_attach_type')
+ self.attach_types = self.make_enum_map(names, 'BPF_')
+ return self.attach_types
+
+ def get_cgroup_attach_type_map(self):
+ if not self.attach_types:
+ self.get_attach_type_map()
+ return {name: text for name, text in self.attach_types.items()
+ if name.startswith('BPF_CGROUP')}
class ManPageExtractor(FileExtractor):
"""
@@ -467,12 +471,6 @@ class BashcompExtractor(FileExtractor):
def get_prog_attach_types(self):
return self.get_bashcomp_list('BPFTOOL_PROG_ATTACH_TYPES')
- def get_map_types(self):
- return self.get_bashcomp_list('BPFTOOL_MAP_CREATE_TYPES')
-
- def get_cgroup_attach_types(self):
- return self.get_bashcomp_list('BPFTOOL_CGROUP_ATTACH_TYPES')
-
def verify(first_set, second_set, message):
"""
Print all values that differ between two sets.
@@ -495,21 +493,12 @@ def main():
""")
args = argParser.parse_args()
- # Map types (enum)
-
bpf_info = BpfHeaderExtractor()
- ref = bpf_info.get_map_types()
-
- map_info = MapFileExtractor()
- source_map_items = map_info.get_map_types()
- map_types_enum = set(source_map_items.keys())
-
- verify(ref, map_types_enum,
- f'Comparing BPF header (enum bpf_map_type) and {MapFileExtractor.filename} (map_type_name):')
# Map types (names)
- source_map_types = set(source_map_items.values())
+ map_info = MapFileExtractor()
+ source_map_types = set(bpf_info.get_map_type_map().values())
source_map_types.discard('unspec')
help_map_types = map_info.get_map_help()
@@ -521,41 +510,16 @@ def main():
man_map_types = man_map_info.get_map_types()
man_map_info.close()
- bashcomp_info = BashcompExtractor()
- bashcomp_map_types = bashcomp_info.get_map_types()
-
verify(source_map_types, help_map_types,
- f'Comparing {MapFileExtractor.filename} (map_type_name) and {MapFileExtractor.filename} (do_help() TYPE):')
+ f'Comparing {BpfHeaderExtractor.filename} (bpf_map_type) and {MapFileExtractor.filename} (do_help() TYPE):')
verify(source_map_types, man_map_types,
- f'Comparing {MapFileExtractor.filename} (map_type_name) and {ManMapExtractor.filename} (TYPE):')
+ f'Comparing {BpfHeaderExtractor.filename} (bpf_map_type) and {ManMapExtractor.filename} (TYPE):')
verify(help_map_options, man_map_options,
f'Comparing {MapFileExtractor.filename} (do_help() OPTIONS) and {ManMapExtractor.filename} (OPTIONS):')
- verify(source_map_types, bashcomp_map_types,
- f'Comparing {MapFileExtractor.filename} (map_type_name) and {BashcompExtractor.filename} (BPFTOOL_MAP_CREATE_TYPES):')
-
- # Program types (enum)
-
- ref = bpf_info.get_prog_types()
-
- prog_info = ProgFileExtractor()
- prog_types = set(prog_info.get_prog_types().keys())
-
- verify(ref, prog_types,
- f'Comparing BPF header (enum bpf_prog_type) and {ProgFileExtractor.filename} (prog_type_name):')
-
- # Attach types (enum)
-
- ref = bpf_info.get_attach_types()
- bpf_info.close()
-
- common_info = CommonFileExtractor()
- attach_types = common_info.get_attach_types()
-
- verify(ref, attach_types,
- f'Comparing BPF header (enum bpf_attach_type) and {CommonFileExtractor.filename} (attach_type_name):')
# Attach types (names)
+ prog_info = ProgFileExtractor()
source_prog_attach_types = set(prog_info.get_attach_types().values())
help_prog_attach_types = prog_info.get_prog_attach_help()
@@ -567,22 +531,23 @@ def main():
man_prog_attach_types = man_prog_info.get_attach_types()
man_prog_info.close()
- bashcomp_info.reset_read() # We stopped at map types, rewind
+
+ bashcomp_info = BashcompExtractor()
bashcomp_prog_attach_types = bashcomp_info.get_prog_attach_types()
+ bashcomp_info.close()
verify(source_prog_attach_types, help_prog_attach_types,
- f'Comparing {ProgFileExtractor.filename} (attach_type_strings) and {ProgFileExtractor.filename} (do_help() ATTACH_TYPE):')
+ f'Comparing {ProgFileExtractor.filename} (bpf_attach_type) and {ProgFileExtractor.filename} (do_help() ATTACH_TYPE):')
verify(source_prog_attach_types, man_prog_attach_types,
- f'Comparing {ProgFileExtractor.filename} (attach_type_strings) and {ManProgExtractor.filename} (ATTACH_TYPE):')
+ f'Comparing {ProgFileExtractor.filename} (bpf_attach_type) and {ManProgExtractor.filename} (ATTACH_TYPE):')
verify(help_prog_options, man_prog_options,
f'Comparing {ProgFileExtractor.filename} (do_help() OPTIONS) and {ManProgExtractor.filename} (OPTIONS):')
verify(source_prog_attach_types, bashcomp_prog_attach_types,
- f'Comparing {ProgFileExtractor.filename} (attach_type_strings) and {BashcompExtractor.filename} (BPFTOOL_PROG_ATTACH_TYPES):')
+ f'Comparing {ProgFileExtractor.filename} (bpf_attach_type) and {BashcompExtractor.filename} (BPFTOOL_PROG_ATTACH_TYPES):')
# Cgroup attach types
-
- source_cgroup_attach_types = set(common_info.get_cgroup_attach_types().values())
- common_info.close()
+ source_cgroup_attach_types = set(bpf_info.get_cgroup_attach_type_map().values())
+ bpf_info.close()
cgroup_info = CgroupFileExtractor()
help_cgroup_attach_types = cgroup_info.get_prog_attach_help()
@@ -594,17 +559,12 @@ def main():
man_cgroup_attach_types = man_cgroup_info.get_attach_types()
man_cgroup_info.close()
- bashcomp_cgroup_attach_types = bashcomp_info.get_cgroup_attach_types()
- bashcomp_info.close()
-
verify(source_cgroup_attach_types, help_cgroup_attach_types,
- f'Comparing {CommonFileExtractor.filename} (attach_type_strings) and {CgroupFileExtractor.filename} (do_help() ATTACH_TYPE):')
+ f'Comparing {BpfHeaderExtractor.filename} (bpf_attach_type) and {CgroupFileExtractor.filename} (do_help() ATTACH_TYPE):')
verify(source_cgroup_attach_types, man_cgroup_attach_types,
- f'Comparing {CommonFileExtractor.filename} (attach_type_strings) and {ManCgroupExtractor.filename} (ATTACH_TYPE):')
+ f'Comparing {BpfHeaderExtractor.filename} (bpf_attach_type) and {ManCgroupExtractor.filename} (ATTACH_TYPE):')
verify(help_cgroup_options, man_cgroup_options,
f'Comparing {CgroupFileExtractor.filename} (do_help() OPTIONS) and {ManCgroupExtractor.filename} (OPTIONS):')
- verify(source_cgroup_attach_types, bashcomp_cgroup_attach_types,
- f'Comparing {CommonFileExtractor.filename} (attach_type_strings) and {BashcompExtractor.filename} (BPFTOOL_CGROUP_ATTACH_TYPES):')
# Options for remaining commands
diff --git a/tools/testing/selftests/bpf/test_btf.h b/tools/testing/selftests/bpf/test_btf.h
index 128989bed8b7..fb4f4714eeb4 100644
--- a/tools/testing/selftests/bpf/test_btf.h
+++ b/tools/testing/selftests/bpf/test_btf.h
@@ -4,6 +4,8 @@
#ifndef _TEST_BTF_H
#define _TEST_BTF_H
+#define BTF_END_RAW 0xdeadbeef
+
#define BTF_INFO_ENC(kind, kind_flag, vlen) \
((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
@@ -39,6 +41,7 @@
#define BTF_MEMBER_ENC(name, type, bits_offset) \
(name), (type), (bits_offset)
#define BTF_ENUM_ENC(name, val) (name), (val)
+#define BTF_ENUM64_ENC(name, val_lo32, val_hi32) (name), (val_lo32), (val_hi32)
#define BTF_MEMBER_OFFSET(bitfield_size, bits_offset) \
((bitfield_size) << 24 | (bits_offset))
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index c639f2e56fc5..3561c97701f2 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -1604,11 +1604,8 @@ int main(int argc, char **argv)
struct prog_test_def *test = &prog_test_defs[i];
test->test_num = i + 1;
- if (should_run(&env.test_selector,
- test->test_num, test->test_name))
- test->should_run = true;
- else
- test->should_run = false;
+ test->should_run = should_run(&env.test_selector,
+ test->test_num, test->test_name);
if ((test->run_test == NULL && test->run_serial_test == NULL) ||
(test->run_test != NULL && test->run_serial_test != NULL)) {
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 372579c9f45e..f9d553fbf68a 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -51,12 +51,24 @@
#endif
#define MAX_INSNS BPF_MAXINSNS
+#define MAX_EXPECTED_INSNS 32
+#define MAX_UNEXPECTED_INSNS 32
#define MAX_TEST_INSNS 1000000
#define MAX_FIXUPS 8
#define MAX_NR_MAPS 23
#define MAX_TEST_RUNS 8
#define POINTER_VALUE 0xcafe4all
#define TEST_DATA_LEN 64
+#define MAX_FUNC_INFOS 8
+#define MAX_BTF_STRINGS 256
+#define MAX_BTF_TYPES 256
+
+#define INSN_OFF_MASK ((__s16)0xFFFF)
+#define INSN_IMM_MASK ((__s32)0xFFFFFFFF)
+#define SKIP_INSNS() BPF_RAW_INSN(0xde, 0xa, 0xd, 0xbeef, 0xdeadbeef)
+
+#define DEFAULT_LIBBPF_LOG_LEVEL 4
+#define VERBOSE_LIBBPF_LOG_LEVEL 1
#define F_NEEDS_EFFICIENT_UNALIGNED_ACCESS (1 << 0)
#define F_LOAD_WITH_STRICT_ALIGNMENT (1 << 1)
@@ -79,6 +91,23 @@ struct bpf_test {
const char *descr;
struct bpf_insn insns[MAX_INSNS];
struct bpf_insn *fill_insns;
+ /* If specified, test engine looks for this sequence of
+ * instructions in the BPF program after loading. Allows to
+ * test rewrites applied by verifier. Use values
+ * INSN_OFF_MASK and INSN_IMM_MASK to mask `off` and `imm`
+ * fields if content does not matter. The test case fails if
+ * specified instructions are not found.
+ *
+ * The sequence could be split into sub-sequences by adding
+ * SKIP_INSNS instruction at the end of each sub-sequence. In
+ * such case sub-sequences are searched for one after another.
+ */
+ struct bpf_insn expected_insns[MAX_EXPECTED_INSNS];
+ /* If specified, test engine applies same pattern matching
+ * logic as for `expected_insns`. If the specified pattern is
+ * matched test case is marked as failed.
+ */
+ struct bpf_insn unexpected_insns[MAX_UNEXPECTED_INSNS];
int fixup_map_hash_8b[MAX_FIXUPS];
int fixup_map_hash_48b[MAX_FIXUPS];
int fixup_map_hash_16b[MAX_FIXUPS];
@@ -135,6 +164,14 @@ struct bpf_test {
};
enum bpf_attach_type expected_attach_type;
const char *kfunc;
+ struct bpf_func_info func_info[MAX_FUNC_INFOS];
+ int func_info_cnt;
+ char btf_strings[MAX_BTF_STRINGS];
+ /* A set of BTF types to load when specified,
+ * use macro definitions from test_btf.h,
+ * must end with BTF_END_RAW
+ */
+ __u32 btf_types[MAX_BTF_TYPES];
};
/* Note we want this to be 64 bit aligned so that the end of our array is
@@ -388,6 +425,45 @@ static void bpf_fill_torturous_jumps(struct bpf_test *self)
}
}
+static void bpf_fill_big_prog_with_loop_1(struct bpf_test *self)
+{
+ struct bpf_insn *insn = self->fill_insns;
+ /* This test was added to catch a specific use after free
+ * error, which happened upon BPF program reallocation.
+ * Reallocation is handled by core.c:bpf_prog_realloc, which
+ * reuses old memory if page boundary is not crossed. The
+ * value of `len` is chosen to cross this boundary on bpf_loop
+ * patching.
+ */
+ const int len = getpagesize() - 25;
+ int callback_load_idx;
+ int callback_idx;
+ int i = 0;
+
+ insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1);
+ callback_load_idx = i;
+ insn[i++] = BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW,
+ BPF_REG_2, BPF_PSEUDO_FUNC, 0,
+ 777 /* filled below */);
+ insn[i++] = BPF_RAW_INSN(0, 0, 0, 0, 0);
+ insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0);
+ insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0);
+ insn[i++] = BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop);
+
+ while (i < len - 3)
+ insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0);
+ insn[i++] = BPF_EXIT_INSN();
+
+ callback_idx = i;
+ insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0);
+ insn[i++] = BPF_EXIT_INSN();
+
+ insn[callback_load_idx].imm = callback_idx - callback_load_idx - 1;
+ self->func_info[1].insn_off = callback_idx;
+ self->prog_len = i;
+ assert(i == len);
+}
+
/* BPF_SK_LOOKUP contains 13 instructions, if you need to fix up maps */
#define BPF_SK_LOOKUP(func) \
/* struct bpf_sock_tuple tuple = {} */ \
@@ -664,34 +740,66 @@ static __u32 btf_raw_types[] = {
BTF_MEMBER_ENC(71, 13, 128), /* struct prog_test_member __kptr_ref *ptr; */
};
-static int load_btf(void)
+static char bpf_vlog[UINT_MAX >> 8];
+
+static int load_btf_spec(__u32 *types, int types_len,
+ const char *strings, int strings_len)
{
struct btf_header hdr = {
.magic = BTF_MAGIC,
.version = BTF_VERSION,
.hdr_len = sizeof(struct btf_header),
- .type_len = sizeof(btf_raw_types),
- .str_off = sizeof(btf_raw_types),
- .str_len = sizeof(btf_str_sec),
+ .type_len = types_len,
+ .str_off = types_len,
+ .str_len = strings_len,
};
void *ptr, *raw_btf;
int btf_fd;
+ LIBBPF_OPTS(bpf_btf_load_opts, opts,
+ .log_buf = bpf_vlog,
+ .log_size = sizeof(bpf_vlog),
+ .log_level = (verbose
+ ? VERBOSE_LIBBPF_LOG_LEVEL
+ : DEFAULT_LIBBPF_LOG_LEVEL),
+ );
- ptr = raw_btf = malloc(sizeof(hdr) + sizeof(btf_raw_types) +
- sizeof(btf_str_sec));
+ raw_btf = malloc(sizeof(hdr) + types_len + strings_len);
+ ptr = raw_btf;
memcpy(ptr, &hdr, sizeof(hdr));
ptr += sizeof(hdr);
- memcpy(ptr, btf_raw_types, hdr.type_len);
+ memcpy(ptr, types, hdr.type_len);
ptr += hdr.type_len;
- memcpy(ptr, btf_str_sec, hdr.str_len);
+ memcpy(ptr, strings, hdr.str_len);
ptr += hdr.str_len;
- btf_fd = bpf_btf_load(raw_btf, ptr - raw_btf, NULL);
- free(raw_btf);
+ btf_fd = bpf_btf_load(raw_btf, ptr - raw_btf, &opts);
if (btf_fd < 0)
- return -1;
- return btf_fd;
+ printf("Failed to load BTF spec: '%s'\n", strerror(errno));
+
+ free(raw_btf);
+
+ return btf_fd < 0 ? -1 : btf_fd;
+}
+
+static int load_btf(void)
+{
+ return load_btf_spec(btf_raw_types, sizeof(btf_raw_types),
+ btf_str_sec, sizeof(btf_str_sec));
+}
+
+static int load_btf_for_test(struct bpf_test *test)
+{
+ int types_num = 0;
+
+ while (types_num < MAX_BTF_TYPES &&
+ test->btf_types[types_num] != BTF_END_RAW)
+ ++types_num;
+
+ int types_len = types_num * sizeof(test->btf_types[0]);
+
+ return load_btf_spec(test->btf_types, types_len,
+ test->btf_strings, sizeof(test->btf_strings));
}
static int create_map_spin_lock(void)
@@ -770,8 +878,6 @@ static int create_map_kptr(void)
return fd;
}
-static char bpf_vlog[UINT_MAX >> 8];
-
static void do_test_fixup(struct bpf_test *test, enum bpf_prog_type prog_type,
struct bpf_insn *prog, int *map_fds)
{
@@ -1126,10 +1232,218 @@ static bool cmp_str_seq(const char *log, const char *exp)
return true;
}
+static int get_xlated_program(int fd_prog, struct bpf_insn **buf, int *cnt)
+{
+ struct bpf_prog_info info = {};
+ __u32 info_len = sizeof(info);
+ __u32 xlated_prog_len;
+ __u32 buf_element_size = sizeof(struct bpf_insn);
+
+ if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) {
+ perror("bpf_obj_get_info_by_fd failed");
+ return -1;
+ }
+
+ xlated_prog_len = info.xlated_prog_len;
+ if (xlated_prog_len % buf_element_size) {
+ printf("Program length %d is not multiple of %d\n",
+ xlated_prog_len, buf_element_size);
+ return -1;
+ }
+
+ *cnt = xlated_prog_len / buf_element_size;
+ *buf = calloc(*cnt, buf_element_size);
+ if (!buf) {
+ perror("can't allocate xlated program buffer");
+ return -ENOMEM;
+ }
+
+ bzero(&info, sizeof(info));
+ info.xlated_prog_len = xlated_prog_len;
+ info.xlated_prog_insns = (__u64)*buf;
+ if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) {
+ perror("second bpf_obj_get_info_by_fd failed");
+ goto out_free_buf;
+ }
+
+ return 0;
+
+out_free_buf:
+ free(*buf);
+ return -1;
+}
+
+static bool is_null_insn(struct bpf_insn *insn)
+{
+ struct bpf_insn null_insn = {};
+
+ return memcmp(insn, &null_insn, sizeof(null_insn)) == 0;
+}
+
+static bool is_skip_insn(struct bpf_insn *insn)
+{
+ struct bpf_insn skip_insn = SKIP_INSNS();
+
+ return memcmp(insn, &skip_insn, sizeof(skip_insn)) == 0;
+}
+
+static int null_terminated_insn_len(struct bpf_insn *seq, int max_len)
+{
+ int i;
+
+ for (i = 0; i < max_len; ++i) {
+ if (is_null_insn(&seq[i]))
+ return i;
+ }
+ return max_len;
+}
+
+static bool compare_masked_insn(struct bpf_insn *orig, struct bpf_insn *masked)
+{
+ struct bpf_insn orig_masked;
+
+ memcpy(&orig_masked, orig, sizeof(orig_masked));
+ if (masked->imm == INSN_IMM_MASK)
+ orig_masked.imm = INSN_IMM_MASK;
+ if (masked->off == INSN_OFF_MASK)
+ orig_masked.off = INSN_OFF_MASK;
+
+ return memcmp(&orig_masked, masked, sizeof(orig_masked)) == 0;
+}
+
+static int find_insn_subseq(struct bpf_insn *seq, struct bpf_insn *subseq,
+ int seq_len, int subseq_len)
+{
+ int i, j;
+
+ if (subseq_len > seq_len)
+ return -1;
+
+ for (i = 0; i < seq_len - subseq_len + 1; ++i) {
+ bool found = true;
+
+ for (j = 0; j < subseq_len; ++j) {
+ if (!compare_masked_insn(&seq[i + j], &subseq[j])) {
+ found = false;
+ break;
+ }
+ }
+ if (found)
+ return i;
+ }
+
+ return -1;
+}
+
+static int find_skip_insn_marker(struct bpf_insn *seq, int len)
+{
+ int i;
+
+ for (i = 0; i < len; ++i)
+ if (is_skip_insn(&seq[i]))
+ return i;
+
+ return -1;
+}
+
+/* Return true if all sub-sequences in `subseqs` could be found in
+ * `seq` one after another. Sub-sequences are separated by a single
+ * nil instruction.
+ */
+static bool find_all_insn_subseqs(struct bpf_insn *seq, struct bpf_insn *subseqs,
+ int seq_len, int max_subseqs_len)
+{
+ int subseqs_len = null_terminated_insn_len(subseqs, max_subseqs_len);
+
+ while (subseqs_len > 0) {
+ int skip_idx = find_skip_insn_marker(subseqs, subseqs_len);
+ int cur_subseq_len = skip_idx < 0 ? subseqs_len : skip_idx;
+ int subseq_idx = find_insn_subseq(seq, subseqs,
+ seq_len, cur_subseq_len);
+
+ if (subseq_idx < 0)
+ return false;
+ seq += subseq_idx + cur_subseq_len;
+ seq_len -= subseq_idx + cur_subseq_len;
+ subseqs += cur_subseq_len + 1;
+ subseqs_len -= cur_subseq_len + 1;
+ }
+
+ return true;
+}
+
+static void print_insn(struct bpf_insn *buf, int cnt)
+{
+ int i;
+
+ printf(" addr op d s off imm\n");
+ for (i = 0; i < cnt; ++i) {
+ struct bpf_insn *insn = &buf[i];
+
+ if (is_null_insn(insn))
+ break;
+
+ if (is_skip_insn(insn))
+ printf(" ...\n");
+ else
+ printf(" %04x: %02x %1x %x %04hx %08x\n",
+ i, insn->code, insn->dst_reg,
+ insn->src_reg, insn->off, insn->imm);
+ }
+}
+
+static bool check_xlated_program(struct bpf_test *test, int fd_prog)
+{
+ struct bpf_insn *buf;
+ int cnt;
+ bool result = true;
+ bool check_expected = !is_null_insn(test->expected_insns);
+ bool check_unexpected = !is_null_insn(test->unexpected_insns);
+
+ if (!check_expected && !check_unexpected)
+ goto out;
+
+ if (get_xlated_program(fd_prog, &buf, &cnt)) {
+ printf("FAIL: can't get xlated program\n");
+ result = false;
+ goto out;
+ }
+
+ if (check_expected &&
+ !find_all_insn_subseqs(buf, test->expected_insns,
+ cnt, MAX_EXPECTED_INSNS)) {
+ printf("FAIL: can't find expected subsequence of instructions\n");
+ result = false;
+ if (verbose) {
+ printf("Program:\n");
+ print_insn(buf, cnt);
+ printf("Expected subsequence:\n");
+ print_insn(test->expected_insns, MAX_EXPECTED_INSNS);
+ }
+ }
+
+ if (check_unexpected &&
+ find_all_insn_subseqs(buf, test->unexpected_insns,
+ cnt, MAX_UNEXPECTED_INSNS)) {
+ printf("FAIL: found unexpected subsequence of instructions\n");
+ result = false;
+ if (verbose) {
+ printf("Program:\n");
+ print_insn(buf, cnt);
+ printf("Un-expected subsequence:\n");
+ print_insn(test->unexpected_insns, MAX_UNEXPECTED_INSNS);
+ }
+ }
+
+ free(buf);
+ out:
+ return result;
+}
+
static void do_test_single(struct bpf_test *test, bool unpriv,
int *passes, int *errors)
{
- int fd_prog, expected_ret, alignment_prevented_execution;
+ int fd_prog, btf_fd, expected_ret, alignment_prevented_execution;
int prog_len, prog_type = test->prog_type;
struct bpf_insn *prog = test->insns;
LIBBPF_OPTS(bpf_prog_load_opts, opts);
@@ -1141,8 +1455,10 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
__u32 pflags;
int i, err;
+ fd_prog = -1;
for (i = 0; i < MAX_NR_MAPS; i++)
map_fds[i] = -1;
+ btf_fd = -1;
if (!prog_type)
prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
@@ -1175,11 +1491,11 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
opts.expected_attach_type = test->expected_attach_type;
if (verbose)
- opts.log_level = 1;
+ opts.log_level = VERBOSE_LIBBPF_LOG_LEVEL;
else if (expected_ret == VERBOSE_ACCEPT)
opts.log_level = 2;
else
- opts.log_level = 4;
+ opts.log_level = DEFAULT_LIBBPF_LOG_LEVEL;
opts.prog_flags = pflags;
if (prog_type == BPF_PROG_TYPE_TRACING && test->kfunc) {
@@ -1197,6 +1513,19 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
opts.attach_btf_id = attach_btf_id;
}
+ if (test->btf_types[0] != 0) {
+ btf_fd = load_btf_for_test(test);
+ if (btf_fd < 0)
+ goto fail_log;
+ opts.prog_btf_fd = btf_fd;
+ }
+
+ if (test->func_info_cnt != 0) {
+ opts.func_info = test->func_info;
+ opts.func_info_cnt = test->func_info_cnt;
+ opts.func_info_rec_size = sizeof(test->func_info[0]);
+ }
+
opts.log_buf = bpf_vlog;
opts.log_size = sizeof(bpf_vlog);
fd_prog = bpf_prog_load(prog_type, NULL, "GPL", prog, prog_len, &opts);
@@ -1262,6 +1591,9 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
if (verbose)
printf(", verifier log:\n%s", bpf_vlog);
+ if (!check_xlated_program(test, fd_prog))
+ goto fail_log;
+
run_errs = 0;
run_successes = 0;
if (!alignment_prevented_execution && fd_prog >= 0 && test->runs >= 0) {
@@ -1305,6 +1637,7 @@ close_fds:
if (test->fill_insns)
free(test->fill_insns);
close(fd_prog);
+ close(btf_fd);
for (i = 0; i < MAX_NR_MAPS; i++)
close(map_fds[i]);
sched_yield();
diff --git a/tools/testing/selftests/bpf/test_xdp_veth.sh b/tools/testing/selftests/bpf/test_xdp_veth.sh
index 392d28cc4e58..49936c4c8567 100755
--- a/tools/testing/selftests/bpf/test_xdp_veth.sh
+++ b/tools/testing/selftests/bpf/test_xdp_veth.sh
@@ -106,9 +106,9 @@ bpftool prog loadall \
bpftool map update pinned $BPF_DIR/maps/tx_port key 0 0 0 0 value 122 0 0 0
bpftool map update pinned $BPF_DIR/maps/tx_port key 1 0 0 0 value 133 0 0 0
bpftool map update pinned $BPF_DIR/maps/tx_port key 2 0 0 0 value 111 0 0 0
-ip link set dev veth1 xdp pinned $BPF_DIR/progs/redirect_map_0
-ip link set dev veth2 xdp pinned $BPF_DIR/progs/redirect_map_1
-ip link set dev veth3 xdp pinned $BPF_DIR/progs/redirect_map_2
+ip link set dev veth1 xdp pinned $BPF_DIR/progs/xdp_redirect_map_0
+ip link set dev veth2 xdp pinned $BPF_DIR/progs/xdp_redirect_map_1
+ip link set dev veth3 xdp pinned $BPF_DIR/progs/xdp_redirect_map_2
ip -n ${NS1} link set dev veth11 xdp obj xdp_dummy.o sec xdp
ip -n ${NS2} link set dev veth22 xdp obj xdp_tx.o sec xdp
diff --git a/tools/testing/selftests/bpf/test_xdping.sh b/tools/testing/selftests/bpf/test_xdping.sh
index c2f0ddb45531..c3d82e0a7378 100755
--- a/tools/testing/selftests/bpf/test_xdping.sh
+++ b/tools/testing/selftests/bpf/test_xdping.sh
@@ -95,5 +95,9 @@ for server_args in "" "-I veth0 -s -S" ; do
test "$client_args" "$server_args"
done
+# Test drv mode
+test "-I veth1 -N" "-I veth0 -s -N"
+test "-I veth1 -N -c 10" "-I veth0 -s -N"
+
echo "OK. All tests passed"
exit 0
diff --git a/tools/testing/selftests/bpf/test_xsk.sh b/tools/testing/selftests/bpf/test_xsk.sh
index 567500299231..096a957594cd 100755
--- a/tools/testing/selftests/bpf/test_xsk.sh
+++ b/tools/testing/selftests/bpf/test_xsk.sh
@@ -47,7 +47,7 @@
# conflict with any existing interface
# * tests the veth and xsk layers of the topology
#
-# See the source xdpxceiver.c for information on each test
+# See the source xskxceiver.c for information on each test
#
# Kernel configuration:
# ---------------------
@@ -160,14 +160,14 @@ statusList=()
TEST_NAME="XSK_SELFTESTS_SOFTIRQ"
-execxdpxceiver
+exec_xskxceiver
cleanup_exit ${VETH0} ${VETH1} ${NS1}
TEST_NAME="XSK_SELFTESTS_BUSY_POLL"
busy_poll=1
setup_vethPairs
-execxdpxceiver
+exec_xskxceiver
## END TESTS
diff --git a/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c b/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
new file mode 100644
index 000000000000..a535d41dc20d
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
@@ -0,0 +1,264 @@
+#define BTF_TYPES \
+ .btf_strings = "\0int\0i\0ctx\0callback\0main\0", \
+ .btf_types = { \
+ /* 1: int */ BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), \
+ /* 2: int* */ BTF_PTR_ENC(1), \
+ /* 3: void* */ BTF_PTR_ENC(0), \
+ /* 4: int __(void*) */ BTF_FUNC_PROTO_ENC(1, 1), \
+ BTF_FUNC_PROTO_ARG_ENC(7, 3), \
+ /* 5: int __(int, int*) */ BTF_FUNC_PROTO_ENC(1, 2), \
+ BTF_FUNC_PROTO_ARG_ENC(5, 1), \
+ BTF_FUNC_PROTO_ARG_ENC(7, 2), \
+ /* 6: main */ BTF_FUNC_ENC(20, 4), \
+ /* 7: callback */ BTF_FUNC_ENC(11, 5), \
+ BTF_END_RAW \
+ }
+
+#define MAIN_TYPE 6
+#define CALLBACK_TYPE 7
+
+/* can't use BPF_CALL_REL, jit_subprogs adjusts IMM & OFF
+ * fields for pseudo calls
+ */
+#define PSEUDO_CALL_INSN() \
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_CALL, \
+ INSN_OFF_MASK, INSN_IMM_MASK)
+
+/* can't use BPF_FUNC_loop constant,
+ * do_mix_fixups adjusts the IMM field
+ */
+#define HELPER_CALL_INSN() \
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, INSN_OFF_MASK, INSN_IMM_MASK)
+
+{
+ "inline simple bpf_loop call",
+ .insns = {
+ /* main */
+ /* force verifier state branching to verify logic on first and
+ * subsequent bpf_loop insn processing steps
+ */
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 777, 2),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+ BPF_JMP_IMM(BPF_JA, 0, 0, 1),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
+
+ BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
+ BPF_RAW_INSN(0, 0, 0, 0, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ /* callback */
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .expected_insns = { PSEUDO_CALL_INSN() },
+ .unexpected_insns = { HELPER_CALL_INSN() },
+ .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+ .result = ACCEPT,
+ .runs = 0,
+ .func_info = { { 0, MAIN_TYPE }, { 12, CALLBACK_TYPE } },
+ .func_info_cnt = 2,
+ BTF_TYPES
+},
+{
+ "don't inline bpf_loop call, flags non-zero",
+ .insns = {
+ /* main */
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+ BPF_ALU64_REG(BPF_MOV, BPF_REG_6, BPF_REG_0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+ BPF_ALU64_REG(BPF_MOV, BPF_REG_7, BPF_REG_0),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_6, 0, 9),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_7, 0, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+ BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 7),
+ BPF_RAW_INSN(0, 0, 0, 0, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 1),
+ BPF_JMP_IMM(BPF_JA, 0, 0, -10),
+ /* callback */
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .expected_insns = { HELPER_CALL_INSN() },
+ .unexpected_insns = { PSEUDO_CALL_INSN() },
+ .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+ .result = ACCEPT,
+ .runs = 0,
+ .func_info = { { 0, MAIN_TYPE }, { 16, CALLBACK_TYPE } },
+ .func_info_cnt = 2,
+ BTF_TYPES
+},
+{
+ "don't inline bpf_loop call, callback non-constant",
+ .insns = {
+ /* main */
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 777, 4), /* pick a random callback */
+
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+ BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 10),
+ BPF_RAW_INSN(0, 0, 0, 0, 0),
+ BPF_JMP_IMM(BPF_JA, 0, 0, 3),
+
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+ BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 8),
+ BPF_RAW_INSN(0, 0, 0, 0, 0),
+
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ /* callback */
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ /* callback #2 */
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .expected_insns = { HELPER_CALL_INSN() },
+ .unexpected_insns = { PSEUDO_CALL_INSN() },
+ .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+ .result = ACCEPT,
+ .runs = 0,
+ .func_info = {
+ { 0, MAIN_TYPE },
+ { 14, CALLBACK_TYPE },
+ { 16, CALLBACK_TYPE }
+ },
+ .func_info_cnt = 3,
+ BTF_TYPES
+},
+{
+ "bpf_loop_inline and a dead func",
+ .insns = {
+ /* main */
+
+ /* A reference to callback #1 to make verifier count it as a func.
+ * This reference is overwritten below and callback #1 is dead.
+ */
+ BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 9),
+ BPF_RAW_INSN(0, 0, 0, 0, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+ BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 8),
+ BPF_RAW_INSN(0, 0, 0, 0, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ /* callback */
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ /* callback #2 */
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .expected_insns = { PSEUDO_CALL_INSN() },
+ .unexpected_insns = { HELPER_CALL_INSN() },
+ .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+ .result = ACCEPT,
+ .runs = 0,
+ .func_info = {
+ { 0, MAIN_TYPE },
+ { 10, CALLBACK_TYPE },
+ { 12, CALLBACK_TYPE }
+ },
+ .func_info_cnt = 3,
+ BTF_TYPES
+},
+{
+ "bpf_loop_inline stack locations for loop vars",
+ .insns = {
+ /* main */
+ BPF_ST_MEM(BPF_W, BPF_REG_10, -12, 0x77),
+ /* bpf_loop call #1 */
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1),
+ BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 22),
+ BPF_RAW_INSN(0, 0, 0, 0, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+ /* bpf_loop call #2 */
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
+ BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 16),
+ BPF_RAW_INSN(0, 0, 0, 0, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+ /* call func and exit */
+ BPF_CALL_REL(2),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ /* func */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -32, 0x55),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2),
+ BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6),
+ BPF_RAW_INSN(0, 0, 0, 0, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop),
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ /* callback */
+ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .expected_insns = {
+ BPF_ST_MEM(BPF_W, BPF_REG_10, -12, 0x77),
+ SKIP_INSNS(),
+ BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -40),
+ BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -32),
+ BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -24),
+ SKIP_INSNS(),
+ /* offsets are the same as in the first call */
+ BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -40),
+ BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -32),
+ BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -24),
+ SKIP_INSNS(),
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -32, 0x55),
+ SKIP_INSNS(),
+ /* offsets differ from main because of different offset
+ * in BPF_ST_MEM instruction
+ */
+ BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -56),
+ BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -48),
+ BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -40),
+ },
+ .unexpected_insns = { HELPER_CALL_INSN() },
+ .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+ .result = ACCEPT,
+ .func_info = {
+ { 0, MAIN_TYPE },
+ { 16, MAIN_TYPE },
+ { 25, CALLBACK_TYPE },
+ },
+ .func_info_cnt = 3,
+ BTF_TYPES
+},
+{
+ "inline bpf_loop call in a big program",
+ .insns = {},
+ .fill_helper = bpf_fill_big_prog_with_loop_1,
+ .expected_insns = { PSEUDO_CALL_INSN() },
+ .unexpected_insns = { HELPER_CALL_INSN() },
+ .result = ACCEPT,
+ .prog_type = BPF_PROG_TYPE_TRACEPOINT,
+ .func_info = { { 0, MAIN_TYPE }, { 16, CALLBACK_TYPE } },
+ .func_info_cnt = 2,
+ BTF_TYPES
+},
+
+#undef HELPER_CALL_INSN
+#undef PSEUDO_CALL_INSN
+#undef CALLBACK_TYPE
+#undef MAIN_TYPE
+#undef BTF_TYPES
diff --git a/tools/testing/selftests/bpf/verifier/calls.c b/tools/testing/selftests/bpf/verifier/calls.c
index 743ed34c1238..3fb4f69b1962 100644
--- a/tools/testing/selftests/bpf/verifier/calls.c
+++ b/tools/testing/selftests/bpf/verifier/calls.c
@@ -219,6 +219,59 @@
.errstr = "variable ptr_ access var_off=(0x0; 0x7) disallowed",
},
{
+ "calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID",
+ .insns = {
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8),
+ BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+ BPF_EXIT_INSN(),
+ BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6, 16),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ .fixup_kfunc_btf_id = {
+ { "bpf_kfunc_call_test_acquire", 3 },
+ { "bpf_kfunc_call_test_ref", 8 },
+ { "bpf_kfunc_call_test_ref", 10 },
+ },
+ .result_unpriv = REJECT,
+ .result = REJECT,
+ .errstr = "R1 must be referenced",
+},
+{
+ "calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID",
+ .insns = {
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8),
+ BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+ BPF_EXIT_INSN(),
+ BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ .fixup_kfunc_btf_id = {
+ { "bpf_kfunc_call_test_acquire", 3 },
+ { "bpf_kfunc_call_test_ref", 8 },
+ { "bpf_kfunc_call_test_release", 10 },
+ },
+ .result_unpriv = REJECT,
+ .result = ACCEPT,
+},
+{
"calls: basic sanity",
.insns = {
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 2),
diff --git a/tools/testing/selftests/bpf/vmtest.sh b/tools/testing/selftests/bpf/vmtest.sh
index e0bb04a97e10..b86ae4a2e5c5 100755
--- a/tools/testing/selftests/bpf/vmtest.sh
+++ b/tools/testing/selftests/bpf/vmtest.sh
@@ -30,8 +30,7 @@ DEFAULT_COMMAND="./test_progs"
MOUNT_DIR="mnt"
ROOTFS_IMAGE="root.img"
OUTPUT_DIR="$HOME/.bpf_selftests"
-KCONFIG_URL="https://raw.githubusercontent.com/libbpf/libbpf/master/travis-ci/vmtest/configs/config-latest.${ARCH}"
-KCONFIG_API_URL="https://api.github.com/repos/libbpf/libbpf/contents/travis-ci/vmtest/configs/config-latest.${ARCH}"
+KCONFIG_REL_PATHS=("tools/testing/selftests/bpf/config" "tools/testing/selftests/bpf/config.${ARCH}")
INDEX_URL="https://raw.githubusercontent.com/libbpf/ci/master/INDEX"
NUM_COMPILE_JOBS="$(nproc)"
LOG_FILE_BASE="$(date +"bpf_selftests.%Y-%m-%d_%H-%M-%S")"
@@ -269,26 +268,42 @@ is_rel_path()
[[ ${path:0:1} != "/" ]]
}
+do_update_kconfig()
+{
+ local kernel_checkout="$1"
+ local kconfig_file="$2"
+
+ rm -f "$kconfig_file" 2> /dev/null
+
+ for config in "${KCONFIG_REL_PATHS[@]}"; do
+ local kconfig_src="${kernel_checkout}/${config}"
+ cat "$kconfig_src" >> "$kconfig_file"
+ done
+}
+
update_kconfig()
{
- local kconfig_file="$1"
- local update_command="curl -sLf ${KCONFIG_URL} -o ${kconfig_file}"
- # Github does not return the "last-modified" header when retrieving the
- # raw contents of the file. Use the API call to get the last-modified
- # time of the kernel config and only update the config if it has been
- # updated after the previously cached config was created. This avoids
- # unnecessarily compiling the kernel and selftests.
- if [[ -f "${kconfig_file}" ]]; then
- local last_modified_date="$(curl -sL -D - "${KCONFIG_API_URL}" -o /dev/null | \
- grep "last-modified" | awk -F ': ' '{print $2}')"
- local remote_modified_timestamp="$(date -d "${last_modified_date}" +"%s")"
- local local_creation_timestamp="$(stat -c %Y "${kconfig_file}")"
+ local kernel_checkout="$1"
+ local kconfig_file="$2"
- if [[ "${remote_modified_timestamp}" -gt "${local_creation_timestamp}" ]]; then
- ${update_command}
- fi
+ if [[ -f "${kconfig_file}" ]]; then
+ local local_modified="$(stat -c %Y "${kconfig_file}")"
+
+ for config in "${KCONFIG_REL_PATHS[@]}"; do
+ local kconfig_src="${kernel_checkout}/${config}"
+ local src_modified="$(stat -c %Y "${kconfig_src}")"
+ # Only update the config if it has been updated after the
+ # previously cached config was created. This avoids
+ # unnecessarily compiling the kernel and selftests.
+ if [[ "${src_modified}" -gt "${local_modified}" ]]; then
+ do_update_kconfig "$kernel_checkout" "$kconfig_file"
+ # Once we have found one outdated configuration
+ # there is no need to check other ones.
+ break
+ fi
+ done
else
- ${update_command}
+ do_update_kconfig "$kernel_checkout" "$kconfig_file"
fi
}
@@ -372,7 +387,7 @@ main()
mkdir -p "${OUTPUT_DIR}"
mkdir -p "${mount_dir}"
- update_kconfig "${kconfig_file}"
+ update_kconfig "${kernel_checkout}" "${kconfig_file}"
recompile_kernel "${kernel_checkout}" "${make_command}"
diff --git a/tools/testing/selftests/bpf/xdp_synproxy.c b/tools/testing/selftests/bpf/xdp_synproxy.c
new file mode 100644
index 000000000000..d874ddfb39c4
--- /dev/null
+++ b/tools/testing/selftests/bpf/xdp_synproxy.c
@@ -0,0 +1,466 @@
+// SPDX-License-Identifier: LGPL-2.1 OR BSD-2-Clause
+/* Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#include <stdnoreturn.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <unistd.h>
+#include <getopt.h>
+#include <signal.h>
+#include <sys/types.h>
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#include <net/if.h>
+#include <linux/if_link.h>
+#include <linux/limits.h>
+
+static unsigned int ifindex;
+static __u32 attached_prog_id;
+static bool attached_tc;
+
+static void noreturn cleanup(int sig)
+{
+ LIBBPF_OPTS(bpf_xdp_attach_opts, opts);
+ int prog_fd;
+ int err;
+
+ if (attached_prog_id == 0)
+ exit(0);
+
+ if (attached_tc) {
+ LIBBPF_OPTS(bpf_tc_hook, hook,
+ .ifindex = ifindex,
+ .attach_point = BPF_TC_INGRESS);
+
+ err = bpf_tc_hook_destroy(&hook);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_tc_hook_destroy: %s\n", strerror(-err));
+ fprintf(stderr, "Failed to destroy the TC hook\n");
+ exit(1);
+ }
+ exit(0);
+ }
+
+ prog_fd = bpf_prog_get_fd_by_id(attached_prog_id);
+ if (prog_fd < 0) {
+ fprintf(stderr, "Error: bpf_prog_get_fd_by_id: %s\n", strerror(-prog_fd));
+ err = bpf_xdp_attach(ifindex, -1, 0, NULL);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_set_link_xdp_fd: %s\n", strerror(-err));
+ fprintf(stderr, "Failed to detach XDP program\n");
+ exit(1);
+ }
+ } else {
+ opts.old_prog_fd = prog_fd;
+ err = bpf_xdp_attach(ifindex, -1, XDP_FLAGS_REPLACE, &opts);
+ close(prog_fd);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_set_link_xdp_fd_opts: %s\n", strerror(-err));
+ /* Not an error if already replaced by someone else. */
+ if (err != -EEXIST) {
+ fprintf(stderr, "Failed to detach XDP program\n");
+ exit(1);
+ }
+ }
+ }
+ exit(0);
+}
+
+static noreturn void usage(const char *progname)
+{
+ fprintf(stderr, "Usage: %s [--iface <iface>|--prog <prog_id>] [--mss4 <mss ipv4> --mss6 <mss ipv6> --wscale <wscale> --ttl <ttl>] [--ports <port1>,<port2>,...] [--single] [--tc]\n",
+ progname);
+ exit(1);
+}
+
+static unsigned long parse_arg_ul(const char *progname, const char *arg, unsigned long limit)
+{
+ unsigned long res;
+ char *endptr;
+
+ errno = 0;
+ res = strtoul(arg, &endptr, 10);
+ if (errno != 0 || *endptr != '\0' || arg[0] == '\0' || res > limit)
+ usage(progname);
+
+ return res;
+}
+
+static void parse_options(int argc, char *argv[], unsigned int *ifindex, __u32 *prog_id,
+ __u64 *tcpipopts, char **ports, bool *single, bool *tc)
+{
+ static struct option long_options[] = {
+ { "help", no_argument, NULL, 'h' },
+ { "iface", required_argument, NULL, 'i' },
+ { "prog", required_argument, NULL, 'x' },
+ { "mss4", required_argument, NULL, 4 },
+ { "mss6", required_argument, NULL, 6 },
+ { "wscale", required_argument, NULL, 'w' },
+ { "ttl", required_argument, NULL, 't' },
+ { "ports", required_argument, NULL, 'p' },
+ { "single", no_argument, NULL, 's' },
+ { "tc", no_argument, NULL, 'c' },
+ { NULL, 0, NULL, 0 },
+ };
+ unsigned long mss4, mss6, wscale, ttl;
+ unsigned int tcpipopts_mask = 0;
+
+ if (argc < 2)
+ usage(argv[0]);
+
+ *ifindex = 0;
+ *prog_id = 0;
+ *tcpipopts = 0;
+ *ports = NULL;
+ *single = false;
+
+ while (true) {
+ int opt;
+
+ opt = getopt_long(argc, argv, "", long_options, NULL);
+ if (opt == -1)
+ break;
+
+ switch (opt) {
+ case 'h':
+ usage(argv[0]);
+ break;
+ case 'i':
+ *ifindex = if_nametoindex(optarg);
+ if (*ifindex == 0)
+ usage(argv[0]);
+ break;
+ case 'x':
+ *prog_id = parse_arg_ul(argv[0], optarg, UINT32_MAX);
+ if (*prog_id == 0)
+ usage(argv[0]);
+ break;
+ case 4:
+ mss4 = parse_arg_ul(argv[0], optarg, UINT16_MAX);
+ tcpipopts_mask |= 1 << 0;
+ break;
+ case 6:
+ mss6 = parse_arg_ul(argv[0], optarg, UINT16_MAX);
+ tcpipopts_mask |= 1 << 1;
+ break;
+ case 'w':
+ wscale = parse_arg_ul(argv[0], optarg, 14);
+ tcpipopts_mask |= 1 << 2;
+ break;
+ case 't':
+ ttl = parse_arg_ul(argv[0], optarg, UINT8_MAX);
+ tcpipopts_mask |= 1 << 3;
+ break;
+ case 'p':
+ *ports = optarg;
+ break;
+ case 's':
+ *single = true;
+ break;
+ case 'c':
+ *tc = true;
+ break;
+ default:
+ usage(argv[0]);
+ }
+ }
+ if (optind < argc)
+ usage(argv[0]);
+
+ if (tcpipopts_mask == 0xf) {
+ if (mss4 == 0 || mss6 == 0 || wscale == 0 || ttl == 0)
+ usage(argv[0]);
+ *tcpipopts = (mss6 << 32) | (ttl << 24) | (wscale << 16) | mss4;
+ } else if (tcpipopts_mask != 0) {
+ usage(argv[0]);
+ }
+
+ if (*ifindex != 0 && *prog_id != 0)
+ usage(argv[0]);
+ if (*ifindex == 0 && *prog_id == 0)
+ usage(argv[0]);
+}
+
+static int syncookie_attach(const char *argv0, unsigned int ifindex, bool tc)
+{
+ struct bpf_prog_info info = {};
+ __u32 info_len = sizeof(info);
+ char xdp_filename[PATH_MAX];
+ struct bpf_program *prog;
+ struct bpf_object *obj;
+ int prog_fd;
+ int err;
+
+ snprintf(xdp_filename, sizeof(xdp_filename), "%s_kern.o", argv0);
+ obj = bpf_object__open_file(xdp_filename, NULL);
+ err = libbpf_get_error(obj);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_object__open_file: %s\n", strerror(-err));
+ return err;
+ }
+
+ err = bpf_object__load(obj);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_object__open_file: %s\n", strerror(-err));
+ return err;
+ }
+
+ prog = bpf_object__find_program_by_name(obj, tc ? "syncookie_tc" : "syncookie_xdp");
+ if (!prog) {
+ fprintf(stderr, "Error: bpf_object__find_program_by_name: program was not found\n");
+ return -ENOENT;
+ }
+
+ prog_fd = bpf_program__fd(prog);
+
+ err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_obj_get_info_by_fd: %s\n", strerror(-err));
+ goto out;
+ }
+ attached_tc = tc;
+ attached_prog_id = info.id;
+ signal(SIGINT, cleanup);
+ signal(SIGTERM, cleanup);
+ if (tc) {
+ LIBBPF_OPTS(bpf_tc_hook, hook,
+ .ifindex = ifindex,
+ .attach_point = BPF_TC_INGRESS);
+ LIBBPF_OPTS(bpf_tc_opts, opts,
+ .handle = 1,
+ .priority = 1,
+ .prog_fd = prog_fd);
+
+ err = bpf_tc_hook_create(&hook);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_tc_hook_create: %s\n",
+ strerror(-err));
+ goto fail;
+ }
+ err = bpf_tc_attach(&hook, &opts);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_tc_attach: %s\n",
+ strerror(-err));
+ goto fail;
+ }
+
+ } else {
+ err = bpf_xdp_attach(ifindex, prog_fd,
+ XDP_FLAGS_UPDATE_IF_NOEXIST, NULL);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_set_link_xdp_fd: %s\n",
+ strerror(-err));
+ goto fail;
+ }
+ }
+ err = 0;
+out:
+ bpf_object__close(obj);
+ return err;
+fail:
+ signal(SIGINT, SIG_DFL);
+ signal(SIGTERM, SIG_DFL);
+ attached_prog_id = 0;
+ goto out;
+}
+
+static int syncookie_open_bpf_maps(__u32 prog_id, int *values_map_fd, int *ports_map_fd)
+{
+ struct bpf_prog_info prog_info;
+ __u32 map_ids[8];
+ __u32 info_len;
+ int prog_fd;
+ int err;
+ int i;
+
+ *values_map_fd = -1;
+ *ports_map_fd = -1;
+
+ prog_fd = bpf_prog_get_fd_by_id(prog_id);
+ if (prog_fd < 0) {
+ fprintf(stderr, "Error: bpf_prog_get_fd_by_id: %s\n", strerror(-prog_fd));
+ return prog_fd;
+ }
+
+ prog_info = (struct bpf_prog_info) {
+ .nr_map_ids = 8,
+ .map_ids = (__u64)map_ids,
+ };
+ info_len = sizeof(prog_info);
+
+ err = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &info_len);
+ if (err != 0) {
+ fprintf(stderr, "Error: bpf_obj_get_info_by_fd: %s\n", strerror(-err));
+ goto out;
+ }
+
+ if (prog_info.nr_map_ids < 2) {
+ fprintf(stderr, "Error: Found %u BPF maps, expected at least 2\n",
+ prog_info.nr_map_ids);
+ err = -ENOENT;
+ goto out;
+ }
+
+ for (i = 0; i < prog_info.nr_map_ids; i++) {
+ struct bpf_map_info map_info = {};
+ int map_fd;
+
+ err = bpf_map_get_fd_by_id(map_ids[i]);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_map_get_fd_by_id: %s\n", strerror(-err));
+ goto err_close_map_fds;
+ }
+ map_fd = err;
+
+ info_len = sizeof(map_info);
+ err = bpf_obj_get_info_by_fd(map_fd, &map_info, &info_len);
+ if (err != 0) {
+ fprintf(stderr, "Error: bpf_obj_get_info_by_fd: %s\n", strerror(-err));
+ close(map_fd);
+ goto err_close_map_fds;
+ }
+ if (strcmp(map_info.name, "values") == 0) {
+ *values_map_fd = map_fd;
+ continue;
+ }
+ if (strcmp(map_info.name, "allowed_ports") == 0) {
+ *ports_map_fd = map_fd;
+ continue;
+ }
+ close(map_fd);
+ }
+
+ if (*values_map_fd != -1 && *ports_map_fd != -1) {
+ err = 0;
+ goto out;
+ }
+
+ err = -ENOENT;
+
+err_close_map_fds:
+ if (*values_map_fd != -1)
+ close(*values_map_fd);
+ if (*ports_map_fd != -1)
+ close(*ports_map_fd);
+ *values_map_fd = -1;
+ *ports_map_fd = -1;
+
+out:
+ close(prog_fd);
+ return err;
+}
+
+int main(int argc, char *argv[])
+{
+ int values_map_fd, ports_map_fd;
+ __u64 tcpipopts;
+ bool firstiter;
+ __u64 prevcnt;
+ __u32 prog_id;
+ char *ports;
+ bool single;
+ int err = 0;
+ bool tc;
+
+ parse_options(argc, argv, &ifindex, &prog_id, &tcpipopts, &ports,
+ &single, &tc);
+
+ if (prog_id == 0) {
+ if (!tc) {
+ err = bpf_xdp_query_id(ifindex, 0, &prog_id);
+ if (err < 0) {
+ fprintf(stderr, "Error: bpf_get_link_xdp_id: %s\n",
+ strerror(-err));
+ goto out;
+ }
+ }
+ if (prog_id == 0) {
+ err = syncookie_attach(argv[0], ifindex, tc);
+ if (err < 0)
+ goto out;
+ prog_id = attached_prog_id;
+ }
+ }
+
+ err = syncookie_open_bpf_maps(prog_id, &values_map_fd, &ports_map_fd);
+ if (err < 0)
+ goto out;
+
+ if (ports) {
+ __u16 port_last = 0;
+ __u32 port_idx = 0;
+ char *p = ports;
+
+ fprintf(stderr, "Replacing allowed ports\n");
+
+ while (p && *p != '\0') {
+ char *token = strsep(&p, ",");
+ __u16 port;
+
+ port = parse_arg_ul(argv[0], token, UINT16_MAX);
+ err = bpf_map_update_elem(ports_map_fd, &port_idx, &port, BPF_ANY);
+ if (err != 0) {
+ fprintf(stderr, "Error: bpf_map_update_elem: %s\n", strerror(-err));
+ fprintf(stderr, "Failed to add port %u (index %u)\n",
+ port, port_idx);
+ goto out_close_maps;
+ }
+ fprintf(stderr, "Added port %u\n", port);
+ port_idx++;
+ }
+ err = bpf_map_update_elem(ports_map_fd, &port_idx, &port_last, BPF_ANY);
+ if (err != 0) {
+ fprintf(stderr, "Error: bpf_map_update_elem: %s\n", strerror(-err));
+ fprintf(stderr, "Failed to add the terminator value 0 (index %u)\n",
+ port_idx);
+ goto out_close_maps;
+ }
+ }
+
+ if (tcpipopts) {
+ __u32 key = 0;
+
+ fprintf(stderr, "Replacing TCP/IP options\n");
+
+ err = bpf_map_update_elem(values_map_fd, &key, &tcpipopts, BPF_ANY);
+ if (err != 0) {
+ fprintf(stderr, "Error: bpf_map_update_elem: %s\n", strerror(-err));
+ goto out_close_maps;
+ }
+ }
+
+ if ((ports || tcpipopts) && attached_prog_id == 0 && !single)
+ goto out_close_maps;
+
+ prevcnt = 0;
+ firstiter = true;
+ while (true) {
+ __u32 key = 1;
+ __u64 value;
+
+ err = bpf_map_lookup_elem(values_map_fd, &key, &value);
+ if (err != 0) {
+ fprintf(stderr, "Error: bpf_map_lookup_elem: %s\n", strerror(-err));
+ goto out_close_maps;
+ }
+ if (firstiter) {
+ prevcnt = value;
+ firstiter = false;
+ }
+ if (single) {
+ printf("Total SYNACKs generated: %llu\n", value);
+ break;
+ }
+ printf("SYNACKs generated: %llu (total %llu)\n", value - prevcnt, value);
+ prevcnt = value;
+ sleep(1);
+ }
+
+out_close_maps:
+ close(values_map_fd);
+ close(ports_map_fd);
+out:
+ return err == 0 ? 0 : 1;
+}
diff --git a/tools/testing/selftests/bpf/xsk.c b/tools/testing/selftests/bpf/xsk.c
new file mode 100644
index 000000000000..f2721a4ae7c5
--- /dev/null
+++ b/tools/testing/selftests/bpf/xsk.c
@@ -0,0 +1,1268 @@
+// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+
+/*
+ * AF_XDP user-space access library.
+ *
+ * Copyright(c) 2018 - 2019 Intel Corporation.
+ *
+ * Author(s): Magnus Karlsson <magnus.karlsson@intel.com>
+ */
+
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <arpa/inet.h>
+#include <asm/barrier.h>
+#include <linux/compiler.h>
+#include <linux/ethtool.h>
+#include <linux/filter.h>
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+#include <linux/if_xdp.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/sockios.h>
+#include <net/if.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/socket.h>
+#include <sys/types.h>
+#include <linux/if_link.h>
+
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#include "xsk.h"
+
+#ifndef SOL_XDP
+ #define SOL_XDP 283
+#endif
+
+#ifndef AF_XDP
+ #define AF_XDP 44
+#endif
+
+#ifndef PF_XDP
+ #define PF_XDP AF_XDP
+#endif
+
+#define pr_warn(fmt, ...) fprintf(stderr, fmt, ##__VA_ARGS__)
+
+enum xsk_prog {
+ XSK_PROG_FALLBACK,
+ XSK_PROG_REDIRECT_FLAGS,
+};
+
+struct xsk_umem {
+ struct xsk_ring_prod *fill_save;
+ struct xsk_ring_cons *comp_save;
+ char *umem_area;
+ struct xsk_umem_config config;
+ int fd;
+ int refcount;
+ struct list_head ctx_list;
+ bool rx_ring_setup_done;
+ bool tx_ring_setup_done;
+};
+
+struct xsk_ctx {
+ struct xsk_ring_prod *fill;
+ struct xsk_ring_cons *comp;
+ __u32 queue_id;
+ struct xsk_umem *umem;
+ int refcount;
+ int ifindex;
+ struct list_head list;
+ int prog_fd;
+ int link_fd;
+ int xsks_map_fd;
+ char ifname[IFNAMSIZ];
+ bool has_bpf_link;
+};
+
+struct xsk_socket {
+ struct xsk_ring_cons *rx;
+ struct xsk_ring_prod *tx;
+ __u64 outstanding_tx;
+ struct xsk_ctx *ctx;
+ struct xsk_socket_config config;
+ int fd;
+};
+
+struct xsk_nl_info {
+ bool xdp_prog_attached;
+ int ifindex;
+ int fd;
+};
+
+/* Up until and including Linux 5.3 */
+struct xdp_ring_offset_v1 {
+ __u64 producer;
+ __u64 consumer;
+ __u64 desc;
+};
+
+/* Up until and including Linux 5.3 */
+struct xdp_mmap_offsets_v1 {
+ struct xdp_ring_offset_v1 rx;
+ struct xdp_ring_offset_v1 tx;
+ struct xdp_ring_offset_v1 fr;
+ struct xdp_ring_offset_v1 cr;
+};
+
+int xsk_umem__fd(const struct xsk_umem *umem)
+{
+ return umem ? umem->fd : -EINVAL;
+}
+
+int xsk_socket__fd(const struct xsk_socket *xsk)
+{
+ return xsk ? xsk->fd : -EINVAL;
+}
+
+static bool xsk_page_aligned(void *buffer)
+{
+ unsigned long addr = (unsigned long)buffer;
+
+ return !(addr & (getpagesize() - 1));
+}
+
+static void xsk_set_umem_config(struct xsk_umem_config *cfg,
+ const struct xsk_umem_config *usr_cfg)
+{
+ if (!usr_cfg) {
+ cfg->fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
+ cfg->comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
+ cfg->frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
+ cfg->frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM;
+ cfg->flags = XSK_UMEM__DEFAULT_FLAGS;
+ return;
+ }
+
+ cfg->fill_size = usr_cfg->fill_size;
+ cfg->comp_size = usr_cfg->comp_size;
+ cfg->frame_size = usr_cfg->frame_size;
+ cfg->frame_headroom = usr_cfg->frame_headroom;
+ cfg->flags = usr_cfg->flags;
+}
+
+static int xsk_set_xdp_socket_config(struct xsk_socket_config *cfg,
+ const struct xsk_socket_config *usr_cfg)
+{
+ if (!usr_cfg) {
+ cfg->rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
+ cfg->tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
+ cfg->libbpf_flags = 0;
+ cfg->xdp_flags = 0;
+ cfg->bind_flags = 0;
+ return 0;
+ }
+
+ if (usr_cfg->libbpf_flags & ~XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)
+ return -EINVAL;
+
+ cfg->rx_size = usr_cfg->rx_size;
+ cfg->tx_size = usr_cfg->tx_size;
+ cfg->libbpf_flags = usr_cfg->libbpf_flags;
+ cfg->xdp_flags = usr_cfg->xdp_flags;
+ cfg->bind_flags = usr_cfg->bind_flags;
+
+ return 0;
+}
+
+static void xsk_mmap_offsets_v1(struct xdp_mmap_offsets *off)
+{
+ struct xdp_mmap_offsets_v1 off_v1;
+
+ /* getsockopt on a kernel <= 5.3 has no flags fields.
+ * Copy over the offsets to the correct places in the >=5.4 format
+ * and put the flags where they would have been on that kernel.
+ */
+ memcpy(&off_v1, off, sizeof(off_v1));
+
+ off->rx.producer = off_v1.rx.producer;
+ off->rx.consumer = off_v1.rx.consumer;
+ off->rx.desc = off_v1.rx.desc;
+ off->rx.flags = off_v1.rx.consumer + sizeof(__u32);
+
+ off->tx.producer = off_v1.tx.producer;
+ off->tx.consumer = off_v1.tx.consumer;
+ off->tx.desc = off_v1.tx.desc;
+ off->tx.flags = off_v1.tx.consumer + sizeof(__u32);
+
+ off->fr.producer = off_v1.fr.producer;
+ off->fr.consumer = off_v1.fr.consumer;
+ off->fr.desc = off_v1.fr.desc;
+ off->fr.flags = off_v1.fr.consumer + sizeof(__u32);
+
+ off->cr.producer = off_v1.cr.producer;
+ off->cr.consumer = off_v1.cr.consumer;
+ off->cr.desc = off_v1.cr.desc;
+ off->cr.flags = off_v1.cr.consumer + sizeof(__u32);
+}
+
+static int xsk_get_mmap_offsets(int fd, struct xdp_mmap_offsets *off)
+{
+ socklen_t optlen;
+ int err;
+
+ optlen = sizeof(*off);
+ err = getsockopt(fd, SOL_XDP, XDP_MMAP_OFFSETS, off, &optlen);
+ if (err)
+ return err;
+
+ if (optlen == sizeof(*off))
+ return 0;
+
+ if (optlen == sizeof(struct xdp_mmap_offsets_v1)) {
+ xsk_mmap_offsets_v1(off);
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
+static int xsk_create_umem_rings(struct xsk_umem *umem, int fd,
+ struct xsk_ring_prod *fill,
+ struct xsk_ring_cons *comp)
+{
+ struct xdp_mmap_offsets off;
+ void *map;
+ int err;
+
+ err = setsockopt(fd, SOL_XDP, XDP_UMEM_FILL_RING,
+ &umem->config.fill_size,
+ sizeof(umem->config.fill_size));
+ if (err)
+ return -errno;
+
+ err = setsockopt(fd, SOL_XDP, XDP_UMEM_COMPLETION_RING,
+ &umem->config.comp_size,
+ sizeof(umem->config.comp_size));
+ if (err)
+ return -errno;
+
+ err = xsk_get_mmap_offsets(fd, &off);
+ if (err)
+ return -errno;
+
+ map = mmap(NULL, off.fr.desc + umem->config.fill_size * sizeof(__u64),
+ PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, fd,
+ XDP_UMEM_PGOFF_FILL_RING);
+ if (map == MAP_FAILED)
+ return -errno;
+
+ fill->mask = umem->config.fill_size - 1;
+ fill->size = umem->config.fill_size;
+ fill->producer = map + off.fr.producer;
+ fill->consumer = map + off.fr.consumer;
+ fill->flags = map + off.fr.flags;
+ fill->ring = map + off.fr.desc;
+ fill->cached_cons = umem->config.fill_size;
+
+ map = mmap(NULL, off.cr.desc + umem->config.comp_size * sizeof(__u64),
+ PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, fd,
+ XDP_UMEM_PGOFF_COMPLETION_RING);
+ if (map == MAP_FAILED) {
+ err = -errno;
+ goto out_mmap;
+ }
+
+ comp->mask = umem->config.comp_size - 1;
+ comp->size = umem->config.comp_size;
+ comp->producer = map + off.cr.producer;
+ comp->consumer = map + off.cr.consumer;
+ comp->flags = map + off.cr.flags;
+ comp->ring = map + off.cr.desc;
+
+ return 0;
+
+out_mmap:
+ munmap(map, off.fr.desc + umem->config.fill_size * sizeof(__u64));
+ return err;
+}
+
+int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area,
+ __u64 size, struct xsk_ring_prod *fill,
+ struct xsk_ring_cons *comp,
+ const struct xsk_umem_config *usr_config)
+{
+ struct xdp_umem_reg mr;
+ struct xsk_umem *umem;
+ int err;
+
+ if (!umem_area || !umem_ptr || !fill || !comp)
+ return -EFAULT;
+ if (!size && !xsk_page_aligned(umem_area))
+ return -EINVAL;
+
+ umem = calloc(1, sizeof(*umem));
+ if (!umem)
+ return -ENOMEM;
+
+ umem->fd = socket(AF_XDP, SOCK_RAW | SOCK_CLOEXEC, 0);
+ if (umem->fd < 0) {
+ err = -errno;
+ goto out_umem_alloc;
+ }
+
+ umem->umem_area = umem_area;
+ INIT_LIST_HEAD(&umem->ctx_list);
+ xsk_set_umem_config(&umem->config, usr_config);
+
+ memset(&mr, 0, sizeof(mr));
+ mr.addr = (uintptr_t)umem_area;
+ mr.len = size;
+ mr.chunk_size = umem->config.frame_size;
+ mr.headroom = umem->config.frame_headroom;
+ mr.flags = umem->config.flags;
+
+ err = setsockopt(umem->fd, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr));
+ if (err) {
+ err = -errno;
+ goto out_socket;
+ }
+
+ err = xsk_create_umem_rings(umem, umem->fd, fill, comp);
+ if (err)
+ goto out_socket;
+
+ umem->fill_save = fill;
+ umem->comp_save = comp;
+ *umem_ptr = umem;
+ return 0;
+
+out_socket:
+ close(umem->fd);
+out_umem_alloc:
+ free(umem);
+ return err;
+}
+
+struct xsk_umem_config_v1 {
+ __u32 fill_size;
+ __u32 comp_size;
+ __u32 frame_size;
+ __u32 frame_headroom;
+};
+
+static enum xsk_prog get_xsk_prog(void)
+{
+ enum xsk_prog detected = XSK_PROG_FALLBACK;
+ char data_in = 0, data_out;
+ struct bpf_insn insns[] = {
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_MOV64_IMM(BPF_REG_2, 0),
+ BPF_MOV64_IMM(BPF_REG_3, XDP_PASS),
+ BPF_EMIT_CALL(BPF_FUNC_redirect_map),
+ BPF_EXIT_INSN(),
+ };
+ LIBBPF_OPTS(bpf_test_run_opts, opts,
+ .data_in = &data_in,
+ .data_size_in = 1,
+ .data_out = &data_out,
+ );
+
+ int prog_fd, map_fd, ret, insn_cnt = ARRAY_SIZE(insns);
+
+ map_fd = bpf_map_create(BPF_MAP_TYPE_XSKMAP, NULL, sizeof(int), sizeof(int), 1, NULL);
+ if (map_fd < 0)
+ return detected;
+
+ insns[0].imm = map_fd;
+
+ prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "GPL", insns, insn_cnt, NULL);
+ if (prog_fd < 0) {
+ close(map_fd);
+ return detected;
+ }
+
+ ret = bpf_prog_test_run_opts(prog_fd, &opts);
+ if (!ret && opts.retval == XDP_PASS)
+ detected = XSK_PROG_REDIRECT_FLAGS;
+ close(prog_fd);
+ close(map_fd);
+ return detected;
+}
+
+static int xsk_load_xdp_prog(struct xsk_socket *xsk)
+{
+ static const int log_buf_size = 16 * 1024;
+ struct xsk_ctx *ctx = xsk->ctx;
+ char log_buf[log_buf_size];
+ int prog_fd;
+
+ /* This is the fallback C-program:
+ * SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
+ * {
+ * int ret, index = ctx->rx_queue_index;
+ *
+ * // A set entry here means that the correspnding queue_id
+ * // has an active AF_XDP socket bound to it.
+ * ret = bpf_redirect_map(&xsks_map, index, XDP_PASS);
+ * if (ret > 0)
+ * return ret;
+ *
+ * // Fallback for pre-5.3 kernels, not supporting default
+ * // action in the flags parameter.
+ * if (bpf_map_lookup_elem(&xsks_map, &index))
+ * return bpf_redirect_map(&xsks_map, index, 0);
+ * return XDP_PASS;
+ * }
+ */
+ struct bpf_insn prog[] = {
+ /* r2 = *(u32 *)(r1 + 16) */
+ BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 16),
+ /* *(u32 *)(r10 - 4) = r2 */
+ BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -4),
+ /* r1 = xskmap[] */
+ BPF_LD_MAP_FD(BPF_REG_1, ctx->xsks_map_fd),
+ /* r3 = XDP_PASS */
+ BPF_MOV64_IMM(BPF_REG_3, 2),
+ /* call bpf_redirect_map */
+ BPF_EMIT_CALL(BPF_FUNC_redirect_map),
+ /* if w0 != 0 goto pc+13 */
+ BPF_JMP32_IMM(BPF_JSGT, BPF_REG_0, 0, 13),
+ /* r2 = r10 */
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ /* r2 += -4 */
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4),
+ /* r1 = xskmap[] */
+ BPF_LD_MAP_FD(BPF_REG_1, ctx->xsks_map_fd),
+ /* call bpf_map_lookup_elem */
+ BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+ /* r1 = r0 */
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
+ /* r0 = XDP_PASS */
+ BPF_MOV64_IMM(BPF_REG_0, 2),
+ /* if r1 == 0 goto pc+5 */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 5),
+ /* r2 = *(u32 *)(r10 - 4) */
+ BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_10, -4),
+ /* r1 = xskmap[] */
+ BPF_LD_MAP_FD(BPF_REG_1, ctx->xsks_map_fd),
+ /* r3 = 0 */
+ BPF_MOV64_IMM(BPF_REG_3, 0),
+ /* call bpf_redirect_map */
+ BPF_EMIT_CALL(BPF_FUNC_redirect_map),
+ /* The jumps are to this instruction */
+ BPF_EXIT_INSN(),
+ };
+
+ /* This is the post-5.3 kernel C-program:
+ * SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
+ * {
+ * return bpf_redirect_map(&xsks_map, ctx->rx_queue_index, XDP_PASS);
+ * }
+ */
+ struct bpf_insn prog_redirect_flags[] = {
+ /* r2 = *(u32 *)(r1 + 16) */
+ BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 16),
+ /* r1 = xskmap[] */
+ BPF_LD_MAP_FD(BPF_REG_1, ctx->xsks_map_fd),
+ /* r3 = XDP_PASS */
+ BPF_MOV64_IMM(BPF_REG_3, 2),
+ /* call bpf_redirect_map */
+ BPF_EMIT_CALL(BPF_FUNC_redirect_map),
+ BPF_EXIT_INSN(),
+ };
+ size_t insns_cnt[] = {ARRAY_SIZE(prog),
+ ARRAY_SIZE(prog_redirect_flags),
+ };
+ struct bpf_insn *progs[] = {prog, prog_redirect_flags};
+ enum xsk_prog option = get_xsk_prog();
+ LIBBPF_OPTS(bpf_prog_load_opts, opts,
+ .log_buf = log_buf,
+ .log_size = log_buf_size,
+ );
+
+ prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "LGPL-2.1 or BSD-2-Clause",
+ progs[option], insns_cnt[option], &opts);
+ if (prog_fd < 0) {
+ pr_warn("BPF log buffer:\n%s", log_buf);
+ return prog_fd;
+ }
+
+ ctx->prog_fd = prog_fd;
+ return 0;
+}
+
+static int xsk_create_bpf_link(struct xsk_socket *xsk)
+{
+ DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
+ struct xsk_ctx *ctx = xsk->ctx;
+ __u32 prog_id = 0;
+ int link_fd;
+ int err;
+
+ err = bpf_xdp_query_id(ctx->ifindex, xsk->config.xdp_flags, &prog_id);
+ if (err) {
+ pr_warn("getting XDP prog id failed\n");
+ return err;
+ }
+
+ /* if there's a netlink-based XDP prog loaded on interface, bail out
+ * and ask user to do the removal by himself
+ */
+ if (prog_id) {
+ pr_warn("Netlink-based XDP prog detected, please unload it in order to launch AF_XDP prog\n");
+ return -EINVAL;
+ }
+
+ opts.flags = xsk->config.xdp_flags & ~(XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_REPLACE);
+
+ link_fd = bpf_link_create(ctx->prog_fd, ctx->ifindex, BPF_XDP, &opts);
+ if (link_fd < 0) {
+ pr_warn("bpf_link_create failed: %s\n", strerror(errno));
+ return link_fd;
+ }
+
+ ctx->link_fd = link_fd;
+ return 0;
+}
+
+/* Copy up to sz - 1 bytes from zero-terminated src string and ensure that dst
+ * is zero-terminated string no matter what (unless sz == 0, in which case
+ * it's a no-op). It's conceptually close to FreeBSD's strlcpy(), but differs
+ * in what is returned. Given this is internal helper, it's trivial to extend
+ * this, when necessary. Use this instead of strncpy inside libbpf source code.
+ */
+static inline void libbpf_strlcpy(char *dst, const char *src, size_t sz)
+{
+ size_t i;
+
+ if (sz == 0)
+ return;
+
+ sz--;
+ for (i = 0; i < sz && src[i]; i++)
+ dst[i] = src[i];
+ dst[i] = '\0';
+}
+
+static int xsk_get_max_queues(struct xsk_socket *xsk)
+{
+ struct ethtool_channels channels = { .cmd = ETHTOOL_GCHANNELS };
+ struct xsk_ctx *ctx = xsk->ctx;
+ struct ifreq ifr = {};
+ int fd, err, ret;
+
+ fd = socket(AF_LOCAL, SOCK_DGRAM | SOCK_CLOEXEC, 0);
+ if (fd < 0)
+ return -errno;
+
+ ifr.ifr_data = (void *)&channels;
+ libbpf_strlcpy(ifr.ifr_name, ctx->ifname, IFNAMSIZ);
+ err = ioctl(fd, SIOCETHTOOL, &ifr);
+ if (err && errno != EOPNOTSUPP) {
+ ret = -errno;
+ goto out;
+ }
+
+ if (err) {
+ /* If the device says it has no channels, then all traffic
+ * is sent to a single stream, so max queues = 1.
+ */
+ ret = 1;
+ } else {
+ /* Take the max of rx, tx, combined. Drivers return
+ * the number of channels in different ways.
+ */
+ ret = max(channels.max_rx, channels.max_tx);
+ ret = max(ret, (int)channels.max_combined);
+ }
+
+out:
+ close(fd);
+ return ret;
+}
+
+static int xsk_create_bpf_maps(struct xsk_socket *xsk)
+{
+ struct xsk_ctx *ctx = xsk->ctx;
+ int max_queues;
+ int fd;
+
+ max_queues = xsk_get_max_queues(xsk);
+ if (max_queues < 0)
+ return max_queues;
+
+ fd = bpf_map_create(BPF_MAP_TYPE_XSKMAP, "xsks_map",
+ sizeof(int), sizeof(int), max_queues, NULL);
+ if (fd < 0)
+ return fd;
+
+ ctx->xsks_map_fd = fd;
+
+ return 0;
+}
+
+static void xsk_delete_bpf_maps(struct xsk_socket *xsk)
+{
+ struct xsk_ctx *ctx = xsk->ctx;
+
+ bpf_map_delete_elem(ctx->xsks_map_fd, &ctx->queue_id);
+ close(ctx->xsks_map_fd);
+}
+
+static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
+{
+ __u32 i, *map_ids, num_maps, prog_len = sizeof(struct bpf_prog_info);
+ __u32 map_len = sizeof(struct bpf_map_info);
+ struct bpf_prog_info prog_info = {};
+ struct xsk_ctx *ctx = xsk->ctx;
+ struct bpf_map_info map_info;
+ int fd, err;
+
+ err = bpf_obj_get_info_by_fd(ctx->prog_fd, &prog_info, &prog_len);
+ if (err)
+ return err;
+
+ num_maps = prog_info.nr_map_ids;
+
+ map_ids = calloc(prog_info.nr_map_ids, sizeof(*map_ids));
+ if (!map_ids)
+ return -ENOMEM;
+
+ memset(&prog_info, 0, prog_len);
+ prog_info.nr_map_ids = num_maps;
+ prog_info.map_ids = (__u64)(unsigned long)map_ids;
+
+ err = bpf_obj_get_info_by_fd(ctx->prog_fd, &prog_info, &prog_len);
+ if (err)
+ goto out_map_ids;
+
+ ctx->xsks_map_fd = -1;
+
+ for (i = 0; i < prog_info.nr_map_ids; i++) {
+ fd = bpf_map_get_fd_by_id(map_ids[i]);
+ if (fd < 0)
+ continue;
+
+ memset(&map_info, 0, map_len);
+ err = bpf_obj_get_info_by_fd(fd, &map_info, &map_len);
+ if (err) {
+ close(fd);
+ continue;
+ }
+
+ if (!strncmp(map_info.name, "xsks_map", sizeof(map_info.name))) {
+ ctx->xsks_map_fd = fd;
+ break;
+ }
+
+ close(fd);
+ }
+
+ if (ctx->xsks_map_fd == -1)
+ err = -ENOENT;
+
+out_map_ids:
+ free(map_ids);
+ return err;
+}
+
+static int xsk_set_bpf_maps(struct xsk_socket *xsk)
+{
+ struct xsk_ctx *ctx = xsk->ctx;
+
+ return bpf_map_update_elem(ctx->xsks_map_fd, &ctx->queue_id,
+ &xsk->fd, 0);
+}
+
+static int xsk_link_lookup(int ifindex, __u32 *prog_id, int *link_fd)
+{
+ struct bpf_link_info link_info;
+ __u32 link_len;
+ __u32 id = 0;
+ int err;
+ int fd;
+
+ while (true) {
+ err = bpf_link_get_next_id(id, &id);
+ if (err) {
+ if (errno == ENOENT) {
+ err = 0;
+ break;
+ }
+ pr_warn("can't get next link: %s\n", strerror(errno));
+ break;
+ }
+
+ fd = bpf_link_get_fd_by_id(id);
+ if (fd < 0) {
+ if (errno == ENOENT)
+ continue;
+ pr_warn("can't get link by id (%u): %s\n", id, strerror(errno));
+ err = -errno;
+ break;
+ }
+
+ link_len = sizeof(struct bpf_link_info);
+ memset(&link_info, 0, link_len);
+ err = bpf_obj_get_info_by_fd(fd, &link_info, &link_len);
+ if (err) {
+ pr_warn("can't get link info: %s\n", strerror(errno));
+ close(fd);
+ break;
+ }
+ if (link_info.type == BPF_LINK_TYPE_XDP) {
+ if (link_info.xdp.ifindex == ifindex) {
+ *link_fd = fd;
+ if (prog_id)
+ *prog_id = link_info.prog_id;
+ break;
+ }
+ }
+ close(fd);
+ }
+
+ return err;
+}
+
+static bool xsk_probe_bpf_link(void)
+{
+ LIBBPF_OPTS(bpf_link_create_opts, opts, .flags = XDP_FLAGS_SKB_MODE);
+ struct bpf_insn insns[2] = {
+ BPF_MOV64_IMM(BPF_REG_0, XDP_PASS),
+ BPF_EXIT_INSN()
+ };
+ int prog_fd, link_fd = -1, insn_cnt = ARRAY_SIZE(insns);
+ int ifindex_lo = 1;
+ bool ret = false;
+ int err;
+
+ err = xsk_link_lookup(ifindex_lo, NULL, &link_fd);
+ if (err)
+ return ret;
+
+ if (link_fd >= 0)
+ return true;
+
+ prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "GPL", insns, insn_cnt, NULL);
+ if (prog_fd < 0)
+ return ret;
+
+ link_fd = bpf_link_create(prog_fd, ifindex_lo, BPF_XDP, &opts);
+ close(prog_fd);
+
+ if (link_fd >= 0) {
+ ret = true;
+ close(link_fd);
+ }
+
+ return ret;
+}
+
+static int xsk_create_xsk_struct(int ifindex, struct xsk_socket *xsk)
+{
+ char ifname[IFNAMSIZ];
+ struct xsk_ctx *ctx;
+ char *interface;
+
+ ctx = calloc(1, sizeof(*ctx));
+ if (!ctx)
+ return -ENOMEM;
+
+ interface = if_indextoname(ifindex, &ifname[0]);
+ if (!interface) {
+ free(ctx);
+ return -errno;
+ }
+
+ ctx->ifindex = ifindex;
+ libbpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ);
+
+ xsk->ctx = ctx;
+ xsk->ctx->has_bpf_link = xsk_probe_bpf_link();
+
+ return 0;
+}
+
+static int xsk_init_xdp_res(struct xsk_socket *xsk,
+ int *xsks_map_fd)
+{
+ struct xsk_ctx *ctx = xsk->ctx;
+ int err;
+
+ err = xsk_create_bpf_maps(xsk);
+ if (err)
+ return err;
+
+ err = xsk_load_xdp_prog(xsk);
+ if (err)
+ goto err_load_xdp_prog;
+
+ if (ctx->has_bpf_link)
+ err = xsk_create_bpf_link(xsk);
+ else
+ err = bpf_xdp_attach(xsk->ctx->ifindex, ctx->prog_fd,
+ xsk->config.xdp_flags, NULL);
+
+ if (err)
+ goto err_attach_xdp_prog;
+
+ if (!xsk->rx)
+ return err;
+
+ err = xsk_set_bpf_maps(xsk);
+ if (err)
+ goto err_set_bpf_maps;
+
+ return err;
+
+err_set_bpf_maps:
+ if (ctx->has_bpf_link)
+ close(ctx->link_fd);
+ else
+ bpf_xdp_detach(ctx->ifindex, 0, NULL);
+err_attach_xdp_prog:
+ close(ctx->prog_fd);
+err_load_xdp_prog:
+ xsk_delete_bpf_maps(xsk);
+ return err;
+}
+
+static int xsk_lookup_xdp_res(struct xsk_socket *xsk, int *xsks_map_fd, int prog_id)
+{
+ struct xsk_ctx *ctx = xsk->ctx;
+ int err;
+
+ ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id);
+ if (ctx->prog_fd < 0) {
+ err = -errno;
+ goto err_prog_fd;
+ }
+ err = xsk_lookup_bpf_maps(xsk);
+ if (err)
+ goto err_lookup_maps;
+
+ if (!xsk->rx)
+ return err;
+
+ err = xsk_set_bpf_maps(xsk);
+ if (err)
+ goto err_set_maps;
+
+ return err;
+
+err_set_maps:
+ close(ctx->xsks_map_fd);
+err_lookup_maps:
+ close(ctx->prog_fd);
+err_prog_fd:
+ if (ctx->has_bpf_link)
+ close(ctx->link_fd);
+ return err;
+}
+
+static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp, int *xsks_map_fd)
+{
+ struct xsk_socket *xsk = _xdp;
+ struct xsk_ctx *ctx = xsk->ctx;
+ __u32 prog_id = 0;
+ int err;
+
+ if (ctx->has_bpf_link)
+ err = xsk_link_lookup(ctx->ifindex, &prog_id, &ctx->link_fd);
+ else
+ err = bpf_xdp_query_id(ctx->ifindex, xsk->config.xdp_flags, &prog_id);
+
+ if (err)
+ return err;
+
+ err = !prog_id ? xsk_init_xdp_res(xsk, xsks_map_fd) :
+ xsk_lookup_xdp_res(xsk, xsks_map_fd, prog_id);
+
+ if (!err && xsks_map_fd)
+ *xsks_map_fd = ctx->xsks_map_fd;
+
+ return err;
+}
+
+int xsk_setup_xdp_prog_xsk(struct xsk_socket *xsk, int *xsks_map_fd)
+{
+ return __xsk_setup_xdp_prog(xsk, xsks_map_fd);
+}
+
+static struct xsk_ctx *xsk_get_ctx(struct xsk_umem *umem, int ifindex,
+ __u32 queue_id)
+{
+ struct xsk_ctx *ctx;
+
+ if (list_empty(&umem->ctx_list))
+ return NULL;
+
+ list_for_each_entry(ctx, &umem->ctx_list, list) {
+ if (ctx->ifindex == ifindex && ctx->queue_id == queue_id) {
+ ctx->refcount++;
+ return ctx;
+ }
+ }
+
+ return NULL;
+}
+
+static void xsk_put_ctx(struct xsk_ctx *ctx, bool unmap)
+{
+ struct xsk_umem *umem = ctx->umem;
+ struct xdp_mmap_offsets off;
+ int err;
+
+ if (--ctx->refcount)
+ return;
+
+ if (!unmap)
+ goto out_free;
+
+ err = xsk_get_mmap_offsets(umem->fd, &off);
+ if (err)
+ goto out_free;
+
+ munmap(ctx->fill->ring - off.fr.desc, off.fr.desc + umem->config.fill_size *
+ sizeof(__u64));
+ munmap(ctx->comp->ring - off.cr.desc, off.cr.desc + umem->config.comp_size *
+ sizeof(__u64));
+
+out_free:
+ list_del(&ctx->list);
+ free(ctx);
+}
+
+static struct xsk_ctx *xsk_create_ctx(struct xsk_socket *xsk,
+ struct xsk_umem *umem, int ifindex,
+ const char *ifname, __u32 queue_id,
+ struct xsk_ring_prod *fill,
+ struct xsk_ring_cons *comp)
+{
+ struct xsk_ctx *ctx;
+ int err;
+
+ ctx = calloc(1, sizeof(*ctx));
+ if (!ctx)
+ return NULL;
+
+ if (!umem->fill_save) {
+ err = xsk_create_umem_rings(umem, xsk->fd, fill, comp);
+ if (err) {
+ free(ctx);
+ return NULL;
+ }
+ } else if (umem->fill_save != fill || umem->comp_save != comp) {
+ /* Copy over rings to new structs. */
+ memcpy(fill, umem->fill_save, sizeof(*fill));
+ memcpy(comp, umem->comp_save, sizeof(*comp));
+ }
+
+ ctx->ifindex = ifindex;
+ ctx->refcount = 1;
+ ctx->umem = umem;
+ ctx->queue_id = queue_id;
+ libbpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ);
+
+ ctx->fill = fill;
+ ctx->comp = comp;
+ list_add(&ctx->list, &umem->ctx_list);
+ ctx->has_bpf_link = xsk_probe_bpf_link();
+ return ctx;
+}
+
+static void xsk_destroy_xsk_struct(struct xsk_socket *xsk)
+{
+ free(xsk->ctx);
+ free(xsk);
+}
+
+int xsk_socket__update_xskmap(struct xsk_socket *xsk, int fd)
+{
+ xsk->ctx->xsks_map_fd = fd;
+ return xsk_set_bpf_maps(xsk);
+}
+
+int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd)
+{
+ struct xsk_socket *xsk;
+ int res;
+
+ xsk = calloc(1, sizeof(*xsk));
+ if (!xsk)
+ return -ENOMEM;
+
+ res = xsk_create_xsk_struct(ifindex, xsk);
+ if (res) {
+ free(xsk);
+ return -EINVAL;
+ }
+
+ res = __xsk_setup_xdp_prog(xsk, xsks_map_fd);
+
+ xsk_destroy_xsk_struct(xsk);
+
+ return res;
+}
+
+int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
+ const char *ifname,
+ __u32 queue_id, struct xsk_umem *umem,
+ struct xsk_ring_cons *rx,
+ struct xsk_ring_prod *tx,
+ struct xsk_ring_prod *fill,
+ struct xsk_ring_cons *comp,
+ const struct xsk_socket_config *usr_config)
+{
+ bool unmap, rx_setup_done = false, tx_setup_done = false;
+ void *rx_map = NULL, *tx_map = NULL;
+ struct sockaddr_xdp sxdp = {};
+ struct xdp_mmap_offsets off;
+ struct xsk_socket *xsk;
+ struct xsk_ctx *ctx;
+ int err, ifindex;
+
+ if (!umem || !xsk_ptr || !(rx || tx))
+ return -EFAULT;
+
+ unmap = umem->fill_save != fill;
+
+ xsk = calloc(1, sizeof(*xsk));
+ if (!xsk)
+ return -ENOMEM;
+
+ err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
+ if (err)
+ goto out_xsk_alloc;
+
+ xsk->outstanding_tx = 0;
+ ifindex = if_nametoindex(ifname);
+ if (!ifindex) {
+ err = -errno;
+ goto out_xsk_alloc;
+ }
+
+ if (umem->refcount++ > 0) {
+ xsk->fd = socket(AF_XDP, SOCK_RAW | SOCK_CLOEXEC, 0);
+ if (xsk->fd < 0) {
+ err = -errno;
+ goto out_xsk_alloc;
+ }
+ } else {
+ xsk->fd = umem->fd;
+ rx_setup_done = umem->rx_ring_setup_done;
+ tx_setup_done = umem->tx_ring_setup_done;
+ }
+
+ ctx = xsk_get_ctx(umem, ifindex, queue_id);
+ if (!ctx) {
+ if (!fill || !comp) {
+ err = -EFAULT;
+ goto out_socket;
+ }
+
+ ctx = xsk_create_ctx(xsk, umem, ifindex, ifname, queue_id,
+ fill, comp);
+ if (!ctx) {
+ err = -ENOMEM;
+ goto out_socket;
+ }
+ }
+ xsk->ctx = ctx;
+
+ if (rx && !rx_setup_done) {
+ err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
+ &xsk->config.rx_size,
+ sizeof(xsk->config.rx_size));
+ if (err) {
+ err = -errno;
+ goto out_put_ctx;
+ }
+ if (xsk->fd == umem->fd)
+ umem->rx_ring_setup_done = true;
+ }
+ if (tx && !tx_setup_done) {
+ err = setsockopt(xsk->fd, SOL_XDP, XDP_TX_RING,
+ &xsk->config.tx_size,
+ sizeof(xsk->config.tx_size));
+ if (err) {
+ err = -errno;
+ goto out_put_ctx;
+ }
+ if (xsk->fd == umem->fd)
+ umem->tx_ring_setup_done = true;
+ }
+
+ err = xsk_get_mmap_offsets(xsk->fd, &off);
+ if (err) {
+ err = -errno;
+ goto out_put_ctx;
+ }
+
+ if (rx) {
+ rx_map = mmap(NULL, off.rx.desc +
+ xsk->config.rx_size * sizeof(struct xdp_desc),
+ PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE,
+ xsk->fd, XDP_PGOFF_RX_RING);
+ if (rx_map == MAP_FAILED) {
+ err = -errno;
+ goto out_put_ctx;
+ }
+
+ rx->mask = xsk->config.rx_size - 1;
+ rx->size = xsk->config.rx_size;
+ rx->producer = rx_map + off.rx.producer;
+ rx->consumer = rx_map + off.rx.consumer;
+ rx->flags = rx_map + off.rx.flags;
+ rx->ring = rx_map + off.rx.desc;
+ rx->cached_prod = *rx->producer;
+ rx->cached_cons = *rx->consumer;
+ }
+ xsk->rx = rx;
+
+ if (tx) {
+ tx_map = mmap(NULL, off.tx.desc +
+ xsk->config.tx_size * sizeof(struct xdp_desc),
+ PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE,
+ xsk->fd, XDP_PGOFF_TX_RING);
+ if (tx_map == MAP_FAILED) {
+ err = -errno;
+ goto out_mmap_rx;
+ }
+
+ tx->mask = xsk->config.tx_size - 1;
+ tx->size = xsk->config.tx_size;
+ tx->producer = tx_map + off.tx.producer;
+ tx->consumer = tx_map + off.tx.consumer;
+ tx->flags = tx_map + off.tx.flags;
+ tx->ring = tx_map + off.tx.desc;
+ tx->cached_prod = *tx->producer;
+ /* cached_cons is r->size bigger than the real consumer pointer
+ * See xsk_prod_nb_free
+ */
+ tx->cached_cons = *tx->consumer + xsk->config.tx_size;
+ }
+ xsk->tx = tx;
+
+ sxdp.sxdp_family = PF_XDP;
+ sxdp.sxdp_ifindex = ctx->ifindex;
+ sxdp.sxdp_queue_id = ctx->queue_id;
+ if (umem->refcount > 1) {
+ sxdp.sxdp_flags |= XDP_SHARED_UMEM;
+ sxdp.sxdp_shared_umem_fd = umem->fd;
+ } else {
+ sxdp.sxdp_flags = xsk->config.bind_flags;
+ }
+
+ err = bind(xsk->fd, (struct sockaddr *)&sxdp, sizeof(sxdp));
+ if (err) {
+ err = -errno;
+ goto out_mmap_tx;
+ }
+
+ if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
+ err = __xsk_setup_xdp_prog(xsk, NULL);
+ if (err)
+ goto out_mmap_tx;
+ }
+
+ *xsk_ptr = xsk;
+ umem->fill_save = NULL;
+ umem->comp_save = NULL;
+ return 0;
+
+out_mmap_tx:
+ if (tx)
+ munmap(tx_map, off.tx.desc +
+ xsk->config.tx_size * sizeof(struct xdp_desc));
+out_mmap_rx:
+ if (rx)
+ munmap(rx_map, off.rx.desc +
+ xsk->config.rx_size * sizeof(struct xdp_desc));
+out_put_ctx:
+ xsk_put_ctx(ctx, unmap);
+out_socket:
+ if (--umem->refcount)
+ close(xsk->fd);
+out_xsk_alloc:
+ free(xsk);
+ return err;
+}
+
+int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
+ __u32 queue_id, struct xsk_umem *umem,
+ struct xsk_ring_cons *rx, struct xsk_ring_prod *tx,
+ const struct xsk_socket_config *usr_config)
+{
+ if (!umem)
+ return -EFAULT;
+
+ return xsk_socket__create_shared(xsk_ptr, ifname, queue_id, umem,
+ rx, tx, umem->fill_save,
+ umem->comp_save, usr_config);
+}
+
+int xsk_umem__delete(struct xsk_umem *umem)
+{
+ struct xdp_mmap_offsets off;
+ int err;
+
+ if (!umem)
+ return 0;
+
+ if (umem->refcount)
+ return -EBUSY;
+
+ err = xsk_get_mmap_offsets(umem->fd, &off);
+ if (!err && umem->fill_save && umem->comp_save) {
+ munmap(umem->fill_save->ring - off.fr.desc,
+ off.fr.desc + umem->config.fill_size * sizeof(__u64));
+ munmap(umem->comp_save->ring - off.cr.desc,
+ off.cr.desc + umem->config.comp_size * sizeof(__u64));
+ }
+
+ close(umem->fd);
+ free(umem);
+
+ return 0;
+}
+
+void xsk_socket__delete(struct xsk_socket *xsk)
+{
+ size_t desc_sz = sizeof(struct xdp_desc);
+ struct xdp_mmap_offsets off;
+ struct xsk_umem *umem;
+ struct xsk_ctx *ctx;
+ int err;
+
+ if (!xsk)
+ return;
+
+ ctx = xsk->ctx;
+ umem = ctx->umem;
+
+ xsk_put_ctx(ctx, true);
+
+ if (!ctx->refcount) {
+ xsk_delete_bpf_maps(xsk);
+ close(ctx->prog_fd);
+ if (ctx->has_bpf_link)
+ close(ctx->link_fd);
+ }
+
+ err = xsk_get_mmap_offsets(xsk->fd, &off);
+ if (!err) {
+ if (xsk->rx) {
+ munmap(xsk->rx->ring - off.rx.desc,
+ off.rx.desc + xsk->config.rx_size * desc_sz);
+ }
+ if (xsk->tx) {
+ munmap(xsk->tx->ring - off.tx.desc,
+ off.tx.desc + xsk->config.tx_size * desc_sz);
+ }
+ }
+
+ umem->refcount--;
+ /* Do not close an fd that also has an associated umem connected
+ * to it.
+ */
+ if (xsk->fd != umem->fd)
+ close(xsk->fd);
+ free(xsk);
+}
diff --git a/tools/testing/selftests/bpf/xsk.h b/tools/testing/selftests/bpf/xsk.h
new file mode 100644
index 000000000000..997723b0bfb2
--- /dev/null
+++ b/tools/testing/selftests/bpf/xsk.h
@@ -0,0 +1,316 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+
+/*
+ * AF_XDP user-space access library.
+ *
+ * Copyright (c) 2018 - 2019 Intel Corporation.
+ * Copyright (c) 2019 Facebook
+ *
+ * Author(s): Magnus Karlsson <magnus.karlsson@intel.com>
+ */
+
+#ifndef __XSK_H
+#define __XSK_H
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <linux/if_xdp.h>
+
+#include <bpf/libbpf.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* This whole API has been deprecated and moved to libxdp that can be found at
+ * https://github.com/xdp-project/xdp-tools. The APIs are exactly the same so
+ * it should just be linking with libxdp instead of libbpf for this set of
+ * functionality. If not, please submit a bug report on the aforementioned page.
+ */
+
+/* Load-Acquire Store-Release barriers used by the XDP socket
+ * library. The following macros should *NOT* be considered part of
+ * the xsk.h API, and is subject to change anytime.
+ *
+ * LIBRARY INTERNAL
+ */
+
+#define __XSK_READ_ONCE(x) (*(volatile typeof(x) *)&x)
+#define __XSK_WRITE_ONCE(x, v) (*(volatile typeof(x) *)&x) = (v)
+
+#if defined(__i386__) || defined(__x86_64__)
+# define libbpf_smp_store_release(p, v) \
+ do { \
+ asm volatile("" : : : "memory"); \
+ __XSK_WRITE_ONCE(*p, v); \
+ } while (0)
+# define libbpf_smp_load_acquire(p) \
+ ({ \
+ typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
+ asm volatile("" : : : "memory"); \
+ ___p1; \
+ })
+#elif defined(__aarch64__)
+# define libbpf_smp_store_release(p, v) \
+ asm volatile ("stlr %w1, %0" : "=Q" (*p) : "r" (v) : "memory")
+# define libbpf_smp_load_acquire(p) \
+ ({ \
+ typeof(*p) ___p1; \
+ asm volatile ("ldar %w0, %1" \
+ : "=r" (___p1) : "Q" (*p) : "memory"); \
+ ___p1; \
+ })
+#elif defined(__riscv)
+# define libbpf_smp_store_release(p, v) \
+ do { \
+ asm volatile ("fence rw,w" : : : "memory"); \
+ __XSK_WRITE_ONCE(*p, v); \
+ } while (0)
+# define libbpf_smp_load_acquire(p) \
+ ({ \
+ typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
+ asm volatile ("fence r,rw" : : : "memory"); \
+ ___p1; \
+ })
+#endif
+
+#ifndef libbpf_smp_store_release
+#define libbpf_smp_store_release(p, v) \
+ do { \
+ __sync_synchronize(); \
+ __XSK_WRITE_ONCE(*p, v); \
+ } while (0)
+#endif
+
+#ifndef libbpf_smp_load_acquire
+#define libbpf_smp_load_acquire(p) \
+ ({ \
+ typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
+ __sync_synchronize(); \
+ ___p1; \
+ })
+#endif
+
+/* LIBRARY INTERNAL -- END */
+
+/* Do not access these members directly. Use the functions below. */
+#define DEFINE_XSK_RING(name) \
+struct name { \
+ __u32 cached_prod; \
+ __u32 cached_cons; \
+ __u32 mask; \
+ __u32 size; \
+ __u32 *producer; \
+ __u32 *consumer; \
+ void *ring; \
+ __u32 *flags; \
+}
+
+DEFINE_XSK_RING(xsk_ring_prod);
+DEFINE_XSK_RING(xsk_ring_cons);
+
+/* For a detailed explanation on the memory barriers associated with the
+ * ring, please take a look at net/xdp/xsk_queue.h.
+ */
+
+struct xsk_umem;
+struct xsk_socket;
+
+static inline __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill,
+ __u32 idx)
+{
+ __u64 *addrs = (__u64 *)fill->ring;
+
+ return &addrs[idx & fill->mask];
+}
+
+static inline const __u64 *
+xsk_ring_cons__comp_addr(const struct xsk_ring_cons *comp, __u32 idx)
+{
+ const __u64 *addrs = (const __u64 *)comp->ring;
+
+ return &addrs[idx & comp->mask];
+}
+
+static inline struct xdp_desc *xsk_ring_prod__tx_desc(struct xsk_ring_prod *tx,
+ __u32 idx)
+{
+ struct xdp_desc *descs = (struct xdp_desc *)tx->ring;
+
+ return &descs[idx & tx->mask];
+}
+
+static inline const struct xdp_desc *
+xsk_ring_cons__rx_desc(const struct xsk_ring_cons *rx, __u32 idx)
+{
+ const struct xdp_desc *descs = (const struct xdp_desc *)rx->ring;
+
+ return &descs[idx & rx->mask];
+}
+
+static inline int xsk_ring_prod__needs_wakeup(const struct xsk_ring_prod *r)
+{
+ return *r->flags & XDP_RING_NEED_WAKEUP;
+}
+
+static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
+{
+ __u32 free_entries = r->cached_cons - r->cached_prod;
+
+ if (free_entries >= nb)
+ return free_entries;
+
+ /* Refresh the local tail pointer.
+ * cached_cons is r->size bigger than the real consumer pointer so
+ * that this addition can be avoided in the more frequently
+ * executed code that computs free_entries in the beginning of
+ * this function. Without this optimization it whould have been
+ * free_entries = r->cached_prod - r->cached_cons + r->size.
+ */
+ r->cached_cons = libbpf_smp_load_acquire(r->consumer);
+ r->cached_cons += r->size;
+
+ return r->cached_cons - r->cached_prod;
+}
+
+static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb)
+{
+ __u32 entries = r->cached_prod - r->cached_cons;
+
+ if (entries == 0) {
+ r->cached_prod = libbpf_smp_load_acquire(r->producer);
+ entries = r->cached_prod - r->cached_cons;
+ }
+
+ return (entries > nb) ? nb : entries;
+}
+
+static inline __u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx)
+{
+ if (xsk_prod_nb_free(prod, nb) < nb)
+ return 0;
+
+ *idx = prod->cached_prod;
+ prod->cached_prod += nb;
+
+ return nb;
+}
+
+static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb)
+{
+ /* Make sure everything has been written to the ring before indicating
+ * this to the kernel by writing the producer pointer.
+ */
+ libbpf_smp_store_release(prod->producer, *prod->producer + nb);
+}
+
+static inline __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx)
+{
+ __u32 entries = xsk_cons_nb_avail(cons, nb);
+
+ if (entries > 0) {
+ *idx = cons->cached_cons;
+ cons->cached_cons += entries;
+ }
+
+ return entries;
+}
+
+static inline void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb)
+{
+ cons->cached_cons -= nb;
+}
+
+static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb)
+{
+ /* Make sure data has been read before indicating we are done
+ * with the entries by updating the consumer pointer.
+ */
+ libbpf_smp_store_release(cons->consumer, *cons->consumer + nb);
+
+}
+
+static inline void *xsk_umem__get_data(void *umem_area, __u64 addr)
+{
+ return &((char *)umem_area)[addr];
+}
+
+static inline __u64 xsk_umem__extract_addr(__u64 addr)
+{
+ return addr & XSK_UNALIGNED_BUF_ADDR_MASK;
+}
+
+static inline __u64 xsk_umem__extract_offset(__u64 addr)
+{
+ return addr >> XSK_UNALIGNED_BUF_OFFSET_SHIFT;
+}
+
+static inline __u64 xsk_umem__add_offset_to_addr(__u64 addr)
+{
+ return xsk_umem__extract_addr(addr) + xsk_umem__extract_offset(addr);
+}
+
+int xsk_umem__fd(const struct xsk_umem *umem);
+int xsk_socket__fd(const struct xsk_socket *xsk);
+
+#define XSK_RING_CONS__DEFAULT_NUM_DESCS 2048
+#define XSK_RING_PROD__DEFAULT_NUM_DESCS 2048
+#define XSK_UMEM__DEFAULT_FRAME_SHIFT 12 /* 4096 bytes */
+#define XSK_UMEM__DEFAULT_FRAME_SIZE (1 << XSK_UMEM__DEFAULT_FRAME_SHIFT)
+#define XSK_UMEM__DEFAULT_FRAME_HEADROOM 0
+#define XSK_UMEM__DEFAULT_FLAGS 0
+
+struct xsk_umem_config {
+ __u32 fill_size;
+ __u32 comp_size;
+ __u32 frame_size;
+ __u32 frame_headroom;
+ __u32 flags;
+};
+
+int xsk_setup_xdp_prog_xsk(struct xsk_socket *xsk, int *xsks_map_fd);
+int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd);
+int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd);
+
+/* Flags for the libbpf_flags field. */
+#define XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD (1 << 0)
+
+struct xsk_socket_config {
+ __u32 rx_size;
+ __u32 tx_size;
+ __u32 libbpf_flags;
+ __u32 xdp_flags;
+ __u16 bind_flags;
+};
+
+/* Set config to NULL to get the default configuration. */
+int xsk_umem__create(struct xsk_umem **umem,
+ void *umem_area, __u64 size,
+ struct xsk_ring_prod *fill,
+ struct xsk_ring_cons *comp,
+ const struct xsk_umem_config *config);
+int xsk_socket__create(struct xsk_socket **xsk,
+ const char *ifname, __u32 queue_id,
+ struct xsk_umem *umem,
+ struct xsk_ring_cons *rx,
+ struct xsk_ring_prod *tx,
+ const struct xsk_socket_config *config);
+int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
+ const char *ifname,
+ __u32 queue_id, struct xsk_umem *umem,
+ struct xsk_ring_cons *rx,
+ struct xsk_ring_prod *tx,
+ struct xsk_ring_prod *fill,
+ struct xsk_ring_cons *comp,
+ const struct xsk_socket_config *config);
+
+/* Returns 0 for success and -EBUSY if the umem is still in use. */
+int xsk_umem__delete(struct xsk_umem *umem);
+void xsk_socket__delete(struct xsk_socket *xsk);
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif /* __XSK_H */
diff --git a/tools/testing/selftests/bpf/xsk_prereqs.sh b/tools/testing/selftests/bpf/xsk_prereqs.sh
index 684e813803ec..a0b71723a818 100755
--- a/tools/testing/selftests/bpf/xsk_prereqs.sh
+++ b/tools/testing/selftests/bpf/xsk_prereqs.sh
@@ -8,7 +8,7 @@ ksft_xfail=2
ksft_xpass=3
ksft_skip=4
-XSKOBJ=xdpxceiver
+XSKOBJ=xskxceiver
validate_root_exec()
{
@@ -77,7 +77,7 @@ validate_ip_utility()
[ ! $(type -P ip) ] && { echo "'ip' not found. Skipping tests."; test_exit $ksft_skip; }
}
-execxdpxceiver()
+exec_xskxceiver()
{
if [[ $busy_poll -eq 1 ]]; then
ARGS+="-b "
diff --git a/tools/testing/selftests/bpf/xdpxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
index e5992a6b5e09..74d56d971baf 100644
--- a/tools/testing/selftests/bpf/xdpxceiver.c
+++ b/tools/testing/selftests/bpf/xskxceiver.c
@@ -97,12 +97,12 @@
#include <time.h>
#include <unistd.h>
#include <stdatomic.h>
-#include <bpf/xsk.h>
-#include "xdpxceiver.h"
+#include "xsk.h"
+#include "xskxceiver.h"
#include "../kselftest.h"
/* AF_XDP APIs were moved into libxdp and marked as deprecated in libbpf.
- * Until xdpxceiver is either moved or re-writed into libxdp, suppress
+ * Until xskxceiver is either moved or re-writed into libxdp, suppress
* deprecation warnings in this file
*/
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
@@ -1085,6 +1085,7 @@ static void thread_common_ops(struct test_spec *test, struct ifobject *ifobject)
{
u64 umem_sz = ifobject->umem->num_frames * ifobject->umem->frame_size;
int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE;
+ LIBBPF_OPTS(bpf_xdp_query_opts, opts);
int ret, ifindex;
void *bufs;
u32 i;
@@ -1130,10 +1131,26 @@ static void thread_common_ops(struct test_spec *test, struct ifobject *ifobject)
if (!ifindex)
exit_with_error(errno);
- ret = xsk_setup_xdp_prog(ifindex, &ifobject->xsk_map_fd);
+ ret = xsk_setup_xdp_prog_xsk(ifobject->xsk->xsk, &ifobject->xsk_map_fd);
if (ret)
exit_with_error(-ret);
+ ret = bpf_xdp_query(ifindex, ifobject->xdp_flags, &opts);
+ if (ret)
+ exit_with_error(-ret);
+
+ if (ifobject->xdp_flags & XDP_FLAGS_SKB_MODE) {
+ if (opts.attach_mode != XDP_ATTACHED_SKB) {
+ ksft_print_msg("ERROR: [%s] XDP prog not in SKB mode\n");
+ exit_with_error(-EINVAL);
+ }
+ } else if (ifobject->xdp_flags & XDP_FLAGS_DRV_MODE) {
+ if (opts.attach_mode != XDP_ATTACHED_DRV) {
+ ksft_print_msg("ERROR: [%s] XDP prog not in DRV mode\n");
+ exit_with_error(-EINVAL);
+ }
+ }
+
ret = xsk_socket__update_xskmap(ifobject->xsk->xsk, ifobject->xsk_map_fd);
if (ret)
exit_with_error(-ret);
diff --git a/tools/testing/selftests/bpf/xdpxceiver.h b/tools/testing/selftests/bpf/xskxceiver.h
index 8f672b0fe0e1..3d17053f98e5 100644
--- a/tools/testing/selftests/bpf/xdpxceiver.h
+++ b/tools/testing/selftests/bpf/xskxceiver.h
@@ -2,8 +2,8 @@
* Copyright(c) 2020 Intel Corporation.
*/
-#ifndef XDPXCEIVER_H_
-#define XDPXCEIVER_H_
+#ifndef XSKXCEIVER_H_
+#define XSKXCEIVER_H_
#ifndef SOL_XDP
#define SOL_XDP 283
@@ -169,4 +169,4 @@ pthread_cond_t pacing_cond = PTHREAD_COND_INITIALIZER;
int pkts_in_flight;
-#endif /* XDPXCEIVER_H */
+#endif /* XSKXCEIVER_H_ */
diff --git a/tools/testing/selftests/drivers/net/dsa/Makefile b/tools/testing/selftests/drivers/net/dsa/Makefile
new file mode 100644
index 000000000000..2a731d5c6d85
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/dsa/Makefile
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0+ OR MIT
+
+TEST_PROGS = bridge_locked_port.sh \
+ bridge_mdb.sh \
+ bridge_mld.sh \
+ bridge_vlan_aware.sh \
+ bridge_vlan_mcast.sh \
+ bridge_vlan_unaware.sh \
+ local_termination.sh \
+ no_forwarding.sh \
+ test_bridge_fdb_stress.sh
+
+TEST_PROGS_EXTENDED := lib.sh
+
+TEST_FILES := forwarding.config
+
+include ../../../lib.mk
diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_linecard.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_linecard.sh
index 08a922d8b86a..224ca3695c89 100755
--- a/tools/testing/selftests/drivers/net/mlxsw/devlink_linecard.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_linecard.sh
@@ -84,6 +84,13 @@ lc_wait_until_port_count_is()
busywait "$timeout" until_lc_port_count_is "$port_count" lc_port_count_get "$lc"
}
+lc_nested_devlink_dev_get()
+{
+ local lc=$1
+
+ devlink lc show $DEVLINK_DEV lc $lc -j | jq -e -r ".[][][].nested_devlink"
+}
+
PROV_UNPROV_TIMEOUT=8000 # ms
POST_PROV_ACT_TIMEOUT=2000 # ms
PROV_PORTS_INSTANTIATION_TIMEOUT=15000 # ms
@@ -191,12 +198,30 @@ ports_check()
check_err $? "Unexpected port count linecard $lc (got $port_count, expected $expected_port_count)"
}
+lc_dev_info_provisioned_check()
+{
+ local lc=$1
+ local nested_devlink_dev=$2
+ local fixed_hw_revision
+ local running_ini_version
+
+ fixed_hw_revision=$(devlink dev info $nested_devlink_dev -j | \
+ jq -e -r '.[][].versions.fixed."hw.revision"')
+ check_err $? "Failed to get linecard $lc fixed.hw.revision"
+ log_info "Linecard $lc fixed.hw.revision: \"$fixed_hw_revision\""
+ running_ini_version=$(devlink dev info $nested_devlink_dev -j | \
+ jq -e -r '.[][].versions.running."ini.version"')
+ check_err $? "Failed to get linecard $lc running.ini.version"
+ log_info "Linecard $lc running.ini.version: \"$running_ini_version\""
+}
+
provision_test()
{
RET=0
local lc
local type
local state
+ local nested_devlink_dev
lc=$LC_SLOT
supported_types_check $lc
@@ -207,6 +232,11 @@ provision_test()
fi
provision_one $lc $LC_16X100G_TYPE
ports_check $lc $LC_16X100G_PORT_COUNT
+
+ nested_devlink_dev=$(lc_nested_devlink_dev_get $lc)
+ check_err $? "Failed to get nested devlink handle of linecard $lc"
+ lc_dev_info_provisioned_check $lc $nested_devlink_dev
+
log_test "Provision"
}
@@ -220,12 +250,32 @@ interface_check()
setup_wait
}
+lc_dev_info_active_check()
+{
+ local lc=$1
+ local nested_devlink_dev=$2
+ local fixed_device_fw_psid
+ local running_device_fw
+
+ fixed_device_fw_psid=$(devlink dev info $nested_devlink_dev -j | \
+ jq -e -r ".[][].versions.fixed" | \
+ jq -e -r '."fw.psid"')
+ check_err $? "Failed to get linecard $lc fixed fw PSID"
+ log_info "Linecard $lc fixed.fw.psid: \"$fixed_device_fw_psid\""
+
+ running_device_fw=$(devlink dev info $nested_devlink_dev -j | \
+ jq -e -r ".[][].versions.running.fw")
+ check_err $? "Failed to get linecard $lc running.fw.version"
+ log_info "Linecard $lc running.fw: \"$running_device_fw\""
+}
+
activation_16x100G_test()
{
RET=0
local lc
local type
local state
+ local nested_devlink_dev
lc=$LC_SLOT
type=$LC_16X100G_TYPE
@@ -238,6 +288,10 @@ activation_16x100G_test()
interface_check
+ nested_devlink_dev=$(lc_nested_devlink_dev_get $lc)
+ check_err $? "Failed to get nested devlink handle of linecard $lc"
+ lc_dev_info_active_check $lc $nested_devlink_dev
+
log_test "Activation 16x100G"
}
diff --git a/tools/testing/selftests/drivers/net/mlxsw/rif_counter_scale.sh b/tools/testing/selftests/drivers/net/mlxsw/rif_counter_scale.sh
new file mode 100644
index 000000000000..a43a9926e690
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/mlxsw/rif_counter_scale.sh
@@ -0,0 +1,107 @@
+# SPDX-License-Identifier: GPL-2.0
+
+RIF_COUNTER_NUM_NETIFS=2
+
+rif_counter_addr4()
+{
+ local i=$1; shift
+ local p=$1; shift
+
+ printf 192.0.%d.%d $((i / 64)) $(((4 * i % 256) + p))
+}
+
+rif_counter_addr4pfx()
+{
+ rif_counter_addr4 $@
+ printf /30
+}
+
+rif_counter_h1_create()
+{
+ simple_if_init $h1
+}
+
+rif_counter_h1_destroy()
+{
+ simple_if_fini $h1
+}
+
+rif_counter_h2_create()
+{
+ simple_if_init $h2
+}
+
+rif_counter_h2_destroy()
+{
+ simple_if_fini $h2
+}
+
+rif_counter_setup_prepare()
+{
+ h1=${NETIFS[p1]}
+ h2=${NETIFS[p2]}
+
+ vrf_prepare
+
+ rif_counter_h1_create
+ rif_counter_h2_create
+}
+
+rif_counter_cleanup()
+{
+ local count=$1; shift
+
+ pre_cleanup
+
+ for ((i = 1; i <= count; i++)); do
+ vlan_destroy $h2 $i
+ done
+
+ rif_counter_h2_destroy
+ rif_counter_h1_destroy
+
+ vrf_cleanup
+
+ if [[ -v RIF_COUNTER_BATCH_FILE ]]; then
+ rm -f $RIF_COUNTER_BATCH_FILE
+ fi
+}
+
+
+rif_counter_test()
+{
+ local count=$1; shift
+ local should_fail=$1; shift
+
+ RIF_COUNTER_BATCH_FILE="$(mktemp)"
+
+ for ((i = 1; i <= count; i++)); do
+ vlan_create $h2 $i v$h2 $(rif_counter_addr4pfx $i 2)
+ done
+ for ((i = 1; i <= count; i++)); do
+ cat >> $RIF_COUNTER_BATCH_FILE <<-EOF
+ stats set dev $h2.$i l3_stats on
+ EOF
+ done
+
+ ip -b $RIF_COUNTER_BATCH_FILE
+ check_err_fail $should_fail $? "RIF counter enablement"
+}
+
+rif_counter_traffic_test()
+{
+ local count=$1; shift
+ local i;
+
+ for ((i = count; i > 0; i /= 2)); do
+ $MZ $h1 -Q $i -c 1 -d 20msec -p 100 -a own -b $(mac_get $h2) \
+ -A $(rif_counter_addr4 $i 1) \
+ -B $(rif_counter_addr4 $i 2) \
+ -q -t udp sp=54321,dp=12345
+ done
+ for ((i = count; i > 0; i /= 2)); do
+ busywait "$TC_HIT_TIMEOUT" until_counter_is "== 1" \
+ hw_stats_get l3_stats $h2.$i rx packets > /dev/null
+ check_err $? "Traffic not seen at RIF $h2.$i"
+ done
+}
diff --git a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/resource_scale.sh b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/resource_scale.sh
index e9f65bd2e299..688338bbeb97 100755
--- a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/resource_scale.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/resource_scale.sh
@@ -25,7 +25,16 @@ cleanup()
trap cleanup EXIT
-ALL_TESTS="router tc_flower mirror_gre tc_police port rif_mac_profile"
+ALL_TESTS="
+ router
+ tc_flower
+ mirror_gre
+ tc_police
+ port
+ rif_mac_profile
+ rif_counter
+"
+
for current_test in ${TESTS:-$ALL_TESTS}; do
RET_FIN=0
source ${current_test}_scale.sh
@@ -36,16 +45,32 @@ for current_test in ${TESTS:-$ALL_TESTS}; do
for should_fail in 0 1; do
RET=0
target=$(${current_test}_get_target "$should_fail")
+ if ((target == 0)); then
+ log_test_skip "'$current_test' should_fail=$should_fail test"
+ continue
+ fi
+
${current_test}_setup_prepare
setup_wait $num_netifs
+ # Update target in case occupancy of a certain resource changed
+ # following the test setup.
+ target=$(${current_test}_get_target "$should_fail")
${current_test}_test "$target" "$should_fail"
- ${current_test}_cleanup
- devlink_reload
if [[ "$should_fail" -eq 0 ]]; then
log_test "'$current_test' $target"
+
+ if ((!RET)); then
+ tt=${current_test}_traffic_test
+ if [[ $(type -t $tt) == "function" ]]; then
+ $tt "$target"
+ log_test "'$current_test' $target traffic test"
+ fi
+ fi
else
log_test "'$current_test' overflow $target"
fi
+ ${current_test}_cleanup $target
+ devlink_reload
RET_FIN=$(( RET_FIN || RET ))
done
done
diff --git a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/rif_counter_scale.sh b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/rif_counter_scale.sh
new file mode 120000
index 000000000000..1f5752e8ffc0
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/rif_counter_scale.sh
@@ -0,0 +1 @@
+../spectrum/rif_counter_scale.sh \ No newline at end of file
diff --git a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower_scale.sh b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower_scale.sh
index efd798a85931..4444bbace1a9 100644
--- a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower_scale.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower_scale.sh
@@ -4,17 +4,22 @@ source ../tc_flower_scale.sh
tc_flower_get_target()
{
local should_fail=$1; shift
+ local max_cnts
# The driver associates a counter with each tc filter, which means the
# number of supported filters is bounded by the number of available
# counters.
- # Currently, the driver supports 30K (30,720) flow counters and six of
- # these are used for multicast routing.
- local target=30714
+ max_cnts=$(devlink_resource_size_get counters flow)
+
+ # Remove already allocated counters.
+ ((max_cnts -= $(devlink_resource_occ_get counters flow)))
+
+ # Each rule uses two counters, for packets and bytes.
+ ((max_cnts /= 2))
if ((! should_fail)); then
- echo $target
+ echo $max_cnts
else
- echo $((target + 1))
+ echo $((max_cnts + 1))
fi
}
diff --git a/tools/testing/selftests/drivers/net/mlxsw/spectrum/resource_scale.sh b/tools/testing/selftests/drivers/net/mlxsw/spectrum/resource_scale.sh
index dea33dc93790..95d9f710a630 100755
--- a/tools/testing/selftests/drivers/net/mlxsw/spectrum/resource_scale.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/spectrum/resource_scale.sh
@@ -22,7 +22,16 @@ cleanup()
devlink_sp_read_kvd_defaults
trap cleanup EXIT
-ALL_TESTS="router tc_flower mirror_gre tc_police port rif_mac_profile"
+ALL_TESTS="
+ router
+ tc_flower
+ mirror_gre
+ tc_police
+ port
+ rif_mac_profile
+ rif_counter
+"
+
for current_test in ${TESTS:-$ALL_TESTS}; do
RET_FIN=0
source ${current_test}_scale.sh
@@ -41,15 +50,31 @@ for current_test in ${TESTS:-$ALL_TESTS}; do
for should_fail in 0 1; do
RET=0
target=$(${current_test}_get_target "$should_fail")
+ if ((target == 0)); then
+ log_test_skip "'$current_test' [$profile] should_fail=$should_fail test"
+ continue
+ fi
${current_test}_setup_prepare
setup_wait $num_netifs
+ # Update target in case occupancy of a certain resource
+ # changed following the test setup.
+ target=$(${current_test}_get_target "$should_fail")
${current_test}_test "$target" "$should_fail"
- ${current_test}_cleanup
if [[ "$should_fail" -eq 0 ]]; then
log_test "'$current_test' [$profile] $target"
+
+ if ((!RET)); then
+ tt=${current_test}_traffic_test
+ if [[ $(type -t $tt) == "function" ]]
+ then
+ $tt "$target"
+ log_test "'$current_test' [$profile] $target traffic test"
+ fi
+ fi
else
log_test "'$current_test' [$profile] overflow $target"
fi
+ ${current_test}_cleanup $target
RET_FIN=$(( RET_FIN || RET ))
done
done
diff --git a/tools/testing/selftests/drivers/net/mlxsw/spectrum/rif_counter_scale.sh b/tools/testing/selftests/drivers/net/mlxsw/spectrum/rif_counter_scale.sh
new file mode 100644
index 000000000000..d44536276e8a
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/mlxsw/spectrum/rif_counter_scale.sh
@@ -0,0 +1,34 @@
+# SPDX-License-Identifier: GPL-2.0
+source ../rif_counter_scale.sh
+
+rif_counter_get_target()
+{
+ local should_fail=$1; shift
+ local max_cnts
+ local max_rifs
+ local target
+
+ max_rifs=$(devlink_resource_size_get rifs)
+ max_cnts=$(devlink_resource_size_get counters rif)
+
+ # Remove already allocated RIFs.
+ ((max_rifs -= $(devlink_resource_occ_get rifs)))
+
+ # 10 KVD slots per counter, ingress+egress counters per RIF
+ ((max_cnts /= 20))
+
+ # Pointless to run the overflow test if we don't have enough RIFs to
+ # host all the counters.
+ if ((max_cnts > max_rifs && should_fail)); then
+ echo 0
+ return
+ fi
+
+ target=$((max_rifs < max_cnts ? max_rifs : max_cnts))
+
+ if ((! should_fail)); then
+ echo $target
+ else
+ echo $((target + 1))
+ fi
+}
diff --git a/tools/testing/selftests/drivers/net/mlxsw/tc_flower_scale.sh b/tools/testing/selftests/drivers/net/mlxsw/tc_flower_scale.sh
index aa74be9f47c8..d3d9e60d6ddf 100644
--- a/tools/testing/selftests/drivers/net/mlxsw/tc_flower_scale.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/tc_flower_scale.sh
@@ -77,6 +77,7 @@ tc_flower_rules_create()
filter add dev $h2 ingress \
prot ipv6 \
pref 1000 \
+ handle 42$i \
flower $tcflags dst_ip $(tc_flower_addr $i) \
action drop
EOF
@@ -121,3 +122,19 @@ tc_flower_test()
tcflags="skip_sw"
__tc_flower_test $count $should_fail
}
+
+tc_flower_traffic_test()
+{
+ local count=$1; shift
+ local i;
+
+ for ((i = count - 1; i > 0; i /= 2)); do
+ $MZ -6 $h1 -c 1 -d 20msec -p 100 -a own -b $(mac_get $h2) \
+ -A $(tc_flower_addr 0) -B $(tc_flower_addr $i) \
+ -q -t udp sp=54321,dp=12345
+ done
+ for ((i = count - 1; i > 0; i /= 2)); do
+ tc_check_packets "dev $h2 ingress" 42$i 1
+ check_err $? "Traffic not seen at rule #$i"
+ done
+}
diff --git a/tools/testing/selftests/drivers/net/netdevsim/fib.sh b/tools/testing/selftests/drivers/net/netdevsim/fib.sh
index fc794cd30389..6800de816e8b 100755
--- a/tools/testing/selftests/drivers/net/netdevsim/fib.sh
+++ b/tools/testing/selftests/drivers/net/netdevsim/fib.sh
@@ -16,6 +16,7 @@ ALL_TESTS="
ipv4_replay
ipv4_flush
ipv4_error_path
+ ipv4_delete_fail
ipv6_add
ipv6_metric
ipv6_append_single
@@ -29,11 +30,13 @@ ALL_TESTS="
ipv6_replay_single
ipv6_replay_multipath
ipv6_error_path
+ ipv6_delete_fail
"
NETDEVSIM_PATH=/sys/bus/netdevsim/
DEV_ADDR=1337
DEV=netdevsim${DEV_ADDR}
SYSFS_NET_DIR=/sys/bus/netdevsim/devices/$DEV/net/
+DEBUGFS_DIR=/sys/kernel/debug/netdevsim/$DEV/
NUM_NETIFS=0
source $lib_dir/lib.sh
source $lib_dir/fib_offload_lib.sh
@@ -157,6 +160,27 @@ ipv4_error_path()
ipv4_error_path_replay
}
+ipv4_delete_fail()
+{
+ RET=0
+
+ echo "y" > $DEBUGFS_DIR/fib/fail_route_delete
+
+ ip -n testns1 link add name dummy1 type dummy
+ ip -n testns1 link set dev dummy1 up
+
+ ip -n testns1 route add 192.0.2.0/24 dev dummy1
+ ip -n testns1 route del 192.0.2.0/24 dev dummy1 &> /dev/null
+
+ # We should not be able to delete the netdev if we are leaking a
+ # reference.
+ ip -n testns1 link del dev dummy1
+
+ log_test "IPv4 route delete failure"
+
+ echo "n" > $DEBUGFS_DIR/fib/fail_route_delete
+}
+
ipv6_add()
{
fib_ipv6_add_test "testns1"
@@ -304,6 +328,27 @@ ipv6_error_path()
ipv6_error_path_replay
}
+ipv6_delete_fail()
+{
+ RET=0
+
+ echo "y" > $DEBUGFS_DIR/fib/fail_route_delete
+
+ ip -n testns1 link add name dummy1 type dummy
+ ip -n testns1 link set dev dummy1 up
+
+ ip -n testns1 route add 2001:db8:1::/64 dev dummy1
+ ip -n testns1 route del 2001:db8:1::/64 dev dummy1 &> /dev/null
+
+ # We should not be able to delete the netdev if we are leaking a
+ # reference.
+ ip -n testns1 link del dev dummy1
+
+ log_test "IPv6 route delete failure"
+
+ echo "n" > $DEBUGFS_DIR/fib/fail_route_delete
+}
+
fib_notify_on_flag_change_set()
{
local notify=$1; shift
diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore
index ffc35a22e914..892306bdb47d 100644
--- a/tools/testing/selftests/net/.gitignore
+++ b/tools/testing/selftests/net/.gitignore
@@ -38,3 +38,4 @@ ioam6_parser
toeplitz
tun
cmsg_sender
+unix_connect \ No newline at end of file
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index d0460a969060..e2dfef8b78a7 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -35,9 +35,12 @@ TEST_PROGS += cmsg_time.sh cmsg_ipv6.sh
TEST_PROGS += srv6_end_dt46_l3vpn_test.sh
TEST_PROGS += srv6_end_dt4_l3vpn_test.sh
TEST_PROGS += srv6_end_dt6_l3vpn_test.sh
+TEST_PROGS += srv6_hencap_red_l3vpn_test.sh
+TEST_PROGS += srv6_hl2encap_red_l2vpn_test.sh
TEST_PROGS += vrf_strict_mode_test.sh
TEST_PROGS += arp_ndisc_evict_nocarrier.sh
TEST_PROGS += ndisc_unsolicited_na_test.sh
+TEST_PROGS += arp_ndisc_untracked_subnets.sh
TEST_PROGS += stress_reuseport_listen.sh
TEST_PROGS_EXTENDED := in_netns.sh setup_loopback.sh setup_veth.sh
TEST_PROGS_EXTENDED += toeplitz_client.sh toeplitz.sh
diff --git a/tools/testing/selftests/net/af_unix/Makefile b/tools/testing/selftests/net/af_unix/Makefile
index df341648f818..969620ae9928 100644
--- a/tools/testing/selftests/net/af_unix/Makefile
+++ b/tools/testing/selftests/net/af_unix/Makefile
@@ -1,2 +1,3 @@
-TEST_GEN_PROGS := test_unix_oob
+TEST_GEN_PROGS := test_unix_oob unix_connect
+
include ../../lib.mk
diff --git a/tools/testing/selftests/net/af_unix/unix_connect.c b/tools/testing/selftests/net/af_unix/unix_connect.c
new file mode 100644
index 000000000000..d799fd8f5c7c
--- /dev/null
+++ b/tools/testing/selftests/net/af_unix/unix_connect.c
@@ -0,0 +1,148 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+#include <sched.h>
+
+#include <stddef.h>
+#include <stdio.h>
+#include <unistd.h>
+
+#include <sys/socket.h>
+#include <sys/un.h>
+
+#include "../../kselftest_harness.h"
+
+FIXTURE(unix_connect)
+{
+ int server, client;
+ int family;
+};
+
+FIXTURE_VARIANT(unix_connect)
+{
+ int type;
+ char sun_path[8];
+ int len;
+ int flags;
+ int err;
+};
+
+FIXTURE_VARIANT_ADD(unix_connect, stream_pathname)
+{
+ .type = SOCK_STREAM,
+ .sun_path = "test",
+ .len = 4 + 1,
+ .flags = 0,
+ .err = 0,
+};
+
+FIXTURE_VARIANT_ADD(unix_connect, stream_abstract)
+{
+ .type = SOCK_STREAM,
+ .sun_path = "\0test",
+ .len = 5,
+ .flags = 0,
+ .err = 0,
+};
+
+FIXTURE_VARIANT_ADD(unix_connect, stream_pathname_netns)
+{
+ .type = SOCK_STREAM,
+ .sun_path = "test",
+ .len = 4 + 1,
+ .flags = CLONE_NEWNET,
+ .err = 0,
+};
+
+FIXTURE_VARIANT_ADD(unix_connect, stream_abstract_netns)
+{
+ .type = SOCK_STREAM,
+ .sun_path = "\0test",
+ .len = 5,
+ .flags = CLONE_NEWNET,
+ .err = ECONNREFUSED,
+};
+
+FIXTURE_VARIANT_ADD(unix_connect, dgram_pathname)
+{
+ .type = SOCK_DGRAM,
+ .sun_path = "test",
+ .len = 4 + 1,
+ .flags = 0,
+ .err = 0,
+};
+
+FIXTURE_VARIANT_ADD(unix_connect, dgram_abstract)
+{
+ .type = SOCK_DGRAM,
+ .sun_path = "\0test",
+ .len = 5,
+ .flags = 0,
+ .err = 0,
+};
+
+FIXTURE_VARIANT_ADD(unix_connect, dgram_pathname_netns)
+{
+ .type = SOCK_DGRAM,
+ .sun_path = "test",
+ .len = 4 + 1,
+ .flags = CLONE_NEWNET,
+ .err = 0,
+};
+
+FIXTURE_VARIANT_ADD(unix_connect, dgram_abstract_netns)
+{
+ .type = SOCK_DGRAM,
+ .sun_path = "\0test",
+ .len = 5,
+ .flags = CLONE_NEWNET,
+ .err = ECONNREFUSED,
+};
+
+FIXTURE_SETUP(unix_connect)
+{
+ self->family = AF_UNIX;
+}
+
+FIXTURE_TEARDOWN(unix_connect)
+{
+ close(self->server);
+ close(self->client);
+
+ if (variant->sun_path[0])
+ remove("test");
+}
+
+TEST_F(unix_connect, test)
+{
+ socklen_t addrlen;
+ struct sockaddr_un addr = {
+ .sun_family = self->family,
+ };
+ int err;
+
+ self->server = socket(self->family, variant->type, 0);
+ ASSERT_NE(-1, self->server);
+
+ addrlen = offsetof(struct sockaddr_un, sun_path) + variant->len;
+ memcpy(&addr.sun_path, variant->sun_path, variant->len);
+
+ err = bind(self->server, (struct sockaddr *)&addr, addrlen);
+ ASSERT_EQ(0, err);
+
+ if (variant->type == SOCK_STREAM) {
+ err = listen(self->server, 32);
+ ASSERT_EQ(0, err);
+ }
+
+ err = unshare(variant->flags);
+ ASSERT_EQ(0, err);
+
+ self->client = socket(self->family, variant->type, 0);
+ ASSERT_LT(0, self->client);
+
+ err = connect(self->client, (struct sockaddr *)&addr, addrlen);
+ ASSERT_EQ(variant->err, err == -1 ? errno : 0);
+}
+
+TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/net/arp_ndisc_untracked_subnets.sh b/tools/testing/selftests/net/arp_ndisc_untracked_subnets.sh
new file mode 100755
index 000000000000..c899b446acb6
--- /dev/null
+++ b/tools/testing/selftests/net/arp_ndisc_untracked_subnets.sh
@@ -0,0 +1,308 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# 2 namespaces: one host and one router. Use arping from the host to send a
+# garp to the router. Router accepts or ignores based on its arp_accept
+# or accept_untracked_na configuration.
+
+TESTS="arp ndisc"
+
+ROUTER_NS="ns-router"
+ROUTER_NS_V6="ns-router-v6"
+ROUTER_INTF="veth-router"
+ROUTER_ADDR="10.0.10.1"
+ROUTER_ADDR_V6="2001:db8:abcd:0012::1"
+
+HOST_NS="ns-host"
+HOST_NS_V6="ns-host-v6"
+HOST_INTF="veth-host"
+HOST_ADDR="10.0.10.2"
+HOST_ADDR_V6="2001:db8:abcd:0012::2"
+
+SUBNET_WIDTH=24
+PREFIX_WIDTH_V6=64
+
+cleanup() {
+ ip netns del ${HOST_NS}
+ ip netns del ${ROUTER_NS}
+}
+
+cleanup_v6() {
+ ip netns del ${HOST_NS_V6}
+ ip netns del ${ROUTER_NS_V6}
+}
+
+setup() {
+ set -e
+ local arp_accept=$1
+
+ # Set up two namespaces
+ ip netns add ${ROUTER_NS}
+ ip netns add ${HOST_NS}
+
+ # Set up interfaces veth0 and veth1, which are pairs in separate
+ # namespaces. veth0 is veth-router, veth1 is veth-host.
+ # first, set up the inteface's link to the namespace
+ # then, set the interface "up"
+ ip netns exec ${ROUTER_NS} ip link add name ${ROUTER_INTF} \
+ type veth peer name ${HOST_INTF}
+
+ ip netns exec ${ROUTER_NS} ip link set dev ${ROUTER_INTF} up
+ ip netns exec ${ROUTER_NS} ip link set dev ${HOST_INTF} netns ${HOST_NS}
+
+ ip netns exec ${HOST_NS} ip link set dev ${HOST_INTF} up
+ ip netns exec ${ROUTER_NS} ip addr add ${ROUTER_ADDR}/${SUBNET_WIDTH} \
+ dev ${ROUTER_INTF}
+
+ ip netns exec ${HOST_NS} ip addr add ${HOST_ADDR}/${SUBNET_WIDTH} \
+ dev ${HOST_INTF}
+ ip netns exec ${HOST_NS} ip route add default via ${HOST_ADDR} \
+ dev ${HOST_INTF}
+ ip netns exec ${ROUTER_NS} ip route add default via ${ROUTER_ADDR} \
+ dev ${ROUTER_INTF}
+
+ ROUTER_CONF=net.ipv4.conf.${ROUTER_INTF}
+ ip netns exec ${ROUTER_NS} sysctl -w \
+ ${ROUTER_CONF}.arp_accept=${arp_accept} >/dev/null 2>&1
+ set +e
+}
+
+setup_v6() {
+ set -e
+ local accept_untracked_na=$1
+
+ # Set up two namespaces
+ ip netns add ${ROUTER_NS_V6}
+ ip netns add ${HOST_NS_V6}
+
+ # Set up interfaces veth0 and veth1, which are pairs in separate
+ # namespaces. veth0 is veth-router, veth1 is veth-host.
+ # first, set up the inteface's link to the namespace
+ # then, set the interface "up"
+ ip -6 -netns ${ROUTER_NS_V6} link add name ${ROUTER_INTF} \
+ type veth peer name ${HOST_INTF}
+
+ ip -6 -netns ${ROUTER_NS_V6} link set dev ${ROUTER_INTF} up
+ ip -6 -netns ${ROUTER_NS_V6} link set dev ${HOST_INTF} netns \
+ ${HOST_NS_V6}
+
+ ip -6 -netns ${HOST_NS_V6} link set dev ${HOST_INTF} up
+ ip -6 -netns ${ROUTER_NS_V6} addr add \
+ ${ROUTER_ADDR_V6}/${PREFIX_WIDTH_V6} dev ${ROUTER_INTF} nodad
+
+ HOST_CONF=net.ipv6.conf.${HOST_INTF}
+ ip netns exec ${HOST_NS_V6} sysctl -qw ${HOST_CONF}.ndisc_notify=1
+ ip netns exec ${HOST_NS_V6} sysctl -qw ${HOST_CONF}.disable_ipv6=0
+ ip -6 -netns ${HOST_NS_V6} addr add ${HOST_ADDR_V6}/${PREFIX_WIDTH_V6} \
+ dev ${HOST_INTF}
+
+ ROUTER_CONF=net.ipv6.conf.${ROUTER_INTF}
+
+ ip netns exec ${ROUTER_NS_V6} sysctl -w \
+ ${ROUTER_CONF}.forwarding=1 >/dev/null 2>&1
+ ip netns exec ${ROUTER_NS_V6} sysctl -w \
+ ${ROUTER_CONF}.drop_unsolicited_na=0 >/dev/null 2>&1
+ ip netns exec ${ROUTER_NS_V6} sysctl -w \
+ ${ROUTER_CONF}.accept_untracked_na=${accept_untracked_na} \
+ >/dev/null 2>&1
+ set +e
+}
+
+verify_arp() {
+ local arp_accept=$1
+ local same_subnet=$2
+
+ neigh_show_output=$(ip netns exec ${ROUTER_NS} ip neigh get \
+ ${HOST_ADDR} dev ${ROUTER_INTF} 2>/dev/null)
+
+ if [ ${arp_accept} -eq 1 ]; then
+ # Neighbor entries expected
+ [[ ${neigh_show_output} ]]
+ elif [ ${arp_accept} -eq 2 ]; then
+ if [ ${same_subnet} -eq 1 ]; then
+ # Neighbor entries expected
+ [[ ${neigh_show_output} ]]
+ else
+ [[ -z "${neigh_show_output}" ]]
+ fi
+ else
+ [[ -z "${neigh_show_output}" ]]
+ fi
+ }
+
+arp_test_gratuitous() {
+ set -e
+ local arp_accept=$1
+ local same_subnet=$2
+
+ if [ ${arp_accept} -eq 2 ]; then
+ test_msg=("test_arp: "
+ "accept_arp=$1 "
+ "same_subnet=$2")
+ if [ ${same_subnet} -eq 0 ]; then
+ HOST_ADDR=10.0.11.3
+ else
+ HOST_ADDR=10.0.10.3
+ fi
+ else
+ test_msg=("test_arp: "
+ "accept_arp=$1")
+ fi
+ # Supply arp_accept option to set up which sets it in sysctl
+ setup ${arp_accept}
+ ip netns exec ${HOST_NS} arping -A -U ${HOST_ADDR} -c1 2>&1 >/dev/null
+
+ if verify_arp $1 $2; then
+ printf " TEST: %-60s [ OK ]\n" "${test_msg[*]}"
+ else
+ printf " TEST: %-60s [FAIL]\n" "${test_msg[*]}"
+ fi
+ cleanup
+ set +e
+}
+
+arp_test_gratuitous_combinations() {
+ arp_test_gratuitous 0
+ arp_test_gratuitous 1
+ arp_test_gratuitous 2 0 # Second entry indicates subnet or not
+ arp_test_gratuitous 2 1
+}
+
+cleanup_tcpdump() {
+ set -e
+ [[ ! -z ${tcpdump_stdout} ]] && rm -f ${tcpdump_stdout}
+ [[ ! -z ${tcpdump_stderr} ]] && rm -f ${tcpdump_stderr}
+ tcpdump_stdout=
+ tcpdump_stderr=
+ set +e
+}
+
+start_tcpdump() {
+ set -e
+ tcpdump_stdout=`mktemp`
+ tcpdump_stderr=`mktemp`
+ ip netns exec ${ROUTER_NS_V6} timeout 15s \
+ tcpdump --immediate-mode -tpni ${ROUTER_INTF} -c 1 \
+ "icmp6 && icmp6[0] == 136 && src ${HOST_ADDR_V6}" \
+ > ${tcpdump_stdout} 2> /dev/null
+ set +e
+}
+
+verify_ndisc() {
+ local accept_untracked_na=$1
+ local same_subnet=$2
+
+ neigh_show_output=$(ip -6 -netns ${ROUTER_NS_V6} neigh show \
+ to ${HOST_ADDR_V6} dev ${ROUTER_INTF} nud stale)
+
+ if [ ${accept_untracked_na} -eq 1 ]; then
+ # Neighbour entry expected to be present
+ [[ ${neigh_show_output} ]]
+ elif [ ${accept_untracked_na} -eq 2 ]; then
+ if [ ${same_subnet} -eq 1 ]; then
+ [[ ${neigh_show_output} ]]
+ else
+ [[ -z "${neigh_show_output}" ]]
+ fi
+ else
+ # Neighbour entry expected to be absent for all other cases
+ [[ -z "${neigh_show_output}" ]]
+ fi
+}
+
+ndisc_test_untracked_advertisements() {
+ set -e
+ test_msg=("test_ndisc: "
+ "accept_untracked_na=$1")
+
+ local accept_untracked_na=$1
+ local same_subnet=$2
+ if [ ${accept_untracked_na} -eq 2 ]; then
+ test_msg=("test_ndisc: "
+ "accept_untracked_na=$1 "
+ "same_subnet=$2")
+ if [ ${same_subnet} -eq 0 ]; then
+ # Not same subnet
+ HOST_ADDR_V6=2000:db8:abcd:0013::4
+ else
+ HOST_ADDR_V6=2001:db8:abcd:0012::3
+ fi
+ fi
+ setup_v6 $1 $2
+ start_tcpdump
+
+ if verify_ndisc $1 $2; then
+ printf " TEST: %-60s [ OK ]\n" "${test_msg[*]}"
+ else
+ printf " TEST: %-60s [FAIL]\n" "${test_msg[*]}"
+ fi
+
+ cleanup_tcpdump
+ cleanup_v6
+ set +e
+}
+
+ndisc_test_untracked_combinations() {
+ ndisc_test_untracked_advertisements 0
+ ndisc_test_untracked_advertisements 1
+ ndisc_test_untracked_advertisements 2 0
+ ndisc_test_untracked_advertisements 2 1
+}
+
+################################################################################
+# usage
+
+usage()
+{
+ cat <<EOF
+usage: ${0##*/} OPTS
+
+ -t <test> Test(s) to run (default: all)
+ (options: $TESTS)
+EOF
+}
+
+################################################################################
+# main
+
+while getopts ":t:h" opt; do
+ case $opt in
+ t) TESTS=$OPTARG;;
+ h) usage; exit 0;;
+ *) usage; exit 1;;
+ esac
+done
+
+if [ "$(id -u)" -ne 0 ];then
+ echo "SKIP: Need root privileges"
+ exit $ksft_skip;
+fi
+
+if [ ! -x "$(command -v ip)" ]; then
+ echo "SKIP: Could not run test without ip tool"
+ exit $ksft_skip
+fi
+
+if [ ! -x "$(command -v tcpdump)" ]; then
+ echo "SKIP: Could not run test without tcpdump tool"
+ exit $ksft_skip
+fi
+
+if [ ! -x "$(command -v arping)" ]; then
+ echo "SKIP: Could not run test without arping tool"
+ exit $ksft_skip
+fi
+
+# start clean
+cleanup &> /dev/null
+cleanup_v6 &> /dev/null
+
+for t in $TESTS
+do
+ case $t in
+ arp_test_gratuitous_combinations|arp) arp_test_gratuitous_combinations;;
+ ndisc_test_untracked_combinations|ndisc) \
+ ndisc_test_untracked_combinations;;
+ help) echo "Test names: $TESTS"; exit 0;;
+esac
+done
diff --git a/tools/testing/selftests/net/cmsg_sender.c b/tools/testing/selftests/net/cmsg_sender.c
index bc2162909a1a..75dd83e39207 100644
--- a/tools/testing/selftests/net/cmsg_sender.c
+++ b/tools/testing/selftests/net/cmsg_sender.c
@@ -456,7 +456,7 @@ int main(int argc, char *argv[])
buf[1] = 0;
} else if (opt.sock.type == SOCK_RAW) {
struct udphdr hdr = { 1, 2, htons(opt.size), 0 };
- struct sockaddr_in6 *sin6 = (void *)ai->ai_addr;;
+ struct sockaddr_in6 *sin6 = (void *)ai->ai_addr;
memcpy(buf, &hdr, sizeof(hdr));
sin6->sin6_port = htons(opt.sock.proto);
diff --git a/tools/testing/selftests/net/fib_rule_tests.sh b/tools/testing/selftests/net/fib_rule_tests.sh
index bbe3b379927a..c245476fa29d 100755
--- a/tools/testing/selftests/net/fib_rule_tests.sh
+++ b/tools/testing/selftests/net/fib_rule_tests.sh
@@ -303,6 +303,29 @@ run_fibrule_tests()
log_section "IPv6 fib rule"
fib_rule6_test
}
+################################################################################
+# usage
+
+usage()
+{
+ cat <<EOF
+usage: ${0##*/} OPTS
+
+ -t <test> Test(s) to run (default: all)
+ (options: $TESTS)
+EOF
+}
+
+################################################################################
+# main
+
+while getopts ":t:h" opt; do
+ case $opt in
+ t) TESTS=$OPTARG;;
+ h) usage; exit 0;;
+ *) usage; exit 1;;
+ esac
+done
if [ "$(id -u)" -ne 0 ];then
echo "SKIP: Need root privileges"
diff --git a/tools/testing/selftests/net/forwarding/Makefile b/tools/testing/selftests/net/forwarding/Makefile
index 57b84e0c879e..a9c5c1be5088 100644
--- a/tools/testing/selftests/net/forwarding/Makefile
+++ b/tools/testing/selftests/net/forwarding/Makefile
@@ -3,6 +3,7 @@
TEST_PROGS = bridge_igmp.sh \
bridge_locked_port.sh \
bridge_mdb.sh \
+ bridge_mdb_port_down.sh \
bridge_mld.sh \
bridge_port_isolation.sh \
bridge_sticky_fdb.sh \
diff --git a/tools/testing/selftests/net/forwarding/bridge_mdb_port_down.sh b/tools/testing/selftests/net/forwarding/bridge_mdb_port_down.sh
new file mode 100755
index 000000000000..1a0480e71d83
--- /dev/null
+++ b/tools/testing/selftests/net/forwarding/bridge_mdb_port_down.sh
@@ -0,0 +1,118 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Verify that permanent mdb entries can be added to and deleted from bridge
+# interfaces that are down, and works correctly when done so.
+
+ALL_TESTS="add_del_to_port_down"
+NUM_NETIFS=4
+
+TEST_GROUP="239.10.10.10"
+TEST_GROUP_MAC="01:00:5e:0a:0a:0a"
+
+source lib.sh
+
+
+add_del_to_port_down() {
+ RET=0
+
+ ip link set dev $swp2 down
+ bridge mdb add dev br0 port "$swp2" grp $TEST_GROUP permanent 2>/dev/null
+ check_err $? "Failed adding mdb entry"
+
+ ip link set dev $swp2 up
+ setup_wait_dev $swp2
+ mcast_packet_test $TEST_GROUP_MAC 192.0.2.1 $TEST_GROUP $h1 $h2
+ check_fail $? "Traffic to $TEST_GROUP wasn't forwarded"
+
+ ip link set dev $swp2 down
+ bridge mdb show dev br0 | grep -q "$TEST_GROUP permanent" 2>/dev/null
+ check_err $? "MDB entry did not persist after link up/down"
+
+ bridge mdb del dev br0 port "$swp2" grp $TEST_GROUP 2>/dev/null
+ check_err $? "Failed deleting mdb entry"
+
+ ip link set dev $swp2 up
+ setup_wait_dev $swp2
+ mcast_packet_test $TEST_GROUP_MAC 192.0.2.1 $TEST_GROUP $h1 $h2
+ check_err $? "Traffic to $TEST_GROUP was forwarded after entry removed"
+
+ log_test "MDB add/del entry to port with state down "
+}
+
+h1_create()
+{
+ simple_if_init $h1 192.0.2.1/24 2001:db8:1::1/64
+}
+
+h1_destroy()
+{
+ simple_if_fini $h1 192.0.2.1/24 2001:db8:1::1/64
+}
+
+h2_create()
+{
+ simple_if_init $h2 192.0.2.2/24 2001:db8:1::2/64
+}
+
+h2_destroy()
+{
+ simple_if_fini $h2 192.0.2.2/24 2001:db8:1::2/64
+}
+
+switch_create()
+{
+ # Enable multicast filtering
+ ip link add dev br0 type bridge mcast_snooping 1 mcast_querier 1
+
+ ip link set dev $swp1 master br0
+ ip link set dev $swp2 master br0
+
+ ip link set dev br0 up
+ ip link set dev $swp1 up
+
+ bridge link set dev $swp2 mcast_flood off
+ # Bridge currently has a "grace time" at creation time before it
+ # forwards multicast according to the mdb. Since we disable the
+ # mcast_flood setting per port
+ sleep 10
+}
+
+switch_destroy()
+{
+ ip link set dev $swp1 down
+ ip link set dev $swp2 down
+ ip link del dev br0
+}
+
+setup_prepare()
+{
+ h1=${NETIFS[p1]}
+ swp1=${NETIFS[p2]}
+
+ swp2=${NETIFS[p3]}
+ h2=${NETIFS[p4]}
+
+ vrf_prepare
+
+ h1_create
+ h2_create
+ switch_create
+}
+
+cleanup()
+{
+ pre_cleanup
+
+ switch_destroy
+ h1_destroy
+ h2_destroy
+
+ vrf_cleanup
+}
+
+trap cleanup EXIT
+
+setup_prepare
+tests_run
+exit $EXIT_STATUS
diff --git a/tools/testing/selftests/net/forwarding/ethtool_extended_state.sh b/tools/testing/selftests/net/forwarding/ethtool_extended_state.sh
index 4b42dfd4efd1..072faa77f53b 100755
--- a/tools/testing/selftests/net/forwarding/ethtool_extended_state.sh
+++ b/tools/testing/selftests/net/forwarding/ethtool_extended_state.sh
@@ -11,6 +11,8 @@ NUM_NETIFS=2
source lib.sh
source ethtool_lib.sh
+TIMEOUT=$((WAIT_TIMEOUT * 1000)) # ms
+
setup_prepare()
{
swp1=${NETIFS[p1]}
@@ -18,7 +20,7 @@ setup_prepare()
swp3=$NETIF_NO_CABLE
}
-ethtool_extended_state_check()
+ethtool_ext_state()
{
local dev=$1; shift
local expected_ext_state=$1; shift
@@ -30,21 +32,27 @@ ethtool_extended_state_check()
| sed -e 's/^[[:space:]]*//')
ext_state=$(echo $ext_state | cut -d "," -f1)
- [[ $ext_state == $expected_ext_state ]]
- check_err $? "Expected \"$expected_ext_state\", got \"$ext_state\""
-
- [[ $ext_substate == $expected_ext_substate ]]
- check_err $? "Expected \"$expected_ext_substate\", got \"$ext_substate\""
+ if [[ $ext_state != $expected_ext_state ]]; then
+ echo "Expected \"$expected_ext_state\", got \"$ext_state\""
+ return 1
+ fi
+ if [[ $ext_substate != $expected_ext_substate ]]; then
+ echo "Expected \"$expected_ext_substate\", got \"$ext_substate\""
+ return 1
+ fi
}
autoneg()
{
+ local msg
+
RET=0
ip link set dev $swp1 up
- sleep 4
- ethtool_extended_state_check $swp1 "Autoneg" "No partner detected"
+ msg=$(busywait $TIMEOUT ethtool_ext_state $swp1 \
+ "Autoneg" "No partner detected")
+ check_err $? "$msg"
log_test "Autoneg, No partner detected"
@@ -53,6 +61,8 @@ autoneg()
autoneg_force_mode()
{
+ local msg
+
RET=0
ip link set dev $swp1 up
@@ -65,12 +75,13 @@ autoneg_force_mode()
ethtool_set $swp1 speed $speed1 autoneg off
ethtool_set $swp2 speed $speed2 autoneg off
- sleep 4
- ethtool_extended_state_check $swp1 "Autoneg" \
- "No partner detected during force mode"
+ msg=$(busywait $TIMEOUT ethtool_ext_state $swp1 \
+ "Autoneg" "No partner detected during force mode")
+ check_err $? "$msg"
- ethtool_extended_state_check $swp2 "Autoneg" \
- "No partner detected during force mode"
+ msg=$(busywait $TIMEOUT ethtool_ext_state $swp2 \
+ "Autoneg" "No partner detected during force mode")
+ check_err $? "$msg"
log_test "Autoneg, No partner detected during force mode"
@@ -83,12 +94,14 @@ autoneg_force_mode()
no_cable()
{
+ local msg
+
RET=0
ip link set dev $swp3 up
- sleep 1
- ethtool_extended_state_check $swp3 "No cable"
+ msg=$(busywait $TIMEOUT ethtool_ext_state $swp3 "No cable")
+ check_err $? "$msg"
log_test "No cable"
diff --git a/tools/testing/selftests/net/forwarding/mirror_gre_bridge_1q_lag.sh b/tools/testing/selftests/net/forwarding/mirror_gre_bridge_1q_lag.sh
index 28d568c48a73..91e431cd919e 100755
--- a/tools/testing/selftests/net/forwarding/mirror_gre_bridge_1q_lag.sh
+++ b/tools/testing/selftests/net/forwarding/mirror_gre_bridge_1q_lag.sh
@@ -141,12 +141,13 @@ switch_create()
ip link set dev $swp4 up
ip link add name br1 type bridge vlan_filtering 1
- ip link set dev br1 up
- __addr_add_del br1 add 192.0.2.129/32
- ip -4 route add 192.0.2.130/32 dev br1
team_create lag loadbalance $swp3 $swp4
ip link set dev lag master br1
+
+ ip link set dev br1 up
+ __addr_add_del br1 add 192.0.2.129/32
+ ip -4 route add 192.0.2.130/32 dev br1
}
switch_destroy()
diff --git a/tools/testing/selftests/net/forwarding/vxlan_asymmetric.sh b/tools/testing/selftests/net/forwarding/vxlan_asymmetric.sh
index 0727e2012b68..43469c7de118 100755
--- a/tools/testing/selftests/net/forwarding/vxlan_asymmetric.sh
+++ b/tools/testing/selftests/net/forwarding/vxlan_asymmetric.sh
@@ -525,7 +525,7 @@ arp_suppression()
log_test "neigh_suppress: on / neigh exists: yes"
- # Delete the neighbour from the the SVI. A single ARP request should be
+ # Delete the neighbour from the SVI. A single ARP request should be
# received by the remote VTEP
RET=0
diff --git a/tools/testing/selftests/net/ioam6.sh b/tools/testing/selftests/net/ioam6.sh
index a2b9fad5a9a6..4ceb401da1bf 100755
--- a/tools/testing/selftests/net/ioam6.sh
+++ b/tools/testing/selftests/net/ioam6.sh
@@ -117,6 +117,8 @@
# | Schema Data | |
# +-----------------------------------------------------------+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
################################################################################
# #
@@ -211,7 +213,7 @@ check_kernel_compatibility()
echo "SKIP: kernel version probably too old, missing ioam support"
ip link del veth0 2>/dev/null || true
ip netns del ioam-tmp-node || true
- exit 1
+ exit $ksft_skip
fi
ip -netns ioam-tmp-node route add db02::/64 encap ioam6 mode inline \
@@ -227,7 +229,7 @@ check_kernel_compatibility()
"without CONFIG_IPV6_IOAM6_LWTUNNEL?"
ip link del veth0 2>/dev/null || true
ip netns del ioam-tmp-node || true
- exit 1
+ exit $ksft_skip
fi
ip link del veth0 2>/dev/null || true
@@ -752,20 +754,20 @@ nfailed=0
if [ "$(id -u)" -ne 0 ]
then
echo "SKIP: Need root privileges"
- exit 1
+ exit $ksft_skip
fi
if [ ! -x "$(command -v ip)" ]
then
echo "SKIP: Could not run test without ip tool"
- exit 1
+ exit $ksft_skip
fi
ip ioam &>/dev/null
if [ $? = 1 ]
then
echo "SKIP: iproute2 too old, missing ioam command"
- exit 1
+ exit $ksft_skip
fi
check_kernel_compatibility
diff --git a/tools/testing/selftests/net/ipv6_flowlabel.c b/tools/testing/selftests/net/ipv6_flowlabel.c
index a7c41375374f..708a9822259d 100644
--- a/tools/testing/selftests/net/ipv6_flowlabel.c
+++ b/tools/testing/selftests/net/ipv6_flowlabel.c
@@ -9,6 +9,7 @@
#include <errno.h>
#include <fcntl.h>
#include <limits.h>
+#include <linux/icmpv6.h>
#include <linux/in6.h>
#include <stdbool.h>
#include <stdio.h>
@@ -29,26 +30,48 @@
#ifndef IPV6_FLOWLABEL_MGR
#define IPV6_FLOWLABEL_MGR 32
#endif
+#ifndef IPV6_FLOWINFO_SEND
+#define IPV6_FLOWINFO_SEND 33
+#endif
#define FLOWLABEL_WILDCARD ((uint32_t) -1)
static const char cfg_data[] = "a";
static uint32_t cfg_label = 1;
+static bool use_ping;
+static bool use_flowinfo_send;
+
+static struct icmp6hdr icmp6 = {
+ .icmp6_type = ICMPV6_ECHO_REQUEST
+};
+
+static struct sockaddr_in6 addr = {
+ .sin6_family = AF_INET6,
+ .sin6_addr = IN6ADDR_LOOPBACK_INIT,
+};
static void do_send(int fd, bool with_flowlabel, uint32_t flowlabel)
{
char control[CMSG_SPACE(sizeof(flowlabel))] = {0};
struct msghdr msg = {0};
- struct iovec iov = {0};
+ struct iovec iov = {
+ .iov_base = (char *)cfg_data,
+ .iov_len = sizeof(cfg_data)
+ };
int ret;
- iov.iov_base = (char *)cfg_data;
- iov.iov_len = sizeof(cfg_data);
+ if (use_ping) {
+ iov.iov_base = &icmp6;
+ iov.iov_len = sizeof(icmp6);
+ }
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
- if (with_flowlabel) {
+ if (use_flowinfo_send) {
+ msg.msg_name = &addr;
+ msg.msg_namelen = sizeof(addr);
+ } else if (with_flowlabel) {
struct cmsghdr *cm;
cm = (void *)control;
@@ -94,6 +117,8 @@ static void do_recv(int fd, bool with_flowlabel, uint32_t expect)
ret = recvmsg(fd, &msg, 0);
if (ret == -1)
error(1, errno, "recv");
+ if (use_ping)
+ goto parse_cmsg;
if (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))
error(1, 0, "recv: truncated");
if (ret != sizeof(cfg_data))
@@ -101,6 +126,7 @@ static void do_recv(int fd, bool with_flowlabel, uint32_t expect)
if (memcmp(data, cfg_data, sizeof(data)))
error(1, 0, "recv: data mismatch");
+parse_cmsg:
cm = CMSG_FIRSTHDR(&msg);
if (with_flowlabel) {
if (!cm)
@@ -114,9 +140,11 @@ static void do_recv(int fd, bool with_flowlabel, uint32_t expect)
flowlabel = ntohl(*(uint32_t *)CMSG_DATA(cm));
fprintf(stderr, "recv with label %u\n", flowlabel);
- if (expect != FLOWLABEL_WILDCARD && expect != flowlabel)
+ if (expect != FLOWLABEL_WILDCARD && expect != flowlabel) {
fprintf(stderr, "recv: incorrect flowlabel %u != %u\n",
flowlabel, expect);
+ error(1, 0, "recv: flowlabel is wrong");
+ }
} else {
fprintf(stderr, "recv without label\n");
@@ -165,11 +193,17 @@ static void parse_opts(int argc, char **argv)
{
int c;
- while ((c = getopt(argc, argv, "l:")) != -1) {
+ while ((c = getopt(argc, argv, "l:ps")) != -1) {
switch (c) {
case 'l':
cfg_label = strtoul(optarg, NULL, 0);
break;
+ case 'p':
+ use_ping = true;
+ break;
+ case 's':
+ use_flowinfo_send = true;
+ break;
default:
error(1, 0, "%s: parse error", argv[0]);
}
@@ -178,27 +212,30 @@ static void parse_opts(int argc, char **argv)
int main(int argc, char **argv)
{
- struct sockaddr_in6 addr = {
- .sin6_family = AF_INET6,
- .sin6_port = htons(8000),
- .sin6_addr = IN6ADDR_LOOPBACK_INIT,
- };
const int one = 1;
int fdt, fdr;
+ int prot = 0;
+
+ addr.sin6_port = htons(8000);
parse_opts(argc, argv);
- fdt = socket(PF_INET6, SOCK_DGRAM, 0);
+ if (use_ping) {
+ fprintf(stderr, "attempting to use ping sockets\n");
+ prot = IPPROTO_ICMPV6;
+ }
+
+ fdt = socket(PF_INET6, SOCK_DGRAM, prot);
if (fdt == -1)
error(1, errno, "socket t");
- fdr = socket(PF_INET6, SOCK_DGRAM, 0);
+ fdr = use_ping ? fdt : socket(PF_INET6, SOCK_DGRAM, 0);
if (fdr == -1)
error(1, errno, "socket r");
if (connect(fdt, (void *)&addr, sizeof(addr)))
error(1, errno, "connect");
- if (bind(fdr, (void *)&addr, sizeof(addr)))
+ if (!use_ping && bind(fdr, (void *)&addr, sizeof(addr)))
error(1, errno, "bind");
flowlabel_get(fdt, cfg_label, IPV6_FL_S_EXCL, IPV6_FL_F_CREATE);
@@ -216,13 +253,21 @@ int main(int argc, char **argv)
do_recv(fdr, false, 0);
}
+ if (use_flowinfo_send) {
+ fprintf(stderr, "using IPV6_FLOWINFO_SEND to send label\n");
+ addr.sin6_flowinfo = htonl(cfg_label);
+ if (setsockopt(fdt, SOL_IPV6, IPV6_FLOWINFO_SEND, &one,
+ sizeof(one)) == -1)
+ error(1, errno, "setsockopt flowinfo_send");
+ }
+
fprintf(stderr, "send label\n");
do_send(fdt, true, cfg_label);
do_recv(fdr, true, cfg_label);
if (close(fdr))
error(1, errno, "close r");
- if (close(fdt))
+ if (!use_ping && close(fdt))
error(1, errno, "close t");
return 0;
diff --git a/tools/testing/selftests/net/ipv6_flowlabel.sh b/tools/testing/selftests/net/ipv6_flowlabel.sh
index d3bc6442704e..cee95e252bee 100755
--- a/tools/testing/selftests/net/ipv6_flowlabel.sh
+++ b/tools/testing/selftests/net/ipv6_flowlabel.sh
@@ -18,4 +18,20 @@ echo "TEST datapath (with auto-flowlabels)"
./in_netns.sh \
sh -c 'sysctl -q -w net.ipv6.auto_flowlabels=1 && ./ipv6_flowlabel -l 1'
+echo "TEST datapath (with ping-sockets)"
+./in_netns.sh \
+ sh -c 'sysctl -q -w net.ipv6.flowlabel_reflect=4 && \
+ sysctl -q -w net.ipv4.ping_group_range="0 2147483647" && \
+ ./ipv6_flowlabel -l 1 -p'
+
+echo "TEST datapath (with flowinfo-send)"
+./in_netns.sh \
+ sh -c './ipv6_flowlabel -l 1 -s'
+
+echo "TEST datapath (with ping-sockets flowinfo-send)"
+./in_netns.sh \
+ sh -c 'sysctl -q -w net.ipv6.flowlabel_reflect=4 && \
+ sysctl -q -w net.ipv4.ping_group_range="0 2147483647" && \
+ ./ipv6_flowlabel -l 1 -p -s'
+
echo OK. All tests passed
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index a4406b7a8064..ff83ef426df5 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -455,6 +455,12 @@ wait_mpj()
done
}
+kill_wait()
+{
+ kill $1 > /dev/null 2>&1
+ wait $1 2>/dev/null
+}
+
pm_nl_set_limits()
{
local ns=$1
@@ -654,6 +660,11 @@ do_transfer()
local port=$((10000 + TEST_COUNT - 1))
local cappid
+ local userspace_pm=0
+ local evts_ns1
+ local evts_ns1_pid
+ local evts_ns2
+ local evts_ns2_pid
:> "$cout"
:> "$sout"
@@ -690,10 +701,29 @@ do_transfer()
extra_args="-r ${speed:6}"
fi
+ if [[ "${addr_nr_ns1}" = "userspace_"* ]]; then
+ userspace_pm=1
+ addr_nr_ns1=${addr_nr_ns1:10}
+ fi
+
if [[ "${addr_nr_ns2}" = "fastclose_"* ]]; then
# disconnect
extra_args="$extra_args -I ${addr_nr_ns2:10}"
addr_nr_ns2=0
+ elif [[ "${addr_nr_ns2}" = "userspace_"* ]]; then
+ userspace_pm=1
+ addr_nr_ns2=${addr_nr_ns2:10}
+ fi
+
+ if [ $userspace_pm -eq 1 ]; then
+ evts_ns1=$(mktemp)
+ evts_ns2=$(mktemp)
+ :> "$evts_ns1"
+ :> "$evts_ns2"
+ ip netns exec ${listener_ns} ./pm_nl_ctl events >> "$evts_ns1" 2>&1 &
+ evts_ns1_pid=$!
+ ip netns exec ${connector_ns} ./pm_nl_ctl events >> "$evts_ns2" 2>&1 &
+ evts_ns2_pid=$!
fi
local local_addr
@@ -748,6 +778,8 @@ do_transfer()
if [ $addr_nr_ns1 -gt 0 ]; then
local counter=2
local add_nr_ns1=${addr_nr_ns1}
+ local id=10
+ local tk
while [ $add_nr_ns1 -gt 0 ]; do
local addr
if is_v6 "${connect_addr}"; then
@@ -755,9 +787,18 @@ do_transfer()
else
addr="10.0.$counter.1"
fi
- pm_nl_add_endpoint $ns1 $addr flags signal
+ if [ $userspace_pm -eq 0 ]; then
+ pm_nl_add_endpoint $ns1 $addr flags signal
+ else
+ tk=$(sed -n 's/.*\(token:\)\([[:digit:]]*\).*$/\2/p;q' "$evts_ns1")
+ ip netns exec ${listener_ns} ./pm_nl_ctl ann $addr token $tk id $id
+ sleep 1
+ ip netns exec ${listener_ns} ./pm_nl_ctl rem token $tk id $id
+ fi
+
counter=$((counter + 1))
add_nr_ns1=$((add_nr_ns1 - 1))
+ id=$((id + 1))
done
elif [ $addr_nr_ns1 -lt 0 ]; then
local rm_nr_ns1=$((-addr_nr_ns1))
@@ -804,6 +845,8 @@ do_transfer()
if [ $addr_nr_ns2 -gt 0 ]; then
local add_nr_ns2=${addr_nr_ns2}
local counter=3
+ local id=20
+ local tk da dp sp
while [ $add_nr_ns2 -gt 0 ]; do
local addr
if is_v6 "${connect_addr}"; then
@@ -811,9 +854,23 @@ do_transfer()
else
addr="10.0.$counter.2"
fi
- pm_nl_add_endpoint $ns2 $addr flags $flags
+ if [ $userspace_pm -eq 0 ]; then
+ pm_nl_add_endpoint $ns2 $addr flags $flags
+ else
+ tk=$(sed -n 's/.*\(token:\)\([[:digit:]]*\).*$/\2/p;q' "$evts_ns2")
+ da=$(sed -n 's/.*\(daddr4:\)\([0-9.]*\).*$/\2/p;q' "$evts_ns2")
+ dp=$(sed -n 's/.*\(dport:\)\([[:digit:]]*\).*$/\2/p;q' "$evts_ns2")
+ ip netns exec ${connector_ns} ./pm_nl_ctl csf lip $addr lid $id \
+ rip $da rport $dp token $tk
+ sleep 1
+ sp=$(grep "type:10" "$evts_ns2" |
+ sed -n 's/.*\(sport:\)\([[:digit:]]*\).*$/\2/p;q')
+ ip netns exec ${connector_ns} ./pm_nl_ctl dsf lip $addr lport $sp \
+ rip $da rport $dp token $tk
+ fi
counter=$((counter + 1))
add_nr_ns2=$((add_nr_ns2 - 1))
+ id=$((id + 1))
done
elif [ $addr_nr_ns2 -lt 0 ]; then
local rm_nr_ns2=$((-addr_nr_ns2))
@@ -890,6 +947,12 @@ do_transfer()
kill $cappid
fi
+ if [ $userspace_pm -eq 1 ]; then
+ kill_wait $evts_ns1_pid
+ kill_wait $evts_ns2_pid
+ rm -rf $evts_ns1 $evts_ns2
+ fi
+
NSTAT_HISTORY=/tmp/${listener_ns}.nstat ip netns exec ${listener_ns} \
nstat | grep Tcp > /tmp/${listener_ns}.out
NSTAT_HISTORY=/tmp/${connector_ns}.nstat ip netns exec ${connector_ns} \
@@ -2365,6 +2428,36 @@ backup_tests()
chk_add_nr 1 1
chk_prio_nr 1 1
fi
+
+ if reset "mpc backup"; then
+ pm_nl_add_endpoint $ns2 10.0.1.2 flags subflow,backup
+ run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow
+ chk_join_nr 0 0 0
+ chk_prio_nr 0 1
+ fi
+
+ if reset "mpc backup both sides"; then
+ pm_nl_add_endpoint $ns1 10.0.1.1 flags subflow,backup
+ pm_nl_add_endpoint $ns2 10.0.1.2 flags subflow,backup
+ run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow
+ chk_join_nr 0 0 0
+ chk_prio_nr 1 1
+ fi
+
+ if reset "mpc switch to backup"; then
+ pm_nl_add_endpoint $ns2 10.0.1.2 flags subflow
+ run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow backup
+ chk_join_nr 0 0 0
+ chk_prio_nr 0 1
+ fi
+
+ if reset "mpc switch to backup both sides"; then
+ pm_nl_add_endpoint $ns1 10.0.1.1 flags subflow
+ pm_nl_add_endpoint $ns2 10.0.1.2 flags subflow
+ run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow backup
+ chk_join_nr 0 0 0
+ chk_prio_nr 1 1
+ fi
}
add_addr_ports_tests()
@@ -2810,6 +2903,25 @@ userspace_tests()
chk_join_nr 0 0 0
chk_rm_nr 0 0
fi
+
+ # userspace pm add & remove address
+ if reset "userspace pm add & remove address"; then
+ set_userspace_pm $ns1
+ pm_nl_set_limits $ns2 1 1
+ run_tests $ns1 $ns2 10.0.1.1 0 userspace_1 0 slow
+ chk_join_nr 1 1 1
+ chk_add_nr 1 1
+ chk_rm_nr 1 1 invert
+ fi
+
+ # userspace pm create destroy subflow
+ if reset "userspace pm create destroy subflow"; then
+ set_userspace_pm $ns2
+ pm_nl_set_limits $ns1 0 1
+ run_tests $ns1 $ns2 10.0.1.1 0 0 userspace_1 slow
+ chk_join_nr 1 1 1
+ chk_rm_nr 0 1
+ fi
}
endpoint_tests()
diff --git a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
index cb79f0719e3b..abddf4c63e79 100644
--- a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
+++ b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
@@ -31,7 +31,7 @@
static void syntax(char *argv[])
{
- fprintf(stderr, "%s add|get|set|del|flush|dump|accept [<args>]\n", argv[0]);
+ fprintf(stderr, "%s add|ann|rem|csf|dsf|get|set|del|flush|dump|events|listen|accept [<args>]\n", argv[0]);
fprintf(stderr, "\tadd [flags signal|subflow|backup|fullmesh] [id <nr>] [dev <name>] <ip>\n");
fprintf(stderr, "\tann <local-ip> id <local-id> token <token> [port <local-port>] [dev <name>]\n");
fprintf(stderr, "\trem id <local-id> token <token>\n");
diff --git a/tools/testing/selftests/net/mptcp/simult_flows.sh b/tools/testing/selftests/net/mptcp/simult_flows.sh
index f441ff7904fc..ffa13a957a36 100755
--- a/tools/testing/selftests/net/mptcp/simult_flows.sh
+++ b/tools/testing/selftests/net/mptcp/simult_flows.sh
@@ -12,6 +12,7 @@ timeout_test=$((timeout_poll * 2 + 1))
test_cnt=1
ret=0
bail=0
+slack=50
usage() {
echo "Usage: $0 [ -b ] [ -c ] [ -d ]"
@@ -52,6 +53,7 @@ setup()
cout=$(mktemp)
capout=$(mktemp)
size=$((2 * 2048 * 4096))
+
dd if=/dev/zero of=$small bs=4096 count=20 >/dev/null 2>&1
dd if=/dev/zero of=$large bs=4096 count=$((size / 4096)) >/dev/null 2>&1
@@ -104,6 +106,16 @@ setup()
ip -net "$ns3" route add default via dead:beef:3::2
ip netns exec "$ns3" ./pm_nl_ctl limits 1 1
+
+ # debug build can slow down measurably the test program
+ # we use quite tight time limit on the run-time, to ensure
+ # maximum B/W usage.
+ # Use kmemleak/lockdep/kasan/prove_locking presence as a rough
+ # estimate for this being a debug kernel and increase the
+ # maximum run-time accordingly. Observed run times for CI builds
+ # running selftests, including kbuild, were used to determine the
+ # amount of time to add.
+ grep -q ' kmemleak_init$\| lockdep_init$\| kasan_init$\| prove_locking$' /proc/kallsyms && slack=$((slack+550))
}
# $1: ns, $2: port
@@ -241,7 +253,7 @@ run_test()
# mptcp_connect will do some sleeps to allow the mp_join handshake
# completion (see mptcp_connect): 200ms on each side, add some slack
- time=$((time + 450))
+ time=$((time + 400 + slack))
printf "%-60s" "$msg"
do_transfer $small $large $time
diff --git a/tools/testing/selftests/net/mptcp/userspace_pm.sh b/tools/testing/selftests/net/mptcp/userspace_pm.sh
index abe3d4ebe554..3229725b64b0 100755
--- a/tools/testing/selftests/net/mptcp/userspace_pm.sh
+++ b/tools/testing/selftests/net/mptcp/userspace_pm.sh
@@ -37,6 +37,12 @@ rndh=$(stdbuf -o0 -e0 printf %x "$sec")-$(mktemp -u XXXXXX)
ns1="ns1-$rndh"
ns2="ns2-$rndh"
+kill_wait()
+{
+ kill $1 > /dev/null 2>&1
+ wait $1 2>/dev/null
+}
+
cleanup()
{
echo "cleanup"
@@ -48,16 +54,16 @@ cleanup()
kill -SIGUSR1 $client4_pid > /dev/null 2>&1
fi
if [ $server4_pid -ne 0 ]; then
- kill $server4_pid > /dev/null 2>&1
+ kill_wait $server4_pid
fi
if [ $client6_pid -ne 0 ]; then
kill -SIGUSR1 $client6_pid > /dev/null 2>&1
fi
if [ $server6_pid -ne 0 ]; then
- kill $server6_pid > /dev/null 2>&1
+ kill_wait $server6_pid
fi
if [ $evts_pid -ne 0 ]; then
- kill $evts_pid > /dev/null 2>&1
+ kill_wait $evts_pid
fi
local netns
for netns in "$ns1" "$ns2" ;do
@@ -153,7 +159,7 @@ make_connection()
sleep 1
# Capture client/server attributes from MPTCP connection netlink events
- kill $client_evts_pid
+ kill_wait $client_evts_pid
local client_token
local client_port
@@ -165,7 +171,7 @@ make_connection()
client_port=$(sed --unbuffered -n 's/.*\(sport:\)\([[:digit:]]*\).*$/\2/p;q' "$client_evts")
client_serverside=$(sed --unbuffered -n 's/.*\(server_side:\)\([[:digit:]]*\).*$/\2/p;q'\
"$client_evts")
- kill $server_evts_pid
+ kill_wait $server_evts_pid
server_token=$(sed --unbuffered -n 's/.*\(token:\)\([[:digit:]]*\).*$/\2/p;q' "$server_evts")
server_serverside=$(sed --unbuffered -n 's/.*\(server_side:\)\([[:digit:]]*\).*$/\2/p;q'\
"$server_evts")
@@ -286,7 +292,7 @@ test_announce()
verify_announce_event "$evts" "$ANNOUNCED" "$server4_token" "10.0.2.2"\
"$client_addr_id" "$new4_port"
- kill $evts_pid
+ kill_wait $evts_pid
# Capture events on the network namespace running the client
:>"$evts"
@@ -321,7 +327,7 @@ test_announce()
verify_announce_event "$evts" "$ANNOUNCED" "$client4_token" "10.0.2.1"\
"$server_addr_id" "$new4_port"
- kill $evts_pid
+ kill_wait $evts_pid
rm -f "$evts"
}
@@ -416,7 +422,7 @@ test_remove()
sleep 0.5
verify_remove_event "$evts" "$REMOVED" "$server6_token" "$client_addr_id"
- kill $evts_pid
+ kill_wait $evts_pid
# Capture events on the network namespace running the client
:>"$evts"
@@ -449,7 +455,7 @@ test_remove()
sleep 0.5
verify_remove_event "$evts" "$REMOVED" "$client6_token" "$server_addr_id"
- kill $evts_pid
+ kill_wait $evts_pid
rm -f "$evts"
}
@@ -553,7 +559,7 @@ test_subflows()
"10.0.2.2" "$client4_port" "23" "$client_addr_id" "ns1" "ns2"
# Delete the listener from the client ns, if one was created
- kill $listener_pid > /dev/null 2>&1
+ kill_wait $listener_pid
local sport
sport=$(sed --unbuffered -n 's/.*\(sport:\)\([[:digit:]]*\).*$/\2/p;q' "$evts")
@@ -592,7 +598,7 @@ test_subflows()
"$client_addr_id" "ns1" "ns2"
# Delete the listener from the client ns, if one was created
- kill $listener_pid > /dev/null 2>&1
+ kill_wait $listener_pid
sport=$(sed --unbuffered -n 's/.*\(sport:\)\([[:digit:]]*\).*$/\2/p;q' "$evts")
@@ -631,7 +637,7 @@ test_subflows()
"$client_addr_id" "ns1" "ns2"
# Delete the listener from the client ns, if one was created
- kill $listener_pid > /dev/null 2>&1
+ kill_wait $listener_pid
sport=$(sed --unbuffered -n 's/.*\(sport:\)\([[:digit:]]*\).*$/\2/p;q' "$evts")
@@ -647,7 +653,7 @@ test_subflows()
ip netns exec "$ns2" ./pm_nl_ctl rem id $client_addr_id token\
"$client4_token" > /dev/null 2>&1
- kill $evts_pid
+ kill_wait $evts_pid
# Capture events on the network namespace running the client
:>"$evts"
@@ -674,7 +680,7 @@ test_subflows()
"10.0.2.1" "$app4_port" "23" "$server_addr_id" "ns2" "ns1"
# Delete the listener from the server ns, if one was created
- kill $listener_pid> /dev/null 2>&1
+ kill_wait $listener_pid
sport=$(sed --unbuffered -n 's/.*\(sport:\)\([[:digit:]]*\).*$/\2/p;q' "$evts")
@@ -713,7 +719,7 @@ test_subflows()
"$server_addr_id" "ns2" "ns1"
# Delete the listener from the server ns, if one was created
- kill $listener_pid > /dev/null 2>&1
+ kill_wait $listener_pid
sport=$(sed --unbuffered -n 's/.*\(sport:\)\([[:digit:]]*\).*$/\2/p;q' "$evts")
@@ -750,7 +756,7 @@ test_subflows()
"10.0.2.2" "10.0.2.1" "$new4_port" "23" "$server_addr_id" "ns2" "ns1"
# Delete the listener from the server ns, if one was created
- kill $listener_pid > /dev/null 2>&1
+ kill_wait $listener_pid
sport=$(sed --unbuffered -n 's/.*\(sport:\)\([[:digit:]]*\).*$/\2/p;q' "$evts")
@@ -766,7 +772,7 @@ test_subflows()
ip netns exec "$ns1" ./pm_nl_ctl rem id $server_addr_id token\
"$server4_token" > /dev/null 2>&1
- kill $evts_pid
+ kill_wait $evts_pid
rm -f "$evts"
}
diff --git a/tools/testing/selftests/net/srv6_hencap_red_l3vpn_test.sh b/tools/testing/selftests/net/srv6_hencap_red_l3vpn_test.sh
new file mode 100755
index 000000000000..28a775654b92
--- /dev/null
+++ b/tools/testing/selftests/net/srv6_hencap_red_l3vpn_test.sh
@@ -0,0 +1,879 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# author: Andrea Mayer <andrea.mayer@uniroma2.it>
+#
+# This script is designed for testing the SRv6 H.Encaps.Red behavior.
+#
+# Below is depicted the IPv6 network of an operator which offers advanced
+# IPv4/IPv6 VPN services to hosts, enabling them to communicate with each
+# other.
+# In this example, hosts hs-1 and hs-2 are connected through an IPv4/IPv6 VPN
+# service, while hs-3 and hs-4 are connected using an IPv6 only VPN.
+#
+# Routers rt-1,rt-2,rt-3 and rt-4 implement IPv4/IPv6 L3 VPN services
+# leveraging the SRv6 architecture. The key components for such VPNs are:
+#
+# i) The SRv6 H.Encaps.Red behavior applies SRv6 Policies on traffic received
+# by connected hosts, initiating the VPN tunnel. Such a behavior is an
+# optimization of the SRv6 H.Encap aiming to reduce the length of the SID
+# List carried in the pushed SRH. Specifically, the H.Encaps.Red removes
+# the first SID contained in the SID List (i.e. SRv6 Policy) by storing it
+# into the IPv6 Destination Address. When a SRv6 Policy is made of only one
+# SID, the SRv6 H.Encaps.Red behavior omits the SRH at all and pushes that
+# SID directly into the IPv6 DA;
+#
+# ii) The SRv6 End behavior advances the active SID in the SID List carried by
+# the SRH;
+#
+# iii) The SRv6 End.DT46 behavior is used for removing the SRv6 Policy and,
+# thus, it terminates the VPN tunnel. Such a behavior is capable of
+# handling, at the same time, both tunneled IPv4 and IPv6 traffic.
+#
+#
+# cafe::1 cafe::2
+# 10.0.0.1 10.0.0.2
+# +--------+ +--------+
+# | | | |
+# | hs-1 | | hs-2 |
+# | | | |
+# +---+----+ +--- +---+
+# cafe::/64 | | cafe::/64
+# 10.0.0.0/24 | | 10.0.0.0/24
+# +---+----+ +----+---+
+# | | fcf0:0:1:2::/64 | |
+# | rt-1 +-------------------+ rt-2 |
+# | | | |
+# +---+----+ +----+---+
+# | . . |
+# | fcf0:0:1:3::/64 . |
+# | . . |
+# | . . |
+# fcf0:0:1:4::/64 | . | fcf0:0:2:3::/64
+# | . . |
+# | . . |
+# | fcf0:0:2:4::/64 . |
+# | . . |
+# +---+----+ +----+---+
+# | | | |
+# | rt-4 +-------------------+ rt-3 |
+# | | fcf0:0:3:4::/64 | |
+# +---+----+ +----+---+
+# cafe::/64 | | cafe::/64
+# 10.0.0.0/24 | | 10.0.0.0/24
+# +---+----+ +--- +---+
+# | | | |
+# | hs-4 | | hs-3 |
+# | | | |
+# +--------+ +--------+
+# cafe::4 cafe::3
+# 10.0.0.4 10.0.0.3
+#
+#
+# Every fcf0:0:x:y::/64 network interconnects the SRv6 routers rt-x with rt-y
+# in the IPv6 operator network.
+#
+# Local SID table
+# ===============
+#
+# Each SRv6 router is configured with a Local SID table in which SIDs are
+# stored. Considering the given SRv6 router rt-x, at least two SIDs are
+# configured in the Local SID table:
+#
+# Local SID table for SRv6 router rt-x
+# +----------------------------------------------------------+
+# |fcff:x::e is associated with the SRv6 End behavior |
+# |fcff:x::d46 is associated with the SRv6 End.DT46 behavior |
+# +----------------------------------------------------------+
+#
+# The fcff::/16 prefix is reserved by the operator for implementing SRv6 VPN
+# services. Reachability of SIDs is ensured by proper configuration of the IPv6
+# operator's network and SRv6 routers.
+#
+# # SRv6 Policies
+# ===============
+#
+# An SRv6 ingress router applies SRv6 policies to the traffic received from a
+# connected host. SRv6 policy enforcement consists of encapsulating the
+# received traffic into a new IPv6 packet with a given SID List contained in
+# the SRH.
+#
+# IPv4/IPv6 VPN between hs-1 and hs-2
+# -----------------------------------
+#
+# Hosts hs-1 and hs-2 are connected using dedicated IPv4/IPv6 VPNs.
+# Specifically, packets generated from hs-1 and directed towards hs-2 are
+# handled by rt-1 which applies the following SRv6 Policies:
+#
+# i.a) IPv6 traffic, SID List=fcff:3::e,fcff:4::e,fcff:2::d46
+# ii.a) IPv4 traffic, SID List=fcff:2::d46
+#
+# Policy (i.a) steers tunneled IPv6 traffic through SRv6 routers
+# rt-3,rt-4,rt-2. Instead, Policy (ii.a) steers tunneled IPv4 traffic through
+# rt-2.
+# The H.Encaps.Red reduces the SID List (i.a) carried in SRH by removing the
+# first SID (fcff:3::e) and pushing it into the IPv6 DA. In case of IPv4
+# traffic, the H.Encaps.Red omits the presence of SRH at all, since the SID
+# List (ii.a) consists of only one SID that can be stored directly in the IPv6
+# DA.
+#
+# On the reverse path (i.e. from hs-2 to hs-1), rt-2 applies the following
+# policies:
+#
+# i.b) IPv6 traffic, SID List=fcff:1::d46
+# ii.b) IPv4 traffic, SID List=fcff:4::e,fcff:3::e,fcff:1::d46
+#
+# Policy (i.b) steers tunneled IPv6 traffic through the SRv6 router rt-1.
+# Conversely, Policy (ii.b) steers tunneled IPv4 traffic through SRv6 routers
+# rt-4,rt-3,rt-1.
+# The H.Encaps.Red omits the SRH at all in case of (i.b) by pushing the single
+# SID (fcff::1::d46) inside the IPv6 DA.
+# The H.Encaps.Red reduces the SID List (ii.b) in the SRH by removing the first
+# SID (fcff:4::e) and pushing it into the IPv6 DA.
+#
+# In summary:
+# hs-1->hs-2 |IPv6 DA=fcff:3::e|SRH SIDs=fcff:4::e,fcff:2::d46|IPv6|...| (i.a)
+# hs-1->hs-2 |IPv6 DA=fcff:2::d46|IPv4|...| (ii.a)
+#
+# hs-2->hs-1 |IPv6 DA=fcff:1::d46|IPv6|...| (i.b)
+# hs-2->hs-1 |IPv6 DA=fcff:4::e|SRH SIDs=fcff:3::e,fcff:1::d46|IPv4|...| (ii.b)
+#
+#
+# IPv6 VPN between hs-3 and hs-4
+# ------------------------------
+#
+# Hosts hs-3 and hs-4 are connected using a dedicated IPv6 only VPN.
+# Specifically, packets generated from hs-3 and directed towards hs-4 are
+# handled by rt-3 which applies the following SRv6 Policy:
+#
+# i.c) IPv6 traffic, SID List=fcff:2::e,fcff:4::d46
+#
+# Policy (i.c) steers tunneled IPv6 traffic through SRv6 routers rt-2,rt-4.
+# The H.Encaps.Red reduces the SID List (i.c) carried in SRH by pushing the
+# first SID (fcff:2::e) in the IPv6 DA.
+#
+# On the reverse path (i.e. from hs-4 to hs-3) the router rt-4 applies the
+# following SRv6 Policy:
+#
+# i.d) IPv6 traffic, SID List=fcff:1::e,fcff:3::d46.
+#
+# Policy (i.d) steers tunneled IPv6 traffic through SRv6 routers rt-1,rt-3.
+# The H.Encaps.Red reduces the SID List (i.d) carried in SRH by pushing the
+# first SID (fcff:1::e) in the IPv6 DA.
+#
+# In summary:
+# hs-3->hs-4 |IPv6 DA=fcff:2::e|SRH SIDs=fcff:4::d46|IPv6|...| (i.c)
+# hs-4->hs-3 |IPv6 DA=fcff:1::e|SRH SIDs=fcff:3::d46|IPv6|...| (i.d)
+#
+
+# Kselftest framework requirement - SKIP code is 4.
+readonly ksft_skip=4
+
+readonly RDMSUFF="$(mktemp -u XXXXXXXX)"
+readonly VRF_TID=100
+readonly VRF_DEVNAME="vrf-${VRF_TID}"
+readonly RT2HS_DEVNAME="veth-t${VRF_TID}"
+readonly LOCALSID_TABLE_ID=90
+readonly IPv6_RT_NETWORK=fcf0:0
+readonly IPv6_HS_NETWORK=cafe
+readonly IPv4_HS_NETWORK=10.0.0
+readonly VPN_LOCATOR_SERVICE=fcff
+readonly END_FUNC=000e
+readonly DT46_FUNC=0d46
+
+PING_TIMEOUT_SEC=4
+PAUSE_ON_FAIL=${PAUSE_ON_FAIL:=no}
+
+# IDs of routers and hosts are initialized during the setup of the testing
+# network
+ROUTERS=''
+HOSTS=''
+
+SETUP_ERR=1
+
+ret=${ksft_skip}
+nsuccess=0
+nfail=0
+
+log_test()
+{
+ local rc="$1"
+ local expected="$2"
+ local msg="$3"
+
+ if [ "${rc}" -eq "${expected}" ]; then
+ nsuccess=$((nsuccess+1))
+ printf "\n TEST: %-60s [ OK ]\n" "${msg}"
+ else
+ ret=1
+ nfail=$((nfail+1))
+ printf "\n TEST: %-60s [FAIL]\n" "${msg}"
+ if [ "${PAUSE_ON_FAIL}" = "yes" ]; then
+ echo
+ echo "hit enter to continue, 'q' to quit"
+ read a
+ [ "$a" = "q" ] && exit 1
+ fi
+ fi
+}
+
+print_log_test_results()
+{
+ printf "\nTests passed: %3d\n" "${nsuccess}"
+ printf "Tests failed: %3d\n" "${nfail}"
+
+ # when a test fails, the value of 'ret' is set to 1 (error code).
+ # Conversely, when all tests are passed successfully, the 'ret' value
+ # is set to 0 (success code).
+ if [ "${ret}" -ne 1 ]; then
+ ret=0
+ fi
+}
+
+log_section()
+{
+ echo
+ echo "################################################################################"
+ echo "TEST SECTION: $*"
+ echo "################################################################################"
+}
+
+test_command_or_ksft_skip()
+{
+ local cmd="$1"
+
+ if [ ! -x "$(command -v "${cmd}")" ]; then
+ echo "SKIP: Could not run test without \"${cmd}\" tool";
+ exit "${ksft_skip}"
+ fi
+}
+
+get_nodename()
+{
+ local name="$1"
+
+ echo "${name}-${RDMSUFF}"
+}
+
+get_rtname()
+{
+ local rtid="$1"
+
+ get_nodename "rt-${rtid}"
+}
+
+get_hsname()
+{
+ local hsid="$1"
+
+ get_nodename "hs-${hsid}"
+}
+
+__create_namespace()
+{
+ local name="$1"
+
+ ip netns add "${name}"
+}
+
+create_router()
+{
+ local rtid="$1"
+ local nsname
+
+ nsname="$(get_rtname "${rtid}")"
+
+ __create_namespace "${nsname}"
+}
+
+create_host()
+{
+ local hsid="$1"
+ local nsname
+
+ nsname="$(get_hsname "${hsid}")"
+
+ __create_namespace "${nsname}"
+}
+
+cleanup()
+{
+ local nsname
+ local i
+
+ # destroy routers
+ for i in ${ROUTERS}; do
+ nsname="$(get_rtname "${i}")"
+
+ ip netns del "${nsname}" &>/dev/null || true
+ done
+
+ # destroy hosts
+ for i in ${HOSTS}; do
+ nsname="$(get_hsname "${i}")"
+
+ ip netns del "${nsname}" &>/dev/null || true
+ done
+
+ # check whether the setup phase was completed successfully or not. In
+ # case of an error during the setup phase of the testing environment,
+ # the selftest is considered as "skipped".
+ if [ "${SETUP_ERR}" -ne 0 ]; then
+ echo "SKIP: Setting up the testing environment failed"
+ exit "${ksft_skip}"
+ fi
+
+ exit "${ret}"
+}
+
+add_link_rt_pairs()
+{
+ local rt="$1"
+ local rt_neighs="$2"
+ local neigh
+ local nsname
+ local neigh_nsname
+
+ nsname="$(get_rtname "${rt}")"
+
+ for neigh in ${rt_neighs}; do
+ neigh_nsname="$(get_rtname "${neigh}")"
+
+ ip link add "veth-rt-${rt}-${neigh}" netns "${nsname}" \
+ type veth peer name "veth-rt-${neigh}-${rt}" \
+ netns "${neigh_nsname}"
+ done
+}
+
+get_network_prefix()
+{
+ local rt="$1"
+ local neigh="$2"
+ local p="${rt}"
+ local q="${neigh}"
+
+ if [ "${p}" -gt "${q}" ]; then
+ p="${q}"; q="${rt}"
+ fi
+
+ echo "${IPv6_RT_NETWORK}:${p}:${q}"
+}
+
+# Setup the basic networking for the routers
+setup_rt_networking()
+{
+ local rt="$1"
+ local rt_neighs="$2"
+ local nsname
+ local net_prefix
+ local devname
+ local neigh
+
+ nsname="$(get_rtname "${rt}")"
+
+ for neigh in ${rt_neighs}; do
+ devname="veth-rt-${rt}-${neigh}"
+
+ net_prefix="$(get_network_prefix "${rt}" "${neigh}")"
+
+ ip -netns "${nsname}" addr \
+ add "${net_prefix}::${rt}/64" dev "${devname}" nodad
+
+ ip -netns "${nsname}" link set "${devname}" up
+ done
+
+ ip -netns "${nsname}" link set lo up
+
+ ip netns exec "${nsname}" sysctl -wq net.ipv6.conf.all.accept_dad=0
+ ip netns exec "${nsname}" sysctl -wq net.ipv6.conf.default.accept_dad=0
+ ip netns exec "${nsname}" sysctl -wq net.ipv6.conf.all.forwarding=1
+
+ ip netns exec "${nsname}" sysctl -wq net.ipv4.conf.all.rp_filter=0
+ ip netns exec "${nsname}" sysctl -wq net.ipv4.conf.default.rp_filter=0
+ ip netns exec "${nsname}" sysctl -wq net.ipv4.ip_forward=1
+}
+
+# Setup local SIDs for an SRv6 router
+setup_rt_local_sids()
+{
+ local rt="$1"
+ local rt_neighs="$2"
+ local net_prefix
+ local devname
+ local nsname
+ local neigh
+
+ nsname="$(get_rtname "${rt}")"
+
+ for neigh in ${rt_neighs}; do
+ devname="veth-rt-${rt}-${neigh}"
+
+ net_prefix="$(get_network_prefix "${rt}" "${neigh}")"
+
+ # set underlay network routes for SIDs reachability
+ ip -netns "${nsname}" -6 route \
+ add "${VPN_LOCATOR_SERVICE}:${neigh}::/32" \
+ table "${LOCALSID_TABLE_ID}" \
+ via "${net_prefix}::${neigh}" dev "${devname}"
+ done
+
+ # Local End behavior (note that "dev" is dummy and the VRF is chosen
+ # for the sake of simplicity).
+ ip -netns "${nsname}" -6 route \
+ add "${VPN_LOCATOR_SERVICE}:${rt}::${END_FUNC}" \
+ table "${LOCALSID_TABLE_ID}" \
+ encap seg6local action End dev "${VRF_DEVNAME}"
+
+ # Local End.DT46 behavior
+ ip -netns "${nsname}" -6 route \
+ add "${VPN_LOCATOR_SERVICE}:${rt}::${DT46_FUNC}" \
+ table "${LOCALSID_TABLE_ID}" \
+ encap seg6local action End.DT46 vrftable "${VRF_TID}" \
+ dev "${VRF_DEVNAME}"
+
+ # all SIDs for VPNs start with a common locator. Routes and SRv6
+ # Endpoint behavior instaces are grouped together in the 'localsid'
+ # table.
+ ip -netns "${nsname}" -6 rule \
+ add to "${VPN_LOCATOR_SERVICE}::/16" \
+ lookup "${LOCALSID_TABLE_ID}" prio 999
+
+ # set default routes to unreachable for both ipv4 and ipv6
+ ip -netns "${nsname}" -6 route \
+ add unreachable default metric 4278198272 \
+ vrf "${VRF_DEVNAME}"
+
+ ip -netns "${nsname}" -4 route \
+ add unreachable default metric 4278198272 \
+ vrf "${VRF_DEVNAME}"
+}
+
+# build and install the SRv6 policy into the ingress SRv6 router.
+# args:
+# $1 - destination host (i.e. cafe::x host)
+# $2 - SRv6 router configured for enforcing the SRv6 Policy
+# $3 - SRv6 routers configured for steering traffic (End behaviors)
+# $4 - SRv6 router configured for removing the SRv6 Policy (router connected
+# to the destination host)
+# $5 - encap mode (full or red)
+# $6 - traffic type (IPv6 or IPv4)
+__setup_rt_policy()
+{
+ local dst="$1"
+ local encap_rt="$2"
+ local end_rts="$3"
+ local dec_rt="$4"
+ local mode="$5"
+ local traffic="$6"
+ local nsname
+ local policy=''
+ local n
+
+ nsname="$(get_rtname "${encap_rt}")"
+
+ for n in ${end_rts}; do
+ policy="${policy}${VPN_LOCATOR_SERVICE}:${n}::${END_FUNC},"
+ done
+
+ policy="${policy}${VPN_LOCATOR_SERVICE}:${dec_rt}::${DT46_FUNC}"
+
+ # add SRv6 policy to incoming traffic sent by connected hosts
+ if [ "${traffic}" -eq 6 ]; then
+ ip -netns "${nsname}" -6 route \
+ add "${IPv6_HS_NETWORK}::${dst}" vrf "${VRF_DEVNAME}" \
+ encap seg6 mode "${mode}" segs "${policy}" \
+ dev "${VRF_DEVNAME}"
+
+ ip -netns "${nsname}" -6 neigh \
+ add proxy "${IPv6_HS_NETWORK}::${dst}" \
+ dev "${RT2HS_DEVNAME}"
+ else
+ # "dev" must be different from the one where the packet is
+ # received, otherwise the proxy arp does not work.
+ ip -netns "${nsname}" -4 route \
+ add "${IPv4_HS_NETWORK}.${dst}" vrf "${VRF_DEVNAME}" \
+ encap seg6 mode "${mode}" segs "${policy}" \
+ dev "${VRF_DEVNAME}"
+ fi
+}
+
+# see __setup_rt_policy
+setup_rt_policy_ipv6()
+{
+ __setup_rt_policy "$1" "$2" "$3" "$4" "$5" 6
+}
+
+#see __setup_rt_policy
+setup_rt_policy_ipv4()
+{
+ __setup_rt_policy "$1" "$2" "$3" "$4" "$5" 4
+}
+
+setup_hs()
+{
+ local hs="$1"
+ local rt="$2"
+ local hsname
+ local rtname
+
+ hsname="$(get_hsname "${hs}")"
+ rtname="$(get_rtname "${rt}")"
+
+ ip netns exec "${hsname}" sysctl -wq net.ipv6.conf.all.accept_dad=0
+ ip netns exec "${hsname}" sysctl -wq net.ipv6.conf.default.accept_dad=0
+
+ ip -netns "${hsname}" link add veth0 type veth \
+ peer name "${RT2HS_DEVNAME}" netns "${rtname}"
+
+ ip -netns "${hsname}" addr \
+ add "${IPv6_HS_NETWORK}::${hs}/64" dev veth0 nodad
+ ip -netns "${hsname}" addr add "${IPv4_HS_NETWORK}.${hs}/24" dev veth0
+
+ ip -netns "${hsname}" link set veth0 up
+ ip -netns "${hsname}" link set lo up
+
+ # configure the VRF on the router which is directly connected to the
+ # source host.
+ ip -netns "${rtname}" link \
+ add "${VRF_DEVNAME}" type vrf table "${VRF_TID}"
+ ip -netns "${rtname}" link set "${VRF_DEVNAME}" up
+
+ # enslave the veth interface connecting the router with the host to the
+ # VRF in the access router
+ ip -netns "${rtname}" link \
+ set "${RT2HS_DEVNAME}" master "${VRF_DEVNAME}"
+
+ ip -netns "${rtname}" addr \
+ add "${IPv6_HS_NETWORK}::254/64" dev "${RT2HS_DEVNAME}" nodad
+ ip -netns "${rtname}" addr \
+ add "${IPv4_HS_NETWORK}.254/24" dev "${RT2HS_DEVNAME}"
+
+ ip -netns "${rtname}" link set "${RT2HS_DEVNAME}" up
+
+ ip netns exec "${rtname}" \
+ sysctl -wq net.ipv6.conf."${RT2HS_DEVNAME}".proxy_ndp=1
+ ip netns exec "${rtname}" \
+ sysctl -wq net.ipv4.conf."${RT2HS_DEVNAME}".proxy_arp=1
+
+ # disable the rp_filter otherwise the kernel gets confused about how
+ # to route decap ipv4 packets.
+ ip netns exec "${rtname}" \
+ sysctl -wq net.ipv4.conf."${RT2HS_DEVNAME}".rp_filter=0
+
+ ip netns exec "${rtname}" sh -c "echo 1 > /proc/sys/net/vrf/strict_mode"
+}
+
+setup()
+{
+ local i
+
+ # create routers
+ ROUTERS="1 2 3 4"; readonly ROUTERS
+ for i in ${ROUTERS}; do
+ create_router "${i}"
+ done
+
+ # create hosts
+ HOSTS="1 2 3 4"; readonly HOSTS
+ for i in ${HOSTS}; do
+ create_host "${i}"
+ done
+
+ # set up the links for connecting routers
+ add_link_rt_pairs 1 "2 3 4"
+ add_link_rt_pairs 2 "3 4"
+ add_link_rt_pairs 3 "4"
+
+ # set up the basic connectivity of routers and routes required for
+ # reachability of SIDs.
+ setup_rt_networking 1 "2 3 4"
+ setup_rt_networking 2 "1 3 4"
+ setup_rt_networking 3 "1 2 4"
+ setup_rt_networking 4 "1 2 3"
+
+ # set up the hosts connected to routers
+ setup_hs 1 1
+ setup_hs 2 2
+ setup_hs 3 3
+ setup_hs 4 4
+
+ # set up default SRv6 Endpoints (i.e. SRv6 End and SRv6 End.DT46)
+ setup_rt_local_sids 1 "2 3 4"
+ setup_rt_local_sids 2 "1 3 4"
+ setup_rt_local_sids 3 "1 2 4"
+ setup_rt_local_sids 4 "1 2 3"
+
+ # set up SRv6 policies
+
+ # create an IPv6 VPN between hosts hs-1 and hs-2.
+ # the network path between hs-1 and hs-2 traverses several routers
+ # depending on the direction of traffic.
+ #
+ # Direction hs-1 -> hs-2 (H.Encaps.Red)
+ # - rt-3,rt-4 (SRv6 End behaviors)
+ # - rt-2 (SRv6 End.DT46 behavior)
+ #
+ # Direction hs-2 -> hs-1 (H.Encaps.Red)
+ # - rt-1 (SRv6 End.DT46 behavior)
+ setup_rt_policy_ipv6 2 1 "3 4" 2 encap.red
+ setup_rt_policy_ipv6 1 2 "" 1 encap.red
+
+ # create an IPv4 VPN between hosts hs-1 and hs-2
+ # the network path between hs-1 and hs-2 traverses several routers
+ # depending on the direction of traffic.
+ #
+ # Direction hs-1 -> hs-2 (H.Encaps.Red)
+ # - rt-2 (SRv6 End.DT46 behavior)
+ #
+ # Direction hs-2 -> hs-1 (H.Encaps.Red)
+ # - rt-4,rt-3 (SRv6 End behaviors)
+ # - rt-1 (SRv6 End.DT46 behavior)
+ setup_rt_policy_ipv4 2 1 "" 2 encap.red
+ setup_rt_policy_ipv4 1 2 "4 3" 1 encap.red
+
+ # create an IPv6 VPN between hosts hs-3 and hs-4
+ # the network path between hs-3 and hs-4 traverses several routers
+ # depending on the direction of traffic.
+ #
+ # Direction hs-3 -> hs-4 (H.Encaps.Red)
+ # - rt-2 (SRv6 End Behavior)
+ # - rt-4 (SRv6 End.DT46 behavior)
+ #
+ # Direction hs-4 -> hs-3 (H.Encaps.Red)
+ # - rt-1 (SRv6 End behavior)
+ # - rt-3 (SRv6 End.DT46 behavior)
+ setup_rt_policy_ipv6 4 3 "2" 4 encap.red
+ setup_rt_policy_ipv6 3 4 "1" 3 encap.red
+
+ # testing environment was set up successfully
+ SETUP_ERR=0
+}
+
+check_rt_connectivity()
+{
+ local rtsrc="$1"
+ local rtdst="$2"
+ local prefix
+ local rtsrc_nsname
+
+ rtsrc_nsname="$(get_rtname "${rtsrc}")"
+
+ prefix="$(get_network_prefix "${rtsrc}" "${rtdst}")"
+
+ ip netns exec "${rtsrc_nsname}" ping -c 1 -W "${PING_TIMEOUT_SEC}" \
+ "${prefix}::${rtdst}" >/dev/null 2>&1
+}
+
+check_and_log_rt_connectivity()
+{
+ local rtsrc="$1"
+ local rtdst="$2"
+
+ check_rt_connectivity "${rtsrc}" "${rtdst}"
+ log_test $? 0 "Routers connectivity: rt-${rtsrc} -> rt-${rtdst}"
+}
+
+check_hs_ipv6_connectivity()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+ local hssrc_nsname
+
+ hssrc_nsname="$(get_hsname "${hssrc}")"
+
+ ip netns exec "${hssrc_nsname}" ping -c 1 -W "${PING_TIMEOUT_SEC}" \
+ "${IPv6_HS_NETWORK}::${hsdst}" >/dev/null 2>&1
+}
+
+check_hs_ipv4_connectivity()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+ local hssrc_nsname
+
+ hssrc_nsname="$(get_hsname "${hssrc}")"
+
+ ip netns exec "${hssrc_nsname}" ping -c 1 -W "${PING_TIMEOUT_SEC}" \
+ "${IPv4_HS_NETWORK}.${hsdst}" >/dev/null 2>&1
+}
+
+check_and_log_hs2gw_connectivity()
+{
+ local hssrc="$1"
+
+ check_hs_ipv6_connectivity "${hssrc}" 254
+ log_test $? 0 "IPv6 Hosts connectivity: hs-${hssrc} -> gw"
+
+ check_hs_ipv4_connectivity "${hssrc}" 254
+ log_test $? 0 "IPv4 Hosts connectivity: hs-${hssrc} -> gw"
+}
+
+check_and_log_hs_ipv6_connectivity()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+
+ check_hs_ipv6_connectivity "${hssrc}" "${hsdst}"
+ log_test $? 0 "IPv6 Hosts connectivity: hs-${hssrc} -> hs-${hsdst}"
+}
+
+check_and_log_hs_ipv4_connectivity()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+
+ check_hs_ipv4_connectivity "${hssrc}" "${hsdst}"
+ log_test $? 0 "IPv4 Hosts connectivity: hs-${hssrc} -> hs-${hsdst}"
+}
+
+check_and_log_hs_connectivity()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+
+ check_and_log_hs_ipv4_connectivity "${hssrc}" "${hsdst}"
+ check_and_log_hs_ipv6_connectivity "${hssrc}" "${hsdst}"
+}
+
+check_and_log_hs_ipv6_isolation()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+
+ # in this case, the connectivity test must fail
+ check_hs_ipv6_connectivity "${hssrc}" "${hsdst}"
+ log_test $? 1 "IPv6 Hosts isolation: hs-${hssrc} -X-> hs-${hsdst}"
+}
+
+check_and_log_hs_ipv4_isolation()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+
+ # in this case, the connectivity test must fail
+ check_hs_ipv4_connectivity "${hssrc}" "${hsdst}"
+ log_test $? 1 "IPv4 Hosts isolation: hs-${hssrc} -X-> hs-${hsdst}"
+}
+
+check_and_log_hs_isolation()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+
+ check_and_log_hs_ipv6_isolation "${hssrc}" "${hsdst}"
+ check_and_log_hs_ipv4_isolation "${hssrc}" "${hsdst}"
+}
+
+router_tests()
+{
+ local i
+ local j
+
+ log_section "IPv6 routers connectivity test"
+
+ for i in ${ROUTERS}; do
+ for j in ${ROUTERS}; do
+ if [ "${i}" -eq "${j}" ]; then
+ continue
+ fi
+
+ check_and_log_rt_connectivity "${i}" "${j}"
+ done
+ done
+}
+
+host2gateway_tests()
+{
+ local hs
+
+ log_section "IPv4/IPv6 connectivity test among hosts and gateways"
+
+ for hs in ${HOSTS}; do
+ check_and_log_hs2gw_connectivity "${hs}"
+ done
+}
+
+host_vpn_tests()
+{
+ log_section "SRv6 VPN connectivity test hosts (h1 <-> h2, IPv4/IPv6)"
+
+ check_and_log_hs_connectivity 1 2
+ check_and_log_hs_connectivity 2 1
+
+ log_section "SRv6 VPN connectivity test hosts (h3 <-> h4, IPv6 only)"
+
+ check_and_log_hs_ipv6_connectivity 3 4
+ check_and_log_hs_ipv6_connectivity 4 3
+}
+
+host_vpn_isolation_tests()
+{
+ local l1="1 2"
+ local l2="3 4"
+ local tmp
+ local i
+ local j
+ local k
+
+ log_section "SRv6 VPN isolation test among hosts"
+
+ for k in 0 1; do
+ for i in ${l1}; do
+ for j in ${l2}; do
+ check_and_log_hs_isolation "${i}" "${j}"
+ done
+ done
+
+ # let us test the reverse path
+ tmp="${l1}"; l1="${l2}"; l2="${tmp}"
+ done
+
+ log_section "SRv6 VPN isolation test among hosts (h2 <-> h4, IPv4 only)"
+
+ check_and_log_hs_ipv4_isolation 2 4
+ check_and_log_hs_ipv4_isolation 4 2
+}
+
+test_iproute2_supp_or_ksft_skip()
+{
+ if ! ip route help 2>&1 | grep -qo "encap.red"; then
+ echo "SKIP: Missing SRv6 encap.red support in iproute2"
+ exit "${ksft_skip}"
+ fi
+}
+
+test_vrf_or_ksft_skip()
+{
+ modprobe vrf &>/dev/null || true
+ if [ ! -e /proc/sys/net/vrf/strict_mode ]; then
+ echo "SKIP: vrf sysctl does not exist"
+ exit "${ksft_skip}"
+ fi
+}
+
+if [ "$(id -u)" -ne 0 ]; then
+ echo "SKIP: Need root privileges"
+ exit "${ksft_skip}"
+fi
+
+# required programs to carry out this selftest
+test_command_or_ksft_skip ip
+test_command_or_ksft_skip ping
+test_command_or_ksft_skip sysctl
+test_command_or_ksft_skip grep
+
+test_iproute2_supp_or_ksft_skip
+test_vrf_or_ksft_skip
+
+set -e
+trap cleanup EXIT
+
+setup
+set +e
+
+router_tests
+host2gateway_tests
+host_vpn_tests
+host_vpn_isolation_tests
+
+print_log_test_results
diff --git a/tools/testing/selftests/net/srv6_hl2encap_red_l2vpn_test.sh b/tools/testing/selftests/net/srv6_hl2encap_red_l2vpn_test.sh
new file mode 100755
index 000000000000..cb4177d41b21
--- /dev/null
+++ b/tools/testing/selftests/net/srv6_hl2encap_red_l2vpn_test.sh
@@ -0,0 +1,821 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# author: Andrea Mayer <andrea.mayer@uniroma2.it>
+#
+# This script is designed for testing the SRv6 H.L2Encaps.Red behavior.
+#
+# Below is depicted the IPv6 network of an operator which offers L2 VPN
+# services to hosts, enabling them to communicate with each other.
+# In this example, hosts hs-1 and hs-2 are connected through an L2 VPN service.
+# Currently, the SRv6 subsystem in Linux allows hosts hs-1 and hs-2 to exchange
+# full L2 frames as long as they carry IPv4/IPv6.
+#
+# Routers rt-1,rt-2,rt-3 and rt-4 implement L2 VPN services
+# leveraging the SRv6 architecture. The key components for such VPNs are:
+#
+# i) The SRv6 H.L2Encaps.Red behavior applies SRv6 Policies on traffic
+# received by connected hosts, initiating the VPN tunnel. Such a behavior
+# is an optimization of the SRv6 H.L2Encap aiming to reduce the
+# length of the SID List carried in the pushed SRH. Specifically, the
+# H.L2Encaps.Red removes the first SID contained in the SID List (i.e. SRv6
+# Policy) by storing it into the IPv6 Destination Address. When a SRv6
+# Policy is made of only one SID, the SRv6 H.L2Encaps.Red behavior omits
+# the SRH at all and pushes that SID directly into the IPv6 DA;
+#
+# ii) The SRv6 End behavior advances the active SID in the SID List
+# carried by the SRH;
+#
+# iii) The SRv6 End.DX2 behavior is used for removing the SRv6 Policy
+# and, thus, it terminates the VPN tunnel. The decapsulated L2 frame is
+# sent over the interface connected with the destination host.
+#
+# cafe::1 cafe::2
+# 10.0.0.1 10.0.0.2
+# +--------+ +--------+
+# | | | |
+# | hs-1 | | hs-2 |
+# | | | |
+# +---+----+ +--- +---+
+# cafe::/64 | | cafe::/64
+# 10.0.0.0/24 | | 10.0.0.0/24
+# +---+----+ +----+---+
+# | | fcf0:0:1:2::/64 | |
+# | rt-1 +-------------------+ rt-2 |
+# | | | |
+# +---+----+ +----+---+
+# | . . |
+# | fcf0:0:1:3::/64 . |
+# | . . |
+# | . . |
+# fcf0:0:1:4::/64 | . | fcf0:0:2:3::/64
+# | . . |
+# | . . |
+# | fcf0:0:2:4::/64 . |
+# | . . |
+# +---+----+ +----+---+
+# | | | |
+# | rt-4 +-------------------+ rt-3 |
+# | | fcf0:0:3:4::/64 | |
+# +---+----+ +----+---+
+#
+#
+# Every fcf0:0:x:y::/64 network interconnects the SRv6 routers rt-x with rt-y
+# in the IPv6 operator network.
+#
+# Local SID table
+# ===============
+#
+# Each SRv6 router is configured with a Local SID table in which SIDs are
+# stored. Considering the given SRv6 router rt-x, at least two SIDs are
+# configured in the Local SID table:
+#
+# Local SID table for SRv6 router rt-x
+# +----------------------------------------------------------+
+# |fcff:x::e is associated with the SRv6 End behavior |
+# |fcff:x::d2 is associated with the SRv6 End.DX2 behavior |
+# +----------------------------------------------------------+
+#
+# The fcff::/16 prefix is reserved by the operator for implementing SRv6 VPN
+# services. Reachability of SIDs is ensured by proper configuration of the IPv6
+# operator's network and SRv6 routers.
+#
+# SRv6 Policies
+# =============
+#
+# An SRv6 ingress router applies SRv6 policies to the traffic received from a
+# connected host. SRv6 policy enforcement consists of encapsulating the
+# received traffic into a new IPv6 packet with a given SID List contained in
+# the SRH.
+#
+# L2 VPN between hs-1 and hs-2
+# ----------------------------
+#
+# Hosts hs-1 and hs-2 are connected using a dedicated L2 VPN.
+# Specifically, packets generated from hs-1 and directed towards hs-2 are
+# handled by rt-1 which applies the following SRv6 Policies:
+#
+# i.a) L2 traffic, SID List=fcff:2::d2
+#
+# Policy (i.a) steers tunneled L2 traffic through SRv6 router rt-2.
+# The H.L2Encaps.Red omits the presence of SRH at all, since the SID List
+# consists of only one SID (fcff:2::d2) that can be stored directly in the IPv6
+# DA.
+#
+# On the reverse path (i.e. from hs-2 to hs-1), rt-2 applies the following
+# policies:
+#
+# i.b) L2 traffic, SID List=fcff:4::e,fcff:3::e,fcff:1::d2
+#
+# Policy (i.b) steers tunneled L2 traffic through the SRv6 routers
+# rt-4,rt-3,rt2. The H.L2Encaps.Red reduces the SID List in the SRH by removing
+# the first SID (fcff:4::e) and pushing it into the IPv6 DA.
+#
+# In summary:
+# hs-1->hs-2 |IPv6 DA=fcff:2::d2|eth|...| (i.a)
+# hs-2->hs-1 |IPv6 DA=fcff:4::e|SRH SIDs=fcff:3::e,fcff:1::d2|eth|...| (i.b)
+#
+
+# Kselftest framework requirement - SKIP code is 4.
+readonly ksft_skip=4
+
+readonly RDMSUFF="$(mktemp -u XXXXXXXX)"
+readonly DUMMY_DEVNAME="dum0"
+readonly RT2HS_DEVNAME="veth-hs"
+readonly HS_VETH_NAME="veth0"
+readonly LOCALSID_TABLE_ID=90
+readonly IPv6_RT_NETWORK=fcf0:0
+readonly IPv6_HS_NETWORK=cafe
+readonly IPv4_HS_NETWORK=10.0.0
+readonly VPN_LOCATOR_SERVICE=fcff
+readonly MAC_PREFIX=00:00:00:c0:01
+readonly END_FUNC=000e
+readonly DX2_FUNC=00d2
+
+PING_TIMEOUT_SEC=4
+PAUSE_ON_FAIL=${PAUSE_ON_FAIL:=no}
+
+# IDs of routers and hosts are initialized during the setup of the testing
+# network
+ROUTERS=''
+HOSTS=''
+
+SETUP_ERR=1
+
+ret=${ksft_skip}
+nsuccess=0
+nfail=0
+
+log_test()
+{
+ local rc="$1"
+ local expected="$2"
+ local msg="$3"
+
+ if [ "${rc}" -eq "${expected}" ]; then
+ nsuccess=$((nsuccess+1))
+ printf "\n TEST: %-60s [ OK ]\n" "${msg}"
+ else
+ ret=1
+ nfail=$((nfail+1))
+ printf "\n TEST: %-60s [FAIL]\n" "${msg}"
+ if [ "${PAUSE_ON_FAIL}" = "yes" ]; then
+ echo
+ echo "hit enter to continue, 'q' to quit"
+ read a
+ [ "$a" = "q" ] && exit 1
+ fi
+ fi
+}
+
+print_log_test_results()
+{
+ printf "\nTests passed: %3d\n" "${nsuccess}"
+ printf "Tests failed: %3d\n" "${nfail}"
+
+ # when a test fails, the value of 'ret' is set to 1 (error code).
+ # Conversely, when all tests are passed successfully, the 'ret' value
+ # is set to 0 (success code).
+ if [ "${ret}" -ne 1 ]; then
+ ret=0
+ fi
+}
+
+log_section()
+{
+ echo
+ echo "################################################################################"
+ echo "TEST SECTION: $*"
+ echo "################################################################################"
+}
+
+test_command_or_ksft_skip()
+{
+ local cmd="$1"
+
+ if [ ! -x "$(command -v "${cmd}")" ]; then
+ echo "SKIP: Could not run test without \"${cmd}\" tool";
+ exit "${ksft_skip}"
+ fi
+}
+
+get_nodename()
+{
+ local name="$1"
+
+ echo "${name}-${RDMSUFF}"
+}
+
+get_rtname()
+{
+ local rtid="$1"
+
+ get_nodename "rt-${rtid}"
+}
+
+get_hsname()
+{
+ local hsid="$1"
+
+ get_nodename "hs-${hsid}"
+}
+
+__create_namespace()
+{
+ local name="$1"
+
+ ip netns add "${name}"
+}
+
+create_router()
+{
+ local rtid="$1"
+ local nsname
+
+ nsname="$(get_rtname "${rtid}")"
+
+ __create_namespace "${nsname}"
+}
+
+create_host()
+{
+ local hsid="$1"
+ local nsname
+
+ nsname="$(get_hsname "${hsid}")"
+
+ __create_namespace "${nsname}"
+}
+
+cleanup()
+{
+ local nsname
+ local i
+
+ # destroy routers
+ for i in ${ROUTERS}; do
+ nsname="$(get_rtname "${i}")"
+
+ ip netns del "${nsname}" &>/dev/null || true
+ done
+
+ # destroy hosts
+ for i in ${HOSTS}; do
+ nsname="$(get_hsname "${i}")"
+
+ ip netns del "${nsname}" &>/dev/null || true
+ done
+
+ # check whether the setup phase was completed successfully or not. In
+ # case of an error during the setup phase of the testing environment,
+ # the selftest is considered as "skipped".
+ if [ "${SETUP_ERR}" -ne 0 ]; then
+ echo "SKIP: Setting up the testing environment failed"
+ exit "${ksft_skip}"
+ fi
+
+ exit "${ret}"
+}
+
+add_link_rt_pairs()
+{
+ local rt="$1"
+ local rt_neighs="$2"
+ local neigh
+ local nsname
+ local neigh_nsname
+
+ nsname="$(get_rtname "${rt}")"
+
+ for neigh in ${rt_neighs}; do
+ neigh_nsname="$(get_rtname "${neigh}")"
+
+ ip link add "veth-rt-${rt}-${neigh}" netns "${nsname}" \
+ type veth peer name "veth-rt-${neigh}-${rt}" \
+ netns "${neigh_nsname}"
+ done
+}
+
+get_network_prefix()
+{
+ local rt="$1"
+ local neigh="$2"
+ local p="${rt}"
+ local q="${neigh}"
+
+ if [ "${p}" -gt "${q}" ]; then
+ p="${q}"; q="${rt}"
+ fi
+
+ echo "${IPv6_RT_NETWORK}:${p}:${q}"
+}
+
+# Setup the basic networking for the routers
+setup_rt_networking()
+{
+ local rt="$1"
+ local rt_neighs="$2"
+ local nsname
+ local net_prefix
+ local devname
+ local neigh
+
+ nsname="$(get_rtname "${rt}")"
+
+ for neigh in ${rt_neighs}; do
+ devname="veth-rt-${rt}-${neigh}"
+
+ net_prefix="$(get_network_prefix "${rt}" "${neigh}")"
+
+ ip -netns "${nsname}" addr \
+ add "${net_prefix}::${rt}/64" dev "${devname}" nodad
+
+ ip -netns "${nsname}" link set "${devname}" up
+ done
+
+ ip -netns "${nsname}" link add "${DUMMY_DEVNAME}" type dummy
+
+ ip -netns "${nsname}" link set "${DUMMY_DEVNAME}" up
+ ip -netns "${nsname}" link set lo up
+
+ ip netns exec "${nsname}" sysctl -wq net.ipv6.conf.all.accept_dad=0
+ ip netns exec "${nsname}" sysctl -wq net.ipv6.conf.default.accept_dad=0
+ ip netns exec "${nsname}" sysctl -wq net.ipv6.conf.all.forwarding=1
+
+ ip netns exec "${nsname}" sysctl -wq net.ipv4.conf.all.rp_filter=0
+ ip netns exec "${nsname}" sysctl -wq net.ipv4.conf.default.rp_filter=0
+ ip netns exec "${nsname}" sysctl -wq net.ipv4.ip_forward=1
+}
+
+# Setup local SIDs for an SRv6 router
+setup_rt_local_sids()
+{
+ local rt="$1"
+ local rt_neighs="$2"
+ local net_prefix
+ local devname
+ local nsname
+ local neigh
+
+ nsname="$(get_rtname "${rt}")"
+
+ for neigh in ${rt_neighs}; do
+ devname="veth-rt-${rt}-${neigh}"
+
+ net_prefix="$(get_network_prefix "${rt}" "${neigh}")"
+
+ # set underlay network routes for SIDs reachability
+ ip -netns "${nsname}" -6 route \
+ add "${VPN_LOCATOR_SERVICE}:${neigh}::/32" \
+ table "${LOCALSID_TABLE_ID}" \
+ via "${net_prefix}::${neigh}" dev "${devname}"
+ done
+
+ # Local End behavior (note that dev "${DUMMY_DEVNAME}" is a dummy
+ # interface)
+ ip -netns "${nsname}" -6 route \
+ add "${VPN_LOCATOR_SERVICE}:${rt}::${END_FUNC}" \
+ table "${LOCALSID_TABLE_ID}" \
+ encap seg6local action End dev "${DUMMY_DEVNAME}"
+
+ # all SIDs for VPNs start with a common locator. Routes and SRv6
+ # Endpoint behaviors instaces are grouped together in the 'localsid'
+ # table.
+ ip -netns "${nsname}" -6 rule add \
+ to "${VPN_LOCATOR_SERVICE}::/16" \
+ lookup "${LOCALSID_TABLE_ID}" prio 999
+}
+
+# build and install the SRv6 policy into the ingress SRv6 router.
+# args:
+# $1 - destination host (i.e. cafe::x host)
+# $2 - SRv6 router configured for enforcing the SRv6 Policy
+# $3 - SRv6 routers configured for steering traffic (End behaviors)
+# $4 - SRv6 router configured for removing the SRv6 Policy (router connected
+# to the destination host)
+# $5 - encap mode (full or red)
+# $6 - traffic type (IPv6 or IPv4)
+__setup_rt_policy()
+{
+ local dst="$1"
+ local encap_rt="$2"
+ local end_rts="$3"
+ local dec_rt="$4"
+ local mode="$5"
+ local traffic="$6"
+ local nsname
+ local policy=''
+ local n
+
+ nsname="$(get_rtname "${encap_rt}")"
+
+ for n in ${end_rts}; do
+ policy="${policy}${VPN_LOCATOR_SERVICE}:${n}::${END_FUNC},"
+ done
+
+ policy="${policy}${VPN_LOCATOR_SERVICE}:${dec_rt}::${DX2_FUNC}"
+
+ # add SRv6 policy to incoming traffic sent by connected hosts
+ if [ "${traffic}" -eq 6 ]; then
+ ip -netns "${nsname}" -6 route \
+ add "${IPv6_HS_NETWORK}::${dst}" \
+ encap seg6 mode "${mode}" segs "${policy}" \
+ dev dum0
+ else
+ ip -netns "${nsname}" -4 route \
+ add "${IPv4_HS_NETWORK}.${dst}" \
+ encap seg6 mode "${mode}" segs "${policy}" \
+ dev dum0
+ fi
+}
+
+# see __setup_rt_policy
+setup_rt_policy_ipv6()
+{
+ __setup_rt_policy "$1" "$2" "$3" "$4" "$5" 6
+}
+
+#see __setup_rt_policy
+setup_rt_policy_ipv4()
+{
+ __setup_rt_policy "$1" "$2" "$3" "$4" "$5" 4
+}
+
+setup_decap()
+{
+ local rt="$1"
+ local nsname
+
+ nsname="$(get_rtname "${rt}")"
+
+ # Local End.DX2 behavior
+ ip -netns "${nsname}" -6 route \
+ add "${VPN_LOCATOR_SERVICE}:${rt}::${DX2_FUNC}" \
+ table "${LOCALSID_TABLE_ID}" \
+ encap seg6local action End.DX2 oif "${RT2HS_DEVNAME}" \
+ dev "${RT2HS_DEVNAME}"
+}
+
+setup_hs()
+{
+ local hs="$1"
+ local rt="$2"
+ local hsname
+ local rtname
+
+ hsname="$(get_hsname "${hs}")"
+ rtname="$(get_rtname "${rt}")"
+
+ ip netns exec "${hsname}" sysctl -wq net.ipv6.conf.all.accept_dad=0
+ ip netns exec "${hsname}" sysctl -wq net.ipv6.conf.default.accept_dad=0
+
+ ip -netns "${hsname}" link add "${HS_VETH_NAME}" type veth \
+ peer name "${RT2HS_DEVNAME}" netns "${rtname}"
+
+ ip -netns "${hsname}" addr add "${IPv6_HS_NETWORK}::${hs}/64" \
+ dev "${HS_VETH_NAME}" nodad
+ ip -netns "${hsname}" addr add "${IPv4_HS_NETWORK}.${hs}/24" \
+ dev "${HS_VETH_NAME}"
+
+ ip -netns "${hsname}" link set "${HS_VETH_NAME}" up
+ ip -netns "${hsname}" link set lo up
+
+ ip -netns "${rtname}" addr add "${IPv6_HS_NETWORK}::254/64" \
+ dev "${RT2HS_DEVNAME}" nodad
+ ip -netns "${rtname}" addr \
+ add "${IPv4_HS_NETWORK}.254/24" dev "${RT2HS_DEVNAME}"
+
+ ip -netns "${rtname}" link set "${RT2HS_DEVNAME}" up
+
+ # disable the rp_filter otherwise the kernel gets confused about how
+ # to route decap ipv4 packets.
+ ip netns exec "${rtname}" \
+ sysctl -wq net.ipv4.conf."${RT2HS_DEVNAME}".rp_filter=0
+}
+
+# set an auto-generated mac address
+# args:
+# $1 - name of the node (e.g.: hs-1, rt-3, etc)
+# $2 - id of the node (e.g.: 1 for hs-1, 3 for rt-3, etc)
+# $3 - host part of the IPv6 network address
+# $4 - name of the network interface to which the generated mac address must
+# be set.
+set_mac_address()
+{
+ local nodename="$1"
+ local nodeid="$2"
+ local host="$3"
+ local ifname="$4"
+ local nsname
+
+ nsname=$(get_nodename "${nodename}")
+
+ ip -netns "${nsname}" link set dev "${ifname}" down
+
+ ip -netns "${nsname}" link set address "${MAC_PREFIX}:${nodeid}" \
+ dev "${ifname}"
+
+ # the IPv6 address must be set once again after the MAC address has
+ # been changed.
+ ip -netns "${nsname}" addr add "${IPv6_HS_NETWORK}::${host}/64" \
+ dev "${ifname}" nodad
+
+ ip -netns "${nsname}" link set dev "${ifname}" up
+}
+
+set_host_l2peer()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+ local ipprefix="$3"
+ local proto="$4"
+ local hssrc_name
+ local ipaddr
+
+ hssrc_name="$(get_hsname "${hssrc}")"
+
+ if [ "${proto}" -eq 6 ]; then
+ ipaddr="${ipprefix}::${hsdst}"
+ else
+ ipaddr="${ipprefix}.${hsdst}"
+ fi
+
+ ip -netns "${hssrc_name}" route add "${ipaddr}" dev "${HS_VETH_NAME}"
+
+ ip -netns "${hssrc_name}" neigh \
+ add "${ipaddr}" lladdr "${MAC_PREFIX}:${hsdst}" \
+ dev "${HS_VETH_NAME}"
+}
+
+# setup an SRv6 L2 VPN between host hs-x and hs-y (currently, the SRv6
+# subsystem only supports L2 frames whose layer-3 is IPv4/IPv6).
+# args:
+# $1 - source host
+# $2 - SRv6 routers configured for steering tunneled traffic
+# $3 - destination host
+setup_l2vpn()
+{
+ local hssrc="$1"
+ local end_rts="$2"
+ local hsdst="$3"
+ local rtsrc="${hssrc}"
+ local rtdst="${hsdst}"
+
+ # set fixed mac for source node and the neigh MAC address
+ set_mac_address "hs-${hssrc}" "${hssrc}" "${hssrc}" "${HS_VETH_NAME}"
+ set_host_l2peer "${hssrc}" "${hsdst}" "${IPv6_HS_NETWORK}" 6
+ set_host_l2peer "${hssrc}" "${hsdst}" "${IPv4_HS_NETWORK}" 4
+
+ # we have to set the mac address of the veth-host (on ingress router)
+ # to the mac address of the remote peer (L2 VPN destination host).
+ # Otherwise, traffic coming from the source host is dropped at the
+ # ingress router.
+ set_mac_address "rt-${rtsrc}" "${hsdst}" 254 "${RT2HS_DEVNAME}"
+
+ # set the SRv6 Policies at the ingress router
+ setup_rt_policy_ipv6 "${hsdst}" "${rtsrc}" "${end_rts}" "${rtdst}" \
+ l2encap.red 6
+ setup_rt_policy_ipv4 "${hsdst}" "${rtsrc}" "${end_rts}" "${rtdst}" \
+ l2encap.red 4
+
+ # set the decap behavior
+ setup_decap "${rtsrc}"
+}
+
+setup()
+{
+ local i
+
+ # create routers
+ ROUTERS="1 2 3 4"; readonly ROUTERS
+ for i in ${ROUTERS}; do
+ create_router "${i}"
+ done
+
+ # create hosts
+ HOSTS="1 2"; readonly HOSTS
+ for i in ${HOSTS}; do
+ create_host "${i}"
+ done
+
+ # set up the links for connecting routers
+ add_link_rt_pairs 1 "2 3 4"
+ add_link_rt_pairs 2 "3 4"
+ add_link_rt_pairs 3 "4"
+
+ # set up the basic connectivity of routers and routes required for
+ # reachability of SIDs.
+ setup_rt_networking 1 "2 3 4"
+ setup_rt_networking 2 "1 3 4"
+ setup_rt_networking 3 "1 2 4"
+ setup_rt_networking 4 "1 2 3"
+
+ # set up the hosts connected to routers
+ setup_hs 1 1
+ setup_hs 2 2
+
+ # set up default SRv6 Endpoints (i.e. SRv6 End and SRv6 End.DX2)
+ setup_rt_local_sids 1 "2 3 4"
+ setup_rt_local_sids 2 "1 3 4"
+ setup_rt_local_sids 3 "1 2 4"
+ setup_rt_local_sids 4 "1 2 3"
+
+ # create a L2 VPN between hs-1 and hs-2.
+ # NB: currently, H.L2Encap* enables tunneling of L2 frames whose
+ # layer-3 is IPv4/IPv6.
+ #
+ # the network path between hs-1 and hs-2 traverses several routers
+ # depending on the direction of traffic.
+ #
+ # Direction hs-1 -> hs-2 (H.L2Encaps.Red)
+ # - rt-2 (SRv6 End.DX2 behavior)
+ #
+ # Direction hs-2 -> hs-1 (H.L2Encaps.Red)
+ # - rt-4,rt-3 (SRv6 End behaviors)
+ # - rt-1 (SRv6 End.DX2 behavior)
+ setup_l2vpn 1 "" 2
+ setup_l2vpn 2 "4 3" 1
+
+ # testing environment was set up successfully
+ SETUP_ERR=0
+}
+
+check_rt_connectivity()
+{
+ local rtsrc="$1"
+ local rtdst="$2"
+ local prefix
+ local rtsrc_nsname
+
+ rtsrc_nsname="$(get_rtname "${rtsrc}")"
+
+ prefix="$(get_network_prefix "${rtsrc}" "${rtdst}")"
+
+ ip netns exec "${rtsrc_nsname}" ping -c 1 -W "${PING_TIMEOUT_SEC}" \
+ "${prefix}::${rtdst}" >/dev/null 2>&1
+}
+
+check_and_log_rt_connectivity()
+{
+ local rtsrc="$1"
+ local rtdst="$2"
+
+ check_rt_connectivity "${rtsrc}" "${rtdst}"
+ log_test $? 0 "Routers connectivity: rt-${rtsrc} -> rt-${rtdst}"
+}
+
+check_hs_ipv6_connectivity()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+ local hssrc_nsname
+
+ hssrc_nsname="$(get_hsname "${hssrc}")"
+
+ ip netns exec "${hssrc_nsname}" ping -c 1 -W "${PING_TIMEOUT_SEC}" \
+ "${IPv6_HS_NETWORK}::${hsdst}" >/dev/null 2>&1
+}
+
+check_hs_ipv4_connectivity()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+ local hssrc_nsname
+
+ hssrc_nsname="$(get_hsname "${hssrc}")"
+
+ ip netns exec "${hssrc_nsname}" ping -c 1 -W "${PING_TIMEOUT_SEC}" \
+ "${IPv4_HS_NETWORK}.${hsdst}" >/dev/null 2>&1
+}
+
+check_and_log_hs2gw_connectivity()
+{
+ local hssrc="$1"
+
+ check_hs_ipv6_connectivity "${hssrc}" 254
+ log_test $? 0 "IPv6 Hosts connectivity: hs-${hssrc} -> gw"
+
+ check_hs_ipv4_connectivity "${hssrc}" 254
+ log_test $? 0 "IPv4 Hosts connectivity: hs-${hssrc} -> gw"
+}
+
+check_and_log_hs_ipv6_connectivity()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+
+ check_hs_ipv6_connectivity "${hssrc}" "${hsdst}"
+ log_test $? 0 "IPv6 Hosts connectivity: hs-${hssrc} -> hs-${hsdst}"
+}
+
+check_and_log_hs_ipv4_connectivity()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+
+ check_hs_ipv4_connectivity "${hssrc}" "${hsdst}"
+ log_test $? 0 "IPv4 Hosts connectivity: hs-${hssrc} -> hs-${hsdst}"
+}
+
+check_and_log_hs_connectivity()
+{
+ local hssrc="$1"
+ local hsdst="$2"
+
+ check_and_log_hs_ipv4_connectivity "${hssrc}" "${hsdst}"
+ check_and_log_hs_ipv6_connectivity "${hssrc}" "${hsdst}"
+}
+
+router_tests()
+{
+ local i
+ local j
+
+ log_section "IPv6 routers connectivity test"
+
+ for i in ${ROUTERS}; do
+ for j in ${ROUTERS}; do
+ if [ "${i}" -eq "${j}" ]; then
+ continue
+ fi
+
+ check_and_log_rt_connectivity "${i}" "${j}"
+ done
+ done
+}
+
+host2gateway_tests()
+{
+ local hs
+
+ log_section "IPv4/IPv6 connectivity test among hosts and gateways"
+
+ for hs in ${HOSTS}; do
+ check_and_log_hs2gw_connectivity "${hs}"
+ done
+}
+
+host_vpn_tests()
+{
+ log_section "SRv6 L2 VPN connectivity test hosts (h1 <-> h2)"
+
+ check_and_log_hs_connectivity 1 2
+ check_and_log_hs_connectivity 2 1
+}
+
+test_dummy_dev_or_ksft_skip()
+{
+ local test_netns
+
+ test_netns="dummy-$(mktemp -u XXXXXXXX)"
+
+ if ! ip netns add "${test_netns}"; then
+ echo "SKIP: Cannot set up netns for testing dummy dev support"
+ exit "${ksft_skip}"
+ fi
+
+ modprobe dummy &>/dev/null || true
+ if ! ip -netns "${test_netns}" link \
+ add "${DUMMY_DEVNAME}" type dummy; then
+ echo "SKIP: dummy dev not supported"
+
+ ip netns del "${test_netns}"
+ exit "${ksft_skip}"
+ fi
+
+ ip netns del "${test_netns}"
+}
+
+test_iproute2_supp_or_ksft_skip()
+{
+ if ! ip route help 2>&1 | grep -qo "l2encap.red"; then
+ echo "SKIP: Missing SRv6 l2encap.red support in iproute2"
+ exit "${ksft_skip}"
+ fi
+}
+
+if [ "$(id -u)" -ne 0 ]; then
+ echo "SKIP: Need root privileges"
+ exit "${ksft_skip}"
+fi
+
+# required programs to carry out this selftest
+test_command_or_ksft_skip ip
+test_command_or_ksft_skip ping
+test_command_or_ksft_skip sysctl
+test_command_or_ksft_skip grep
+
+test_iproute2_supp_or_ksft_skip
+test_dummy_dev_or_ksft_skip
+
+set -e
+trap cleanup EXIT
+
+setup
+set +e
+
+router_tests
+host2gateway_tests
+host_vpn_tests
+
+print_log_test_results
diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 5d70b04c482c..2cbb12736596 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -235,6 +235,7 @@ FIXTURE_VARIANT(tls)
{
uint16_t tls_version;
uint16_t cipher_type;
+ bool nopad;
};
FIXTURE_VARIANT_ADD(tls, 12_aes_gcm)
@@ -297,9 +298,17 @@ FIXTURE_VARIANT_ADD(tls, 13_aes_gcm_256)
.cipher_type = TLS_CIPHER_AES_GCM_256,
};
+FIXTURE_VARIANT_ADD(tls, 13_nopad)
+{
+ .tls_version = TLS_1_3_VERSION,
+ .cipher_type = TLS_CIPHER_AES_GCM_128,
+ .nopad = true,
+};
+
FIXTURE_SETUP(tls)
{
struct tls_crypto_info_keys tls12;
+ int one = 1;
int ret;
tls_crypto_info_init(variant->tls_version, variant->cipher_type,
@@ -315,6 +324,12 @@ FIXTURE_SETUP(tls)
ret = setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12, tls12.len);
ASSERT_EQ(ret, 0);
+
+ if (variant->nopad) {
+ ret = setsockopt(self->cfd, SOL_TLS, TLS_RX_EXPECT_NO_PAD,
+ (void *)&one, sizeof(one));
+ ASSERT_EQ(ret, 0);
+ }
}
FIXTURE_TEARDOWN(tls)
@@ -629,12 +644,14 @@ TEST_F(tls, splice_from_pipe2)
int p2[2];
int p[2];
+ memrnd(mem_send, sizeof(mem_send));
+
ASSERT_GE(pipe(p), 0);
ASSERT_GE(pipe(p2), 0);
- EXPECT_GE(write(p[1], mem_send, 8000), 0);
- EXPECT_GE(splice(p[0], NULL, self->fd, NULL, 8000, 0), 0);
- EXPECT_GE(write(p2[1], mem_send + 8000, 8000), 0);
- EXPECT_GE(splice(p2[0], NULL, self->fd, NULL, 8000, 0), 0);
+ EXPECT_EQ(write(p[1], mem_send, 8000), 8000);
+ EXPECT_EQ(splice(p[0], NULL, self->fd, NULL, 8000, 0), 8000);
+ EXPECT_EQ(write(p2[1], mem_send + 8000, 8000), 8000);
+ EXPECT_EQ(splice(p2[0], NULL, self->fd, NULL, 8000, 0), 8000);
EXPECT_EQ(recv(self->cfd, mem_recv, send_len, MSG_WAITALL), send_len);
EXPECT_EQ(memcmp(mem_send, mem_recv, send_len), 0);
}
@@ -668,10 +685,12 @@ TEST_F(tls, splice_to_pipe)
char mem_recv[TLS_PAYLOAD_MAX_LEN];
int p[2];
+ memrnd(mem_send, sizeof(mem_send));
+
ASSERT_GE(pipe(p), 0);
- EXPECT_GE(send(self->fd, mem_send, send_len, 0), 0);
- EXPECT_GE(splice(self->cfd, NULL, p[1], NULL, send_len, 0), 0);
- EXPECT_GE(read(p[0], mem_recv, send_len), 0);
+ EXPECT_EQ(send(self->fd, mem_send, send_len, 0), send_len);
+ EXPECT_EQ(splice(self->cfd, NULL, p[1], NULL, send_len, 0), send_len);
+ EXPECT_EQ(read(p[0], mem_recv, send_len), send_len);
EXPECT_EQ(memcmp(mem_send, mem_recv, send_len), 0);
}
@@ -860,6 +879,8 @@ TEST_F(tls, multiple_send_single_recv)
char recv_mem[2 * 10];
char send_mem[10];
+ memrnd(send_mem, sizeof(send_mem));
+
EXPECT_GE(send(self->fd, send_mem, send_len, 0), 0);
EXPECT_GE(send(self->fd, send_mem, send_len, 0), 0);
memset(recv_mem, 0, total_len);
@@ -876,6 +897,8 @@ TEST_F(tls, single_send_multiple_recv_non_align)
char recv_mem[recv_len * 2];
char send_mem[total_len];
+ memrnd(send_mem, sizeof(send_mem));
+
EXPECT_GE(send(self->fd, send_mem, total_len, 0), 0);
memset(recv_mem, 0, total_len);
@@ -921,10 +944,10 @@ TEST_F(tls, recv_peek)
char buf[15];
EXPECT_EQ(send(self->fd, test_str, send_len, 0), send_len);
- EXPECT_NE(recv(self->cfd, buf, send_len, MSG_PEEK), -1);
+ EXPECT_EQ(recv(self->cfd, buf, send_len, MSG_PEEK), send_len);
EXPECT_EQ(memcmp(test_str, buf, send_len), 0);
memset(buf, 0, sizeof(buf));
- EXPECT_NE(recv(self->cfd, buf, send_len, 0), -1);
+ EXPECT_EQ(recv(self->cfd, buf, send_len, 0), send_len);
EXPECT_EQ(memcmp(test_str, buf, send_len), 0);
}
@@ -1582,6 +1605,38 @@ TEST_F(tls_err, bad_cmsg)
EXPECT_EQ(errno, EBADMSG);
}
+TEST_F(tls_err, timeo)
+{
+ struct timeval tv = { .tv_usec = 10000, };
+ char buf[128];
+ int ret;
+
+ if (self->notls)
+ SKIP(return, "no TLS support");
+
+ ret = setsockopt(self->cfd2, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));
+ ASSERT_EQ(ret, 0);
+
+ ret = fork();
+ ASSERT_GE(ret, 0);
+
+ if (ret) {
+ usleep(1000); /* Give child a head start */
+
+ EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), -1);
+ EXPECT_EQ(errno, EAGAIN);
+
+ EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), -1);
+ EXPECT_EQ(errno, EAGAIN);
+
+ wait(&ret);
+ } else {
+ EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), -1);
+ EXPECT_EQ(errno, EAGAIN);
+ exit(0);
+ }
+}
+
TEST(non_established) {
struct tls12_crypto_info_aes_gcm_256 tls12;
struct sockaddr_in addr;
@@ -1659,6 +1714,57 @@ TEST(keysizes) {
close(cfd);
}
+TEST(no_pad) {
+ struct tls12_crypto_info_aes_gcm_256 tls12;
+ int ret, fd, cfd, val;
+ socklen_t len;
+ bool notls;
+
+ memset(&tls12, 0, sizeof(tls12));
+ tls12.info.version = TLS_1_3_VERSION;
+ tls12.info.cipher_type = TLS_CIPHER_AES_GCM_256;
+
+ ulp_sock_pair(_metadata, &fd, &cfd, &notls);
+
+ if (notls)
+ exit(KSFT_SKIP);
+
+ ret = setsockopt(fd, SOL_TLS, TLS_TX, &tls12, sizeof(tls12));
+ EXPECT_EQ(ret, 0);
+
+ ret = setsockopt(cfd, SOL_TLS, TLS_RX, &tls12, sizeof(tls12));
+ EXPECT_EQ(ret, 0);
+
+ val = 1;
+ ret = setsockopt(cfd, SOL_TLS, TLS_RX_EXPECT_NO_PAD,
+ (void *)&val, sizeof(val));
+ EXPECT_EQ(ret, 0);
+
+ len = sizeof(val);
+ val = 2;
+ ret = getsockopt(cfd, SOL_TLS, TLS_RX_EXPECT_NO_PAD,
+ (void *)&val, &len);
+ EXPECT_EQ(ret, 0);
+ EXPECT_EQ(val, 1);
+ EXPECT_EQ(len, 4);
+
+ val = 0;
+ ret = setsockopt(cfd, SOL_TLS, TLS_RX_EXPECT_NO_PAD,
+ (void *)&val, sizeof(val));
+ EXPECT_EQ(ret, 0);
+
+ len = sizeof(val);
+ val = 2;
+ ret = getsockopt(cfd, SOL_TLS, TLS_RX_EXPECT_NO_PAD,
+ (void *)&val, &len);
+ EXPECT_EQ(ret, 0);
+ EXPECT_EQ(val, 0);
+ EXPECT_EQ(len, 4);
+
+ close(fd);
+ close(cfd);
+}
+
TEST(tls_v6ops) {
struct tls_crypto_info_keys tls12;
struct sockaddr_in6 addr, addr2;
diff --git a/tools/testing/selftests/tc-testing/.gitignore b/tools/testing/selftests/tc-testing/.gitignore
index d52f65de23b4..9fe1cef72728 100644
--- a/tools/testing/selftests/tc-testing/.gitignore
+++ b/tools/testing/selftests/tc-testing/.gitignore
@@ -1,7 +1,6 @@
# SPDX-License-Identifier: GPL-2.0-only
__pycache__/
*.pyc
-plugins/
*.xml
*.tap
tdc_config_local.py
diff --git a/tools/testing/selftests/wireguard/qemu/Makefile b/tools/testing/selftests/wireguard/qemu/Makefile
index 9700358e4337..fda76282d34b 100644
--- a/tools/testing/selftests/wireguard/qemu/Makefile
+++ b/tools/testing/selftests/wireguard/qemu/Makefile
@@ -248,8 +248,13 @@ QEMU_MACHINE := -cpu host,accel=kvm -machine s390-ccw-virtio -append $(KERNEL_CM
else
QEMU_MACHINE := -cpu max -machine s390-ccw-virtio -append $(KERNEL_CMDLINE)
endif
+else ifeq ($(ARCH),um)
+CHOST := $(HOST_ARCH)-linux-musl
+KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/vmlinux
+KERNEL_ARCH := um
+KERNEL_CMDLINE := $(shell sed -n 's/CONFIG_CMDLINE=\(.*\)/\1/p' arch/um.config)
else
-$(error I only build: x86_64, i686, arm, armeb, aarch64, aarch64_be, mips, mipsel, mips64, mips64el, powerpc64, powerpc64le, powerpc, m68k, riscv64, riscv32, s390x)
+$(error I only build: x86_64, i686, arm, armeb, aarch64, aarch64_be, mips, mipsel, mips64, mips64el, powerpc64, powerpc64le, powerpc, m68k, riscv64, riscv32, s390x, um)
endif
TOOLCHAIN_FILENAME := $(CHOST)-cross.tgz
@@ -262,7 +267,9 @@ $(eval $(call file_download,$(TOOLCHAIN_FILENAME),$(TOOLCHAIN_DIR),,$(DISTFILES_
STRIP := $(CHOST)-strip
CROSS_COMPILE_FLAG := --build=$(CBUILD) --host=$(CHOST)
$(info Building for $(CHOST) using $(CBUILD))
+ifneq ($(ARCH),um)
export CROSS_COMPILE := $(CHOST)-
+endif
export PATH := $(TOOLCHAIN_PATH)/bin:$(PATH)
export CC := $(CHOST)-gcc
CCACHE_PATH := $(shell which ccache 2>/dev/null)
@@ -279,6 +286,7 @@ comma := ,
build: $(KERNEL_BZIMAGE)
qemu: $(KERNEL_BZIMAGE)
rm -f $(BUILD_PATH)/result
+ifneq ($(ARCH),um)
timeout --foreground 20m qemu-system-$(QEMU_ARCH) \
-nodefaults \
-nographic \
@@ -291,6 +299,13 @@ qemu: $(KERNEL_BZIMAGE)
-no-reboot \
-monitor none \
-kernel $<
+else
+ timeout --foreground 20m $< \
+ $(KERNEL_CMDLINE) \
+ mem=$$(grep -q CONFIG_DEBUG_KMEMLEAK=y $(KERNEL_BUILD_PATH)/.config && echo 1G || echo 256M) \
+ noreboot \
+ con1=fd:51 51>$(BUILD_PATH)/result </dev/null 2>&1 | cat
+endif
grep -Fq success $(BUILD_PATH)/result
$(BUILD_PATH)/init-cpio-spec.txt: $(TOOLCHAIN_PATH)/.installed $(BUILD_PATH)/init
diff --git a/tools/testing/selftests/wireguard/qemu/arch/um.config b/tools/testing/selftests/wireguard/qemu/arch/um.config
new file mode 100644
index 000000000000..c8b229e0810e
--- /dev/null
+++ b/tools/testing/selftests/wireguard/qemu/arch/um.config
@@ -0,0 +1,3 @@
+CONFIG_64BIT=y
+CONFIG_CMDLINE="wg.success=tty1 panic_on_warn=1"
+CONFIG_FRAME_WARN=1280
diff --git a/tools/testing/selftests/wireguard/qemu/debug.config b/tools/testing/selftests/wireguard/qemu/debug.config
index 2b321b8a96cf..9d172210e2c6 100644
--- a/tools/testing/selftests/wireguard/qemu/debug.config
+++ b/tools/testing/selftests/wireguard/qemu/debug.config
@@ -18,15 +18,12 @@ CONFIG_DEBUG_VM=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_STACKOVERFLOW=y
-CONFIG_HAVE_ARCH_KMEMCHECK=y
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_KASAN=y
CONFIG_KASAN_INLINE=y
CONFIG_UBSAN=y
CONFIG_UBSAN_SANITIZE_ALL=y
-CONFIG_UBSAN_NULL=y
CONFIG_DEBUG_KMEMLEAK=y
-CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=8192
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_DEBUG_SHIRQ=y
CONFIG_WQ_WATCHDOG=y
@@ -35,7 +32,6 @@ CONFIG_SCHED_INFO=y
CONFIG_SCHEDSTATS=y
CONFIG_SCHED_STACK_END_CHECK=y
CONFIG_DEBUG_TIMEKEEPING=y
-CONFIG_TIMER_STATS=y
CONFIG_DEBUG_PREEMPT=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
@@ -49,7 +45,6 @@ CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_PLIST=y
CONFIG_PROVE_RCU=y
-CONFIG_SPARSE_RCU_POINTER=y
CONFIG_RCU_CPU_STALL_TIMEOUT=21
CONFIG_RCU_TRACE=y
CONFIG_RCU_EQS_DEBUG=y
diff --git a/tools/testing/selftests/wireguard/qemu/kernel.config b/tools/testing/selftests/wireguard/qemu/kernel.config
index e1858ce7003f..ce2a04717300 100644
--- a/tools/testing/selftests/wireguard/qemu/kernel.config
+++ b/tools/testing/selftests/wireguard/qemu/kernel.config
@@ -19,7 +19,6 @@ CONFIG_NETFILTER_XTABLES=y
CONFIG_NETFILTER_XT_NAT=y
CONFIG_NETFILTER_XT_MATCH_LENGTH=y
CONFIG_NETFILTER_XT_MARK=y
-CONFIG_NF_NAT_IPV4=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_MANGLE=y