wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2024-11-11	netdevsim: add more hw_features	Sabrina Dubroca	1	-1/+5
	netdevsim currently only set HW_TC in its hw_features, but other features should also be present to better reflect the behavior of real HW. In my macsec offload testing, this ends up as HW_CSUM being missing from hw_features, so it doesn't stick in wanted_features when offload is turned off. Then HW_CSUM (and thus TSO, thanks to netdev_fix_features) is not automatically turned back on when offload is re-enabled. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/b918dc4dd76410a57f7516a855f66b0a2bd58326.1730929545.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: page_frag: use __alloc_pages() to replace alloc_pages_node()	Yunsheng Lin	1	-3/+3
	It seems there is about 24Bytes binary size increase for __page_frag_cache_refill() after refactoring in arm64 system with 64K PAGE_SIZE. By doing the gdb disassembling, It seems we can have more than 100Bytes decrease for the binary size by using __alloc_pages() to replace alloc_pages_node(), as there seems to be some unnecessary checking for nid being NUMA_NO_NODE, especially when page_frag is part of the mm system. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-8-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: page_frag: reuse existing space for 'size' and 'pfmemalloc'	Yunsheng Lin	3	-32/+81
	Currently there is one 'struct page_frag' for every 'struct sock' and 'struct task_struct', we are about to replace the 'struct page_frag' with 'struct page_frag_cache' for them. Before begin the replacing, we need to ensure the size of 'struct page_frag_cache' is not bigger than the size of 'struct page_frag', as there may be tens of thousands of 'struct sock' and 'struct task_struct' instances in the system. By or'ing the page order & pfmemalloc with lower bits of 'va' instead of using 'u16' or 'u32' for page size and 'u8' for pfmemalloc, we are able to avoid 3 or 5 bytes space waste. And page address & pfmemalloc & order is unchanged for the same page in the same 'page_frag_cache' instance, it makes sense to fit them together. After this patch, the size of 'struct page_frag_cache' should be the same as the size of 'struct page_frag'. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-7-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	xtensa: remove the get_order() implementation	Yunsheng Lin	1	-18/+0
	As the get_order() implemented by xtensa supporting 'nsau' instruction seems be the same as the generic implementation in include/asm-generic/getorder.h when size is not a constant value as the generic implementation calling the fls*() is also utilizing the 'nsau' instruction for xtensa. So remove the get_order() implemented by xtensa, as using the generic implementation may enable the compiler to do the computing when size is a constant value instead of runtime computing and enable the using of get_order() in BUILD_BUG_ON() macro in next patch. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-6-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: page_frag: avoid caller accessing 'page_frag_cache' directly	Yunsheng Lin	7	-15/+19
	Use appropriate frag_page API instead of caller accessing 'page_frag_cache' directly. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20241028115343.3405838-5-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: page_frag: use initial zero offset for page_frag_alloc_align()	Yunsheng Lin	1	-23/+23
	We are about to use page_frag_alloc_*() API to not just allocate memory for skb->data, but also use them to do the memory allocation for skb frag too. Currently the implementation of page_frag in mm subsystem is running the offset as a countdown rather than count-up value, there may have several advantages to that as mentioned in [1], but it may have some disadvantages, for example, it may disable skb frag coalescing and more correct cache prefetching We have a trade-off to make in order to have a unified implementation and API for page_frag, so use a initial zero offset in this patch, and the following patch will try to make some optimization to avoid the disadvantages as much as possible. 1. https://lore.kernel.org/all/f4abe71b3439b39d17a6fb2d410180f367cadf5c.camel@gmail.com/ CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-4-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: move the page fragment allocator from page_alloc into its own file	Yunsheng Lin	9	-177/+197
	Inspired by [1], move the page fragment allocator from page_alloc into its own c file and header file, as we are about to make more change for it to replace another page_frag implementation in sock.c As this patchset is going to replace 'struct page_frag' with 'struct page_frag_cache' in sched.h, including page_frag_cache.h in sched.h has a compiler error caused by interdependence between mm_types.h and mm.h for asm-offsets.c, see [2]. So avoid the compiler error by moving 'struct page_frag_cache' to mm_types_task.h as suggested by Alexander, see [3]. 1. https://lore.kernel.org/all/20230411160902.4134381-3-dhowells@redhat.com/ 2. https://lore.kernel.org/all/15623dac-9358-4597-b3ee-3694a5956920@gmail.com/ 3. https://lore.kernel.org/all/CAKgT0UdH1yD=LSCXFJ=YM_aiA4OomD-2wXykO42bizaWMt_HOA@mail.gmail.com/ CC: David Howells <dhowells@redhat.com> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-3-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: page_frag: add a test module for page_frag	Yunsheng Lin	5	-0/+402
	The testing is done by ensuring that the fragment allocated from a frag_frag_cache instance is pushed into a ptr_ring instance in a kthread binded to a specified cpu, and a kthread binded to a specified cpu will pop the fragment from the ptr_ring and free the fragment. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-2-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	net: convert to nla_get_*_default()	Johannes Berg	40	-236/+126
	Most of the original conversion is from the spatch below, but I edited some and left out other instances that were either buggy after conversion (where default values don't fit into the type) or just looked strange. @@ expression attr, def; expression val; identifier fn =~ "^nla_get_.*"; fresh identifier dfn = fn ## "_default"; @@ ( -if (attr) - val = fn(attr); -else - val = def; +val = dfn(attr, def); \| -if (!attr) - val = def; -else - val = fn(attr); +val = dfn(attr, def); \| -if (!attr) - return def; -return fn(attr); +return dfn(attr, def); \| -attr ? fn(attr) : def +dfn(attr, def) \| -!attr ? def : fn(attr) +dfn(attr, def) ) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org> Link: https://patch.msgid.link/20241108114145.0580b8684e7f.I740beeaa2f70ebfc19bfca1045a24d6151992790@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	net: netlink: add nla_get_*_default() accessors	Johannes Berg	1	-0/+262
	There are quite a number of places that use patterns such as if (attr) val = nla_get_u16(attr); else val = DEFAULT; Add nla_get_u16_default() and friends like that to not have to type this out all the time. Acked-by: Toke Høiland-Jørgensen <toke@kernel.org> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20241108114145.acd2aadb03ac.I3df6aac71d38a5baa1c0a03d0c7e82d4395c030e@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	bridge: Allow deleting FDB entries with non-existent VLAN	Ido Schimmel	2	-7/+42
	It is currently impossible to delete individual FDB entries (as opposed to flushing) that were added with a VLAN that no longer exists: # ip link add name dummy1 up type dummy # ip link add name br1 up type bridge vlan_filtering 1 # ip link set dev dummy1 master br1 # bridge fdb add 00:11:22:33:44:55 dev dummy1 master static vlan 1 # bridge vlan del vid 1 dev dummy1 # bridge fdb get 00:11:22:33:44:55 br br1 vlan 1 00:11:22:33:44:55 dev dummy1 vlan 1 master br1 static # bridge fdb del 00:11:22:33:44:55 dev dummy1 master vlan 1 RTNETLINK answers: Invalid argument # bridge fdb get 00:11:22:33:44:55 br br1 vlan 1 00:11:22:33:44:55 dev dummy1 vlan 1 master br1 static This is in contrast to MDB entries that can be deleted after the VLAN was deleted: # bridge vlan add vid 10 dev dummy1 # bridge mdb add dev br1 port dummy1 grp 239.1.1.1 permanent vid 10 # bridge vlan del vid 10 dev dummy1 # bridge mdb get dev br1 grp 239.1.1.1 vid 10 dev br1 port dummy1 grp 239.1.1.1 permanent vid 10 # bridge mdb del dev br1 port dummy1 grp 239.1.1.1 permanent vid 10 # bridge mdb get dev br1 grp 239.1.1.1 vid 10 Error: bridge: MDB entry not found. Align the two interfaces and allow user space to delete FDB entries that were added with a VLAN that no longer exists: # ip link add name dummy1 up type dummy # ip link add name br1 up type bridge vlan_filtering 1 # ip link set dev dummy1 master br1 # bridge fdb add 00:11:22:33:44:55 dev dummy1 master static vlan 1 # bridge vlan del vid 1 dev dummy1 # bridge fdb get 00:11:22:33:44:55 br br1 vlan 1 00:11:22:33:44:55 dev dummy1 vlan 1 master br1 static # bridge fdb del 00:11:22:33:44:55 dev dummy1 master vlan 1 # bridge fdb get 00:11:22:33:44:55 br br1 vlan 1 Error: Fdb entry not found. Add a selftest to make sure this behavior does not regress: # ./rtnetlink.sh -t kci_test_fdb_del PASS: bridge fdb del Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Andy Roulin <aroulin@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20241105133954.350479-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	mlx5/core: Schedule EQ comp tasklet only if necessary	Caleb Sander Mateos	2	-4/+12
	Currently, the mlx5_eq_comp_int() interrupt handler schedules a tasklet to call mlx5_cq_tasklet_cb() if it processes any completions. For CQs whose completions don't need to be processed in tasklet context, this adds unnecessary overhead. In a heavy TCP workload, we see 4% of CPU time spent on the tasklet_trylock() in tasklet_action_common(), with a smaller amount spent on the atomic operations in tasklet_schedule(), tasklet_clear_sched(), and locking the spinlock in mlx5_cq_tasklet_cb(). TCP completions are handled by mlx5e_completion_event(), which schedules NAPI to poll the queue, so they don't need tasklet processing. Schedule the tasklet in mlx5_add_cq_to_tasklet() instead to avoid this overhead. mlx5_add_cq_to_tasklet() is responsible for enqueuing the CQs to be processed in tasklet context, so it can schedule the tasklet. CQs that need tasklet processing have their interrupt comp handler set to mlx5_add_cq_to_tasklet(), so they will schedule the tasklet. CQs that don't need tasklet processing won't schedule the tasklet. To avoid scheduling the tasklet multiple times during the same interrupt, only schedule the tasklet in mlx5_add_cq_to_tasklet() if the tasklet work queue was empty before the new CQ was pushed to it. The additional branch in mlx5_add_cq_to_tasklet(), called for each EQE, may add a small cost for the userspace Infiniband CQs whose completions are processed in tasklet context. But this seems worth it to avoid the tasklet overhead for CQs that don't need it. Note that the mlx4 driver works the same way: it schedules the tasklet in mlx4_add_cq_to_tasklet() and only if the work queue was empty before. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://patch.msgid.link/20241105204000.1807095-1-csander@purestorage.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	neighbour: Create netdev->neighbour association	Gilad Naaman	5	-45/+80
	Create a mapping between a netdev and its neighoburs, allowing for much cheaper flushes. Signed-off-by: Gilad Naaman <gnaaman@drivenets.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241107160444.2913124-7-gnaaman@drivenets.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	neighbour: Remove bare neighbour::next pointer	Gilad Naaman	3	-84/+12
	Remove the now-unused neighbour::next pointer, leaving struct neighbour solely with the hlist_node implementation. Signed-off-by: Gilad Naaman <gnaaman@drivenets.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241107160444.2913124-6-gnaaman@drivenets.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	neighbour: Convert iteration to use hlist+macro	Gilad Naaman	2	-33/+19
	Remove all usage of the bare neighbour::next pointer, replacing them with neighbour::hash and its for_each macro. Signed-off-by: Gilad Naaman <gnaaman@drivenets.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241107160444.2913124-5-gnaaman@drivenets.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	neighbour: Convert seq_file functions to use hlist	Gilad Naaman	1	-56/+48
	Convert seq_file-related neighbour functionality to use neighbour::hash and the related for_each macro. Signed-off-by: Gilad Naaman <gnaaman@drivenets.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241107160444.2913124-4-gnaaman@drivenets.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	neighbour: Define neigh_for_each_in_bucket	Gilad Naaman	1	-0/+6
	Introduce neigh_for_each_in_bucket in neighbour.h, to help iterate over the neighbour table more succinctly. Signed-off-by: Gilad Naaman <gnaaman@drivenets.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241107160444.2913124-3-gnaaman@drivenets.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	neighbour: Add hlist_node to struct neighbour	Gilad Naaman	2	-1/+21
	Add a doubly-linked node to neighbours, so that they can be deleted without iterating the entire bucket they're in. Signed-off-by: Gilad Naaman <gnaaman@drivenets.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241107160444.2913124-2-gnaaman@drivenets.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	r8169: align WAKE_PHY handling with r8125/r8126 vendor drivers	Heiner Kallweit	1	-0/+3
	Vendor drivers r8125/r8126 apply this additional magic setting when enabling WAKE_PHY, so do the same here. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/51130715-45be-4db5-abb7-05d87e1f5df9@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	r8169: improve rtl_set_d3_pll_down	Heiner Kallweit	1	-13/+5
	Make use of new helper r8169_mod_reg8_cond() and move from a switch() to an if() clause. Benefit is that we don't have to touch this piece of code each time support for a new chip version is added. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/e1ccdb85-a4ed-4800-89c2-89770ff06452@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	r8169: improve __rtl8169_set_wol	Heiner Kallweit	1	-31/+24
	Add helper r8169_mod_reg8_cond() what allows to significantly simplify __rtl8169_set_wol(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/697b197a-8eac-40c6-8847-27093cacec36@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	tc: fix typo probabilty in tc.yaml doc	Abhinav Saxena	1	-1/+1
	Fix spelling of "probability" in tc.yaml documentation. This corrects the max-P field description in struct tc_sfq_qopt_v1. Signed-off-by: Abhinav Saxena <xandfury@gmail.com> Link: https://patch.msgid.link/20241108195642.139315-1-xandfury@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	mISDN: Fix typos	Andrew Kreimer	1	-8/+8
	Fix typos: - syncronized -> synchronized - interfacs -> interface - otherwhise -> otherwise - ony -> only - busses -> buses - maxinum -> maximum Via codespell. Reported-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Kreimer <algonell@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241106112513.9559-1-algonell@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	hv_sock: Initializing vsk->trans to NULL to prevent a dangling pointer	Hyunwoo Kim	1	-0/+1
	When hvs is released, there is a possibility that vsk->trans may not be initialized to NULL, which could lead to a dangling pointer. This issue is resolved by initializing vsk->trans to NULL. Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/Zys4hCj61V+mQfX2@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	net: sfc: use ethtool string helpers	Rosen Penev	20	-106/+83
	The latter is the preferred way to copy ethtool strings. Avoids manually incrementing the pointer. Cleans up the code quite well. Signed-off-by: Rosen Penev <rosenp@gmail.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20241105231855.235894-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	mptcp: remove the redundant assignment of 'new_ctx->tcp_sock' in subflow_ulp_clone()	MoYuanhao	1	-1/+0
	The variable has already been assigned in the subflow_create_ctx(), So we don't need to reassign this variable in the subflow_ulp_clone(). Signed-off-by: MoYuanhao <moyuanhao3676@163.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241106071035.2591-1-moyuanhao3676@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09	net: mctp: Expose transport binding identifier via IFLA attribute	Khang Nguyen	7	-8/+37
	MCTP control protocol implementations are transport binding dependent. Endpoint discovery is mandatory based on transport binding. Message timing requirements are specified in each respective transport binding specification. However, we currently have no means to get this information from MCTP links. Add a IFLA_MCTP_PHYS_BINDING netlink link attribute, which represents the transport type using the DMTF DSP0239-defined type numbers, returned as part of RTM_GETLINK data. We get an IFLA_MCTP_PHYS_BINDING attribute for each MCTP link, for example: - 0x00 (unspec) for loopback interface; - 0x01 (SMBus/I2C) for mctpi2c%d interfaces; and - 0x05 (serial) for mctpserial%d interfaces. Signed-off-by: Khang Nguyen <khangng@os.amperecomputing.com> Reviewed-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20241105071915.821871-1-khangng@os.amperecomputing.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-07	bonding: add ESP offload features when slaves support	Jianbo Liu	1	-0/+9
	Add NETIF_F_GSO_ESP bit to bond's gso_partial_features if all slaves support it, such that ESP segmentation is handled by hardware if possible. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Boris Pismenny <borisp@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241105192721.584822-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>