wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2024-11-18	ipv4/udp: Add 4-tuple hash for connected socket	Philo Lu	3	-5/+210
	Currently, the udp_table has two hash table, the port hash and portaddr hash. Usually for UDP servers, all sockets have the same local port and addr, so they are all on the same hash slot within a reuseport group. In some applications, UDP servers use connect() to manage clients. In particular, when firstly receiving from an unseen 4 tuple, a new socket is created and connect()ed to the remote addr:port, and then the fd is used exclusively by the client. Once there are connected sks in a reuseport group, udp has to score all sks in the same hash2 slot to find the best match. This could be inefficient with a large number of connections, resulting in high softirq overhead. To solve the problem, this patch implement 4-tuple hash for connected udp sockets. During connect(), hash4 slot is updated, as well as a corresponding counter, hash4_cnt, in hslot2. In __udp4_lib_lookup(), hslot4 will be searched firstly if the counter is non-zero. Otherwise, hslot2 is used like before. Note that only connected sockets enter this hash4 path, while un-connected ones are not affected. hlist_nulls is used for hash4, because we probably move to another hslot wrongly when lookup with concurrent rehash. Then we check nulls at the list end to see if we should restart lookup. Because udp does not use SLAB_TYPESAFE_BY_RCU, we don't need to touch sk_refcnt when lookup. Stress test results (with 1 cpu fully used) are shown below, in pps: (1) _un-connected_ socket as server [a] w/o hash4: 1,825176 [b] w/ hash4: 1,831750 (+0.36%) (2) 500 _connected_ sockets as server [c] w/o hash4: 290860 (only 16% of [a]) [d] w/ hash4: 1,889658 (+3.1% compared with [b]) With hash4, compute_score is skipped when lookup, so [d] is slightly better than [b]. Co-developed-by: Cambda Zhu <cambda@linux.alibaba.com> Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com> Co-developed-by: Fred Chen <fred.cc@alibaba-inc.com> Signed-off-by: Fred Chen <fred.cc@alibaba-inc.com> Co-developed-by: Yubing Qiu <yubing.qiuyubing@alibaba-inc.com> Signed-off-by: Yubing Qiu <yubing.qiuyubing@alibaba-inc.com> Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-18	net/udp: Add 4-tuple hash list basis	Philo Lu	3	-5/+97
	Add a new hash list, hash4, in udp table. It will be used to implement 4-tuple hash for connected udp sockets. This patch adds the hlist to table, and implements helpers and the initialization. 4-tuple hash is implemented in the following patch. hash4 uses hlist_nulls to avoid moving wrongly onto another hlist due to concurrent rehash, because rehash() can happen with lookup(). Co-developed-by: Cambda Zhu <cambda@linux.alibaba.com> Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com> Co-developed-by: Fred Chen <fred.cc@alibaba-inc.com> Signed-off-by: Fred Chen <fred.cc@alibaba-inc.com> Co-developed-by: Yubing Qiu <yubing.qiuyubing@alibaba-inc.com> Signed-off-by: Yubing Qiu <yubing.qiuyubing@alibaba-inc.com> Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-18	net/udp: Add a new struct for hash2 slot	Philo Lu	3	-34/+63
	Preparing for udp 4-tuple hash (uhash4 for short). To implement uhash4 without cache line missing when lookup, hslot2 is used to record the number of hashed sockets in hslot4. Thus adding a new struct udp_hslot_main with field hash4_cnt, which is used by hash2. The new struct is used to avoid doubling the size of udp_hslot. Before uhash4 lookup, firstly checking hash4_cnt to see if there are hashed sks in hslot4. Because hslot2 is always used in lookup, there is no cache line miss. Related helpers are updated, and use the helpers as possible. uhash4 is implemented in following patches. Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-15	virtio_net: xdp_features add NETDEV_XDP_ACT_XSK_ZEROCOPY	Xuan Zhuo	1	-1/+2
	Now, we support AF_XDP(xsk). Add NETDEV_XDP_ACT_XSK_ZEROCOPY to xdp_features. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-14-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: update tx timeout record	Xuan Zhuo	1	-0/+7
	If send queue sent some packets, we update the tx timeout record to prevent the tx timeout. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-13-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: xsk: tx: support xmit xsk buffer	Xuan Zhuo	1	-11/+168
	The driver's tx napi is very important for XSK. It is responsible for obtaining data from the XSK queue and sending it out. At the beginning, we need to trigger tx napi. virtnet_free_old_xmit distinguishes three type ptr(skb, xdp frame, xsk buffer) by the last bits of the pointer. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-12-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: xsk: prevent disable tx napi	Xuan Zhuo	1	-1/+9
	Since xsk's TX queue is consumed by TX NAPI, if sq is bound to xsk, then we must stop tx napi from being disabled. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-11-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: xsk: bind/unbind xsk for tx	Xuan Zhuo	1	-0/+53
	This patch implement the logic of bind/unbind xsk pool to sq and rq. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-10-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: refactor the xmit type	Xuan Zhuo	1	-39/+51
	Because the af-xdp will introduce a new xmit type, so I refactor the xmit type mechanism first. We know both xdp_frame and sk_buff are at least 4 bytes aligned. For the xdp tx, we do not pass any pointer to virtio core as data, we just need to pass the len of the packet. So we will push len to the void pointer. We can make sure the pointer is 4 bytes aligned. And the data structure of AF_XDP also is at least 4 bytes aligned. So the last two bits of the pointers are free, we can't use these to distinguish them. 00 for skb 01 for SKB_ORPHAN 10 for XDP 11 for AF-XDP tx Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-9-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: remove API virtqueue_set_dma_premapped	Xuan Zhuo	3	-63/+0
	Now, this API is useless. remove it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-8-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio-net: rq submits premapped per-buffer	Xuan Zhuo	1	-11/+11
	virtio-net rq submits premapped per-buffer by setting sg page to NULL; Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-7-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: introduce add api for premapped	Xuan Zhuo	2	-0/+59
	Two APIs are introduced to submit premapped per-buffers. int virtqueue_add_inbuf_premapped(struct virtqueue vq, struct scatterlist sg, unsigned int num, void data, void ctx, gfp_t gfp); int virtqueue_add_outbuf_premapped(struct virtqueue vq, struct scatterlist sg, unsigned int num, void *data, gfp_t gfp); Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-6-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: perform premapped operations based on per-buffer	Xuan Zhuo	1	-48/+53
	The current configuration sets the virtqueue (vq) to premapped mode, implying that all buffers submitted to this queue must be mapped ahead of time. This presents a challenge for the virtnet send queue (sq): the virtnet driver would be required to keep track of dma information for vq size * 17, which can be substantial. However, if the premapped mode were applied on a per-buffer basis, the complexity would be greatly reduced. With AF_XDP enabled, AF_XDP buffers would become premapped, while kernel skb buffers could remain unmapped. And consider that some sgs are not generated by the virtio driver, that may be passed from the block stack. So we can not change the sgs, new APIs are the better way. So we pass the new argument 'premapped' to indicate the buffers submitted to virtio are premapped in advance. Additionally, DMA unmap operations for these buffers will be bypassed. Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-5-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: packed: record extras for indirect buffers	Xuan Zhuo	1	-24/+36
	The subsequent commit needs to know whether every indirect buffer is premapped or not. So we need to introduce an extra struct for every indirect buffer to record this info. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-4-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: split: record extras for indirect buffers	Xuan Zhuo	1	-60/+52
	The subsequent commit needs to know whether every indirect buffer is premapped or not. So we need to introduce an extra struct for every indirect buffer to record this info. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-3-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: introduce vring_need_unmap_buffer	Xuan Zhuo	1	-15/+12
	To make the code readable, introduce vring_need_unmap_buffer() to replace do_unmap. use_dma_api premapped -> vring_need_unmap_buffer() 1. false false false 2. true false true 3. true true false Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-2-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	selftests: net: fdb_notify: Add a test for FDB notifications	Petr Machata	3	-1/+114
	Check that only one notification is produced for various FDB edit operations. Regarding the ip_link_add() and ip_link_master() helpers. This pattern of action plus corresponding defer is bound to come up often, and a dedicated vocabulary to capture it will be handy. tunnel_create() and vlan_create() from forwarding/lib.sh are somewhat opaque and perhaps too kitchen-sinky, so I tried to go in the opposite direction with these ones, and wrapped only the bare minimum to schedule a corresponding cleanup. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://patch.msgid.link/910c5880ae6d3b558d6889cbdba2be690c2615c6.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	selftests: net: lib: Add kill_process	Petr Machata	15	-34/+41
	A number of selftests run processes in the background and need to kill them afterwards. Instead for everyone to open-code the kill / wait / redirect mantra, add a helper in net/lib.sh. Convert existing open-code sites. Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Link: https://patch.msgid.link/a9db102067d741c118f0bd93b10c75e2a34665ea.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	selftests: net: lib: Move checks from forwarding/lib.sh here	Petr Machata	2	-73/+73
	For logging to be useful, something has to set RET and retmsg by calling ret_set_ksft_status(). There is a suite of functions to that end in forwarding/lib: check_err, check_fail et.al. Move them to net/lib.sh so that every net test can use them. Existing lib.sh users might be using these same names for their functions. However lib.sh is always sourced near the top of the file (checked), and whatever new definitions will simply override the ones provided by lib.sh. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://patch.msgid.link/f488a00dc85b8e0c1f3c71476b32b21b5189a847.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	selftests: net: lib: Move tests_run from forwarding/lib.sh here	Petr Machata	2	-10/+10
	It would be good to use the same mechanism for scheduling and dispatching general net tests as the many forwarding tests already use. To that end, move the logging helpers to net/lib.sh so that every net test can use them. Existing lib.sh users might be using the name themselves. However lib.sh is always sourced near the top of the file (checked), and whatever new definition will simply override the one provided by lib.sh. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://patch.msgid.link/a6fc083486493425b2c61185c327845b6ce3233a.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	selftests: net: lib: Move logging from forwarding/lib.sh here	Petr Machata	2	-113/+115
	Many net selftests invent their own logging helpers. These really should be in a library sourced by these tests. Currently forwarding/lib.sh has a suite of perfectly fine logging helpers, but sourcing a forwarding/ library from a higher-level directory smells of layering violation. In this patch, move the logging helpers to net/lib.sh so that every net test can use them. Together with the logging helpers, it's also necessary to move pause_on_fail(), and EXIT_STATUS and RET. Existing lib.sh users might be using these same names for their functions or variables. However lib.sh is always sourced near the top of the file (checked), and whatever new definitions will simply override the ones provided by lib.sh. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://patch.msgid.link/edd3785a3bd72ffbe1409300989e993ee50ae98b.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	ndo_fdb_del: Add a parameter to report whether notification was sent	Petr Machata	9	-18/+34
	In a similar fashion to ndo_fdb_add, which was covered in the previous patch, add the bool *notified argument to ndo_fdb_del. Callees that send a notification on their own set the flag to true. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/06b1acf4953ef0a5ed153ef1f32d7292044f2be6.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	ndo_fdb_add: Add a parameter to report whether notification was sent	Petr Machata	12	-18/+32
	Currently when FDB entries are added to or deleted from a VXLAN netdevice, the VXLAN driver emits one notification, including the VXLAN-specific attributes. The core however always sends a notification as well, a generic one. Thus two notifications are unnecessarily sent for these operations. A similar situation comes up with bridge driver, which also emits notifications on its own: # ip link add name vx type vxlan id 1000 dstport 4789 # bridge monitor fdb & [1] 1981693 # bridge fdb add de:ad:be:ef:13:37 dev vx self dst 192.0.2.1 de:ad:be:ef:13:37 dev vx dst 192.0.2.1 self permanent de:ad:be:ef:13:37 dev vx self permanent In order to prevent this duplicity, add a paremeter to ndo_fdb_add, bool *notified. The flag is primed to false, and if the callee sends a notification on its own, it sets it to true, thus informing the core that it should not generate another notification. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/cbf6ae8195e85cbf922f8058ce4eba770f3b71ed.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	rtase: Modify the content format of the enum rtase_registers	Justin Lai	1	-1/+1
	Remove unnecessary spaces. Signed-off-by: Justin Lai <justinlai0215@realtek.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241114112549.376101-3-justinlai0215@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	rtase: Modify the name of the goto label	Justin Lai	1	-5/+5
	Modify the name of the goto label in rtase_init_one(). Signed-off-by: Justin Lai <justinlai0215@realtek.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241114112549.376101-2-justinlai0215@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	net: netpoll: flush skb pool during cleanup	Breno Leitao	1	-1/+13
	The netpoll subsystem maintains a pool of 32 pre-allocated SKBs per instance, but these SKBs are not freed when the netpoll user is brought down. This leads to memory waste as these buffers remain allocated but unused. Add skb_pool_flush() to properly clean up these SKBs when netconsole is terminated, improving memory efficiency. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20241114-skb_buffers_v2-v3-2-9be9f52a8b69@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	net: netpoll: Individualize the skb pool	Breno Leitao	2	-18/+14
	The current implementation of the netpoll system uses a global skb pool, which can lead to inefficient memory usage and waste when targets are disabled or no longer in use. This can result in a significant amount of memory being unnecessarily allocated and retained, potentially causing performance issues and limiting the availability of resources for other system components. Modify the netpoll system to assign a skb pool to each target instead of using a global one. This approach allows for more fine-grained control over memory allocation and deallocation, ensuring that resources are only allocated and retained as needed. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20241114-skb_buffers_v2-v3-1-9be9f52a8b69@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	octeontx2-pf: Fix spelling mistake "reprentator" -> "representor"	Colin Ian King	1	-1/+1
	There is a spelling mistake in a NL_SET_ERR_MSG_MOD error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://patch.msgid.link/20241114102012.1868514-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	net/netlink: Correct the comment on netlink message max cap	Dmitry Safonov	1	-2/+2
	Since commit d35c99ff77ec ("netlink: do not enter direct reclaim from netlink_dump()") the cap is 32KiB. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20241113-tcp-md5-diag-prep-v2-5-00a2a7feb1fa@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	enic: Move kdump check into enic_adjust_resources()	Nelson Escobar	1	-16/+9
	Move the kdump check into enic_adjust_resources() so that everything that modifies resources is in the same function. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-7-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	enic: Move enic resource adjustments to separate function	Nelson Escobar	1	-57/+70
	Move the enic resource adjustments out of enic_set_intr_mode() and into its own function, enic_adjust_resources(). Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-6-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	enic: Adjust used MSI-X wq/rq/cq/interrupt resources in a more robust way	Nelson Escobar	1	-59/+57
	Instead of failing to use MSI-X if resources aren't configured exactly right, use the resources we do have. Since we could start using large numbers of rq resources, we do limit the rq count to what netif_get_num_default_rss_queues() recommends. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-5-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	enic: Allocate arrays in enic struct based on VIC config	Nelson Escobar	2	-21/+87
	Allocate wq, rq, cq, intr, and napi arrays based on the number of resources configured in the VIC. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-4-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	enic: Save resource counts we read from HW	Nelson Escobar	3	-9/+18
	Save the resources counts for wq,rq,cq, and interrupts in *_avail variables so that we don't lose the information when adjusting the counts we are actually using. Report the wq_avail and rq_avail as the channel maximums in 'ethtool -l' output. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-3-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	enic: Make MSI-X I/O interrupts come after the other required ones	Nelson Escobar	2	-9/+22
	The VIC hardware has a constraint that the MSIX interrupt used for errors be specified as a 7 bit number. Before this patch, it was allocated after the I/O interrupts, which would cause a problem if 128 or more I/O interrupts are in use. So make the required interrupts come before the I/O interrupts to guarantee the error interrupt offset never exceeds 7 bits. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-2-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	enic: Create enic_wq/rq structures to bundle per wq/rq data	Nelson Escobar	4	-73/+81
	Bundling the wq/rq specific data into dedicated enic_wq/rq structures cleans up the enic structure and simplifies future changes related to wq/rq. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-1-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	net: phy: microchip_t1: Clause-45 PHY loopback support for LAN887x	Tarun Alle	1	-0/+1
	Adds support for clause-45 PHY loopback for the Microchip LAN887x driver. Signed-off-by: Tarun Alle <Tarun.Alle@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241114101951.382996-1-Tarun.Alle@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	dt-bindings: net: dsa: microchip,ksz: Drop undocumented "id"	Rob Herring (Arm)	1	-1/+0
	"id" is not a documented property, so drop it. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20241113225642.1783485-2-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	bnxt_en: optimize gettimex64	Vadim Fedorenko	2	-6/+37
	Current implementation of gettimex64() makes at least 3 PCIe reads to get current PHC time. It takes at least 2.2us to get this value back to userspace. At the same time there is cached value of upper bits of PHC available for packet timestamps already. This patch reuses cached value to speed up reading of PHC time. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20241114114820.1411660-1-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	netdev-genl: Hold rcu_read_lock in napi_set	Joe Damato	1	-0/+2
	Hold rcu_read_lock during netdev_nl_napi_set_doit, which calls napi_by_id and requires rcu_read_lock to be held. Closes: https://lore.kernel.org/netdev/719083c2-e277-447b-b6ea-ca3acb293a03@redhat.com/ Fixes: 1287c1ae0fc2 ("netdev-genl: Support setting per-NAPI config values") Signed-off-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241114175600.18882-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	netfilter: bitwise: add support for doing AND, OR and XOR directly	Jeremy Sowden	2	-11/+131
	Hitherto, these operations have been converted in user space to mask-and-xor operations on one register and two immediate values, and it is the latter which have been evaluated by the kernel. We add support for evaluating these operations directly in kernel space on one register and either an immediate value or a second register. Pablo made a few changes to the original patch: - EINVAL if NFTA_BITWISE_SREG2 is used with fast version. - Allow _AND,_OR,_XOR with _DATA != sizeof(u32) - Dump _SREG2 or _DATA with _AND,_OR,_XOR Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15	netfilter: bitwise: rename some boolean operation functions	Jeremy Sowden	2	-20/+24
	In the next patch we add support for doing AND, OR and XOR operations directly in the kernel, so rename some functions and an enum constant related to mask-and-xor boolean operations. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15	netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.	Guillaume Nault	1	-1/+1
	Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15	netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.	Guillaume Nault	1	-1/+2
	Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15	netfilter: rpfilter: Convert rpfilter_mt() to dscp_t.	Guillaume Nault	1	-1/+1
	Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15	netfilter: flow_offload: Convert nft_flow_route() to dscp_t.	Guillaume Nault	1	-2/+2
	Use ip4h_dscp()instead of reading ip_hdr()->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Also, remove the comment about the net/ip.h include file, since it's now required for the ip4h_dscp() helper too. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15	netfilter: ipv4: Convert ip_route_me_harder() to dscp_t.	Guillaume Nault	1	-1/+1
	Use ip4h_dscp()instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15	xfrm: Fix acquire state insertion.	Steffen Klassert	1	-0/+1
	A recent commit jumped over the dst hash computation and left the symbol uninitialized. Fix this by explicitly computing the dst hash before it is used. Fixes: 0045e3d80613 ("xfrm: Cache used outbound xfrm states at the policy.") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-11-14	selftest: extend test_rss_context_queue_reconfigure for action addition	Edward Cree	1	-0/+26
	The combination of ntuple action (ring_cookie) and RSS context can cause an ntuple rule to target a higher queue than appears in any RSS indirection table or directly in the ntuple rule, since the two numbers are added together. Verify the logic that prevents reducing the queue count in this case. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/58276b800ab78c0a79c1918046ccae7fe45ba802.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14	selftest: validate RSS+ntuple filters with nonzero ring_cookie	Edward Cree	1	-1/+40
	Test creates an ntuple filter with 'action 2' and an RSS context whose indirection table has entries 0 and 1. Resulting traffic should go to queues 2 and 3; verify that it never hits queues 0 and 1. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/114afdf4d2867f72ed27751e8e08fe8b128a8529.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>