Age | Commit message (Collapse) | Author | Files | Lines |
|
With multipath routes, try to ensure that packets leave on the device
that is associated with the source address.
Avoid the following tcpdump example:
veth0 Out IP 10.1.0.2.38640 > 10.2.0.3.8000: Flags [S]
veth1 Out IP 10.1.0.2.38648 > 10.2.0.3.8000: Flags [S]
Which can happen easily with the most straightforward setup:
ip addr add 10.0.0.1/24 dev veth0
ip addr add 10.1.0.1/24 dev veth1
ip route add 10.2.0.3 nexthop via 10.0.0.2 dev veth0 \
nexthop via 10.1.0.2 dev veth1
This is apparently considered WAI, based on the comment in
ip_route_output_key_hash_rcu:
* 2. Moreover, we are allowed to send packets with saddr
* of another iface. --ANK
It may be ok for some uses of multipath, but not all. For instance,
when using two ISPs, a router may drop packets with unknown source.
The behavior occurs because tcp_v4_connect makes three route
lookups when establishing a connection:
1. ip_route_connect calls to select a source address, with saddr zero.
2. ip_route_connect calls again now that saddr and daddr are known.
3. ip_route_newports calls again after a source port is also chosen.
With a route with multiple nexthops, each lookup may make a different
choice depending on available entropy to fib_select_multipath. So it
is possible for 1 to select the saddr from the first entry, but 3 to
select the second entry. Leading to the above situation.
Address this by preferring a match that matches the flowi4 saddr. This
will make 2 and 3 make the same choice as 1. Continue to update the
backup choice until a choice that matches saddr is found.
Do this in fib_select_multipath itself, rather than passing an fl4_oif
constraint, to avoid changing non-multipath route selection. Commit
e6b45241c57a ("ipv4: reset flowi parameters on route connect") shows
how that may cause regressions.
Also read ipv4.sysctl_fib_multipath_use_neigh only once. No need to
refresh in the loop.
This does not happen in IPv6, which performs only one lookup.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250424143549.669426-2-willemdebruijn.kernel@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The ICSSG firmware maintains set of stats called PA_STATS.
Currently the driver only dumps 4 stats. Add support for dumping more
stats.
The offset for different stats are defined as MACROs in icssg_switch_map.h
file. All the offsets are for Slice0. Slice1 offsets are slice0 + 4.
The offset calculation is taken care while reading the stats in
emac_update_hardware_stats().
The statistics are documented in
Documentation/networking/device_drivers/icssg_prueth.rst
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Link: https://patch.msgid.link/20250424095316.2643573-1-danishanwar@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add targets to build, clean, and install ynl headers, libynl.a, and
python tooling.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250423204647.190784-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Modify the format specifier in snprintf to %u.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250425064057.30035-1-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
thunder_bgx's PCI device is enabled with pcim_enable_device(), a managed
devres function which ensures that the device gets enabled on driver
detach automatically.
Remove the calls to pci_disable_device().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250425085740.65304-10-phasta@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The currently used function pci_request_regions() is one of the
problematic "hybrid devres" PCI functions, which are sometimes managed
through devres, and sometimes not (depending on whether
pci_enable_device() or pcim_enable_device() has been called before).
The PCI subsystem wants to remove this behavior and, therefore, needs to
port all users to functions that don't have this problem.
Furthermore, the PCI function being managed implies that it's not
necessary to call pci_release_regions() manually.
Remove the calls to pci_release_regions().
Replace pci_request_regions() with pcim_request_all_regions().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250425085740.65304-9-phasta@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The currently used function pci_request_regions() is one of the
problematic "hybrid devres" PCI functions, which are sometimes managed
through devres, and sometimes not (depending on whether
pci_enable_device() or pcim_enable_device() has been called before).
The PCI subsystem wants to remove this behavior and, therefore, needs to
port all users to functions that don't have this problem.
Furthermore, the PCI function being managed implies that it's not
necessary to call pci_release_regions() manually.
Remove the calls to pci_release_regions().
Replace pci_request_regions() with pcim_request_all_regions().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250425085740.65304-8-phasta@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The currently used function pci_request_regions() is one of the
problematic "hybrid devres" PCI functions, which are sometimes managed
through devres, and sometimes not (depending on whether
pci_enable_device() or pcim_enable_device() has been called before).
The PCI subsystem wants to remove this behavior and, therefore, needs to
port all users to functions that don't have this problem.
Replace pci_request_regions() with pcim_request_all_regions().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Daniele Venzano <venza@brownhat.org>
Link: https://patch.msgid.link/20250425085740.65304-7-phasta@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The currently used function pci_request_regions() is one of the
problematic "hybrid devres" PCI functions, which are sometimes managed
through devres, and sometimes not (depending on whether
pci_enable_device() or pcim_enable_device() has been called before).
The PCI subsystem wants to remove this behavior and, therefore, needs to
port all users to functions that don't have this problem.
Replace pci_request_regions() with pcim_request_all_regions().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250425085740.65304-6-phasta@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The currently used function pci_request_regions() is one of the
problematic "hybrid devres" PCI functions, which are sometimes managed
through devres, and sometimes not (depending on whether
pci_enable_device() or pcim_enable_device() has been called before).
The PCI subsystem wants to remove this behavior and, therefore, needs to
port all users to functions that don't have this problem.
Replace pci_request_regions() with pcim_request_all_regions().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250425085740.65304-5-phasta@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The currently used function pci_request_regions() is one of the
problematic "hybrid devres" PCI functions, which are sometimes managed
through devres, and sometimes not (depending on whether
pci_enable_device() or pcim_enable_device() has been called before).
The PCI subsystem wants to remove this behavior and, therefore, needs to
port all users to functions that don't have this problem.
Furthermore, the PCI function being managed implies that it's not
necessary to call pci_release_regions() manually.
Remove the calls to pci_release_regions().
Replace pci_request_regions() with pcim_request_all_regions().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patch.msgid.link/20250425085740.65304-4-phasta@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The currently used function pci_request_regions() is one of the
problematic "hybrid devres" PCI functions, which are sometimes managed
through devres, and sometimes not (depending on whether
pci_enable_device() or pcim_enable_device() has been called before).
The PCI subsystem wants to remove this behavior and, therefore, needs to
port all users to functions that don't have this problem.
Furthermore, the PCI function being managed implies that it's not
necessary to call pci_release_regions() manually.
Remove the calls to pci_release_regions().
Replace pci_request_regions() with pcim_request_all_regions().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Acked-by: Elad Nachman <enachman@marvell.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250425085740.65304-3-phasta@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The selftest reproduces the deadlock scenario when binding/unbinding XDP
program, XDP socket, rx ring resize on virtio_net interface.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20250425071018.36078-5-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When binding the XDP socket, we may get EBUSY because the deferred
destructor of XDP socket in previous test has not been executed yet. If
that is the case, just sleep and retry some times.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20250425071018.36078-4-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This commit adds an optional -z flag to xdp_helper. When this flag is
provided, the XDP socket binding is forced to be in zerocopy mode.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20250425071018.36078-3-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move xdp_helper to net/lib to make it easier for other selftests to use
the helper.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20250425071018.36078-2-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In production, we're seeing TX drops on veth devices when the ptr_ring
fills up. This can occur when NAPI mode is enabled, though it's
relatively rare. However, with threaded NAPI - which we use in
production - the drops become significantly more frequent.
The underlying issue is that with threaded NAPI, the consumer often runs
on a different CPU than the producer. This increases the likelihood of
the ring filling up before the consumer gets scheduled, especially under
load, leading to drops in veth_xmit() (ndo_start_xmit()).
This patch introduces backpressure by returning NETDEV_TX_BUSY when the
ring is full, signaling the qdisc layer to requeue the packet. The txq
(netdev queue) is stopped in this condition and restarted once
veth_poll() drains entries from the ring, ensuring coordination between
NAPI and qdisc.
Backpressure is only enabled when a qdisc is attached. Without a qdisc,
the driver retains its original behavior - dropping packets immediately
when the ring is full. This avoids unexpected behavior changes in setups
without a configured qdisc.
With a qdisc in place (e.g. fq, sfq) this allows Active Queue Management
(AQM) to fairly schedule packets across flows and reduce collateral
damage from elephant flows.
A known limitation of this approach is that the full ring sits in front
of the qdisc layer, effectively forming a FIFO buffer that introduces
base latency. While AQM still improves fairness and mitigates flow
dominance, the latency impact is measurable.
In hardware drivers, this issue is typically addressed using BQL (Byte
Queue Limits), which tracks in-flight bytes needed based on physical link
rate. However, for virtual drivers like veth, there is no fixed bandwidth
constraint - the bottleneck is CPU availability and the scheduler's ability
to run the NAPI thread. It is unclear how effective BQL would be in this
context.
This patch serves as a first step toward addressing TX drops. Future work
may explore adapting a BQL-like mechanism to better suit virtual devices
like veth.
Reported-by: Yan Zhai <yan@cloudflare.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Reviewed-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Link: https://patch.msgid.link/174559294022.827981.1282809941662942189.stgit@firesoul
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The "noqueue" qdisc can either be directly attached, or get default
attached if net_device priv_flags has IFF_NO_QUEUE. In both cases, the
allocated Qdisc structure gets it's enqueue function pointer reset to
NULL by noqueue_init() via noqueue_qdisc_ops.
This is a common case for software virtual net_devices. For these devices
with no-queue, the transmission path in __dev_queue_xmit() will bypass
the qdisc layer. Directly invoking device drivers ndo_start_xmit (via
dev_hard_start_xmit). In this mode the device driver is not allowed to
ask for packets to be queued (either via returning NETDEV_TX_BUSY or
stopping the TXQ).
The simplest and most reliable way to identify this no-queue case is by
checking if enqueue == NULL.
The vrf driver currently open-codes this check (!qdisc->enqueue). While
functionally correct, this low-level detail is better encapsulated in a
dedicated helper for clarity and long-term maintainability.
To make this behavior more explicit and reusable, this patch introduce a
new helper: qdisc_txq_has_no_queue(). Helper will also be used by the
veth driver in the next patch, which introduces optional qdisc-based
backpressure.
This is a non-functional change.
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://patch.msgid.link/174559293172.827981.7583862632045264175.stgit@firesoul
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The newly added rtl9300 driver needs MDIO_DEVRES:
x86_64-linux-ld: drivers/net/mdio/mdio-realtek-rtl9300.o: in function `rtl9300_mdiobus_probe':
mdio-realtek-rtl9300.c:(.text+0x941): undefined reference to `devm_mdiobus_alloc_size'
x86_64-linux-ld: mdio-realtek-rtl9300.c:(.text+0x9e2): undefined reference to `__devm_mdiobus_register'
Since this is a hidden symbol, it needs to be selected by each user,
rather than the usual 'depends on'. I see that there are a few other
drivers that accidentally use 'depends on', so fix these as well for
consistency and to avoid dependency loops.
Fixes: 37f9b2a6c086 ("net: ethernet: Add missing depends on MDIO_DEVRES")
Fixes: 24e31e474769 ("net: mdio: Add RTL9300 MDIO driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Link: https://patch.msgid.link/20250425112819.1645342-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new GMAC's PCI device ID (0x7a23) support which is used in
Loongson-2K3000/Loongson-3B6000M. The new GMAC device use external PHY,
so it reuses loongson_gmac_data() as the old GMAC device (0x7a03), and
the new GMAC device still doesn't support flow control now.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Tested-by: Henry Chen <chenx97@aosc.io>
Tested-by: Biao Dong <dongbiao@loongson.cn>
Signed-off-by: Baoqi Zhang <zhangbaoqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://patch.msgid.link/20250424072209.3134762-4-chenhuacai@loongson.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new multi-chan IP core (0x12) support which is used in Loongson-
2K3000/Loongson-3B6000M. Compared with the 0x10 core, the new 0x12 core
reduces channel numbers from 8 to 4, but checksum is supported for all
channels.
Add a "multichan" flag to loongson_data, so that we can simply use a
"if (ld->multichan)" condition rather than the complicated condition
"if (ld->loongson_id == DWMAC_CORE_MULTICHAN_V1 || ld->loongson_id ==
DWMAC_CORE_MULTICHAN_V2)".
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Henry Chen <chenx97@aosc.io>
Tested-by: Biao Dong <dongbiao@loongson.cn>
Signed-off-by: Baoqi Zhang <zhangbaoqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Link: https://patch.msgid.link/20250424072209.3134762-3-chenhuacai@loongson.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When dwmac-socfpga was converted to using the Lynx PCS (previously
referred to in the driver as the Altera TSE PCS), the
lynx_pcs_create_mdiodev() was used to create the pcs instance.
As this function didn't exist in the early versions of the series, a
local mdiodev object was stored for PCS creation. It was never used, but
still made it into the driver, so remove it.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250424071223.221239-4-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The SGMII adapter needs to be enabled for both Cisco SGMII and 1000BaseX
operations. It doesn't make sense to check for an attached phydev here,
as we simply might not have any, in particular if we're using the
1000BaseX interface mode.
Make so that we only re-enable the SGMII adapter when it's present, and
when we use a phy_mode that is handled by said adapter.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250424071223.221239-3-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Dwmac Socfpga may be used with an instance of a Lynx / Altera TSE PCS,
in which case it gains support for 1000BaseX.
It appears that the PCS is wired to the MAC through an internal GMII
bus. Make sure that we enable the GMII_MII mode for the internal MAC when
using 1000BaseX.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250424071223.221239-2-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, the tx and rx queue number initialization is duplicated in
loongson_gmac_data() and loongson_gnet_data(), so move it to the common
function loongson_default_data().
This is a preparation for later patches.
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Tested-by: Henry Chen <chenx97@aosc.io>
Tested-by: Biao Dong <dongbiao@loongson.cn>
Signed-off-by: Baoqi Zhang <zhangbaoqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change addresses a MAC address conflict issue in failover scenarios,
similar to the problem described in commit a951bc1e6ba5 ("bonding: correct
the MAC address for 'follow' fail_over_mac policy").
In fail_over_mac=follow mode, the bonding driver expects the formerly active
slave to swap MAC addresses with the newly active slave during failover.
However, under certain conditions, two slaves may end up with the same MAC
address, which breaks this policy:
1) ip link set eth0 master bond0
-> bond0 adopts eth0's MAC address (MAC0).
2) ip link set eth1 master bond0
-> eth1 is added as a backup with its own MAC (MAC1).
3) ip link set eth0 nomaster
-> eth0 is released and restores its MAC (MAC0).
-> eth1 becomes the active slave, and bond0 assigns MAC0 to eth1.
4) ip link set eth0 master bond0
-> eth0 is re-added to bond0, now both eth0 and eth1 have MAC0.
This results in a MAC address conflict and violates the expected behavior
of the failover policy.
To fix this, we assign a random MAC address to any newly added slave if
its current MAC address matches that of the bond. The original (permanent)
MAC address is saved and will be restored when the device is released
from the bond.
This ensures that each slave has a unique MAC address during failover
transitions, preserving the integrity of the fail_over_mac=follow policy.
Fixes: 3915c1e8634a ("bonding: Add "follow" option to fail_over_mac")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RSS contexts are used to shard work across multiple queues for an
application using io_uring zero copy receive. Add a test case checking
that steering flows into an RSS context works.
Until I add multi-thread support to the selftest binary, this test case
only has 1 queue in the RSS context.
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250425022049.3474590-4-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Setting hds_thresh to 0 is required for queue reset.
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250425022049.3474590-3-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Switch to using defer() for putting the NIC back to the original state
prior to running the selftest.
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250425022049.3474590-2-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Test that the SO_INCOMING_NAPI_ID of a network file descriptor is
non-zero. This ensures that either the core networking stack or, in some
cases like netdevsim, the driver correctly sets the NAPI ID.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250424002746.16891-4-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Factor ksft C helpers to a header so they can be used by other C-based
tests.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250424002746.16891-3-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Previously, nsim_rcv was not marking the NAPI ID on the skb, leading to
applications seeing a napi ID of 0 when using SO_INCOMING_NAPI_ID.
To add to the userland confusion, netlink appears to correctly report
the NAPI IDs for netdevsim queues but the resulting file descriptor from
a call to accept() was reporting a NAPI ID of 0.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250424002746.16891-2-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Thorsten reports that after upgrading system headers from linux-next
the YNL build breaks. I typo'ed the header guard, _H is missing.
Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/59ba7a94-17b9-485f-aa6d-14e4f01a7a39@leemhuis.info
Fixes: 12b196568a3a ("tools: ynl: add missing header deps")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250423220231.1035931-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are some sparse warnings in wifi, and it seems that
it's actually possible to annotate a function pointer with
__releases(), making the sparse warnings go away. In a way
that also serves as documentation that rcu_read_unlock()
must be called in the attach method, so add that annotation.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://patch.msgid.link/20250423150811.456205-2-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tcp: fastopen: pass TFO child indication through getsockopt
Note that this uses up the last bit of a field in struct tcp_info
Signed-off-by: Jeremy Harris <jgh@exim.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20250423124334.4916-3-jgh@exim.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tcp: fastopen: note that a child socket was created
This uses up the last bit in a field of tcp_sock.
Signed-off-by: Jeremy Harris <jgh@exim.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20250423124334.4916-2-jgh@exim.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is a spelling mistake in a pr_info message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://patch.msgid.link/20250423113719.173539-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These paths should call rxgk_put(gk) but they don't. In the
rxgk_construct_response() function the "goto error;" will free the
"response" skb as well calling rxgk_put() so that's a bonus.
Fixes: 9d1d2b59341f ("rxrpc: rxgk: Implement the yfs-rxgk security class (GSSAPI)")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/aAikCbsnnzYtVmIA@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With commit 51a4df60db5c2 ("net: ethernet: mtk_eth_soc: convert caps in
mtk_soc_data struct to u64") the capabilities bitfield was converted to
a 64-bit value, but a cap_bit in struct mtk_eth_muxc which is used to
store a full bitfield (rather than the bit number, as the name would
suggest) still holds only a 32-bit value.
Change the type of cap_bit to u64 in order to avoid truncating the
bitfield which results in path selection to not work with capabilities
above the 32-bit limit.
The values currently stored in the cap_bit field are
MTK_ETH_MUX_GDM1_TO_GMAC1_ESW:
BIT_ULL(18) | BIT_ULL(5)
MTK_ETH_MUX_GMAC2_GMAC0_TO_GEPHY:
BIT_ULL(19) | BIT_ULL(5) | BIT_ULL(6)
MTK_ETH_MUX_U3_GMAC2_TO_QPHY:
BIT_ULL(20) | BIT_ULL(5) | BIT_ULL(6)
MTK_ETH_MUX_GMAC1_GMAC2_TO_SGMII_RGMII:
BIT_ULL(20) | BIT_ULL(5) | BIT_ULL(7)
MTK_ETH_MUX_GMAC12_TO_GEPHY_SGMII:
BIT_ULL(21) | BIT_ULL(5)
While all those values are currently still within 32-bit boundaries,
the addition of new capabilities of MT7988 as well as future SoC's
like MT7987 will exceed them. Also, the use of a 32-bit 'int' type to
store the result of a BIT_ULL(...) is misleading.
Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/ded98b0d716c3203017a7a92151516ec2bf1abee.1745369249.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove three functions that are no longer used.
rxrpc_get_txbuf() last use was removed by 2020's
commit 5e6ef4f1017c ("rxrpc: Make the I/O thread take over the call and
local processor work")
rxrpc_kernel_get_epoch() last use was removed by 2020's
commit 44746355ccb1 ("afs: Don't get epoch from a server because it may be
ambiguous")
rxrpc_kernel_set_max_life() last use was removed by 2023's
commit db099c625b13 ("rxrpc: Fix timeout of a call that hasn't yet been
granted a channel")
Both of the rxrpc_kernel_* functions were documented. Remove that
documentation as well as the code.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20250422235147.146460-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add mdio compat string for asp-v3.0 ethernet driver.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250422233645.1931036-9-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The asp-v3.0 is a major HW revision that reduced the number of
channels and filters. The goal was to save cost by reducing the
feature set.
Changes for asp-v3.0
- Number of network filters were reduced.
- Number of channels were reduced.
- EDPKT stats were removed.
- Fix a bug with csum offload.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250422233645.1931036-8-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The asp-v3.0 Ethernet controller uses a brcm unimac like its
predecessor.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250422233645.1931036-7-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add asp-v3.0 support. v3.0 is a major revision that reduces
the feature set for cost savings. We have a reduced amount of
channels and network filters.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250422233645.1931036-6-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove asp-v2.0 which will no longer be supported.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250422233645.1931036-5-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The SoC that supported asp-v2.0 never saw the light of day. asp-v2.0 has
quirks that makes the logic overly complicated. For example, asp-v2.0 is
the only revision that has a different wake up IRQ hook up. Remove asp-v2.0
support to make supporting future HW revisions cleaner.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250422233645.1931036-4-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove asp-v2.0 which was only supported on one SoC that never
saw the light of day.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250422233645.1931036-3-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove asp-v2.0 which was only supported on one SoC that never
saw the light of day.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250422233645.1931036-2-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the API `sysconf()` to query page size at runtime, instead of using
hard code number 4096.
And use `posix_memalign` to allocate the page size aligned momory.
Signed-off-by: Haiyue Wang <haiyuewa@163.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250419141044.10304-1-haiyuewa@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When a VNI is deleted from a VXLAN device in 'vnifilter' mode, the FDB
entry associated with the default remote (assuming one was configured)
is deleted without holding the hash lock. This is wrong and will result
in a warning [1] being generated by the lockdep annotation that was
added by commit ebe642067455 ("vxlan: Create wrappers for FDB lookup").
Reproducer:
# ip link add vx0 up type vxlan dstport 4789 external vnifilter local 192.0.2.1
# bridge vni add vni 10010 remote 198.51.100.1 dev vx0
# bridge vni del vni 10010 dev vx0
Fix by acquiring the hash lock before the deletion and releasing it
afterwards. Blame the original commit that introduced the issue rather
than the one that exposed it.
[1]
WARNING: CPU: 3 PID: 392 at drivers/net/vxlan/vxlan_core.c:417 vxlan_find_mac+0x17f/0x1a0
[...]
RIP: 0010:vxlan_find_mac+0x17f/0x1a0
[...]
Call Trace:
<TASK>
__vxlan_fdb_delete+0xbe/0x560
vxlan_vni_delete_group+0x2ba/0x940
vxlan_vni_del.isra.0+0x15f/0x580
vxlan_process_vni_filter+0x38b/0x7b0
vxlan_vnifilter_process+0x3bb/0x510
rtnetlink_rcv_msg+0x2f7/0xb70
netlink_rcv_skb+0x131/0x360
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
__sys_sendmsg+0x121/0x1b0
do_syscall_64+0xbb/0x1d0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250423145131.513029-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|