linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2021-10-25	net/tls: getsockopt supports complete algorithm list	Tianjia Zhang	1	-0/+42
	AES_CCM_128 and CHACHA20_POLY1305 are already supported by tls, similar to setsockopt, getsockopt also needs to support these two algorithms. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net/tls: tls_crypto_context add supported algorithms context	Tianjia Zhang	1	-0/+2
	tls already supports the SM4 GCM/CCM algorithms. It is also necessary to add support for these two algorithms in tls_crypto_context to avoid potential issues caused by forced type conversion. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	mlxsw: spectrum: Use 'bitmap_zalloc()' when applicable	Christophe JAILLET	4	-27/+16
	Use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid some open-coded arithmetic in allocator arguments. Also change the corresponding 'kfree()' into 'bitmap_free()' to keep consistency. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	usbb: catc: use correct API for MAC addresses	Oliver Neukum	1	-5/+17
	Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. In the case of catc we need a new temporary buffer to conform to the rules for DMA coherency. That in turn necessitates a reworking of error handling in probe(). Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	gve: Add a jumbo-frame device option.	Shailend Chand	2	-4/+68
	A widely deployed driver has a bug that will cause the driver not to load when a max_mtu > 2048 is present in the device descriptor. To avoid this bug while still enabling jumbo frames, we present a lower max_mtu in the device descriptor and pass the actual max_mtu in a separate device option. The driver supports 2 different queue formats. To enable features on one queue format, but not the other, a supported_features mask was added to the device options in the device descriptor. Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	gve: Implement packet continuation for RX.	David Awogbemila	9	-126/+292
	This enables the driver to receive RX packets spread across multiple buffers: For a given multi-fragment packet the "packet continuation" bit is set on all descriptors except the last one. These descriptors' payloads are combined into a single SKB before the SKB is handed to the networking stack. This change adds a "packet buffer size" notion for RX queues. The CreateRxQueue AdminQueue command sent to the device now includes the packet_buffer_size. We opt for a packet_buffer_size of PAGE_SIZE / 2 to give the driver the opportunity to flip pages where we can instead of copying. Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	gve: Add RX context.	David Awogbemila	2	-37/+44
	This refactor moves the skb_head and skb_tail fields into a new gve_rx_ctx struct. This new struct will contain information about the current packet being processed. This is in preparation for multi-descriptor RX packets. Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	selftests: mlxsw: Reduce test run time	Ido Schimmel	2	-18/+20
	Instead of iterating over all the available trap policers, only perform the tests with three policers: The first, the last and the one in the middle of the range. On a Spectrum-3 system, this reduces the run time from almost an hour to a few minutes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	selftests: mlxsw: Use permanent neighbours instead of reachable ones	Ido Schimmel	1	-11/+11
	The nexthop objects tests configure dummy reachable neighbours so that the nexthops will have a MAC address and be programmed to the device. Since these are dummy reachable neighbours, they can be transitioned by the kernel to a failed state if they are around for too long. This can happen, for example, if the "TIMEOUT" variable is configured with a too high value. Make the tests more robust by configuring the neighbours as permanent, so that the tests do not depend on the configured timeout value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	selftests: mlxsw: Add helpers for skipping selftests	Petr Machata	8	-24/+81
	A number of mlxsw-specific selftests currently detect whether they are run on a compatible machine, and bail out silently when not. These tests are however done in a somewhat impenetrable manner by directly comparing PCI IDs against a blacklist or a whitelist, and bailing out silently if the machine is not compatible. Instead, add a helper, mlxsw_only_on_spectrum(), which allows specifying the supported machines in a human-readable manner. If the current machine is incompatible, the helper emits a SKIP message and returns an error code, based on which the caller can gracefully bail out in a suitable way. This allows a more readable conditions such as: mlxsw_only_on_spectrum 2+ \|\| return Convert all existing open-coded guards to the new helper. Also add two new guards to do_mark_test() and do_drop_test(), which are supported only on Spectrum-2+, but the corresponding check was not there. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: add qca8081 cdt feature	Luo Jie	1	-3/+191
	To perform CDT of qca8081 phy: 1. disable hibernation. 2. force phy working in MDI mode. 3. force phy working in 1000BASE-T mode. 4. configure the related thresholds. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: adjust qca8081 master/slave seed value if link down	Luo Jie	1	-0/+16
	1. The master/slave seed needs to be updated when the link can't be created. 2. The case where two qca8081 PHYs are connected each other and master/slave seed is generated as the same value also needs to be considered, so adding this code change into read_status instead of link_change_notify. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: add qca8081 soft_reset and enable master/slave seed	Luo Jie	1	-0/+48
	qca8081 phy is a single port phy, configure phy the lower seed value to make it linked as slave mode easier. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: add qca8081 config_init	Luo Jie	1	-0/+107
	Add the qca8081 phy driver config_init function, which includes: 1. Enable fast restrain. 2. Add 802.3az configurations. 3. Initialize ADC threshold as 100mv. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: add genphy_c45_fast_retrain	Luo Jie	2	-0/+35
	Add generic fast retrain auto-negotiation function for C45 PHYs. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: add constants for fast retrain related register	Luo Jie	1	-0/+9
	Add the constants for 2.5G fast retrain capability in 10G AN control register, fast retrain status and control register and THP bypass register into mdio.h. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: add qca8081 config_aneg	Luo Jie	1	-1/+25
	Reuse at803x phy driver config_aneg excepting adding 2500M auto-negotiation. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: add qca8081 get_features	Luo Jie	1	-0/+10
	Reuse the at803x phy driver get_features excepting adding 2500M capability. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: add qca8081 read_status	Luo Jie	1	-22/+73
	1. Separate the function at803x_read_specific_status from the at803x_read_status, since it can be reused by the read_status of qca8081 phy driver excepting adding the 2500M speed. 2. Add the qca8081 read_status function qca808x_read_status. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: add qca8081 ethernet phy driver	Luo Jie	1	-1/+16
	qca8081 is a single port ethernet phy chip that supports 10/100/1000/2500 Mbps mode. Add the basic phy driver features, and reuse the at803x phy driver functions. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: at803x: use GENMASK() for speed status	Luo Jie	1	-5/+5
	Use GENMASK() for the current speed value. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: at803x: improve the WOL feature	Luo Jie	1	-7/+38
	The wol feature is controlled by the MMD3.8012 bit5, need to set this bit when the wol function is enabled. The reg18 bit0 is for enabling WOL interrupt, when wol occurs, the wol interrupt status reg19 bit0 is set to 1. Call phy_trigger_machine if there are any other interrupt pending in the function set_wol. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: at803x: use phy_modify()	Luo Jie	1	-6/+2
	Convert at803x_set_wol to use phy_modify. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: phy: at803x: replace AT803X_DEVICE_ADDR with MDIO_MMD_PCS	Luo Jie	1	-3/+3
	Replace AT803X_DEVICE_ADDR with MDIO_MMD_PCS defined in mdio.h. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: hns3: add error recovery module and type for himac	Jiaran Zhang	2	-0/+12
	This patch adds himac error recovery module, link_error type and ptp_error type for himac. Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: hns3: add new ras error type for roce	Weihang Li	2	-1/+5
	This patch adds one ras error of bus related for roce, this error including RRESP/BRESP and read poison error. Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: hns3: add update ethtool advertised link modes for FIBRE port when autoneg off	Guangbin Huang	1	-1/+77
	Currently, the ethtool advertised link modes of FIBRE port is cleared to zero when autoneg is off, so user can not get the advertised link modes info directly from "ethtool <dev>" command. In order to ameliorate this situation, update data of speeds, fec and pause of advertised link modes when autoneg is off. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: hns3: modify functions of converting speed ability to ethtool link mode	Guangbin Huang	1	-33/+37
	The functions of converting speed ability to ethtool link mode just support setting mac->supported currently, to reuse these functions to set ethtool link mode for others(i.e. advertising), delete the argument mac and add argument link_mode. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: hns3: add support pause/pfc durations for mac statistics	Guangbin Huang	4	-92/+182
	The mac statistics add pause/pfc durations in device version V3, we can get total active cycle of pause/pfc from these durations. As driver gets register number from firmware to calculate desc number to query mac statistics, it needs to set mac statistics extended enable bit in firmware command 0x701A to tell firmware that driver supports extended mac statistics, otherwise firmware only returns register number of version V1. As pause/pfc durations are not supported by hardware of old version, they should not been shown in command "ethtool -S ethX" in this case, so add checking max register number of each mac statistic in their version. If the max register number of one mac statistic is greater than register number got from firmware, it means hardware does not support this mac statistic, so ignore this statistic when get string and data of mac statistic. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: hns3: device specifications add number of mac statistics	Guangbin Huang	3	-11/+26
	Currently, driver queries number of mac statistics before querying mac statistics. As the number of mac statistics is a fixed value in firmware, it is redundant to query this number everytime before querying mac statistics, it can just be queried once in initialization process and saved in device specifications. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: hns3: modify mac statistics update process for compatibility	Guangbin Huang	2	-47/+48
	After querying mac statistics from firmware, driver copies data from descriptors to struct mac_stats of hdev, and the number of copied data is just according to the register number queried from firmware. There is a problem that if the register number queried from firmware is larger than data number of struct mac_stats, it will cause a copy overflow. So if the firmware adds more mac statistics in later version, it is not compatible with driver of old version. To fix this problem, the number of copied data needs to be used the minimum value between the register number queried from firmware and data number of struct mac_stats. The first descriptor has three data and there is one reserved, to optimize the copy process, add this reserverd data to struct mac_stats. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: hns3: add debugfs support for interrupt coalesce	Huazhong Tan	3	-1/+122
	Since user may need to check the current configuration of the interrupt coalesce, so add debugfs support for query this info. Create a single file "coalesce_info" for it, and query it by "cat coalesce_info", return the result to userspace. For device whose version is above V3(include V3), the GL's register contains usecs and 1us unit configuration. When get the usecs configuration from this register, it will include the confusing unit configuration, so add a GL mask to get the correct value, and add a QL mask for the frames configuration as well. The display style is below: $ cat coalesce_info tx interrupt coalesce info: VEC_ID ALGO_STATE PROFILE_ID CQE_MODE TUNE_STATE STEPS_LEFT... 0 IN_PROG 4 EQE ON_TOP 0... 1 START 3 EQE LEFT 1... rx interrupt coalesce info: VEC_ID ALGO_STATE PROFILE_ID CQE_MODE TUNE_STATE STEPS_LEFT... 0 IN_PROG 3 EQE LEFT 1... 1 IN_PROG 0 EQE ON_TOP 0... Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	s390/qeth: update kerneldoc for qeth_add_hw_header()	Julian Wiedmann	1	-0/+2
	qeth_add_hw_header() is missing documentation for some of its parameters, fix that up. Reported-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	s390/qeth: fix kernel doc comments	Heiko Carstens	2	-6/+6
	Fix kernel doc comments and remove incorrect kernel doc indicators. Acked-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	s390/qeth: add __printf format attribute to qeth_dbf_longtext	Heiko Carstens	1	-0/+1
	Allow the compiler to recognize and check format strings and parameters. As reported with allmodconfig and W=1: drivers/s390/net/qeth_core_main.c: In function ‘qeth_dbf_longtext’: drivers/s390/net/qeth_core_main.c:6190:9: error: function ‘qeth_dbf_longtext’ might be a candidate for ‘gnu_printf’ format attribute [-Werror=suggest-attribute=format] 6190 \| vsnprintf(dbf_txt_buf, sizeof(dbf_txt_buf), fmt, args); \| ^~~~~~~~~ Acked-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	s390/qeth: fix various format strings	Heiko Carstens	1	-7/+7
	Various format strings don't match with types of parameters. Fix all of them. Acked-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	s390/qeth: don't keep track of Input Queue count	Julian Wiedmann	2	-11/+7
	The only actual user of qdio.no_input_queues is qeth_qdio_establish(), and there we already have full awareness of the current Input Queue configuration (1 RX queue, plus potentially 1 TX Completion queue). So avoid this state tracking, and the ambiguity it brings with it. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	s390/qeth: clarify remaining dev_kfree_skb_any() users	Julian Wiedmann	1	-3/+3
	For none of the users we are under risk of running in HW IRQ context or or with IRQs disabled. Thus we always end up in consume_skb(). But the two occurences in the RX path should really report the dropped packet to dropmon, so have them use kfree_skb() instead. That's also consistent with what napi_free_frags() does internally. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	s390/qeth: move qdio's QAOB cache into qeth	Julian Wiedmann	3	-36/+19
	qdio.ko no longer needs to care about how the QAOBs are allocated, from its perspective they are merely another parameter to do_QDIO(). So for a start, shift the cache into the only qdio driver that uses QAOBs (ie. qeth). Here there's further opportunity to optimize its usage in the future - eg. make it per-{device, TX queue}, or only compile it when the driver is built with CQ/QAOB support. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	s390/qeth: remove .do_ioctl() callback from driver discipline	Julian Wiedmann	4	-12/+6
	With commit 18787eeebd71 ("qeth: use ndo_siocdevprivate") this callback is now actually used to handle transport mode-specific _private_ ioctls. We only have such ioctls for L3 devices. So wire up a L3-specific .ndo_siocdevprivate() callback that handles those ioctls, and defers to the core qeth_siocdevprivate() for all other private ioctls. This takes the discipline one step closer to its original purpose of providing an internal extension for the qeth_core_ccwgroup_driver. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	s390/qeth: improve trace entries for MAC address (un)registration	Julian Wiedmann	1	-6/+6
	Add the failed MAC address into the trace message. Also fix up one format string to use %x instead of %u for the CARD_DEVID. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	selftests: net: dsa: add a stress test for unlocked FDB operations	Vladimir Oltean	2	-0/+48
	This test is a bit strange in that it is perhaps more manual than others: it does not transmit a clear OK/FAIL verdict, because user space does not have synchronous feedback from the kernel. If a hardware access fails, it is in deferred context. Nonetheless, on sja1105 I have used it successfully to find and solve a concurrency issue, so it can be used as a starting point for other driver maintainers too. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	selftests: lib: forwarding: allow tests to not require mz and jq	Vladimir Oltean	1	-2/+8
	These programs are useful, but not all selftests require them. Additionally, on embedded boards without package management (things like buildroot), installing mausezahn or jq is not always as trivial as downloading a package from the web. So it is actually a bit annoying to require programs that are not used. Introduce options that can be set by scripts to not enforce these dependencies. For compatibility, default to "yes". Cc: Nikolay Aleksandrov <nikolay@nvidia.com> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Guillaume Nault <gnault@redhat.com> Cc: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work	Vladimir Oltean	1	-2/+0
	After talking with Ido Schimmel, it became clear that rtnl_lock is not actually required for anything that is done inside the SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE deferred work handlers. The reason why it was probably added by Arkadi Sharshevsky in commit c9eb3e0f8701 ("net: dsa: Add support for learning FDB through notification") was to offer the same locking/serialization guarantees as .ndo_fdb_{add,del} and avoid reworking any drivers. DSA has implemented .ndo_fdb_add and .ndo_fdb_del until commit b117e1e8a86d ("net: dsa: delete dsa_legacy_fdb_add and dsa_legacy_fdb_del") - that is to say, until fairly recently. But those methods have been deleted, so now we are free to drop the rtnl_lock as well. Note that exposing DSA switch drivers to an unlocked method which was previously serialized by the rtnl_mutex is a potentially dangerous affair. Driver writers couldn't ensure that their internal locking scheme does the right thing even if they wanted. We could err on the side of paranoia and introduce a switch-wide lock inside the DSA framework, but that seems way overreaching. Instead, we could check as many drivers for regressions as we can, fix those first, then let this change go in once it is assumed to be fairly safe. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: dsa: introduce locking for the address lists on CPU and DSA ports	Vladimir Oltean	3	-24/+54
	Now that the rtnl_mutex is going away for dsa_port_{host_,}fdb_{add,del}, no one is serializing access to the address lists that DSA keeps for the purpose of reference counting on shared ports (CPU and cascade ports). It can happen for one dsa_switch_do_fdb_del to do list_del on a dp->fdbs element while another dsa_switch_do_fdb_{add,del} is traversing dp->fdbs. We need to avoid that. Currently dp->mdbs is not at risk, because dsa_switch_do_mdb_{add,del} still runs under the rtnl_mutex. But it would be nice if it would not depend on that being the case. So let's introduce a mutex per port (the address lists are per port too) and share it between dp->mdbs and dp->fdbs. The place where we put the locking is interesting. It could be tempting to put a DSA-level lock which still serializes calls to .port_fdb_{add,del}, but it would still not avoid concurrency with other driver code paths that are currently under rtnl_mutex (.port_fdb_dump, .port_fast_age). So it would add a very false sense of security (and adding a global switch-wide lock in DSA to resynchronize with the rtnl_lock is also counterproductive and hard). So the locking is intentionally done only where the dp->fdbs and dp->mdbs lists are traversed. That means, from a driver perspective, that .port_fdb_add will be called with the dp->addr_lists_lock mutex held on the CPU port, but not held on user ports. This is done so that driver writers are not encouraged to rely on any guarantee offered by dp->addr_lists_lock. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: dsa: lantiq_gswip: serialize access to the PCE registers	Vladimir Oltean	1	-5/+23
	The GSWIP switch accesses various bridging layer tables (VLANs, FDBs, forwarding rules) indirectly through PCE registers. These hardware accesses are non-atomic, being comprised of several register reads and writes. These accesses are currently serialized by the rtnl_lock, but DSA is changing its driver API and that lock will no longer be held when calling ->port_fdb_add() and ->port_fdb_del(). So this driver needs to serialize the access to the PCE registers using its own locking scheme. This patch adds that. Note that the driver also uses the gswip_pce_load_microcode() function to load a static configuration for the packet classification engine into a table using the same registers. It is currently not protected, but since that configuration is only done from the dsa_switch_ops :: setup method, there is no risk of it being concurrent with other operations. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: dsa: b53: serialize access to the ARL table	Vladimir Oltean	2	-7/+30
	The b53 driver performs non-atomic transactions to the ARL table when adding, deleting and reading FDB and MDB entries. Traditionally these were all serialized by the rtnl_lock(), but now it is possible that DSA calls ->port_fdb_add and ->port_fdb_del without holding that lock. So the driver must have its own serialization logic. Add a mutex and hold it from all entry points (->port_fdb_{add,del,dump}, ->port_mdb_{add,del}). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: mscc: ocelot: serialize access to the MAC table	Vladimir Oltean	2	-12/+44
	DSA would like to remove the rtnl_lock from its SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handlers, and the felix driver uses the same MAC table functions as ocelot. This means that the MAC table functions will no longer be implicitly serialized with respect to each other by the rtnl_mutex, we need to add a dedicated lock in ocelot for the non-atomic operations of selecting a MAC table row, reading/writing what we want and polling for completion. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: dsa: sja1105: serialize access to the dynamic config interface	Vladimir Oltean	3	-2/+13
	The sja1105 hardware seems as concurrent as can be, but when we create a background script that adds/removes a rain of FDB entries without the rtnl_mutex taken, then in parallel we do another operation like run 'bridge fdb show', we can notice these errors popping up: sja1105 spi2.0: port 2 failed to read back entry for 00:01:02:03:00:40 vid 0: -ENOENT sja1105 spi2.0: port 2 failed to add 00:01:02:03:00:40 vid 0 to fdb: -2 sja1105 spi2.0: port 2 failed to read back entry for 00:01:02:03:00:46 vid 0: -ENOENT sja1105 spi2.0: port 2 failed to add 00:01:02:03:00:46 vid 0 to fdb: -2 Luckily what is going on does not require a major rework in the driver. The sja1105_dynamic_config_read() function sends multiple SPI buffers to the peripheral until the operation completes. We should not do anything until the hardware clears the VALID bit. But since there is no locking (i.e. right now we are implicitly serialized by the rtnl_mutex, but if we remove that), it might be possible that the process which performs the dynamic config read is preempted and another one performs a dynamic config write. What will happen in that case is that sja1105_dynamic_config_read(), when it resumes, expects to see VALIDENT set for the entry it reads back. But it won't. This can be corrected by introducing a mutex for serializing SPI accesses to the dynamic config interface which should be atomic with respect to each other. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25	net: dsa: sja1105: wait for dynamic config command completion on writes too	Vladimir Oltean	1	-22/+59
	The hardware manual says that software should attempt a new dynamic config access (be it a a write or a read-back) only while the VALID bit is cleared. The VALID bit is set by software to 1, and it remains set as long as the hardware is still processing the request. Currently the driver only polls for the command completion only for reads, because that's when we need the actual data read back. Writes have been more or less "asynchronous", although this has never been an observable issue. This change makes sja1105_dynamic_config_write poll the VALID bit as well, to absolutely ensure that a follow-up access to the static config finds the VALID bit cleared. So VALID means "work in progress", while VALIDENT means "entry being read is valid". On reads we check the VALIDENT bit too, while on writes that bit is not always defined. So we need to factor it out of the loop, and make the loop provide back the unpacked command structure, so that sja1105_dynamic_config_read can check the VALIDENT bit. The change also attempts to convert the open-coded loop to use the read_poll_timeout macro, since I know this will come up during review. It's more code, but hey, it uses read_poll_timeout! Tested on SJA1105T, SJA1105S, SJA1110A. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>