linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-04-29	dt-bindings: net: micrel: add coma-mode-gpios property	Michael Walle	1	-0/+9
	The LAN8814 has a coma mode pin which is used to put the PHY into isolate and power-down mode. Usually strapped to be asserted by default. A GPIO is then used to take the PHY out of this mode. Signed-off-by: Michael Walle <michael@walle.cc> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29	qeth: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	4	-5/+3
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: velocity: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	2	-3/+1
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: spider: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	2	-2/+1
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Acked-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: vxge: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	2	-3/+1
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: gfar: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	2	-4/+1
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: benet: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	2	-3/+2
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: atlantic: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	3	-4/+2
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	net: bgmac: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	2	-3/+1
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	slic: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	2	-3/+1
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	usb: lan78xx: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	1	-3/+1
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: mtk_eth_soc: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	2	-3/+2
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: pch_gbe: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	1	-7/+5
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: cpsw: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	4	-11/+10
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: smsc: remove a copy of the NAPI_POLL_WEIGHT define	Jakub Kicinski	2	-2/+1
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29	eth: remove copies of the NAPI_POLL_WEIGHT define	Jakub Kicinski	7	-16/+8
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Drop the special defines in a bunch of drivers where the removal is relatively simple so grouping into one patch does not impact reviewability. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-28	net: dsa: mv88e6xxx: Single chip mode detection for MV88E6*41	Nathan Rossi	1	-7/+39
	The mv88e6xxx driver expects switches that are configured in single chip addressing mode to have the MDIO address configured as 0. This is due to the switch ADDR pins representing the single chip addressing mode as 0. However depending on the device (e.g. MV88E6*41) the switch does not respond on address 0 or any other address below 16 (the first port address) in single chip addressing mode. This allows for other devices to be on the same shared MDIO bus despite the switch being in single chip addressing mode. When using a switch that works this way it is not possible to configure switch driver as single chip addressing via device tree, along with another MDIO device on the same bus with address 0, as both devices would have the same address of 0 resulting in mdiobus_register_device -EBUSY errors for one of the devices with address 0. In order to support this configuration the switch node can have its MDIO address configured as 16 (the first address that the device responds to). During initialization the driver will treat this address similar to how address 0 is, however because this address is also a valid multi-chip address (in certain switch models, but not all) the driver will configure the SMI in single chip addressing mode and attempt to detect the switch model. If the device is configured in single chip addressing mode this will succeed and the initialization process can continue. If it fails to detect a valid model this is because the switch model register is not a valid register when in multi-chip mode, it will then fall back to the existing SMI initialization process using the MDIO address as the multi-chip mode address. This detection method is safe if the device is in either mode because the single chip addressing mode read is a direct SMI/MDIO read operation and has no side effects compared to the SMI writes required for the multi-chip addressing mode. In order to implement this change, the reset gpio configuration is moved to occur before any SMI initialization. This ensures that the device has the same/correct reset gpio state for both mv88e6xxx_smi_init calls. Signed-off-by: Nathan Rossi <nathan@nathanrossi.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220427130928.540007-1-nathan@nathanrossi.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28	net: prestera: add police action support	Volodymyr Mytnyk	5	-2/+149
	- Add HW api to configure policer: - SR TCM policer mode is only supported for now. - Policer ingress/egress direction support. - Add police action support into flower Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu> Link: https://lore.kernel.org/r/1651061148-21321-1-git-send-email-volodymyr.mytnyk@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28	net: phy: Deduplicate interrupt disablement on PHY attach	Lukas Wunner	1	-4/+2
	phy_attach_direct() first calls phy_init_hw() (which restores interrupt settings through ->config_intr()), then calls phy_disable_interrupts(). So if phydev->interrupts was previously set to 1, interrupts are briefly enabled, then disabled, which seems nonsensical. If it was previously set to 0, interrupts are disabled twice, which is equally nonsensical. Deduplicate interrupt disablement. Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/805ccdc606bd8898d59931bd4c7c68537ed6e550.1651040826.git.lukas@wunner.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28	net: SO_RCVMARK socket option for SO_MARK with recvmsg()	Erin MacNeil	23	-27/+56
	Adding a new socket option, SO_RCVMARK, to indicate that SO_MARK should be included in the ancillary data returned by recvmsg(). Renamed the sock_recv_ts_and_drops() function to sock_recv_cmsgs(). Signed-off-by: Erin MacNeil <lnx.erin@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20220427200259.2564-1-lnx.erin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28	tcp: fix F-RTO may not work correctly when receiving DSACK	Pengcheng Yang	1	-1/+2
	Currently DSACK is regarded as a dupack, which may cause F-RTO to incorrectly enter "loss was real" when receiving DSACK. Packetdrill to demonstrate: // Enable F-RTO and TLP 0 `sysctl -q net.ipv4.tcp_frto=2` 0 `sysctl -q net.ipv4.tcp_early_retrans=3` 0 `sysctl -q net.ipv4.tcp_congestion_control=cubic` // Establish a connection +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 // RTT 10ms, RTO 210ms +.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <...> +.01 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 // Send 2 data segments +0 write(4, ..., 2000) = 2000 +0 > P. 1:2001(2000) ack 1 // TLP +.022 > P. 1001:2001(1000) ack 1 // Continue to send 8 data segments +0 write(4, ..., 10000) = 10000 +0 > P. 2001:10001(8000) ack 1 // RTO +.188 > . 1:1001(1000) ack 1 // The original data is acked and new data is sent(F-RTO step 2.b) +0 < . 1:1(0) ack 2001 win 257 +0 > P. 10001:12001(2000) ack 1 // D-SACK caused by TLP is regarded as a dupack, this results in // the incorrect judgment of "loss was real"(F-RTO step 3.a) +.022 < . 1:1(0) ack 2001 win 257 <sack 1001:2001,nop,nop> // Never-retransmitted data(3001:4001) are acked and // expect to switch to open state(F-RTO step 3.b) +0 < . 1:1(0) ack 4001 win 257 +0 %{ assert tcpi_ca_state == 0, tcpi_ca_state }% Fixes: e33099f96d99 ("tcp: implement RFC5682 F-RTO") Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/1650967419-2150-1-git-send-email-yangpc@wangsu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28	Revert "ibmvnic: Add ethtool private flag for driver-defined queue limits"	Dany Madden	2	-100/+35
	This reverts commit 723ad916134784b317b72f3f6cf0f7ba774e5dae When client requests channel or ring size larger than what the server can support the server will cap the request to the supported max. So, the client would not be able to successfully request resources that exceed the server limit. Fixes: 723ad9161347 ("ibmvnic: Add ethtool private flag for driver-defined queue limits") Signed-off-by: Dany Madden <drt@linux.ibm.com> Link: https://lore.kernel.org/r/20220427235146.23189-1-drt@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28	net: enetc: allow tc-etf offload even with NETIF_F_CSUM_MASK	Vladimir Oltean	1	-4/+0
	The Time-Specified Departure feature is indeed mutually exclusive with TX IP checksumming in ENETC, but TX checksumming in itself is broken and was removed from this driver in commit 82728b91f124 ("enetc: Remove Tx checksumming offload code"). The blamed commit declared NETIF_F_HW_CSUM in dev->features to comply with software TSO's expectations, and still did the checksumming in software by calling skb_checksum_help(). So there isn't any restriction for the Time-Specified Departure feature. However, enetc_setup_tc_txtime() doesn't understand that, and blindly looks for NETIF_F_CSUM_MASK. Instead of checking for things which can literally never happen in the current code base, just remove the check and let the driver offload tc-etf qdiscs. Fixes: acede3c5dad5 ("net: enetc: declare NETIF_F_HW_CSUM and do it in software") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220427203017.1291634-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28	ixgbe: ensure IPsec VF<->PF compatibility	Leon Romanovsky	1	-1/+2
	The VF driver can forward any IPsec flags and such makes the function is not extendable and prone to backward/forward incompatibility. If new software runs on VF, it won't know that PF configured something completely different as it "knows" only XFRM_OFFLOAD_INBOUND flag. Fixes: eda0333ac293 ("ixgbe: add VF IPsec management") Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shannon Nelson <snelson@pensando.io> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220427173152.443102-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28	MAINTAINERS: Update BNXT entry with firmware files	Florian Fainelli	1	-0/+2
	There appears to be a maintainer gap for BNXT TEE firmware files which causes some patches to be missed. Update the entry for the BNXT Ethernet controller with its companion firmware files. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20220427163606.126154-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28	netfilter: nft_socket: only do sk lookups when indev is available	Florian Westphal	1	-14/+38
	Check if the incoming interface is available and NFT_BREAK in case neither skb->sk nor input device are set. Because nf_sk_lookup_slow*() assume packet headers are in the 'in' direction, use in postrouting is not going to yield a meaningful result. Same is true for the forward chain, so restrict the use to prerouting, input and output. Use in output work if a socket is already attached to the skb. Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching") Reported-and-tested-by: Topi Miettinen <toiwoton@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-04-28	gfs2: No short reads or writes upon glock contention	Andreas Gruenbacher	1	-4/+0
	Commit 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for buffered I/O") changed gfs2_file_read_iter() and gfs2_file_buffered_write() to allow dropping the inode glock while faulting in user buffers. When the lock was dropped, a short result was returned to indicate that the operation was interrupted. As pointed out by Linus (see the link below), this behavior is broken and the operations should always re-acquire the inode glock and resume the operation instead. Link: https://lore.kernel.org/lkml/CAHk-=whaz-g_nOOoo8RRiWNjnv2R+h6_xk2F1J4TuSRxk1MtLw@mail.gmail.com/ Fixes: 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for buffered I/O") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-04-28	net: make sure net_rx_action() calls skb_defer_free_flush()	Eric Dumazet	1	-1/+2
	I missed a stray return; in net_rx_action(), which very well is taken whenever trigger_rx_softirq() has been called on a cpu that is no longer receiving network packets, or receiving too few of them. Fixes: 68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220427204147.1310161-1-eric.dumazet@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-28	ARM: dts: aspeed: add reset properties into MDIO nodes	Dylan Hung	1	-0/+4
	Add reset control properties into MDIO nodes. The 4 MDIO controllers in AST2600 SOC share one reset control bit SCU50[3]. Signed-off-by: Dylan Hung <dylan_hung@aspeedtech.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-28	net: mdio: add reset control for Aspeed MDIO	Dylan Hung	1	-1/+14
	Add reset assertion/deassertion for Aspeed MDIO. There are 4 MDIO controllers embedded in Aspeed AST2600 SOC and share one reset control register SCU50[3]. To work with old DT blobs which don't have the reset property, devm_reset_control_get_optional_shared is used in this change. Signed-off-by: Dylan Hung <dylan_hung@aspeedtech.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-28	dt-bindings: net: add reset property for aspeed, ast2600-mdio binding	Dylan Hung	1	-0/+6
	The AST2600 MDIO bus controller has a reset control bit and must be deasserted before manipulating the MDIO controller. By default, the hardware asserts the reset so the driver only need to deassert it. Regarding to the old DT blobs which don't have reset property in them, the reset deassertion is usually done by the bootloader so the reset property is optional to work with them. Signed-off-by: Dylan Hung <dylan_hung@aspeedtech.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-27	net: fec: add missing of_node_put() in fec_enet_init_stop_mode()	Yang Yingliang	1	-1/+1
	Put device node in error path in fec_enet_init_stop_mode(). Fixes: 8a448bf832af ("net: ethernet: fec: move GPR register offset and bit into DT") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220426125231.375688-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27	bnx2x: fix napi API usage sequence	Manish Chopra	1	-4/+5
	While handling PCI errors (AER flow) driver tries to disable NAPI [napi_disable()] after NAPI is deleted [__netif_napi_del()] which causes unexpected system hang/crash. System message log shows the following: ======================================= [ 3222.537510] EEH: Detected PCI bus error on PHB#384-PE#800000 [ 3222.537511] EEH: This PCI device has failed 2 times in the last hour and will be permanently disabled after 5 failures. [ 3222.537512] EEH: Notify device drivers to shutdown [ 3222.537513] EEH: Beginning: 'error_detected(IO frozen)' [ 3222.537514] EEH: PE#800000 (PCI 0384:80:00.0): Invoking bnx2x->error_detected(IO frozen) [ 3222.537516] bnx2x: [bnx2x_io_error_detected:14236(eth14)]IO error detected [ 3222.537650] EEH: PE#800000 (PCI 0384:80:00.0): bnx2x driver reports: 'need reset' [ 3222.537651] EEH: PE#800000 (PCI 0384:80:00.1): Invoking bnx2x->error_detected(IO frozen) [ 3222.537651] bnx2x: [bnx2x_io_error_detected:14236(eth13)]IO error detected [ 3222.537729] EEH: PE#800000 (PCI 0384:80:00.1): bnx2x driver reports: 'need reset' [ 3222.537729] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'need reset' [ 3222.537890] EEH: Collect temporary log [ 3222.583481] EEH: of node=0384:80:00.0 [ 3222.583519] EEH: PCI device/vendor: 168e14e4 [ 3222.583557] EEH: PCI cmd/status register: 00100140 [ 3222.583557] EEH: PCI-E capabilities and status follow: [ 3222.583744] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.583892] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.583893] EEH: PCI-E 20: 00000000 [ 3222.583893] EEH: PCI-E AER capability register set follows: [ 3222.584079] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.584230] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.584378] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.584416] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.584416] EEH: of node=0384:80:00.1 [ 3222.584454] EEH: PCI device/vendor: 168e14e4 [ 3222.584491] EEH: PCI cmd/status register: 00100140 [ 3222.584492] EEH: PCI-E capabilities and status follow: [ 3222.584677] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.584825] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.584826] EEH: PCI-E 20: 00000000 [ 3222.584826] EEH: PCI-E AER capability register set follows: [ 3222.585011] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.585160] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.585309] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.585347] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.586872] RTAS: event: 5, Type: Platform Error (224), Severity: 2 [ 3222.586873] EEH: Reset without hotplug activity [ 3224.762767] EEH: Beginning: 'slot_reset' [ 3224.762770] EEH: PE#800000 (PCI 0384:80:00.0): Invoking bnx2x->slot_reset() [ 3224.762771] bnx2x: [bnx2x_io_slot_reset:14271(eth14)]IO slot reset initializing... [ 3224.762887] bnx2x 0384:80:00.0: enabling device (0140 -> 0142) [ 3224.768157] bnx2x: [bnx2x_io_slot_reset:14287(eth14)]IO slot reset --> driver unload Uninterruptible tasks ===================== crash> ps \| grep UN 213 2 11 c000000004c89e00 UN 0.0 0 0 [eehd] 215 2 0 c000000004c80000 UN 0.0 0 0 [kworker/0:2] 2196 1 28 c000000004504f00 UN 0.1 15936 11136 wickedd 4287 1 9 c00000020d076800 UN 0.0 4032 3008 agetty 4289 1 20 c00000020d056680 UN 0.0 7232 3840 agetty 32423 2 26 c00000020038c580 UN 0.0 0 0 [kworker/26:3] 32871 4241 27 c0000002609ddd00 UN 0.1 18624 11648 sshd 32920 10130 16 c00000027284a100 UN 0.1 48512 12608 sendmail 33092 32987 0 c000000205218b00 UN 0.1 48512 12608 sendmail 33154 4567 16 c000000260e51780 UN 0.1 48832 12864 pickup 33209 4241 36 c000000270cb6500 UN 0.1 18624 11712 sshd 33473 33283 0 c000000205211480 UN 0.1 48512 12672 sendmail 33531 4241 37 c00000023c902780 UN 0.1 18624 11648 sshd EEH handler hung while bnx2x sleeping and holding RTNL lock =========================================================== crash> bt 213 PID: 213 TASK: c000000004c89e00 CPU: 11 COMMAND: "eehd" #0 [c000000004d477e0] __schedule at c000000000c70808 #1 [c000000004d478b0] schedule at c000000000c70ee0 #2 [c000000004d478e0] schedule_timeout at c000000000c76dec #3 [c000000004d479c0] msleep at c0000000002120cc #4 [c000000004d479f0] napi_disable at c000000000a06448 ^^^^^^^^^^^^^^^^ #5 [c000000004d47a30] bnx2x_netif_stop at c0080000018dba94 [bnx2x] #6 [c000000004d47a60] bnx2x_io_slot_reset at c0080000018a551c [bnx2x] #7 [c000000004d47b20] eeh_report_reset at c00000000004c9bc #8 [c000000004d47b90] eeh_pe_report at c00000000004d1a8 #9 [c000000004d47c40] eeh_handle_normal_event at c00000000004da64 And the sleeping source code ============================ crash> dis -ls c000000000a06448 FILE: ../net/core/dev.c LINE: 6702 6697 { 6698 might_sleep(); 6699 set_bit(NAPI_STATE_DISABLE, &n->state); 6700 6701 while (test_and_set_bit(NAPI_STATE_SCHED, &n->state)) * 6702 msleep(1); 6703 while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state)) 6704 msleep(1); 6705 6706 hrtimer_cancel(&n->timer); 6707 6708 clear_bit(NAPI_STATE_DISABLE, &n->state); 6709 } EEH calls into bnx2x twice based on the system log above, first through bnx2x_io_error_detected() and then bnx2x_io_slot_reset(), and executes the following call chains: bnx2x_io_error_detected() +-> bnx2x_eeh_nic_unload() +-> bnx2x_del_all_napi() +-> __netif_napi_del() bnx2x_io_slot_reset() +-> bnx2x_netif_stop() +-> bnx2x_napi_disable() +->napi_disable() Fix this by correcting the sequence of NAPI APIs usage, that is delete the NAPI after disabling it. Fixes: 7fa6f34081f1 ("bnx2x: AER revised") Reported-by: David Christensen <drc@linux.vnet.ibm.com> Tested-by: David Christensen <drc@linux.vnet.ibm.com> Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Link: https://lore.kernel.org/r/20220426153913.6966-1-manishc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27	net: dsa: ksz9477: move get_stats64 to ksz_common.c	Arun Ramadoss	3	-96/+101
	The mib counters for the ksz9477 is same for the ksz9477 switch and LAN937x switch. Hence moving it to ksz_common.c file in order to have it generic function. The DSA hook get_stats64 now can call ksz_get_stats64. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220426091048.9311-1-arun.ramadoss@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27	tls: Skip tls_append_frag on zero copy size	Maxim Mikityanskiy	1	-5/+7
	Calling tls_append_frag when max_open_record_len == record->len might add an empty fragment to the TLS record if the call happens to be on the page boundary. Normally tls_append_frag coalesces the zero-sized fragment to the previous one, but not if it's on page boundary. If a resync happens then, the mlx5 driver posts dump WQEs in tx_post_resync_dump, and the empty fragment may become a data segment with byte_count == 0, which will confuse the NIC and lead to a CQE error. This commit fixes the described issue by skipping tls_append_frag on zero size to avoid adding empty fragments. The fix is not in the driver, because an empty fragment is hardly the desired behavior. Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220426154949.159055-1-maximmi@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27	docs: vm/page_owner: use literal blocks for param description	Akira Yokosawa	1	-2/+3
	Sphinx generates hard-to-read lists of parameters at the bottom of the page. Fix them by putting literal-block markers of "::" in front of them. Link: https://lkml.kernel.org/r/cfd3bcc0-b51d-0c68-c065-ca1c4c202447@gmail.com Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Fixes: 57f2b54a9379 ("Documentation/vm/page_owner.rst: update the documentation") Cc: Shenghong Han <hanshenghong2019@email.szu.edu.cn> Cc: Haowen Bai <baihaowen@meizu.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Alex Shi <seakeel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27	kasan: prevent cpu_quarantine corruption when CPU offline and cache shrink occur at same time	Zqiang	1	-0/+7
	kasan_quarantine_remove_cache() is called in kmem_cache_shrink()/ destroy(). The kasan_quarantine_remove_cache() call is protected by cpuslock in kmem_cache_destroy() to ensure serialization with kasan_cpu_offline(). However the kasan_quarantine_remove_cache() call is not protected by cpuslock in kmem_cache_shrink(). When a CPU is going offline and cache shrink occurs at same time, the cpu_quarantine may be corrupted by interrupt (per_cpu_remove_cache operation). So add a cpu_quarantine offline flags check in per_cpu_remove_cache(). [akpm@linux-foundation.org: add comment, per Zqiang] Link: https://lkml.kernel.org/r/20220414025925.2423818-1-qiang1.zhang@intel.com Signed-off-by: Zqiang <qiang1.zhang@intel.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27	intel_idle: Fix SPR C6 optimization	Artem Bityutskiy	1	-5/+3
	The Sapphire Rapids (SPR) C6 optimization was added to the end of the 'spr_idle_state_table_update()' function. However, the function has a 'return' which may happen before the optimization has a chance to run. And this may prevent the optimization from happening. This is an unlikely scenario, but possible if user boots with, say, the 'intel_idle.preferred_cstates=6' kernel boot option. This patch fixes the issue by eliminating the problematic 'return' statement. Fixes: 3a9cf77b60dc ("intel_idle: add core C6 optimization for SPR") Suggested-by: Jan Beulich <jbeulich@suse.com> Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> [ rjw: Minor changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-27	intel_idle: Fix the 'preferred_cstates' module parameter	Artem Bityutskiy	1	-7/+12
	Problem description. When user boots kernel up with the 'intel_idle.preferred_cstates=4' option, we enable C1E and disable C1 states on Sapphire Rapids Xeon (SPR). In order for C1E to work on SPR, we have to enable the C1E promotion bit on all CPUs. However, we enable it only on one CPU. Fix description. The 'intel_idle' driver already has the infrastructure for disabling C1E promotion on every CPU. This patch uses the same infrastructure for enabling C1E promotion on every CPU. It changes the boolean 'disable_promotion_to_c1e' variable to a tri-state 'c1e_promotion' variable. Tested on a 2-socket SPR system. I verified the following combinations: * C1E promotion enabled and disabled in BIOS. * Booted with and without the 'intel_idle.preferred_cstates=4' kernel argument. In all 4 cases C1E promotion was correctly set on all CPUs. Also tested on an old Broadwell system, just to make sure it does not cause a regression. C1E promotion was correctly disabled on that system, both C1 and C1E were exposed (as expected). Fixes: da0e58c038e6 ("intel_idle: add 'preferred_cstates' module argument") Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> [ rjw: Minor changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-27	hex2bin: fix access beyond string end	Mikulas Patocka	1	-3/+6
	If we pass too short string to "hex2bin" (and the string size without the terminating NUL character is even), "hex2bin" reads one byte after the terminating NUL character. This patch fixes it. Note that hex_to_bin returns -1 on error and hex2bin return -EINVAL on error - so we can't just return the variable "hi" or "lo" on error. This inconsistency may be fixed in the next merge window, but for the purpose of fixing this bug, we just preserve the existing behavior and return -1 and -EINVAL. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Fixes: b78049831ffe ("lib: add error checking to hex2bin") Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27	hex2bin: make the function hex_to_bin constant-time	Mikulas Patocka	2	-8/+26
	The function hex2bin is used to load cryptographic keys into device mapper targets dm-crypt and dm-integrity. It should take constant time independent on the processed data, so that concurrently running unprivileged code can't infer any information about the keys via microarchitectural convert channels. This patch changes the function hex_to_bin so that it contains no branches and no memory accesses. Note that this shouldn't cause performance degradation because the size of the new function is the same as the size of the old function (on x86-64) - and the new function causes no branch misprediction penalties. I compile-tested this function with gcc on aarch64 alpha arm hppa hppa64 i386 ia64 m68k mips32 mips64 powerpc powerpc64 riscv sh4 s390x sparc32 sparc64 x86_64 and with clang on aarch64 arm hexagon i386 mips32 mips64 powerpc powerpc64 s390x sparc32 sparc64 x86_64 to verify that there are no branches in the generated code. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27	Add Eric Dumazet to networking maintainers	Jakub Kicinski	1	-0/+2
	Welcome Eric! Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Link: https://lore.kernel.org/r/20220426175723.417614-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27	floppy: disable FDRAWCMD by default	Willy Tarreau	2	-11/+48
	Minh Yuan reported a concurrency use-after-free issue in the floppy code between raw_cmd_ioctl and seek_interrupt. [ It turns out this has been around, and that others have reported the KASAN splats over the years, but Minh Yuan had a reproducer for it and so gets primary credit for reporting it for this fix - Linus ] The problem is, this driver tends to break very easily and nowadays, nobody is expected to use FDRAWCMD anyway since it was used to manipulate non-standard formats. The risk of breaking the driver is higher than the risk presented by this race, and accessing the device requires privileges anyway. Let's just add a config option to completely disable this ioctl and leave it disabled by default. Distros shouldn't use it, and only those running on antique hardware might need to enable it. Link: https://lore.kernel.org/all/000000000000b71cdd05d703f6bf@google.com/ Link: https://lore.kernel.org/lkml/CAKcFiNC=MfYVW-Jt9A3=FPJpTwCD2PL_ULNCpsCVE5s8ZeBQgQ@mail.gmail.com Link: https://lore.kernel.org/all/CAEAjamu1FRhz6StCe_55XY5s389ZP_xmCF69k987En+1z53=eg@mail.gmail.com Reported-by: Minh Yuan <yuanmingbuaa@gmail.com> Reported-by: syzbot+8e8958586909d62b6840@syzkaller.appspotmail.com Reported-by: cruise k <cruise4k@gmail.com> Reported-by: Kyungtae Kim <kt0755@gmail.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Tested-by: Denis Efremov <efremov@linux.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27	platform/x86/intel: pmc/core: change pmc_lpm_modes to static	Tom Rix	1	-1/+1
	Sparse reports this issue core.c: note: in included file: core.h:239:12: warning: symbol 'pmc_lpm_modes' was not declared. Should it be static? Global variables should not be defined in headers. This only works because core.h is only included by core.c. Single file use variables should be static, so change its storage-class specifier to static. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220423123048.591405-1-trix@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-04-27	platform/x86/intel/sdsi: Fix bug in multi packet reads	David E. Box	1	-5/+3
	Fix bug that added an offset to the mailbox addr during multi-packet reads. Did not affect current ABI since it doesn't support multi-packet transactions. Fixes: 2546c6000430 ("platform/x86: Add Intel Software Defined Silicon driver") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220420155622.1763633-4-david.e.box@linux.intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-04-27	platform/x86/intel/sdsi: Poll on ready bit for writes	David E. Box	1	-2/+2
	Due to change in firmware flow, update mailbox writes to poll on ready bit instead of run_busy bit. This change makes the polling method consistent for both writes and reads, which also uses the ready bit. Fixes: 2546c6000430 ("platform/x86: Add Intel Software Defined Silicon driver") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220420155622.1763633-3-david.e.box@linux.intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-04-27	platform/x86/intel/sdsi: Handle leaky bucket	David E. Box	1	-7/+25
	To prevent an agent from indefinitely holding the mailbox firmware has implemented a leaky bucket algorithm. Repeated access to the mailbox may now incur a delay of up to 2.1 seconds. Add a retry loop that tries for up to 2.5 seconds to acquire the mailbox. Fixes: 2546c6000430 ("platform/x86: Add Intel Software Defined Silicon driver") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220420155622.1763633-2-david.e.box@linux.intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-04-27	platform/x86: intel-uncore-freq: Prevent driver loading in guests	Srinivas Pandruvada	1	-0/+3
	Loading this driver in guests results in unchecked MSR access error for MSR 0x620. There is no use of reading and modifying package/die scope uncore MSRs in guests. So check for CPU feature X86_FEATURE_HYPERVISOR to prevent loading of this driver in guests. Fixes: dbce412a7733 ("platform/x86/intel-uncore-freq: Split common and enumeration part") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215870 Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20220427100304.2562990-1-srinivas.pandruvada@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-04-27	platform/x86: gigabyte-wmi: added support for B660 GAMING X DDR4 motherboard	Darryn Anton Jordan	1	-0/+1
	This works on my system. Signed-off-by: Darryn Anton Jordan <darrynjordan@icloud.com> Acked-by: Thomas Weißschuh <thomas@weissschuh.net> Link: https://lore.kernel.org/r/Ylguq87YG+9L3foV@hark Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-04-27	platform/x86: dell-laptop: Add quirk entry for Latitude 7520	Gabriele Mazzotta	1	-0/+13
	The Latitude 7520 supports AC timeouts, but it has no KBD_LED_AC_TOKEN and so changes to stop_timeout appear to have no effect if the laptop is plugged in. Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Acked-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20220426120827.12363-1-gabriele.mzt@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>