linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2020-09-08	net: hns3: skip periodic service task if reset failed	Guangbin Huang	2	-0/+6
	When reset fails, if there are some pending jobs for the periodic service task, it does not do anything except print error each time the task is scheduled. So skip the periodic service task if reset failed. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08	net: hns3: narrow two local variable range in hclgevf_reset_prepare_wait()	Huazhong Tan	1	-6/+6
	Since variable send_msg and ret only used in if branch, so move their definition into the if branch. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08	net: sched: skip an unnecessay check	Tom Rix	1	-2/+3
	Reviewing the error handling in tcf_action_init_1() most of the early handling uses err_out: if (cookie) { kfree(cookie->data); kfree(cookie); } before cookie could ever be set. So skip the unnecessay check. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08	rxrpc: Allow multiple client connections to the same peer	David Howells	1	-0/+6
	Allow the number of parallel connections to a machine to be expanded from a single connection to a maximum of four. This allows up to 16 calls to be in progress at the same time to any particular peer instead of 4. Signed-off-by: David Howells <dhowells@redhat.com>
2020-09-08	rxrpc: Rewrite the client connection manager	David Howells	12	-691/+559
	Rewrite the rxrpc client connection manager so that it can support multiple connections for a given security key to a peer. The following changes are made: (1) For each open socket, the code currently maintains an rbtree with the connections placed into it, keyed by communications parameters. This is tricky to maintain as connections can be culled from the tree or replaced within it. Connections can require replacement for a number of reasons, e.g. their IDs span too great a range for the IDR data type to represent efficiently, the call ID numbers on that conn would overflow or the conn got aborted. This is changed so that there's now a connection bundle object placed in the tree, keyed on the same parameters. The bundle, however, does not need to be replaced. (2) An rxrpc_bundle object can now manage the available channels for a set of parallel connections. The lock that manages this is moved there from the rxrpc_connection struct (channel_lock). (3) There'a a dummy bundle for all incoming connections to share so that they have a channel_lock too. It might be better to give each incoming connection its own bundle. This bundle is not needed to manage which channels incoming calls are made on because that's the solely at whim of the client. (4) The restrictions on how many client connections are around are removed. Instead, a previous patch limits the number of client calls that can be allocated. Ordinarily, client connections are reaped after 2 minutes on the idle queue, but when more than a certain number of connections are in existence, the reaper starts reaping them after 2s of idleness instead to get the numbers back down. It could also be made such that new call allocations are forced to wait until the number of outstanding connections subsides. Signed-off-by: David Howells <dhowells@redhat.com>
2020-09-08	rxrpc: Impose a maximum number of client calls	David Howells	3	-3/+49
	Impose a maximum on the number of client rxrpc calls that are allowed simultaneously. This will be in lieu of a maximum number of client connections as this is easier to administed as, unlike connections, calls aren't reusable (to be changed in a subsequent patch).. This doesn't affect the limits on service calls and connections. Signed-off-by: David Howells <dhowells@redhat.com>
2020-09-08	netfilter: nf_tables: add userdata support for nft_object	Jose M. Guisado Gomez	3	-8/+31
	Enables storing userdata for nft_object. Initially this will store an optional comment but can be extended in the future as needed. Adds new attribute NFTA_OBJ_USERDATA to nft_object. Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-09-08	selftests/net: replace obsolete NFT_CHAIN configuration	Fabian Frederick	1	-2/+1
	Replace old parameters with global NFT_NAT from commit db8ab38880e0 ("netfilter: nf_tables: merge ipv4 and ipv6 nat chain types") Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-09-08	netfilter: ebt_stp: Remove unused macro BPDU_TYPE_TCN	Wang Hai	1	-1/+0
	BPDU_TYPE_TCN is never used after it was introduced. So better to remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-09-07	net: dsa: don't print non-fatal MTU error if not supported	Vladimir Oltean	1	-1/+1
	Commit 72579e14a1d3 ("net: dsa: don't fail to probe if we couldn't set the MTU") changed, for some reason, the "err && err != -EOPNOTSUPP" check into a simple "err". This causes the MTU warning to be printed even for drivers that don't have the MTU operations implemented. Fix that. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: dsa: change PHY error message again	Vladimir Oltean	1	-2/+3
	slave_dev->name is only populated at this stage if it was specified through a label in the device tree. However that is not mandatory. When it isn't, the error message looks like this: [ 5.037057] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d [ 5.044672] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d [ 5.052275] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d [ 5.059877] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d which is especially confusing since the error gets printed on behalf of the DSA master (fsl_enetc in this case). Printing an error message that contains a valid reference to the DSA port's name is difficult at this point in the initialization stage, so at least we should print some info that is more reliable, even if less user-friendly. That may be the driver name and the hardware port index. After this change, the error is printed as: [ 6.051587] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 0 [ 6.061192] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 1 [ 6.070765] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 2 [ 6.080324] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 3 Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: tighten the definition of interface statistics	Jakub Kicinski	3	-17/+320
	This patch is born out of an investigation into which IEEE statistics correspond to which struct rtnl_link_stats64 members. Turns out that there seems to be reasonable consensus on the matter, among many drivers. To save others the time (and it took more time than I'm comfortable admitting) I'm adding comments referring to IEEE attributes to struct rtnl_link_stats64. Up until now we had two forms of documentation for stats - in Documentation/ABI/testing/sysfs-class-net-statistics and the comments on struct rtnl_link_stats64 itself. While the former is very cautious in defining the expected behavior, the latter feel quite dated and may not be easy to understand for modern day driver author (e.g. rx_over_errors). At the same time modern systems are far more complex and once obvious definitions lost their clarity. For example - does rx_packet count at the MAC layer (aFramesReceivedOK)? packets processed correctly by hardware? received by the driver? or maybe received by the stack? I tried to clarify the expectations, further clarifications from others are very welcome. The part hardest to untangle is rx_over_errors vs rx_fifo_errors vs rx_missed_errors. After much deliberation I concluded that for modern HW only two of the counters will make sense. The distinction between internal FIFO overflow and packets dropped due to back-pressure from the host is likely too implementation (driver and device) specific to expose in the standard stats. Now - which two of those counters we select to use is anyone's pick: sysfs documentation suggests rx_over_errors counts packets which did not fit into buffers due to MTU being too small, which I reused. There don't seem to be many modern drivers using it (well, CAN drivers seem to love this statistic). Of the remaining two I picked rx_missed_errors to report device drops. bnxt reports it and it's folded into "drop"s in procfs (while rx_fifo_errors is an error, and modern devices usually receive the frame OK, they just can't admit it into the pipeline). Of the drivers I looked at only AMD Lance-like and NS8390-like use all three of these counters. rx_missed_errors counts missed frames, rx_over_errors counts overflow events, and rx_fifo_errors counts frames which were truncated because they didn't fit into buffers. This suggests that rx_fifo_errors may be the correct stat for truncated packets, but I'd think a FIFO stat counting truncated packets would be very confusing to a modern reader. v2: - add driver developer notes about ethtool stat count and reset - replace Ethernet with IEEE 802.3 to better indicate source of attrs - mention byte counters don't count FCS - clarify RX counter is from device to host - drop "sightly" from sysfs paragraph - add examples of ethtool stats - s/incoming/received/ s/incoming/transmitted/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	rxrpc: Remove unused macro rxrpc_min_rtt_wlen	Wang Hai	1	-1/+0
	rxrpc_min_rtt_wlen is never used after it was introduced. So better to remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	sfc: simplify DMA mask setting	Edward Cree	1	-11/+1
	Christoph says[1] that dma_set_mask_and_coherent() is smart enough to truncate the mask itself if it's too long. So we can get rid of our "lop off one bit and retry" loop in efx_init_io(). [1]: https://www.spinics.net/lists/netdev/msg677266.html Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	sfc: remove EFX_DRIVER_VERSION	Edward Cree	3	-5/+1
	Per-module versions for in-tree drivers are deprecated. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	sfc: handle limited FEC support	Edward Cree	2	-14/+36
	If the reported PHY capabilities do not include a given FEC mode, don't attempt to select that FEC mode anyway. If the user tries to set a mode through ethtool that is not supported, return an error. The _REQUESTED bits don't appear in the supported caps, but are implied by the corresponding FEC bits. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	sfc: add ethtool ops and miscellaneous ndos to EF100	Edward Cree	3	-1/+51
	Mostly just calls to existing common functions. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	sfc: remove phy_op indirection	Edward Cree	9	-712/+601
	Originally there were several implementations of PHY operations for the several different PHYs used on Falcon boards. But Falcon is now in a separate driver, and all sfc NICs since then have had MCDI-managed PHYs. Thus, there is no need to indirect through function pointers in efx->phy_op; we can simply call the efx_mcdi_phy_* functions directly. This also hooks up these functions for EF100, which was previously using the dummy_phy_ops. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	sfc: don't double-down() filters in ef100_reset()	Edward Cree	1	-12/+0
	dev_close(), by way of ef100_net_stop(), already brings down the filter table, so there's no need to do it again (which just causes lots of WARN_ONs). Similarly, don't bring it up ourselves, as dev_open() -> ef100_net_open() will do it, and will fail if it's already been brought up. Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: ethernet: dnet: Remove set but unused variable 'len'	Wang Hai	1	-4/+1
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/dnet.c: In function dnet_start_xmit drivers/net/ethernet/dnet.c:511:15: warning: variable ‘len’ set but not used [-Wunused-but-set-variable] commit 4796417417a6 ("dnet: Dave DNET ethernet controller driver (updated)") involved this unused variable, remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: ethernet: dwmac: remove redundant null check before clk_disable_unprepare()	Zhang Changzhong	1	-2/+1
	Because clk_prepare_enable() and clk_disable_unprepare() already checked NULL clock parameter, so the additional checks are unnecessary, just remove them. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: ethernet: fec: remove redundant null check before clk_disable_unprepare()	Zhang Changzhong	1	-2/+1
	Because clk_prepare_enable() and clk_disable_unprepare() already checked NULL clock parameter, so the additional checks are unnecessary, just remove them. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: stmmac: remove redundant null check before clk_disable_unprepare()	Zhang Changzhong	1	-4/+2
	Because clk_prepare_enable() and clk_disable_unprepare() already checked NULL clock parameter, so the additional checks are unnecessary, just remove them. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: xilinx: remove redundant null check before clk_disable_unprepare()	Zhang Changzhong	1	-2/+1
	Because clk_prepare_enable() and clk_disable_unprepare() already checked NULL clock parameter, so the additional checks are unnecessary, just remove them. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: destroy all entries via gc	Nikolay Aleksandrov	2	-42/+89
	Since each entry type has timers that can be running simultaneously we need to make sure that entries are not freed before their timers have finished. In order to do that generalize the src gc work to mcast gc work and use a callback to free the entries (mdb, port group or src). v3: add IPv6 support v2: force mcast gc on port del to make sure all port group timers have finished before freeing the bridge port Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: improve IGMPv3/MLDv2 query processing	Nikolay Aleksandrov	1	-3/+12
	When an IGMPv3/MLDv2 query is received and we're operating in such mode then we need to avoid updating group timers if the suppress flag is set. Also we should update only timers for groups in exclude mode. v3: add IPv6/MLDv2 support Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: support for IGMPV3/MLDv2 BLOCK_OLD_SOURCES report	Nikolay Aleksandrov	1	-0/+97
	We already have all necessary helpers, so process IGMPV3/MLDv2 BLOCK_OLD_SOURCES as per the RFCs. v3: add IPv6/MLDv2 support v2: directly do flag bit operations Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: support for IGMPV3/MLDv2 CHANGE_TO_INCLUDE/EXCLUDE report	Nikolay Aleksandrov	1	-0/+306
	In order to process IGMPV3/MLDv2 CHANGE_TO_INCLUDE/EXCLUDE report types we need new helpers which allow us to mark entries based on their timer state and to query only marked entries. v3: add IPv6/MLDv2 support, fix other_query checks v2: directly do flag bit operations Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: support for IGMPV3/MLDv2 MODE_IS_INCLUDE/EXCLUDE report	Nikolay Aleksandrov	1	-0/+126
	In order to process IGMPV3/MLDv2_MODE_IS_INCLUDE/EXCLUDE report types we need some new helpers which allow us to set/clear flags for all current entries and later delete marked entries after the report sources have been processed. v3: add IPv6/MLDv2 support v2: drop flag helpers and directly do flag bit operations Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: support for IGMPv3/MLDv2 ALLOW_NEW_SOURCES report	Nikolay Aleksandrov	2	-22/+137
	This patch adds handling for the ALLOW_NEW_SOURCES IGMPv3/MLDv2 report types and limits them only when multicast_igmp_version == 3 or multicast_mld_version == 2 respectively. Now that IGMPv3/MLDv2 handling functions will be managing timers we need to delay their activation, thus a new argument is added which controls if the timer should be updated. We also disable host IGMPv3/MLDv2 handling as it's not yet implemented and could cause inconsistent group state, the host can only join a group as EXCLUDE {} or leave it. v4: rename update_timer to igmpv2_mldv1 and use the passed value from br_multicast_add_group's callers v3: Add IPv6/MLDv2 support Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: delete expired port groups without srcs	Nikolay Aleksandrov	1	-1/+20
	If an expired port group is in EXCLUDE mode, then we have to turn it into INCLUDE mode, remove all srcs with zero timer and finally remove the group itself if there are no more srcs with an active timer. For IGMPv2 use there would be no sources, so this will reduce to just removing the group as before. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mdb: use mdb and port entries in notifications	Nikolay Aleksandrov	3	-68/+92
	We have to use mdb and port entries when sending mdb notifications in order to fill in all group attributes properly. Before this change we would've used a fake br_mdb_entry struct to fill in only partial information about the mdb. Now we can also reuse the mdb dump fill function and thus have only a single central place which fills the mdb attributes. v3: add IPv6 support Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mdb: push notifications in __br_mdb_add/del	Nikolay Aleksandrov	1	-12/+8
	This change is in preparation for using the mdb port group entries when sending a notification, so their full state and additional attributes can be filled in. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: add support for group query retransmit	Nikolay Aleksandrov	2	-10/+71
	We need to be able to retransmit group-specific and group-and-source specific queries. The new timer takes care of those. v3: add IPv6 support Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: add support for group-and-source specific queries	Nikolay Aleksandrov	2	-54/+183
	Allows br_multicast_alloc_query to build queries with the port group's source lists and sends a query for sources over and under lmqt when necessary as per RFCs 3376 and 3810 with the suppress flag set appropriately. v3: add IPv6 support Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: add support for src list and filter mode dumping	Nikolay Aleksandrov	2	-2/+104
	Support per port group src list (address and timer) and filter mode dumping. Protected by either multicast_lock or rcu. v3: add IPv6 support v2: require RCU or multicast_lock to traverse src groups Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: add support for group source list	Nikolay Aleksandrov	3	-14/+179
	Initial functions for group source lists which are needed for IGMPv3 and MLDv2 include/exclude lists. Both IPv4 and IPv6 sources are supported. User-added mdb entries are created with exclude filter mode, we can extend that later to allow user-supplied mode. When group src entries are deleted, they're freed from a workqueue to make sure their timers are not still running. Source entries are protected by the multicast_lock and rcu. The number of src groups per port group is limited to 32. v4: use the new port group del function directly add igmpv2/mldv1 bool to denote if the entry was added in those modes, it will later replace the old update_timer bool v3: add IPv6 support v2: allow src groups to be traversed under rcu Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mcast: factor out port group del	Nikolay Aleksandrov	3	-35/+28
	In order to avoid future errors and reduce code duplication we should factor out the port group del sequence. This allows us to have one function which takes care of all details when removing a port group. v4: set pg's fast leave flag when deleting due to fast leave move the patch before adding source lists Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: bridge: mdb: arrange internal structs so fast-path fields are close	Nikolay Aleksandrov	1	-5/+9
	Before this patch we'd need 2 cache lines for fast-path, now all used fields are in the first cache line. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	net: dsa: rtl8366rb: Switch to phylink	Linus Walleij	1	-7/+37
	This switches the RTL8366RB over to using phylink callbacks instead of .adjust_link(). This is a pretty template switchover. All we adjust is the CPU port so that is why the code only inspects this port. We enhance by adding proper error messages, also disabling the CPU port on the way down and moving dev_info() to dev_dbg(). Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07	tipc: fix a deadlock when flushing scheduled work	Hoang Huu Le	4	-19/+19
	In the commit fdeba99b1e58 ("tipc: fix use-after-free in tipc_bcast_get_mode"), we're trying to make sure the tipc_net_finalize_work work item finished if it enqueued. But calling flush_scheduled_work() is not just affecting above work item but either any scheduled work. This has turned out to be overkill and caused to deadlock as syzbot reported: ====================================================== WARNING: possible circular locking dependency detected 5.9.0-rc2-next-20200828-syzkaller #0 Not tainted ------------------------------------------------------ kworker/u4:6/349 is trying to acquire lock: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: flush_workqueue+0xe1/0x13e0 kernel/workqueue.c:2777 but task is already holding lock: ffffffff8a879430 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x9b/0xb10 net/core/net_namespace.c:565 [...] Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pernet_ops_rwsem); lock(&sb->s_type->i_mutex_key#13); lock(pernet_ops_rwsem); lock((wq_completion)events); * DEADLOCK * [...] v1: To fix the original issue, we replace above calling by introducing a bit flag. When a namespace cleaned-up, bit flag is set to zero and: - tipc_net_finalize functionial just does return immediately. - tipc_net_finalize_work does not enqueue into the scheduled work queue. v2: Use cancel_work_sync() helper to make sure ONLY the tipc_net_finalize_work() stopped before releasing bcbase object. Reported-by: syzbot+d5aa7e0385f6a5d0f4fd@syzkaller.appspotmail.com Fixes: fdeba99b1e58 ("tipc: fix use-after-free in tipc_bcast_get_mode") Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-06	enic: switch from 'pci_' to 'dma_' API	Christophe JAILLET	2	-70/+72
	The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below and has been hand modified to replace GFP_ with a correct flag. It has been compile tested. When memory is allocated in 'vnic_dev_classifier()', 'vnic_dev_fw_info()', 'vnic_dev_notify_set()' and 'vnic_dev_stats_dump()' (vnic_dev.c) GFP_ATOMIC must be used because its callers take a spinlock before calling these functions. When memory is allocated in '__enic_set_rsskey()' and 'enic_set_rsscpu()' GFP_ATOMIC must be used because they can be called with a spinlock. The call chain is: enic_reset <-- takes 'enic->enic_api_lock' --> enic_set_rss_nic_cfg --> enic_set_rsskey --> __enic_set_rsskey <-- uses dma_alloc_coherent --> enic_set_rsscpu <-- uses dma_alloc_coherent When memory is allocated in 'vnic_dev_init_prov2()' GFP_ATOMIC must be used because a spinlock is hidden in the ENIC_DEVCMD_PROXY_BY_INDEX macro, when this function is called in 'enic_set_port_profile()'. When memory is allocated in 'vnic_dev_alloc_desc_ring()' GFP_KERNEL can be used because it is only called from 5 functions ('vnic_dev_init_devcmd2()', 'vnic_cq_alloc()', 'vnic_rq_alloc()', 'vnic_wq_alloc()' and 'enic_wq_devcmd2_alloc()'. 'vnic_dev_init_devcmd2()': already uses GFP_KERNEL and no lock is taken in the between. 'enic_wq_devcmd2_alloc()': is called from ' vnic_dev_init_devcmd2()' which already uses GFP_KERNEL and no lock is taken in the between. 'vnic_cq_alloc()', 'vnic_rq_alloc()', 'vnic_wq_alloc()': are called from 'enic_alloc_vnic_resources()' 'enic_alloc_vnic_resources()' has only 2 call chains: 1) enic_probe --> enic_dev_init --> enic_alloc_vnic_resources 'enic_probe()' is a probe function and no lock is taken in the between 2) enic_set_ringparam --> enic_alloc_vnic_resources 'enic_set_ringparam()' is a .set_ringparam function (see struct ethtool_ops). It seems to only take a mutex and no spinlock. So all paths are safe to use GFP_KERNEL. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-06	net: gemini: Clean up phy registration	Linus Walleij	1	-21/+11
	It's nice if the phy is online before we register the netdev so try to do that first. Stop trying to do "second tried" to register the phy, it works perfectly fine the first time. Stop remvoving the phy in uninit. Remove it when the driver is remove():d, symmetric to where it is added, in probe(). Suggested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-06	net: Add a missing word	Jonathan Neuschäfer	1	-1/+1
	Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-06	net: dsa: rtl8366rb: Support setting MTU	Linus Walleij	1	-1/+37
	This implements the missing MTU setting for the RTL8366RB switch. Apart from supporting jumboframes, this rids us of annoying boot messages like this: realtek-smi switch: nonfatal error -95 setting MTU on port 0 Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-06	net/packet: Remove unused macro BLOCK_PRIV	Wang Hai	1	-1/+0
	BLOCK_PRIV is never used after it was introduced. So better to remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-05	NFC: digital: Remove two unused macroes	Wang Hai	1	-3/+0
	DIGITAL_NFC_DEP_REQ_RES_TAILROOM is never used after it was introduced. DIGITAL_NFC_DEP_REQ_RES_HEADROOM is no more used after below commit e8e7f4217564 ("NFC: digital: Remove useless call to skb_reserve()") Remove them. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-05	caif: Remove duplicate macro SRVL_CTRL_PKT_SIZE	Wang Hai	1	-1/+0
	Remove SRVL_CTRL_PKT_SIZE which is defined more than once. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-05	MAINTAINERS: repair reference in LYNX PCS MODULE	Lukas Bulwahn	1	-1/+1
	Commit 0da4c3d393e4 ("net: phy: add Lynx PCS module") added the files in ./drivers/net/pcs/, but the new LYNX PCS MODULE section refers to ./drivers/net/phy/. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains: warning: no file matches F: drivers/net/phy/pcs-lynx.c Repair the LYNX PCS MODULE section by referring to the right location. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-05	net: dsa: bcm_sf2: Ensure that MDIO diversion is used	Florian Fainelli	1	-2/+29
	Registering our slave MDIO bus outside of the OF infrastructure is necessary in order to avoid creating double references of the same Device Tree nodes, however it is not sufficient to guarantee that the MDIO bus diversion is used because of_phy_connect() will still resolve to a valid PHY phandle and it will connect to the PHY using its parent MDIO bus which is still the SF2 master MDIO bus. The reason for that is because BCM7445 systems were already shipped with a Device Tree blob looking like this (irrelevant parts omitted for simplicity): ports { #address-cells = <1>; #size-cells = <0>; port@1 { phy-mode = "rgmii-txid"; phy-handle = <&phy0>; reg = <1>; label = "rgmii_1"; }; ... mdio@403c0 { ... phy0: ethernet-phy@0 { broken-turn-around; device_type = "ethernet-phy"; max-speed = <0x3e8>; reg = <0>; compatible = "brcm,bcm53125", "ethernet-phy-ieee802.3-c22"; }; }; There is a hardware issue with chip revisions (Dx) that lead to the development of the following commits: 461cd1b03e32 ("net: dsa: bcm_sf2: Register our slave MDIO bus") 536fab5bf582 ("net: dsa: bcm_sf2: Do not register slave MDIO bus with OF") b8c6cd1d316f ("net: dsa: bcm_sf2: do not use indirect reads and writes for 7445E0") There should have been an internal MDIO bus node created for the chip revision (Dx) that suffers from this problem, but it did not happen back then. Had that happen, that we should have correctly parented phy@0 (bcm53125 below) as child node of the internal MDIO bus, but the production Device Tree blob that was shipped with the firmware targeted the fixed version of the chip, despite both the affected and corrected chips being shipped into production. The problem is that of_phy_connect() for port@1 will happily resolve the 'phy-handle' from the mdio@403c0 node, which bypasses the diversion completely. This results in this double programming that the diversion refers to and aims to avoid. In order to force of_phy_connect() to fail, and have DSA call to dsa_slave_phy_connect(), we must deactivate ethernet-phy@0 from mdio@403c0, and the best way to do that is by removing the phandle property completely. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>