aboutsummaryrefslogtreecommitdiffstatshomepage
AgeCommit message (Collapse)AuthorFilesLines
2020-03-23r8169: add new helper rtl8168g_enable_gphy_10mHeiner Kallweit1-8/+10
Factor out setting GPHY 10M to new helper rtl8168g_enable_gphy_10m. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller18-304/+826
Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2020-03-21 Implement basic support for the devlink interface in the ice driver. Additionally pave some necessary changes for adding a devlink region that exposes the NVM contents. This series first contains 5 patches for enabling and implementing full NVM read access via the ETHTOOL_GEEPROM interface. This includes some cleanup of endian-types, a new function for reading from the NVM and Shadow RAM as a flat addressable space, a function to calculate the available flash size during load, and a change to how some of the NVM version fields are stored in the ice_nvm_info structure. Following this is 3 patches for implementing devlink support. First, one patch which implements the basic framework and introduces the ice_devlink.c file. Second, a patch to implement basic .info_get support. Finally, a patch which reads the device PBA identifier and reports it as the `board.id` value in the .info_get response. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23Merge branch 'octeontx2-vf-Add-network-driver-for-virtual-function'David S. Miller10-112/+1693
Sunil Goutham says: ==================== octeontx2-vf: Add network driver for virtual function This patch series adds network driver for the virtual functions of OcteonTX2 SOC's resource virtualization unit (RVU). Changes from v3: * Removed missed out EXPORT symbols in VF driver. Changes from v2: * Removed Copyright license text. * Removed wrapper fn()s around mutex_lock and unlock. * Got rid of using macro with 'return'. * Removed __weak fn()s. - Sugested by Leon Romanovsky and Andrew Lunn Changes from v1: * Removed driver version and fixed authorship * Removed driver version and fixed authorship in the already upstreamed AF, PF drivers. * Removed unnecessary checks in sriov_enable and xmit fn()s. * Removed WQ_MEM_RECLAIM flag while creating workqueue. * Added lock in tx_timeout task. * Added 'supported_coalesce_params' in ethtool ops. * Minor other cleanups. - Sugested by Jakub Kicinski ==================== Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23octeontx2-pf: Remove wrapper APIs for mutex lock and unlockSunil Goutham5-89/+75
This patch removes wrapper fn()s around mutex_init/lock/unlock. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23octeontx2-af: Remove driver version and fix authorshipSunil Goutham2-6/+2
Removed MODULE_VERSION and fixed MODULE_AUTHOR. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23octeontx2-pf: Cleanup all receive buffers in SG descriptorGeetha sowjanya2-9/+34
With MTU sized receive buffers it is not expected to have CQE_RX with multiple receive buffer pointers. But since same physcial link is shared by PF and it's VFs, the max receive packet configured at link could be morethan MTU. Hence there is a chance of receiving plts morethan MTU which then gets DMA'ed into multiple buffers and notified in a single CQE_RX. This patch treats such pkts as errors and frees up receive buffers pointers back to hardware. Also on the transmit side this patch sets SMQ MAXLEN to max value to avoid HW length errors for the packets whose size > MTU, eg due to path MTU. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23octeontx2-vf: Link event notification supportTomasz Duszynski2-2/+93
VF shares physical link with PF. Admin function (AF) sends notification to PF whenever a link change event happens. PF has to forward the same notification to each of the enabled VF. PF traps START/STOP_RX messages sent by VF to AF to keep track of VF's enabled/disabled state. Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23octeontx2-vf: Ethtool supportTomasz Duszynski3-14/+117
Added ethtool support for VF devices for - Driver stats, Tx/Rx perqueue stats - Set/show Rx/Tx queue count - Set/show Rx/Tx ring sizes - Set/show IRQ coalescing parameters - RSS configuration etc It's the PF which owns the interface, hence VF cannot display underlying CGX interface stats. Except for this rest ethtool support reuses PF's APIs. Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23octeontx2-vf: Virtual function driver supportTomasz Duszynski8-1/+699
On OcteonTx2 silicon there two two types VFs, VFs that share the physical link with their parent SR-IOV PF and the VFs which work in pairs using internal HW loopback channels (LBK). Except for the underlying Rx/Tx channel mapping from netdev functionality perspective they are almost identical. This patch adds netdev driver support for these VFs. Unlike it's parent PF a VF cannot directly communicate with admin function (AF) and it has to go through PF for the same. The mailbox communication with AF works like 'VF <=> PF <=> AF'. Also functionality wise VF and PF are identical, hence to avoid code duplication PF driver's APIs are resued here for HW initialization, packet handling etc etc ie almost everything. For VF driver to compile as module exported few of the existing PF driver APIs. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23octeontx2-pf: Handle VF function level resetGeetha sowjanya2-3/+239
When FLR is initiated for a VF (PCI function level reset), the parent PF gets a interrupt. PF then sends a message to admin function (AF), which then cleanups all resources attached to that VF. Also handled IRQs triggered when master enable bit is cleared or set for VFs. This handler just clears the transaction pending ie TRPEND bit. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23octeontx2-pf: Enable SRIOV and added VF mbox handlingSunil Goutham2-0/+446
Added 'sriov_configure' to enable/disable virtual functions (VFs). Also added handling of mailbox messages from these VFs. Admin function (AF) is the only one with all priviliges to configure HW, alloc resources etc etc, PFs and it's VFs have to request AF via mbox for all their needs. But unlike PFs, their VFs cannot send a mbox request directly. A VF shares a mailbox region with it's parent PF, so VF sends a mailbox msg to PF and then PF forwards it to AF. Then AF after processing sends response to PF which it again forwards to VF. This patch adds support for this 'VF <=> PF <=> AF' mailbox communication. Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23Merge branch 'phy_check_downshift'David S. Miller5-49/+45
Heiner Kallweit says: ==================== net: phy: add and use phy_check_downshift So far PHY drivers have to check whether a downshift occurred to be able to notify the user. To make life of drivers authors a little bit easier move the downshift notification to phylib. phy_check_downshift() compares the highest mutually advertised speed with the actual value of phydev->speed (typically read by the PHY driver from a vendor-specific register) to detect a downshift. v2: Add downshift hint to phy_print_status(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: aquantia: remove downshift warning now that phylib takes careHeiner Kallweit1-24/+1
Now that phylib notifies the user of a downshift we can remove this functionality from the driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: marvell: remove downshift warning now that phylib takes careHeiner Kallweit1-24/+0
Now that phylib notifies the user of a downshift we can remove this functionality from the driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: add and use phy_check_downshiftHeiner Kallweit3-1/+44
So far PHY drivers have to check whether a downshift occurred to be able to notify the user. To make life of drivers authors a little bit easier move the downshift notification to phylib. phy_check_downshift() compares the highest mutually advertised speed with the actual value of phydev->speed (typically read by the PHY driver from a vendor-specific register) to detect a downshift. v2: - Add downshift hint to phy_print_status Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23Merge branch 'net-phy-xpcs-Improvements-for-next'David S. Miller1-4/+10
Jose Abreu says: ==================== net: phy: xpcs: Improvements for -next Misc set of improvements for XPCS. All for net-next. Patch 1/4, returns link error upon 10GKR faults are detected. Patch 2/4, resets XPCS upon probe so that we start from well known state. Patch 3/4, sets Link as down if AutoNeg is enabled but did not finish with success. Patch 4/4, restarts AutoNeg process if previous outcome was not valid. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: xpcs: Restart AutoNeg if outcome was invalidJose Abreu1-1/+3
Restart AutoNeg if we didn't get a valid result from previous run. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: xpcs: Set Link down if AutoNeg is enabled and did not finishJose Abreu1-1/+3
Set XPCS Link as down when AutoNeg is enabled but it didn't finish with success. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: xpcs: Reset XPCS upon probeJose Abreu1-1/+1
Reset the XPCS upon probe stage so that we start it from well known state. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: xpcs: Return error when 10GKR link errors are foundJose Abreu1-1/+3
For 10GKR rate, when link errors are found we need to return fault status so that XPCS is correctly resumed. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23mlxsw: spectrum_cnt: Fix 64-bit division in mlxsw_sp_counter_resources_registerNathan Chancellor1-1/+1
When building arm32 allyesconfig: ld.lld: error: undefined symbol: __aeabi_uldivmod >>> referenced by spectrum_cnt.c >>> net/ethernet/mellanox/mlxsw/spectrum_cnt.o:(mlxsw_sp_counter_resources_register) in archive drivers/built-in.a >>> did you mean: __aeabi_uidivmod >>> defined in: arch/arm/lib/lib.a(lib1funcs.o) pool_size and bank_size are u64; use div64_u64 so that 32-bit platforms do not error. Fixes: ab8c4cc60420 ("mlxsw: spectrum_cnt: Move config validation along with resource register") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: sched: rename more stats_typesJakub Kicinski7-58/+55
Commit 53eca1f3479f ("net: rename flow_action_hw_stats_types* -> flow_action_hw_stats*") renamed just the flow action types and helpers. For consistency rename variables, enums, struct members and UAPI too (note that this UAPI was not in any official release, yet). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: mptcp: don't hang in mptcp_sendmsg() after TCP fallbackDavide Caratti2-4/+6
it's still possible for packetdrill to hang in mptcp_sendmsg(), when the MPTCP socket falls back to regular TCP (e.g. after receiving unsupported flags/version during the three-way handshake). Adjust MPTCP socket state earlier, to ensure correct functionality of mptcp_sendmsg() even in case of TCP fallback. Fixes: 767d3ded5fb8 ("net: mptcp: don't hang before sending 'MP capable with data'") Fixes: 1954b86016cf ("mptcp: Check connection state before attempting send") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23Merge branch 'MSCC-PHY-RGMII-delays-and-VSC8502-support'David S. Miller2-12/+52
Vladimir Oltean says: ==================== MSCC PHY: RGMII delays and VSC8502 support This series makes RGMII delays configurable as they should be on Vitesse/Microsemi/Microchip RGMII PHYs, and adds support for a new RGMII PHY. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: mscc: add support for VSC8502Vladimir Oltean2-0/+25
This is a dual copper PHY with support for MII/GMII/RGMII on MAC side, as well as a bunch of other features such as SyncE and Ring Resiliency. I haven't tested interrupts and WoL, but I am confident that they work since support is already present in the driver and the register map is no different for this PHY. PHY statistics work, PHY tunables appear to work, suspend/resume works. Signed-off-by: Wes Li <wes.li@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: mscc: configure both RX and TX internal delays for RGMIIVladimir Oltean2-3/+15
The driver appears to be secretly enabling the RX clock skew irrespective of PHY interface type, which is generally considered a big no-no. Make them configurable instead, and add TX internal delays when necessary too. While at it, configure a more canonical clock skew of 2.0 nanoseconds than the current default of 1.1 ns. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: mscc: accept all RGMII species in vsc85xx_mac_if_setVladimir Oltean1-0/+3
The helper for configuring the pinout of the MII side of the PHY should do so irrespective of whether RGMII delays are used or not. So accept the ID, TXID and RXID variants as well, not just the no-delay RGMII variant. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: phy: mscc: rename enum rgmii_rx_clock_delay to rgmii_clock_delayVladimir Oltean2-10/+10
There is nothing RX-specific about these clock skew values. So remove "RX" from the name in preparation for the next patch where TX delays are also going to be configured. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23enetc: Remove unused variable 'enetc_drv_name'YueHaibing2-2/+0
commit ed0a72e0de16 ("net/freescale: Clean drivers from static versions") leave behind this, remove it . Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23Crypto/chtls: add/delete TLS header in driverRohit Maheshwari1-14/+59
Kernel TLS forms TLS header in kernel during encryption and removes while decryption before giving packet back to user application. The similar logic is introduced in chtls code as well. v1->v2: - tls_proccess_cmsg() uses tls_handle_open_record() which is not required in TOE-TLS. Don't mix TOE with other TLS types. Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: Make skb_segment not to compute checksum if network controller supports checksummingYadu Kishore1-8/+15
Problem: TCP checksum in the output path is not being offloaded during GSO in the following case: The network driver does not support scatter-gather but supports checksum offload with NETIF_F_HW_CSUM. Cause: skb_segment calls skb_copy_and_csum_bits if the network driver does not announce NETIF_F_SG. It does not check if the driver supports NETIF_F_HW_CSUM. So for devices which might want to offload checksum but do not support SG there is currently no way to do so if GSO is enabled. Solution: In skb_segment check if the network controller does checksum and if so call skb_copy_bits instead of skb_copy_and_csum_bits. Testing: Without the patch, ran iperf TCP traffic with NETIF_F_HW_CSUM enabled in the network driver. Observed the TCP checksum offload is not happening since the skbs received by the driver in the output path have skb->ip_summed set to CHECKSUM_NONE. With the patch ran iperf TCP traffic and observed that TCP checksum is being offloaded with skb->ip_summed set to CHECKSUM_PARTIAL. Also tested with the patch by disabling NETIF_F_HW_CSUM in the driver to cover the newly introduced if-else code path in skb_segment. Link: https://lore.kernel.org/netdev/CA+FuTSeYGYr3Umij+Mezk9CUcaxYwqEe5sPSuXF8jPE2yMFJAw@mail.gmail.com Signed-off-by: Yadu Kishore <kyk.segfault@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21Merge branch 'net-hns3-add-three-optimizations-for-mailbox-handling'David S. Miller5-386/+420
Huazhong Tan says: ==================== net: hns3: add three optimizations for mailbox handling This patchset includes three code optimizations for mailbox handling. [patch 1] adds a response code conversion. [patch 2] refactors some structure definitions about PF and VF mailbox. [patch 3] refactors the condition whether PF responds VF's mailbox. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21net: hns3: refactor mailbox response scheme between PF and VFHuazhong Tan3-155/+113
Currently, PF responds to VF depending on what mailbox it is handling, it is a bit inflexible. The correct way is, PF should check the mbx_need_resp field to decide whether gives response to VF. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21net: hns3: refactor the mailbox message between PF and VFYufeng Mo5-234/+291
For making the code more readable, this adds several new structure to replace the msg field in structure hclge_mbx_vf_to_pf_cmd and hclge_mbx_pf_to_vf_cmd. Also uses macro to instead of some magic number. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21net: hns3: add a conversion for mailbox's response codeJian Shen2-2/+21
Currently, when mailbox handling fails, the PF driver just responds 1 to the VF driver. It is not sufficient for the VF driver to find out why its mailbox fails. So the error should be responded to VF, but the error is type int and the response field in struct hclge_mbx_pf_to_vf_cmd is type u16, a conversion is needed. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21mptcp: Remove set but not used variable 'can_ack'YueHaibing1-2/+0
Fixes gcc '-Wunused-but-set-variable' warning: net/mptcp/options.c: In function 'mptcp_established_options_dss': net/mptcp/options.c:338:7: warning: variable 'can_ack' set but not used [-Wunused-but-set-variable] commit dc093db5cc05 ("mptcp: drop unneeded checks") leave behind this unused, remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21Merge branch 'selftests-expand-txtimestamp-with-new-features'David S. Miller2-23/+187
Jian Yang says: ==================== selftests: expand txtimestamp with new features Current txtimestamp selftest issues requests with no delay, or fixed 50 usec delay. Nsec granularity is useful to measure fine-grained latency. A configurable delay is useful to simulate the case with cold cachelines. This patchset adds new flags and features to the txtimestamp selftest, including: - Printing in nsec (-N) - Polling interval (-b, -S) - Using epoll (-E, -e) - Printing statistics - Running individual tests in txtimestamp.sh ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21selftests: txtimestamp: print statistics for timestamp events.Jian Yang1-0/+58
Statistics on timestamps is useful to quantify average and tail latency. Print timestamp statistics in count/avg/min/max format. Signed-off-by: Jian Yang <jianyang@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21selftests: txtimestamp: add support for epoll().Jian Yang1-5/+48
Add the following new flags: -e: use level-triggered epoll() instead of poll(). -E: use event-triggered epoll() instead of poll(). Signed-off-by: Jian Yang <jianyang@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21selftests: txtimestamp: add new command-line flags.Jian Yang1-9/+15
A longer sleep duration between sendmsg()s makes more cachelines to be evicted and results in higher latency. Making the duration configurable. Add the following new flags: -S: Configurable sleep duration. -b: Busy loop instead of poll(). Remove the following flag: -D: No delay between packets: subsumed by -S. Signed-off-by: Jian Yang <jianyang@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21selftests: txtimestamp: allow printing latencies in nsec.Jian Yang1-12/+44
Txtimestamp reports latencies in uses resolution, while nsec is needed in cases such as measuring latencies on localhost. Add the following new flag: -N: print timestamps and durations in nsec (instead of usec) Signed-off-by: Jian Yang <jianyang@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21selftests: txtimestamp: allow individual txtimestamp tests.Jian Yang1-3/+28
The wrapper script txtimestamp.sh executes a pre-defined list of testcases sequentially without configuration options available. Add an option (-r/--run) to setup the test namespace and pass remaining arguments to txtimestamp binary. The script still runs all tests when no argument is passed. Signed-off-by: Jian Yang <jianyang@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21Merge branch 'net-tls-Annotate-lockless-access-to-sk_prot'David S. Miller2-14/+16
Jakub Sitnicki says: ==================== net/tls: Annotate lockless access to sk_prot We have recently noticed that there is a case of lockless read/write to sk->sk_prot [0]. sockmap code on psock tear-down writes to sk->sk_prot, while holding sk_callback_lock. Concurrently, tcp can access it. Usually to read out the sk_prot pointer and invoke one of the ops, sk->sk_prot->handler(). The lockless write (lockless in regard to concurrent reads) happens on the following paths: tcp_bpf_{recvmsg|sendmsg} / sock_map_unref sk_psock_put sk_psock_drop sk_psock_restore_proto WRITE_ONCE(sk->sk_prot, proto) To prevent load/store tearing [1], and to make tooling aware of intentional shared access [2], we need to annotate sites that access sk_prot with READ_ONCE/WRITE_ONCE. This series kicks off the effort to do it. Starting with net/tls. [0] https://lore.kernel.org/bpf/a6bf279e-a998-84ab-4371-cd6c1ccbca5d@gmail.com/ [1] https://lwn.net/Articles/793253/ [2] https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21net/tls: Annotate access to sk_prot with READ_ONCE/WRITE_ONCEJakub Sitnicki2-5/+6
sockmap performs lockless writes to sk->sk_prot on the following paths: tcp_bpf_{recvmsg|sendmsg} / sock_map_unref sk_psock_put sk_psock_drop sk_psock_restore_proto WRITE_ONCE(sk->sk_prot, proto) To prevent load/store tearing [1], and to make tooling aware of intentional shared access [2], we need to annotate other sites that access sk_prot with READ_ONCE/WRITE_ONCE macros. Change done with Coccinelle with following semantic patch: @@ expression E; identifier I; struct sock *sk; identifier sk_prot =~ "^sk_prot$"; @@ ( E = -sk->sk_prot +READ_ONCE(sk->sk_prot) | -sk->sk_prot = E +WRITE_ONCE(sk->sk_prot, E) | -sk->sk_prot +READ_ONCE(sk->sk_prot) ->I ) Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21net/tls: Read sk_prot once when building tls proto opsJakub Sitnicki1-8/+9
Apart from being a "tremendous" win when it comes to generated machine code (see bloat-o-meter output for x86-64 below) this mainly prepares ground for annotating access to sk_prot with READ_ONCE, so that we don't pepper the code with access annotations and needlessly repeat loads. add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-46 (-46) Function old new delta tls_init 851 805 -46 Total: Before=21063, After=21017, chg -0.22% Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21net/tls: Constify base proto ops used for building tls protoJakub Sitnicki1-2/+2
The helper that builds kTLS proto ops doesn't need to and should not modify the base proto ops. Annotate the parameter as read-only. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21Merge branch 'ionic-error-recovery-fixes'David S. Miller3-22/+50
Shannon Nelson says: ==================== ionic error recovery fixes These are a few little patches to make error recovery a little more safe and successful. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21ionic: check for NULL structs on teardownShannon Nelson2-13/+20
Make sure the queue structs exist before trying to tear them down to make for safer error recovery. Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21ionic: clean irq affinity on queue deinitShannon Nelson1-0/+2
Add a little more cleanup when tearing down the queues. Fixes: 1d062b7b6f64 ("ionic: Add basic adminq support") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21ionic: ignore eexist on rx filter addShannon Nelson1-2/+2
Don't worry if the rx filter add firmware request fails on EEXIST, at least we know the filter is there. Same for the delete request, at least we know it isn't there. Fixes: 2a654540be10 ("ionic: Add Rx filter and rx_mode ndo support") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>