aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2019-08-27sctp: allow users to set ep ecn flag by sockoptXin Long2-0/+74
SCTP_ECN_SUPPORTED sockopt will be added to allow users to change ep ecn flag, and it's similar with other feature flags. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27sctp: allow users to set netns ecn flag with sysctlXin Long1-0/+7
sysctl net.sctp.ecn_enable is added in this patch. It will allow users to change the default sctp ecn flag, net.sctp.ecn_enable. This feature was also required on this thread: http://lkml.iu.edu/hypermail/linux/kernel/0812.1/01858.html Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27sctp: make ecn flag per netns and endpointXin Long5-5/+21
This patch is to add ecn flag for both netns_sctp and sctp_endpoint, net->sctp.ecn_enable is set 1 by default, and ep->ecn_enable will be initialized with net->sctp.ecn_enable. asoc->peer.ecn_capable will be set during negotiation only when ep->ecn_enable is set on both sides. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: mediatek: remove set but not used variable 'status'Mao Wenan1-2/+0
Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/mediatek/mtk_eth_soc.c: In function mtk_handle_irq: drivers/net/ethernet/mediatek/mtk_eth_soc.c:1951:6: warning: variable status set but not used [-Wunused-but-set-variable] Fixes: 296c9120752b ("net: ethernet: mediatek: Add MT7628/88 SoC support") Signed-off-by: Mao Wenan <maowenan@huawei.com> Reviewed-by: Stefan Roese <sr@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27Merge branch 'Simplify-DSA-handling-of-VLAN-subinterface-offload'David S. Miller2-9/+22
Vladimir Oltean says: ==================== Simplify DSA handling of VLAN subinterface offload Depends on Vivien Didelot's patchset: https://patchwork.ozlabs.org/project/netdev/list/?series=127197&state=* This patchset removes a few strange-looking guards for -EOPNOTSUPP in dsa_slave_vlan_rx_add_vid and dsa_slave_vlan_rx_kill_vid, making that code path no longer possible. It also disables the code path for the sja1105 driver, which does support editing the VLAN table, but not hardware-accelerated VLAN sub-interfaces, therefore the check in the DSA core would be wrong. There was no better DSA callback to do this than .port_enable, i.e. at ndo_open time. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: dsa: sja1105: Clear VLAN filtering offload netdev featureVladimir Oltean1-0/+16
The switch barely supports traffic I/O, and it does that by repurposing VLANs when there is no bridge that is taking control of them. Letting DSA declare this netdev feature as supported (see dsa_slave_create) would mean that VLAN sub-interfaces created on sja1105 switch ports will be hardware offloaded. That means that net/8021q/vlan_core.c would install the VLAN into the filter tables of the switch, potentially interfering with the tag_8021q VLANs. We need to prevent that from happening and not let the 8021q core offload VLANs to the switch hardware tables. In vlan_filtering=0 modes of operation, the switch ports can pass through VLAN-tagged frames with no problem. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: dsa: Advertise the VLAN offload netdev ability only if switch supports itVladimir Oltean1-9/+6
When adding a VLAN sub-interface on a DSA slave port, the 8021q core checks NETIF_F_HW_VLAN_CTAG_FILTER and, if the netdev is capable of filtering, calls .ndo_vlan_rx_add_vid or .ndo_vlan_rx_kill_vid to configure the VLAN offloading. DSA sets this up counter-intuitively: it always advertises this netdev feature, but the underlying driver may not actually support VLAN table manipulation. In that case, the DSA core is forced to ignore the error, because not being able to offload the VLAN is still fine - and should result in the creation of a non-accelerated VLAN sub-interface. Change this so that the netdev feature is only advertised for switch drivers that support VLAN manipulation, instead of checking for -EOPNOTSUPP at runtime. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27Merge branch 'net-ethernet-mediatek-convert-to-PHYLINK'David S. Miller8-292/+470
René van Dorst says: ==================== net: ethernet: mediatek: convert to PHYLINK These patches converts mediatek driver to PHYLINK API. v3->v4: * Phylink improvements and clean-ups after review v2->v3: * Phylink improvements and clean-ups after review v1->v2: * Rebase for mt76x8 changes * Phylink improvements and clean-ups after review * SGMII port doesn't support 2.5Gbit in SGMII mode only in BASE-X mode. Refactor the code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27dt-bindings: net: ethernet: Update mt7622 docs and dts to reflect the new phylink APIRené van Dorst3-12/+19
This patch the removes the recently added mediatek,physpeed property. Use the fixed-link property speed = <2500> to set the phy in 2.5Gbit. See mt7622-bananapi-bpi-r64.dts for a working example. Signed-off-by: René van Dorst <opensource@vdorst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: ethernet: mediatek: Re-add support SGMIIRené van Dorst4-115/+213
* Re-add SGMII support but now with PHYLINK API support So the SGMII changes are more clear * Move SGMII block setup from mtk_gmac_sgmii_path_setup() to mtk_mac_config() * Merge mtk_setup_hw_path() into mtk_mac_config() * Remove mediatek,physpeed property, fixed-link supports now any speed so speed = <2500>; is now valid with PHYLINK * Demagic SGMII register values * Use phylink state to setup fixed-link mode Signed-off-by: René van Dorst <opensource@vdorst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: ethernet: mediatek: Add basic PHYLINK supportRené van Dorst3-192/+265
This convert the basics to PHYLINK API. SGMII support is not in this patch. Signed-off-by: René van Dorst <opensource@vdorst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27Merge branch 'net-dsa-explicit-programmation-of-VLAN-on-CPU-ports'David S. Miller5-109/+136
Vivien Didelot says: ==================== net: dsa: explicit programmation of VLAN on CPU ports When a VLAN is programmed on a user port, every switch of the fabric also program the CPU ports and the DSA links as part of the VLAN. To do that, DSA makes use of bitmaps to prepare all members of a VLAN. While this is expected for DSA links which are used as conduit between interconnected switches, only the dedicated CPU port of the slave must be programmed, not all CPU ports of the fabric. This may also cause problems in other corners of DSA such as the tag_8021q.c driver, which needs to program its ports manually, CPU port included. We need the dsa_port_vlan_{add,del} functions and its dsa_port_vid_{add,del} variants to simply trigger the VLAN programmation without any logic in them, but they may currently skip the operation based on the bridge device state. This patchset gets rid of the bitmap operations, and moves the bridge device check as well as the explicit programmation of CPU ports where they belong, in the slave code. While at it, clear the VLAN flags before programming a CPU port, as it doesn't make sense to forward the PVID flag for example for such ports. Changes in v2: only clear the PVID flag. ==================== Tested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: dsa: clear VLAN PVID flag for CPU portVivien Didelot1-0/+6
When the bridge offloads a VLAN on a slave port, we also need to program its dedicated CPU port as a member of the VLAN. Drivers may handle the CPU port's membership as they want. For example, Marvell as a special "Unmodified" mode to pass frames as is through such ports. Even though DSA expects the drivers to handle the CPU port membership, it does not make sense to program user VLANs as PVID on the CPU port. This patch clears this flag before programming the CPU port. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Suggested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: dsa: program VLAN on CPU port from slaveVivien Didelot2-1/+18
DSA currently programs a VLAN on the CPU port implicitly after the related notifier is received by a switch. While we still need to do this transparent programmation of the DSA links in the fabric, programming the CPU port this way may cause problems in some corners such as the tag_8021q driver. Because the dedicated CPU port is specific to a slave, make their programmation explicit a few layers up, in the slave code. Note that technically, DSA links have a dedicated CPU port as well, but since they are only used as conduit between interconnected switches of a fabric, programming them transparently this way is what we want. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: dsa: check bridge VLAN in slave operationsVivien Didelot2-8/+14
The bridge VLANs are not offloaded by dsa_port_vlan_* if the port is not bridged or if its bridge is not VLAN aware. This is a good thing but other corners of DSA, such as the tag_8021q driver, may need to program VLANs regardless the bridge state. And also because bridge_dev is specific to user ports anyway, move these checks were it belongs, one layer up in the slave code. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Suggested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: dsa: add slave VLAN helpersVivien Didelot1-7/+33
Add dsa_slave_vlan_add and dsa_slave_vlan_del helpers to handle SWITCHDEV_OBJ_ID_PORT_VLAN switchdev objects. Also copy the switchdev_obj_port_vlan structure on add since we will modify it in future patches. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: dsa: do not skip -EOPNOTSUPP in dsa_port_vid_addVivien Didelot2-4/+7
Currently dsa_port_vid_add returns 0 if the switch returns -EOPNOTSUPP. This function is used in the tag_8021q.c code to offload the PVID of ports, which would simply not work if .port_vlan_add is not supported by the underlying switch. Do not skip -EOPNOTSUPP in dsa_port_vid_add but only when necessary, that is to say in dsa_slave_vlan_rx_add_vid. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27net: dsa: remove bitmap operationsVivien Didelot3-90/+59
The bitmap operations were introduced to simplify the switch drivers in the future, since most of them could implement the common VLAN and MDB operations (add, del, dump) with simple functions taking all target ports at once, and thus limiting the number of hardware accesses. Programming an MDB or VLAN this way in a single operation would clearly simplify the drivers a lot but would require a new get-set interface in DSA. The usage of such bitmap from the stack also raised concerned in the past, leading to the dynamic allocation of a new ds->_bitmap member in the dsa_switch structure. So let's get rid of them for now. This commit nicely wraps the ds->ops->port_{mdb,vlan}_{prepare,add} switch operations into new dsa_switch_{mdb,vlan}_{prepare,add} variants not using any bitmap argument anymore. New dsa_switch_{mdb,vlan}_match helpers have been introduced to make clear which local port of a switch must be programmed with the target object. While the targeted user port is an obvious candidate, the DSA links must also be programmed, as well as the CPU port for VLANs. While at it, also remove local variables that are only used once. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller279-1287/+2599
Minor conflict in r8169, bug fix had two versions in net and net-next, take the net-next hunks. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27Merge tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds14-118/+163
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable fixes: - Fix a page lock leak in nfs_pageio_resend() - Ensure O_DIRECT reports an error if the bytes read/written is 0 - Don't handle errors if the bind/connect succeeded - Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidat ed" Bugfixes: - Don't refresh attributes with mounted-on-file information - Fix return values for nfs4_file_open() and nfs_finish_open() - Fix pnfs layoutstats reporting of I/O errors - Don't use soft RPC calls for pNFS/flexfiles I/O, and don't abort for soft I/O errors when the user specifies a hard mount. - Various fixes to the error handling in sunrpc - Don't report writepage()/writepages() errors twice" * tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: remove set but not used variable 'mapping' NFSv2: Fix write regression NFSv2: Fix eof handling NFS: Fix writepage(s) error handling to not report errors twice NFS: Fix spurious EIO read errors pNFS/flexfiles: Don't time out requests on hard mounts SUNRPC: Handle connection breakages correctly in call_status() Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated" SUNRPC: Handle EADDRINUSE and ENOBUFS correctly pNFS/flexfiles: Turn off soft RPC calls SUNRPC: Don't handle errors if the bind/connect succeeded NFS: On fatal writeback errors, we need to call nfs_inode_remove_request() NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0 NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend() NFSv4: Fix return value in nfs_finish_open() NFSv4: Fix return values for nfs4_file_open() NFS: Don't refresh attributes with mounted-on-file information
2019-08-27Merge tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arcLinus Torvalds10-38/+172
Pull ARC updates from Vineet Gupta: - support for Edge Triggered IRQs in ARC IDU intc - other fixes here and there * tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: arc: prefer __section from compiler_attributes.h dt-bindings: IDU-intc: Add support for edge-triggered interrupts dt-bindings: IDU-intc: Clean up documentation ARCv2: IDU-intc: Add support for edge-triggered interrupts ARC: unwind: Mark expected switch fall-throughs ARC: [plat-hsdk]: allow to switch between AXI DMAC port configurations ARC: fix typo in setup_dma_ops log message ARCv2: entry: early return from exception need not clear U & DE bits
2019-08-27Merge tag 'mfd-fixes-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfdLinus Torvalds1-3/+3
Pull MFD fix from Lee Jones: "Identify potentially unused functions in rk808 driver when !PM" * tag 'mfd-fixes-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mfd: rk808: Make PM function declaration static mfd: rk808: Mark pm functions __maybe_unused
2019-08-27Merge tag 'sound-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds10-32/+73
Pull sound fixes from Takashi Iwai: "A collection of small fixes as usual: - More coverage of USB-audio descriptor sanity checks - A fix for mute LED regression on Conexant HD-audio codecs - A few device-specific fixes and quirks for USB-audio and HD-audio - A fix for (die-hard remaining) possible race in sequencer core - FireWire oxfw regression fix that was introduced in 5.3-rc1" * tag 'sound-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: oxfw: fix to handle correct stream for PCM playback ALSA: seq: Fix potential concurrent access to the deleted pool ALSA: usb-audio: Check mixer unit bitmap yet more strictly ALSA: line6: Fix memory leak at line6_init_pcm() error path ALSA: usb-audio: Fix invalid NULL check in snd_emuusb_set_samplerate() ALSA: hda/ca0132 - Add new SBZ quirk ALSA: usb-audio: Add implicit fb quirk for Behringer UFX1604 ALSA: hda - Fixes inverted Conexant GPIO mic mute led
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds67-204/+415
Pull networking fixes from David Miller: 1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya Leoshkevich. 2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for SMC from Jason Baron. 3) ipv6_mc_may_pull() should return 0 for malformed packets, not -EINVAL. From Stefano Brivio. 4) Don't forget to unpin umem xdp pages in error path of xdp_umem_reg(). From Ivan Khoronzhuk. 5) Fix sta object leak in mac80211, from Johannes Berg. 6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2 switches. From Florian Fainelli. 7) Revert DMA sync removal from r8169 which was causing regressions on some MIPS Loongson platforms. From Heiner Kallweit. 8) Use after free in flow dissector, from Jakub Sitnicki. 9) Fix NULL derefs of net devices during ICMP processing across collect_md tunnels, from Hangbin Liu. 10) proto_register() memory leaks, from Zhang Lin. 11) Set NLM_F_MULTI flag in multipart netlink messages consistently, from John Fastabend. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits) r8152: Set memory to all 0xFFs on failed reg reads openvswitch: Fix conntrack cache with timeout ipv4: mpls: fix mpls_xmit for iptunnel nexthop: Fix nexthop_num_path for blackhole nexthops net: rds: add service level support in rds-info net: route dump netlink NLM_F_MULTI flag missing s390/qeth: reject oversized SNMP requests sock: fix potential memory leak in proto_register() MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode ipv4/icmp: fix rt dst dev null pointer dereference openvswitch: Fix log message in ovs conntrack bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0 bpf: fix use after free in prog symbol exposure bpf: fix precision tracking in presence of bpf2bpf calls flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH Revert "r8169: remove not needed call to dma_sync_single_for_device" ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev net/ncsi: Fix the payload copying for the request coming from Netlink qed: Add cleanup in qed_slowpath_start() ...
2019-08-27NFS: remove set but not used variable 'mapping'YueHaibing1-2/+0
Fixes gcc '-Wunused-but-set-variable' warning: fs/nfs/write.c: In function nfs_page_async_flush: fs/nfs/write.c:609:24: warning: variable mapping set but not used [-Wunused-but-set-variable] It is not use since commit aefb623c422e ("NFS: Fix writepage(s) error handling to not report errors twice") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-27NFSv2: Fix write regressionTrond Myklebust1-1/+3
Ensure we update the write result count on success, since the RPC call itself does not do so. Reported-by: Jan Stancek <jstancek@redhat.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Jan Stancek <jstancek@redhat.com>
2019-08-27NFSv2: Fix eof handlingTrond Myklebust1-1/+2
If we received a reply from the server with a zero length read and no error, then that implies we are at eof. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-27mfd: rk808: Make PM function declaration staticLee Jones1-1/+1
Avoids: ../drivers/mfd/rk808.c:771:1: warning: symbol 'rk8xx_pm_ops' \ was not declared. Should it be static? Fixes: 5752bc4373b2 ("mfd: rk808: Mark pm functions __maybe_unused") Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2019-08-27mfd: rk808: Mark pm functions __maybe_unusedArnd Bergmann1-2/+2
The newly added suspend/resume functions are only used if CONFIG_PM is enabled: drivers/mfd/rk808.c:752:12: error: 'rk8xx_resume' defined but not used [-Werror=unused-function] drivers/mfd/rk808.c:732:12: error: 'rk8xx_suspend' defined but not used [-Werror=unused-function] Mark them as __maybe_unused so the compiler can silently drop them when they are not needed. Fixes: 586c1b4125b3 ("mfd: rk808: Add RK817 and RK809 support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2019-08-26nfp: add AMDA0058 boards to firmware listJakub Kicinski1-0/+2
Add MODULE_FIRMWARE entries for AMDA0058 boards. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26r8169: improve DMA handling in rtl_rxHeiner Kallweit1-4/+3
Move the call to dma_sync_single_for_cpu after calling napi_alloc_skb. This avoids calling dma_sync_single_for_cpu w/o handing control back to device if the memory allocation should fail. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26Merge branch 'cls-hw-offload-rtnl'David S. Miller11-172/+478
Vlad Buslov says: ==================== Refactor cls hardware offload API to support rtnl-independent drivers Currently, all cls API hardware offloads driver callbacks require caller to hold rtnl lock when calling them. This patch set introduces new API that allows drivers to register callbacks that are not dependent on rtnl lock and unlocked classifiers to offload filters without obtaining rtnl lock first, which is intended to allow offloading tc rules in parallel. Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added. TC rule update handlers (RTM_NEWTFILTER, RTM_DELTFILTER, etc.) are already registered with this flag and only take rtnl lock when qdisc or classifier requires it. Classifiers can indicate that their ops callbacks don't require caller to hold rtnl lock by setting the TCF_PROTO_OPS_DOIT_UNLOCKED flag. Unlocked implementation of flower classifier is now upstreamed. However, this implementation still obtains rtnl lock before calling hardware offloads API. Implement following cls API changes: - Introduce new "unlocked_driver_cb" flag to struct flow_block_offload to allow registering and unregistering block hardware offload callbacks that do not require caller to hold rtnl lock. Drivers that doesn't require users of its tc offload callbacks to hold rtnl lock sets the flag to true on block bind/unbind. Internally tcf_block is extended with additional lockeddevcnt counter that is used to count number of devices that require rtnl lock that block is bound to. When this counter is zero, tc_setup_cb_*() functions execute callbacks without obtaining rtnl lock. - Extend cls API single hardware rule update tc_setup_cb_call() function with tc_setup_cb_add(), tc_setup_cb_replace(), tc_setup_cb_destroy() and tc_setup_cb_reoffload() functions. These new APIs are needed to move management of block offload counter, filter in hardware counter and flag from classifier implementations to cls API, which is now responsible for managing them in concurrency-safe manner. Access to cb_list from callback execution code is synchronized by obtaining new 'cb_lock' rw_semaphore in read mode, which allows executing callbacks in parallel, but excludes any modifications of data from register/unregister code. tcf_block offloads counter type is changed to atomic integer to allow updating the counter concurrently. - Extend classifier ops with new ops->hw_add() and ops->hw_del() callbacks which are used to notify unlocked classifiers when filter is successfully added or deleted to hardware without releasing cb_lock. This is necessary to update classifier state atomically with callback list traversal and updating of all relevant counters and allows unlocked classifiers to synchronize with concurrent reoffload without requiring any changes to driver callback API implementations. New tc flow_action infrastructure is also modified to allow its user to execute without rtnl lock protection. Function tc_setup_flow_action() is modified to conditionally obtain rtnl lock before accessing action state. Action data that is accessed by reference is either copied or reference counted to prevent concurrent action overwrite from deallocating it. New function tc_cleanup_flow_action() is introduced to cleanup/release all such data obtained by tc_setup_flow_action(). Flower classifier (only unlocked classifier at the moment) is modified to use new cls hardware offloads API and no longer obtains rtnl lock before calling it. ==================== Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26net: sched: flower: don't take rtnl lock for cls hw offloads APIVlad Buslov1-37/+16
Don't manually take rtnl lock in flower classifier before calling cls hardware offloads API. Instead, pass rtnl lock status via 'rtnl_held' parameter. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26net: sched: copy tunnel info when setting flow_action entry->tunnelVlad Buslov2-1/+25
In order to remove dependency on rtnl lock, modify tc_setup_flow_action() to copy tunnel info, instead of just saving pointer to tunnel_key action tunnel info. This is necessary to prevent concurrent action overwrite from releasing tunnel info while it is being used by rtnl-unlocked driver. Implement helper tcf_tunnel_info_copy() that is used to copy tunnel info with all its options to dynamically allocated memory block. Modify tc_cleanup_flow_action() to free dynamically allocated tunnel info. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26net: sched: take reference to action dev before calling offloadsVlad Buslov3-0/+36
In order to remove dependency on rtnl lock when calling hardware offload API, take reference to action mirred dev when initializing flow_action structure in tc_setup_flow_action(). Implement function tc_cleanup_flow_action(), use it to release the device after hardware offload API is done using it. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26net: sched: take rtnl lock in tc_setup_flow_action()Vlad Buslov4-9/+20
In order to allow using new flow_action infrastructure from unlocked classifiers, modify tc_setup_flow_action() to accept new 'rtnl_held' argument. Take rtnl lock before accessing tc_action data. This is necessary to protect from concurrent action replace. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26net: sched: conditionally obtain rtnl lock in cls hw offloads APIVlad Buslov1-0/+65
In order to remove dependency on rtnl lock from offloads code of classifiers, take rtnl lock conditionally before executing driver callbacks. Only obtain rtnl lock if block is bound to devices that require it. Block bind/unbind code is rtnl-locked and obtains block->cb_lock while holding rtnl lock. Obtain locks in same order in tc_setup_cb_*() functions to prevent deadlock. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26net: sched: add API for registering unlocked offload block callbacksVlad Buslov5-0/+13
Extend struct flow_block_offload with "unlocked_driver_cb" flag to allow registering and unregistering block hardware offload callbacks that do not require caller to hold rtnl lock. Extend tcf_block with additional lockeddevcnt counter that is incremented for each non-unlocked driver callback attached to device. This counter is necessary to conditionally obtain rtnl lock before calling hardware callbacks in following patches. Register mlx5 tc block offload callbacks as "unlocked". Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26net: sched: notify classifier on successful offload add/deleteVlad Buslov3-9/+47
To remove dependency on rtnl lock, extend classifier ops with new ops->hw_add() and ops->hw_del() callbacks. Call them from cls API while holding cb_lock every time filter if successfully added to or deleted from hardware. Implement the new API in flower classifier. Use it to manage hw_filters list under cb_lock protection, instead of relying on rtnl lock to synchronize with concurrent fl_reoffload() call. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26net: sched: refactor block offloads counter usageVlad Buslov7-123/+233
Without rtnl lock protection filters can no longer safely manage block offloads counter themselves. Refactor cls API to protect block offloadcnt with tcf_block->cb_lock that is already used to protect driver callback list and nooffloaddevcnt counter. The counter can be modified by concurrent tasks by new functions that execute block callbacks (which is safe with previous patch that changed its type to atomic_t), however, block bind/unbind code that checks the counter value takes cb_lock in write mode to exclude any concurrent modifications. This approach prevents race conditions between bind/unbind and callback execution code but allows for concurrency for tc rule update path. Move block offload counter, filter in hardware counter and filter flags management from classifiers into cls hardware offloads API. Make functions tcf_block_offload_{inc|dec}() and tc_cls_offload_cnt_update() to be cls API private. Implement following new cls API to be used instead: tc_setup_cb_add() - non-destructive filter add. If filter that wasn't already in hardware is successfully offloaded, increment block offloads counter, set filter in hardware counter and flag. On failure, previously offloaded filter is considered to be intact and offloads counter is not decremented. tc_setup_cb_replace() - destructive filter replace. Release existing filter block offload counter and reset its in hardware counter and flag. Set new filter in hardware counter and flag. On failure, previously offloaded filter is considered to be destroyed and offload counter is decremented. tc_setup_cb_destroy() - filter destroy. Unconditionally decrement block offloads counter. tc_setup_cb_reoffload() - reoffload filter to single cb. Execute cb() and call tc_cls_offload_cnt_update() if cb() didn't return an error. Refactor all offload-capable classifiers to atomically offload filters to hardware, change block offload counter, and set filter in hardware counter and flag by means of the new cls API functions. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26net: sched: change tcf block offload counter type to atomic_tVlad Buslov2-4/+5
As a preparation for running proto ops functions without rtnl lock, change offload counter type to atomic. This is necessary to allow updating the counter by multiple concurrent users when offloading filters to hardware from unlocked classifiers. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26net: sched: protect block offload-related fields with rw_semaphoreVlad Buslov2-9/+38
In order to remove dependency on rtnl lock, extend tcf_block with 'cb_lock' rwsem and use it to protect flow_block->cb_list and related counters from concurrent modification. The lock is taken in read mode for read-only traversal of cb_list in tc_setup_cb_call() and write mode in all other cases. This approach ensures that: - cb_list is not changed concurrently while filters is being offloaded on block. - block->nooffloaddevcnt is checked while holding the lock in read mode, but is only changed by bind/unbind code when holding the cb_lock in write mode to prevent concurrent modification. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-26NFS: Fix writepage(s) error handling to not report errors twiceTrond Myklebust1-8/+13
If writepage()/writepages() saw an error, but handled it without reporting it, we should not be re-reporting that error on exit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26NFS: Fix spurious EIO read errorsTrond Myklebust3-21/+36
If the client attempts to read a page, but the read fails due to some spurious error (e.g. an ACCESS error or a timeout, ...) then we need to allow other processes to retry. Also try to report errors correctly when doing a synchronous readpage. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26pNFS/flexfiles: Don't time out requests on hard mountsTrond Myklebust1-2/+9
If the mount is hard, we should ignore the 'io_maxretrans' module parameter so that we always keep retrying. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26SUNRPC: Handle connection breakages correctly in call_status()Trond Myklebust1-1/+1
If the connection breaks while we're waiting for a reply from the server, then we want to immediately try to reconnect. Fixes: ec6017d90359 ("SUNRPC fix regression in umount of a secure mount") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated"Trond Myklebust3-25/+0
This reverts commit a79f194aa4879e9baad118c3f8bb2ca24dbef765. The mechanism for aborting I/O is racy, since we are not guaranteed that the request is asleep while we're changing both task->tk_status and task->tk_action. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v5.1
2019-08-26SUNRPC: Handle EADDRINUSE and ENOBUFS correctlyTrond Myklebust1-3/+7
If a connect or bind attempt returns EADDRINUSE, that means we want to retry with a different port. It is not a fatal connection error. Similarly, ENOBUFS is not fatal, but just indicates a memory allocation issue. Retry after a short delay. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26pNFS/flexfiles: Turn off soft RPC callsTrond Myklebust1-5/+10
The pNFS/flexfiles I/O requests are sent with the SOFTCONN flag set, so they automatically time out if the connection breaks. It should therefore not be necessary to have the soft flag set in addition. Fixes: 5f01d9539496 ("nfs41: create NFSv3 DS connection if specified") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26SUNRPC: Don't handle errors if the bind/connect succeededTrond Myklebust1-11/+24
Don't handle errors in call_bind_status()/call_connect_status() if it turns out that a previous call caused it to succeed. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v5.1+