aboutsummaryrefslogtreecommitdiffstats
path: root/CREDITS (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-06-20net: dsa: mv88e6xxx: better IEEE Prio Mapping Table descriptionVivien Didelot2-7/+8
Kill the remaining shift macro in favor of calculating at compile time its value from the more descriptive mask, which gives us a better representation of the register layout. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net: dsa: mv88e6xxx: prefix Global 2 remaining macrosVivien Didelot2-35/+57
Prefix and document the remaining Global 2 registers macros. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net: dsa: mv88e6xxx: prefix Global 2 Watchdog macrosVivien Didelot2-43/+52
The Marvell 88E6352 family has a Global 2 register dedicated to the watchdog setup. But the 88E6390 turned it into an indirect table. Prefix and document that. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net: dsa: mv88e6xxx: prefix Global 2 Switch MAC macrosVivien Didelot2-2/+7
Prefix and document the Global 2 Switch MAC registers macros. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net: dsa: mv88e6xxx: prefix Global 2 EEPROM macrosVivien Didelot2-25/+38
Prefix and document the Global 2 EEPROM registers macros. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net: dsa: mv88e6xxx: prefix Global 2 PVT macrosVivien Didelot2-12/+21
Prefix and document the Global 2 Cross-chip Port VLAN registers macros. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net: dsa: mv88e6xxx: prefix Global 2 MGMT macrosVivien Didelot2-13/+21
Prefix and document the Global 2 MGMT registers macros. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net: dsa: mv88e6xxx: prefix Global 2 Device Mapping macrosVivien Didelot2-5/+7
Prefix and document the Global 2 Device Mapping macros. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net: dsa: mv88e6xxx: prefix Global 2 Trunk macrosVivien Didelot2-14/+17
Prefix and document the Global 2 Trunk registers macros. At the same time, fix the hask -> hash typo and use the mv88e6xxx_port_mask helper. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net: dsa: mv88e6xxx: clarify SMI PHY functionsVivien Didelot2-100/+143
Marvell chips with an SMI PHY access in Global 2 registers handle both Clause 22 and Clause 45 of IEEE 802.3. The 88E6390 family has addition bits to target the internal or external PHYs connected to the device, and a Setup function in addition to the default (register) Access function. Prefix the SMI PHY Command and Data registers macros, implement clear helpers for Clause 22 and 44 Access functions, rename variable to match the SMI and switch vocabulary (device and register addresses for Clause 22 and port and device class for Clause 45.) Finally do not use complex macros but simple 16-bit mask to document the registers organization. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net: dsa: mv88e6xxx: add irl_init_all opVivien Didelot4-50/+116
Some Marvell chips have an Ingress Rate Limit unit. But the command values slightly differs between models: 88E6352 use 3-bit for operations while 88E6390 use different 2-bit operations. This commit kills the IRL flags in favor of a new operation implementing the "Init all resources to the initial state" operation. This fixes the operation of 88E6390 family where 0x1000 means Read the selected resource 0, register 0 on port 16, instead of init all. A mv88e6xxx_irl_setup helper is added to wrap the operation call. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net-next: stmmac: dwmac-sun8i: add support for V3s EMACIcenowy Zheng2-0/+9
Allwinner V3s SoC has an Ethernet MAC and an internal PHY like the ones in H3 SoC, however the MAC has no external *MII interfaces available at GPIOs, thus only MII connection to internal PHY is supported. Add this variant of EMAC to dwmac-sun8i driver. The default value of the syscon EMAC-related register seems to have changed from H3, but it seems to be a harmless change. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20net-next: stmmac: dwmac-sun8i: force EPHY clock freq to 24MHzIcenowy Zheng1-0/+4
The EPHY control part of the EMAC syscon register has a bit called CLK_SEL. On the datasheet it says that if it's 0 the EPHY clock is 25MHz and if it's 1 the clock is 24MHz. However, according to the datasheets, no Allwinner SoC with EPHY has any extra xtal input pins for the EPHY, and the system xtal is 24MHz. That means the EPHY is not possible to get a 25MHz xtal input, and thus the frequency can only be 24MHz. It doesn't matter on H3 as the default value of H3 is 24MHz, however on V3s the default value is wrongly set to 25MHz, which prevented the EPHY from working properly. Force the EPHY clock frequency to 24MHz. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20dt-bindings: syscon: Add DT bindings documentation for Allwinner V3s sysconIcenowy Zheng1-0/+1
Allwinner V3s SoC has a syscon like the one in H3. Add its compatible string. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20dt-bindings: net-next: Add DT bindings documentation for Allwinner V3s EMACIcenowy Zheng1-2/+8
Allwinner V3s SoC has a Ethernet MAC like the one in Allwinner H3, but have no external MII capability. That means that it can only use the EPHY and cannot do Gbps transmission. Add binding for it. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20selftests: Introduce tc testsuiteLucas Bates10-0/+1863
Add the beginnings of a testsuite for tc functionality in the kernel. These are a series of unit tests that use the tc executable and verify the success of those commands by checking both the exit codes and the output from tc's 'show' operation. To run the tests: # cd tools/testing/selftests/tc-testing # sudo ./tdc.py You can specify the tc executable to use with the -p argument on the command line or editing the 'TC' variable in tdc_config.py. Refer to the README for full details on how to run. The initial complement of test cases are limited mostly to tc actions. Test cases are most welcome; see the creating-testcases subdirectory for help in creating them. Signed-off-by: Lucas Bates <lucasb@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20qed: SPQ async callback registrationMichal Kalderon7-55/+96
Whenever firmware indicates that there's an async indication it needs to handle, there's a switch-case where the right functionality is called based on function's personality and information. Before iWARP is added [as yet another client], switch over the SPQ into a callback-registered mechanism, allowing registration of the relevant event-processing logic based on the function's personality. This allows us to tidy the code by removing protocol-specifics from a common file. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20qed: Wait for resources before FUNC_CLOSEMichal Kalderon1-15/+20
Driver needs to wait for all resources to return from FW before it can send the FUNC_CLOSE ramrod. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20qed*: Set rdma generic functions prefixMichal Kalderon5-100/+101
Rename the functions common to both iWARP and RoCE to have a prefix of _rdma_ instead of _roce_. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20qed*: qede_roce.[ch] -> qede_rdma.[ch]Michal Kalderon7-5/+10
Once we have iWARP support, the qede portion of the qedr<->qede would serve all the RDMA protocols - so rename the file to be appropriate to its function. While we're at it, we're also moving a couple of inclusions to it into .h files and adding includes to make sure it contains all type definitions it requires. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20qed: Disable RoCE dpm when DCBx change occursMintz, Yuval3-0/+49
If DCBx update occurs while QPs are open, stop sending edpms until all QPs are closed. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20qed: RoCE EDPM to honor PFCMintz, Yuval2-0/+22
Configure device according to DCBx results so that EDPMs made by RoCE would honor flow-control. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20qed: Chain support for external PBLMintz, Yuval10-28/+56
iWARP would require the chains to allocate/free their PBL memory independently, so add the infrastructure to provide it externally. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19tcp: md5: add TCP_MD5SIG_EXT socket option to set a key address prefixIvan Delalande5-15/+41
Replace first padding in the tcp_md5sig structure with a new flag field and address prefix length so it can be specified when configuring a new key for TCP MD5 signature. The tcpm_flags field will only be used if the socket option is TCP_MD5SIG_EXT to avoid breaking existing programs, and tcpm_prefixlen only when the TCP_MD5SIG_FLAG_PREFIX flag is set. Signed-off-by: Bob Gilligan <gilligan@arista.com> Signed-off-by: Eric Mowat <mowat@arista.com> Signed-off-by: Ivan Delalande <colona@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19tcp: md5: add an address prefix for key lookupIvan Delalande3-16/+70
This allows the keys used for TCP MD5 signature to be used for whole range of addresses, specified with a prefix length, instead of only one address as it currently is. Signed-off-by: Bob Gilligan <gilligan@arista.com> Signed-off-by: Eric Mowat <mowat@arista.com> Signed-off-by: Ivan Delalande <colona@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19cxgb4: notify uP to route ctrlq compl to rdma rspqRaju Rangoju1-4/+6
During the module initialisation there is a possible race (basically race between uld and lld) where neither the uld nor lld notifies the uP about where to route the ctrl queue completions. LLD skips notifying uP as the rdma queues were not created by then (will leave it to ULD to notify the uP). As the ULD comes up, it also skips notifying the uP as the flag FULL_INIT_DONE is not set yet (ULD assumes that the interface is not up yet). Consequently, this race between uld and lld leaves uP unnotified about where to send the ctrl queue completions to, leading to iwarp RI_RES WR failure. Here is the race: CPU 0 CPU1 - allocates nic rx queus - t4_sge_alloc_ctrl_txq() (if rdma rsp queues exists, tell uP to route ctrl queue compl to rdma rspq) - acquires the mutex_lock - allocates rdma response queues - if FULL_INIT_DONE set, tell uP to route ctrl queue compl to rdma rspq - relinquishes mutex_lock - acquires the mutex_lock - enable_rx() - set FULL_INIT_DONE - relinquishes mutex_lock This patch fixes the above issue. Fixes: e7519f9926f1('cxgb4: avoid enabling napi twice to the same queue') Signed-off-by: Raju Rangoju <rajur@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> CC: Stable <stable@vger.kernel.org> # 4.9+ Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19cxgb4: add new T6 pci device id'sGanesh Goudar1-0/+3
Add 0x6082, 0x6083 and 0x6084 T6 device id's Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19nfp: add VLAN filtering supportPablo Cascón2-1/+100
Add general use per-vNIC mailbox area and use it for VLAN filtering support. Initially proto is hardcoded to 802.1q. Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19cxgb4: fix a NULL dereferenceGanesh Goudar1-1/+4
Avoid NULL dereference in setup_sge_queues() when the adapter is in non offload mode. Fixes: 0fbc81b3ad51 ('chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ULD's') Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-18liquidio: replace info-pointer mode with buffer-pointer-only modePrasad Kanneganti11-109/+37
Each Octeon output ring can DMA packets to host memory in two modes: info- pointer mode and buffer-pointer-only mode. In info-pointer mode, Octeon takes two buffer pointers for each packet and places the length of the packet along with specified number of bytes from the beginning of the packet into one buffer and the rest of the packet in a separate buffer. In buffer-pointer-only mode, Octeon takes single buffer pointer and places the length of the packet at the beginning of the buffer followed by the packet data. This patch switches all Octeon output rings from info-pointer mode to buffer-pointer-only mode. This results in fewer DMA setups and cache line snoops. Signed-off-by: Prasad Kanneganti <pkanneganti@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-18pptp: Remove unused variable in pptp_release()Christos Gkekas1-2/+0
Variable opt in pptp_release() is set but never used, thus needs to be removed. Signed-off-by: Christos Gkekas <chris.gekas@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-18liquidio: implement vlan filter enable and disablePrasad Kanneganti3-10/+32
Add implementation to support ethtool -K ethX rx-vlan-filter on/off. Rename OCTNET_CMD_ENABLE_VLAN_FILTER command to OCTNET_CMD_VLAN_FILTER_CTL and add OCTNET_CMD_VLAN_FILTER_ENABLE and OCTNET_CMD_VLAN_FILTER_DISABLE parameters so that it can be used to enable or disable the filter. Signed-off-by: Prasad Kanneganti <prasad.kanneganti@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: dsa: Fix legacy probingFlorian Fainelli1-11/+8
After commit 6d3c8c0dd88a ("net: dsa: Remove master_netdev and use dst->cpu_dp->netdev") and a29342e73911 ("net: dsa: Associate slave network device with CPU port") we would be seeing NULL pointer dereferences when accessing dst->cpu_dp->netdev too early. In the legacy code, we actually know early in advance the master network device, so pass it down to the relevant functions. Fixes: 6d3c8c0dd88a ("net: dsa: Remove master_netdev and use dst->cpu_dp->netdev") Fixes: a29342e73911 ("net: dsa: Associate slave network device with CPU port") Reported-by: Jason Cobham <jcobham@questertangent.com> Tested-by: Jason Cobham <jcobham@questertangent.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17tls: update KconfigDave Watson1-2/+5
Missing crypto deps for some platforms. Default to n for new module. config: m68k-amcore_defconfig (attached as .config) compiler: m68k-linux-gcc (GCC) 4.9.0 make.cross ARCH=m68k All errors (new ones prefixed by >>): net/built-in.o: In function `tls_set_sw_offload': >> (.text+0x732f8): undefined reference to `crypto_alloc_aead' net/built-in.o: In function `tls_set_sw_offload': >> (.text+0x7333c): undefined reference to `crypto_aead_setkey' net/built-in.o: In function `tls_set_sw_offload': >> (.text+0x73354): undefined reference to `crypto_aead_setauthsize' Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: add debug atomic_inc_not_zero() in dst_hold()Wei Wang1-1/+1
This patch is meant to add a debug warning on the situation where dst is being held during its destroy phase. This could potentially cause double free issue on the dst. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: reorder all the dst flagsWei Wang1-5/+5
As some dst flags are removed, reorder the dst flags to fill in the blanks. Note: these flags are not exposed into user space. So it is safe to reorder. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: remove DST_NOCACHE flagWei Wang8-25/+17
DST_NOCACHE flag check has been removed from dst_release() and dst_hold_safe() in a previous patch because all the dst are now ref counted properly and can be released based on refcnt only. Looking at the rest of the DST_NOCACHE use, all of them can now be removed or replaced with other checks. So this patch gets rid of all the DST_NOCACHE usage and remove this flag completely. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: remove DST_NOGC flagWei Wang6-19/+9
Now that all the components have been changed to release dst based on refcnt only and not depend on dst gc anymore, we can remove the temporary flag DST_NOGC. Note that we also need to remove the DST_NOCACHE check in dst_release() and dst_hold_safe() because now all the dst are released based on refcnt and behaves as DST_NOCACHE. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: remove dst gc related codeWei Wang3-235/+0
This patch removes all dst gc related code and all the dst free functions Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17decnet: take dst->__refcnt when struct dn_route is createdWei Wang1-17/+19
struct dn_route is inserted into dn_rt_hash_table but no dst->__refcnt is taken. This patch makes sure the dn_rt_hash_table's reference to the dst is ref counted. As the dst is always ref counted properly, we can safely mark DST_NOGC flag so dst_release() will release dst based on refcnt only. And dst gc is no longer needed and all dst_free() or its related function calls should be replaced with dst_release() or dst_release_immediate(). And dst_dev_put() is called when removing dst from the hash table to release the reference on dst->dev before we lose pointer to it. Also, correct the logic in dn_dst_check_expire() and dn_dst_gc() to check dst->__refcnt to be > 1 to indicate it is referenced by other users. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17xfrm: take refcnt of dst when creating struct xfrm_dst bundleWei Wang3-36/+32
During the creation of xfrm_dst bundle, always take ref count when allocating the dst. This way, xfrm_bundle_create() will form a linked list of dst with dst->child pointing to a ref counted dst child. And the returned dst pointer is also ref counted. This makes the link from the flow cache to this dst now ref counted properly. As the dst is always ref counted properly, we can safely mark DST_NOGC flag so dst_release() will release dst based on refcnt only. And dst gc is no longer needed and all dst_free() and its related function calls should be replaced with dst_release() or dst_release_immediate(). The special handling logic for dst->child in dst_destroy() can be replaced with a simple dst_release_immediate() call on the child to release the whole list linked by dst->child pointer. Previously used DST_NOHASH flag is not needed anymore as well. The reason that DST_NOHASH is used in the existing code is mainly to prevent the dst inserted in the fib tree to be wrongly destroyed during the deletion of the xfrm_dst bundle. So in the existing code, DST_NOHASH flag is marked in all the dst children except the one which is in the fib tree. However, with this patch series to remove dst gc logic and release dst only based on ref count, it is safe to release all the children from a xfrm_dst bundle as long as the dst children are all ref counted properly which is already the case in the existing code. So, this patch removes the use of DST_NOHASH flag. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv6: get rid of icmp6 dst garbage collectorWei Wang3-49/+1
icmp6 dst route is currently ref counted during creation and will be freed by user during its call of dst_release(). So no need of a garbage collector for it. Remove all icmp6 dst garbage collector related code. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv6: mark DST_NOGC and remove the operation of dst_free()Wei Wang2-45/+19
With the previous preparation patches, we are ready to get rid of the dst gc operation in ipv6 code and release dst based on refcnt only. So this patch adds DST_NOGC flag for all IPv6 dst and remove the calls to dst_free() and its related functions. At this point, all dst created in ipv6 code do not use the dst gc anymore and will be destroyed at the point when refcnt drops to 0. Also, as icmp6 dst route is refcounted during creation and will be freed by user during its call of dst_release(), there is no need to add this dst to the icmp6 gc list as well. Instead, we need to add it into uncached list so that when a NETDEV_DOWN/NETDEV_UNREGISRER event comes, we can properly go through these icmp6 dst as well and release the net device properly. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv6: call dst_hold_safe() properlyWei Wang2-4/+4
Similar as ipv4, ipv6 path also needs to call dst_hold_safe() when necessary to avoid double free issue on the dst. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv6: call dst_dev_put() properlyWei Wang1-0/+2
As the intend of this patch series is to completely remove dst gc, we need to call dst_dev_put() to release the reference to dst->dev when removing routes from fib because we won't keep the gc list anymore and will lose the dst pointer right after removing the routes. Without the gc list, there is no way to find all the dst's that have dst->dev pointing to the going-down dev. Hence, we are doing dst_dev_put() immediately before we lose the last reference of the dst from the routing code. The next dst_check() will trigger a route re-lookup to find another route (if there is any). Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv6: take dst->__refcnt for insertion into fib6 treeWei Wang3-21/+50
In IPv6 routing code, struct rt6_info is created for each static route and RTF_CACHE route and inserted into fib6 tree. In both cases, dst ref count is not taken. As explained in the previous patch, this leads to the need of the dst garbage collector. This patch holds ref count of dst before inserting the route into fib6 tree and properly releases the dst when deleting it from the fib6 tree as a preparation in order to fully get rid of dst gc later. Also, correct fib6_age() logic to check dst->__refcnt to be 1 to indicate no user is referencing the dst. And remove dst_hold() in vrf_rt6_create() as ip6_dst_alloc() already puts dst->__refcnt to 1. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv4: mark DST_NOGC and remove the operation of dst_free()Wei Wang2-16/+5
With the previous preparation patches, we are ready to get rid of the dst gc operation in ipv4 code and release dst based on refcnt only. So this patch adds DST_NOGC flag for all IPv4 dst and remove the calls to dst_free(). At this point, all dst created in ipv4 code do not use the dst gc anymore and will be destroyed at the point when refcnt drops to 0. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv4: call dst_hold_safe() properlyWei Wang2-4/+4
This patch checks all the calls to dst_hold()/skb_dst_force()/dst_clone()/dst_use() to see if dst_hold_safe() is needed to avoid double free issue if dst gc is removed and dst_release() directly destroys dst when dst->__refcnt drops to 0. In tx path, TCP hold sk->sk_rx_dst ref count and also hold sock_lock(). UDP and other similar protocols always hold refcount for skb->_skb_refdst. So both paths seem to be safe. In rx path, as it is lockless and skb_dst_set_noref() is likely to be used, dst_hold_safe() should always be used when trying to hold dst. In the routing code, if dst is held during an rcu protected session, it is necessary to call dst_hold_safe() as the current dst might be in its rcu grace period. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv4: call dst_dev_put() properlyWei Wang2-0/+6
As the intend of this patch series is to completely remove dst gc, we need to call dst_dev_put() to release the reference to dst->dev when removing routes from fib because we won't keep the gc list anymore and will lose the dst pointer right after removing the routes. Without the gc list, there is no way to find all the dst's that have dst->dev pointing to the going-down dev. Hence, we are doing dst_dev_put() immediately before we lose the last reference of the dst from the routing code. The next dst_check() will trigger a route re-lookup to find another route (if there is any). Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv4: take dst->__refcnt when caching dst in fibWei Wang2-4/+20
In IPv4 routing code, fib_nh and fib_nh_exception can hold pointers to struct rtable but they never increment dst->__refcnt. This leads to the need of the dst garbage collector because when user is done with this dst and calls dst_release(), it can only decrement dst->__refcnt and can not free the dst even it sees dst->__refcnt drops from 1 to 0 (unless DST_NOCACHE flag is set) because the routing code might still hold reference to it. And when the routing code tries to delete a route, it has to put the dst to the gc_list if dst->__refcnt is not yet 0 and have a gc thread running periodically to check on dst->__refcnt and finally to free dst when refcnt becomes 0. This patch increments dst->__refcnt when fib_nh/fib_nh_exception holds reference to this dst and properly release the dst when fib_nh/fib_nh_exception has been updated with a new dst. This patch is a preparation in order to fully get rid of dst gc later. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>