aboutsummaryrefslogtreecommitdiffstats
path: root/fs (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2018-12-05tipc: fix node keep alive interval calculationHoang Le1-0/+6
When setting LINK tolerance, node timer interval will be calculated base on the LINK with lowest tolerance. But when calculated, the old node timer interval only updated if current setting value (tolerance/4) less than old ones regardless of number of links as well as links' lowest tolerance value. This caused to two cases missing if tolerance changed as following: Case 1: 1.1/ There is one link (L1) available in the system 1.2/ Set L1's tolerance from 1500ms => lower (i.e 500ms) 1.3/ Then, fallback to default (1500ms) or higher (i.e 2000ms) Expected: node timer interval is 1500/4=375ms after 1.3 Result: node timer interval will not being updated after changing tolerance at 1.3 since its value 1500/4=375ms is not less than 500/4=125ms at 1.2. Case 2: 2.1/ There are two links (L1, L2) available in the system 2.2/ L1 and L2 tolerance value are 2000ms as initial 2.3/ Set L2's tolerance from 2000ms => lower 1500ms 2.4/ Disable link L2 (bring down its bearer) Expected: node timer interval is 2000ms/4=500ms after 2.4 Result: node timer interval will not being updated after disabling L2 since its value 2000ms/4=500ms is still not less than 1500/4=375ms at 2.3 although L2 is already not available in the system. To fix this, we start the node interval calculation by initializing it to a value larger than any conceivable calculated value. This way, the link with the lowest tolerance will always determine the calculated value. Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: Use of_node_name_eq for node name comparisonsRob Herring5-9/+9
Convert string compares of DT node names to use of_node_name_eq helper instead. This removes direct access to the node name pointer. For instances using of_node_cmp, this has the side effect of now using case sensitive comparisons. This should not matter for any FDT based system which all of these are. Cc: "David S. Miller" <davem@davemloft.net> Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Wingman Kwok <w-kwok2@ti.com> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: netdev@vger.kernel.org Cc: linux-omap@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: netem: use a list in addition to rbtreePeter Oskolkov1-20/+69
When testing high-bandwidth TCP streams with large windows, high latency, and low jitter, netem consumes a lot of CPU cycles doing rbtree rebalancing. This patch uses a linear list/queue in addition to the rbtree: if an incoming packet is past the tail of the linear queue, it is added there, otherwise it is inserted into the rbtree. Without this patch, perf shows netem_enqueue, netem_dequeue, and rb_* functions among the top offenders. With this patch, only netem_enqueue is noticeable if jitter is low/absent. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: bridge: increase multicast's default maximum number of entriesNikolay Aleksandrov2-1/+3
bridge's default hash_max was 512 which is rather conservative, now that we're using the generic rhashtable API which autoshrinks let's increase it to 4096 and move it to a define in br_private.h. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: bridge: mark hash_elasticity as obsoleteNikolay Aleksandrov4-12/+7
Now that the bridge multicast uses the generic rhashtable interface we can drop the hash_elasticity option as that is already done for us and it's hardcoded to a maximum of RHT_ELASTICITY (16 currently). Add a warning about the obsolete option when the hash_elasticity is set. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: bridge: multicast: use non-bh rcu flavorNikolay Aleksandrov3-23/+6
The bridge multicast code has been using a mix of RCU and RCU-bh flavors sometimes in questionable way. Since we've moved to rhashtable just use non-bh RCU everywhere. In addition this simplifies freeing of objects and allows us to remove some unnecessary callback functions. v3: new patch Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: bridge: convert multicast to generic rhashtableNikolay Aleksandrov6-436/+163
The bridge multicast code currently uses a custom resizable hashtable which predates the generic rhashtable interface. It has many shortcomings compared and duplicates functionality that is presently available via the generic rhashtable, so this patch removes the custom rhashtable implementation in favor of the kernel's generic rhashtable. The hash maximum is kept and the rhashtable's size is used to do a loose check if it's reached in which case we revert to the old behaviour and disable further bridge multicast processing. Also now we can support any hash maximum, doesn't need to be a power of 2. v3: add non-rcu br_mdb_get variant and use it where multicast_lock is held to avoid RCU splat, drop hash_max function and just set it directly v2: handle when IGMP snooping is undefined, add br_mdb_init/uninit placeholders Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: phy: Fix ioctl handler when modifing MII_ADVERTISEAndrew Lunn1-2/+2
When the MII_ADVERTISE register is modified by the IOCTL handler, phydev->advertising needs recalculating. Use the _mod_ variant of mii_adv_to_linkmode_adv_t so that bits outside of the advertise registers are not cleared. Fixes: c0ec3c273677 ("net: phy: Convert u32 phydev->lp_advertising to linkmode") Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: mii: mii_lpa_mod_linkmode_lpa_t: Make use of linkmode_mod_bit helperAndrew Lunn1-6/+2
Replace the if else code structure with a call to the helper linkmode_mod_bit. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: mii: Add mii_lpa_mod_linkmode_lpa_tAndrew Lunn2-17/+53
Add a _mod_ variant of mii_lpa_to_linkmode_lpa_t. Use this to fix the genphy_read_status() where the 1G link partner features are getting lost. Fixes: c0ec3c273677 ("net: phy: Convert u32 phydev->lp_advertising to linkmode") Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05phy: marvell: Rename mii_lpa_to_linkmode_lpa_tAndrew Lunn1-11/+11
Rename mii_lpa_to_linkmode_lpa_t to mii_lpa_mod_linkmode_lpa_t to indicate it modifies the passed linkmode bitmap, without clearing any other bits. Also, ensure bit are clear which the lpa indicates should not be set. Fixes: c0ec3c273677 ("net: phy: Convert u32 phydev->lp_advertising to linkmode") Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: mii: Rename mii_stat1000_to_linkmode_lpa_tAndrew Lunn5-14/+23
Rename mii_stat1000_to_linkmode_lpa_t to mii_stat1000_mod_linkmode_lpa_t to indicate it modifies the passed linkmode bitmap, without clearing any other bits. Add a helper to set/clear bits in a linkmode. Use this helper to ensure bit are clear which the stat1000 indicates should not be set. Fixes: c0ec3c273677 ("net: phy: Convert u32 phydev->lp_advertising to linkmode") Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net: mii: Fix autoneg in mii_lpa_to_linkmode_lpa_t()Andrew Lunn1-3/+6
mii_adv_to_linkmode_adv_t() clears all bits before setting it needs to set. This means the freshly set Autoneg gets cleared. Change the order, and add comments about it clearing the old content of the bitmap. Fixes: c0ec3c273677 ("net: phy: Convert u32 phydev->lp_advertising to linkmode") Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05net/mlx5e: Improve ethtool private-flags code structureTariq Toukan2-53/+41
Refactor the code of private-flags setter. Replace consecutive calls to mlx5e_handle_pflag with a loop that uses a preset set of parameters. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-05net/mlx5e: ethtool, Support user configuration for RX hash fieldsAya Levin3-6/+134
Enable user configuration of RX hash fields that are used for traffic spreading into RX queues. User can change built-in RSS (Receive Side Scaling) profiles on the following traffic types: UDP4, UDP6, TCP4 and TCP6. This configuration effects both outer and inner headers. Added support for ethtool commands: ETHTOOL_SRXFH and ETHTOOL_GRXFH. Command example respectively: $ethtool -N eth1 rx-flow-hash tcp4 sdfn $ethtool -n eth1 rx-flow-hash tcpp4 IP SA IP DA L4 bytes 0 & 1 [TCP/UDP src port] L4 bytes 2 & 3 [TCP/UDP dst port] Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-05net/mlx5e: Move RSS params to a dedicated structAya Levin6-47/+62
Remove RSS params from params struct under channels, and introduce a new struct with RSS configuration params under priv struct. There is no functional change here. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-05net/mlx5e: Refactor TIR configuration functionAya Levin4-104/+87
Refactor mlx5e_build_indir_tir_ctx_hash for better code re-use. TIR stands for Transport Interface Receive, which is responsible for all transport related operations on the receive side. Added a static array with TIR default configuration values. This separates configuration values from command setting, which is needed for downstream patch. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-05net: documentation: build a directory structure for driversJakub Kicinski58-66/+71
Documentation/networking/ is full of cryptically named files with driver documentation. This makes finding interesting information at a glance really hard. Move all those files into a directory called device_drivers (since not all drivers are for device) and fix up references. RFC v0.1 -> RFC v1: - also add .txt suffix to the files which are missing it (Quentin) Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Henrik Austad <henrik@austad.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-04tcp: reduce POLLOUT events caused by TCP_NOTSENT_LOWATEric Dumazet3-8/+22
TCP_NOTSENT_LOWAT socket option or sysctl was added in linux-3.12 as a step to enable bigger tcp sndbuf limits. It works reasonably well, but the following happens : Once the limit is reached, TCP stack generates an [E]POLLOUT event for every incoming ACK packet. This causes a high number of context switches. This patch implements the strategy David Miller added in sock_def_write_space() : - If TCP socket has a notsent_lowat constraint of X bytes, allow sendmsg() to fill up to X bytes, but send [E]POLLOUT only if number of notsent bytes is below X/2 This considerably reduces TCP_NOTSENT_LOWAT overhead, while allowing to keep the pipe full. Tested: 100 ms RTT netem testbed between A and B, 100 concurrent TCP_STREAM A:/# cat /proc/sys/net/ipv4/tcp_wmem 4096 262144 64000000 A:/# super_netperf 100 -H B -l 1000 -- -K bbr & A:/# grep TCP /proc/net/sockstat TCP: inuse 203 orphan 0 tw 19 alloc 414 mem 1364904 # This is about 54 MB of memory per flow :/ A:/# vmstat 5 5 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 256220672 13532 694976 0 0 10 0 28 14 0 1 99 0 0 2 0 0 256320016 13532 698480 0 0 512 0 715901 5927 0 10 90 0 0 0 0 0 256197232 13532 700992 0 0 735 13 771161 5849 0 11 89 0 0 1 0 0 256233824 13532 703320 0 0 512 23 719650 6635 0 11 89 0 0 2 0 0 256226880 13532 705780 0 0 642 4 775650 6009 0 12 88 0 0 A:/# echo 2097152 >/proc/sys/net/ipv4/tcp_notsent_lowat A:/# grep TCP /proc/net/sockstat TCP: inuse 203 orphan 0 tw 19 alloc 414 mem 86411 # 3.5 MB per flow A:/# vmstat 5 5 # check that context switches have not inflated too much. procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 0 260386512 13592 662148 0 0 10 0 17 14 0 1 99 0 0 0 0 0 260519680 13592 604184 0 0 512 13 726843 12424 0 10 90 0 0 1 1 0 260435424 13592 598360 0 0 512 25 764645 12925 0 10 90 0 0 1 0 0 260855392 13592 578380 0 0 512 7 722943 13624 0 11 88 0 0 1 0 0 260445008 13592 601176 0 0 614 34 772288 14317 0 10 90 0 0 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-04net/sched: act_tunnel_key: Don't dump dst port if it wasn't setAdi Nissim1-1/+3
It's possible to set a tunnel without a destination port. However, on dump(), a zero dst port is returned to user space even if it was not set, fix that. Note that so far it wasn't required, b/c key less tunnels were not supported and the UDP tunnels do require destination port. Signed-off-by: Adi Nissim <adin@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-04net/sched: act_tunnel_key: Allow key-less tunnelsAdi Nissim1-10/+11
Allow setting a tunnel without a tunnel key. This is required for tunneling protocols, such as GRE, that define the key as an optional field. Signed-off-by: Adi Nissim <adin@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-04qed: fix spelling mistake "Dispalying" -> "Displaying"Colin Ian King1-1/+1
There is a spelling mistake in a DP_NOTICE message, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-04net/mlx5e: Move modify tirs hash functionalityAya Levin3-25/+29
Move modify tirs hash functionality (mlx5e_modify_tirs_hash) from en_ethtool.c to en_main.c. This allows future use of this fuctionality from en_fs_ethtool.c, while keeping current convention: en_ethtool.c doesn't have an API. There is no functional change here. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-04net/mlx5e: Cleanup unused definesGal Pressman1-3/+0
Remove couple of defines that are no longer used. Signed-off-by: Gal Pressman <pressmangal@gmail.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-04net/mlx5e: Remove trailing space of tx_pause ethtool counter nameSaeed Mahameed1-1/+1
tx_pause_storm_warning_events ethtool counter name has a trailing space, remove it. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>