aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/Kconfig.freezer (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-05-17tcp: replace misc tcp_time_stamp to tcp_jiffies32Eric Dumazet5-6/+6
After this patch, all uses of tcp_time_stamp will require a change when we introduce 1 ms and/or 1 us TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp_lp: cache tcp_time_stampEric Dumazet1-3/+4
tcp_time_stamp will become slightly more expensive soon, cache its value. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp_westwood: use tcp_jiffies32 instead of tcp_time_stampEric Dumazet1-3/+3
This CC does not need 1 ms tcp_time_stamp and can use the jiffy based 'timestamp'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tcp_jiffies32 in __tcp_oow_rate_limited()Eric Dumazet1-2/+2
This place wants to use tcp_jiffies32, this is good enough. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: uses jiffies_32 to feed tp->chrono_startEric Dumazet2-2/+2
tcp_time_stamp will no longer be tied to jiffies. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tcp_jiffies32 to feed probe_timestampEric Dumazet2-4/+4
Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtimeEric Dumazet5-8/+8
Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: bic, cubic: use tcp_jiffies32 instead of tcp_time_stampEric Dumazet2-9/+9
Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp_bbr: use tcp_jiffies32 instead of tcp_time_stampEric Dumazet1-6/+6
Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stampEric Dumazet3-12/+12
Use tcp_jiffies32 instead of tcp_time_stamp to feed tp->snd_cwnd_stamp. tcp_time_stamp will soon be a litle bit more expensive than simply reading 'jiffies'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tcp_jiffies32 to feed tp->lsndtimeEric Dumazet6-9/+9
Use tcp_jiffies32 instead of tcp_time_stamp to feed tp->lsndtime. tcp_time_stamp will soon be a litle bit more expensive than simply reading 'jiffies'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17dccp: do not use tcp_time_stampEric Dumazet2-5/+5
Use our own macro instead of abusing tcp_time_stamp Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: introduce tcp_jiffies32Eric Dumazet1-5/+8
We abuse tcp_time_stamp for two different cases : 1) base to generate TCP Timestamp options (RFC 7323) 2) A 32bit version of jiffies since some TCP fields are 32bit wide to save memory. Since we want in the future to have 1ms TCP TS clock, regardless of HZ value, we want to cleanup things. tcp_jiffies32 is the truncated jiffies value, which will be used only in places where we want a 'host' timestamp. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tp->tcp_mstamp in output pathEric Dumazet4-12/+14
Idea is to later convert tp->tcp_mstamp to a full u64 counter using usec resolution, so that we can later have fine grained TCP TS clock (RFC 7323), regardless of HZ value. We try to refresh tp->tcp_mstamp only when necessary. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17sch_dsmark: Fix uninitialized variable warning.David S. Miller1-1/+1
We still need to initialize err to -EINVAL for the case where 'opt' is NULL in dsmark_init(). Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure") Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: add termination action to allow goto chainJiri Pirko5-3/+54
Introduce new type of termination action called "goto_chain". This allows user to specify a chain to be processed. This action type is then processed as a return value in tcf_classify loop in similar way as "reclassify" is, only it does not reset to the first filter in chain but rather reset to the first filter of the desired chain. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: push tp down to action initJiri Pirko3-17/+19
Tp pointer will be needed by the next patch in order to get the chain. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: introduce multichain support for filtersJiri Pirko4-18/+98
Instead of having only one filter per block, introduce a list of chains for every block. Create chain 0 by default. UAPI is extended so the user can specify which chain he wants to change. If the new attribute is not specified, chain 0 is used. That allows to maintain backward compatibility. If chain does not exist and user wants to manipulate with it, new chain is created with specified index. Also, when last filter is removed from the chain, the chain is destroyed. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: push chain dump to a separate functionJiri Pirko1-43/+52
Since there will be multiple chains to dump, push chain dumping code to a separate function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: introduce helpers to work with filter chainsJiri Pirko2-42/+113
Introduce struct tcf_chain object and set of helpers around it. Wraps up insertion, deletion and search in the filter chain. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: move TC_H_MAJ macro call into tcf_auto_prioJiri Pirko1-2/+2
Call the helper from the function rather than to always adjust the return value of the function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: replace nprio by a bool to make the function more readableJiri Pirko1-6/+7
The use of "nprio" variable in tc_ctl_tfilter is a bit cryptic and makes a reader wonder what is going on for a while. So help him to understand this priority allocation dance a litte bit better. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: rename tcf_destroy_chain helperJiri Pirko1-3/+3
Make the name consistent with the rest of the helpers around. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: introduce tcf block infractructureJiri Pirko17-99/+243
Currently, the filter chains are direcly put into the private structures of qdiscs. In order to be able to have multiple chains per qdisc and to allow filter chains sharing among qdiscs, there is a need for common object that would hold the chains. This introduces such object and calls it "tcf_block". Helpers to get and put the blocks are provided to be called from individual qdisc code. Also, the original filter_list pointers are left in qdisc privs to allow the entry into tcf_block processing without any added overhead of possible multiple pointer dereference on fast path. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: move tc_classify function to cls_api.cJiri Pirko17-65/+72
Move tc_classify function to cls_api.c where it belongs, rename it to fit the namespace. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17drivers: net: DSA: Sort driversAndrew Lunn2-23/+23
With more drivers being added, it is time to sort the drivers to impose some order. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: dsa: Sort DSA tagging protocol driversAndrew Lunn5-29/+29
With more tag protocols being added, regain some order by sorting the entries in various places. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17liquidio: fix PF falsely indicating success at setting MAC address of a nonexistent VFFelix Manlunas1-0/+3
In the function assigned to .ndo_set_vf_mac, check the validity of the vfidx argument before proceeding to tell the firmware to set the VF MAC address. Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17liquidio: fix insmod failure when multiple NICs are plugged inRick Farrington3-10/+123
When multiple liquidio NICs are plugged in, the first insmod of the PF driver succeeds. But after an rmmod, a subsequent insmod fails. Reason is during rmmod, the PF driver resets the Octeon of only one of the NICs; it neglects to reset the Octeons of the other NICs. Fix the insmod failure by adding the missing Octeon resets at rmmod. Keep a per-NIC refcount that indicates the number of active PFs in a given NIC. When the refcount goes to zero, then reset the Octeon of that NIC. Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: dsa: store CPU port pointer in the treeVivien Didelot11-35/+30
A dsa_switch_tree instance holds a dsa_switch pointer and a port index to identify the switch port to which the CPU is attached. Now that the DSA layer has a dsa_port structure to hold this data, use it to point the switch CPU port. This patch simply substitutes s/dst->cpu_switch/dst->cpu_dp->ds/ and s/dst->cpu_port/dst->cpu_dp->index/. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum: Default ports to non-virtual modeIdo Schimmel1-0/+8
In virtual mode, packets are classified to FIDs based on their ingress port and VLAN whereas in non-virtual mode only the VLAN is taken into account. Currently ports are initialized to use virtual mode due to the presence of the PVID vPort. However, we're going to transition ports between both modes based on the FIDs they use and not merely based on the presence on a VLAN upper. Therefore, during initialization, no mode will be explicitly set. Since the Programmer's Reference Manual (PRM) doesn't specify a default, explicitly set the port to non-virtual mode and later transition the port between both modes based on the FIDs it uses. In a follow-up patchset, this step will be moved to the common FID core where it logically belongs. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum: Move PVID code to appropriate placeIdo Schimmel3-58/+46
PVID is a port attribute and should therefore reside in the main driver file and not the switchdev specific one. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum_switchdev: Don't batch learning operationsIdo Schimmel3-20/+7
We no longer batch VLAN operations, so there's no need to set the learning state for a range of VLANs. Use a common function to set the learning state for a Port-VLAN, thereby making the code saner more receptive for upcoming changes. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum_switchdev: Don't batch STP operationsIdo Schimmel1-42/+17
Simplify the code by using the common function that sets an STP state for a Port-VLAN and remove the existing one that tries to batch it for several VLANs. This will help us in a follow-up patchset to introduce a unified infrastructure for bridge ports, regardless if the bridge is VLAN-aware or not. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum_switchdev: Don't batch VLAN operationsIdo Schimmel3-139/+121
switchdev's VLAN object has the ability to describe a range of VLAN IDs, but this is only used when VLAN operations are done using the SELF flag, which is something we would like to remove as it allows one to bypass the bridge driver. Do VLAN operations on a per-VLAN basis, thereby simplifying the code and preparing it for refactoring in a follow-up patchset. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum_switchdev: Remove redundant checkIdo Schimmel1-9/+0
Since commit 97c242902c20 ("switchdev: Execute bridge ndos only for bridge ports") switchdev code checks that port is bridged, so no need to perform the same check in the driver. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum_router: Initialize RIFs in a separate functionIdo Schimmel1-18/+30
The router interfaces (RIFs) array is currently initialized together with the general router configuration. However, in a follow-up patchset we're going to introduce a common RIF core that will require us to initialize more RIF constructs, so move the RIF initialization to its own function. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum_router: Move FIB notification block to router structIdo Schimmel2-8/+10
The FIB notification block logically belongs inside the router specific struct, so move it there. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum_router: Move RIFs array to its rightful placeIdo Schimmel4-24/+35
The router interfaces (RIFs) array is of no interest to code outside the routing realm, so declare it inside the router specific struct instead of the chip-wide one. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum_switchdev: Reduce scope of bridge structIdo Schimmel4-37/+69
Some attributes in the global chip struct are only relevant for bridge operation, so encapsulate them in their own struct that isn't exposed to non-bridge code. This will also help us later, when we add more bridge-specific attributes. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum_router: Reduce scope of router structIdo Schimmel2-114/+130
In a similar fashion to previous patch, the router structure ('mlxsw_sp_router') doesn't need to be accessible to anyone, but the router code located at spectrum_router.c Make this apparent and reduce its scope by defining it there. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17mlxsw: spectrum_buffer: Reduce scope of shared buffer structIdo Schimmel2-59/+68
The shared buffer structure ('mlxsw_sp_sb') doesn't need to be accessible to anyone, but the shared buffer code located at spectrum_buffers.c Make this apparent and reduce its scope by defining it there. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17cxgb4: add new T5 pci device idGanesh Goudar1-0/+1
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17cxgb4: reduce resource allocation in kdump kernelGanesh Goudar1-6/+6
When is_kdump_kernel() is true, reduce memory footprint of cxgb4 by using a single "Queue Set". Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16liquidio: use pcie_flr instead of duplicating itChristoph Hellwig1-16/+1
Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16net: phy: Remove residual magic from PHY driversAndrew Lunn5-38/+36
commit fa8cddaf903c ("net phylib: Remove unnecessary condition check in phy") removed the only place where the PHY flag PHY_HAS_MAGICANEG was checked. But it left the flag being set in the drivers. Remove the flag. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16bnx2x: Remove open coded carrier checkLeon Romanovsky1-1/+1
There is inline function to test if carrier present, so it makes open-coded solution redundant. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16tcp: internal implementation for pacingEric Dumazet8-5/+113
BBR congestion control depends on pacing, and pacing is currently handled by sch_fq packet scheduler for performance reasons, and also because implemening pacing with FQ was convenient to truly avoid bursts. However there are many cases where this packet scheduler constraint is not practical. - Many linux hosts are not focusing on handling thousands of TCP flows in the most efficient way. - Some routers use fq_codel or other AQM, but still would like to use BBR for the few TCP flows they initiate/terminate. This patch implements an automatic fallback to internal pacing. Pacing is requested either by BBR or use of SO_MAX_PACING_RATE option. If sch_fq happens to be in the egress path, pacing is delegated to the qdisc, otherwise pacing is done by TCP itself. One advantage of pacing from TCP stack is to get more precise rtt estimations, and less work done from TX completion, since TCP Small queue limits are not generally hit. Setups with single TX queue but many cpus might even benefit from this. Note that unlike sch_fq, we do not take into account header sizes. Taking care of these headers would add additional complexity for no practical differences in behavior. Some performance numbers using 800 TCP_STREAM flows rate limited to ~48 Mbit per second on 40Gbit NIC. If MQ+pfifo_fast is used on the NIC : $ sar -n DEV 1 5 | grep eth 14:48:44 eth0 725743.00 2932134.00 46776.76 4335184.68 0.00 0.00 1.00 14:48:45 eth0 725349.00 2932112.00 46751.86 4335158.90 0.00 0.00 0.00 14:48:46 eth0 725101.00 2931153.00 46735.07 4333748.63 0.00 0.00 0.00 14:48:47 eth0 725099.00 2931161.00 46735.11 4333760.44 0.00 0.00 1.00 14:48:48 eth0 725160.00 2931731.00 46738.88 4334606.07 0.00 0.00 0.00 Average: eth0 725290.40 2931658.20 46747.54 4334491.74 0.00 0.00 0.40 $ vmstat 1 5 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 4 0 0 259825920 45644 2708324 0 0 21 2 247 98 0 0 100 0 0 4 0 0 259823744 45644 2708356 0 0 0 0 2400825 159843 0 19 81 0 0 0 0 0 259824208 45644 2708072 0 0 0 0 2407351 159929 0 19 81 0 0 1 0 0 259824592 45644 2708128 0 0 0 0 2405183 160386 0 19 80 0 0 1 0 0 259824272 45644 2707868 0 0 0 32 2396361 158037 0 19 81 0 0 Now use MQ+FQ : lpaa23:~# echo fq >/proc/sys/net/core/default_qdisc lpaa23:~# tc qdisc replace dev eth0 root mq $ sar -n DEV 1 5 | grep eth 14:49:57 eth0 678614.00 2727930.00 43739.13 4033279.14 0.00 0.00 0.00 14:49:58 eth0 677620.00 2723971.00 43674.69 4027429.62 0.00 0.00 1.00 14:49:59 eth0 676396.00 2719050.00 43596.83 4020125.02 0.00 0.00 0.00 14:50:00 eth0 675197.00 2714173.00 43518.62 4012938.90 0.00 0.00 1.00 14:50:01 eth0 676388.00 2719063.00 43595.47 4020171.64 0.00 0.00 0.00 Average: eth0 676843.00 2720837.40 43624.95 4022788.86 0.00 0.00 0.40 $ vmstat 1 5 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 0 259832240 46008 2710912 0 0 21 2 223 192 0 1 99 0 0 1 0 0 259832896 46008 2710744 0 0 0 0 1702206 198078 0 17 82 0 0 0 0 0 259830272 46008 2710596 0 0 0 0 1696340 197756 1 17 83 0 0 4 0 0 259829168 46024 2710584 0 0 16 0 1688472 197158 1 17 82 0 0 3 0 0 259830224 46024 2710408 0 0 0 0 1692450 197212 0 18 82 0 0 As expected, number of interrupts per second is very different. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Van Jacobson <vanj@google.com> Cc: Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16udp: keep the sk_receive_queue held when splicingPaolo Abeni1-10/+26
On packet reception, when we are forced to splice the sk_receive_queue, we can keep the related lock held, so that we can avoid re-acquiring it, if fwd memory scheduling is required. v1 -> v2: the rx_queue_lock_held param in udp_rmem_release() is now a bool Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16udp: use a separate rx queue for packet receptionPaolo Abeni5-24/+131
under udp flood the sk_receive_queue spinlock is heavily contended. This patch try to reduce the contention on such lock adding a second receive queue to the udp sockets; recvmsg() looks first in such queue and, only if empty, tries to fetch the data from sk_receive_queue. The latter is spliced into the newly added queue every time the receive path has to acquire the sk_receive_queue lock. The accounting of forward allocated memory is still protected with the sk_receive_queue lock, so udp_rmem_release() needs to acquire both locks when the forward deficit is flushed. On specific scenarios we can end up acquiring and releasing the sk_receive_queue lock multiple times; that will be covered by the next patch Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>