aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2012-07-16net: make sock diag per-namespaceAndrey Vagin4-17/+50
Before this patch sock_diag works for init_net only and dumps information about sockets from all namespaces. This patch expands sock_diag for all name-spaces. It creates a netlink kernel socket for each netns and filters data during dumping. v2: filter accoding with netns in all places remove an unused variable. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Pavel Emelyanov <xemul@parallels.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Andrew Vagin <avagin@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16tcp: add OFO snmp countersEric Dumazet2-2/+8
Add three SNMP TCP counters, to better track TCP behavior at global stage (netstat -s), when packets are received Out Of Order (OFO) TCPOFOQueue : Number of packets queued in OFO queue TCPOFODrop : Number of packets meant to be queued in OFO but dropped because socket rcvbuf limit hit. TCPOFOMerge : Number of packets in OFO that were merged with other packets. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-14xfrm: Initialize the struct xfrm_dst behind the dst_enty fieldSteffen Klassert1-2/+3
We start initializing the struct xfrm_dst at the first field behind the struct dst_enty. This is error prone because it might leave a new field uninitialized. So start initializing the struct xfrm_dst right behind the dst_entry. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-14ipv6: Initialize the struct rt6_info behind the dst_enty fieldSteffen Klassert1-5/+6
We start initializing the struct rt6_info at the first field behind the struct dst_enty. This is error prone because it might leave a new field uninitialized. So start initializing the struct rt6_info right behind the dst_entry. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-13Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-nextDavid S. Miller34-480/+1138
John Linville says: ==================== Several drivers see updates: mwifiex, ath9k, iwlwifi, brcmsmac, wlcore/wl12xx/wl18xx, and a handful of others. The bcma bus got a lot of attention from Hauke Mehrtens. The cfg80211 component gets a flurry of patches for multi-channel support, and the mac80211 component gets the first few VHT (11ac) and 60GHz (11ad) patches. This also includes the removal of the iwmc3200 drivers, since the hardware never became available to normal people. Additionally, the NFC subsystem gets a series of updates. According to Samuel, "Here are the interesting bits: - A better error management for the HCI stack. - An LLCP "late" binding implementation for a better NFC SAP usage. SAPs are now reserved only when there's a client for it. - Support for Sony RC-S360 (a.k.a. PaSoRi) pn533 based dongle. We can read and write NFC tags and also establish a p2p link with this dongle now. - A few LLCP fixes." Finally, this includes another pull of the fixes from the wireless tree in order to resolve some merge issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-13ipv4: Don't store a rule pointer in fib_result.David S. Miller3-21/+8
We only use it to fetch the rule's tclassid, so just store the tclassid there instead. This also decreases the size of fib_result by a full 8 bytes on 64-bit. On 32-bits it's a wash. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-13tcp: add LAST_ACK as a valid state for TSQEric Dumazet1-2/+2
Socket state LAST_ACK should allow TSQ to send additional frames, or else we rely on incoming ACKS or timers to send them. Reported-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-13net: Update alloc frag to reduce get/put page usage and recycle pagesAlexander Duyck1-8/+20
This patch is meant to help improve performance by reducing the number of locked operations required to allocate a frag on x86 and other platforms. This is accomplished by using atomic_set operations on the page count instead of calling get_page and put_page. It is based on work originally provided by Eric Dumazet. In addition it also helps to reduce memory overhead when using TCP. This is done by recycling the page if the only holder of the frame is the netdev_alloc_frag call itself. This can occur when skb heads are stolen by either GRO or TCP and the driver providing the packets is using paged frags to store all of the data for the packets. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davemJohn W. Linville34-480/+1138
2012-07-12ipv4: Remove tb_peers from fib_table.David S. Miller1-3/+0
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-nextDavid S. Miller1-57/+33
2012-07-12ipv4: Put proper checks into icmp_socket_deliver().David S. Miller1-6/+6
All handler->err() routines expect that we've done a pskb_may_pull() test to make sure that IP header length + 8 bytes can be safely pulled. Reported-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12net: sched: add ipset ematchFlorian Westphal3-0/+146
Can be used to match packets against netfilter ip sets created via ipset(8). skb->sk_iif is used as 'incoming interface', skb->dev is 'outgoing interface'. Since ipset is usually called from netfilter, the ematch initializes a fake xt_action_param, pulls the ip header into the linear area and also sets skb->data to the IP header (otherwise matching Layer 4 set types doesn't work). Tested-by: Mr Dash Four <mr.dash.four@googlemail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-126lowpan: rework fragment-deleting routinealex.bluesman.smirnov@gmail.com1-19/+20
6lowpan module starts collecting incomming frames and fragments right after lowpan_module_init() therefor it will be better to clean unfinished fragments in lowpan_cleanup_module() function instead of doing it when link goes down. Changed spinlocks type to prevent deadlock with expired timer event and removed unused one. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-126lowpan: fix tag variable sizealex.bluesman.smirnov@gmail.com1-1/+1
Function lowpan_alloc_new_frame() takes u8 tag as an argument. However, its only caller, lowpan_process_data() passes down a u16. Hence, the tag value can get corrupted. This prevent 6lowpan fragment reassembly of a message when the fragment tag value is over 256. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Cc: Tony Cheneau <tony.cheneau@amnesiak.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12mac802154: sparse warnings: make symbols staticalex.bluesman.smirnov@gmail.com2-2/+2
Make symbols static to avoid the following warning shown up by sparse: warning: symbol ... was not declared. Should it be static? Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-126lowpan: get extra headroom in allocated framealex.bluesman.smirnov@gmail.com1-2/+2
Use netdev_alloc_skb_ip_align() instead of alloc_skb() to get some extra headroom in case we need to forward this frame in a tunnel or something else. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12mac802154: add get short address methodalex.bluesman.smirnov@gmail.com3-0/+17
Add method to get the device short 802.15.4 address. This call needed by ieee802154 layer to satisfy 'iz list' request from the user space. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-126lowpan: revert: add missing spin_lock_init()alex.bluesman.smirnov@gmail.com1-3/+1
Revert the commit 768f7c7c121e80f458a9d013b2e8b169e5dfb1e5 to initialize spinlock in the more preferable way and make it static to avoid sparse warning. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv4: Fix warnings in ip_do_redirect() for some configurations.David S. Miller1-4/+6
Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12net: Remove checks for dst_ops->redirect being NULL.David S. Miller7-11/+8
No longer necessary. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12net: Add dummy dst_ops->redirect method where needed.David S. Miller4-0/+21
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().David S. Miller3-109/+2
And delete rt6_redirect(), since it is no longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Add redirect support to all protocol icmp error handlers.David S. Miller12-14/+67
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Add ip6_redirect() and ip6_sk_redirect() helper functions.David S. Miller1-0/+27
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Pull main logic of rt6_redirect() into rt6_do_redirect().David S. Miller1-31/+49
Hook it into dst_ops->redirect as well. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv6: Move bulk of redirect handling into rt6_redirect().David S. Miller2-76/+71
This sets things up so that we can have the protocol error handlers call down into the ipv6 route code for redirects just as ipv4 already does. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv6: Export ndisc option parsing from ndisc.cDavid S. Miller1-45/+2
This is going to be used internally by the rt6 redirect code. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Kill ip_rt_redirect().David S. Miller2-45/+0
No longer needed, as the protocol handlers now all properly propagate the redirect back into the routing code. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Add redirect support to all protocol icmp error handlers.David S. Miller12-16/+110
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions.David S. Miller1-0/+28
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect.David S. Miller1-40/+54
All of the redirect acceptance policy is now contained within. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Rearrange arguments to ip_rt_redirect()David S. Miller2-34/+25
Pass in the SKB rather than just the IP addresses, so that policy and other aspects can reside in ip_rt_redirect() rather then icmp_redirect(). Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Pull redirect instantiation out into a helper function.David S. Miller1-15/+22
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Deliver ICMP redirects to sockets too.David S. Miller1-7/+1
And thus, we can remove the ping_err() hack. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Pull icmp socket delivery out into a helper function.David S. Miller1-15/+16
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11tcp: TCP Small QueuesEric Dumazet7-1/+173
This introduce TSQ (TCP Small Queues) TSQ goal is to reduce number of TCP packets in xmit queues (qdisc & device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem. sk->sk_wmem_alloc not allowed to grow above a given limit, allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a given time. TSO packets are sized/capped to half the limit, so that we have two TSO packets in flight, allowing better bandwidth use. As a side effect, setting the limit to 40000 automatically reduces the standard gso max limit (65536) to 40000/2 : It can help to reduce latencies of high prio packets, having smaller TSO packets. This means we divert sock_wfree() to a tcp_wfree() handler, to queue/send following frames when skb_orphan() [2] is called for the already queued skbs. Results on my dev machines (tg3/ixgbe nics) are really impressive, using standard pfifo_fast, and with or without TSO/GSO. Without reduction of nominal bandwidth, we have reduction of buffering per bulk sender : < 1ms on Gbit (instead of 50ms with TSO) < 8ms on 100Mbit (instead of 132 ms) I no longer have 4 MBytes backlogged in qdisc by a single netperf session, and both side socket autotuning no longer use 4 Mbytes. As skb destructor cannot restart xmit itself ( as qdisc lock might be taken at this point ), we delegate the work to a tasklet. We use one tasklest per cpu for performance reasons. If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag. This flag is tested in a new protocol method called from release_sock(), to eventually send new segments. [1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable [2] skb_orphan() is usually called at TX completion time, but some drivers call it in their start_xmit() handler. These drivers should at least use BQL, or else a single TCP session can still fill the whole NIC TX ring, since TSQ will have no effect. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11tcp: Fix out of bounds access to tcpm_valsAlexander Duyck1-1/+1
The recent patch "tcp: Maintain dynamic metrics in local cache." introduced an out of bounds access due to what appears to be a typo. I believe this change should resolve the issue by replacing the access to RTAX_CWND with TCP_METRIC_CWND. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11bridge: fix endianLi RongQing1-1/+1
mld->mld_maxdelay is net endian, so we should use ntohs, not htons CC: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller13-53/+62
Conflicts: net/batman-adv/bridge_loop_avoidance.c net/batman-adv/bridge_loop_avoidance.h net/batman-adv/soft-interface.c net/mac80211/mlme.c With merge help from Antonio Quartulli (batman-adv) and Stephen Rothwell (drivers/net/usb/qmi_wwan.c). The net/mac80211/mlme.c conflict seemed easy enough, accounting for a conversion to some new tracing macros. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Fix memory leak - vlan_info structAmir Hanania1-0/+3
In driver reload test there is a memory leak. The structure vlan_info was not freed when the driver was removed. It was not released since the nr_vids var is one after last vlan was removed. The nr_vids is one, since vlan zero is added to the interface when the interface is being set, but the vlan zero is not deleted at unregister. Fix - delete vlan zero when we unregister the device. Signed-off-by: Amir Hanania <amir.hanania@intel.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller3-7/+19
Included changes: - fix a bug generated by the wrong interaction between the GW feature and the Bridge Loop Avoidance
2012-07-10net: Fix non-kernel-doc comments with kernel-doc start markerBen Hutchings4-17/+8
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Fix (nearly-)kernel-doc comments for various functionsBen Hutchings46-103/+163
Fix incorrect start markers, wrapped summary lines, missing section breaks, incorrect separators, and some name mismatches. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Properly define functions with no parametersBen Hutchings1-1/+1
Defining a function with no parameters as 'T foo()' is the deprecated K&R style, and is not strictly equivalent to defining it as 'T foo(void)'. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Remove inetpeer from routes.David S. Miller2-61/+6
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Calling ->cow_metrics() now is a bug.David S. Miller1-28/+2
Nothing every writes to ipv4 metrics any longer. PMTU is stored in rt->rt_pmtu. Dynamic TCP metrics are stored in a special TCP metrics cache, completely outside of the routes. Therefore ->cow_metrics() can simply nothing more than a WARN_ON trigger so we can catch anyone who tries to add new writes to ipv4 route metrics. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Kill dst_copy_metrics() call from ipv4_blackhole_route().David S. Miller1-1/+0
Blackhole routes have a COW metrics operation that returns NULL always, therefore this dst_copy_metrics() call did absolutely nothing. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Enforce max MTU metric at route insertion time.David S. Miller2-6/+3
Rather than at every struct rtable creation. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Maintain redirect and PMTU info in struct rtable again.David S. Miller3-149/+40
Maintaining this in the inetpeer entries was not the right way to do this at all. Signed-off-by: David S. Miller <davem@davemloft.net>