aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2012-07-11ipv4: Add redirect support to all protocol icmp error handlers.David S. Miller12-16/+110
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions.David S. Miller1-0/+28
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect.David S. Miller1-40/+54
All of the redirect acceptance policy is now contained within. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Rearrange arguments to ip_rt_redirect()David S. Miller2-34/+25
Pass in the SKB rather than just the IP addresses, so that policy and other aspects can reside in ip_rt_redirect() rather then icmp_redirect(). Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Pull redirect instantiation out into a helper function.David S. Miller1-15/+22
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Deliver ICMP redirects to sockets too.David S. Miller1-7/+1
And thus, we can remove the ping_err() hack. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Pull icmp socket delivery out into a helper function.David S. Miller1-15/+16
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11tcp: TCP Small QueuesEric Dumazet7-1/+173
This introduce TSQ (TCP Small Queues) TSQ goal is to reduce number of TCP packets in xmit queues (qdisc & device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem. sk->sk_wmem_alloc not allowed to grow above a given limit, allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a given time. TSO packets are sized/capped to half the limit, so that we have two TSO packets in flight, allowing better bandwidth use. As a side effect, setting the limit to 40000 automatically reduces the standard gso max limit (65536) to 40000/2 : It can help to reduce latencies of high prio packets, having smaller TSO packets. This means we divert sock_wfree() to a tcp_wfree() handler, to queue/send following frames when skb_orphan() [2] is called for the already queued skbs. Results on my dev machines (tg3/ixgbe nics) are really impressive, using standard pfifo_fast, and with or without TSO/GSO. Without reduction of nominal bandwidth, we have reduction of buffering per bulk sender : < 1ms on Gbit (instead of 50ms with TSO) < 8ms on 100Mbit (instead of 132 ms) I no longer have 4 MBytes backlogged in qdisc by a single netperf session, and both side socket autotuning no longer use 4 Mbytes. As skb destructor cannot restart xmit itself ( as qdisc lock might be taken at this point ), we delegate the work to a tasklet. We use one tasklest per cpu for performance reasons. If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag. This flag is tested in a new protocol method called from release_sock(), to eventually send new segments. [1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable [2] skb_orphan() is usually called at TX completion time, but some drivers call it in their start_xmit() handler. These drivers should at least use BQL, or else a single TCP session can still fill the whole NIC TX ring, since TSQ will have no effect. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11tcp: Fix out of bounds access to tcpm_valsAlexander Duyck1-1/+1
The recent patch "tcp: Maintain dynamic metrics in local cache." introduced an out of bounds access due to what appears to be a typo. I believe this change should resolve the issue by replacing the access to RTAX_CWND with TCP_METRIC_CWND. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11bridge: fix endianLi RongQing1-1/+1
mld->mld_maxdelay is net endian, so we should use ntohs, not htons CC: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller13-53/+62
Conflicts: net/batman-adv/bridge_loop_avoidance.c net/batman-adv/bridge_loop_avoidance.h net/batman-adv/soft-interface.c net/mac80211/mlme.c With merge help from Antonio Quartulli (batman-adv) and Stephen Rothwell (drivers/net/usb/qmi_wwan.c). The net/mac80211/mlme.c conflict seemed easy enough, accounting for a conversion to some new tracing macros. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Fix memory leak - vlan_info structAmir Hanania1-0/+3
In driver reload test there is a memory leak. The structure vlan_info was not freed when the driver was removed. It was not released since the nr_vids var is one after last vlan was removed. The nr_vids is one, since vlan zero is added to the interface when the interface is being set, but the vlan zero is not deleted at unregister. Fix - delete vlan zero when we unregister the device. Signed-off-by: Amir Hanania <amir.hanania@intel.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller3-7/+19
Included changes: - fix a bug generated by the wrong interaction between the GW feature and the Bridge Loop Avoidance
2012-07-10net: Fix non-kernel-doc comments with kernel-doc start markerBen Hutchings4-17/+8
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Fix (nearly-)kernel-doc comments for various functionsBen Hutchings46-103/+163
Fix incorrect start markers, wrapped summary lines, missing section breaks, incorrect separators, and some name mismatches. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Properly define functions with no parametersBen Hutchings1-1/+1
Defining a function with no parameters as 'T foo()' is the deprecated K&R style, and is not strictly equivalent to defining it as 'T foo(void)'. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Remove inetpeer from routes.David S. Miller2-61/+6
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Calling ->cow_metrics() now is a bug.David S. Miller1-28/+2
Nothing every writes to ipv4 metrics any longer. PMTU is stored in rt->rt_pmtu. Dynamic TCP metrics are stored in a special TCP metrics cache, completely outside of the routes. Therefore ->cow_metrics() can simply nothing more than a WARN_ON trigger so we can catch anyone who tries to add new writes to ipv4 route metrics. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Kill dst_copy_metrics() call from ipv4_blackhole_route().David S. Miller1-1/+0
Blackhole routes have a COW metrics operation that returns NULL always, therefore this dst_copy_metrics() call did absolutely nothing. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Enforce max MTU metric at route insertion time.David S. Miller2-6/+3
Rather than at every struct rtable creation. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Maintain redirect and PMTU info in struct rtable again.David S. Miller3-149/+40
Maintaining this in the inetpeer entries was not the right way to do this at all. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10rtnetlink: Remove ts/tsage args to rtnl_put_cacheinfo().David S. Miller4-8/+4
Nobody provides non-zero values any longer. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10inet: Kill FLOWI_FLAG_PRECOW_METRICS.David S. Miller3-11/+4
No longer needed. TCP writes metrics, but now in it's own special cache that does not dirty the route metrics. Therefore there is no longer any reason to pre-cow metrics in this way. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10inet: Minimize use of cached route inetpeer.David S. Miller5-22/+35
Only use it in the absolutely required cases: 1) COW'ing metrics 2) ipv4 PMTU 3) ipv4 redirects Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10inet: Remove ->get_peer() method.David S. Miller2-32/+0
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Remove tw->tw_peerDavid S. Miller1-14/+2
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Move timestamps from inetpeer to metrics cache.David S. Miller7-123/+144
With help from Lin Ming. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Don't report route RTT metric value in cache dumps.David S. Miller2-18/+15
We don't maintain it dynamically any longer, so reporting it would be extremely misleading. Report zero instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Maintain dynamic metrics in local cache.David S. Miller2-93/+464
Maintain a local hash table of TCP dynamic metrics blobs. Computed TCP metrics are no longer maintained in the route metrics. The table uses RCU and an extremely simple hash so that it has low latency and low overhead. A simple hash is legitimate because we only make metrics blobs for fully established connections. Some tweaking of the default hash table sizes, metric timeouts, and the hash chain length limit certainly could use some tweaking. But the basic design seems sound. With help from Eric Dumazet and Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Abstract back handling peer aliveness test into helper function.David S. Miller3-2/+12
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Move dynamnic metrics handling into seperate file.David S. Miller3-187/+195
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09net/rxrpc/ar-peer.c: remove invalid reference to list iterator variableJulia Lawall1-1/+1
If list_for_each_entry, etc complete a traversal of the list, the iterator variable ends up pointing to an address at an offset from the list head, and not a meaningful structure. Thus this value should not be used after the end of the iterator. This seems to be a copy-paste bug from a previous debugging message, and so the meaningless value is just deleted. This problem was found using Coccinelle (http://coccinelle.lip6.fr/). Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09net: cgroup: fix out of bounds accessesEric Dumazet2-4/+8
dev->priomap is allocated by extend_netdev_table() called from update_netdev_tables(). And this is only called if write_priomap() is called. But if write_priomap() is not called, it seems we can have out of bounds accesses in cgrp_destroy(), read_priomap() & skb_update_prio() With help from Gao Feng Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davemJohn W. Linville3-6/+4
2012-07-09mac80211: destroy assoc_data correctly if assoc failsEliad Peller1-4/+2
If association failed due to internal error (e.g. no supported rates IE), we call ieee80211_destroy_assoc_data() with assoc=true, while we actually reject the association. This results in the BSSID not being zeroed out. After passing assoc=false, we no longer have to call sta_info_destroy_addr() explicitly. While on it, move the "associated" message after the assoc_success check. Cc: stable@vger.kernel.org [3.4+] Signed-off-by: Eliad Peller <eliad@wizery.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-07-09NFC: Prevent NULL deref when getting socket nameSasha Levin1-1/+1
llcp_sock_getname can be called without a device attached to the nfc_llcp_sock. This would lead to the following BUG: [ 362.341807] BUG: unable to handle kernel NULL pointer dereference at (null) [ 362.341815] IP: [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0 [ 362.341818] PGD 31b35067 PUD 30631067 PMD 0 [ 362.341821] Oops: 0000 [#627] PREEMPT SMP DEBUG_PAGEALLOC [ 362.341826] CPU 3 [ 362.341827] Pid: 7816, comm: trinity-child55 Tainted: G D W 3.5.0-rc4-next-20120628-sasha-00005-g9f23eb7 #479 [ 362.341831] RIP: 0010:[<ffffffff836258e5>] [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0 [ 362.341832] RSP: 0018:ffff8800304fde88 EFLAGS: 00010286 [ 362.341834] RAX: 0000000000000000 RBX: ffff880033cb8000 RCX: 0000000000000001 [ 362.341835] RDX: ffff8800304fdec4 RSI: ffff8800304fdec8 RDI: ffff8800304fdeda [ 362.341836] RBP: ffff8800304fdea8 R08: 7ebcebcb772b7ffb R09: 5fbfcb9c35bdfd53 [ 362.341838] R10: 4220020c54326244 R11: 0000000000000246 R12: ffff8800304fdec8 [ 362.341839] R13: ffff8800304fdec4 R14: ffff8800304fdec8 R15: 0000000000000044 [ 362.341841] FS: 00007effa376e700(0000) GS:ffff880035a00000(0000) knlGS:0000000000000000 [ 362.341843] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 362.341844] CR2: 0000000000000000 CR3: 0000000030438000 CR4: 00000000000406e0 [ 362.341851] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 362.341856] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 362.341858] Process trinity-child55 (pid: 7816, threadinfo ffff8800304fc000, task ffff880031270000) [ 362.341858] Stack: [ 362.341862] ffff8800304fdea8 ffff880035156780 0000000000000000 0000000000001000 [ 362.341865] ffff8800304fdf78 ffffffff83183b40 00000000304fdec8 0000006000000000 [ 362.341868] ffff8800304f0027 ffffffff83729649 ffff8800304fdee8 ffff8800304fdf48 [ 362.341869] Call Trace: [ 362.341874] [<ffffffff83183b40>] sys_getpeername+0xa0/0x110 [ 362.341877] [<ffffffff83729649>] ? _raw_spin_unlock_irq+0x59/0x80 [ 362.341882] [<ffffffff810f342b>] ? do_setitimer+0x23b/0x290 [ 362.341886] [<ffffffff81985ede>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 362.341889] [<ffffffff8372a539>] system_call_fastpath+0x16/0x1b [ 362.341921] Code: 84 00 00 00 00 00 b8 b3 ff ff ff 48 85 db 74 54 66 41 c7 04 24 27 00 49 8d 7c 24 12 41 c7 45 00 60 00 00 00 48 8b 83 28 05 00 00 <8b> 00 41 89 44 24 04 0f b6 83 41 05 00 00 41 88 44 24 10 0f b6 [ 362.341924] RIP [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0 [ 362.341925] RSP <ffff8800304fde88> [ 362.341926] CR2: 0000000000000000 [ 362.341928] ---[ end trace 6d450e935ee18bf3 ]--- Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-07-09mac80211: correct size the argument to kzalloc in minstrel_htThomas Huehn1-1/+1
msp has type struct minstrel_ht_sta_priv not struct minstrel_ht_sta. (This incorporates the fixup originally posted as "mac80211: fix kzalloc memory corruption introduced in minstrel_ht". -- JWL) Reported-by: Fengguang Wu <wfg@linux.intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-07-09Merge branch 'master' of git://1984.lsi.us.es/nfDavid S. Miller1-1/+3
Pablo Neira Ayuso says: ==================== * One to get the timeout special parameter for the SET target back working (this was introduced while trying to fix another bug in 3.4) from Jozsef Kadlecsik. * One crash fix if containers and nf_conntrack are used reported by Hans Schillstrom by myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09netfilter: ipset: timeout fixing bug broke SET target special timeout valueJozsef Kadlecsik1-1/+3
The patch "127f559 netfilter: ipset: fix timeout value overflow bug" broke the SET target when no timeout was specified. Reported-by: Jean-Philippe Menil <jean-philippe.menil@univ-nantes.fr> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-07-09cgroup: fix panic in netprio_cgroupGao feng1-1/+2
we set max_prioidx to the first zero bit index of prioidx_map in function get_prioidx. So when we delete the low index netprio cgroup and adding a new netprio cgroup again,the max_prioidx will be set to the low index. when we set the high index cgroup's net_prio.ifpriomap,the function write_priomap will call update_netdev_tables to alloc memory which size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1), so the size of array that map->priomap point to is max_prioidx +1, which is low than what we actually need. fix this by adding check in get_prioidx,only set max_prioidx when max_prioidx low than the new prioidx. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09small cleanup in ax25_addr_parse()Dan Carpenter1-2/+4
The comments were wrong here because "AX25_MAX_DIGIS" is 8 but the comments say 6. Also I've changed the "7" to "AX25_ADDR_LEN". Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09netem: add limitation to reordered packetsEric Dumazet1-27/+15
Fix two netem bugs : 1) When a frame was dropped by tfifo_enqueue(), drop counter was incremented twice. 2) When reordering is triggered, we enqueue a packet without checking queue limit. This can OOM pretty fast when this is repeated enough, since skbs are orphaned, no socket limit can help in this situation. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mark Gordon <msg@google.com> Cc: Andreas Terzis <aterzis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-08sctp: refactor sctp_packet_append_chunk and clenup some memory leaksNeil Horman1-27/+52
While doing some recent work on sctp sack bundling I noted that sctp_packet_append_chunk was pretty inefficient. Specifially, it was called recursively while trying to bundle auth and sack chunks. Because of that we call sctp_packet_bundle_sack and sctp_packet_bundle_auth a total of 4 times for every call to sctp_packet_append_chunk, knowing that at least 3 of those calls will do nothing. So lets refactor sctp_packet_bundle_auth to have an outer part that does the attempted bundling, and an inner part that just does the chunk appends. This saves us several calls per iteration that we just don't need. Also, noticed that the auth and sack bundling fail to free the chunks they allocate if the append fails, so make sure we add that in Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: linux-sctp@vger.kernel.org Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-08ieee802154: verify packet size before trying to allocate itSasha Levin1-6/+6
Currently when sending data over datagram, the send function will attempt to allocate any size passed on from the userspace. We should make sure that this size is checked and limited. We'll limit it to the MTU of the device, which is checked later anyway. Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-07Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-nextDavid S. Miller5-3/+254
2012-07-07Merge branch 'master' of git://1984.lsi.us.es/nf-nextDavid S. Miller13-256/+540
2012-07-06ipv6: fix a bad cast in ip6_dst_lookup_tail()Eric Dumazet1-1/+1
Fix a bug in ip6_dst_lookup_tail(), where typeof(dst) is "struct dst_entry **", not "struct dst_entry *" Reported-by: Fengguang Wu <wfg@linux.intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05ipv6: remove redundant declarationsEric Dumazet1-3/+0
remove redundant declarations, they belong in include/net/tcp.h Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05ipv4: Avoid overhead when no custom FIB rules are installed.David S. Miller3-12/+33
If the user hasn't actually installed any custom rules, or fiddled with the default ones, don't go through the whole FIB rules layer. It's just pure overhead. Instead do what we do with CONFIG_IP_MULTIPLE_TABLES disabled, check the individual tables by hand, one by one. Also, move fib_num_tclassid_users into the ipv4 network namespace. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-06batman-adv: check incoming packet type for blaSimon Wunderlich3-7/+19
If the gateway functionality is used, some broadcast packets (DHCP requests) may be transmitted as unicast packets. As the bridge loop avoidance code now only considers the payload Ethernet destination, it may drop the DHCP request for clients which are claimed by other backbone gateways, because it falsely infers from the broadcast address that the right backbone gateway should havehandled the broadcast. Fix this by checking and delegating the batman-adv packet type used for transmission. Reported-by: Guido Iribarren <guidoiribarren@buenosaireslibre.org> Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>