aboutsummaryrefslogtreecommitdiffstats
path: root/net (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2009-12-03net: Batch inet_twsk_purgeEric W. Biederman3-11/+21
This function walks the whole hashtable so there is no point in passing it a network namespace. Instead I purge all timewait sockets from dead network namespaces that I find. If the namespace is one of the once I am trying to purge I am guaranteed no new timewait sockets can be formed so this will get them all. If the namespace is one I am not acting for it might form a few more but I will call inet_twsk_purge again and shortly to get rid of them. In any even if the network namespace is dead timewait sockets are useless. Move the calls of inet_twsk_purge into batch_exit routines so that if I am killing a bunch of namespaces at once I will just call inet_twsk_purge once and save a lot of redundant unnecessary work. My simple 4k network namespace exit test the cleanup time dropped from roughly 8.2s to 1.6s. While the time spent running inet_twsk_purge fell to about 2ms. 1ms for ipv4 and 1ms for ipv6. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03net: Use rcu lookups in inet_twsk_purge.Eric W. Biederman1-15/+24
While we are looking up entries to free there is no reason to take the lock in inet_twsk_purge. We have to drop locks and restart occassionally anyway so adding a few more in case we get on the wrong list because of a timewait move is no big deal. At the same time not taking the lock for long periods of time is much more polite to the rest of the users of the hash table. In my test configuration of killing 4k network namespaces this change causes 4k back to back runs of inet_twsk_purge on an empty hash table to go from roughly 20.7s to 3.3s, and the total time to destroy 4k network namespaces goes from roughly 44s to 3.3s. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03net: Allow fib_rule_unregister to batchEric W. Biederman4-37/+55
Refactor the code so fib_rules_register always takes a template instead of the actual fib_rules_ops structure that will be used. This is required for network namespace support so 2 out of the 3 callers already do this, it allows the error handling to be made common, and it allows fib_rules_unregister to free the template for hte caller. Modify fib_rules_unregister to use call_rcu instead of syncrhonize_rcu to allw multiple namespaces to be cleaned up in the same rcu grace period. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03netns: Add an explicit rcu_barrier to unregister_pernet_{device|subsys}Eric W. Biederman1-2/+6
This allows namespace exit methods to batch work that comes requires an rcu barrier using call_rcu without having to treat the unregister_pernet_operations cases specially. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03net: Allow xfrm_user_net_exit to batch efficiently.Eric W. Biederman1-8/+10
xfrm.nlsk is provided by the xfrm_user module and is access via rcu from other parts of the xfrm code. Add xfrm.nlsk_stash a copy of xfrm.nlsk that will never be set to NULL. This allows the synchronize_net and netlink_kernel_release to be deferred until a whole batch of xfrm.nlsk sockets have been set to NULL. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03net: Move network device exit batchingEric W. Biederman2-24/+25
Move network device exit batching from a special case in net_namespace.c to using common mechanisms in dev.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03net: Add support for batching network namespace cleanupsEric W. Biederman1-61/+61
- Add exit_list to struct net to support building lists of network namespaces to cleanup. - Add exit_batch to pernet_operations to allow running operations only once during a network namespace exit. Instead of once per network namespace. - Factor opt ops_exit_list and ops_exit_free so the logic with cleanup up a network namespace does not need to be duplicated. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03ipv4 05/05: add sysctl to accept packets with local source addressesPatrick McHardy2-4/+8
commit 8ec1e0ebe26087bfc5c0394ada5feb5758014fc8 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:16:35 2009 +0100 ipv4: add sysctl to accept packets with local source addresses Change fib_validate_source() to accept packets with a local source address when the "accept_local" sysctl is set for the incoming inet device. Combined with the previous patches, this allows to communicate between multiple local interfaces over the wire. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03net 04/05: fib_rules: allow to delete local rulePatrick McHardy3-3/+3
commit d124356ce314fff22a047ea334379d5105b2d834 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:16:35 2009 +0100 net: fib_rules: allow to delete local rule Allow to delete the local rule and recreate it with a higher priority. This can be used to force packets with a local destination out on the wire instead of routing them to loopback. Additionally this patch allows to recreate rules with a priority of 0. Combined with the previous patch to allow oif classification, a socket can be bound to the desired interface and packets routed to the wire like this: # move local rule to lower priority ip rule add pref 1000 lookup local ip rule del pref 0 # route packets of sockets bound to eth0 to the wire independant # of the destination address ip rule add pref 100 oif eth0 lookup 100 ip route add default dev eth0 table 100 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03net 03/05: fib_rules: add oif classificationPatrick McHardy1-1/+32
commit 68144d350f4f6c348659c825cde6a82b34c27a91 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:05:25 2009 +0100 net: fib_rules: add oif classification Support routing table lookup based on the flow's oif. This is useful to classify packets originating from sockets bound to interfaces differently. The route cache already includes the oif and needs no changes. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03net 02/05: fib_rules: rename ifindex/ifname/FRA_IFNAME to iifindex/iifname/FRA_IIFNAMEPatrick McHardy1-18/+18
commit 229e77eec406ad68662f18e49fda8b5d366768c5 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:05:23 2009 +0100 net: fib_rules: rename ifindex/ifname/FRA_IFNAME to iifindex/iifname/FRA_IIFNAME The next patch will add oif classification, rename interface related members and attributes to reflect that they're used for iif classification. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02tcp: sysctl_tcp_cookie_size needs to be exported to modules.David S. Miller1-0/+1
Otherwise: ERROR: "sysctl_tcp_cookie_size" [net/ipv6/ipv6.ko] undefined! make[1]: *** [__modpost] Error 1 Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02tcp: Fix warning on 64-bit.David S. Miller1-1/+1
net/ipv4/tcp_output.c: In function ‘tcp_make_synack’: net/ipv4/tcp_output.c:2488: warning: cast from pointer to integer of different size Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02net: Teach vlans to cleanup as a pernet subsystemEric W. Biederman1-5/+3
Take advantage of the fact that an explicit rtnl_kill_links is unnecessary (and skipping it improves batching), as network namespace exit calls dellink on all remaining virtual devices, and rtnl_link_unregister calls dellink on all outstanding devices in that network namespace. To do this we need to leave the vlan proc directories in place until after network device exit time, which is done by using register_pernet_subsys instead of register_pernet_device. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02TCPCT part 1g: Responder Cookie => InitiatorWilliam Allen Simpson7-43/+258
Parse incoming TCP_COOKIE option(s). Calculate <SYN,ACK> TCP_COOKIE option. Send optional <SYN,ACK> data. This is a significantly revised implementation of an earlier (year-old) patch that no longer applies cleanly, with permission of the original author (Adam Langley): http://thread.gmane.org/gmane.linux.network/102586 Requires: TCPCT part 1a: add request_values parameter for sending SYNACK TCPCT part 1b: generate Responder Cookie secret TCPCT part 1c: sysctl_tcp_cookie_size, socket option TCP_COOKIE_TRANSACTIONS TCPCT part 1d: define TCP cookie option, extend existing struct's TCPCT part 1e: implement socket option TCP_COOKIE_TRANSACTIONS TCPCT part 1f: Initiator Cookie => Responder Signed-off-by: William.Allen.Simpson@gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02TCPCT part 1f: Initiator Cookie => ResponderWilliam Allen Simpson1-30/+163
Calculate and format <SYN> TCP_COOKIE option. This is a significantly revised implementation of an earlier (year-old) patch that no longer applies cleanly, with permission of the original author (Adam Langley): http://thread.gmane.org/gmane.linux.network/102586 Requires: TCPCT part 1c: sysctl_tcp_cookie_size, socket option TCP_COOKIE_TRANSACTIONS TCPCT part 1d: define TCP cookie option, extend existing struct's Signed-off-by: William.Allen.Simpson@gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02TCPCT part 1e: implement socket option TCP_COOKIE_TRANSACTIONSWilliam Allen Simpson1-2/+131
Provide per socket control of the TCP cookie option and SYN/SYNACK data. This is a straightforward re-implementation of an earlier (year-old) patch that no longer applies cleanly, with permission of the original author (Adam Langley): http://thread.gmane.org/gmane.linux.network/102586 The principle difference is using a TCP option to carry the cookie nonce, instead of a user configured offset in the data. Allocations have been rearranged to avoid requiring GFP_ATOMIC. Requires: net: TCP_MSS_DEFAULT, TCP_MSS_DESIRED TCPCT part 1c: sysctl_tcp_cookie_size, socket option TCP_COOKIE_TRANSACTIONS TCPCT part 1d: define TCP cookie option, extend existing struct's Signed-off-by: William.Allen.Simpson@gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02TCPCT part 1d: define TCP cookie option, extend existing struct'sWilliam Allen Simpson3-8/+71
Data structures are carefully composed to require minimal additions. For example, the struct tcp_options_received cookie_plus variable fits between existing 16-bit and 8-bit variables, requiring no additional space (taking alignment into consideration). There are no additions to tcp_request_sock, and only 1 pointer in tcp_sock. This is a significantly revised implementation of an earlier (year-old) patch that no longer applies cleanly, with permission of the original author (Adam Langley): http://thread.gmane.org/gmane.linux.network/102586 The principle difference is using a TCP option to carry the cookie nonce, instead of a user configured offset in the data. This is more flexible and less subject to user configuration error. Such a cookie option has been suggested for many years, and is also useful without SYN data, allowing several related concepts to use the same extension option. "Re: SYN floods (was: does history repeat itself?)", September 9, 1996. http://www.merit.net/mail.archives/nanog/1996-09/msg00235.html "Re: what a new TCP header might look like", May 12, 1998. ftp://ftp.isi.edu/end2end/end2end-interest-1998.mail These functions will also be used in subsequent patches that implement additional features. Requires: TCPCT part 1a: add request_values parameter for sending SYNACK TCPCT part 1b: generate Responder Cookie secret TCPCT part 1c: sysctl_tcp_cookie_size, socket option TCP_COOKIE_TRANSACTIONS Signed-off-by: William.Allen.Simpson@gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02TCPCT part 1c: sysctl_tcp_cookie_size, socket option TCP_COOKIE_TRANSACTIONSWilliam Allen Simpson2-0/+11
Define sysctl (tcp_cookie_size) to turn on and off the cookie option default globally, instead of a compiled configuration option. Define per socket option (TCP_COOKIE_TRANSACTIONS) for setting constant data values, retrieving variable cookie values, and other facilities. Move inline tcp_clear_options() unchanged from net/tcp.h to linux/tcp.h, near its corresponding struct tcp_options_received (prior to changes). This is a straightforward re-implementation of an earlier (year-old) patch that no longer applies cleanly, with permission of the original author (Adam Langley): http://thread.gmane.org/gmane.linux.network/102586 These functions will also be used in subsequent patches that implement additional features. Requires: net: TCP_MSS_DEFAULT, TCP_MSS_DESIRED Signed-off-by: William.Allen.Simpson@gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02TCPCT part 1b: generate Responder Cookie secretWilliam Allen Simpson1-0/+140
Define (missing) hash message size for SHA1. Define hashing size constants specific to TCP cookies. Add new function: tcp_cookie_generator(). Maintain global secret values for tcp_cookie_generator(). This is a significantly revised implementation of earlier (15-year-old) Photuris [RFC-2522] code for the KA9Q cooperative multitasking platform. Linux RCU technique appears to be well-suited to this application, though neither of the circular queue items are freed. These functions will also be used in subsequent patches that implement additional features. Signed-off-by: William.Allen.Simpson@gmail.com Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02TCPCT part 1a: add request_values parameter for sending SYNACKWilliam Allen Simpson8-31/+33
Add optional function parameters associated with sending SYNACK. These parameters are not needed after sending SYNACK, and are not used for retransmission. Avoids extending struct tcp_request_sock, and avoids allocating kernel memory. Also affects DCCP as it uses common struct request_sock_ops, but this parameter is currently reserved for future use. Signed-off-by: William.Allen.Simpson@gmail.com Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02skbuff: remove skb_dma_map/unmapAlexander Duyck2-66/+0
The two functions skb_dma_map/unmap are unsafe to use as they cause problems when packets are cloned and sent to multiple devices while a HW IOMMU is enabled. Due to this it is best to remove the code so it is not used by any other network driver maintainters. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02net: compat_sys_recvmmsg user timespec arg can be NULLJean-Mickael Guerin1-2/+6
We must test if user timespec is non-NULL before copying from userpace, same as sys_recvmmsg(). Commiter note: changed it so that we have just one branch. Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02net: compat_mmsghdr must be used in sys_recvmmsgJean-Mickael Guerin1-6/+18
Both to traverse the entries and to set the msg_len field. Commiter note: folded two patches and avoided one branch repeating the compat test. Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02sctp: fix sctp_setsockopt_autoclose compile warningAndrei Pelinescu-Onciul1-1/+1
Fix the following warning, when building on 64 bits: net/sctp/socket.c:2091: warning: large integer implicitly truncated to unsigned type Signed-off-by: Andrei Pelinescu-Onciul <andrei@iptel.org> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Simplify ipip6 aka sit pernet operations.Eric W. Biederman1-19/+6
Take advantage of the new pernet automatic storage management, and stop using compatibility network namespace functions. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Simplify ip6_tunnel pernet operations.Eric W. Biederman1-19/+6
Take advantage of the new pernet automatic storage management, and stop using compatibility network namespace functions. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Simplify ipip pernet operations.Eric W. Biederman1-18/+6
Take advantage of the new pernet automatic storage management, and stop using compatibility network namespace functions. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Simplify ip_gre pernet operations.Eric W. Biederman1-18/+6
Take advantage of the new pernet automatic storage management, and stop using compatibility network namespace functions. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Simplify phonet pernet operations.Eric W. Biederman1-10/+6
Take advantage of the new pernet automatic storage management, and stop using compatibility network namespace functions. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Simplify conntrack_proto_gre pernet operations.Eric W. Biederman1-14/+6
Take advantage of the new pernet automatic storage management, and stop using compatibility network namespace functions. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Simplify conntrack_proto_dccp pernet operations.Eric W. Biederman1-23/+8
Take advantage of the new pernet automatic storage management, and stop using compatibility network namespace functions. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Simplify af_key pernet operations.Eric W. Biederman1-19/+6
Take advantage of the new pernet automatic storage management, and stop using compatibility network namespace functions. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Simplify vlan pernet operations.Eric W. Biederman1-26/+7
Take advantage of the new pernet automatic storage management, and stop using compatibility network namespace functions. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Simplfy default_device_exit and improve batching.Eric W. Biederman1-10/+6
- Defer dellink to net_cleanup() allowing for batching. - Fix comment. - Use for_each_netdev_safe again as dev_change_net_namespace touches at most one network device (unlike veth dellink). Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Automatically allocate per namespace data.Eric W. Biederman1-86/+102
To get the full benefit of batched network namespace cleanup netowrk device deletion needs to be performed by the generic code. When using register_pernet_gen_device and freeing the data in exit_net it is impossible to delay allocation until after exit_net has called as the device uninit methods are no longer safe. To correct this, and to simplify working with per network namespace data I have moved allocation and deletion of per network namespace data into the network namespace core. The core now frees the data only after all of the network namespace exit routines have run. Now it is only required to set the new fields .id and .size in the pernet_operations structure if you want network namespace data to be managed for you automatically. This makes the current register_pernet_gen_device and register_pernet_gen_subsys routines unnecessary. For the moment I have left them as compatibility wrappers in net_namespace.h They will be removed once all of the users have been updated. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: Batch network namespace destruction.Eric W. Biederman1-8/+58
It is fairly common to kill several network namespaces at once. Either because they are nested one inside the other or because they are cooperating in multiple machine networking experiments. As the network stack control logic does not parallelize easily batch up multiple network namespaces existing together. To get the full benefit of batching the virtual network devices to be removed must be all removed in one batch. For that purpose I have added a loop after the last network device operations have run that batches up all remaining network devices and deletes them. An extra benefit is that the reorganization slightly shrinks the size of the per network namespace data structures replaceing a work_struct with a list_head. In a trivial test with 4K namespaces this change reduced the cost of a destroying 4K namespaces from 7+ minutes (at 12% cpu) to 44 seconds (at 60% cpu). The bulk of that 44s was spent in inet_twsk_purge. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01net: NETDEV_UNREGISTER_PERNET -> NETDEV_UNREGISTER_BATCHEric W. Biederman3-28/+18
The motivation for an additional notifier in batched netdevice notification (rt_do_flush) only needs to be called once per batch not once per namespace. For further batching improvements I need a guarantee that the netdevices are unregistered in order allowing me to unregister an all of the network devices in a network namespace at the same time with the guarantee that the loopback device is really and truly unregistered last. Additionally it appears that we moved the route cache flush after the final synchronize_net, which seems wrong and there was no explanation. So I have restored the original location of the final synchronize_net. Cc: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01ip_fragment: also adjust skb->truesize for packets not owned by a socketPatrick McHardy1-1/+1
When a large packet gets reassembled by ip_defrag(), the head skb accounts for all the fragments in skb->truesize. If this packet is refragmented again, skb->truesize is not re-adjusted to reflect only the head size since its not owned by a socket. If the head fragment then gets recycled and reused for another received fragment, it might exceed the defragmentation limits due to its large truesize value. skb_recycle_check() explicitly checks for linear skbs, so any recycled skb should reflect its true size in skb->truesize. Change ip_fragment() to also adjust the truesize value of skbs not owned by a socket. Reported-and-tested-by: Ben Menchaca <ben@bigfootnetworks.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01ipsec: can not add camellia cipher algorithm when using "ip xfrm state" commandLi Yewang1-0/+1
can not add camellia cipher algorithm when using "ip xfrm state" command. Signed-off-by: Li Yewang <lyw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-30tcp: tcp_disconnect() should clear window_clampEric Dumazet1-0/+1
NFS can reuse its TCP socket after calling tcp_disconnect(). We noticed window scaling was not negotiated in SYN packet of next connection request. Fix is to clear tp->window_clamp in tcp_disconnect(). Reported-by: Krzysztof Oledzki <ole@ans.pl> Tested-by: Krzysztof Oledzki <ole@ans.pl> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-30mac80211: fix spurious delBA handlingJohannes Berg3-13/+12
Lennert Buytenhek noticed that delBA handling in mac80211 was broken and has remotely triggerable problems, some of which are due to some code shuffling I did that ended up changing the order in which things were done -- this was commit d75636ef9c1af224f1097941879d5a8db7cd04e5 Author: Johannes Berg <johannes@sipsolutions.net> Date: Tue Feb 10 21:25:53 2009 +0100 mac80211: RX aggregation: clean up stop session and other parts were already present in the original commit d92684e66091c0f0101819619b315b4bb8b5bcc5 Author: Ron Rindjunsky <ron.rindjunsky@intel.com> Date: Mon Jan 28 14:07:22 2008 +0200 mac80211: A-MPDU Tx add delBA from recipient support The first problem is that I moved a BUG_ON before various checks -- thereby making it possible to hit. As the comment indicates, the BUG_ON can be removed since the ampdu_action callback must already exist when the state is != IDLE. The second problem isn't easily exploitable but there's a race condition due to unconditionally setting the state to OPERATIONAL when a delBA frame is received, even when no aggregation session was ever initiated. All the drivers accept stopping the session even then, but that opens a race window where crashes could happen before the driver accepts it. Right now, a WARN_ON may happen with non-HT drivers, while the race opens only for HT drivers. For this case, there are two things necessary to fix it: 1) don't process spurious delBA frames, and be more careful about the session state; don't drop the lock 2) HT drivers need to be prepared to handle a session stop even before the session was really started -- this is true for all drivers (that support aggregation) but iwlwifi which can be fixed easily. The other HT drivers (ath9k and ar9170) are behaving properly already. Reported-by: Lennert Buytenhek <buytenh@marvell.com> Cc: stable@kernel.org Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-30mac80211: fix two remote exploitsJohannes Berg2-5/+1
Lennert Buytenhek noticed a remotely triggerable problem in mac80211, which is due to some code shuffling I did that ended up changing the order in which things were done -- this was in commit d75636ef9c1af224f1097941879d5a8db7cd04e5 Author: Johannes Berg <johannes@sipsolutions.net> Date: Tue Feb 10 21:25:53 2009 +0100 mac80211: RX aggregation: clean up stop session The problem is that the BUG_ON moved before the various checks, and as such can be triggered. As the comment indicates, the BUG_ON can be removed since the ampdu_action callback must already exist when the state is OPERATIONAL. A similar code path leads to a WARN_ON in ieee80211_stop_tx_ba_session, which can also be removed. Cc: stable@kernel.org [2.6.29+] Cc: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-29ipv4: additional update of dev_net(dev) to struct *net in ip_fragment.c, NULL ptr OOPSDavid Ford1-1/+1
ipv4 ip_frag_reasm(), fully replace 'dev_net(dev)' with 'net', defined previously patched into 2.6.29. Between 2.6.28.10 and 2.6.29, net/ipv4/ip_fragment.c was patched, changing from dev_net(dev) to container_of(...). Unfortunately the goto section (out_fail) on oversized packets inside ip_frag_reasm() didn't get touched up as well. Oversized IP packets cause a NULL pointer dereference and immediate hang. I discovered this running openvasd and my previous email on this is titled: NULL pointer dereference at 2.6.32-rc8:net/ipv4/ip_fragment.c:566 Signed-off-by: David Ford <david@blue-labs.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-29net: Move && and || to end of previous lineJoe Perches63-233/+235
Not including net/atm/ Compiled tested x86 allyesconfig only Added a > 80 column line or two, which I ignored. Existing checkpatch plaints willfully, cheerfully ignored. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-29pktgen: NUMA awareEric Dumazet1-3/+6
pktgen threads are bound to given CPU, we can allocate memory for these threads in a NUMA aware way. After a pktgen session on two threads, we can check flows memory was allocated on right node, instead of a not related one. # grep pktgen_thread_write /proc/vmallocinfo 0xffffc90007204000-0xffffc90007385000 1576960 pktgen_thread_write+0x3a4/0x6b0 [pktgen] pages=384 vmalloc N0=384 0xffffc90007386000-0xffffc90007507000 1576960 pktgen_thread_write+0x3a4/0x6b0 [pktgen] pages=384 vmalloc N1=384 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-29X25: Fix oops and refcnt problems from x25_dev_getandrew hendry1-1/+3
Calls to x25_dev_get check for dev = NULL which was not set. It allowed x25 to set routes and ioctls on down interfaces. This caused oopses and refcnt problems on device_unregister. Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-29X25: Check for errors in x25_initandrew hendry1-3/+16
Adds error checking to x25_init. Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-29X25: Move SYSCTL ifdefs into headerandrew hendry1-4/+0
Moves the CONFIG_SYSCTL ifdefs in x25_init into header. Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-29sctp: on T3_RTX retransmit all the in-flight chunksAndrei Pelinescu-Onciul3-14/+2
When retransmitting due to T3 timeout, retransmit all the in-flight chunks for the corresponding transport/path, including chunks sent less then 1 rto ago. This is the correct behaviour according to rfc4960 section 6.3.3 E3 and "Note: Any DATA chunks that were sent to the address for which the T3-rtx timer expired but did not fit in one MTU (rule E3 above) should be marked for retransmission and sent as soon as cwnd allows (normally, when a SACK arrives). ". This fixes problems when more then one path is present and the T3 retransmission of the first chunk that timeouts stops the T3 timer for the initial active path, leaving all the other in-flight chunks waiting forever or until a new chunk is transmitted on the same path and timeouts (and this will happen only if the cwnd allows sending new chunks, but since cwnd was dropped to MTU by the timeout => it will wait until the first heartbeat). Example: 10 packets in flight, sent at 0.1 s intervals on the primary path. The primary path is down and the first packet timeouts. The first packet is retransmitted on another path, the T3 timer for the primary path is stopped and cwnd is set to MTU. All the other 9 in-flight packets will not be retransmitted (unless more new packets are sent on the primary path which depend on cwnd allowing it, and even in this case the 9 packets will be retransmitted only after a new packet timeouts which even in the best case would be more then RTO). This commit reverts d0ce92910bc04e107b2f3f2048f07e94f570035d and also removes the now unused transport->last_rto, introduced in b6157d8e03e1e780660a328f7183bcbfa4a93a19. p.s The problem is not only when multiple paths are there. It can happen in a single homed environment. If the application stops sending data, it possible to have a hung association. Signed-off-by: Andrei Pelinescu-Onciul <andrei@iptel.org> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>