aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2015-04-13Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsDavid S. Miller10-111/+196
Al Viro says: ==================== netdev-related stuff in vfs.git There are several commits sitting in vfs.git that probably ought to go in via net-next.git. First of all, there's merge with vfs.git#iocb - that's Christoph's aio rework, which has triggered conflicts with the ->sendmsg() and ->recvmsg() patches a while ago. It's not so much Christoph's stuff that ought to be in net-next, as (pretty simple) conflict resolution on merge. The next chunk is switch to {compat_,}import_iovec/import_single_range - new safer primitives for initializing iov_iter. The primitives themselves come from vfs/git#iov_iter (and they are used quite a lot in vfs part of queue), conversion of net/socket.c syscalls belongs in net-next, IMO. Next there's afs and rxrpc stuff from dhowells. And then there's sanitizing kernel_sendmsg et.al. + missing inlined helper for "how much data is left in msg->msg_iter" - this stuff is used in e.g. cifs stuff, but it belongs in net-next. That pile is pullable from git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem I'll post the individual patches in there in followups; could you take a look and tell if everything in there is OK with you? ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13tcp/dccp: get rid of central timewait timerEric Dumazet10-288/+60
Using a timer wheel for timewait sockets was nice ~15 years ago when memory was expensive and machines had a single processor. This does not scale, code is ugly and source of huge latencies (Typically 30 ms have been seen, cpus spinning on death_lock spinlock.) We can afford to use an extra 64 bytes per timewait sock and spread timewait load to all cpus to have better behavior. Tested: On following test, /proc/sys/net/ipv4/tcp_tw_recycle is set to 1 on the target (lpaa24) Before patch : lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0 419594 lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0 437171 While test is running, we can observe 25 or even 33 ms latencies. lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23 ... 1000 packets transmitted, 1000 received, 0% packet loss, time 20601ms rtt min/avg/max/mdev = 0.020/0.217/25.771/1.535 ms, pipe 2 lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23 ... 1000 packets transmitted, 1000 received, 0% packet loss, time 20702ms rtt min/avg/max/mdev = 0.019/0.183/33.761/1.441 ms, pipe 2 After patch : About 90% increase of throughput : lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0 810442 lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0 800992 And latencies are kept to minimal values during this load, even if network utilization is 90% higher : lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23 ... 1000 packets transmitted, 1000 received, 0% packet loss, time 19991ms rtt min/avg/max/mdev = 0.023/0.064/0.360/0.042 ms Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13netfilter: Fix format string of nfnetlink_log proc fileRichard Weinberger1-1/+1
The printed values are all of type unsigned integer, therefore use %u instead of %d. Otherwise an user can face negative values. Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13netfilter: Fix format string of nfnetlink_queue proc fileRichard Weinberger1-1/+1
The printed values are all of type unsigned integer, therefore use %u instead of %d. Otherwise an user can face negative values. Fixes: $ cat /proc/net/netfilter/nfnetlink_queue 0 29508 278 2 65531 0 2004213241 -2129885586 1 1 -27747 0 2 65531 0 0 0 1 2 -27748 0 2 65531 0 0 0 1 Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13netfilter: Fix portid typesRichard Weinberger2-6/+5
The netlink portid is an unsigned integer, use this type also in netfilter. Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13nfc: Fix portid type in urelease_workRichard Weinberger1-1/+1
portid is an unsigned integer. Fix urelease_work to match all other portid user in the kernel. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13tcp: fix bogus RTT for CC when retransmissions are ackedKenneth Klette Jonassen1-6/+4
Since retransmitted segments are not used for RTT estimation, previously SACKed segments present in the rtx queue are used. This estimation can be several times larger than the actual RTT. When a cumulative ack covers both previously SACKed and retransmitted segments, CC may thus get a bogus RTT. Such segments previously had an RTT estimation in tcp_sacktag_one(), so it seems reasonable to not reuse them in tcp_clean_rtx_queue() at all. Afaik, this has had no effect on SRTT/RTO because of Karn's check. Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13net: use jump label patching for ingress qdisc in __netif_receive_skb_coreDaniel Borkmann2-7/+33
Even if we make use of classifier and actions from the egress path, we're going into handle_ing() executing additional code on a per-packet cost for ingress qdisc, just to realize that nothing is attached on ingress. Instead, this can just be blinded out as a no-op entirely with the use of a static key. On input fast-path, we already make use of static keys in various places, e.g. skb time stamping, in RPS, etc. It makes sense to not waste time when we're assured that no ingress qdisc is attached anywhere. Enabling/disabling of that code path is being done via two helpers, namely net_{inc,dec}_ingress_queue(), that are being invoked under RTNL mutex when a ingress qdisc is being either initialized or destructed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller1-6/+26
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-04-11 This series contains updates to iflink, ixgbe and ixgbevf. The entire set of changes come from Vlad Zolotarov to ultimately add the ethtool ops to VF driver to allow querying the RSS indirection table and RSS random key. Currently we support only 82599 and x540 devices. On those devices, VFs share the RSS redirection table and hash key with a PF. Letting the VF query this information may introduce some security risks, therefore this feature will be disabled by default. The new netdev op allows a system administrator to change the default behaviour with "ip link set" command. The relevant iproute2 patch has already been sent and awaits for this series upstream. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12fou: implement FOU_CMD_GETWANG Cong1-0/+109
Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12fou: add network namespace supportWANG Cong1-39/+67
Also convert the spinlock to a mutex. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12fou: always use be16 for portWANG Cong1-3/+3
udp_config.local_udp_port is be16. And iproute2 passes network order for FOU_ATTR_PORT. This doesn't fix any bug, just for consistency. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12fou: exit early when parsing config failsWANG Cong1-1/+4
Not a big deal, just for corretness. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12fou: avoid calling udp_del_offload() twiceWANG Cong1-2/+2
This fixes the following harmless warning: ./ip/ip fou del port 7777 [ 122.907516] udp_del_offload: didn't find offload for port 7777 Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12tcp: do not cache align timewait socketsEric Dumazet1-2/+1
With recent adoption of skc_cookie in struct sock_common, struct tcp_timewait_sock size increased from 192 to 200 bytes on 64bit arches. SLAB rounds then to 256 bytes. It is time to drop SLAB_HWCACHE_ALIGN constraint for twsk_slab. This saves about 12 MB of memory on typical configuration reaching 262144 timewait sockets, and has no noticeable impact on performance. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12Merge tag 'mac80211-next-for-davem-2015-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-nextDavid S. Miller25-383/+1113
Johannes Berg says: ==================== There isn't much left, but we have * new mac80211 internal software queue to allow drivers to have shorter hardware queues and pull on-demand * use rhashtable for mac80211 station table * minstrel rate control debug improvements and some refactoring * fix noisy message about TX power reduction * fix continuous message printing and activity if CRDA doesn't respond * fix VHT-related capabilities with "iw connect" or "iwconfig ..." * fix Kconfig for cfg80211 wireless extensions compatibility ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-11new helper: msg_data_left()Al Viro4-17/+16
convert open-coded instances Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11Merge remote-tracking branch 'dh/afs' into for-davemAl Viro4-29/+148
2015-04-11get rid of the size argument of sock_sendmsg()Al Viro2-14/+15
it's equal to iov_iter_count(&msg->msg_iter) in all cases Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-10if_link: Add an additional parameter to ifla_vf_info for RSS queryingVlad Zolotarov1-6/+26
Add configuration setting for drivers to allow/block an RSS Redirection Table and a Hash Key querying for discrete VFs. On some devices VF share the mentioned above information with PF and querying it may adduce a theoretical security risk. We want to let a system administrator to decide if he/she wants to take this risk or not. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-10rtnetlink: Mark name argument of rtnl_create_link() constThomas Graf1-1/+1
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-nextDavid S. Miller3-151/+258
Johan Hedberg says: ==================== pull request: bluetooth-next 2015-04-09 We've had enough new patches during the past week (especially from Marcel) that it'd be good to still get these queued for 4.1. The majority of the changes are from Marcel with lots of cleanup & refactoring patches for the HCI UART driver. Marcel also split out some Broadcom & Intel vendor specific functionality into two new btintel & btbcm modules. In addition to the HCI driver changes there's the completion of our local OOB data interface for pairing, added support for requesting remote LE features when connecting, as well as a couple of minor fixes for mac802154. Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09tcp: md5: fix a typo in tcp_v4_md5_lookup()Eric Dumazet1-2/+2
Lookup key for tcp_md5_do_lookup() has to be taken from addr_sk, not sk (which can be the listener) Fixes: fd3a154a00fb ("tcp: md5: get rid of tcp_v[46]_reqsk_md5_lookup()") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09net: Pass VLAN ID to rtnl_fdb_notify.Hubert Sokolowski1-10/+10
When an FDB entry is added or deleted the information about VLAN is not passed to listening applications like 'bridge monitor fdb'. With this patch VLAN ID is passed if it was set in the original netlink message. Also remove an unused bdev variable. Signed-off-by: Hubert Sokolowski <hubert.sokolowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller18-173/+747
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. They are: * nf_tables set timeout infrastructure from Patrick Mchardy. 1) Add support for set timeout support. 2) Add support for set element timeouts using the new set extension infrastructure. 4) Add garbage collection helper functions to get rid of stale elements. Elements are accumulated in a batch that are asynchronously released via RCU when the batch is full. 5) Add garbage collection synchronization helpers. This introduces a new element busy bit to address concurrent access from the netlink API and the garbage collector. 5) Add timeout support for the nft_hash set implementation. The garbage collector peridically checks for stale elements from the workqueue. * iptables/nftables cgroup fixes: 6) Ignore non full-socket objects from the input path, otherwise cgroup match may crash, from Daniel Borkmann. 7) Fix cgroup in nf_tables. 8) Save some cycles from xt_socket by skipping packet header parsing when skb->sk is already set because of early demux. Also from Daniel. * br_netfilter updates from Florian Westphal. 9) Save frag_max_size and restore it from the forward path too. 10) Use a per-cpu area to restore the original source MAC address when traffic is DNAT'ed. 11) Add helper functions to access physical devices. 12) Use these new physdev helper function from xt_physdev. 13) Add another nf_bridge_info_get() helper function to fetch the br_netfilter state information. 14) Annotate original layer 2 protocol number in nf_bridge info, instead of using kludgy flags. 15) Also annotate the pkttype mangling when the packet travels back and forth from the IP to the bridge layer, instead of using a flag. * More nf_tables set enhancement from Patrick: 16) Fix possible usage of set variant that doesn't support timeouts. 17) Avoid spurious "set is full" errors from Netlink API when there are pending stale elements scheduled to be released. 18) Restrict loop checks to set maps. 19) Add support for dynamic set updates from the packet path. 20) Add support to store optional user data (eg. comments) per set element. BTW, I have also pulled net-next into nf-next to anticipate the conflict resolution between your okfn() signature changes and Florian's br_netfilter updates. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-nextDavid S. Miller1-0/+5
Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2015-04-09 1) Prohibit the use/abuse of the xfrm netlink interface on 32/64 bit compatibility tasks. We need a full compat layer before we can allow this. From Fan Du. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09Bluetooth: Read LE remote features during connection establishmentMarcel Holtmann1-2/+105
When establishing a Bluetooth LE connection, read the remote used features mask to determine which features are supported. This was not really needed with Bluetooth 4.0, but since Bluetooth 4.1 and also 4.2 have introduced new optional features, this becomes more important. This works the same as with BR/EDR where the connection enters the BT_CONFIG stage and hci_connect_cfm call is delayed until the remote features have been retrieved. Only after successfully receiving the remote features, the connection enters the BT_CONNECTED state. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-09switch kernel_sendmsg() and kernel_recvmsg() to iov_iter_kvec()Al Viro1-17/+3
For kernel_sendmsg() that eliminates the need to play with setfs(); for kernel_recvmsg() it does *not* - a couple of callers are using it with non-NULL ->msg_control, which would be treated as userland address on recvmsg side of things. In all cases we are really setting a kvec-backed iov_iter, though. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09net: switch importing msghdr from userland to {compat_,}import_iovec()Al Viro2-30/+19
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09net: switch sendto() and recvfrom() to import_single_range()Al Viro1-16/+8
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09Merge branch 'iocb' into for-davemAl Viro2-4/+3
trivial conflict in net/socket.c and non-trivial one in crypto - that one had evaded aio_complete() removal. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-08tcp: do not rearm rsk_timer on FastOpen requestsEric Dumazet1-4/+10
FastOpen requests are not like other regular request sockets. They do not yet use rsk_timer : tcp_fastopen_queue_check() simply manually removes one expired request from fastopenq->rskq_rst list. Therefore, tcp_check_req() must not call mod_timer_pending(), otherwise we crash because rsk_timer was not initialized. Fixes: fa76ce7328b ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08Merge tag 'nfc-next-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-nextDavid S. Miller1-0/+11
Samuel Ortiz says: ==================== NFC: 4.1 pull request This is the NFC pull request for 4.1. This is a shorter one than usual, as the Intel Field Peak NFC driver could not make it in time. We have: - A new driver for NXP NCI based chipsets, like e.g. the NPC100 or the PN7150. It currently only supports an i2c physical layer, but could easily be extended to work on top of e.g. SPI. This driver also includes support for user space triggered firmware updates. - A few minor st21nfc[ab] fixes, cleanups, and comments improvements. - A pn533 error return fix. - A few NFC related logs formatting cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08netfilter: Fix switch statement warnings with recent gcc.David Miller5-3/+17
More recent GCC warns about two kinds of switch statement uses: 1) Switching on an enumeration, but not having an explicit case statement for all members of the enumeration. To show the compiler this is intentional, we simply add a default case with nothing more than a break statement. 2) Switching on a boolean value. I think this warning is dumb but nevertheless you get it wholesale with -Wswitch. This patch cures all such warnings in netfilter. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextPablo Neira Ayuso171-1639/+2047
Resolve conflicts between 5888b93 ("Merge branch 'nf-hook-compress'") and Florian Westphal br_netfilter works. Conflicts: net/bridge/br_netfilter.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08ipv6: call iptunnel_xmit with NULL sock pointer if no tunnel sock is availableHannes Frederic Sowa1-1/+1
Fixes: 79b16aadea32cce ("udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb().") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08ipv4: ip_tunnel: use net namespace from rtable not socketHannes Frederic Sowa1-1/+2
The socket parameter might legally be NULL, thus sock_net is sometimes causing a NULL pointer dereference. Using net_device pointer in dst_entry is more reliable. Fixes: b6a7719aedd7e5c ("ipv4: hash net ptr into fragmentation bucket selection") Reported-by: Rick Jones <rick.jones2@hp.com> Cc: Rick Jones <rick.jones2@hp.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08netfilter: nf_tables: support optional userdata for set elementsPatrick McHardy1-0/+34
Add an userdata set extension and allow the user to attach arbitrary data to set elements. This is intended to hold TLV encoded data like comments or DNS annotations that have no meaning to the kernel. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: nf_tables: add support for dynamic set updatesPatrick McHardy5-6/+268
Add a new "dynset" expression for dynamic set updates. A new set op ->update() is added which, for non existant elements, invokes an initialization callback and inserts the new element. For both new or existing elements the extenstion pointer is returned to the caller to optionally perform timer updates or other actions. Element removal is not supported so far, however that seems to be a rather exotic need and can be added later on. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: nf_tables: support different set binding typesPatrick McHardy2-3/+10
Currently a set binding is assumed to be related to a lookup and, in case of maps, a data load. In order to use bindings for set updates, the loop detection checks must be restricted to map operations only. Add a flags member to the binding struct to hold the set "action" flags such as NFT_SET_MAP, and perform loop detection based on these. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: nf_tables: prepare set element accounting for async updatesPatrick McHardy2-10/+14
Use atomic operations for the element count to avoid races with async updates. To properly handle the transactional semantics during netlink updates, deleted but not yet committed elements are accounted for seperately and are treated as being already removed. This means for the duration of a netlink transaction, the limit might be exceeded by the amount of elements deleted. Set implementations must be prepared to handle this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: nf_tables: fix set selection when timeouts are requestedPatrick McHardy1-1/+1
The NFT_SET_TIMEOUT flag is ignore in nft_select_set_ops, which may lead to selection of a set implementation that doesn't actually support timeouts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: bridge: make BRNF_PKT_TYPE flag a boolFlorian Westphal1-9/+9
nf_bridge_info->mask is used for several things, for example to remember if skb->pkt_type was set to OTHER_HOST. For a bridge, OTHER_HOST is expected case. For ip forward its a non-starter though -- routing expects PACKET_HOST. Bridge netfilter thus changes OTHER_HOST to PACKET_HOST before hook invocation and then un-does it after hook traversal. This information is irrelevant outside of br_netfilter. After this change, ->mask now only contains flags that need to be known outside of br_netfilter in fast-path. Future patch changes mask into a 2bit state field in sk_buff, so that we can remove skb->nf_bridge pointer for good and consider all remaining places that access nf_bridge info content a not-so fastpath. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: bridge: start splitting mask into public/private chunksFlorian Westphal1-4/+11
->mask is a bit info field that mixes various use cases. In particular, we have flags that are mutually exlusive, and flags that are only used within br_netfilter while others need to be exposed to other parts of the kernel. Remove BRNF_8021Q/PPPoE flags. They're mutually exclusive and only needed within br_netfilter context. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: bridge: add and use nf_bridge_info_get helperFlorian Westphal1-8/+16
Don't access skb->nf_bridge directly, this pointer will be removed soon. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: physdev: use helpersFlorian Westphal1-12/+22
Avoid skb->nf_bridge accesses where possible. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: bridge: add helpers for fetching physin/outdevFlorian Westphal7-33/+75
right now we store this in the nf_bridge_info struct, accessible via skb->nf_bridge. This patch prepares removal of this pointer from skb: Instead of using skb->nf_bridge->x, we use helpers to obtain the in/out device (or ifindexes). Followup patches to netfilter will then allow nf_bridge_info to be obtained by a call into the br_netfilter core, rather than keeping a pointer to it in sk_buff. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: bridge: don't use nf_bridge_info data to store mac headerFlorian Westphal1-29/+41
br_netfilter maintains an extra state, nf_bridge_info, which is attached to skb via skb->nf_bridge pointer. Amongst other things we use skb->nf_bridge->data to store the original mac header for every processed skb. This is required for ip refragmentation when using conntrack on top of bridge, because ip_fragment doesn't copy it from original skb. However there is no need anymore to do this unconditionally. Move this to the one place where its needed -- when br_netfilter calls ip_fragment(). Also switch to percpu storage for this so we can handle fragmenting without accessing nf_bridge meta data. Only user left is neigh resolution when DNAT is detected, to hold the original source mac address (neigh resolution builds new mac header using bridge mac), so rename ->data and reduce its size to whats needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: x_tables: don't extract flow keys on early demuxed sks in socket matchDaniel Borkmann1-45/+50
Currently in xt_socket, we take advantage of early demuxed sockets since commit 00028aa37098 ("netfilter: xt_socket: use IP early demux") in order to avoid a second socket lookup in the fast path, but we only make partial use of this: We still unnecessarily parse headers, extract proto, {s,d}addr and {s,d}ports from the skb data, accessing possible conntrack information, etc even though we were not even calling into the socket lookup via xt_socket_get_sock_{v4,v6}() due to skb->sk hit, meaning those cycles can be spared. After this patch, we only proceed the slower, manual lookup path when we have a skb->sk miss, thus time to match verdict for early demuxed sockets will improve further, which might be i.e. interesting for use cases such as mentioned in 681f130f39e1 ("netfilter: xt_socket: add XT_SOCKET_NOWILDCARD flag"). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08cfg80211: don't allow disabling WEXT if it's requiredJohannes Berg1-1/+1
The change to only export WEXT symbols when required could break the build if CONFIG_CFG80211_WEXT was explicitly disabled while a driver like orinoco selected it. Fix this by hiding the symbol when it's required so it can't be disabled in that case. Fixes: 2afe38d15cee ("cfg80211-wext: export symbols only when needed") Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>