aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-04-11rxrpc: Move some miscellaneous bits out into their own fileDavid Howells5-84/+106
Move some miscellaneous bits out into their own file to make it easier to split the call handling. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: Disable a debugging statement that has been left enabled.David Howells1-1/+1
Disable a debugging statement that has been left enabled Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: do not pull udp headers on receiveWillem de Bruijn1-2/+2
Commit e6afc8ace6dd modified the udp receive path by pulling the udp header before queuing an skbuff onto the receive queue. Rxrpc also calls skb_recv_datagram to dequeue an skb from a udp socket. Modify this receive path to also no longer expect udp headers. Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Signed-off-by: Willem de Bruijn <willemb@google.com> Tested-by: Thierry Reding <treding@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11sunrpc: do not pull udp headers on receiveWillem de Bruijn3-7/+5
Commit e6afc8ace6dd modified the udp receive path by pulling the udp header before queuing an skbuff onto the receive queue. Sunrpc also calls skb_recv_datagram to dequeue an skb from a udp socket. Modify this receive path to also no longer expect udp headers. Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Reported-by: Franklin S Cooper Jr. <fcooper@ti.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Tested-by: Thierry Reding <treding@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11net: ipv4: Consider failed nexthops in multipath routesDavid Ahern2-5/+40
Multipath route lookups should consider knowledge about next hops and not select a hop that is known to be failed. Example: [h2] [h3] 15.0.0.5 | | 3| 3| [SP1] [SP2]--+ 1 2 1 2 | | /-------------+ | | \ / | | X | | / \ | | / \---------------\ | 1 2 1 2 12.0.0.2 [TOR1] 3-----------------3 [TOR2] 12.0.0.3 4 4 \ / \ / \ / -------| |-----/ 1 2 [TOR3] 3| | [h1] 12.0.0.1 host h1 with IP 12.0.0.1 has 2 paths to host h3 at 15.0.0.5: root@h1:~# ip ro ls ... 12.0.0.0/24 dev swp1 proto kernel scope link src 12.0.0.1 15.0.0.0/16 nexthop via 12.0.0.2 dev swp1 weight 1 nexthop via 12.0.0.3 dev swp1 weight 1 ... If the link between tor3 and tor1 is down and the link between tor1 and tor2 then tor1 is effectively cut-off from h1. Yet the route lookups in h1 are alternating between the 2 routes: ping 15.0.0.5 gets one and ssh 15.0.0.5 gets the other. Connections that attempt to use the 12.0.0.2 nexthop fail since that neighbor is not reachable: root@h1:~# ip neigh show ... 12.0.0.3 dev swp1 lladdr 00:02:00:00:00:1b REACHABLE 12.0.0.2 dev swp1 FAILED ... The failed path can be avoided by considering known neighbor information when selecting next hops. If the neighbor lookup fails we have no knowledge about the nexthop, so give it a shot. If there is an entry then only select the nexthop if the state is sane. This is similar to what fib_detect_death does. To maintain backward compatibility use of the neighbor information is based on a new sysctl, fib_multipath_use_neigh. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller34-110/+203
2016-04-09ipv6: fix inet6_lookup_listener()Eric Dumazet1-2/+2
A stupid refactoring bug in inet6_lookup_listener() needs to be fixed in order to get proper SO_REUSEPORT behavior. Fixes: 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds24-51/+144
Pull networking fixes from David Miller: 1) Stale SKB data pointer access across pskb_may_pull() calls in L2TP, from Haishuang Yan. 2) Fix multicast frame handling in mac80211 AP code, from Felix Fietkau. 3) mac80211 station hashtable insert errors not handled properly, fix from Johannes Berg. 4) Fix TX descriptor count limit handling in e1000, from Alexander Duyck. 5) Revert a buggy netdev refcount fix in netpoll, from Bjorn Helgaas. 6) Must assign rtnl_link_ops of the device before registering it, fix in ip6_tunnel from Thadeu Lima de Souza Cascardo. 7) Memory leak fix in tc action net exit, from WANG Cong. 8) Add missing AF_KCM entries to name tables, from Dexuan Cui. 9) Fix regression in GRE handling of csums wrt. FOU, from Alexander Duyck. 10) Fix memory allocation alignment and congestion map corruption in RDS, from Shamir Rabinovitch. 11) Fix default qdisc regression in tuntap driver, from Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits) bridge, netem: mark mailing lists as moderated tuntap: restore default qdisc mpls: find_outdev: check for err ptr in addition to NULL check ipv6: Count in extension headers in skb->network_header RDS: fix congestion map corruption for PAGE_SIZE > 4k RDS: memory allocated must be align to 8 GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU net: add the AF_KCM entries to family name tables MAINTAINERS: intel-wired-lan list is moderated lib/test_bpf: Add additional BPF_ADD tests lib/test_bpf: Add test to check for result of 32-bit add that overflows lib/test_bpf: Add tests for unsigned BPF_JGT lib/test_bpf: Fix JMP_JSET tests VSOCK: Detach QP check should filter out non matching QPs. stmmac: fix adjust link call in case of a switch is attached af_packet: tone down the Tx-ring unsupported spew. net_sched: fix a memory leak in tc action samples/bpf: Enable powerpc support samples/bpf: Use llc in PATH, rather than a hardcoded value samples/bpf: Fix build breakage with map_perf_test_user.c ...
2016-04-08net: dsa: make the VLAN add function return voidVivien Didelot1-8/+3
The switchdev design implies that a software error should not happen in the commit phase since it must have been previously reported in the prepare phase. If an hardware error occurs during the commit phase, there is nothing switchdev can do about it. The DSA layer separates port_vlan_prepare and port_vlan_add for simplicity and convenience. If an hardware error occurs during the commit phase, there is no need to report it outside the driver itself. Make the DSA port_vlan_add routine return void for explicitness. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08net: dsa: make the FDB add function return voidVivien Didelot1-8/+8
The switchdev design implies that a software error should not happen in the commit phase since it must have been previously reported in the prepare phase. If an hardware error occurs during the commit phase, there is nothing switchdev can do about it. The DSA layer separates port_fdb_prepare and port_fdb_add for simplicity and convenience. If an hardware error occurs during the commit phase, there is no need to report it outside the DSA driver itself. Make the DSA port_fdb_add routine return void for explicitness. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08net: dsa: make the STP state function return voidVivien Didelot1-17/+15
The DSA layer doesn't care about the return code of the port_stp_update routine, so make it void in the layer and the DSA drivers. Replace the useless dsa_slave_stp_update function with a dsa_slave_stp_state function used to reply to the switchdev SWITCHDEV_ATTR_ID_PORT_STP_STATE attribute. In the meantime, rename port_stp_update to port_stp_state_set to explicit the state change. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08Merge tag 'mac80211-next-for-davem-2016-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-nextDavid S. Miller39-984/+1809
Johannes Berg says: ==================== For the 4.7 cycle, we have a number of changes: * Bob's mesh mode rhashtable conversion, this includes the rhashtable API change for allocation flags * BSSID scan, connect() command reassoc support (Jouni) * fast (optimised data only) and support for RSS in mac80211 (myself) * various smaller changes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08Merge tag 'mac80211-for-davem-2016-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211David S. Miller8-25/+86
Johannes Berg says: ==================== For the current RC series, we have the following fixes: * TDLS fixes from Arik and Ilan * rhashtable fixes from Ben and myself * documentation fixes from Luis * U-APSD fixes from Emmanuel * a TXQ fix from Felix * and a compiler warning suppression from Jeff ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08devlink: share user_ptr pointer for both devlink and devlink_portJiri Pirko1-8/+13
Ptr to devlink structure can be easily obtained from devlink_port->devlink. So share user_ptr[0] pointer for both and leave user_ptr[1] free for other users. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08devlink: remove implicit type set in port registerJiri Pirko1-1/+0
As we rely on caller zeroing or correctly set the struct before the call, this implicit type set is either no-op (DEVLINK_PORT_TYPE_NOTSET is 0) or it rewrites wanted value. So remove this. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-086lowpan: iphc: fix handling of link-local compressionAlexander Aring1-2/+9
This patch fixes handling in case of link-local address compression. A IPv6 link-local address is defined as fe80::/10 prefix which is also what ipv6_addr_type checks for link-local addresses. But IPHC compression for link-local addresses are for fe80::/64 types only. This patch adds additional checks for zero padded bits in case of link-local address compression to match on a fe80::/64 address only. Signed-off-by: Alexander Aring <aar@pengutronix.de> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08Bluetooth: Allow setting BT_SECURITY_FIPS with setsockoptPatrik Flykt1-1/+1
Update the security level check to allow setting BT_SECURITY_FIPS for an L2CAP socket. Signed-off-by: Patrik Flykt <patrik.flykt@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08Bluetooth: Ignore unknown advertising packet typesJohan Hedberg1-0/+13
In case of buggy controllers send advertising packet types that we don't know of we should simply ignore them instead of trying to react to them in some (potentially wrong) way. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08Bluetooth: Fix setting NO_BREDR advertising flagJohan Hedberg1-3/+3
If we're dealing with a single-mode controller or BR/EDR is disable for a dual-mode one, the NO_BREDR flag needs to be unconditionally present in the advertising data. This patch moves it out from behind an extra condition to be always set in the create_instance_adv_data() function if BR/EDR is disabled. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08mpls: find_outdev: check for err ptr in addition to NULL checkRoopa Prabhu1-0/+3
find_outdev calls inet{,6}_fib_lookup_dev() or dev_get_by_index() to find the output device. In case of an error, inet{,6}_fib_lookup_dev() returns error pointer and dev_get_by_index() returns NULL. But the function only checks for NULL and thus can end up calling dev_put on an ERR_PTR. This patch adds an additional check for err ptr after the NULL check. Before: Trying to add an mpls route with no oif from user, no available path to 10.1.1.8 and no default route: $ip -f mpls route add 100 as 200 via inet 10.1.1.8 [ 822.337195] BUG: unable to handle kernel NULL pointer dereference at 00000000000003a3 [ 822.340033] IP: [<ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182 [ 822.340033] PGD 1db38067 PUD 1de9e067 PMD 0 [ 822.340033] Oops: 0000 [#1] SMP [ 822.340033] Modules linked in: [ 822.340033] CPU: 0 PID: 11148 Comm: ip Not tainted 4.5.0-rc7+ #54 [ 822.340033] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014 [ 822.340033] task: ffff88001db82580 ti: ffff88001dad4000 task.ti: ffff88001dad4000 [ 822.340033] RIP: 0010:[<ffffffff8148781e>] [<ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182 [ 822.340033] RSP: 0018:ffff88001dad7a88 EFLAGS: 00010282 [ 822.340033] RAX: ffffffffffffff9b RBX: ffffffffffffff9b RCX: 0000000000000002 [ 822.340033] RDX: 00000000ffffff9b RSI: 0000000000000008 RDI: 0000000000000000 [ 822.340033] RBP: ffff88001ddc9ea0 R08: ffff88001e9f1768 R09: 0000000000000000 [ 822.340033] R10: ffff88001d9c1100 R11: ffff88001e3c89f0 R12: ffffffff8187e0c0 [ 822.340033] R13: ffffffff8187e0c0 R14: ffff88001ddc9e80 R15: 0000000000000004 [ 822.340033] FS: 00007ff9ed798700(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 [ 822.340033] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 822.340033] CR2: 00000000000003a3 CR3: 000000001de89000 CR4: 00000000000006f0 [ 822.340033] Stack: [ 822.340033] 0000000000000000 0000000100000000 0000000000000000 0000000000000000 [ 822.340033] 0000000000000000 0801010a00000000 0000000000000000 0000000000000000 [ 822.340033] 0000000000000004 ffffffff8148749b ffffffff8187e0c0 000000000000001c [ 822.340033] Call Trace: [ 822.340033] [<ffffffff8148749b>] ? mpls_rt_alloc+0x2b/0x3e [ 822.340033] [<ffffffff81488e66>] ? mpls_rtm_newroute+0x358/0x3e2 [ 822.340033] [<ffffffff810e7bbc>] ? get_page+0x5/0xa [ 822.340033] [<ffffffff813b7d94>] ? rtnetlink_rcv_msg+0x17e/0x191 [ 822.340033] [<ffffffff8111794e>] ? __kmalloc_track_caller+0x8c/0x9e [ 822.340033] [<ffffffff813c9393>] ? rht_key_hashfn.isra.20.constprop.57+0x14/0x1f [ 822.340033] [<ffffffff813b7c16>] ? __rtnl_unlock+0xc/0xc [ 822.340033] [<ffffffff813cb794>] ? netlink_rcv_skb+0x36/0x82 [ 822.340033] [<ffffffff813b4507>] ? rtnetlink_rcv+0x1f/0x28 [ 822.340033] [<ffffffff813cb2b1>] ? netlink_unicast+0x106/0x189 [ 822.340033] [<ffffffff813cb5b3>] ? netlink_sendmsg+0x27f/0x2c8 [ 822.340033] [<ffffffff81392ede>] ? sock_sendmsg_nosec+0x10/0x1b [ 822.340033] [<ffffffff81393df1>] ? ___sys_sendmsg+0x182/0x1e3 [ 822.340033] [<ffffffff810e4f35>] ? __alloc_pages_nodemask+0x11c/0x1e4 [ 822.340033] [<ffffffff8110619c>] ? PageAnon+0x5/0xd [ 822.340033] [<ffffffff811062fe>] ? __page_set_anon_rmap+0x45/0x52 [ 822.340033] [<ffffffff810e7bbc>] ? get_page+0x5/0xa [ 822.340033] [<ffffffff810e85ab>] ? __lru_cache_add+0x1a/0x3a [ 822.340033] [<ffffffff81087ea9>] ? current_kernel_time64+0x9/0x30 [ 822.340033] [<ffffffff813940c4>] ? __sys_sendmsg+0x3c/0x5a [ 822.340033] [<ffffffff8148f597>] ? entry_SYSCALL_64_fastpath+0x12/0x6a [ 822.340033] Code: 83 08 04 00 00 65 ff 00 48 8b 3c 24 e8 40 7c f2 ff eb 13 48 c7 c3 9f ff ff ff eb 0f 89 ce e8 f1 ae f1 ff 48 89 c3 48 85 db 74 15 <48> 8b 83 08 04 00 00 65 ff 08 48 81 fb 00 f0 ff ff 76 0d eb 07 [ 822.340033] RIP [<ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182 [ 822.340033] RSP <ffff88001dad7a88> [ 822.340033] CR2: 00000000000003a3 [ 822.435363] ---[ end trace 98cc65e6f6b8bf11 ]--- After patch: $ip -f mpls route add 100 as 200 via inet 10.1.1.8 RTNETLINK answers: Network is unreachable Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Reported-by: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07ipv6: Count in extension headers in skb->network_headerJakub Sitnicki1-4/+4
When sending a UDPv6 message longer than MTU, account for the length of fragmentable IPv6 extension headers in skb->network_header offset. Same as we do in alloc_new_skb path in __ip6_append_data(). This ensures that later on __ip6_make_skb() will make space in headroom for fragmentable extension headers: /* move skb->data to ip header from ext header */ if (skb->data < skb_network_header(skb)) __skb_pull(skb, skb_network_offset(skb)); Prevents a splat due to skb_under_panic: skbuff: skb_under_panic: text:ffffffff8143397b len:2126 put:14 \ head:ffff880005bacf50 data:ffff880005bacf4a tail:0x48 end:0xc0 dev:lo ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:104! invalid opcode: 0000 [#1] KASAN CPU: 0 PID: 160 Comm: reproducer Not tainted 4.6.0-rc2 #65 [...] Call Trace: [<ffffffff813eb7b9>] skb_push+0x79/0x80 [<ffffffff8143397b>] eth_header+0x2b/0x100 [<ffffffff8141e0d0>] neigh_resolve_output+0x210/0x310 [<ffffffff814eab77>] ip6_finish_output2+0x4a7/0x7c0 [<ffffffff814efe3a>] ip6_output+0x16a/0x280 [<ffffffff815440c1>] ip6_local_out+0xb1/0xf0 [<ffffffff814f1115>] ip6_send_skb+0x45/0xd0 [<ffffffff81518836>] udp_v6_send_skb+0x246/0x5d0 [<ffffffff8151985e>] udpv6_sendmsg+0xa6e/0x1090 [...] Reported-by: Ji Jianwen <jiji@redhat.com> Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07tipc: stricter filtering of packets in bearer layerJon Paul Maloy2-17/+38
Resetting a bearer/interface, with the consequence of resetting all its pertaining links, is not an atomic action. This becomes particularly evident in very large clusters, where a lot of traffic may happen on the remaining links while we are busy shutting them down. In extreme cases, we may even see links being re-created and re-established before we are finished with the job. To solve this, we now introduce a solution where we temporarily detach the bearer from the interface when the bearer is reset. This inhibits all packet reception, while sending still is possible. For the latter, we use the fact that the device's user pointer now is zero to filter out which packets can be sent during this situation; i.e., outgoing RESET messages only. This filtering serves to speed up the neighbors' detection of the loss event, and saves us from unnecessary probing. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07tipc: eliminate buffer leak in bearer layerJon Paul Maloy3-31/+29
When enabling a bearer we create a 'neigbor discoverer' instance by calling the function tipc_disc_create() before the bearer is actually registered in the list of enabled bearers. Because of this, the very first discovery broadcast message, created by the mentioned function, is lost, since it cannot find any valid bearer to use. Furthermore, the used send function, tipc_bearer_xmit_skb() does not free the given buffer when it cannot find a bearer, resulting in the leak of exactly one send buffer each time a bearer is enabled. This commit fixes this problem by introducing two changes: 1) Instead of attemting to send the discovery message directly, we let tipc_disc_create() return the discovery buffer to the calling function, tipc_enable_bearer(), so that the latter can send it when the enabling sequence is finished. 2) In tipc_bearer_xmit_skb(), as well as in the two other transmit functions at the bearer layer, we now free the indicated buffer or buffer chain when a valid bearer cannot be found. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07RDS: fix congestion map corruption for PAGE_SIZE > 4kshamir rabinovitch1-1/+1
When PAGE_SIZE > 4k single page can contain 2 RDS fragments. If 'rds_ib_cong_recv' ignore the RDS fragment offset in to the page it then read the data fragment as far congestion map update and lead to corruption of the RDS connection far congestion map. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07RDS: memory allocated must be align to 8shamir rabinovitch1-2/+2
Fix issue in 'rds_ib_cong_recv' when accessing unaligned memory allocated by 'rds_page_remainder_alloc' using uint64_t pointer. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOUAlexander Duyck4-3/+25
This patch fixes an issue I found in which we were dropping frames if we had enabled checksums on GRE headers that were encapsulated by either FOU or GUE. Without this patch I was barely able to get 1 Gb/s of throughput. With this patch applied I am now at least getting around 6 Gb/s. The issue is due to the fact that with FOU or GUE applied we do not provide a transport offset pointing to the GRE header, nor do we offload it in software as the GRE header is completely skipped by GSO and treated like a VXLAN or GENEVE type header. As such we need to prevent the stack from generating it and also prevent GRE from generating it via any interface we create. Fixes: c3483384ee511 ("gro: Allow tunnel stacking in the case of FOU/GUE") Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07udp: Remove udp_offloadsTom Herbert1-63/+0
Now that the UDP encapsulation GRO functions have been moved to the UDP socket we not longer need the udp_offload insfrastructure so removing it. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07fou: change to use UDP socket GROTom Herbert1-31/+17
Adapt gue_gro_receive, gue_gro_complete to take a socket argument. Don't set udp_offloads any more. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07udp: Add socket based GRO and configTom Herbert1-0/+2
Add gro_receive and gro_complete to struct udp_tunnel_sock_cfg. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07udp: Add GRO functions to UDP socketTom Herbert6-40/+41
This patch adds GRO functions (gro_receive and gro_complete) to UDP sockets. udp_gro_receive is changed to perform socket lookup on a packet. If a socket is found the related GRO functions are called. This features obsoletes using UDP offload infrastructure for GRO (udp_offload). This has the advantage of not being limited to provide offload on a per port basis, GRO is now applied to whatever individual UDP sockets are bound to. This also allows the possbility of "application defined GRO"-- that is we can attach something like a BPF program to a UDP socket to perfrom GRO on an application layer protocol. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07udp: Add udp6_lib_lookup_skb and udp4_lib_lookup_skbTom Herbert2-0/+26
Add externally visible functions to lookup a UDP socket by skb. This will be used for GRO in UDP sockets. These functions also check if skb->dst is set, and if it is not skb->dev is used to get dev_net. This allows calling lookup functions before dst has been set on the skbuff. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07tun: use socket locks for sk_{attach,detatch}_filterHannes Frederic Sowa1-22/+13
This reverts commit 5a5abb1fa3b05dd ("tun, bpf: fix suspicious RCU usage in tun_{attach, detach}_filter") and replaces it to use lock_sock around sk_{attach,detach}_filter. The checks inside filter.c are updated with lockdep_sock_is_held to check for proper socket locks. It keeps the code cleaner by ensuring that only one lock governs the socket filter instead of two independent locks. Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07net: introduce lockdep_is_held and update various places to use itHannes Frederic Sowa9-15/+16
The socket is either locked if we hold the slock spin_lock for lock_sock_fast and unlock_sock_fast or we own the lock (sk_lock.owned != 0). Check for this and at the same time improve that the current thread/cpu is really holding the lock. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07sock: fix lockdep annotation in release_sockHannes Frederic Sowa1-5/+0
During release_sock we use callbacks to finish the processing of outstanding skbs on the socket. We actually are still locked, sk_locked.owned == 1, but we already told lockdep that the mutex is released. This could lead to false positives in lockdep for lockdep_sock_is_held (we don't hold the slock spinlock during processing the outstanding skbs). I took over this patch from Eric Dumazet and tested it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07netfilter: ipv6: unnecessary to check whether ip6_route_output() returns NULLHaishuang Yan1-1/+1
ip6_route_output() never returns NULL, so it is not appropriate to check if the return value is NULL. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-07tcp/dccp: fix inet_reuseport_add_sock()Eric Dumazet2-2/+4
David Ahern reported panics in __inet_hash() caused by my recent commit. The reason is inet_reuseport_add_sock() was still using sk_nulls_for_each_rcu() instead of sk_for_each_rcu(). SO_REUSEPORT enabled listeners were causing an instant crash. While chasing this bug, I found that I forgot to clear SOCK_RCU_FREE flag, as it is inherited from the parent at clone time. Fixes: 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06net: add the AF_KCM entries to family name tablesDexuan Cui1-3/+6
This is for the recent kcm driver, which introduces AF_KCM(41) in b7ac4eb(kcm: Kernel Connection Multiplexor module). Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06ip_tunnel: implement __iptunnel_pull_headerJiri Benc1-4/+4
Allow calling of iptunnel_pull_header without special casing ETH_P_TEB inner protocol. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06VSOCK: Detach QP check should filter out non matching QPs.Jorgen Hansen1-2/+2
The check in vmci_transport_peer_detach_cb should only allow a detach when the qp handle of the transport matches the one in the detach message. Testing: Before this change, a detach from a peer on a different socket would cause an active stream socket to register a detach. Reviewed-by: George Zhang <georgezhang@vmware.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06af_packet: tone down the Tx-ring unsupported spew.Dave Jones1-1/+1
Trinity and other fuzzers can hit this WARN on far too easily, resulting in a tainted kernel that hinders automated fuzzing. Replace it with a rate-limited printk. Signed-off-by: Dave Jones <davej@codemonkey.org.uk> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06Revert "bridge: Fix incorrect variable assignment on error path in br_sysfs_addbr"David S. Miller1-1/+0
This reverts commit c862cc9b70526a71d07e7bd86d9b61d1c792cead. Patch lacks a real-name Signed-off-by. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06mac80211: fix "warning: ‘target_metric’ may be used uninitialized"Jeff Mahoney1-1/+1
This fixes: net/mac80211/mesh_hwmp.c:603:26: warning: ‘target_metric’ may be used uninitialized in this function target_metric is only consumed when reply = true so no bug exists here, but not all versions of gcc realize it. Initialize to 0 to remove the warning. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-06cfg80211: Allow reassociation to be requested with internal SMEJouni Malinen2-3/+14
If the user space issues a NL80211_CMD_CONNECT with NL80211_ATTR_PREV_BSSID when there is already a connection, allow this to proceed as a reassociation instead of rejecting the new connect command with EALREADY. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> [validate prev_bssid] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-06cfg80211: Add option to specify previous BSSID for Connect commandJouni Malinen2-2/+8
This extends NL80211_CMD_CONNECT to allow the NL80211_ATTR_PREV_BSSID attribute to be used similarly to way this was already allowed with NL80211_CMD_ASSOCIATE. This allows user space to request reassociation (instead of association) when already connected to an AP. This provides an option to reassociate within an ESS without having to disconnect and associate with the AP. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-06mac80211: minstrel_ht: set A-MSDU tx limits based on selected max_prob_rateFelix Fietkau1-0/+54
Prevents excessive A-MSDU aggregation at low data rates or bad conditions. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-06mac80211: add A-MSDU tx supportFelix Fietkau5-0/+166
Requires software tx queueing and fast-xmit support. For good performance, drivers need frag_list support as well. This avoids the need for copying data of aggregated frames. Running without it is only supported for debugging purposes. To avoid performance and packet size issues, the rate control module or driver needs to limit the maximum A-MSDU size by setting max_rc_amsdu_len in struct ieee80211_sta. Signed-off-by: Felix Fietkau <nbd@openwrt.org> [fix locking issue] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-06mac80211: enable collecting station statistics per-CPUJohannes Berg4-46/+138
If the driver advertises the new HW flag USE_RSS, make the station statistics on the fast-rx path per-CPU. This will enable calling the RX in parallel, only hitting locking or shared cachelines when the fast-RX path isn't available. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-06mac80211: add fast-rx pathJohannes Berg7-3/+409
The regular RX path has a lot of code, but with a few assumptions on the hardware it's possible to reduce the amount of code significantly. Currently the assumptions on the driver are the following: * hardware/driver reordering buffer (if supporting aggregation) * hardware/driver decryption & PN checking (if using encryption) * hardware/driver did de-duplication * hardware/driver did A-MSDU deaggregation * AP_LINK_PS is used (in AP mode) * no client powersave handling in mac80211 (in client mode) of which some are actually checked per packet: * de-duplication * PN checking * decryption and additionally packets must * not be A-MSDU (have been deaggregated by driver/device) * be data packets * not be fragmented * be unicast * have RFC 1042 header Additionally dynamically we assume: * no encryption or CCMP/GCMP, TKIP/WEP/other not allowed * station must be authorized * 4-addr format not enabled Some data needed for the RX path is cached in a new per-station "fast_rx" structure, so that we only need to look at this and the packet, no other memory when processing packets on the fast RX path. After doing the above per-packet checks, the data path collapses down to a pretty simple conversion function taking advantage of the data cached in the small fast_rx struct. This should speed up the RX processing, and will make it easier to reason about parallelizing RX (for which statistics will need to be per-CPU still.) Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-06mac80211: fix RX u64 stats consistency on 32-bit platformsJohannes Berg3-29/+54
On 32-bit platforms, the 64-bit counters we keep need to be protected to be consistently read. Use the u64_stats_sync mechanism to do that. In order to not end up with overly long lines, refactor the tidstats assignments a bit. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-06mac80211: fix last RX rate data consistencyJohannes Berg3-49/+77
When storing the last_rate_* values in the RX code, there's nothing to guarantee consistency, so a concurrent reader could see, e.g. last_rate_idx on the new value, but last_rate_flag still on the old, getting completely bogus values in the end. To fix this, I lifted the sta_stats_encode_rate() function from my old rate statistics code, which encodes the entire rate data into a single 16-bit value, avoiding the consistency issue. Signed-off-by: Johannes Berg <johannes.berg@intel.com>