aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-04-20net: ipv6: Fix UDP early demux lookup with udp_l3mdev_accept=0subashab@codeaurora.org1-9/+15
David Ahern reported that 5425077d73e0c ("net: ipv6: Add early demux handler for UDP unicast") breaks udp_l3mdev_accept=0 since early demux for IPv6 UDP was doing a generic socket lookup which does not require an exact match. Fix this by making UDPv6 early demux match connected sockets only. v1->v2: Take reference to socket after match as suggested by Eric v2->v3: Add comment before break Fixes: 5425077d73e0c ("net: ipv6: Add early demux handler for UDP unicast") Reported-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Eric Dumazet <edumazet@google.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20tcp: remove poll() flakes with FastOpenEric Dumazet1-7/+9
When using TCP FastOpen for an active session, we send one wakeup event from tcp_finish_connect(), right before the data eventually contained in the received SYNACK is queued to sk->sk_receive_queue. This means that depending on machine load or luck, poll() users might receive POLLOUT events instead of POLLIN|POLLOUT To fix this, we need to move the call to sk->sk_state_change() after the (optional) call to tcp_rcv_fastopen_synack() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20tcp: remove poll() flakes when receiving RSTEric Dumazet1-2/+2
When a RST packet is processed, we send two wakeup events to interested polling users. First one by a sk->sk_error_report(sk) from tcp_reset(), followed by a sk->sk_state_change(sk) from tcp_done(). Depending on machine load and luck, poll() can either return POLLERR, or POLLIN|POLLOUT|POLLERR|POLLHUP (this happens on 99 % of the cases) This is probably fine, but we can avoid the confusion by reordering things so that we have more TCP fields updated before the first wakeup. This might even allow us to remove some barriers we added in the past. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20ipv6: sr: fix out-of-bounds access in SRH validationDavid Lebrun1-0/+3
This patch fixes an out-of-bounds access in seg6_validate_srh() when the trailing data is less than sizeof(struct sr6_tlv). Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20mac80211: reject ToDS broadcast data framesJohannes Berg1-0/+21
AP/AP_VLAN modes don't accept any real 802.11 multicast data frames, but since they do need to accept broadcast management frames the same is currently permitted for data frames. This opens a security problem because such frames would be decrypted with the GTK, and could even contain unicast L3 frames. Since the spec says that ToDS frames must always have the BSSID as the RA (addr1), reject any other data frames. The problem was originally reported in "Predicting, Decrypting, and Abusing WPA2/802.11 Group Keys" at usenix https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/vanhoef and brought to my attention by Jouni. Cc: stable@vger.kernel.org Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com> -- Dave, I didn't want to send you a new pull request for a single commit yet again - can you apply this one patch as is? Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20bpf: remove reference to sock_filter_ext from kerneldoc commentTobias Klauser1-1/+2
struct sock_filter_ext didn't make it into the tree and is now called struct bpf_insn. Reword the kerneldoc comment for bpf_convert_filter() accordingly. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20Merge tag 'mac80211-next-for-davem-2017-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-nextDavid S. Miller36-439/+1338
Johannes Berg says: ==================== My last pull request has been a while, we now have: * connection quality monitoring with multiple thresholds * support for FILS shared key authentication offload * pre-CAC regulatory compliance - only ETSI allows this * sanity check for some rate confusion that hit ChromeOS (but nobody else uses it, evidently) * some documentation updates * lots of cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20net: dsa: add support for the SMSC-LAN9303 tagging formatJuergen Beisert5-0/+152
To define the outgoing port and to discover the incoming port a regular VLAN tag is used by the LAN9303. But its VID meaning is 'special'. This tag handler/filter depends on some hardware features which must be enabled in the device to provide and make use of this special VLAN tag to control the destination and the source of an ethernet packet. Signed-off-by: Juergen Borleis <jbe@pengutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20Merge tag 'mac80211-for-davem-2017-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211David S. Miller1-18/+47
Johannes Berg says: ==================== A single fix, for the MU-MIMO monitor mode, that fixes bad SKB accesses if the SKB was paged, which is the case for the only driver supporting this - iwlwifi. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller6-25/+16
A function in kernel/bpf/syscall.c which got a bug fix in 'net' was moved to kernel/bpf/verifier.c in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-19esp4/6: Fix GSO path for non-GSO SW-crypto packetsIlan Tayari2-5/+6
If esp*_offload module is loaded, outbound packets take the GSO code path, being encapsulated at layer 3, but encrypted in layer 2. validate_xmit_xfrm calls esp*_xmit for that. esp*_xmit was wrongfully detecting these packets as going through hardware crypto offload, while in fact they should be encrypted in software, causing plaintext leakage to the network, and also dropping at the receiver side. Perform the encryption in esp*_xmit, if the SA doesn't have a hardware offload_handle. Also, align esp6 code to esp4 logic. Fixes: fca11ebde3f0 ("esp4: Reorganize esp_output") Fixes: 383d0350f2cc ("esp6: Reorganize esp_output") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-19esp6: fix incorrect null pointer check on xoColin Ian King1-1/+1
The check for xo being null is incorrect, currently it is checking for non-null, it should be checking for null. Detected with CoverityScan, CID#1429349 ("Dereference after null check") Fixes: 7862b4058b9f ("esp: Add gso handlers for esp4 and esp6") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-18sctp: process duplicated strreset asoc request correctlyXin Long1-4/+15
This patch is to fix the replay attack issue for strreset asoc requests. When a duplicated strreset asoc request is received, reply it with bad seqno if it's seqno < asoc->strreset_inseq - 2, and reply it with the result saved in asoc if it's seqno >= asoc->strreset_inseq - 2. But note that if the result saved in asoc is performed, the sender's next tsn and receiver's next tsn for the response chunk should be set. It's safe to get them from asoc. Because if it's changed, which means the peer has received the response already, the new response with wrong tsn won't be accepted by peer. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-18sctp: process duplicated strreset in and addstrm in requests correctlyXin Long1-9/+29
This patch is to fix the replay attack issue for strreset and addstrm in requests. When a duplicated strreset in or addstrm in request is received, reply it with bad seqno if it's seqno < asoc->strreset_inseq - 2, and reply it with the result saved in asoc if it's seqno >= asoc->strreset_inseq - 2. For strreset in or addstrm in request, if the receiver side processes it successfully, a strreset out or addstrm out request(as a response for that request) will be sent back to peer. reconf_time will retransmit the out request even if it's lost. So when receiving a duplicated strreset in or addstrm in request and it's result was performed, it shouldn't reply this request, but drop it instead. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-18sctp: process duplicated strreset out and addstrm out requests correctlyXin Long1-10/+29
Now sctp stream reconf will process a request again even if it's seqno is less than asoc->strreset_inseq. If one request has been done successfully and some data chunks have been accepted and then a duplicated strreset out request comes, the streamin's ssn will be cleared. It will cause that stream will never receive chunks any more because of unsynchronized ssn. It allows a replay attack. A similar issue also exists when processing addstrm out requests. It will cause more extra streams being added. This patch is to fix it by saving the last 2 results into asoc. When a duplicated strreset out or addstrm out request is received, reply it with bad seqno if it's seqno < asoc->strreset_inseq - 2, and reply it with the result saved in asoc if it's seqno >= asoc->strreset_inseq - 2. Note that it saves last 2 results instead of only last 1 result, because two requests can be sent together in one chunk. And note that when receiving a duplicated request, the receiver side will still reply it even if the peer has received the response. It's safe, As the response will be dropped by the peer. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-18nl80211: Fix enum type of variable in nl80211_put_sta_rate()Matthias Kaehlcke1-1/+1
rate_flg is of type 'enum nl80211_attrs', however it is assigned with 'enum nl80211_rate_info' values. Change the type of rate_flg accordingly. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-18mac80211: ibss: Fix channel type enum in ieee80211_sta_join_ibss()Matthias Kaehlcke1-2/+2
cfg80211_chandef_create() expects an 'enum nl80211_channel_type' as channel type however in ieee80211_sta_join_ibss() NL80211_CHAN_WIDTH_20_NOHT is passed in two occasions, which is of the enum type 'nl80211_chan_width'. Change the value to NL80211_CHAN_NO_HT (20 MHz, non-HT channel) of the channel type enum. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-18cfg80211: Fix array-bounds warning in fragment copyMatthias Kaehlcke1-3/+3
__ieee80211_amsdu_copy_frag intentionally initializes a pointer to array[-1] to increment it later to valid values. clang rightfully generates an array-bounds warning on the initialization statement. Initialize the pointer to array[0] and change the algorithm from increment before to increment after consume. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-18mac80211: keep a separate list of monitor interfaces that are upJohannes Berg4-12/+22
In addition to keeping monitor interfaces on the regular list of interfaces, keep those that are up and not in cooked mode on a separate list. This saves having to iterate all interfaces when delivering to monitor interfaces. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-18nl80211: add request id in scheduled scan event messagesArend Van Spriel3-17/+14
For multi-scheduled scan support in subsequent patch a request id will be added. This patch add this request id to the scheduled scan event messages. For now the request id will always be zero. With multi-scheduled scan its value will inform user-space to which scan the event relates. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-04-18af_key: Fix sadb_x_ipsecrequest parsingHerbert Xu1-21/+26
The parsing of sadb_x_ipsecrequest is broken in a number of ways. First of all we're not verifying sadb_x_ipsecrequest_len. This is needed when the structure carries addresses at the end. Worse we don't even look at the length when we parse those optional addresses. The migration code had similar parsing code that's better but it also has some deficiencies. The length is overcounted first of all as it includes the header itself. It also fails to check the length before dereferencing the sa_family field. This patch fixes those problems in parse_sockaddr_pair and then uses it in parse_ipsecrequest. Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-17net: rtnetlink: plumb extended ack to doit functionDavid Ahern23-94/+154
Add netlink_ext_ack arg to rtnl_doit_func. Pass extack arg to nlmsg_parse for doit functions that call it directly. This is the first step to using extended error reporting in rtnetlink. >From here individual subsystems can be updated to set netlink_ext_ack as needed. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17ipv6: sr: fix BUG due to headroom too small after SRH pushDavid Lebrun1-0/+8
When a locally generated packet receives an SRH with two or more segments, the remaining headroom is too small to push an ethernet header. This patch ensures that the headroom is large enough after SRH push. The BUG generated the following trace. [ 192.950285] skbuff: skb_under_panic: text:ffffffff81809675 len:198 put:14 head:ffff88006f306400 data:ffff88006f3063fa tail:0xc0 end:0x2c0 dev:A-1 [ 192.952456] ------------[ cut here ]------------ [ 192.953218] kernel BUG at net/core/skbuff.c:105! [ 192.953411] invalid opcode: 0000 [#1] PREEMPT SMP [ 192.953411] Modules linked in: [ 192.953411] CPU: 5 PID: 3433 Comm: ping6 Not tainted 4.11.0-rc3+ #237 [ 192.953411] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014 [ 192.953411] task: ffff88007c2d42c0 task.stack: ffffc90000ef4000 [ 192.953411] RIP: 0010:skb_panic+0x61/0x70 [ 192.953411] RSP: 0018:ffffc90000ef7900 EFLAGS: 00010286 [ 192.953411] RAX: 0000000000000085 RBX: 00000000000086dd RCX: 0000000000000201 [ 192.953411] RDX: 0000000080000201 RSI: ffffffff81d104c5 RDI: 00000000ffffffff [ 192.953411] RBP: ffffc90000ef7920 R08: 0000000000000001 R09: 0000000000000000 [ 192.953411] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 192.953411] R13: ffff88007c5a4000 R14: ffff88007b363d80 R15: 00000000000000b8 [ 192.953411] FS: 00007f94b558b700(0000) GS:ffff88007fd40000(0000) knlGS:0000000000000000 [ 192.953411] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 192.953411] CR2: 00007fff5ecd5080 CR3: 0000000074141000 CR4: 00000000001406e0 [ 192.953411] Call Trace: [ 192.953411] skb_push+0x3b/0x40 [ 192.953411] eth_header+0x25/0xc0 [ 192.953411] neigh_resolve_output+0x168/0x230 [ 192.953411] ? ip6_finish_output2+0x242/0x8f0 [ 192.953411] ip6_finish_output2+0x242/0x8f0 [ 192.953411] ? ip6_finish_output2+0x76/0x8f0 [ 192.953411] ip6_finish_output+0xa8/0x1d0 [ 192.953411] ip6_output+0x64/0x2d0 [ 192.953411] ? ip6_output+0x73/0x2d0 [ 192.953411] ? ip6_dst_check+0xb5/0xc0 [ 192.953411] ? dst_cache_per_cpu_get.isra.2+0x40/0x80 [ 192.953411] seg6_output+0xb0/0x220 [ 192.953411] lwtunnel_output+0xcf/0x210 [ 192.953411] ? lwtunnel_output+0x59/0x210 [ 192.953411] ip6_local_out+0x38/0x70 [ 192.953411] ip6_send_skb+0x2a/0xb0 [ 192.953411] ip6_push_pending_frames+0x48/0x50 [ 192.953411] rawv6_sendmsg+0xa39/0xf10 [ 192.953411] ? __lock_acquire+0x489/0x890 [ 192.953411] ? __mutex_lock+0x1fc/0x970 [ 192.953411] ? __lock_acquire+0x489/0x890 [ 192.953411] ? __mutex_lock+0x1fc/0x970 [ 192.953411] ? tty_ioctl+0x283/0xec0 [ 192.953411] inet_sendmsg+0x45/0x1d0 [ 192.953411] ? _copy_from_user+0x54/0x80 [ 192.953411] sock_sendmsg+0x33/0x40 [ 192.953411] SYSC_sendto+0xef/0x170 [ 192.953411] ? entry_SYSCALL_64_fastpath+0x5/0xc2 [ 192.953411] ? trace_hardirqs_on_caller+0x12b/0x1b0 [ 192.953411] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 192.953411] SyS_sendto+0x9/0x10 [ 192.953411] entry_SYSCALL_64_fastpath+0x1f/0xc2 [ 192.953411] RIP: 0033:0x7f94b453db33 [ 192.953411] RSP: 002b:00007fff5ecd0578 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 192.953411] RAX: ffffffffffffffda RBX: 00007fff5ecd16e0 RCX: 00007f94b453db33 [ 192.953411] RDX: 0000000000000040 RSI: 000055a78352e9c0 RDI: 0000000000000003 [ 192.953411] RBP: 00007fff5ecd1690 R08: 000055a78352c940 R09: 000000000000001c [ 192.953411] R10: 0000000000000000 R11: 0000000000000246 R12: 000055a783321e10 [ 192.953411] R13: 000055a7839890c0 R14: 0000000000000004 R15: 0000000000000000 [ 192.953411] Code: 00 00 48 89 44 24 10 8b 87 c4 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 90 58 d2 81 48 89 04 24 31 c0 e8 4f 70 9a ff <0f> 0b 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 48 8b 97 d8 00 00 [ 192.953411] RIP: skb_panic+0x61/0x70 RSP: ffffc90000ef7900 [ 193.000186] ---[ end trace bd0b89fabdf2f92c ]--- [ 193.000951] Kernel panic - not syncing: Fatal exception in interrupt [ 193.001137] Kernel Offset: disabled [ 193.001169] ---[ end Kernel panic - not syncing: Fatal exception in interrupt Fixes: 19d5a26f5ef8de5dcb78799feaf404d717b1aac3 ("ipv6: sr: expand skb head only if necessary") Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17gso: Validate assumption of frag_list segementationIlan Tayari1-4/+14
Commit 07b26c9454a2 ("gso: Support partial splitting at the frag_list pointer") assumes that all SKBs in a frag_list (except maybe the last one) contain the same amount of GSO payload. This assumption is not always correct, resulting in the following warning message in the log: skb_segment: too many frags For example, mlx5 driver in Striding RQ mode creates some RX SKBs with one frag, and some with 2 frags. After GRO, the frag_list SKBs end up having different amounts of payload. If this frag_list SKB is then forwarded, the aforementioned assumption is violated. Validate the assumption, and fall back to software GSO if it not true. Fixes: 07b26c9454a2 ("gso: Support partial splitting at the frag_list pointer") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17sctp: get list_of_streams of strreset outreq earlierXin Long1-4/+4
Now when processing strreset out responses, it gets outreq->list_of_streams only when result is performed. But if result is not performed, str_p will be NULL. It will cause panic in sctp_ulpevent_make_stream_reset_event if nums is not 0. This patch is to fix it by getting outreq->list_of_streams earlier, and also to improve some codes for the strreset inreq process. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17Add uid and cookie bpf helper to cg_skb_func_protoChenbo Feng1-6/+1
BPF helper functions get_socket_cookie and get_socket_uid can be used for network traffic classifications, among others. Expose them also to programs of type BPF_PROG_TYPE_CGROUP_SKB. As of commit 8f917bba0042 ("bpf: pass sk to helper functions") the required skb->sk function is available at both cgroup bpf ingress and egress hooks. With these two new helper, cg_skb_func_proto is effectively the same as sk_filter_func_proto. Change since V1: Instead of add the helper to cg_skb_func_proto, redirect the cg_skb_func_proto to sk_filter_func_proto since all helper function in sk_filter_func_proto are applicable to cg_skb_func_proto now. Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17ipv6: drop non loopback packets claiming to originate from ::1Florian Westphal1-2/+5
We lack a saddr check for ::1. This causes security issues e.g. with acls permitting connections from ::1 because of assumption that these originate from local machine. Assuming a source address of ::1 is local seems reasonable. RFC4291 doesn't allow such a source address either, so drop such packets. Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-nextDavid S. Miller9-182/+176
Johan Hedberg says: ==================== pull request: bluetooth-next 2017-04-14 Here's the main batch of Bluetooth & 802.15.4 patches for the 4.12 kernel. - Many fixes to 6LoWPAN, in particular for BLE - New CA8210 IEEE 802.15.4 device driver (accounting for most of the lines of code added in this pull request) - Added Nokia Bluetooth (UART) HCI driver - Some serdev & TTY changes that are dependencies for the Nokia driver (with acks from relevant maintainers and an agreement that these come through the bluetooth tree) - Support for new Intel Bluetooth device - Various other minor cleanups/fixes here and there Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17net: bridge: notify on hw fdb takeoverNikolay Aleksandrov1-1/+3
Recently we added support for SW fdbs to take over HW ones, but that results in changing a user-visible fdb flag thus we need to send a notification, also it's consistent with how HW takes over SW entries. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17kcm: remove a useless copy_from_user()WANG Cong1-4/+0
struct kcm_clone only contains fd, and kcm_clone() only writes this struct, so there is no need to copy it from user. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17Subject: net: allow configuring default qdiscstephen hemminger2-0/+54
Since 3.12 it has been possible to configure the default queuing discipline via sysctl. This patch adds ability to configure the default queue discipline in kernel configuration. This is useful for environments where configuring the value from userspace is difficult to manage. The default is still the same as before (pfifo_fast) and it is possible to change after kernel init with sysctl. This is similar to how TCP congestion control works. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17l2tp: device MTU setup, tunnel socket needs a lockR. Parameswaran2-1/+3
The MTU overhead calculation in L2TP device set-up merged via commit b784e7ebfce8cfb16c6f95e14e8532d0768ab7ff needs to be adjusted to lock the tunnel socket while referencing the sub-data structures to derive the socket's IP overhead. Reported-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: R. Parameswaran <rparames@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17net-timestamp: avoid use-after-free in ip_recv_errorWillem de Bruijn3-14/+6
Syzkaller reported a use-after-free in ip_recv_error at line info->ipi_ifindex = skb->dev->ifindex; This function is called on dequeue from the error queue, at which point the device pointer may no longer be valid. Save ifindex on enqueue in __skb_complete_tx_timestamp, when the pointer is valid or NULL. Store it in temporary storage skb->cb. It is safe to reference skb->dev here, as called from device drivers or dev_queue_xmit. The exception is when called from tcp_ack_tstamp; in that case it is NULL and ifindex is set to 0 (invalid). Do not return a pktinfo cmsg if ifindex is 0. This maintains the current behavior of not returning a cmsg if skb->dev was NULL. On dequeue, the ipv4 path will cast from sock_exterr_skb to in_pktinfo. Both have ifindex as their first element, so no explicit conversion is needed. This is by design, introduced in commit 0b922b7a829c ("net: original ingress device index in PKTINFO"). For ipv6 ip6_datagram_support_cmsg converts to in6_pktinfo. Fixes: 829ae9d61165 ("net-timestamp: allow reading recv cmsg on errqueue with origin tstamp") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17ipv4: fix a deadlock in ip_ra_controlWANG Cong3-9/+5
Similar to commit 87e9f0315952 ("ipv4: fix a potential deadlock in mcast getsockopt() path"), there is a deadlock scenario for IP_ROUTER_ALERT too: CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock(sk_lock-AF_INET); lock(rtnl_mutex); lock(sk_lock-AF_INET); Fix this by always locking RTNL first on all setsockopt() paths. Note, after this patch ip_ra_lock is no longer needed either. Reported-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17net: ipv6: send unsolicited NA on admin upDavid Ahern1-0/+2
ndisc_notify is the ipv6 equivalent to arp_notify. When arp_notify is set to 1, gratuitous arp requests are sent when the device is brought up. The same is expected when ndisc_notify is set to 1 (per ndisc_notify in Documentation/networking/ip-sysctl.txt). The NA is not sent on NETDEV_UP event; add it. Fixes: 5cb04436eef6 ("ipv6: add knob to send unsolicited ND on link-layer address change") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17net: dsa: isolate legacy codeVivien Didelot4-767/+825
This patch moves as is the legacy DSA code from dsa.c to legacy.c, except the few shared symbols which remain in dsa.c. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller22-67/+134
Conflicts were simply overlapping changes. In the net/ipv4/route.c case the code had simply moved around a little bit and the same fix was made in both 'net' and 'net-next'. In the net/sched/sch_generic.c case a fix in 'net' happened at the same time that a new argument was added to qdisc_hash_add(). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller8-25/+62
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Missing TCP header sanity check in TCPMSS target, from Eric Dumazet. 2) Incorrect event message type for related conntracks created via ctnetlink, from Liping Zhang. 3) Fix incorrect rcu locking when handling helpers from ctnetlink, from Gao feng. 4) Fix missing rcu locking when updating helper, from Liping Zhang. 5) Fix missing read_lock_bh when iterating over list of device addresses from TPROXY and redirect, also from Liping. 6) Fix crash when trying to dump expectations from conntrack with no helper via ctnetlink, from Liping. 7) Missing RCU protection to expecation list update given ctnetlink iterates over the list under rcu read lock side, from Liping too. 8) Don't dump autogenerated seed in nft_hash to userspace, this is very confusing to the user, again from Liping. 9) Fix wrong conntrack netns module refcount in ipt_CLUSTERIP, from Gao feng. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-14xfrm: Prepare the GRO codepath for hardware offloading.Steffen Klassert3-63/+71
On IPsec hardware offloading, we already get a secpath with valid state attached when the packet enters the GRO handlers. So check for hardware offload and skip the state lookup in this case. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14xfrm: Add encapsulation header offsets while SKB is not encryptedIlan Tayari5-0/+11
Both esp4 and esp6 used to assume that the SKB payload is encrypted and therefore the inner_network and inner_transport offsets are not relevant. When doing crypto offload in the NIC, this is no longer the case and the NIC driver needs these offsets so it can do TX TCP checksum offloading. This patch sets the inner_network and inner_transport members of the SKB, as well as encapsulation, to reflect the actual positions of these headers, and removes them only once encryption is done on the payload. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14net: Add a xfrm validate function to validate_xmit_skbSteffen Klassert2-0/+32
When we do IPsec offloading, we need a fallback for packets that were targeted to be IPsec offloaded but rerouted to a device that does not support IPsec offload. For that we add a function that checks the offloading features of the sending device and and flags the requirement of a fallback before it calls the IPsec output function. The IPsec output function adds the IPsec trailer and does encryption if needed. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14esp: Use a synchronous crypto algorithm on offloading.Steffen Klassert2-4/+20
We need a fallback algorithm for crypto offloading to a NIC. This is because packets can be rerouted to other NICs that don't support crypto offloading. The fallback is going to be implemented at layer2 where we know the final output device but can't handle asynchronous returns fron the crypto layer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14xfrm: Add xfrm_replay_overflow functions for offloadingSteffen Klassert1-2/+157
This patch adds functions that handles IPsec sequence numbers for GSO segments and TSO offloading. We need to calculate and update the sequence numbers based on the segments that GSO/TSO will generate. We need this to keep software and hardware sequence number counter in sync. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14esp: Add gso handlers for esp4 and esp6Steffen Klassert5-4/+203
This patch extends the xfrm_type by an encap function pointer and implements esp4_gso_encap and esp6_gso_encap. These functions doing the basic esp encapsulation for a GSO packet. In case the GSO packet needs to be segmented in software, we add gso_segment functions. This codepath is going to be used on esp hardware offloads. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14esp6: Reorganize esp_outputSteffen Klassert2-124/+243
We need a fallback for ESP at layer 2, so split esp6_output into generic functions that can be used at layer 3 and layer 2 and use them in esp_output. We also add esp6_xmit which is used for the layer 2 fallback. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14esp4: Reorganize esp_outputSteffen Klassert2-158/+285
We need a fallback for ESP at layer 2, so split esp_output into generic functions that can be used at layer 3 and layer 2 and use them in esp_output. We also add esp_xmit which is used for the layer 2 fallback. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14esp6: Remame esp_input_done2Steffen Klassert1-3/+3
We are going to export the ipv4 and the ipv6 version of esp_input_done2. They are not static anymore and can't have the same name. So rename the ipv6 version to esp6_input_done2. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14xfrm: Add an IPsec hardware offloading APISteffen Klassert11-23/+338
This patch adds all the bits that are needed to do IPsec hardware offload for IPsec states and ESP packets. We add xfrmdev_ops to the net_device. xfrmdev_ops has function pointers that are needed to manage the xfrm states in the hardware and to do a per packet offloading decision. Joint work with: Ilan Tayari <ilant@mellanox.com> Guy Shapiro <guysh@mellanox.com> Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14xfrm: Add mode handlers for IPsec on layer 2Steffen Klassert4-0/+114
This patch adds a gso_segment and xmit callback for the xfrm_mode and implement these functions for tunnel and transport mode. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-14xfrm: Move device notifications to a sepatate fileSteffen Klassert3-17/+45
This is needed for the upcomming IPsec device offloading. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>