aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-03-21ipv4: Allow amount of dirty memory from fib resizing to be controllableDavid Ahern2-6/+17
fib_trie implementation calls synchronize_rcu when a certain amount of pages are dirty from freed entries. The number of pages was determined experimentally in 2009 (commit c3059477fce2d). At the current setting, synchronize_rcu is called often -- 51 times in a second in one test with an average of an 8 msec delay adding a fib entry. The total impact is a lot of slow down modifying the fib. This is seen in the output of 'time' - the difference between real time and sys+user. For example, using 720,022 single path routes and 'ip -batch'[1]: $ time ./ip -batch ipv4/routes-1-hops real 0m14.214s user 0m2.513s sys 0m6.783s So roughly 35% of the actual time to install the routes is from the ip command getting scheduled out, most notably due to synchronize_rcu (this is observed using 'perf sched timehist'). This patch makes the amount of dirty memory configurable between 64k where the synchronize_rcu is called often (small, low end systems that are memory sensitive) to 64M where synchronize_rcu is called rarely during a large FIB change (for high end systems with lots of memory). The default is 512kB which corresponds to the current setting of 128 pages with a 4kB page size. As an example, at 16MB the worst interval shows 4 calls to synchronize_rcu in a second blocking for up to 30 msec in a single instance, and a total of almost 100 msec across the 4 calls in the second. The trade off is allowing FIB entries to consume more memory in a given time window but but with much better fib insertion rates (~30% increase in prefixes/sec). With this patch and net.ipv4.fib_sync_mem set to 16MB, the same batch file runs in: $ time ./ip -batch ipv4/routes-1-hops real 0m9.692s user 0m2.491s sys 0m6.769s So the dead time is reduced to about 1/2 second or <5% of the real time. [1] 'ip' modified to not request ACK messages which improves route insertion times by about 20% Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21ipv6: Change addrconf_f6i_alloc to use ip6_route_info_createDavid Ahern1-26/+16
Change addrconf_f6i_alloc to generate a fib6_config and call ip6_route_info_create. addrconf_f6i_alloc is the last caller to fib6_info_alloc besides ip6_route_info_create, and there is no reason for it to do its own initialization on a fib6_info. Host routes need to be created even if the device is down, so add a new flag, fc_ignore_dev_down, to fib6_config and update fib6_nh_init to not error out if device is not up. Notes on the conversion: - ip_fib_metrics_init is the same as fib6_config has fc_mx set to NULL and fc_mx_len set to 0 - dst_nocount is handled by the RTF_ADDRCONF flag - dst_host is handled by fc_dst_len = 128 nh_gw does not get set after the conversion to ip6_route_info_create but it should not be set in addrconf_f6i_alloc since this is a host route not a gateway route. Everything else is a straight forward map between fib6_info and fib6_config. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21ipv6: Move setting default metric for routesDavid Ahern1-4/+4
ip6_route_info_create is a low level function for ensuring fc_metric is set. Move the check and default setting to the 2 locations that do not already set fc_metric before calling ip6_route_info_create. This is required for the next patch which moves addrconf allocations to ip6_route_info_create and want the metric for host routes to be 0. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21net/tls: Replace kfree_skb() with consume_skb()Vakul Garg1-3/+3
To free the skb in normal course of processing, consume_skb() should be used. Only for failure paths, skb_free() is intended to be used. https://www.kernel.org/doc/htmldocs/networking/API-consume-skb.html Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21tipc: fix a null pointer derefHoang Le3-3/+6
In commit c55c8edafa91 ("tipc: smooth change between replicast and broadcast") we introduced new method to eliminate the risk of message reordering that happen in between different nodes. Unfortunately, we forgot checking at receiving side to ignore intra node. We fix this by checking and returning if arrived message from intra node. syzbot report: ================================================================== kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 7820 Comm: syz-executor418 Not tainted 5.0.0+ #61 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tipc_mcast_filter_msg+0x21b/0x13d0 net/tipc/bcast.c:782 Code: 45 c0 0f 84 39 06 00 00 48 89 5d 98 e8 ce ab a5 fa 49 8d bc 24 c8 00 00 00 48 b9 00 00 00 00 00 fc ff df 48 89 f8 48 c1 e8 03 <80> 3c 08 00 0f 85 9a 0e 00 00 49 8b 9c 24 c8 00 00 00 48 be 00 00 RSP: 0018:ffff8880959defc8 EFLAGS: 00010202 RAX: 0000000000000019 RBX: ffff888081258a48 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffffffff86cab862 RDI: 00000000000000c8 RBP: ffff8880959df030 R08: ffff8880813d0200 R09: ffffed1015d05bc8 R10: ffffed1015d05bc7 R11: ffff8880ae82de3b R12: 0000000000000000 R13: 000000000000002c R14: 0000000000000000 R15: ffff888081258a48 FS: 000000000106a880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020001cc0 CR3: 0000000094a20000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tipc_sk_filter_rcv+0x182d/0x34f0 net/tipc/socket.c:2168 tipc_sk_enqueue net/tipc/socket.c:2254 [inline] tipc_sk_rcv+0xc45/0x25a0 net/tipc/socket.c:2305 tipc_sk_mcast_rcv+0x724/0x1020 net/tipc/socket.c:1209 tipc_mcast_xmit+0x7fe/0x1200 net/tipc/bcast.c:410 tipc_sendmcast+0xb36/0xfc0 net/tipc/socket.c:820 __tipc_sendmsg+0x10df/0x18d0 net/tipc/socket.c:1358 tipc_sendmsg+0x53/0x80 net/tipc/socket.c:1291 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg+0xdd/0x130 net/socket.c:661 ___sys_sendmsg+0x806/0x930 net/socket.c:2260 __sys_sendmsg+0x105/0x1d0 net/socket.c:2298 __do_sys_sendmsg net/socket.c:2307 [inline] __se_sys_sendmsg net/socket.c:2305 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2305 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4401c9 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffd887fa9d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401c9 RDX: 0000000000000000 RSI: 0000000020002140 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401a50 R13: 0000000000401ae0 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: ---[ end trace ba79875754e1708f ]--- Reported-by: syzbot+be4bdf2cc3e85e952c50@syzkaller.appspotmail.com Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21tipc: fix use-after-free in tipc_sk_filter_rcvHoang Le1-1/+2
skb free-ed in: 1/ condition 1: tipc_sk_filter_rcv -> tipc_sk_proto_rcv 2/ condition 2: tipc_sk_filter_rcv -> tipc_group_filter_msg This leads to a "use-after-free" access in the next condition. We fix this by intializing the variable at declaration, then it is safe to check this variable to continue processing if condition matches. syzbot report: ================================================================== BUG: KASAN: use-after-free in tipc_sk_filter_rcv+0x2166/0x34f0 net/tipc/socket.c:2167 Read of size 4 at addr ffff88808ea58534 by task kworker/u4:0/7 CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.0.0+ #61 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: tipc_send tipc_conn_send_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:131 tipc_sk_filter_rcv+0x2166/0x34f0 net/tipc/socket.c:2167 tipc_sk_enqueue net/tipc/socket.c:2254 [inline] tipc_sk_rcv+0xc45/0x25a0 net/tipc/socket.c:2305 tipc_topsrv_kern_evt+0x3b7/0x580 net/tipc/topsrv.c:610 tipc_conn_send_to_sock+0x43e/0x5f0 net/tipc/topsrv.c:283 tipc_conn_send_work+0x65/0x80 net/tipc/topsrv.c:303 process_one_work+0x98e/0x1790 kernel/workqueue.c:2269 worker_thread+0x98/0xe40 kernel/workqueue.c:2415 kthread+0x357/0x430 kernel/kthread.c:253 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352 Reported-by: syzbot+e863893591cc7a622e40@syzkaller.appspotmail.com Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20ipv6: Add icmp_echo_ignore_anycast for ICMPv6Stephen Suryaputra2-2/+15
In addition to icmp_echo_ignore_multicast, there is a need to also prevent responding to pings to anycast addresses for security. Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20net: remove 'fallback' argument from dev->ndo_select_queue()Paolo Abeni3-12/+6
After the previous patch, all the callers of ndo_select_queue() provide as a 'fallback' argument netdev_pick_tx. The only exceptions are nested calls to ndo_select_queue(), which pass down the 'fallback' available in the current scope - still netdev_pick_tx. We can drop such argument and replace fallback() invocation with netdev_pick_tx(). This avoids an indirect call per xmit packet in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen) with device drivers implementing such ndo. It also clean the code a bit. Tested with ixgbe and CONFIG_FCOE=m With pktgen using queue xmit: threads vanilla patched (kpps) (kpps) 1 2334 2428 2 4166 4278 4 7895 8100 v1 -> v2: - rebased after helper's name change Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20packet: rework packet_pick_tx_queue() to use common code selectionPaolo Abeni2-10/+10
Currently packet_pick_tx_queue() is the only caller of ndo_select_queue() using a fallback argument other than netdev_pick_tx. Leveraging rx queue, we can obtain a similar queue selection behavior using core helpers. After this change, ndo_select_queue() is always invoked with netdev_pick_tx() as fallback. We can change ndo_select_queue() signature in a followup patch, dropping an indirect call per transmitted packet in some scenarios (e.g. TCP syn and XDP generic xmit) This changes slightly how af packet queue selection happens when PACKET_QDISC_BYPASS is set. It's now more similar to plan dev_queue_xmit() tacking in account both XPS and TC mapping. v1 -> v2: - rebased after helper name change RFC -> v1: - initialize sender_cpu to the expected value Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20net: dev: rename queue selection helpers.Paolo Abeni3-11/+11
With the following patches, we are going to use __netdev_pick_tx() in many modules. Rename it to netdev_pick_tx(), to make it clear is a public API. Also rename the existing netdev_pick_tx() to netdev_core_pick_tx(), to avoid name clashes. Suggested-by: Eric Dumazet <edumazet@google.com> Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20net: bridge: use eth_broadcast_addr() to assign broadcast addressMao Wenan1-1/+1
This patch is to use eth_broadcast_addr() to assign broadcast address insetad of memset(). Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20net/tls: Add support of AES128-CCM based ciphersVakul Garg2-29/+69
Added support for AES128-CCM based record encryption. AES128-CCM is similar to AES128-GCM. Both of them have same salt/iv/mac size. The notable difference between the two is that while invoking AES128-CCM operation, the salt||nonce (which is passed as IV) has to be prefixed with a hardcoded value '2'. Further, CCM implementation in kernel requires IV passed in crypto_aead_request() to be full '16' bytes. Therefore, the record structure 'struct tls_rec' has been modified to reserve '16' bytes for IV. This works for both GCM and CCM based cipher. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-19ipv6: Add icmp_echo_ignore_multicast support for ICMPv6Stephen Suryaputra2-0/+13
IPv4 has icmp_echo_ignore_broadcast to prevent responding to broadcast pings. IPv6 needs a similar mechanism. v1->v2: - Remove NET_IPV6_ICMP_ECHO_IGNORE_MULTICAST. Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-19tcp: free request sock directly upon TFO or syncookies errorGuillaume Nault2-12/+10
Since the request socket is created locally, it'd make more sense to use reqsk_free() instead of reqsk_put() in TFO and syncookies' error path. However, tcp_get_cookie_sock() may set ->rsk_refcnt before freeing the socket; tcp_conn_request() may also have non-null ->rsk_refcnt because of tcp_try_fastopen(). In both cases 'req' hasn't been exposed to the outside world and is safe to free immediately, but that'd trigger the WARN_ON_ONCE in reqsk_free(). Define __reqsk_free() for these situations where we know nobody's referencing the socket, even though ->rsk_refcnt might be non-null. Now we can consolidate the error path of tcp_get_cookie_sock() and tcp_conn_request(). Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-19datagram: Make __skb_datagram_iter staticYueHaibing1-4/+4
Fix sparse warning: net/core/datagram.c:411:5: warning: symbol '__skb_datagram_iter' was not declared. Should it be static? Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-19tcp: add tcp_inet6_sk() helperEric Dumazet1-18/+26
TCP ipv6 fast path dereferences a pointer to get to the inet6 part of a tcp socket, but given the fixed memory placement, we can do better and avoid a possible cache line miss. This also reduces register pressure, since we let the compiler know about this memory placement. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-19tipc: smooth change between replicast and broadcastHoang Le4-1/+184
Currently, a multicast stream may start out using replicast, because there are few destinations, and then it should ideally switch to L2/broadcast IGMP/multicast when the number of destinations grows beyond a certain limit. The opposite should happen when the number decreases below the limit. To eliminate the risk of message reordering caused by method change, a sending socket must stick to a previously selected method until it enters an idle period of 5 seconds. Means there is a 5 seconds pause in the traffic from the sender socket. If the sender never makes such a pause, the method will never change, and transmission may become very inefficient as the cluster grows. With this commit, we allow such a switch between replicast and broadcast without any need for a traffic pause. Solution is to send a dummy message with only the header, also with the SYN bit set, via broadcast or replicast. For the data message, the SYN bit is set and sending via replicast or broadcast (inverse method with dummy). Then, at receiving side any messages follow first SYN bit message (data or dummy message), they will be held in deferred queue until another pair (dummy or data message) arrived in other link. v2: reverse christmas tree declaration Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-19tipc: introduce new capability flag for clusterHoang Le4-2/+27
As a preparation for introducing a smooth switching between replicast and broadcast method for multicast message, We have to introduce a new capability flag TIPC_MCAST_RBCTL to handle this new feature. During a cluster upgrade a node can come back with this new capabilities which also must be reflected in the cluster capabilities field. The new feature is only applicable if all node in the cluster supports this new capability. Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-19tipc: support broadcast/replicast configurable for bc-linkHoang Le4-5/+118
Currently, a multicast stream uses either broadcast or replicast as transmission method, based on the ratio between number of actual destinations nodes and cluster size. However, when an L2 interface (e.g., VXLAN) provides pseudo broadcast support, this becomes very inefficient, as it blindly replicates multicast packets to all cluster/subnet nodes, irrespective of whether they host actual target sockets or not. The TIPC multicast algorithm is able to distinguish real destination nodes from other nodes, and hence provides a smarter and more efficient method for transferring multicast messages than pseudo broadcast can do. Because of this, we now make it possible for users to force the broadcast link to permanently switch to using replicast, irrespective of which capabilities the bearer provides, or pretend to provide. Conversely, we also make it possible to force the broadcast link to always use true broadcast. While maybe less useful in deployed systems, this may at least be useful for testing the broadcast algorithm in small clusters. We retain the current AUTOSELECT ability, i.e., to let the broadcast link automatically select which algorithm to use, and to switch back and forth between broadcast and replicast as the ratio between destination node number and cluster size changes. This remains the default method. Furthermore, we make it possible to configure the threshold ratio for such switches. The default ratio is now set to 10%, down from 25% in the earlier implementation. Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_STREAM_SCHEDULER sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_STREAM_SCHEDULER sockopt. Fixes: 7efba10d6bd2 ("sctp: add SCTP_FUTURE_ASOC and SCTP_CURRENT_ASSOC for SCTP_STREAM_SCHEDULER sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_EVENT sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_EVENT sockopt. Fixes: d251f05e3ba2 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_EVENT sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_ENABLE_STREAM_RESET sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_ENABLE_STREAM_RESET sockopt. Fixes: 99a62135e127 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_ENABLE_STREAM_RESET sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_PRINFO sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_DEFAULT_PRINFO sockopt. Fixes: 3a583059d187 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_DEFAULT_PRINFO sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_AUTH_DEACTIVATE_KEY sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_AUTH_DEACTIVATE_KEY sockopt. Fixes: 2af66ff3edc7 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_AUTH_DEACTIVATE_KEY sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_AUTH_DELETE_KEY sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_AUTH_DELETE_KEY sockopt. Fixes: 3adcc300603e ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_AUTH_DELETE_KEY sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_AUTH_ACTIVE_KEY sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_AUTH_ACTIVE_KEY sockopt. Fixes: bf9fb6ad4f29 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_AUTH_ACTIVE_KEY sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_AUTH_KEY sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_AUTH_KEY sockopt. Fixes: 7fb3be13a236 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_AUTH_KEY sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_MAX_BURST sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_MAX_BURST sockopt. Fixes: e0651a0dc877 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_MAX_BURST sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_CONTEXT sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_CONTEXT sockopt. Fixes: 49b037acca8c ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_CONTEXT sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SNDINFO sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_DEFAULT_SNDINFO sockopt. Fixes: 92fc3bd928c9 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_DEFAULT_SNDINFO sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DELAYED_SACK sockoptXin Long1-0/+3
A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_DELAYED_SACK sockopt. Fixes: 9c5829e1c49e ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_DELAYED_SACK sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_SEND_PARAM sockoptMarcelo Ricardo Leitner1-0/+3
Currently if the user pass an invalid asoc_id to SCTP_DEFAULT_SEND_PARAM on a TCP-style socket, it will silently ignore the new parameters. That's because after not finding an asoc, it is checking asoc_id against the known values of CURRENT/FUTURE/ALL values and that fails to match. IOW, if the user supplies an invalid asoc id or not, it should either match the current asoc or the socket itself so that it will inherit these later. Fixes it by forcing asoc_id to SCTP_FUTURE_ASSOC in case it is a TCP-style socket without an asoc, so that the values get set on the socket. Fixes: 707e45b3dc5a ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_DEFAULT_SEND_PARAM sockopt") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18sctp: not copy sctp_sock pd_lobby in sctp_copy_descendantXin Long1-2/+1
Now sctp_copy_descendant() copies pd_lobby from old sctp scok to new sctp sock. If sctp_sock_migrate() returns error, it will panic when releasing new sock and trying to purge pd_lobby due to the incorrect pointers in pd_lobby. [ 120.485116] kasan: CONFIG_KASAN_INLINE enabled [ 120.486270] kasan: GPF could be caused by NULL-ptr deref or user [ 120.509901] Call Trace: [ 120.510443] sctp_ulpevent_free+0x1e8/0x490 [sctp] [ 120.511438] sctp_queue_purge_ulpevents+0x97/0xe0 [sctp] [ 120.512535] sctp_close+0x13a/0x700 [sctp] [ 120.517483] inet_release+0xdc/0x1c0 [ 120.518215] __sock_release+0x1d2/0x2a0 [ 120.519025] sctp_do_peeloff+0x30f/0x3c0 [sctp] We fix it by not copying sctp_sock pd_lobby in sctp_copy_descendan(), and skb_queue_head_init() can also be removed in sctp_sock_migrate(). Reported-by: syzbot+85e0b422ff140b03672a@syzkaller.appspotmail.com Fixes: 89664c623617 ("sctp: sctp_sock_migrate() returns error if sctp_bind_addr_dup() fails") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18af_packet: fix the tx skb protocol in raw sockets with ETH_P_ALLYoshiki Komachi1-1/+2
I am using "protocol ip" filters in TC to manipulate TC flower classifiers, which are only available with "protocol ip". However, I faced an issue that packets sent via raw sockets with ETH_P_ALL did not match the ip filters even if they did satisfy the condition (e.g., DHCP offer from dhcpd). I have determined that the behavior was caused by an unexpected value stored in skb->protocol, namely, ETH_P_ALL instead of ETH_P_IP, when packets were sent via raw sockets with ETH_P_ALL set. IMHO, storing ETH_P_ALL in skb->protocol is not appropriate for packets sent via raw sockets because ETH_P_ALL is not a real ether type used on wire, but a virtual one. This patch fixes the tx protocol selection in cases of transmission via raw sockets created with ETH_P_ALL so that it asks the driver to extract protocol from the Ethernet header. Fixes: 75c65772c3 ("net/packet: Ask driver for protocol if not provided by user") Signed-off-by: Yoshiki Komachi <komachi.yoshiki@lab.ntt.co.jp> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18packets: Always register packet sk in the same orderMaxime Chevallier1-1/+1
When using fanouts with AF_PACKET, the demux functions such as fanout_demux_cpu will return an index in the fanout socket array, which corresponds to the selected socket. The ordering of this array depends on the order the sockets were added to a given fanout group, so for FANOUT_CPU this means sockets are bound to cpus in the order they are configured, which is OK. However, when stopping then restarting the interface these sockets are bound to, the sockets are reassigned to the fanout group in the reverse order, due to the fact that they were inserted at the head of the interface's AF_PACKET socket list. This means that traffic that was directed to the first socket in the fanout group is now directed to the last one after an interface restart. In the case of FANOUT_CPU, traffic from CPU0 will be directed to the socket that used to receive traffic from the last CPU after an interface restart. This commit introduces a helper to add a socket at the tail of a list, then uses it to register AF_PACKET sockets. Note that this changes the order in which sockets are listed in /proc and with sock_diag. Fixes: dc99f600698d ("packet: Add fanout support") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-18net: rose: fix a possible stack overflowEric Dumazet1-9/+12
rose_write_internal() uses a temp buffer of 100 bytes, but a manual inspection showed that given arbitrary input, rose_create_facilities() can fill up to 110 bytes. Lets use a tailroom of 256 bytes for peace of mind, and remove the bounce buffer : we can simply allocate a big enough skb and adjust its length as needed. syzbot report : BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:352 [inline] BUG: KASAN: stack-out-of-bounds in rose_create_facilities net/rose/rose_subr.c:521 [inline] BUG: KASAN: stack-out-of-bounds in rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116 Write of size 7 at addr ffff88808b1ffbef by task syz-executor.0/24854 CPU: 0 PID: 24854 Comm: syz-executor.0 Not tainted 5.0.0+ #97 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x123/0x190 mm/kasan/generic.c:191 memcpy+0x38/0x50 mm/kasan/common.c:131 memcpy include/linux/string.h:352 [inline] rose_create_facilities net/rose/rose_subr.c:521 [inline] rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116 rose_connect+0x7cb/0x1510 net/rose/af_rose.c:826 __sys_connect+0x266/0x330 net/socket.c:1685 __do_sys_connect net/socket.c:1696 [inline] __se_sys_connect net/socket.c:1693 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1693 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458079 Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f47b8d9dc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458079 RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000004 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f47b8d9e6d4 R13: 00000000004be4a4 R14: 00000000004ceca8 R15: 00000000ffffffff The buggy address belongs to the page: page:ffffea00022c7fc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x1fffc0000000000() raw: 01fffc0000000000 0000000000000000 ffffffff022c0101 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88808b1ffa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88808b1ffb00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 03 >ffff88808b1ffb80: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 04 f3 ^ ffff88808b1ffc00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 ffff88808b1ffc80: 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 01 f2 01 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-17tipc: allow service ranges to be connect()'ed on RDM/DGRAMErik Hugne1-5/+15
We move the check that prevents connecting service ranges to after the RDM/DGRAM check, and move address sanity control to a separate function that also validates the service range. Fixes: 23998835be98 ("tipc: improve address sanity check in tipc_connect()") Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-16net: tipc: fix a missing check for nla_nest_startKangjie Lu1-0/+2
nla_nest_start may fail. The fix check its status and returns -EMSGSIZE in case it fails. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2-23/+23
Daniel Borkmann says: ==================== pull-request: bpf 2019-03-16 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a umem memory leak on cleanup in AF_XDP, from Björn. 2) Fix BTF to properly resolve forward-declared enums into their corresponding full enum definition types during deduplication, from Andrii. 3) Fix libbpf to reject invalid flags in xsk_socket__create(), from Magnus. 4) Fix accessing invalid pointer returned from bpf_tcp_sock() and bpf_sk_fullsock() after bpf_sk_release() was called, from Martin. 5) Fix generation of load/store DW instructions in PPC JIT, from Naveen. 6) Various fixes in BPF helper function documentation in bpf.h UAPI header used to bpf-helpers(7) man page, from Quentin. 7) Fix segfault in BPF test_progs when prog loading failed, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-16net: tipc: fix a missing check of nla_nest_startKangjie Lu1-0/+3
nla_nest_start could fail and requires a check. The fix returns -EMSGSIZE if it fails. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-16net: ncsi: fix a missing check for nla_nest_startKangjie Lu1-0/+4
nla_nest_start may fail and thus deserves a check. The fix returns -EMSGSIZE in case it fails. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-16net: openvswitch: fix missing checks for nla_nest_startKangjie Lu1-0/+8
nla_nest_start may fail and thus deserves a check. The fix returns -EMSGSIZE when it fails. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-16net: openvswitch: fix a NULL pointer dereferenceKangjie Lu1-0/+4
upcall is dereferenced even when genlmsg_put fails. The fix goto out to avoid the NULL pointer dereference in this case. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-16xsk: fix umem memory leak on cleanupBjörn Töpel1-18/+1
When the umem is cleaned up, the task that created it might already be gone. If the task was gone, the xdp_umem_release function did not free the pages member of struct xdp_umem. It turned out that the task lookup was not needed at all; The code was a left-over when we moved from task accounting to user accounting [1]. This patch fixes the memory leak by removing the task lookup logic completely. [1] https://lore.kernel.org/netdev/20180131135356.19134-3-bjorn.topel@gmail.com/ Link: https://lore.kernel.org/netdev/c1cb2ca8-6a14-3980-8672-f3de0bb38dfd@suse.cz/ Fixes: c0c77d8fb787 ("xsk: add user memory registration support sockopt") Reported-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-15net: add documentation to socket.cPedro Tammela1-18/+259
Adds missing sphinx documentation to the socket.c's functions. Also fixes some whitespaces. I also changed the style of older documentation as an effort to have an uniform documentation style. Signed-off-by: Pedro Tammela <pctammela@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-15net: strparser: fix a missing check for create_singlethread_workqueueKangjie Lu1-0/+2
In case create_singlethread_workqueue fails, the check returns an error to callers to avoid potential NULL pointer dereferences. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-15sch_cake: Interpret fwmark parameter as a bitmaskToke Høiland-Jørgensen1-13/+12
We initially interpreted the fwmark parameter as a flag that simply turned on the feature, using the whole skb->mark field as the index into the CAKE tin_order array. However, it is quite common for different applications to use different parts of the mask field for their own purposes, each using a different mask. Support this use of subsets of the mark by interpreting the TCA_CAKE_FWMARK parameter as a bitmask to apply to the fwmark field when reading it. The result will be right-shifted by the number of unset lower bits of the mask before looking up the tin. In the original commit message we also failed to credit Felix Resch with originally suggesting the fwmark feature back in 2017; so the Suggested-By in this commit covers the whole fwmark feature. Fixes: 0b5c7efdfc6e ("sch_cake: Permit use of connmarks as tin classifiers") Suggested-by: Felix Resch <fuller@beif.de> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-15appletalk: Fix potential NULL pointer dereference in unregister_snap_clientYueHaibing2-11/+24
register_snap_client may return NULL, all the callers check it, but only print a warning. This will result in NULL pointer dereference in unregister_snap_client and other places. It has always been used like this since v2.6 Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds13-66/+134
Pull networking fixes from David Miller: "More fixes in the queue: 1) Netfilter nat can erroneously register the device notifier twice, fix from Florian Westphal. 2) Use after free in nf_tables, from Pablo Neira Ayuso. 3) Parallel update of steering rule fix in mlx5 river, from Eli Britstein. 4) RX processing panic in lan743x, fix from Bryan Whitehead. 5) Use before initialization of TCP_SKB_CB, fix from Christoph Paasch. 6) Fix locking in SRIOV mode of mlx4 driver, from Jack Morgenstein. 7) Fix TX stalls in lan743x due to mishandling of interrupt ACKing modes, from Bryan Whitehead. 8) Fix infoleak in l2tp_ip6_recvmsg(), from Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits) pptp: dst_release sk_dst_cache in pptp_sock_destruct MAINTAINERS: GENET & SYSTEMPORT: Add internal Broadcom list l2tp: fix infoleak in l2tp_ip6_recvmsg() net/tls: Inform user space about send buffer availability net_sched: return correct value for *notify* functions lan743x: Fix TX Stall Issue net/mlx4_core: Fix qp mtt size calculation net/mlx4_core: Fix locking in SRIOV mode when switching between events and polling net/mlx4_core: Fix reset flow when in command polling mode mlxsw: minimal: Initialize base_mac mlxsw: core: Prevent duplication during QSFP module initialization net: dwmac-sun8i: fix a missing check of of_get_phy_mode net: sh_eth: fix a missing check of of_get_phy_mode net: 8390: fix potential NULL pointer dereferences net: fujitsu: fix a potential NULL pointer dereference net: qlogic: fix a potential NULL pointer dereference isdn: hfcpci: fix potential NULL pointer dereference Documentation: devicetree: add a new optional property for port mac address net: rocker: fix a potential NULL pointer dereference net: qlge: fix a potential NULL pointer dereference ...
2019-03-13l2tp: fix infoleak in l2tp_ip6_recvmsg()Eric Dumazet1-3/+1
Back in 2013 Hannes took care of most of such leaks in commit bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") But the bug in l2tp_ip6_recvmsg() has not been fixed. syzbot report : BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 CPU: 1 PID: 10996 Comm: syz-executor362 Not tainted 5.0.0+ #11 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:600 kmsan_internal_check_memory+0x9f4/0xb10 mm/kmsan/kmsan.c:694 kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601 _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 copy_to_user include/linux/uaccess.h:174 [inline] move_addr_to_user+0x311/0x570 net/socket.c:227 ___sys_recvmsg+0xb65/0x1310 net/socket.c:2283 do_recvmmsg+0x646/0x10c0 net/socket.c:2390 __sys_recvmmsg net/socket.c:2469 [inline] __do_sys_recvmmsg net/socket.c:2492 [inline] __se_sys_recvmmsg+0x1d1/0x350 net/socket.c:2485 __x64_sys_recvmmsg+0x62/0x80 net/socket.c:2485 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x445819 Code: e8 6c b6 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 12 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f64453eddb8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00000000006dac28 RCX: 0000000000445819 RDX: 0000000000000005 RSI: 0000000020002f80 RDI: 0000000000000003 RBP: 00000000006dac20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dac2c R13: 00007ffeba8f87af R14: 00007f64453ee9c0 R15: 20c49ba5e353f7cf Local variable description: ----addr@___sys_recvmsg Variable was created at: ___sys_recvmsg+0xf6/0x1310 net/socket.c:2244 do_recvmmsg+0x646/0x10c0 net/socket.c:2390 Bytes 0-31 of 32 are uninitialized Memory access of size 32 starts at ffff8880ae62fbb0 Data copied to user address 0000000020000000 Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>