aboutsummaryrefslogtreecommitdiffstats
path: root/net/core (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-02-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-23/+25
The conflict was an interaction between a bug fix in the netvsc driver in 'net' and an optimization of the RX path in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07net: pending_confirm is not used anymoreJulian Anastasov1-1/+0
When same struct dst_entry can be used for many different neighbours we can not use it for pending confirmations. As last step, we can remove the pending_confirm flag. Reported-by: YueHaibing <yuehaibing@huawei.com> Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.") Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07sock: add sk_dst_pending_confirm flagJulian Anastasov1-0/+2
Add new sock flag to allow sockets to confirm neighbour. When same struct dst_entry can be used for many different neighbours we can not use it for pending confirmations. As not all call paths lock the socket use full word for the flag. Add sk_dst_confirm as replacement for dst_confirm when called for received packets. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07udp: properly cope with csum errorsEric Dumazet1-2/+6
Dmitry reported that UDP sockets being destroyed would trigger the WARN_ON(atomic_read(&sk->sk_rmem_alloc)); in inet_sock_destruct() It turns out we do not properly destroy skb(s) that have wrong UDP checksum. Thanks again to syzkaller team. Fixes : 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06net: remove ndo_neigh_{construct, destroy} from stacked devicesIdo Schimmel1-44/+0
In commit 18bfb924f000 ("net: introduce default neigh_construct/destroy ndo calls for L2 upper devices") we added these ndos to stacked devices such as team and bond, so that calls will be propagated to mlxsw. However, previous commit removed the reliance on these ndos and no new users of these ndos have appeared since above mentioned commit. We can therefore safely remove this dead code. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05net: remove __napi_complete()Eric Dumazet1-21/+3
All __napi_complete() callers have been converted to use the more standard napi_complete_done(), we can now remove this NAPI method for good. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03net: skb_needs_check() accepts CHECKSUM_NONE for txEric Dumazet1-3/+4
My recent change missed fact that UFO would perform a complete UDP checksum before segmenting in frags. In this case skb->ip_summed is set to CHECKSUM_NONE. We need to add this valid case to skb_needs_check() Fixes: b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03net: remove support for per driver ndo_busy_poll()Eric Dumazet2-16/+0
We added generic support for busy polling in NAPI layer in linux-4.5 No network driver uses ndo_busy_poll() anymore, we can get rid of the pointer in struct net_device_ops, and its use in sk_busy_loop() Saves NETIF_F_BUSY_POLL features bit. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller1-1/+1
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree, they are: 1) Stash ctinfo 3-bit field into pointer to nf_conntrack object from sk_buff so we only access one single cacheline in the conntrack hotpath. Patchset from Florian Westphal. 2) Don't leak pointer to internal structures when exporting x_tables ruleset back to userspace, from Willem DeBruijn. This includes new helper functions to copy data to userspace such as xt_data_to_user() as well as conversions of our ip_tables, ip6_tables and arp_tables clients to use it. Not surprinsingly, ebtables requires an ad-hoc update. There is also a new field in x_tables extensions to indicate the amount of bytes that we copy to userspace. 3) Add nf_log_all_netns sysctl: This new knob allows you to enable logging via nf_log infrastructure for all existing netnamespaces. Given the effort to provide pernet syslog has been discontinued, let's provide a way to restore logging using netfilter kernel logging facilities in trusted environments. Patch from Michal Kubecek. 4) Validate SCTP checksum from conntrack helper, from Davide Caratti. 5) Merge UDPlite conntrack and NAT helpers into UDP, this was mostly a copy&paste from the original helper, from Florian Westphal. 6) Reset netfilter state when duplicating packets, also from Florian. 7) Remove unnecessary check for broadcast in IPv6 in pkttype match and nft_meta, from Liping Zhang. 8) Add missing code to deal with loopback packets from nft_meta when used by the netdev family, also from Liping. 9) Several cleanups on nf_tables, one to remove unnecessary check from the netlink control plane path to add table, set and stateful objects and code consolidation when unregister chain hooks, from Gao Feng. 10) Fix harmless reference counter underflow in IPVS that, however, results in problems with the introduction of the new refcount_t type, from David Windsor. 11) Enable LIBCRC32C from nf_ct_sctp instead of nf_nat_sctp, from Davide Caratti. 12) Missing documentation on nf_tables uapi header, from Liping Zhang. 13) Use rb_entry() helper in xt_connlimit, from Geliang Tang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03net: use a work queue to defer net_disable_timestamp() workEric Dumazet1-18/+13
Dmitry reported a warning [1] showing that we were calling net_disable_timestamp() -> static_key_slow_dec() from a non process context. Grabbing a mutex while holding a spinlock or rcu_read_lock() is not allowed. As Cong suggested, we now use a work queue. It is possible netstamp_clear() exits while netstamp_needed_deferred is not zero, but it is probably not worth trying to do better than that. netstamp_needed_deferred atomic tracks the exact number of deferred decrements. [1] [ INFO: suspicious RCU usage. ] 4.10.0-rc5+ #192 Not tainted ------------------------------- ./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 0 2 locks held by syz-executor14/23111: #0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83a35c35>] lock_sock include/net/sock.h:1454 [inline] #0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83a35c35>] rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919 #1: (rcu_read_lock){......}, at: [<ffffffff83ae2678>] nf_hook include/linux/netfilter.h:201 [inline] #1: (rcu_read_lock){......}, at: [<ffffffff83ae2678>] __ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160 stack backtrace: CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452 rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline] ___might_sleep+0x560/0x650 kernel/sched/core.c:7748 __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739 mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752 atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060 __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149 static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174 net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728 sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403 __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441 sk_destruct+0x47/0x80 net/core/sock.c:1460 __sk_free+0x57/0x230 net/core/sock.c:1468 sock_wfree+0xae/0x120 net/core/sock.c:1645 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655 skb_release_all+0x15/0x60 net/core/skbuff.c:668 __kfree_skb+0x15/0x20 net/core/skbuff.c:684 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 inet_frag_put include/net/inet_frag.h:133 [inline] nf_ct_frag6_gather+0x1106/0x3840 net/ipv6/netfilter/nf_conntrack_reasm.c:617 ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline] nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310 nf_hook include/linux/netfilter.h:212 [inline] __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742 rawv6_push_pending_frames net/ipv6/raw.c:613 [inline] rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744 sock_sendmsg_nosec net/socket.c:635 [inline] sock_sendmsg+0xca/0x110 net/socket.c:645 sock_write_iter+0x326/0x600 net/socket.c:848 do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695 do_readv_writev+0x42c/0x9b0 fs/read_write.c:872 vfs_writev+0x87/0xc0 fs/read_write.c:911 do_writev+0x110/0x2c0 fs/read_write.c:944 SYSC_writev fs/read_write.c:1017 [inline] SyS_writev+0x27/0x30 fs/read_write.c:1014 entry_SYSCALL_64_fastpath+0x1f/0xc2 RIP: 0033:0x445559 RSP: 002b:00007f6f46fceb58 EFLAGS: 00000292 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000445559 RDX: 0000000000000001 RSI: 0000000020f1eff0 RDI: 0000000000000005 RBP: 00000000006e19c0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000700000 R13: 0000000020f59000 R14: 0000000000000015 R15: 0000000000020400 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:752 in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14 INFO: lockdep is turned off. CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780 __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739 mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752 atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060 __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149 static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174 net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728 sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403 __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441 sk_destruct+0x47/0x80 net/core/sock.c:1460 __sk_free+0x57/0x230 net/core/sock.c:1468 sock_wfree+0xae/0x120 net/core/sock.c:1645 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655 skb_release_all+0x15/0x60 net/core/skbuff.c:668 __kfree_skb+0x15/0x20 net/core/skbuff.c:684 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 inet_frag_put include/net/inet_frag.h:133 [inline] nf_ct_frag6_gather+0x1106/0x3840 net/ipv6/netfilter/nf_conntrack_reasm.c:617 ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline] nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310 nf_hook include/linux/netfilter.h:212 [inline] __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742 rawv6_push_pending_frames net/ipv6/raw.c:613 [inline] rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744 sock_sendmsg_nosec net/socket.c:635 [inline] sock_sendmsg+0xca/0x110 net/socket.c:645 sock_write_iter+0x326/0x600 net/socket.c:848 do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695 do_readv_writev+0x42c/0x9b0 fs/read_write.c:872 vfs_writev+0x87/0xc0 fs/read_write.c:911 do_writev+0x110/0x2c0 fs/read_write.c:944 SYSC_writev fs/read_write.c:1017 [inline] SyS_writev+0x27/0x30 fs/read_write.c:1014 entry_SYSCALL_64_fastpath+0x1f/0xc2 RIP: 0033:0x445559 Fixes: b90e5794c5bd ("net: dont call jump_label_dec from irq context") Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03ethtool: do not vzalloc(0) on registers dumpStanislaw Gruszka1-3/+6
If ->get_regs_len() callback return 0, we allocate 0 bytes of memory, what print ugly warning in dmesg, which can be found further below. This happen on mac80211 devices where ieee80211_get_regs_len() just return 0 and driver only fills ethtool_regs structure and actually do not provide any dump. However I assume this can happen on other drivers i.e. when for some devices driver provide regs dump and for others do not. Hence preventing to to print warning in ethtool code seems to be reasonable. ethtool: vmalloc: allocation failure: 0 bytes, mode:0x24080c2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_ZERO) <snip> Call Trace: [<ffffffff813bde47>] dump_stack+0x63/0x8c [<ffffffff811b0a1f>] warn_alloc+0x13f/0x170 [<ffffffff811f0476>] __vmalloc_node_range+0x1e6/0x2c0 [<ffffffff811f0874>] vzalloc+0x54/0x60 [<ffffffff8169986c>] dev_ethtool+0xb4c/0x1b30 [<ffffffff816adbb1>] dev_ioctl+0x181/0x520 [<ffffffff816714d2>] sock_do_ioctl+0x42/0x50 <snip> Mem-Info: active_anon:435809 inactive_anon:173951 isolated_anon:0 active_file:835822 inactive_file:196932 isolated_file:0 unevictable:0 dirty:8 writeback:0 unstable:0 slab_reclaimable:157732 slab_unreclaimable:10022 mapped:83042 shmem:306356 pagetables:9507 bounce:0 free:130041 free_pcp:1080 free_cma:0 Node 0 active_anon:1743236kB inactive_anon:695804kB active_file:3343288kB inactive_file:787728kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:332168kB dirty:32kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 1225424kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no Node 0 DMA free:15900kB min:136kB low:168kB high:200kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15900kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 3187 7643 7643 Node 0 DMA32 free:419732kB min:28124kB low:35152kB high:42180kB active_anon:541180kB inactive_anon:248988kB active_file:1466388kB inactive_file:389632kB unevictable:0kB writepending:0kB present:3370280kB managed:3290932kB mlocked:0kB slab_reclaimable:217184kB slab_unreclaimable:4180kB kernel_stack:160kB pagetables:984kB bounce:0kB free_pcp:2236kB local_pcp:660kB free_cma:0kB lowmem_reserve[]: 0 0 4456 4456 Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02net: add LINUX_MIB_PFMEMALLOCDROP counterEric Dumazet1-2/+3
Debugging issues caused by pfmemalloc is often tedious. Add a new SNMP counter to more easily diagnose these problems. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Josef Bacik <jbacik@fb.com> Acked-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02net: remove useless pfmemalloc settingEric Dumazet1-1/+0
When __alloc_skb() allocates an skb from fast clone cache, setting pfmemalloc on the clone is not needed. Clone will be properly initialized later at skb_clone() time, including pfmemalloc field, as it is included in the headers_start/headers_end section which is fully copied. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02skbuff: add and use skb_nfct helperFlorian Westphal1-1/+1
Followup patch renames skb->nfct and changes its type so add a helper to avoid intrusive rename change later. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-01net: reduce skb_warn_bad_offload() noiseEric Dumazet1-3/+9
Dmitry reported warnings occurring in __skb_gso_segment() [1] All SKB_GSO_DODGY producers can allow user space to feed packets that trigger the current check. We could prevent them from doing so, rejecting packets, but this might add regressions to existing programs. It turns out our SKB_GSO_DODGY handlers properly set up checksum information that is needed anyway when packets needs to be segmented. By checking again skb_needs_check() after skb_mac_gso_segment(), we should remove these pesky warnings, at a very minor cost. With help from Willem de Bruijn [1] WARNING: CPU: 1 PID: 6768 at net/core/dev.c:2439 skb_warn_bad_offload+0x2af/0x390 net/core/dev.c:2434 lo: caps=(0x000000a2803b7c69, 0x0000000000000000) len=138 data_len=0 gso_size=15883 gso_type=4 ip_summed=0 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 6768 Comm: syz-executor1 Not tainted 4.9.0 #5 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 ffff8801c063ecd8 ffffffff82346bdf ffffffff00000001 1ffff100380c7d2e ffffed00380c7d26 0000000041b58ab3 ffffffff84b37e38 ffffffff823468f1 ffffffff84820740 ffffffff84f289c0 dffffc0000000000 ffff8801c063ee20 Call Trace: [<ffffffff82346bdf>] __dump_stack lib/dump_stack.c:15 [inline] [<ffffffff82346bdf>] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 [<ffffffff81827e34>] panic+0x1fb/0x412 kernel/panic.c:179 [<ffffffff8141f704>] __warn+0x1c4/0x1e0 kernel/panic.c:542 [<ffffffff8141f7e5>] warn_slowpath_fmt+0xc5/0x100 kernel/panic.c:565 [<ffffffff8356cbaf>] skb_warn_bad_offload+0x2af/0x390 net/core/dev.c:2434 [<ffffffff83585cd2>] __skb_gso_segment+0x482/0x780 net/core/dev.c:2706 [<ffffffff83586f19>] skb_gso_segment include/linux/netdevice.h:3985 [inline] [<ffffffff83586f19>] validate_xmit_skb+0x5c9/0xc20 net/core/dev.c:2969 [<ffffffff835892bb>] __dev_queue_xmit+0xe6b/0x1e70 net/core/dev.c:3383 [<ffffffff8358a2d7>] dev_queue_xmit+0x17/0x20 net/core/dev.c:3424 [<ffffffff83ad161d>] packet_snd net/packet/af_packet.c:2930 [inline] [<ffffffff83ad161d>] packet_sendmsg+0x32ed/0x4d30 net/packet/af_packet.c:2955 [<ffffffff834f0aaa>] sock_sendmsg_nosec net/socket.c:621 [inline] [<ffffffff834f0aaa>] sock_sendmsg+0xca/0x110 net/socket.c:631 [<ffffffff834f329a>] ___sys_sendmsg+0x8fa/0x9f0 net/socket.c:1954 [<ffffffff834f5e58>] __sys_sendmsg+0x138/0x300 net/socket.c:1988 [<ffffffff834f604d>] SYSC_sendmsg net/socket.c:1999 [inline] [<ffffffff834f604d>] SyS_sendmsg+0x2d/0x50 net/socket.c:1995 [<ffffffff84371941>] entry_SYSCALL_64_fastpath+0x1f/0xc2 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01rtnetlink: Handle IFLA_MASTER parameter when processing rtnl_newlinkTheuns Verwoerd1-1/+6
Allow a master interface to be specified as one of the parameters when creating a new interface via rtnl_newlink. Previously this would require invoking interface creation, waiting for it to complete, and then separately binding that new interface to a master. In particular, this is used when creating a macvlan child interface for VRRP in a VRF configuration, allowing the interface creator to specify directly what master interface should be inherited by the child, without having to deal with asynchronous complications and potential race conditions. Signed-off-by: Theuns Verwoerd <theuns.verwoerd@alliedtelesis.co.nz> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-nextDavid S. Miller1-0/+2
Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-02-01 1) Some typo fixes, from Alexander Alemayhu. 2) Don't acquire state lock in get_mtu functions. The only rece against a dead state does not matter. From Florian Westphal. 3) Remove xfrm4_state_fini, it is unused for more than 10 years. From Florian Westphal. 4) Various rcu usage improvements. From Florian Westphal. 5) Properly handle crypto arrors in ah4/ah6. From Gilad Ben-Yossef. 6) Try to avoid skb linearization in esp4 and esp6. 7) The esp trailer is now set up in different places, add a helper for this. 8) With the upcomming usage of gro_cells in IPsec, a gro merged skb can have a secpath. Drop it before freeing or reusing the skb. 9) Add a xfrm dummy network device for napi. With this we can use gro_cells from within xfrm, it allows IPsec GRO without impact on the generic networking code. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31net: ethtool: convert large order kmalloc allocations to vzallocAlexei Starovoitov1-17/+22
under memory pressure 'ethtool -S' command may warn: [ 2374.385195] ethtool: page allocation failure: order:4, mode:0x242c0c0 [ 2374.405573] CPU: 12 PID: 40211 Comm: ethtool Not tainted [ 2374.423071] Call Trace: [ 2374.423076] [<ffffffff8148cb29>] dump_stack+0x4d/0x64 [ 2374.423080] [<ffffffff811667cb>] warn_alloc_failed+0xeb/0x150 [ 2374.423082] [<ffffffff81169cd3>] ? __alloc_pages_direct_compact+0x43/0xf0 [ 2374.423084] [<ffffffff8116a25c>] __alloc_pages_nodemask+0x4dc/0xbf0 [ 2374.423091] [<ffffffffa0023dc2>] ? cmd_exec+0x722/0xcd0 [mlx5_core] [ 2374.423095] [<ffffffff811b3dcc>] alloc_pages_current+0x8c/0x110 [ 2374.423097] [<ffffffff81168859>] alloc_kmem_pages+0x19/0x90 [ 2374.423099] [<ffffffff81186e5e>] kmalloc_order_trace+0x2e/0xe0 [ 2374.423101] [<ffffffff811c0084>] __kmalloc+0x204/0x220 [ 2374.423105] [<ffffffff816c269e>] dev_ethtool+0xe4e/0x1f80 [ 2374.423106] [<ffffffff816b967e>] ? dev_get_by_name_rcu+0x5e/0x80 [ 2374.423108] [<ffffffff816d6926>] dev_ioctl+0x156/0x560 [ 2374.423111] [<ffffffff811d4c68>] ? mem_cgroup_commit_charge+0x78/0x3c0 [ 2374.423117] [<ffffffff8169d542>] sock_do_ioctl+0x42/0x50 [ 2374.423119] [<ffffffff8169d9c3>] sock_ioctl+0x1b3/0x250 [ 2374.423121] [<ffffffff811f0f42>] do_vfs_ioctl+0x92/0x580 [ 2374.423123] [<ffffffff8100222b>] ? do_audit_syscall_entry+0x4b/0x70 [ 2374.423124] [<ffffffff8100287c>] ? syscall_trace_enter_phase1+0xfc/0x120 [ 2374.423126] [<ffffffff811f14a9>] SyS_ioctl+0x79/0x90 [ 2374.423127] [<ffffffff81002bb0>] do_syscall_64+0x50/0xa0 [ 2374.423129] [<ffffffff817e19bc>] entry_SYSCALL64_slow_path+0x25/0x25 ~1160 mlx5 counters ~= order 4 allocation which is unlikely to succeed under memory pressure. Convert them to vzalloc() as ethtool_get_regs() does. Also take care of drivers without counters similar to commit 67ae7cf1eeda ("ethtool: Allow zero-length register dumps again") and reduce warn_on to warn_on_once. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-30lwtunnel: remove device arg to lwtunnel_build_stateDavid Ahern2-3/+3
Nothing about lwt state requires a device reference, so remove the input argument. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-30net: Drop secpath on free after gro merge.Steffen Klassert1-0/+2
With a followup patch, a gro merged skb can have a secpath. So drop it before freeing or reusing the skb. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller4-9/+64
Two trivial overlapping changes conflicts in MPLS and mlx5. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27net: adjust skb->truesize in pskb_expand_head()Eric Dumazet1-3/+11
Slava Shwartsman reported a warning in skb_try_coalesce(), when we detect skb->truesize is completely wrong. In his case, issue came from IPv6 reassembly coping with malicious datagrams, that forced various pskb_may_pull() to reallocate a bigger skb->head than the one allocated by NIC driver before entering GRO layer. Current code does not change skb->truesize, leaving this burden to callers if they care enough. Blindly changing skb->truesize in pskb_expand_head() is not easy, as some producers might track skb->truesize, for example in xmit path for back pressure feedback (sk->sk_wmem_alloc) We can detect the cases where it should be safe to change skb->truesize : 1) skb is not attached to a socket. 2) If it is attached to a socket, destructor is sock_edemux() My audit gave only two callers doing their own skb->truesize manipulation. I had to remove skb parameter in sock_edemux macro when CONFIG_INET is not set to avoid a compile error. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Slava Shwartsman <slavash@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24lwtunnel: Fix oops on state free after encap module unloadRobert Shearman1-1/+5
When attempting to free lwtunnel state after the module for the encap has been unloaded an oops occurs: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: lwtstate_free+0x18/0x40 [..] task: ffff88003e372380 task.stack: ffffc900001fc000 RIP: 0010:lwtstate_free+0x18/0x40 RSP: 0018:ffff88003fd83e88 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88002bbb3380 RCX: ffff88000c91a300 [..] Call Trace: <IRQ> free_fib_info_rcu+0x195/0x1a0 ? rt_fibinfo_free+0x50/0x50 rcu_process_callbacks+0x2d3/0x850 ? rcu_process_callbacks+0x296/0x850 __do_softirq+0xe4/0x4cb irq_exit+0xb0/0xc0 smp_apic_timer_interrupt+0x3d/0x50 apic_timer_interrupt+0x93/0xa0 [..] Code: e8 6e c6 fc ff 89 d8 5b 5d c3 bb de ff ff ff eb f4 66 90 66 66 66 66 90 55 48 89 e5 53 0f b7 07 48 89 fb 48 8b 04 c5 00 81 d5 81 <48> 8b 40 08 48 85 c0 74 13 ff d0 48 8d 7b 20 be 20 00 00 00 e8 The problem is after the module for the encap can be unloaded the corresponding ops is removed and is thus NULL here. Modules implementing lwtunnel ops should not be allowed to unload while there is state alive using those ops, so grab the module reference for the ops on creating lwtunnel state and of course release the reference when freeing the state. Fixes: 1104d9ba443a ("lwtunnel: Add destroy state operation") Signed-off-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24net: Specify the owning module for lwtunnel opsRobert Shearman1-0/+1
Modules implementing lwtunnel ops should not be allowed to unload while there is state alive using those ops, so specify the owning module for all lwtunnel ops. Signed-off-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24bpf: allow option for setting bpf_l4_csum_replace from scratchDaniel Borkmann1-3/+4
When programs need to calculate the csum from scratch for small UDP packets and use bpf_l4_csum_replace() to feed the result from helpers like bpf_csum_diff(), then we need a flag besides BPF_F_MARK_MANGLED_0 that would ignore the case of current csum being 0, and which would still allow for the helper to set the csum and transform when needed to CSUM_MANGLED_0. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24bpf: enable load bytes helper for filter/reuseport progsDaniel Borkmann1-15/+26
BPF_PROG_TYPE_SOCKET_FILTER are used in various facilities such as for SO_REUSEPORT and packet fanout demuxing, packet filtering, kcm, etc, and yet the only facility they can use is BPF_LD with {BPF_ABS, BPF_IND} for single byte/half/word access. Direct packet access is only restricted to tc programs right now, but we can still facilitate usage by allowing skb_load_bytes() helper added back then in 05c74e5e53f6 ("bpf: add bpf_skb_load_bytes helper") that calls skb_header_pointer() similarly to bpf_load_pointer(), but for stack buffers with larger access size. Name the previous sk_filter_func_proto() as bpf_base_func_proto() since this is used everywhere else as well, similarly for the ctx converter, that is, bpf_convert_ctx_access(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24bpf: simplify __is_valid_access test on cbDaniel Borkmann1-13/+2
The __is_valid_access() test for cb[] from 62c7989b24db ("bpf: allow b/h/w/dw access for bpf's cb in ctx") was done unnecessarily complex, we can just simplify it the same way as recent fix from 2d071c643f1c ("bpf, trace: make ctx access checks more robust") did. Overflow can never happen as size is 1/2/4/8 depending on access. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20net: remove duplicate code.Mahesh Bandewar1-3/+1
netdev_rx_handler_register() checks to see if the handler is already busy which was recently separated into netdev_is_rx_handler_busy(). So use the same function inside register() to avoid code duplication. Essentially this change should be a no-op Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20device: Implement a bus agnostic dev_num_vf routinePhil Sutter1-2/+1
Now that pci_bus_type has num_vf callback set, dev_num_vf can be implemented in a bus type independent way and the check for whether a PCI device is being handled in rtnetlink can be dropped. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18lwtunnel: fix autoload of lwt modulesDavid Ahern1-6/+56
Trying to add an mpls encap route when the MPLS modules are not loaded hangs. For example: CONFIG_MPLS=y CONFIG_NET_MPLS_GSO=m CONFIG_MPLS_ROUTING=m CONFIG_MPLS_IPTUNNEL=m $ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2 The ip command hangs: root 880 826 0 21:25 pts/0 00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2 $ cat /proc/880/stack [<ffffffff81065a9b>] call_usermodehelper_exec+0xd6/0x134 [<ffffffff81065efc>] __request_module+0x27b/0x30a [<ffffffff814542f6>] lwtunnel_build_state+0xe4/0x178 [<ffffffff814aa1e4>] fib_create_info+0x47f/0xdd4 [<ffffffff814ae451>] fib_table_insert+0x90/0x41f [<ffffffff814a8010>] inet_rtm_newroute+0x4b/0x52 ... modprobe is trying to load rtnl-lwt-MPLS: root 881 5 0 21:25 ? 00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS and it hangs after loading mpls_router: $ cat /proc/881/stack [<ffffffff81441537>] rtnl_lock+0x12/0x14 [<ffffffff8142ca2a>] register_netdevice_notifier+0x16/0x179 [<ffffffffa0033025>] mpls_init+0x25/0x1000 [mpls_router] [<ffffffff81000471>] do_one_initcall+0x8e/0x13f [<ffffffff81119961>] do_init_module+0x5a/0x1e5 [<ffffffff810bd070>] load_module+0x13bd/0x17d6 ... The problem is that lwtunnel_build_state is called with rtnl lock held preventing mpls_init from registering. Given the potential references held by the time lwtunnel_build_state it can not drop the rtnl lock to the load module. So, extract the module loading code from lwtunnel_build_state into a new function to validate the encap type. The new function is called while converting the user request into a fib_config which is well before any table, device or fib entries are examined. Fixes: 745041e2aaf1 ("lwtunnel: autoload of lwt modules") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18net: fix harmonize_features() vs NETIF_F_HIGHDMAEric Dumazet1-2/+2
Ashizuka reported a highmem oddity and sent a patch for freescale fec driver. But the problem root cause is that core networking stack must ensure no skb with highmem fragment is ever sent through a device that does not assert NETIF_F_HIGHDMA in its features. We need to call illegal_highdma() from harmonize_features() regardless of CSUM checks. Fixes: ec5f06156423 ("net: Kill link between CSUM and SG features.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pravin Shelar <pshelar@ovn.org> Reported-by: "Ashizuka, Yuusuke" <ashiduka@jp.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18net: ethtool: Initialize buffer when querying device channel settingsEran Ben Elisha1-1/+1
Ethtool channels respond struct was uninitialized when querying device channel boundaries settings. As a result, unreported fields by the driver hold garbage. This may cause sending unsupported params to driver. Fixes: 8bf368620486 ('ethtool: ensure channel counts are within bounds ...') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> CC: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-17net: AF-specific RTM_GETSTATS attributesRobert Shearman1-0/+50
Add the functionality for including address-family-specific per-link stats in RTM_GETSTATS messages. This is done through adding a new IFLA_STATS_AF_SPEC attribute under which address family attributes are nested and then the AF-specific attributes can be further nested. This follows the model of IFLA_AF_SPEC on RTM_*LINK messages and it has the advantage of presenting an easily extended hierarchy. The rtnl_af_ops structure is extended to provide AFs with the opportunity to fill and provide the size of their stats attributes. One alternative would have been to provide AFs with the ability to add attributes directly into the RTM_GETSTATS message without a nested hierarchy. I discounted this approach as it increases the rate at which the 32 attribute number space is used up and it makes implementation a little more tricky for stats dump resuming (at the moment the order in which attributes are added to the message has to match the numeric order of the attributes). Another alternative would have been to register per-AF RTM_GETSTATS handlers. I discounted this approach as I perceived a common use-case to be getting all the stats for an interface and this approach would necessitate multiple requests/dumps to retrieve them all. Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-16flow dissector: check if arp_eth is null rather than arpColin Ian King1-2/+2
arp is being checked instead of arp_eth to see if the call to __skb_header_pointer failed. Fix this by checking arp_eth is null instead of arp. Also fix to use length hlen rather than hlen - sizeof(_arp); thanks to Eric Dumazet for spotting this latter issue. CoverityScan CID#1396428 ("Logically dead code") on 2nd arp comparison (which should be arp_eth instead). Fixes: commit 55733350e5e8b70c5 ("flow disector: ARP support") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12secure_seq: fix sparse errorsEric Dumazet1-1/+3
Fixes following warnings : net/core/secure_seq.c:125:28: warning: incorrect type in argument 1 (different base types) net/core/secure_seq.c:125:28: expected unsigned int const [unsigned] [usertype] a net/core/secure_seq.c:125:28: got restricted __be32 [usertype] saddr net/core/secure_seq.c:125:35: warning: incorrect type in argument 2 (different base types) net/core/secure_seq.c:125:35: expected unsigned int const [unsigned] [usertype] b net/core/secure_seq.c:125:35: got restricted __be32 [usertype] daddr net/core/secure_seq.c:125:43: warning: cast from restricted __be16 net/core/secure_seq.c:125:61: warning: restricted __be16 degrades to integer Fixes: 7cd23e5300c1 ("secure_seq: use SipHash in place of MD5") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12lwt_bpf: bpf_lwt_prog_cmp() can be staticWei Yongjun1-1/+1
Fixes the following sparse warning: net/core/lwt_bpf.c:355:5: warning: symbol 'bpf_lwt_prog_cmp' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12bpf: allow b/h/w/dw access for bpf's cb in ctxDaniel Borkmann1-8/+33
When structs are used to store temporary state in cb[] buffer that is used with programs and among tail calls, then the generated code will not always access the buffer in bpf_w chunks. We can ease programming of it and let this act more natural by allowing for aligned b/h/w/dw sized access for cb[] ctx member. Various test cases are attached as well for the selftest suite. Potentially, this can also be reused for other program types to pass data around. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12bpf: pass original insn directly to convert_ctx_accessDaniel Borkmann1-65/+74
Currently, when calling convert_ctx_access() callback for the various program types, we pass in insn->dst_reg, insn->src_reg, insn->off from the original instruction. This information is needed to rewrite the instruction that is based on the user ctx structure into a kernel representation for the ctx. As we'd like to allow access size beyond just BPF_W, we'd need also insn->code for that in order to decode the original access size. Given that, lets just pass insn directly to the convert_ctx_access() callback and work on that to not clutter the callback with even more arguments we need to pass when everything is already contained in insn. So lets go through that once, no functional change. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12net: fix AF_SMC related typoUrsula Braun1-1/+1
When introducing the new socket family AF_SMC in commit ac7138746e14 ("smc: establish new socket family"), a typo in af_family_clock_key_strings has slipped in. This patch repairs it. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Fixes: ac7138746e14 ("smc: establish new socket family") Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12net: core: Make netif_wake_subqueue a wrapperFlorian Fainelli1-22/+0
netif_wake_subqueue() is duplicating the same thing that netif_tx_wake_queue() does, so make it call it directly after looking up the queue from the index. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller4-10/+12
Two AF_* families adding entries to the lockdep tables at the same time. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-4/+4
Merge fixes from Andrew Morton: "27 fixes. There are three patches that aren't actually fixes. They're simple function renamings which are nice-to-have in mainline as ongoing net development depends on them." * akpm: (27 commits) timerfd: export defines to userspace mm/hugetlb.c: fix reservation race when freeing surplus pages mm/slab.c: fix SLAB freelist randomization duplicate entries zram: support BDI_CAP_STABLE_WRITES zram: revalidate disk under init_lock mm: support anonymous stable page mm: add documentation for page fragment APIs mm: rename __page_frag functions to __page_frag_cache, drop order from drain mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_free mm, memcg: fix the active list aging for lowmem requests when memcg is enabled mm: don't dereference struct page fields of invalid pages mailmap: add codeaurora.org names for nameless email commits signal: protect SIGNAL_UNKILLABLE from unintentional clearing. mm: pmd dirty emulation in page fault handler ipc/sem.c: fix incorrect sem_lock pairing lib/Kconfig.debug: fix frv build failure mm: get rid of __GFP_OTHER_NODE mm: fix remote numa hits statistics mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done} ocfs2: fix crash caused by stale lvb with fsdlm plugin ...
2017-01-11flow disector: ARP supportSimon Horman1-0/+57
Allow dissection of (R)ARP operation hardware and protocol addresses for Ethernet hardware and IPv4 protocol addresses. There are currently no users of FLOW_DISSECTOR_KEY_ARP. A follow-up patch will allow FLOW_DISSECTOR_KEY_ARP to be used by the flower classifier. Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11gro: use min_t() in skb_gro_reset_offset()Eric Dumazet1-2/+3
On 32bit arches, (skb->end - skb->data) is not 'unsigned int', so we shall use min_t() instead of min() to avoid a compiler error. Fixes: 1272ce87fa01 ("gro: Enter slow-path if there is no tailroom") Reported-by: kernel test robot <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-10mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_freeAlexander Duyck1-4/+4
Patch series "Page fragment updates", v4. This patch series takes care of a few cleanups for the page fragments API. First we do some renames so that things are much more consistent. First we move the page_frag_ portion of the name to the front of the functions names. Secondly we split out the cache specific functions from the other page fragment functions by adding the word "cache" to the name. Finally I added a bit of documentation that will hopefully help to explain some of this. I plan to revisit this later as we get things more ironed out in the near future with the changes planned for the DMA setup to support eXpress Data Path. This patch (of 3): This patch renames the page frag functions to be more consistent with other APIs. Specifically we place the name page_frag first in the name and then have either an alloc or free call name that we append as the suffix. This makes it a bit clearer in terms of naming. In addition we drop the leading double underscores since we are technically no longer a backing interface and instead the front end that is called from the networking APIs. Link: http://lkml.kernel.org/r/20170104023854.13451.67390.stgit@localhost.localdomain Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10gro: Enter slow-path if there is no tailroomHerbert Xu1-1/+2
The GRO path has a fast-path where we avoid calling pskb_may_pull and pskb_expand by directly accessing frag0. However, this should only be done if we have enough tailroom in the skb as otherwise we'll have to expand it later anyway. This patch adds the check by capping frag0_len with the skb tailroom. Fixes: cb18978cbf45 ("gro: Open-code final pskb_may_pull") Reported-by: Slava Shwartsman <slavash@mellanox.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-10net: add the AF_QIPCRTR entries to family name tablesAnna, Suman1-3/+3
Commit bdabad3e363d ("net: Add Qualcomm IPC router") introduced a new address family. Update the family name tables accordingly so that the lockdep initialization can use the proper names for this family. Cc: Courtney Cavin <courtney.cavin@sonymobile.com> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-10net: skb_flow_get_be16() can be staticEric Dumazet1-2/+2
Removes following sparse complain : net/core/flow_dissector.c:70:8: warning: symbol 'skb_flow_get_be16' was not declared. Should it be static? Fixes: 972d3876faa8 ("flow dissector: ICMP support") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09bpf: rename ARG_PTR_TO_STACKAlexei Starovoitov1-20/+20
since ARG_PTR_TO_STACK is no longer just pointer to stack rename it to ARG_PTR_TO_MEM and adjust comment. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09smc: establish new socket familyUrsula Braun1-3/+3
* enable smc module loading and unloading * register new socket family * basic smc socket creation and deletion * use backing TCP socket to run CLC (Connection Layer Control) handshake of SMC protocol * Setup for infiniband traffic is implemented in follow-on patches. For now fallback to TCP socket is always used. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Utz Bacher <utz.bacher@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>