aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-06-06ipv6: fix EFAULT on sendto with icmpv6 and hdrinclOlivier Matz1-5/+8
The following code returns EFAULT (Bad address): s = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6); setsockopt(s, SOL_IPV6, IPV6_HDRINCL, 1); sendto(ipv6_icmp6_packet, addr); /* returns -1, errno = EFAULT */ The IPv4 equivalent code works. A workaround is to use IPPROTO_RAW instead of IPPROTO_ICMPV6. The failure happens because 2 bytes are eaten from the msghdr by rawv6_probe_proto_opt() starting from commit 19e3c66b52ca ("ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt""), but at that time it was not a problem because IPV6_HDRINCL was not yet introduced. Only eat these 2 bytes if hdrincl == 0. Fixes: 715f504b1189 ("ipv6: add IPV6_HDRINCL option for raw sockets") Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06ipv6: use READ_ONCE() for inet->hdrincl as in ipv4Olivier Matz1-2/+10
As it was done in commit 8f659a03a0ba ("net: ipv4: fix for a race condition in raw_sendmsg") and commit 20b50d79974e ("net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()") for ipv4, copy the value of inet->hdrincl in a local variable, to avoid introducing a race condition in the next commit. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-05Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied"Hangbin Liu1-3/+3
This reverts commit e9919a24d3022f72bcadc407e73a6ef17093a849. Nathan reported the new behaviour breaks Android, as Android just add new rules and delete old ones. If we return 0 without adding dup rules, Android will remove the new added rules and causing system to soft-reboot. Fixes: e9919a24d302 ("fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied") Reported-by: Nathan Chancellor <natechancellor@gmail.com> Reported-by: Yaro Slav <yaro330@gmail.com> Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-05ethtool: fix potential userspace buffer overflowVivien Didelot1-1/+4
ethtool_get_regs() allocates a buffer of size ops->get_regs_len(), and pass it to the kernel driver via ops->get_regs() for filling. There is no restriction about what the kernel drivers can or cannot do with the open ethtool_regs structure. They usually set regs->version and ignore regs->len or set it to the same size as ops->get_regs_len(). But if userspace allocates a smaller buffer for the registers dump, we would cause a userspace buffer overflow in the final copy_to_user() call, which uses the regs.len value potentially reset by the driver. To fix this, make this case obvious and store regs.len before calling ops->get_regs(), to only copy as much data as requested by userspace, up to the value returned by ops->get_regs_len(). While at it, remove the redundant check for non-null regbuf. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-05Fix memory leak in sctp_process_initNeil Horman2-10/+8
syzbot found the following leak in sctp_process_init BUG: memory leak unreferenced object 0xffff88810ef68400 (size 1024): comm "syz-executor273", pid 7046, jiffies 4294945598 (age 28.770s) hex dump (first 32 bytes): 1d de 28 8d de 0b 1b e3 b5 c2 f9 68 fd 1a 97 25 ..(........h...% 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000a02cebbd>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<00000000a02cebbd>] slab_post_alloc_hook mm/slab.h:439 [inline] [<00000000a02cebbd>] slab_alloc mm/slab.c:3326 [inline] [<00000000a02cebbd>] __do_kmalloc mm/slab.c:3658 [inline] [<00000000a02cebbd>] __kmalloc_track_caller+0x15d/0x2c0 mm/slab.c:3675 [<000000009e6245e6>] kmemdup+0x27/0x60 mm/util.c:119 [<00000000dfdc5d2d>] kmemdup include/linux/string.h:432 [inline] [<00000000dfdc5d2d>] sctp_process_init+0xa7e/0xc20 net/sctp/sm_make_chunk.c:2437 [<00000000b58b62f8>] sctp_cmd_process_init net/sctp/sm_sideeffect.c:682 [inline] [<00000000b58b62f8>] sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1384 [inline] [<00000000b58b62f8>] sctp_side_effects net/sctp/sm_sideeffect.c:1194 [inline] [<00000000b58b62f8>] sctp_do_sm+0xbdc/0x1d60 net/sctp/sm_sideeffect.c:1165 [<0000000044e11f96>] sctp_assoc_bh_rcv+0x13c/0x200 net/sctp/associola.c:1074 [<00000000ec43804d>] sctp_inq_push+0x7f/0xb0 net/sctp/inqueue.c:95 [<00000000726aa954>] sctp_backlog_rcv+0x5e/0x2a0 net/sctp/input.c:354 [<00000000d9e249a8>] sk_backlog_rcv include/net/sock.h:950 [inline] [<00000000d9e249a8>] __release_sock+0xab/0x110 net/core/sock.c:2418 [<00000000acae44fa>] release_sock+0x37/0xd0 net/core/sock.c:2934 [<00000000963cc9ae>] sctp_sendmsg+0x2c0/0x990 net/sctp/socket.c:2122 [<00000000a7fc7565>] inet_sendmsg+0x64/0x120 net/ipv4/af_inet.c:802 [<00000000b732cbd3>] sock_sendmsg_nosec net/socket.c:652 [inline] [<00000000b732cbd3>] sock_sendmsg+0x54/0x70 net/socket.c:671 [<00000000274c57ab>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2292 [<000000008252aedb>] __sys_sendmsg+0x80/0xf0 net/socket.c:2330 [<00000000f7bf23d1>] __do_sys_sendmsg net/socket.c:2339 [inline] [<00000000f7bf23d1>] __se_sys_sendmsg net/socket.c:2337 [inline] [<00000000f7bf23d1>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2337 [<00000000a8b4131f>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:3 The problem was that the peer.cookie value points to an skb allocated area on the first pass through this function, at which point it is overwritten with a heap allocated value, but in certain cases, where a COOKIE_ECHO chunk is included in the packet, a second pass through sctp_process_init is made, where the cookie value is re-allocated, leaking the first allocation. Fix is to always allocate the cookie value, and free it when we are done using it. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: syzbot+f7e9153b037eac9b1df8@syzkaller.appspotmail.com CC: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: netdev@vger.kernel.org Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-05net: rds: fix memory leak when unload rds_rdmaZhu Yanjun2-1/+4
When KASAN is enabled, after several rds connections are created, then "rmmod rds_rdma" is run. The following will appear. " BUG rds_ib_incoming (Not tainted): Objects remaining in rds_ib_incoming on __kmem_cache_shutdown() Call Trace: dump_stack+0x71/0xab slab_err+0xad/0xd0 __kmem_cache_shutdown+0x17d/0x370 shutdown_cache+0x17/0x130 kmem_cache_destroy+0x1df/0x210 rds_ib_recv_exit+0x11/0x20 [rds_rdma] rds_ib_exit+0x7a/0x90 [rds_rdma] __x64_sys_delete_module+0x224/0x2c0 ? __ia32_sys_delete_module+0x2c0/0x2c0 do_syscall_64+0x73/0x190 entry_SYSCALL_64_after_hwframe+0x44/0xa9 " This is rds connection memory leak. The root cause is: When "rmmod rds_rdma" is run, rds_ib_remove_one will call rds_ib_dev_shutdown to drop the rds connections. rds_ib_dev_shutdown will call rds_conn_drop to drop rds connections as below. " rds_conn_path_drop(&conn->c_path[0], false); " In the above, destroy is set to false. void rds_conn_path_drop(struct rds_conn_path *cp, bool destroy) { atomic_set(&cp->cp_state, RDS_CONN_ERROR); rcu_read_lock(); if (!destroy && rds_destroy_pending(cp->cp_conn)) { rcu_read_unlock(); return; } queue_work(rds_wq, &cp->cp_down_w); rcu_read_unlock(); } In the above function, destroy is set to false. rds_destroy_pending is called. This does not move rds connections to ib_nodev_conns. So destroy is set to true to move rds connections to ib_nodev_conns. In rds_ib_unregister_client, flush_workqueue is called to make rds_wq finsh shutdown rds connections. The function rds_ib_destroy_nodev_conns is called to shutdown rds connections finally. Then rds_ib_recv_exit is called to destroy slab. void rds_ib_recv_exit(void) { kmem_cache_destroy(rds_ib_incoming_slab); kmem_cache_destroy(rds_ib_frag_slab); } The above slab memory leak will not occur again. >From tests, 256 rds connections [root@ca-dev14 ~]# time rmmod rds_rdma real 0m16.522s user 0m0.000s sys 0m8.152s 512 rds connections [root@ca-dev14 ~]# time rmmod rds_rdma real 0m32.054s user 0m0.000s sys 0m15.568s To rmmod rds_rdma with 256 rds connections, about 16 seconds are needed. And with 512 rds connections, about 32 seconds are needed. >From ftrace, when one rds connection is destroyed, " 19) | rds_conn_destroy [rds]() { 19) 7.782 us | rds_conn_path_drop [rds](); 15) | rds_shutdown_worker [rds]() { 15) | rds_conn_shutdown [rds]() { 15) 1.651 us | rds_send_path_reset [rds](); 15) 7.195 us | } 15) + 11.434 us | } 19) 2.285 us | rds_cong_remove_conn [rds](); 19) * 24062.76 us | } " So if many rds connections will be destroyed, this function rds_ib_destroy_nodev_conns uses most of time. Suggested-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-05ipv4: not do cache for local delivery if bc_forwarding is enabledXin Long1-12/+12
With the topo: h1 ---| rp1 | | route rp3 |--- h3 (192.168.200.1) h2 ---| rp2 | If rp1 bc_forwarding is set while rp2 bc_forwarding is not, after doing "ping 192.168.200.255" on h1, then ping 192.168.200.255 on h2, and the packets can still be forwared. This issue was caused by the input route cache. It should only do the cache for either bc forwarding or local delivery. Otherwise, local delivery can use the route cache for bc forwarding of other interfaces. This patch is to fix it by not doing cache for local delivery if all.bc_forwarding is enabled. Note that we don't fix it by checking route cache local flag after rt_cache_valid() in "local_input:" and "ip_mkroute_input", as the common route code shouldn't be touched for bc_forwarding. Fixes: 5cbf777cfdf6 ("route: add support for directed broadcast forwarding") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04net: fix indirect calls helpers for ptype list hooks.Paolo Abeni1-3/+3
As Eric noted, the current wrapper for ptype func hook inside __netif_receive_skb_list_ptype() has no chance of avoiding the indirect call: we enter such code path only for protocols other than ipv4 and ipv6. Instead we can wrap the list_func invocation. v1 -> v2: - use the correct fix tag Fixes: f5737cbadb7d ("net: use indirect calls helpers for ptype hook") Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Edward Cree <ecree@solarflare.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04udp: only choose unbound UDP socket for multicast when not in a VRFTim Beale1-2/+1
By default, packets received in another VRF should not be passed to an unbound socket in the default VRF. This patch updates the IPv4 UDP multicast logic to match the unicast VRF logic (in compute_score()), as well as the IPv6 mcast logic (in __udp_v6_is_mcast_sock()). The particular case I noticed was DHCP discover packets going to the 255.255.255.255 address, which are handled by __udp4_lib_mcast_deliver(). The previous code meant that running multiple different DHCP server or relay agent instances across VRFs did not work correctly - any server/relay agent in the default VRF received DHCP discover packets for all other VRFs. Fixes: 6da5b0f027a8 ("net: ensure unbound datagram socket to be chosen when not in a VRF") Signed-off-by: Tim Beale <timbeale@catalyst.net.nz> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04net/tls: replace the sleeping lock around RX resync with a bit lockJakub Kicinski1-6/+21
Commit 38030d7cb779 ("net/tls: avoid NULL-deref on resync during device removal") tried to fix a potential NULL-dereference by taking the context rwsem. Unfortunately the RX resync may get called from soft IRQ, so we can't use the rwsem to protect from the device disappearing. Because we are guaranteed there can be only one resync at a time (it's called from strparser) use a bit to indicate resync is busy and make device removal wait for the bit to get cleared. Note that there is a leftover "flags" field in struct tls_context already. Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04Revert "net/tls: avoid NULL-deref on resync during device removal"Jakub Kicinski1-10/+5
This reverts commit 38030d7cb77963ba84cdbe034806e2b81245339f. Unfortunately the RX resync may get called from soft IRQ, so we can't take the rwsem to protect from the device disappearing. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-02packet: unconditionally free po->rolloverWillem de Bruijn1-1/+1
Rollover used to use a complex RCU mechanism for assignment, which had a race condition. The below patch fixed the bug and greatly simplified the logic. The feature depends on fanout, but the state is private to the socket. Fanout_release returns f only when the last member leaves and the fanout struct is to be freed. Destroy rollover unconditionally, regardless of fanout state. Fixes: 57f015f5eccf2 ("packet: fix crash in fanout_demux_rollover()") Reported-by: syzbot <syzkaller@googlegroups.com> Diagnosed-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-31net: dsa: sja1105: Don't store frame type in skb->cbVladimir Oltean1-7/+3
Due to a confusion I thought that eth_type_trans() was called by the network stack whereas it can actually be called by network drivers to figure out the skb protocol and next packet_type handlers. In light of the above, it is not safe to store the frame type from the DSA tagger's .filter callback (first entry point on RX path), since GRO is yet to be invoked on the received traffic. Hence it is very likely that the skb->cb will actually get overwritten between eth_type_trans() and the actual DSA packet_type handler. Of course, what this patch fixes is the actual overwriting of the SJA1105_SKB_CB(skb)->type field from the GRO layer, which made all frames be seen as SJA1105_FRAME_TYPE_NORMAL (0). Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds27-194/+245
Pull networking fixes from David Miller: 1) Fix OOPS during nf_tables rule dump, from Florian Westphal. 2) Use after free in ip_vs_in, from Yue Haibing. 3) Fix various kTLS bugs (NULL deref during device removal resync, netdev notification ignoring, etc.) From Jakub Kicinski. 4) Fix ipv6 redirects with VRF, from David Ahern. 5) Memory leak fix in igmpv3_del_delrec(), from Eric Dumazet. 6) Missing memory allocation failure check in ip6_ra_control(), from Gen Zhang. And likewise fix ip_ra_control(). 7) TX clean budget logic error in aquantia, from Igor Russkikh. 8) SKB leak in llc_build_and_send_ui_pkt(), from Eric Dumazet. 9) Double frees in mlx5, from Parav Pandit. 10) Fix lost MAC address in r8169 during PCI D3, from Heiner Kallweit. 11) Fix botched register access in mvpp2, from Antoine Tenart. 12) Use after free in napi_gro_frags(), from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (89 commits) net: correct zerocopy refcnt with udp MSG_MORE ethtool: Check for vlan etype or vlan tci when parsing flow_rule net: don't clear sock->sk early to avoid trouble in strparser net-gro: fix use-after-free read in napi_gro_frags() net: dsa: tag_8021q: Create a stable binary format net: dsa: tag_8021q: Change order of rx_vid setup net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value ipv4: tcp_input: fix stack out of bounds when parsing TCP options. mlxsw: spectrum: Prevent force of 56G mlxsw: spectrum_acl: Avoid warning after identical rules insertion net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT r8169: fix MAC address being lost in PCI D3 net: core: support XDP generic on stacked devices. netvsc: unshare skb in VF rx handler udp: Avoid post-GRO UDP checksum recalculation net: phy: dp83867: Set up RGMII TX delay net: phy: dp83867: do not call config_init twice net: phy: dp83867: increase SGMII autoneg timer duration net: phy: dp83867: fix speed 10 in sgmii mode net: phy: marvell10g: report if the PHY fails to boot firmware ...
2019-05-30net: correct zerocopy refcnt with udp MSG_MOREWillem de Bruijn3-5/+9
TCP zerocopy takes a uarg reference for every skb, plus one for the tcp_sendmsg_locked datapath temporarily, to avoid reaching refcnt zero as it builds, sends and frees skbs inside its inner loop. UDP and RAW zerocopy do not send inside the inner loop so do not need the extra sock_zerocopy_get + sock_zerocopy_put pair. Commit 52900d22288ed ("udp: elide zerocopy operation in hot path") introduced extra_uref to pass the initial reference taken in sock_zerocopy_alloc to the first generated skb. But, sock_zerocopy_realloc takes this extra reference at the start of every call. With MSG_MORE, no new skb may be generated to attach the extra_uref to, so refcnt is incorrectly 2 with only one skb. Do not take the extra ref if uarg && !tcp, which implies MSG_MORE. Update extra_uref accordingly. This conditional assignment triggers a false positive may be used uninitialized warning, so have to initialize extra_uref at define. Changes v1->v2: fix typo in Fixes SHA1 Fixes: 52900d22288e7 ("udp: elide zerocopy operation in hot path") Reported-by: syzbot <syzkaller@googlegroups.com> Diagnosed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30ethtool: Check for vlan etype or vlan tci when parsing flow_ruleMaxime Chevallier1-2/+6
When parsing an ethtool flow spec to build a flow_rule, the code checks if both the vlan etype and the vlan tci are specified by the user to add a FLOW_DISSECTOR_KEY_VLAN match. However, when the user only specified a vlan etype or a vlan tci, this check silently ignores these parameters. For example, the following rule : ethtool -N eth0 flow-type udp4 vlan 0x0010 action -1 loc 0 will result in no error being issued, but the equivalent rule will be created and passed to the NIC driver : ethtool -N eth0 flow-type udp4 action -1 loc 0 In the end, neither the NIC driver using the rule nor the end user have a way to know that these keys were dropped along the way, or that incorrect parameters were entered. This kind of check should be left to either the driver, or the ethtool flow spec layer. This commit makes so that ethtool parameters are forwarded as-is to the NIC driver. Since none of the users of ethtool_rx_flow_rule_create are using the VLAN dissector, I don't think this qualifies as a regression. Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Acked-by: Pablo Neira Ayuso <pablo@gnumonks.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30net: don't clear sock->sk early to avoid trouble in strparserJakub Kicinski1-1/+1
af_inet sets sock->sk to NULL which trips strparser over: BUG: kernel NULL pointer dereference, address: 0000000000000012 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.2.0-rc1-00139-g14629453a6d3 #21 RIP: 0010:tcp_peek_len+0x10/0x60 RSP: 0018:ffffc02e41c54b98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff9cf924c4e030 RCX: 0000000000000051 RDX: 0000000000000000 RSI: 000000000000000c RDI: ffff9cf97128f480 RBP: ffff9cf9365e0300 R08: ffff9cf94fe7d2c0 R09: 0000000000000000 R10: 000000000000036b R11: ffff9cf939735e00 R12: ffff9cf91ad9ae40 R13: ffff9cf924c4e000 R14: ffff9cf9a8fcbaae R15: 0000000000000020 FS: 0000000000000000(0000) GS:ffff9cf9af7c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000012 CR3: 000000013920a003 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> strp_data_ready+0x48/0x90 tls_data_ready+0x22/0xd0 [tls] tcp_rcv_established+0x569/0x620 tcp_v4_do_rcv+0x127/0x1e0 tcp_v4_rcv+0xad7/0xbf0 ip_protocol_deliver_rcu+0x2c/0x1c0 ip_local_deliver_finish+0x41/0x50 ip_local_deliver+0x6b/0xe0 ? ip_protocol_deliver_rcu+0x1c0/0x1c0 ip_rcv+0x52/0xd0 ? ip_rcv_finish_core.isra.20+0x380/0x380 __netif_receive_skb_one_core+0x7e/0x90 netif_receive_skb_internal+0x42/0xf0 napi_gro_receive+0xed/0x150 nfp_net_poll+0x7a2/0xd30 [nfp] ? kmem_cache_free_bulk+0x286/0x310 net_rx_action+0x149/0x3b0 __do_softirq+0xe3/0x30a ? handle_irq_event_percpu+0x6a/0x80 irq_exit+0xe8/0xf0 do_IRQ+0x85/0xd0 common_interrupt+0xf/0xf </IRQ> RIP: 0010:cpuidle_enter_state+0xbc/0x450 To avoid this issue set sock->sk after sk_prot->close. My grepping and testing did not discover any code which would depend on the current behaviour. Fixes: c46234ebb4d1 ("tls: RX path for ktls") Reported-by: David Beckett <david.beckett@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30net-gro: fix use-after-free read in napi_gro_frags()Eric Dumazet1-1/+1
If a network driver provides to napi_gro_frags() an skb with a page fragment of exactly 14 bytes, the call to gro_pull_from_frag0() will 'consume' the fragment by calling skb_frag_unref(skb, 0), and the page might be freed and reused. Reading eth->h_proto at the end of napi_frags_skb() might read mangled data, or crash under specific debugging features. BUG: KASAN: use-after-free in napi_frags_skb net/core/dev.c:5833 [inline] BUG: KASAN: use-after-free in napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841 Read of size 2 at addr ffff88809366840c by task syz-executor599/8957 CPU: 1 PID: 8957 Comm: syz-executor599 Not tainted 5.2.0-rc1+ #32 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188 __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 kasan_report+0x12/0x20 mm/kasan/common.c:614 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:142 napi_frags_skb net/core/dev.c:5833 [inline] napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841 tun_get_user+0x2f3c/0x3ff0 drivers/net/tun.c:1991 tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2037 call_write_iter include/linux/fs.h:1872 [inline] do_iter_readv_writev+0x5f8/0x8f0 fs/read_write.c:693 do_iter_write fs/read_write.c:970 [inline] do_iter_write+0x184/0x610 fs/read_write.c:951 vfs_writev+0x1b3/0x2f0 fs/read_write.c:1015 do_writev+0x15b/0x330 fs/read_write.c:1058 Fixes: a50e233c50db ("net-gro: restore frag0 optimization") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30net: dsa: tag_8021q: Create a stable binary formatVladimir Oltean1-10/+50
Tools like tcpdump need to be able to decode the significance of fake VLAN headers that DSA uses to separate switch ports. But currently these have no global significance - they are simply an ordered list of DSA_MAX_SWITCHES x DSA_MAX_PORTS numbers ending at 4095. The reason why this is submitted as a fix is that the existing mapping of VIDs should not enter into a stable kernel, so we can pretend that only the new format exists. This way tcpdump won't need to try to make something out of the VLAN tags on 5.2 kernels. Fixes: f9bbe4477c30 ("net: dsa: Optional VLAN-based port separation for switches without tagging") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30net: dsa: tag_8021q: Change order of rx_vid setupIoana Ciornei1-4/+15
The 802.1Q tagging performs an unbalanced setup in terms of RX VIDs on the CPU port. For the ingress path of a 802.1Q switch to work, the RX VID of a port needs to be seen as tagged egress on the CPU port. While configuring the other front-panel ports to be part of this VID, for bridge scenarios, the untagged flag is applied even on the CPU port in dsa_switch_vlan_add. This happens because DSA applies the same flags on the CPU port as on the (bridge-controlled) slave ports, and the effect in this case is that the CPU port tagged settings get deleted. Instead of fixing DSA by introducing a way to control VLAN flags on the CPU port (and hence stop inheriting from the slave ports) - a hard, perhaps intractable problem - avoid this situation by moving the setup part of the RX VID on the CPU port after all the other front-panel ports have been added to the VID. Fixes: f9bbe4477c30 ("net: dsa: Optional VLAN-based port separation for switches without tagging") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30ipv4: tcp_input: fix stack out of bounds when parsing TCP options.Young Xiao1-0/+2
The TCP option parsing routines in tcp_parse_options function could read one byte out of the buffer of the TCP options. 1 while (length > 0) { 2 int opcode = *ptr++; 3 int opsize; 4 5 switch (opcode) { 6 case TCPOPT_EOL: 7 return; 8 case TCPOPT_NOP: /* Ref: RFC 793 section 3.1 */ 9 length--; 10 continue; 11 default: 12 opsize = *ptr++; //out of bound access If length = 1, then there is an access in line2. And another access is occurred in line 12. This would lead to out-of-bound access. Therefore, in the patch we check that the available data length is larger enough to pase both TCP option code and size. Signed-off-by: Young Xiao <92siuyang@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30net: core: support XDP generic on stacked devices.Stephen Hemminger1-46/+12
When a device is stacked like (team, bonding, failsafe or netvsc) the XDP generic program for the parent device was not called. Move the call to XDP generic inside __netif_receive_skb_core where it can be done multiple times for stacked case. Fixes: d445516966dc ("net: xdp: support xdp generic on virtual devices") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28llc: fix skb leak in llc_build_and_send_ui_pkt()Eric Dumazet1-0/+2
If llc_mac_hdr_init() returns an error, we must drop the skb since no llc_build_and_send_ui_pkt() caller will take care of this. BUG: memory leak unreferenced object 0xffff8881202b6800 (size 2048): comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.590s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1a 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ backtrace: [<00000000e25b5abe>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<00000000e25b5abe>] slab_post_alloc_hook mm/slab.h:439 [inline] [<00000000e25b5abe>] slab_alloc mm/slab.c:3326 [inline] [<00000000e25b5abe>] __do_kmalloc mm/slab.c:3658 [inline] [<00000000e25b5abe>] __kmalloc+0x161/0x2c0 mm/slab.c:3669 [<00000000a1ae188a>] kmalloc include/linux/slab.h:552 [inline] [<00000000a1ae188a>] sk_prot_alloc+0xd6/0x170 net/core/sock.c:1608 [<00000000ded25bbe>] sk_alloc+0x35/0x2f0 net/core/sock.c:1662 [<000000002ecae075>] llc_sk_alloc+0x35/0x170 net/llc/llc_conn.c:950 [<00000000551f7c47>] llc_ui_create+0x7b/0x140 net/llc/af_llc.c:173 [<0000000029027f0e>] __sock_create+0x164/0x250 net/socket.c:1430 [<000000008bdec225>] sock_create net/socket.c:1481 [inline] [<000000008bdec225>] __sys_socket+0x69/0x110 net/socket.c:1523 [<00000000b6439228>] __do_sys_socket net/socket.c:1532 [inline] [<00000000b6439228>] __se_sys_socket net/socket.c:1530 [inline] [<00000000b6439228>] __x64_sys_socket+0x1e/0x30 net/socket.c:1530 [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 BUG: memory leak unreferenced object 0xffff88811d750d00 (size 224): comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.600s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 f0 0c 24 81 88 ff ff 00 68 2b 20 81 88 ff ff ...$.....h+ .... backtrace: [<0000000053026172>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<0000000053026172>] slab_post_alloc_hook mm/slab.h:439 [inline] [<0000000053026172>] slab_alloc_node mm/slab.c:3269 [inline] [<0000000053026172>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579 [<00000000fa8f3c30>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198 [<00000000d96fdafb>] alloc_skb include/linux/skbuff.h:1058 [inline] [<00000000d96fdafb>] alloc_skb_with_frags+0x5f/0x250 net/core/skbuff.c:5327 [<000000000a34a2e7>] sock_alloc_send_pskb+0x269/0x2a0 net/core/sock.c:2225 [<00000000ee39999b>] sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2242 [<00000000e034d810>] llc_ui_sendmsg+0x10a/0x540 net/llc/af_llc.c:933 [<00000000c0bc8445>] sock_sendmsg_nosec net/socket.c:652 [inline] [<00000000c0bc8445>] sock_sendmsg+0x54/0x70 net/socket.c:671 [<000000003b687167>] __sys_sendto+0x148/0x1f0 net/socket.c:1964 [<00000000922d78d9>] __do_sys_sendto net/socket.c:1976 [inline] [<00000000922d78d9>] __se_sys_sendto net/socket.c:1972 [inline] [<00000000922d78d9>] __x64_sys_sendto+0x2a/0x30 net/socket.c:1972 [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-26net/tls: fix no wakeup on partial readsJakub Kicinski1-6/+2
When tls_sw_recvmsg() partially copies a record it pops that record from ctx->recv_pkt and places it on rx_list. Next iteration of tls_sw_recvmsg() reads from rx_list via process_rx_list() before it enters the decryption loop. If there is no more records to be read tls_wait_data() will put the process on the wait queue and got to sleep. This is incorrect, because some data was already copied in process_rx_list(). In case of RPC connections process may never get woken up, because peer also simply blocks in read(). I think this may also fix a similar issue when BPF is at play, because after __tcp_bpf_recvmsg() returns some data we subtract it from len and use continue to restart the loop, but len could have just reached 0, so again we'd sleep unnecessarily. That's added by: commit d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records") Reported-by: David Beckett <david.beckett@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Tested-by: David Beckett <david.beckett@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-26net/tls: fix lowat calculation if some data came from previous recordJakub Kicinski1-7/+6
If some of the data came from the previous record, i.e. from the rx_list it had already been decrypted, so it's not counted towards the "decrypted" variable, but the "copied" variable. Take that into account when checking lowat. When calculating lowat target we need to pass the original len. E.g. if lowat is at 80, len is 100 and we had 30 bytes on rx_list target would currently be incorrectly calculated as 70, even though we only need 50 more bytes to make up the 80. Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Tested-by: David Beckett <david.beckett@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-25ip_sockglue: Fix missing-check bug in ip_ra_control()Gen Zhang1-0/+2
In function ip_ra_control(), the pointer new_ra is allocated a memory space via kmalloc(). And it is used in the following codes. However, when there is a memory allocation error, kmalloc() fails. Thus null pointer dereference may happen. And it will cause the kernel to crash. Therefore, we should check the return value and handle the error. Signed-off-by: Gen Zhang <blackgod016574@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-25ipv6_sockglue: Fix a missing-check bug in ip6_ra_control()Gen Zhang1-0/+2
In function ip6_ra_control(), the pointer new_ra is allocated a memory space via kmalloc(). And it is used in the following codes. However, when there is a memory allocation error, kmalloc() fails. Thus null pointer dereference may happen. And it will cause the kernel to crash. Therefore, we should check the return value and handle the error. Signed-off-by: Gen Zhang <blackgod016574@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-24Merge tag 'spdx-5.2-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-coreLinus Torvalds68-730/+68
Pule more SPDX updates from Greg KH: "Here is another set of reviewed patches that adds SPDX tags to different kernel files, based on a set of rules that are being used to parse the comments to try to determine that the license of the file is "GPL-2.0-or-later". Only the "obvious" versions of these matches are included here, a number of "non-obvious" variants of text have been found but those have been postponed for later review and analysis. These patches have been out for review on the linux-spdx@vger mailing list, and while they were created by automatic tools, they were hand-verified by a bunch of different people, all whom names are on the patches are reviewers" * tag 'spdx-5.2-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (85 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 125 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 123 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 122 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 121 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 120 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 119 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 116 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 114 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 113 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 112 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 111 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 110 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 106 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 105 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 104 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 103 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 102 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 101 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 98 ...
2019-05-24net: sched: don't use tc_action->order during action dumpVlad Buslov1-2/+1
Function tcf_action_dump() relies on tc_action->order field when starting nested nla to send action data to userspace. This approach breaks in several cases: - When multiple filters point to same shared action, tc_action->order field is overwritten each time it is attached to filter. This causes filter dump to output action with incorrect attribute for all filters that have the action in different position (different order) from the last set tc_action->order value. - When action data is displayed using tc action API (RTM_GETACTION), action order is overwritten by tca_action_gd() according to its position in resulting array of nl attributes, which will break filter dump for all filters attached to that shared action that expect it to have different order value. Don't rely on tc_action->order when dumping actions. Set nla according to action position in resulting array of actions instead. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 104Thomas Gleixner30-480/+30
Based on 1 normalized pattern(s): this sctp implementation is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 or at your option any later version this sctp implementation is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with gnu cc see the file copying if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 42 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190523091649.683323110@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 103Thomas Gleixner1-16/+1
Based on 1 normalized pattern(s): the sctp implementation is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 or at your option any later version the sctp implementation is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with gnu cc see the file copying if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190523091649.592169384@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 93Thomas Gleixner1-17/+1
Based on 1 normalized pattern(s): this code is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520075212.233647300@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 77Thomas Gleixner5-20/+5
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation or any later at your option extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 5 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520075210.769496418@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 72Thomas Gleixner1-4/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 or any later at your option as published by the free software foundation extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520071859.749329557@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 62Thomas Gleixner1-6/+1
Based on 1 normalized pattern(s): released under the gpl version 2 or later and 1 additional normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520071858.828691433@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61Thomas Gleixner3-42/+3
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation inc 675 mass ave cambridge ma 02139 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 441 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520071858.739733335@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 53Thomas Gleixner1-3/+1
Based on 1 normalized pattern(s): this code may be copied under the gpl v 2 or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520071858.029737698@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 41Thomas Gleixner16-96/+16
Based on 1 normalized pattern(s): this module is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 18 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520170858.008906948@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36Thomas Gleixner9-46/+9
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public licence as published by the free software foundation either version 2 of the licence or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 114 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520170857.552531963@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller9-60/+44
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree: 1) Fix crash when dumping rules after conversion to RCU, from Florian Westphal. 2) Fix incorrect hook reinjection from nf_queue in case NF_REPEAT, from Jagdish Motwani. 3) Fix check for route existence in fib extension, from Phil Sutter. 4) Fix use after free in ip_vs_in() hook, from YueHaibing. 5) Check for veth existence from netfilter selftests, from Jeffrin Jose T. 6) Checksum corruption in UDP NAT helpers due to typo, from Florian Westphal. 7) Pass up packets to classic forwarding path regardless of IPv4 DF bit, patch for the flowtable infrastructure from Florian. 8) Set liberal TCP tracking for flows that are placed in the flowtable, in case they need to go back to classic forwarding path, also from Florian. 9) Don't add flow with sequence adjustment to flowtable, from Florian. 10) Skip IPv4 options from IPv6 datapath in flowtable, from Florian. 11) Add selftest for the flowtable infrastructure, from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23hsr: fix don't prune the master node from the node_dbAndreas Oetken1-0/+8
Don't prune the master node in the hsr_prune_nodes function. Neither time_in[HSR_PT_SLAVE_A] nor time_in[HSR_PT_SLAVE_B] will ever be updated by hsr_register_frame_in for the master port. Thus, the master node will be repeatedly pruned leading to repeated packet loss. This bug never appeared because the hsr_prune_nodes function was only called once. Since commit 5150b45fd355 ("net: hsr: Fix node prune function for forget time expiry") this issue is fixed unveiling the issue described above. Fixes: 5150b45fd355 ("net: hsr: Fix node prune function for forget time expiry") Signed-off-by: Andreas Oetken <andreas.oetken@siemens.com> Tested-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22ipv4/igmp: fix build error if !CONFIG_IP_MULTICASTEric Dumazet1-11/+11
ip_sf_list_clear_all() needs to be defined even if !CONFIG_IP_MULTICAST Fixes: 3580d04aa674 ("ipv4/igmp: fix another memory leak in igmpv3_del_delrec()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22ipv4/igmp: fix another memory leak in igmpv3_del_delrec()Eric Dumazet1-17/+30
syzbot reported memory leaks [1] that I have back tracked to a missing cleanup from igmpv3_del_delrec() when (im->sfmode != MCAST_INCLUDE) Add ip_sf_list_clear_all() and kfree_pmc() helpers to explicitely handle the cleanups before freeing. [1] BUG: memory leak unreferenced object 0xffff888123e32b00 (size 64): comm "softirq", pid 0, jiffies 4294942968 (age 8.010s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 e0 00 00 01 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000006105011b>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<000000006105011b>] slab_post_alloc_hook mm/slab.h:439 [inline] [<000000006105011b>] slab_alloc mm/slab.c:3326 [inline] [<000000006105011b>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 [<000000004bba8073>] kmalloc include/linux/slab.h:547 [inline] [<000000004bba8073>] kzalloc include/linux/slab.h:742 [inline] [<000000004bba8073>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline] [<000000004bba8073>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085 [<00000000a46a65a0>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475 [<000000005956ca89>] do_ip_setsockopt.isra.0+0x1795/0x1930 net/ipv4/ip_sockglue.c:957 [<00000000848e2d2f>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246 [<00000000b9db185c>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616 [<000000003028e438>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130 [<0000000015b65589>] __sys_setsockopt+0x98/0x120 net/socket.c:2078 [<00000000ac198ef0>] __do_sys_setsockopt net/socket.c:2089 [inline] [<00000000ac198ef0>] __se_sys_setsockopt net/socket.c:2086 [inline] [<00000000ac198ef0>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086 [<000000000a770437>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<00000000d3adb93b>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 9c8bb163ae78 ("igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hangbin Liu <liuhangbin@gmail.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22ipv6: Fix redirect with VRFDavid Ahern1-0/+6
IPv6 redirect is broken for VRF. __ip6_route_redirect walks the FIB entries looking for an exact match on ifindex. With VRF the flowi6_oif is updated by l3mdev_update_flow to the l3mdev index and the FLOWI_FLAG_SKIP_NH_OIF set in the flags to tell the lookup to skip the device match. For redirects the device match is requires so use that flag to know when the oif needs to be reset to the skb device index. Fixes: ca254490c8df ("net: Add VRF support to IPv6 stack") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22net/tls: don't ignore netdev notifications if no TLS featuresJakub Kicinski1-1/+2
On device surprise removal path (the notifier) we can't bail just because the features are disabled. They may have been enabled during the lifetime of the device. This bug leads to leaking netdev references and use-after-frees if there are active connections while device features are cleared. Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22net/tls: fix state removal with feature flags offJakub Kicinski1-6/+0
TLS offload drivers shouldn't (and currently don't) block the TLS offload feature changes based on whether there are active offloaded connections or not. This seems to be a good idea, because we want the admin to be able to disable the TLS offload at any time, and there is no clean way of disabling it for active connections (TX side is quite problematic). So if features are cleared existing connections will stay offloaded until they close, and new connections will not attempt offload to a given device. However, the offload state removal handling is currently broken if feature flags get cleared while there are active TLS offloads. RX side will completely bail from cleanup, even on normal remove path, leaving device state dangling, potentially causing issues when the 5-tuple is reused. It will also fail to release the netdev reference. Remove the RX-side warning message, in next release cycle it should be printed when features are disabled, rather than when connection dies, but for that we need a more efficient method of finding connection of a given netdev (a'la BPF offload code). Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22net/tls: avoid NULL-deref on resync during device removalJakub Kicinski1-5/+10
When netdev with active kTLS sockets in unregistered notifier callback walks the offloaded sockets and cleans up offload state. RX data may still be processed, however, and if resync was requested prior to device removal we would hit a NULL pointer dereference on ctx->netdev use. Make sure resync is under the device offload lock and NULL-check the netdev pointer. This should be safe, because the pointer is set to NULL either in the netdev notifier (under said lock) or when socket is completely dead and no resync can happen. The other access to ctx->netdev in tls_validate_xmit_skb() does not dereference the pointer, it just checks it against other device pointer, so it should be pretty safe (perhaps we can add a READ_ONCE/WRITE_ONCE there, if paranoid). Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22Validate required parameters in inet6_validate_link_afMaxim Mikityanskiy1-22/+35
inet6_set_link_af requires that at least one of IFLA_INET6_TOKEN or IFLA_INET6_ADDR_GET_MODE is passed. If none of them is passed, it returns -EINVAL, which may cause do_setlink() to fail in the middle of processing other commands and give the following warning message: A link change request failed with some changes committed already. Interface eth0 may have been left with an inconsistent configuration, please check. Check the presence of at least one of them in inet6_validate_link_af to detect invalid parameters at an early stage, before do_setlink does anything. Also validate the address generation mode at an early stage. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds6-12/+32
Pull networking fixes from David Miller: 1) Clear up some recent tipc regressions because of registration ordering. Fix from Junwei Hu. 2) tipc's TLV_SET() can read past the end of the supplied buffer during the copy. From Chris Packham. 3) ptp example program doesn't match the kernel, from Richard Cochran. 4) Outgoing message type fix in qrtr, from Bjorn Andersson. 5) Flow control regression in stmmac, from Tan Tee Min. 6) Fix inband autonegotiation in phylink, from Russell King. 7) Fix sk_bound_dev_if handling in rawv6_bind(), from Mike Manning. 8) Fix usbnet crash after disconnect, from Kloetzke Jan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits) usbnet: fix kernel crash after disconnect selftests: fib_rule_tests: use pre-defined DEV_ADDR net-next: net: Fix typos in ip-sysctl.txt ipv6: Consider sk_bound_dev_if when binding a raw socket to an address net: phylink: ensure inband AN works correctly usbnet: ipheth: fix racing condition net: stmmac: dma channel control register need to be init first net: stmmac: fix ethtool flow control not able to get/set net: qrtr: Fix message type of outgoing packets networking: : fix typos in code comments ptp: Fix example program to match kernel. fddi: fix typos in code comments selftests: fib_rule_tests: enable forwarding before ipv4 from/iif test selftests: fib_rule_tests: fix local IPv4 address typo tipc: Avoid copying bytes beyond the supplied data 2/2] net: xilinx_emaclite: use readx_poll_timeout() in mdio wait function 1/2] net: axienet: use readx_poll_timeout() in mdio wait function vlan: Mark expected switch fall-through macvlan: Mark expected switch fall-through net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query ...
2019-05-22netfilter: nft_flow_offload: IPCB is only valid for ipv4 familyFlorian Westphal1-6/+11
Guard this with a check vs. ipv4, IPCB isn't valid in ipv6 case. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>