aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2018-11-28i40e: Fix deletion of MAC filtersLihong Yang1-1/+1
In __i40e_del_filter function, the flag __I40E_MACVLAN_SYNC_PENDING for the PF state is wrongly set for the VSI. Deleting any of the MAC filters has caused the incorrect syncing for the PF. Fix it by setting this state flag to the intended PF. CC: stable <stable@vger.kernel.org> Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-28igb: fix uninitialized variablesYunjian Wang1-0/+1
This patch fixes the variable 'phy_word' may be used uninitialized. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-27lan743x: Enable driver to work with LAN7431Bryan Whitehead2-0/+2
This driver was designed to work with both LAN7430 and LAN7431. The only difference between the two is the LAN7431 has support for external phy. This change adds LAN7431 to the list of recognized devices supported by this driver. Updates for v2: changed 'fixes' tag to match defined format fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27tipc: fix lockdep warning during node deleteJon Maloy1-2/+5
We see the following lockdep warning: [ 2284.078521] ====================================================== [ 2284.078604] WARNING: possible circular locking dependency detected [ 2284.078604] 4.19.0+ #42 Tainted: G E [ 2284.078604] ------------------------------------------------------ [ 2284.078604] rmmod/254 is trying to acquire lock: [ 2284.078604] 00000000acd94e28 ((&n->timer)#2){+.-.}, at: del_timer_sync+0x5/0xa0 [ 2284.078604] [ 2284.078604] but task is already holding lock: [ 2284.078604] 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x190 [tipc] [ 2284.078604] [ 2284.078604] which lock already depends on the new lock. [ 2284.078604] [ 2284.078604] [ 2284.078604] the existing dependency chain (in reverse order) is: [ 2284.078604] [ 2284.078604] -> #1 (&(&tn->node_list_lock)->rlock){+.-.}: [ 2284.078604] tipc_node_timeout+0x20a/0x330 [tipc] [ 2284.078604] call_timer_fn+0xa1/0x280 [ 2284.078604] run_timer_softirq+0x1f2/0x4d0 [ 2284.078604] __do_softirq+0xfc/0x413 [ 2284.078604] irq_exit+0xb5/0xc0 [ 2284.078604] smp_apic_timer_interrupt+0xac/0x210 [ 2284.078604] apic_timer_interrupt+0xf/0x20 [ 2284.078604] default_idle+0x1c/0x140 [ 2284.078604] do_idle+0x1bc/0x280 [ 2284.078604] cpu_startup_entry+0x19/0x20 [ 2284.078604] start_secondary+0x187/0x1c0 [ 2284.078604] secondary_startup_64+0xa4/0xb0 [ 2284.078604] [ 2284.078604] -> #0 ((&n->timer)#2){+.-.}: [ 2284.078604] del_timer_sync+0x34/0xa0 [ 2284.078604] tipc_node_delete+0x1a/0x40 [tipc] [ 2284.078604] tipc_node_stop+0xcb/0x190 [tipc] [ 2284.078604] tipc_net_stop+0x154/0x170 [tipc] [ 2284.078604] tipc_exit_net+0x16/0x30 [tipc] [ 2284.078604] ops_exit_list.isra.8+0x36/0x70 [ 2284.078604] unregister_pernet_operations+0x87/0xd0 [ 2284.078604] unregister_pernet_subsys+0x1d/0x30 [ 2284.078604] tipc_exit+0x11/0x6f2 [tipc] [ 2284.078604] __x64_sys_delete_module+0x1df/0x240 [ 2284.078604] do_syscall_64+0x66/0x460 [ 2284.078604] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 2284.078604] [ 2284.078604] other info that might help us debug this: [ 2284.078604] [ 2284.078604] Possible unsafe locking scenario: [ 2284.078604] [ 2284.078604] CPU0 CPU1 [ 2284.078604] ---- ---- [ 2284.078604] lock(&(&tn->node_list_lock)->rlock); [ 2284.078604] lock((&n->timer)#2); [ 2284.078604] lock(&(&tn->node_list_lock)->rlock); [ 2284.078604] lock((&n->timer)#2); [ 2284.078604] [ 2284.078604] *** DEADLOCK *** [ 2284.078604] [ 2284.078604] 3 locks held by rmmod/254: [ 2284.078604] #0: 000000003368be9b (pernet_ops_rwsem){+.+.}, at: unregister_pernet_subsys+0x15/0x30 [ 2284.078604] #1: 0000000046ed9c86 (rtnl_mutex){+.+.}, at: tipc_net_stop+0x144/0x170 [tipc] [ 2284.078604] #2: 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x19 [...} The reason is that the node timer handler sometimes needs to delete a node which has been disconnected for too long. To do this, it grabs the lock 'node_list_lock', which may at the same time be held by the generic node cleanup function, tipc_node_stop(), during module removal. Since the latter is calling del_timer_sync() inside the same lock, we have a potential deadlock. We fix this letting the timer cleanup function use spin_trylock() instead of just spin_lock(), and when it fails to grab the lock it just returns so that the timer handler can terminate its execution. This is safe to do, since tipc_node_stop() anyway is about to delete both the timer and the node instance. Fixes: 6a939f365bdb ("tipc: Auto removal of peer down node instance") Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27lan743x: fix return value for lan743x_tx_napi_pollBryan Whitehead1-5/+5
The lan743x driver, when under heavy traffic load, has been noticed to sometimes hang, or cause a kernel panic. Debugging reveals that the TX napi poll routine was returning the wrong value, 'weight'. Most other drivers return 0. And call napi_complete, instead of napi_complete_done. Additionally when creating the tx napi poll routine. Changed netif_napi_add, to netif_tx_napi_add. Updates for v3: changed 'fixes' tag to match defined format Updates for v2: use napi_complete, instead of napi_complete_done in lan743x_tx_napi_poll use netif_tx_napi_add, instead of netif_napi_add for registration of tx napi poll routine fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27net: via: via-velocity: fix spelling mistake "alignement" -> "alignment"Colin Ian King1-1/+1
The text in array velocity_gstrings contains a spelling mistake, rename rx_frame_alignement_errors to rx_frame_alignment_errors. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27qed: fix spelling mistake "attnetion" -> "attention"Colin Ian King1-1/+1
The text in array s_igu_fifo_error_strs contains a spelling mistake, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27net: thunderx: fix NULL pointer dereference in nic_removeLorenzo Bianconi1-0/+3
Fix a possible NULL pointer dereference in nic_remove routine removing the nicpf module if nic_probe fails. The issue can be triggered with the following reproducer: $rmmod nicvf $rmmod nicpf [ 521.412008] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000014 [ 521.422777] Mem abort info: [ 521.425561] ESR = 0x96000004 [ 521.428624] Exception class = DABT (current EL), IL = 32 bits [ 521.434535] SET = 0, FnV = 0 [ 521.437579] EA = 0, S1PTW = 0 [ 521.440730] Data abort info: [ 521.443603] ISV = 0, ISS = 0x00000004 [ 521.447431] CM = 0, WnR = 0 [ 521.450417] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000072a3da42 [ 521.457022] [0000000000000014] pgd=0000000000000000 [ 521.461916] Internal error: Oops: 96000004 [#1] SMP [ 521.511801] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018 [ 521.518664] pstate: 80400005 (Nzcv daif +PAN -UAO) [ 521.523451] pc : nic_remove+0x24/0x88 [nicpf] [ 521.527808] lr : pci_device_remove+0x48/0xd8 [ 521.532066] sp : ffff000013433cc0 [ 521.535370] x29: ffff000013433cc0 x28: ffff810f6ac50000 [ 521.540672] x27: 0000000000000000 x26: 0000000000000000 [ 521.545974] x25: 0000000056000000 x24: 0000000000000015 [ 521.551274] x23: ffff8007ff89a110 x22: ffff000001667070 [ 521.556576] x21: ffff8007ffb170b0 x20: ffff8007ffb17000 [ 521.561877] x19: 0000000000000000 x18: 0000000000000025 [ 521.567178] x17: 0000000000000000 x16: 000000000000010ffc33ff98 x8 : 0000000000000000 [ 521.593683] x7 : 0000000000000000 x6 : 0000000000000001 [ 521.598983] x5 : 0000000000000002 x4 : 0000000000000003 [ 521.604284] x3 : ffff8007ffb17184 x2 : ffff8007ffb17184 [ 521.609585] x1 : ffff000001662118 x0 : ffff000008557be0 [ 521.614887] Process rmmod (pid: 1897, stack limit = 0x00000000859535c3) [ 521.621490] Call trace: [ 521.623928] nic_remove+0x24/0x88 [nicpf] [ 521.627927] pci_device_remove+0x48/0xd8 [ 521.631847] device_release_driver_internal+0x1b0/0x248 [ 521.637062] driver_detach+0x50/0xc0 [ 521.640628] bus_remove_driver+0x60/0x100 [ 521.644627] driver_unregister+0x34/0x60 [ 521.648538] pci_unregister_driver+0x24/0xd8 [ 521.652798] nic_cleanup_module+0x14/0x111c [nicpf] [ 521.657672] __arm64_sys_delete_module+0x150/0x218 [ 521.662460] el0_svc_handler+0x94/0x110 [ 521.666287] el0_svc+0x8/0xc [ 521.669160] Code: aa1e03e0 9102c295 d503201f f9404eb3 (b9401660) Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27sctp: increase sk_wmem_alloc when head->truesize is increasedXin Long1-0/+1
I changed to count sk_wmem_alloc by skb truesize instead of 1 to fix the sk_wmem_alloc leak caused by later truesize's change in xfrm in Commit 02968ccf0125 ("sctp: count sk_wmem_alloc by skb truesize in sctp_packet_transmit"). But I should have also increased sk_wmem_alloc when head->truesize is increased in sctp_packet_gso_append() as xfrm does. Otherwise, sctp gso packet will cause sk_wmem_alloc underflow. Fixes: 02968ccf0125 ("sctp: count sk_wmem_alloc by skb truesize in sctp_packet_transmit") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27firestream: fix spelling mistake: "Inititing" -> "Initializing"Colin Ian King1-2/+2
There are spelling mistakes in debug messages, fix them. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27net: phy: add workaround for issue where PHY driver doesn't bind to the deviceHeiner Kallweit1-0/+8
After switching the r8169 driver to use phylib some user reported that their network is broken. This was caused by the genphy PHY driver being used instead of the dedicated PHY driver for the RTL8211B. Users reported that loading the Realtek PHY driver module upfront fixes the issue. See also this mail thread: https://marc.info/?t=154279781800003&r=1&w=2 The issue is quite weird and the root cause seems to be somewhere in the base driver core. The patch works around the issue and may be removed once the actual issue is fixed. The Fixes tag refers to the first reported occurrence of the issue. The issue itself may have been existing much longer and it may affect users of other network chips as well. Users typically will recognize this issue only if their PHY stops working when being used with the genphy driver. Fixes: f1e911d5d0df ("r8169: add basic phylib support") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2Bernd Eckstein1-6/+4
The bug is not easily reproducable, as it may occur very infrequently (we had machines with 20minutes heavy downloading before it occurred) However, on a virual machine (VMWare on Windows 10 host) it occurred pretty frequently (1-2 seconds after a speedtest was started) dev->tx_skb mab be freed via dev_kfree_skb_irq on a callback before it is set. This causes the following problems: - double free of the skb or potential memory leak - in dmesg: 'recvmsg bug' and 'recvmsg bug 2' and eventually general protection fault Example dmesg output: [ 134.841986] ------------[ cut here ]------------ [ 134.841987] recvmsg bug: copied 9C24A555 seq 9C24B557 rcvnxt 9C25A6B3 fl 0 [ 134.841993] WARNING: CPU: 7 PID: 2629 at /build/linux-hwe-On9fm7/linux-hwe-4.15.0/net/ipv4/tcp.c:1865 tcp_recvmsg+0x44d/0xab0 [ 134.841994] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi [ 134.842046] CPU: 7 PID: 2629 Comm: python Tainted: G W OE 4.15.0-34-generic #37~16.04.1-Ubuntu [ 134.842046] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 [ 134.842048] RIP: 0010:tcp_recvmsg+0x44d/0xab0 [ 134.842048] RSP: 0018:ffffa6630422bcc8 EFLAGS: 00010286 [ 134.842049] RAX: 0000000000000000 RBX: ffff997616f4f200 RCX: 0000000000000006 [ 134.842049] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff9976257d6490 [ 134.842050] RBP: ffffa6630422bd98 R08: 0000000000000001 R09: 000000000004bba4 [ 134.842050] R10: 0000000001e00c6f R11: 000000000004bba4 R12: ffff99760dee3000 [ 134.842051] R13: 0000000000000000 R14: ffff99760dee3514 R15: 0000000000000000 [ 134.842051] FS: 00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000 [ 134.842052] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 134.842053] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0 [ 134.842055] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 134.842055] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 134.842057] Call Trace: [ 134.842060] ? aa_sk_perm+0x53/0x1a0 [ 134.842064] inet_recvmsg+0x51/0xc0 [ 134.842066] sock_recvmsg+0x43/0x50 [ 134.842070] SYSC_recvfrom+0xe4/0x160 [ 134.842072] ? __schedule+0x3de/0x8b0 [ 134.842075] ? ktime_get_ts64+0x4c/0xf0 [ 134.842079] SyS_recvfrom+0xe/0x10 [ 134.842082] do_syscall_64+0x73/0x130 [ 134.842086] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 134.842086] RIP: 0033:0x7fe331f5a81d [ 134.842088] RSP: 002b:00007ffe8da98398 EFLAGS: 00000246 ORIG_RAX: 000000000000002d [ 134.842090] RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 00007fe331f5a81d [ 134.842094] RDX: 00000000000003fb RSI: 0000000001e00874 RDI: 0000000000000003 [ 134.842095] RBP: 00007fe32f642c70 R08: 0000000000000000 R09: 0000000000000000 [ 134.842097] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe332347698 [ 134.842099] R13: 0000000001b7e0a0 R14: 0000000001e00874 R15: 0000000000000000 [ 134.842103] Code: 24 fd ff ff e9 cc fe ff ff 48 89 d8 41 8b 8c 24 10 05 00 00 44 8b 45 80 48 c7 c7 08 bd 59 8b 48 89 85 68 ff ff ff e8 b3 c4 7d ff <0f> 0b 48 8b 85 68 ff ff ff e9 e9 fe ff ff 41 8b 8c 24 10 05 00 [ 134.842126] ---[ end trace b7138fc08c83147f ]--- [ 134.842144] general protection fault: 0000 [#1] SMP PTI [ 134.842145] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi [ 134.842161] CPU: 7 PID: 2629 Comm: python Tainted: G W OE 4.15.0-34-generic #37~16.04.1-Ubuntu [ 134.842162] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 [ 134.842164] RIP: 0010:tcp_close+0x2c6/0x440 [ 134.842165] RSP: 0018:ffffa6630422bde8 EFLAGS: 00010202 [ 134.842167] RAX: 0000000000000000 RBX: ffff99760dee3000 RCX: 0000000180400034 [ 134.842168] RDX: 5c4afd407207a6c4 RSI: ffffe868495bd300 RDI: ffff997616f4f200 [ 134.842169] RBP: ffffa6630422be08 R08: 0000000016f4d401 R09: 0000000180400034 [ 134.842169] R10: ffffa6630422bd98 R11: 0000000000000000 R12: 000000000000600c [ 134.842170] R13: 0000000000000000 R14: ffff99760dee30c8 R15: ffff9975bd44fe00 [ 134.842171] FS: 00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000 [ 134.842173] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 134.842174] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0 [ 134.842177] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 134.842178] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 134.842179] Call Trace: [ 134.842181] inet_release+0x42/0x70 [ 134.842183] __sock_release+0x42/0xb0 [ 134.842184] sock_close+0x15/0x20 [ 134.842187] __fput+0xea/0x220 [ 134.842189] ____fput+0xe/0x10 [ 134.842191] task_work_run+0x8a/0xb0 [ 134.842193] exit_to_usermode_loop+0xc4/0xd0 [ 134.842195] do_syscall_64+0xf4/0x130 [ 134.842197] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 134.842197] RIP: 0033:0x7fe331f5a560 [ 134.842198] RSP: 002b:00007ffe8da982e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 134.842200] RAX: 0000000000000000 RBX: 00007fe32f642c70 RCX: 00007fe331f5a560 [ 134.842201] RDX: 00000000008f5320 RSI: 0000000001cd4b50 RDI: 0000000000000003 [ 134.842202] RBP: 00007fe32f6500f8 R08: 000000000000003c R09: 00000000009343c0 [ 134.842203] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe32f6500d0 [ 134.842204] R13: 00000000008f5320 R14: 00000000008f5320 R15: 0000000001cd4770 [ 134.842205] Code: c8 00 00 00 45 31 e4 49 39 fe 75 4d eb 50 83 ab d8 00 00 00 01 48 8b 17 48 8b 47 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 <48> 89 42 08 48 89 10 0f b6 57 34 8b 47 2c 2b 47 28 83 e2 01 80 [ 134.842226] RIP: tcp_close+0x2c6/0x440 RSP: ffffa6630422bde8 [ 134.842227] ---[ end trace b7138fc08c831480 ]--- The proposed patch eliminates a potential racing condition. Before, usb_submit_urb was called and _after_ that, the skb was attached (dev->tx_skb). So, on a callback it was possible, however unlikely that the skb was freed before it was set. That way (because dev->tx_skb was not set to NULL after it was freed), it could happen that a skb from a earlier transmission was freed a second time (and the skb we should have freed did not get freed at all) Now we free the skb directly in ipheth_tx(). It is not passed to the callback anymore, eliminating the posibility of a double free of the same skb. Depending on the retval of usb_submit_urb() we use dev_kfree_skb_any() respectively dev_consume_skb_any() to free the skb. Signed-off-by: Oliver Zweigle <Oliver.Zweigle@faro.com> Signed-off-by: Bernd Eckstein <3ernd.Eckstein@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller6-58/+223
Daniel Borkmann says: ==================== pull-request: bpf 2018-11-27 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix several bugs in BPF sparc JIT, that is, convergence for fused branches, initialization of frame pointer register, and moving all arguments into output registers from input registers in prologue to fix BPF to BPF calls, from David. 2) Fix a bug in arm64 JIT for fetching BPF to BPF call addresses where they are not guaranteed to fit into imm field and therefore must be retrieved through prog aux data, from Daniel. 3) Explicitly add all JITs to MAINTAINERS file with developers able to help out in feature development, fixes, review, etc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27sparc: Adjust bpf JIT prologue for PSEUDO calls.David Miller1-1/+7
Move all arguments into output registers from input registers. This path is exercised by test_verifier.c's "calls: two calls with args" test. Adjust BPF_TAILCALL_PROLOGUE_SKIP as needed. Let's also make the prologue length a constant size regardless of the combination of ->saw_frame_pointer and ->saw_tail_call settings. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-26bpf, doc: add entries of who looks over which jitsDaniel Borkmann1-1/+62
Make the high-level BPF JIT entry a general 'catch-all' and add architecture specific entries to make it more clear who actively maintains which BPF JIT compiler. The list (L) address implies that this eventually lands in the bpf patchwork bucket. Goal is that this set of responsible developers listed here is always up to date and a point of contact for helping out in e.g. feature development, fixes, review or testing patches in order to help long-term in ensuring quality of the BPF JITs and therefore BPF core under a given architecture. Every new JIT in future /must/ have an entry here as well. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Zi Shen Lim <zlim.lnx@gmail.com> Acked-by: Paul Burton <paul.burton@mips.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26sparc: Correct ctx->saw_frame_pointer logic.David Miller1-0/+12
We need to initialize the frame pointer register not just if it is seen as a source operand, but also if it is seen as the destination operand of a store or an atomic instruction (which effectively is a source operand). This is exercised by test_verifier's "non-invalid fp arithmetic" Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26sparc: Fix JIT fused branch convergance.David Miller1-28/+49
On T4 and later sparc64 cpus we can use the fused compare and branch instruction. However, it can only be used if the branch destination is in the range of a signed 10-bit immediate offset. This amounts to 1024 instructions forwards or backwards. After the commit referenced in the Fixes: tag, the largest possible size program seen by the JIT explodes by a significant factor. As a result of this convergance takes many more passes since the expanded "BPF_LDX | BPF_MSH | BPF_B" code sequence, for example, contains several embedded branch on condition instructions. On each pass, as suddenly new fused compare and branch instances become valid, this makes thousands more in range for the next pass. And so on and so forth. This is most greatly exemplified by "BPF_MAXINSNS: exec all MSH" which takes 35 passes to converge, and shrinks the image by about 64K. To decrease the cost of this number of convergance passes, do the convergance pass before we have the program image allocated, just like other JITs (such as x86) do. Fixes: e0cea7ce988c ("bpf: implement ld_abs/ld_ind in native bpf") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26Merge branch 'arm64-jit-fixes'Alexei Starovoitov4-28/+93
Daniel Borkmann says: ==================== This set contains a fix for arm64 BPF JIT. First patch generalizes ppc64 way of retrieving subprog into bpf_jit_get_func_addr() as core code and uses the same on arm64 in second patch. Tested on both arm64 and ppc64. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26bpf, arm64: fix getting subprog addr from aux for callsDaniel Borkmann1-9/+17
The arm64 JIT has the same issue as ppc64 JIT in that the relative BPF to BPF call offset can be too far away from core kernel in that relative encoding into imm is not sufficient and could potentially be truncated, see also fd045f6cd98e ("arm64: add support for module PLTs") which adds spill-over space for module_alloc() and therefore bpf_jit_binary_alloc(). Therefore, use the recently added bpf_jit_get_func_addr() helper for properly fetching the address through prog->aux->func[off]->bpf_func instead. This also has the benefit to optimize normal helper calls since their address can use the optimized emission. Tested on Cavium ThunderX CN8890. Fixes: db496944fdaa ("bpf: arm64: add JIT support for multi-function programs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26bpf, ppc64: generalize fetching subprog into bpf_jit_get_func_addrDaniel Borkmann3-19/+76
Make fetching of the BPF call address from ppc64 JIT generic. ppc64 was using a slightly different variant rather than through the insns' imm field encoding as the target address would not fit into that space. Therefore, the target subprog number was encoded into the insns' offset and fetched through fp->aux->func[off]->bpf_func instead. Given there are other JITs with this issue and the mechanism of fetching the address is JIT-generic, move it into the core as a helper instead. On the JIT side, we get information on whether the retrieved address is a fixed one, that is, not changing through JIT passes, or a dynamic one. For the former, JITs can optimize their imm emission because this doesn't change jump offsets throughout JIT process. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Sandipan Das <sandipan@linux.ibm.com> Tested-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller18-39/+752
Daniel Borkmann says: ==================== pull-request: bpf 2018-11-25 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix an off-by-one bug when adjusting subprog start offsets after patching, from Edward. 2) Fix several bugs such as overflow in size allocation in queue / stack map creation, from Alexei. 3) Fix wrong IPv6 destination port byte order in bpf_sk_lookup_udp helper, from Andrey. 4) Fix several bugs in bpftool such as preventing an infinite loop in get_fdinfo, error handling and man page references, from Quentin. 5) Fix a warning in bpf_trace_printk() that wasn't catching an invalid format string, from Martynas. 6) Fix a bug in BPF cgroup local storage where non-atomic allocation was used in atomic context, from Roman. 7) Fix a NULL pointer dereference bug in bpftool from reallocarray() error handling, from Jakub and Wen. 8) Add a copy of pkt_cls.h and tc_bpf.h uapi headers to the tools include infrastructure so that bpftool compiles on older RHEL7-like user space which does not ship these headers, from Yonghong. 9) Fix BPF kselftests for user space where to get ping test working with ping6 and ping -6, from Li. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-24net: always initialize pagedlenWillem de Bruijn2-2/+4
In ip packet generation, pagedlen is initialized for each skb at the start of the loop in __ip(6)_append_data, before label alloc_new_skb. Depending on compiler options, code can be generated that jumps to this label, triggering use of an an uninitialized variable. In practice, at -O2, the generated code moves the initialization below the label. But the code should not rely on that for correctness. Fixes: 15e36f5b8e98 ("udp: paged allocation with gso") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-24tcp: address problems caused by EDT misshapsEric Dumazet2-10/+16
When a qdisc setup including pacing FQ is dismantled and recreated, some TCP packets are sent earlier than instructed by TCP stack. TCP can be fooled when ACK comes back, because the following operation can return a negative value. tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr; Some paths in TCP stack were not dealing properly with this, this patch addresses four of them. Fixes: ab408b6dc744 ("tcp: switch tcp and sch_fq to new earliest departure time model") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-24Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds2-3/+3
Pull arm64 fixes from Catalin Marinas:: - Fix wrong conflict resolution around CONFIG_ARM64_SSBD - Fix sparse warning on unsigned long constant * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: cpufeature: Fix mismerge of CONFIG_ARM64_SSBD block arm64: sysreg: fix sparse warnings
2018-11-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds67-335/+537
Pull networking fixes from David Miller: 1) Need to take mutex in ath9k_add_interface(), from Dan Carpenter. 2) Fix mt76 build without CONFIG_LEDS_CLASS, from Arnd Bergmann. 3) Fix socket wmem accounting in SCTP, from Xin Long. 4) Fix failed resume crash in ena driver, from Arthur Kiyanovski. 5) qed driver passes bytes instead of bits into second arg of bitmap_weight(). From Denis Bolotin. 6) Fix reset deadlock in ibmvnic, from Juliet Kim. 7) skb_scrube_packet() needs to scrub the fwd marks too, from Petr Machata. 8) Make sure older TCP stacks see enough dup ACKs, and avoid doing SACK compression during this period, from Eric Dumazet. 9) Add atomicity to SMC protocol cursor handling, from Ursula Braun. 10) Don't leave dangling error pointer if bpf_prog_add() fails in thunderx driver, from Lorenzo Bianconi. Also, when we unmap TSO headers, set sq->tso_hdrs to NULL. 11) Fix race condition over state variables in act_police, from Davide Caratti. 12) Disable guest csum in the presence of XDP in virtio_net, from Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (64 commits) net: gemini: Fix copy/paste error net: phy: mscc: fix deadlock in vsc85xx_default_config dt-bindings: dsa: Fix typo in "probed" net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue net: amd: add missing of_node_put() team: no need to do team_notify_peers or team_mcast_rejoin when disabling port virtio-net: fail XDP set if guest csum is negotiated virtio-net: disable guest csum during XDP set net/sched: act_police: add missing spinlock initialization net: don't keep lonely packets forever in the gro hash net/ipv6: re-do dad when interface has IFF_NOARP flag change packet: copy user buffers before orphan or clone ibmvnic: Update driver queues after change in ring size support ibmvnic: Fix RX queue buffer cleanup net: thunderx: set xdp_prog to NULL if bpf_prog_add fails net/dim: Update DIM start sample after each DIM iteration net: faraday: ftmac100: remove netif_running(netdev) check before disabling interrupts net/smc: use after free fix in smc_wr_tx_put_slot() net/smc: atomic SMCD cursor handling net/smc: add SMC-D shutdown signal ...
2018-11-24Merge tag 'xfs-4.20-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds10-46/+104
Pull xfs fixes from Darrick Wong: "Dave and I have continued our work fixing corruption problems that can be found when running long-term burn-in exercisers on xfs. Here are some patches fixing most of the problems, but there will likely be more. :/ - Numerous corruption fixes for copy on write - Numerous corruption fixes for blocksize < pagesize writes - Don't miscalculate AG reservations for small final AGs - Fix page cache truncation to work properly for reflink and extent shifting - Fix use-after-free when retrying failed inode/dquot buffer logging - Fix corruptions seen when using copy_file_range in directio mode" * tag 'xfs-4.20-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: readpages doesn't zero page tail beyond EOF vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP iomap: dio data corruption and spurious errors when pipes fill iomap: sub-block dio needs to zeroout beyond EOF iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents xfs: delalloc -> unwritten COW fork allocation can go wrong xfs: flush removing page cache in xfs_reflink_remap_prep xfs: extent shifting doesn't fully invalidate page cache xfs: finobt AG reserves don't consider last AG can be a runt xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers xfs: uncached buffer tracing needs to print bno xfs: make xfs_file_remap_range() static xfs: fix shared extent data corruption due to missing cow reservation
2018-11-23net: gemini: Fix copy/paste errorAndreas Fiedler1-1/+1
The TX stats should be started with the tx_stats_syncp, there seems to be a copy/paste error in the driver. Signed-off-by: Andreas Fiedler <andreas.fiedler@gmx.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23net: phy: mscc: fix deadlock in vsc85xx_default_configQuentin Schulz1-9/+5
The vsc85xx_default_config function called in the vsc85xx_config_init function which is used by VSC8530, VSC8531, VSC8540 and VSC8541 PHYs mistakenly calls phy_read and phy_write in-between phy_select_page and phy_restore_page. phy_select_page and phy_restore_page actually take and release the MDIO bus lock and phy_write and phy_read take and release the lock to write or read to a PHY register. Let's fix this deadlock by using phy_modify_paged which handles correctly a read followed by a write in a non-standard page. Fixes: 6a0bfbbe20b0 ("net: phy: mscc: migrate to phy_select/restore_page functions") Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23dt-bindings: dsa: Fix typo in "probed"Fabio Estevam1-1/+1
The correct form is "can be probed", so fix the typo. Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queueLorenzo Bianconi1-1/+3
Reset snd_queue tso_hdrs pointer to NULL in nicvf_free_snd_queue routine since it is used to check if tso dma descriptor queue has been previously allocated. The issue can be triggered with the following reproducer: $ip link set dev enP2p1s0v0 xdpdrv obj xdp_dummy.o $ip link set dev enP2p1s0v0 xdpdrv off [ 341.467649] WARNING: CPU: 74 PID: 2158 at mm/vmalloc.c:1511 __vunmap+0x98/0xe0 [ 341.515010] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018 [ 341.521874] pstate: 60400005 (nZCv daif +PAN -UAO) [ 341.526654] pc : __vunmap+0x98/0xe0 [ 341.530132] lr : __vunmap+0x98/0xe0 [ 341.533609] sp : ffff00001c5db860 [ 341.536913] x29: ffff00001c5db860 x28: 0000000000020000 [ 341.542214] x27: ffff810feb5090b0 x26: ffff000017e57000 [ 341.547515] x25: 0000000000000000 x24: 00000000fbd00000 [ 341.552816] x23: 0000000000000000 x22: ffff810feb5090b0 [ 341.558117] x21: 0000000000000000 x20: 0000000000000000 [ 341.563418] x19: ffff000017e57000 x18: 0000000000000000 [ 341.568719] x17: 0000000000000000 x16: 0000000000000000 [ 341.574020] x15: 0000000000000010 x14: ffffffffffffffff [ 341.579321] x13: ffff00008985eb27 x12: ffff00000985eb2f [ 341.584622] x11: ffff0000096b3000 x10: ffff00001c5db510 [ 341.589923] x9 : 00000000ffffffd0 x8 : ffff0000086868e8 [ 341.595224] x7 : 3430303030303030 x6 : 00000000000006ef [ 341.600525] x5 : 00000000003fffff x4 : 0000000000000000 [ 341.605825] x3 : 0000000000000000 x2 : ffffffffffffffff [ 341.611126] x1 : ffff0000096b3728 x0 : 0000000000000038 [ 341.616428] Call trace: [ 341.618866] __vunmap+0x98/0xe0 [ 341.621997] vunmap+0x3c/0x50 [ 341.624961] arch_dma_free+0x68/0xa0 [ 341.628534] dma_direct_free+0x50/0x80 [ 341.632285] nicvf_free_resources+0x160/0x2d8 [nicvf] [ 341.637327] nicvf_config_data_transfer+0x174/0x5e8 [nicvf] [ 341.642890] nicvf_stop+0x298/0x340 [nicvf] [ 341.647066] __dev_close_many+0x9c/0x108 [ 341.650977] dev_close_many+0xa4/0x158 [ 341.654720] rollback_registered_many+0x140/0x530 [ 341.659414] rollback_registered+0x54/0x80 [ 341.663499] unregister_netdevice_queue+0x9c/0xe8 [ 341.668192] unregister_netdev+0x28/0x38 [ 341.672106] nicvf_remove+0xa4/0xa8 [nicvf] [ 341.676280] nicvf_shutdown+0x20/0x30 [nicvf] [ 341.680630] pci_device_shutdown+0x44/0x88 [ 341.684720] device_shutdown+0x144/0x250 [ 341.688640] kernel_restart_prepare+0x44/0x50 [ 341.692986] kernel_restart+0x20/0x68 [ 341.696638] __se_sys_reboot+0x210/0x238 [ 341.700550] __arm64_sys_reboot+0x24/0x30 [ 341.704555] el0_svc_handler+0x94/0x110 [ 341.708382] el0_svc+0x8/0xc [ 341.711252] ---[ end trace 3f4019c8439959c9 ]--- [ 341.715874] page:ffff7e0003ef4000 count:0 mapcount:0 mapping:0000000000000000 index:0x4 [ 341.723872] flags: 0x1fffe000000000() [ 341.727527] raw: 001fffe000000000 ffff7e0003f1a008 ffff7e0003ef4048 0000000000000000 [ 341.735263] raw: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000 [ 341.742994] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) where xdp_dummy.c is a simple bpf program that forwards the incoming frames to the network stack (available here: https://github.com/altoor/xdp_walkthrough_examples/blob/master/sample_1/xdp_dummy.c) Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support") Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23net: amd: add missing of_node_put()Yangtao Li1-1/+3
of_find_node_by_path() acquires a reference to the node returned by it and that reference needs to be dropped by its caller. This place doesn't do that, so fix it. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23team: no need to do team_notify_peers or team_mcast_rejoin when disabling portHangbin Liu1-2/+0
team_notify_peers() will send ARP and NA to notify peers. team_mcast_rejoin() will send multicast join group message to notify peers. We should do this when enabling/changed to a new port. But it doesn't make sense to do it when a port is disabled. On the other hand, when we set mcast_rejoin_count to 2, and do a failover, team_port_disable() will increase mcast_rejoin.count_pending to 2 and then team_port_enable() will increase mcast_rejoin.count_pending to 4. We will send 4 mcast rejoin messages at latest, which will make user confused. The same with notify_peers.count. Fix it by deleting team_notify_peers() and team_mcast_rejoin() in team_port_disable(). Reported-by: Liang Li <liali@redhat.com> Fixes: fc423ff00df3a ("team: add peer notification") Fixes: 492b200efdd20 ("team: add support for sending multicast rejoins") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23bpf: fix check of allowed specifiers in bpf_trace_printkMartynas Pumputis1-3/+5
A format string consisting of "%p" or "%s" followed by an invalid specifier (e.g. "%p%\n" or "%s%") could pass the check which would make format_decode (lib/vsprintf.c) to warn. Fixes: 9c959c863f82 ("tracing: Allow BPF programs to call bpf_trace_printk()") Reported-by: syzbot+1ec5c5ec949c4adaa0c4@syzkaller.appspotmail.com Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-23virtio-net: fail XDP set if guest csum is negotiatedJason Wang1-2/+3
We don't support partial csumed packet since its metadata will be lost or incorrect during XDP processing. So fail the XDP set if guest_csum feature is negotiated. Fixes: f600b6905015 ("virtio_net: Add XDP support") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Pavel Popa <pashinho1990@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23virtio-net: disable guest csum during XDP setJason Wang1-6/+2
We don't disable VIRTIO_NET_F_GUEST_CSUM if XDP was set. This means we can receive partial csumed packets with metadata kept in the vnet_hdr. This may have several side effects: - It could be overridden by header adjustment, thus is might be not correct after XDP processing. - There's no way to pass such metadata information through XDP_REDIRECT to another driver. - XDP does not support checksum offload right now. So simply disable guest csum if possible in this the case of XDP. Fixes: 3f93522ffab2d ("virtio-net: switch off offloads on demand if possible on XDP set") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Pavel Popa <pashinho1990@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23Merge tag 'ceph-for-4.20-rc4' of https://github.com/ceph/ceph-clientLinus Torvalds1-3/+9
Pullk ceph fix from Ilya Dryomov: "A messenger fix, marked for stable" * tag 'ceph-for-4.20-rc4' of https://github.com/ceph/ceph-client: libceph: fall back to sendmsg for slab pages
2018-11-23Merge tag 'for-linus-20181123' of git://git.kernel.dk/linux-blockLinus Torvalds1-10/+63
Pull block fix from Jens Axboe: "Just a single fix for this week, fixing an issue with nvme-fc" * tag 'for-linus-20181123' of git://git.kernel.dk/linux-block: nvme-fc: resolve io failures during connect
2018-11-23net/sched: act_police: add missing spinlock initializationDavide Caratti1-0/+1
commit f2cbd4852820 ("net/sched: act_police: fix race condition on state variables") introduces a new spinlock, but forgets its initialization. Ensure that tcf_police_init() initializes 'tcfp_lock' every time a 'police' action is newly created, to avoid the following lockdep splat: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. <...> Call Trace: dump_stack+0x85/0xcb register_lock_class+0x581/0x590 __lock_acquire+0xd4/0x1330 ? tcf_police_init+0x2fa/0x650 [act_police] ? lock_acquire+0x9e/0x1a0 lock_acquire+0x9e/0x1a0 ? tcf_police_init+0x2fa/0x650 [act_police] ? tcf_police_init+0x55a/0x650 [act_police] _raw_spin_lock_bh+0x34/0x40 ? tcf_police_init+0x2fa/0x650 [act_police] tcf_police_init+0x2fa/0x650 [act_police] tcf_action_init_1+0x384/0x4c0 tcf_action_init+0xf6/0x160 tcf_action_add+0x73/0x170 tc_ctl_action+0x122/0x160 rtnetlink_rcv_msg+0x2a4/0x490 ? netlink_deliver_tap+0x99/0x400 ? validate_linkmsg+0x370/0x370 netlink_rcv_skb+0x4d/0x130 netlink_unicast+0x196/0x230 netlink_sendmsg+0x2e5/0x3e0 sock_sendmsg+0x36/0x40 ___sys_sendmsg+0x280/0x2f0 ? _raw_spin_unlock+0x24/0x30 ? handle_pte_fault+0xafe/0xf30 ? find_held_lock+0x2d/0x90 ? syscall_trace_enter+0x1df/0x360 ? __sys_sendmsg+0x5e/0xa0 __sys_sendmsg+0x5e/0xa0 do_syscall_64+0x60/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f1841c7cf10 Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24 RSP: 002b:00007ffcf9df4d68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f1841c7cf10 RDX: 0000000000000000 RSI: 00007ffcf9df4dc0 RDI: 0000000000000003 RBP: 000000005bf56105 R08: 0000000000000002 R09: 00007ffcf9df8edc R10: 00007ffcf9df47e0 R11: 0000000000000246 R12: 0000000000671be0 R13: 00007ffcf9df4e84 R14: 0000000000000008 R15: 0000000000000000 Fixes: f2cbd4852820 ("net/sched: act_police: fix race condition on state variables") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23net: don't keep lonely packets forever in the gro hashPaolo Abeni1-2/+5
Eric noted that with UDP GRO and NAPI timeout, we could keep a single UDP packet inside the GRO hash forever, if the related NAPI instance calls napi_gro_complete() at an higher frequency than the NAPI timeout. Willem noted that even TCP packets could be trapped there, till the next retransmission. This patch tries to address the issue, flushing the old packets - those with a NAPI_GRO_CB age before the current jiffy - before scheduling the NAPI timeout. The rationale is that such a timeout should be well below a jiffy and we are not flushing packets eligible for sane GRO. v1 -> v2: - clarified the commit message and comment RFC -> v1: - added 'Fixes tags', cleaned-up the wording. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: 3b47d30396ba ("net: gro: add a per device gro flush timer") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23net/ipv6: re-do dad when interface has IFF_NOARP flag changeHangbin Liu1-6/+13
When we add a new IPv6 address, we should also join corresponding solicited-node multicast address, unless the interface has IFF_NOARP flag, as function addrconf_join_solict() did. But if we remove IFF_NOARP flag later, we do not do dad and add the mcast address. So we will drop corresponding neighbour discovery message that came from other nodes. A typical example is after creating a ipvlan with mode l3, setting up an ipv6 address and changing the mode to l2. Then we will not be able to ping this address as the interface doesn't join related solicited-node mcast address. Fix it by re-doing dad when interface changed IFF_NOARP flag. Then we will add corresponding mcast group and check if there is a duplicate address on the network. Reported-by: Jianlin Shi <jishi@redhat.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23Merge tag 'iommu-fixes-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds4-3/+7
Pull IOMMU fixes from Joerg Roedel: - Two fixes for the Intel VT-d driver to fix a NULL-ptr dereference and an unbalance in an allocate/free path (allocated with memremap, freed with iounmap) - Fix for a crash in the Renesas IOMMU driver - Fix for the Advanced Virtual Interrupt Controler (AVIC) code in the AMD IOMMU driver * tag 'iommu-fixes-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Use memunmap to free memremap amd/iommu: Fix Guest Virtual APIC Log Tail Address Register iommu/ipmmu-vmsa: Fix crash on early domain free iommu/vt-d: Fix NULL pointer dereference in prq_event_thread()
2018-11-23packet: copy user buffers before orphan or cloneWillem de Bruijn2-3/+19
tpacket_snd sends packets with user pages linked into skb frags. It notifies that pages can be reused when the skb is released by setting skb->destructor to tpacket_destruct_skb. This can cause data corruption if the skb is orphaned (e.g., on transmit through veth) or cloned (e.g., on mirror to another psock). Create a kernel-private copy of data in these cases, same as tun/tap zerocopy transmission. Reuse that infrastructure: mark the skb as SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx). Unlike other zerocopy packets, do not set shinfo destructor_arg to struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify when the original skb is released and a timestamp is recorded. Do not change this timestamp behavior. The ubuf_info->callback is not needed anyway, as no zerocopy notification is expected. Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The resulting value is not a valid ubuf_info pointer, nor a valid tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this. The fix relies on features introduced in commit 52267790ef52 ("sock: add MSG_ZEROCOPY"), so can be backported as is only to 4.14. Tested with from `./in_netns.sh ./txring_overwrite` from http://github.com/wdebruij/kerneltools/tests Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap") Reported-by: Anand H. Krishnan <anandhkrishnan@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23Merge tag 'acpi-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds1-0/+1
Pull ACPI fix from Rafael Wysocki: "Prevent the ACPI core from registering a platform device for the SMB0001 HID to avoid IRQ allocation issues (Hans de Goede)" * tag 'acpi-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / platform: Add SMB0001 HID to forbidden_id_list
2018-11-23Merge tag 'pm-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds10-22/+42
Pull power management fixes from Rafael Wysocki: "These fix two issues in the Operating Performance Points (OPP) framework, one cpufreq driver issue, one problem related to the tasks freezer and a few build-related issues in the cpupower utility. Specifics: - Fix tasks freezer deadlock in de_thread() that occurs if one of its sub-threads has been frozen already (Chanho Min). - Avoid registering a platform device by the ti-cpufreq driver on platforms that cannot use it (Dave Gerlach). - Fix a mistake in the ti-opp-supply operating performance points (OPP) driver that caused an incorrect reference voltage to be used and make it adjust the minimum voltage dynamically to avoid hangs or crashes in some cases (Keerthy). - Fix issues related to compiler flags in the cpupower utility and correct a linking problem in it by renaming a file with a duplicate name (Jiri Olsa, Konstantin Khlebnikov)" * tag 'pm-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: exec: make de_thread() freezable cpufreq: ti-cpufreq: Only register platform_device when supported opp: ti-opp-supply: Correct the supply in _get_optimal_vdd_voltage call opp: ti-opp-supply: Dynamically update u_volt_min tools cpupower: Override CFLAGS assignments tools cpupower debug: Allow to use outside build flags tools/power/cpupower: fix compilation with STATIC=true
2018-11-23arm64: cpufeature: Fix mismerge of CONFIG_ARM64_SSBD blockWill Deacon1-1/+1
When merging support for SSBD and the CRC32 instructions, the conflict resolution for the new capability entries in arm64_features[] inadvertedly predicated the availability of the CRC32 instructions on CONFIG_ARM64_SSBD, despite the functionality being entirely unrelated. Move the #ifdef CONFIG_ARM64_SSBD down so that it only covers the SSBD capability. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-11-23Merge tag 'gpio-v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpioLinus Torvalds4-19/+30
Pull GPIO fixes from Linus Walleij: "Minor stuff except the IDA leak which was kind of important to fix. Also new maintainers, yay. - Do not lose an IDA on the gpiochip register errorpath. - Fix the PXA non-pincontrol GPIO-using platforms. - Fix the direction on the mockup GPIO driver. - Add some MAINTAINERS stuff: Bartosz stepped up as GPIO co-maintainer, and Andy established an Intel git tree" * tag 'gpio-v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: MAINTAINERS: Do maintain Intel GPIO drivers via separate tree gpio: mockup: fix indicated direction gpio: pxa: fix legacy non pinctrl aware builds again gpio: don't free unallocated ida on gpiochip_add_data_with_key() error path MAINTAINERS: add myself as co-maintainer of gpiolib
2018-11-23Merge tag 'mmc-v4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmcLinus Torvalds1-3/+83
Pull MMC fixes from Ulf Hansson: "MMC host: - sdhci-pci: Fixup card detect lookup - sdhci-pci: Workaround GLK firmware bug for tuning" * tag 'mmc-v4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-pci: Workaround GLK firmware failing to restore the tuning value mmc: sdhci-pci: Try "cd" for card-detect lookup before using NULL
2018-11-23Merge tag 'drm-fixes-2018-11-23' of git://anongit.freedesktop.org/drm/drmLinus Torvalds24-63/+273
Pull drm fixes from Dave Airlie: "Regular drm fixes: amdgpu: - Vega20 fixes - firmware loading fix - panel display fix - override fix i915: - Sandybridge lockup fix - fastboot DSI panel fix - GPU hang on Broxton - GPU reloc fixes on pineview/bearlake ast: - screen blurring fix - cursor appearance fix udmabuf: - mmap fix vc4: - NULL deref fix - async cursor update fix All seems pretty normal at this stage" * tag 'drm-fixes-2018-11-23' of git://anongit.freedesktop.org/drm/drm: drm/ast: fixed cursor may disappear sometimes drm/ast: change resolution may cause screen blurred drm/i915: Add rotation readout for plane initial config drm/i915: Force a LUT update in intel_initial_commit() drm/fb-helper: Blacklist writeback when adding connectors to fbdev drm/i915: Write GPU relocs harder with gen3 drm/amdgpu: Enable HDP memory light sleep drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture drm/amd/pp: handle negative values when reading OD drm/amdgpu: Add missing firmware entry for HAINAN drm/amd/powerplay: disable Vega20 DS related features drm/amdgpu: Fix oops when pp_funcs->switch_power_profile is unset drm/i915: Disable LP3 watermarks on all SNB machines drm/ast: Remove existing framebuffers before loading driver udmabuf: set read/write flag when exporting drm/amd/display: Support amdgpu "max bpc" connector property (v2) drm/amdgpu: Add amdgpu "max bpc" connector property (v2) drm/vc4: Set ->legacy_cursor_update to false when doing non-async updates drm/vc4: Fix NULL pointer dereference in the async update path
2018-11-23arm64: sysreg: fix sparse warningsSergey Matyukevich1-2/+2
Specify correct type for the constants to avoid the following sparse complaints: ./arch/arm64/include/asm/sysreg.h:471:42: warning: constant 0xffffffffffffffff is so big it is unsigned long ./arch/arm64/include/asm/sysreg.h:512:42: warning: constant 0xffffffffffffffff is so big it is unsigned long Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-11-23Merge branches 'pm-cpufreq' and 'pm-sleep'Rafael J. Wysocki2-7/+24
* pm-cpufreq: cpufreq: ti-cpufreq: Only register platform_device when supported * pm-sleep: exec: make de_thread() freezable