wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2019-07-25	x86/apic: Cleanup the include maze	Thomas Gleixner	9	-112/+26
	All of these APIC files include the world and some more. Remove the unneeded cruft. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190722105219.342631201@linutronix.de
2019-07-25	x86/apic: Move IPI inlines into ipi.c	Thomas Gleixner	2	-22/+13
	No point in having them in an header file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190722105219.252225936@linutronix.de
2019-07-25	x86/apic: Make apic_pending_intr_clear() more robust	Thomas Gleixner	1	-44/+63
	In course of developing shorthand based IPI support issues with the function which tries to clear eventually pending ISR bits in the local APIC were observed. 1) O-day testing triggered the WARN_ON() in apic_pending_intr_clear(). This warning is emitted when the function fails to clear pending ISR bits or observes pending IRR bits which are not delivered to the CPU after the stale ISR bit(s) are ACK'ed. Unfortunately the function only emits a WARN_ON() and fails to dump the IRR/ISR content. That's useless for debugging. Feng added spot on debug printk's which revealed that the stale IRR bit belonged to the APIC timer interrupt vector, but adding ad hoc debug code does not help with sporadic failures in the field. Rework the loop so the full IRR/ISR contents are saved and on failure dumped. 2) The loop termination logic is interesting at best. If the machine has no TSC or cpu_khz is not known yet it tries 1 million times to ack stale IRR/ISR bits. What? With TSC it uses the TSC to calculate the loop termination. It takes a timestamp at entry and terminates the loop when: (rdtsc() - start_timestamp) >= (cpu_hkz << 10) That's roughly one second. Both methods are problematic. The APIC has 256 vectors, which means that in theory max. 256 IRR/ISR bits can be set. In practice this is impossible and the chance that more than a few bits are set is close to zero. With the pure loop based approach the 1 million retries are complete overkill. With TSC this can terminate too early in a guest which is running on a heavily loaded host even with only a couple of IRR/ISR bits set. The reason is that after acknowledging the highest priority ISR bit, pending IRRs must get serviced first before the next round of acknowledge can take place as the APIC (real and virtualized) does not honour EOI without a preceeding interrupt on the CPU. And every APIC read/write takes a VMEXIT if the APIC is virtualized. While trying to reproduce the issue 0-day reported it was observed that the guest was scheduled out long enough under heavy load that it terminated after 8 iterations. Make the loop terminate after 512 iterations. That's plenty enough in any case and does not take endless time to complete. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190722105219.158847694@linutronix.de
2019-07-25	x86/apic: Soft disable APIC before initializing it	Thomas Gleixner	1	-0/+8
	If the APIC was already enabled on entry of setup_local_APIC() then disabling it soft via the SPIV register makes a lot of sense. That masks all LVT entries and brings it into a well defined state. Otherwise previously enabled LVTs which are not touched in the setup function stay unmasked and might surprise the just booting kernel. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190722105219.068290579@linutronix.de
2019-07-25	x86/apic: Invoke perf_events_lapic_init() after enabling APIC	Thomas Gleixner	1	-3/+2
	If the APIC is soft disabled then unmasking an LVT entry does not work and the write is ignored. perf_events_lapic_init() tries to do so. Move the invocation after the point where the APIC has been enabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190722105218.962517234@linutronix.de
2019-07-25	x86/kgbd: Use NMI_VECTOR not APIC_DM_NMI	Thomas Gleixner	1	-1/+1
	apic->send_IPI_allbutself() takes a vector number as argument. APIC_DM_NMI is clearly not a vector number. It's defined to 0x400 which is outside the vector space. Use NMI_VECTOR instead as that's what it is intended to be. Fixes: 82da3ff89dc2 ("x86: kgdb support") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190722105218.855189979@linutronix.de
2019-07-25	x86/reboot: Always use NMI fallback when shutdown via reboot vector IPI fails	Grzegorz Halat	1	-19/+27
	A reboot request sends an IPI via the reboot vector and waits for all other CPUs to stop. If one or more CPUs are in critical regions with interrupts disabled then the IPI is not handled on those CPUs and the shutdown hangs if native_stop_other_cpus() is called with the wait argument set. Such a situation can happen when one CPU was stopped within a lock held section and another CPU is trying to acquire that lock with interrupts disabled. There are other scenarios which can cause such a lockup as well. In theory the shutdown should be attempted by an NMI IPI after the timeout period elapsed. Though the wait loop after sending the reboot vector IPI prevents this. It checks the wait request argument and the timeout. If wait is set, which is true for sys_reboot() then it won't fall through to the NMI shutdown method after the timeout period has finished. This was an oversight when the NMI shutdown mechanism was added to handle the 'reboot IPI is not working' situation. The mechanism was added to deal with stuck panic shutdowns, which do not have the wait request set, so the 'wait request' case was probably not considered. Remove the wait check from the post reboot vector IPI wait loop and enforce that the wait loop in the NMI fallback path is invoked even if NMI IPIs are disabled or the registration of the NMI handler fails. That second wait loop will then hang if not all CPUs shutdown and the wait argument is set. [ tglx: Avoid the hard to parse line break in the NMI fallback path, add comments and massage the changelog ] Fixes: 7d007d21e539 ("x86/reboot: Use NMI to assist in shutting down if IRQ fails") Signed-off-by: Grzegorz Halat <ghalat@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Don Zickus <dzickus@redhat.com> Link: https://lkml.kernel.org/r/20190628122813.15500-1-ghalat@redhat.com
2019-07-25	cpumask: Implement cpumask_or_equal()	Thomas Gleixner	3	-0/+57
	The IPI code of x86 needs to evaluate whether the target cpumask is equal to the cpu_online_mask or equal except for the calling CPU. To replace the current implementation which requires the usage of a temporary cpumask, which might involve allocations, add a new function which compares a cpumask to the result of two other cpumasks which are or'ed together before comparison. This allows to make the required decision in one go and the calling code then can check for the calling CPU being set in the target mask with cpumask_test_cpu(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190722105220.585449120@linutronix.de
2019-07-25	smp/hotplug: Track booted once CPUs in a cpumask	Thomas Gleixner	2	-4/+9
	The booted once information which is required to deal with the MCE broadcast issue on X86 correctly is stored in the per cpu hotplug state, which is perfectly fine for the intended purpose. X86 needs that information for supporting NMI broadcasting via shortcuts, but retrieving it from per cpu data is cumbersome. Move it to a cpumask so the information can be checked against the cpu_present_mask quickly. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190722105219.818822855@linutronix.de
2019-07-22	sched/rt, Kconfig: Unbreak def/oldconfig with CONFIG_PREEMPT=y	Thomas Gleixner	1	-4/+4
	The merge of the CONFIG_PREEMPT_RT stub renamed CONFIG_PREEMPT to CONFIG_PREEMPT_LL which causes all defconfigs which have CONFIG_PREEMPT=y set to fall back to CONFIG_PREEMPT_NONE because CONFIG_PREEMPT depends on the preemption mode choice wich defaults to NONE. This also affects oldconfig builds. So rather than changing 114 defconfig files and being an annoyance to users, revert the rename and select a new config symbol PREEMPTION. That keeps everything working smoothly and the revelant ifdef's are going to be fixed up step by step. Reported-by: Mark Rutland <mark.rutland@arm.com> Fixes: a50a3f4b6a31 ("sched/rt, Kconfig: Introduce CONFIG_PREEMPT_RT") Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-07-22	pidfd: fix a poll race when setting exit_state	Suren Baghdasaryan	1	-0/+1
	There is a race between reading task->exit_state in pidfd_poll and writing it after do_notify_parent calls do_notify_pidfd. Expected sequence of events is: CPU 0 CPU 1 ------------------------------------------------ exit_notify do_notify_parent do_notify_pidfd tsk->exit_state = EXIT_DEAD pidfd_poll if (tsk->exit_state) However nothing prevents the following sequence: CPU 0 CPU 1 ------------------------------------------------ exit_notify do_notify_parent do_notify_pidfd pidfd_poll if (tsk->exit_state) tsk->exit_state = EXIT_DEAD This causes a polling task to wait forever, since poll blocks because exit_state is 0 and the waiting task is not notified again. A stress test continuously doing pidfd poll and process exits uncovered this bug. To fix it, we make sure that the task's exit_state is always set before calling do_notify_pidfd. Fixes: b53b0b9d9a6 ("pidfd: add polling support") Cc: kernel-team@android.com Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Link: https://lore.kernel.org/r/20190717172100.261204-1-joel@joelfernandes.org [christian@brauner.io: adapt commit message and drop unneeded changes from wait_task_zombie] Signed-off-by: Christian Brauner <christian@brauner.io>
2019-07-22	x86/paravirt: Drop {read,write}_cr8() hooks	Andrew Cooper	8	-66/+1
	There is a lot of infrastructure for functionality which is used exclusively in __{save,restore}_processor_state() on the suspend/resume path. cr8 is an alias of APIC_TASKPRI, and APIC_TASKPRI is saved/restored by lapic_{suspend,resume}(). Saving and restoring cr8 independently of the rest of the Local APIC state isn't a clever thing to be doing. Delete the suspend/resume cr8 handling, which shrinks the size of struct saved_context, and allows for the removal of both PVOPS. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lkml.kernel.org/r/20190715151641.29210-1-andrew.cooper3@citrix.com
2019-07-22	x86/apic: Initialize TPR to block interrupts 16-31	Andy Lutomirski	1	-2/+5
	The APIC, per spec, is fundamentally confused and thinks that interrupt vectors 16-31 are valid. This makes no sense -- the CPU reserves vectors 0-31 for exceptions (faults, traps, etc). Obviously, no device should actually produce an interrupt with vector 16-31, but robustness can be improved by setting the APIC TPR class to 1, which will prevent delivery of an interrupt with a vector below 32. Note: This is not intended as a security measure against attackers who control malicious hardware. Any PCI or similar hardware that can be controlled by an attacker MUST be behind a functional IOMMU that remaps interrupts. The purpose of this change is to reduce the chance that a certain class of device malfunctions crashes the kernel in hard-to-debug ways. Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/dc04a9f8b234d7b0956a8d2560b8945bcd9c4bf7.1563117760.git.luto@kernel.org
2019-07-21	tcp: be more careful in tcp_fragment()	Eric Dumazet	2	-2/+16
	Some applications set tiny SO_SNDBUF values and expect TCP to just work. Recent patches to address CVE-2019-11478 broke them in case of losses, since retransmits might be prevented. We should allow these flows to make progress. This patch allows the first and last skb in retransmit queue to be split even if memory limits are hit. It also adds the some room due to the fact that tcp_sendmsg() and tcp_sendpage() might overshoot sk_wmem_queued by about one full TSO skb (64KB size). Note this allowance was already present in stable backports for kernels < 4.15 Note for < 4.15 backports : tcp_rtx_queue_tail() will probably look like : static inline struct sk_buff tcp_rtx_queue_tail(const struct sock sk) { struct sk_buff *skb = tcp_send_head(sk); return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk); } Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrew Prout <aprout@ll.mit.edu> Tested-by: Andrew Prout <aprout@ll.mit.edu> Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com> Tested-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Christoph Paasch <cpaasch@apple.com> Cc: Jonathan Looney <jtl@netflix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	hv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback()	Haiyang Zhang	1	-1/+0
	There is an extra rcu_read_unlock left in netvsc_recv_callback(), after a previous patch that removes RCU from this function. This patch removes the extra RCU unlock. Fixes: 345ac08990b8 ("hv_netvsc: pass netvsc_device to receive callback") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	Linus 5.3-rc1	Linus Torvalds	1	-2/+2

2019-07-21	vrf: make sure skb->data contains ip header to make routing	Peter Kosyh	1	-23/+35
	vrf_process_v4_outbound() and vrf_process_v6_outbound() do routing using ip/ipv6 addresses, but don't make sure the header is available in skb->data[] (skb_headlen() is less then header size). Case: 1) igb driver from intel. 2) Packet size is greater then 255. 3) MPLS forwards to VRF device. So, patch adds pskb_may_pull() calls in vrf_process_v4/v6_outbound() functions. Signed-off-by: Peter Kosyh <p.kosyh@gmail.com> Reviewed-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	connector: remove redundant input callback from cn_dev	Vasily Averin	2	-6/+1
	A small cleanup: this callback is never used. Originally fixed by Stanislav Kinsburskiy <skinsbursky@virtuozzo.com> for OpenVZ7 bug OVZ-6877 cc: stanislav.kinsburskiy@gmail.com Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	qed: Prefer pcie_capability_read_word()	Frederick Lawler	1	-3/+2
	Commit 8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability") added accessors for the PCI Express Capability so that drivers didn't need to be aware of differences between v1 and v2 of the PCI Express Capability. Replace pci_read_config_word() and pci_write_config_word() calls with pcie_capability_read_word() and pcie_capability_write_word(). Signed-off-by: Frederick Lawler <fred@fredlawl.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	igc: Prefer pcie_capability_read_word()	Frederick Lawler	1	-8/+4
	Commit 8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability") added accessors for the PCI Express Capability so that drivers didn't need to be aware of differences between v1 and v2 of the PCI Express Capability. Replace pci_read_config_word() and pci_write_config_word() calls with pcie_capability_read_word() and pcie_capability_write_word(). Signed-off-by: Frederick Lawler <fred@fredlawl.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	cxgb4: Prefer pcie_capability_read_word()	Frederick Lawler	2	-10/+5
	Commit 8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability") added accessors for the PCI Express Capability so that drivers didn't need to be aware of differences between v1 and v2 of the PCI Express Capability. Replace pci_read_config_word() and pci_write_config_word() calls with pcie_capability_read_word() and pcie_capability_write_word(). Signed-off-by: Frederick Lawler <fred@fredlawl.com> Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	be2net: Synchronize be_update_queues with dev_watchdog	Benjamin Poirier	1	-0/+5
	As pointed out by Firo Yang, a netdev tx timeout may trigger just before an ethtool set_channels operation is started. be_tx_timeout(), which dumps some queue structures, is not written to run concurrently with be_update_queues(), which frees/allocates those queues structures. Add some synchronization between the two. Message-id: <CH2PR18MB31898E033896F9760D36BFF288C90@CH2PR18MB3189.namprd18.prod.outlook.com> Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	bnx2x: Prevent load reordering in tx completion processing	Brian King	1	-0/+3
	This patch fixes an issue seen on Power systems with bnx2x which results in the skb is NULL WARN_ON in bnx2x_free_tx_pkt firing due to the skb pointer getting loaded in bnx2x_free_tx_pkt prior to the hw_cons load in bnx2x_tx_int. Adding a read memory barrier resolves the issue. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	net: phy: sfp: hwmon: Fix scaling of RX power	Andrew Lunn	1	-1/+1
	The RX power read from the SFP uses units of 0.1uW. This must be scaled to units of uW for HWMON. This requires a divide by 10, not the current 100. With this change in place, sensors(1) and ethtool -m agree: sff2-isa-0000 Adapter: ISA adapter in0: +3.23 V temp1: +33.1 C power1: 270.00 uW power2: 200.00 uW curr1: +0.01 A Laser output power : 0.2743 mW / -5.62 dBm Receiver signal average optical power : 0.2014 mW / -6.96 dBm Reported-by: chris.healy@zii.aero Signed-off-by: Andrew Lunn <andrew@lunn.ch> Fixes: 1323061a018a ("net: phy: sfp: Add HWMON support for module sensors") Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	net: sched: verify that q!=NULL before setting q->flags	Vlad Buslov	1	-1/+3
	In function int tc_new_tfilter() q pointer can be NULL when adding filter on a shared block. With recent change that resets TCQ_F_CAN_BYPASS after filter creation, following NULL pointer dereference happens in case parent block is shared: [ 212.925060] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 212.925445] #PF: supervisor write access in kernel mode [ 212.925709] #PF: error_code(0x0002) - not-present page [ 212.925965] PGD 8000000827923067 P4D 8000000827923067 PUD 827924067 PMD 0 [ 212.926302] Oops: 0002 [#1] SMP KASAN PTI [ 212.926539] CPU: 18 PID: 2617 Comm: tc Tainted: G B 5.2.0+ #512 [ 212.926938] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [ 212.927364] RIP: 0010:tc_new_tfilter+0x698/0xd40 [ 212.927633] Code: 74 0d 48 85 c0 74 08 48 89 ef e8 03 aa 62 00 48 8b 84 24 a0 00 00 00 48 8d 78 10 48 89 44 24 18 e8 4d 0c 6b ff 48 8b 44 24 18 <83> 60 10 f b 48 85 ed 0f 85 3d fe ff ff e9 4f fe ff ff e8 81 26 f8 [ 212.928607] RSP: 0018:ffff88884fd5f5d8 EFLAGS: 00010296 [ 212.928905] RAX: 0000000000000000 RBX: 0000000000000000 RCX: dffffc0000000000 [ 212.929201] RDX: 0000000000000007 RSI: 0000000000000004 RDI: 0000000000000297 [ 212.929402] RBP: ffff88886bedd600 R08: ffffffffb91d4b51 R09: fffffbfff7616e4d [ 212.929609] R10: fffffbfff7616e4c R11: ffffffffbb0b7263 R12: ffff88886bc61040 [ 212.929803] R13: ffff88884fd5f950 R14: ffffc900039c5000 R15: ffff88835e927680 [ 212.929999] FS: 00007fe7c50b6480(0000) GS:ffff88886f980000(0000) knlGS:0000000000000000 [ 212.930235] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 212.930394] CR2: 0000000000000010 CR3: 000000085bd04002 CR4: 00000000001606e0 [ 212.930588] Call Trace: [ 212.930682] ? tc_del_tfilter+0xa40/0xa40 [ 212.930811] ? __lock_acquire+0x5b5/0x2460 [ 212.930948] ? find_held_lock+0x85/0xa0 [ 212.931081] ? tc_del_tfilter+0xa40/0xa40 [ 212.931201] rtnetlink_rcv_msg+0x4ab/0x5f0 [ 212.931332] ? rtnl_dellink+0x490/0x490 [ 212.931454] ? lockdep_hardirqs_on+0x260/0x260 [ 212.931589] ? netlink_deliver_tap+0xab/0x5a0 [ 212.931717] ? match_held_lock+0x1b/0x240 [ 212.931844] netlink_rcv_skb+0xd0/0x200 [ 212.931958] ? rtnl_dellink+0x490/0x490 [ 212.932079] ? netlink_ack+0x440/0x440 [ 212.932205] ? netlink_deliver_tap+0x161/0x5a0 [ 212.932335] ? lock_downgrade+0x360/0x360 [ 212.932457] ? lock_acquire+0xe5/0x210 [ 212.932579] netlink_unicast+0x296/0x350 [ 212.932705] ? netlink_attachskb+0x390/0x390 [ 212.932834] ? _copy_from_iter_full+0xe0/0x3a0 [ 212.932976] netlink_sendmsg+0x394/0x600 [ 212.937998] ? netlink_unicast+0x350/0x350 [ 212.943033] ? move_addr_to_kernel.part.0+0x90/0x90 [ 212.948115] ? netlink_unicast+0x350/0x350 [ 212.953185] sock_sendmsg+0x96/0xa0 [ 212.958099] ___sys_sendmsg+0x482/0x520 [ 212.962881] ? match_held_lock+0x1b/0x240 [ 212.967618] ? copy_msghdr_from_user+0x250/0x250 [ 212.972337] ? lock_downgrade+0x360/0x360 [ 212.976973] ? rwlock_bug.part.0+0x60/0x60 [ 212.981548] ? __mod_node_page_state+0x1f/0xa0 [ 212.986060] ? match_held_lock+0x1b/0x240 [ 212.990567] ? find_held_lock+0x85/0xa0 [ 212.994989] ? do_user_addr_fault+0x349/0x5b0 [ 212.999387] ? lock_downgrade+0x360/0x360 [ 213.003713] ? find_held_lock+0x85/0xa0 [ 213.007972] ? __fget_light+0xa1/0xf0 [ 213.012143] ? sockfd_lookup_light+0x91/0xb0 [ 213.016165] __sys_sendmsg+0xba/0x130 [ 213.020040] ? __sys_sendmsg_sock+0xb0/0xb0 [ 213.023870] ? handle_mm_fault+0x337/0x470 [ 213.027592] ? page_fault+0x8/0x30 [ 213.031316] ? lockdep_hardirqs_off+0xbe/0x100 [ 213.034999] ? mark_held_locks+0x24/0x90 [ 213.038671] ? do_syscall_64+0x1e/0xe0 [ 213.042297] do_syscall_64+0x74/0xe0 [ 213.045828] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 213.049354] RIP: 0033:0x7fe7c527c7b8 [ 213.052792] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f 0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 54 [ 213.060269] RSP: 002b:00007ffc3f7908a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 213.064144] RAX: ffffffffffffffda RBX: 000000005d34716f RCX: 00007fe7c527c7b8 [ 213.068094] RDX: 0000000000000000 RSI: 00007ffc3f790910 RDI: 0000000000000003 [ 213.072109] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007fe7c5340cc0 [ 213.076113] R10: 0000000000404ec2 R11: 0000000000000246 R12: 0000000000000080 [ 213.080146] R13: 0000000000480640 R14: 0000000000000080 R15: 0000000000000000 [ 213.084147] Modules linked in: act_gact cls_flower sch_ingress nfsv3 nfs_acl nfs lockd grace fscache bridge stp llc sunrpc intel_rapl_msr intel_rapl_common [<1;69;32Msb_edac rdma_ucm rdma_cm x86_pkg_temp_thermal iw_cm intel_powerclamp ib_cm coretemp kvm_intel kvm irqbypass mlx5_ib ib_uverbs ib_core crct10dif_pclmul crc32_pc lmul crc32c_intel ghash_clmulni_intel mlx5_core intel_cstate intel_uncore iTCO_wdt igb iTCO_vendor_support mlxfw mei_me ptp ses intel_rapl_perf mei pcspkr ipmi _ssif i2c_i801 joydev enclosure pps_core lpc_ich ioatdma wmi dca ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ast i2c_algo_bit drm_vram_helpe r ttm drm_kms_helper drm mpt3sas raid_class scsi_transport_sas [ 213.112326] CR2: 0000000000000010 [ 213.117429] ---[ end trace adb58eb0a4ee6283 ]--- Verify that q pointer is not NULL before setting the 'flags' field. Fixes: 3f05e6886a59 ("net_sched: unset TCQ_F_CAN_BYPASS when adding filters") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	chelsio: Fix a typo in a function name	Christophe JAILLET	1	-2/+2
	It is likely that 'my3216_poll()' should be 'my3126_poll()'. (1 and 2 switched in 3126. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	allocate_flower_entry: should check for null deref	Navid Emamdoost	1	-1/+2
	allocate_flower_entry does not check for allocation success, but tries to deref the result. I only moved the spin_lock under null check, because the caller is checking allocation's status at line 652. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	net: hns3: typo in the name of a constant	Christophe JAILLET	3	-4/+4
	All constant in 'enum HCLGE_MBX_OPCODE' start with HCLGE, except 'HLCGE_MBX_PUSH_VLAN_INFO' (C and L switched) s/HLC/HCL/ Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	kbuild: add net/netfilter/nf_tables_offload.h to header-test blacklist.	Jeremy Sowden	1	-0/+1
	net/netfilter/nf_tables_offload.h includes net/netfilter/nf_tables.h which is itself on the blacklist. Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	tipc: Fix a typo	Christophe JAILLET	1	-1/+1
	s/tipc_toprsv_listener_data_ready/tipc_topsrv_listener_data_ready/ (r and s switched in topsrv) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21	iommu/amd: fix a crash in iova_magazine_free_pfns	Qian Cai	1	-1/+1
	The commit b3aa14f02254 ("iommu: remove the mapping_error dma_map_ops method") incorrectly changed the checking from dma_ops_alloc_iova() in map_sg() causes a crash under memory pressure as dma_ops_alloc_iova() never return DMA_MAPPING_ERROR on failure but 0, so the error handling is all wrong. kernel BUG at drivers/iommu/iova.c:801! Workqueue: kblockd blk_mq_run_work_fn RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0 Call Trace: free_cpu_cached_iovas+0xbd/0x150 alloc_iova_fast+0x8c/0xba dma_ops_alloc_iova.isra.6+0x65/0xa0 map_sg+0x8c/0x2a0 scsi_dma_map+0xc6/0x160 pqi_aio_submit_io+0x1f6/0x440 [smartpqi] pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi] scsi_queue_rq+0x79c/0x1200 blk_mq_dispatch_rq_list+0x4dc/0xb70 blk_mq_sched_dispatch_requests+0x249/0x310 __blk_mq_run_hw_queue+0x128/0x200 blk_mq_run_work_fn+0x27/0x30 process_one_work+0x522/0xa10 worker_thread+0x63/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x22/0x40 Fixes: b3aa14f02254 ("iommu: remove the mapping_error dma_map_ops method") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-21	hexagon: switch to generic version of pte allocation	Mike Rapoport	1	-32/+2
	The hexagon implementation pte_alloc_one(), pte_alloc_one_kernel(), pte_free_kernel() and pte_free() is identical to the generic except of lack of __GFP_ACCOUNT for the user PTEs allocation. Switch hexagon to use generic version of these functions. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-21	parisc: Flush ITLB in flush_tlb_all_local() only on split TLB machines	Helge Deller	1	-1/+2
	flush_tlb_all_local() flushes the ITLB and DTLB of the CPU. In case the machine does not have separate ITLBs and DTLBs, use the alternative functionality to replace the code which flushes the ITLB with nops while keeping the code which flushes the DTLB. Signed-off-by: Helge Deller <deller@gmx.de>
2019-07-21	parisc: add kprobe_fault_handler()	Sven Schnelle	1	-0/+4
	Add kprobe_fault_handler() to fix compilation for PA-RISC. On PA-RISC we actually don't need that function as the recovery counter is restored after interrupt. See the PA-RISC 2.0 Architecture Manual, pg. 4-8, Figure 4-4: "Interruption Processing". Fixes: b98cca444d28 ("mm, kprobes: generalize and rename notify_page_fault() as kprobe_page_fault()") Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-07-20	typo fix: it's d_make_root, not d_make_inode...	Al Viro	1	-1/+1
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-20	dt-bindings: pinctrl: stm32: Fix missing 'clocks' property in examples	Rob Herring	1	-0/+4
	Now that examples are validated against the DT schema, an error with required 'clocks' property missing is exposed: Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.example.dt.yaml: \ pinctrl@40020000: gpio@0: 'clocks' is a required property Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.example.dt.yaml: \ pinctrl@50020000: gpio@1000: 'clocks' is a required property Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.example.dt.yaml: \ pinctrl@50020000: gpio@2000: 'clocks' is a required property Add the missing 'clocks' properties to the examples to fix the errors. Fixes: 2c9239c125f0 ("dt-bindings: pinctrl: Convert stm32 pinctrl bindings to json-schema") Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: linux-gpio@vger.kernel.org Cc: linux-stm32@st-md-mailman.stormreply.com Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Rob Herring <robh@kernel.org>
2019-07-20	dt-bindings: iio: ad7124: Fix dtc warnings in example	Rob Herring	1	-33/+38
	With the conversion to DT schema, the examples are now compiled with dtc. The ad7124 binding example has the following warning: Documentation/devicetree/bindings/iio/adc/adi,ad7124.example.dts:19.11-21: \ Warning (reg_format): /example-0/adc@0:reg: property has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) There's a default #size-cells and #address-cells values of 1 for examples. For examples needing different values such as this one on a SPI bus, they need to provide a SPI bus parent node. Fixes: 26ae15e62d3c ("Convert AD7124 bindings documentation to YAML format.") Cc: Jonathan Cameron <jic23@kernel.org> Cc: linux-iio@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2019-07-20	dt-bindings: iio: avia-hx711: Fix avdd-supply typo in example	Rob Herring	1	-1/+1
	Now that examples are validated against the DT schema, a typo in avia-hx711 example generates a warning: Documentation/devicetree/bindings/iio/adc/avia-hx711.example.dt.yaml: weight: 'avdd-supply' is a required property Fix the typo. Fixes: 5150ec3fe125 ("avia-hx711.yaml: transform DT binding to YAML") Cc: Andreas Klinger <ak@it-klinger.de> Cc: Jonathan Cameron <jic23@kernel.org> Cc: linux-iio@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2019-07-20	dt-bindings: pinctrl: aspeed: Fix AST2500 example errors	Rob Herring	1	-4/+1
	The schema examples are now validated against the schema itself. The AST2500 pinctrl schema has a couple of errors: Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.example.dt.yaml: \ example-0: $nodename:0: 'example-0' does not match '^(bus\|soc\|axi\|ahb\|apb)(@[0-9a-f]+)?$' Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.example.dt.yaml: \ pinctrl: aspeed,external-nodes: [[1, 2]] is too short Fixes: 0a617de16730 ("dt-bindings: pinctrl: aspeed: Convert AST2500 bindings to json-schema") Cc: Andrew Jeffery <andrew@aj.id.au> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Joel Stanley <joel@jms.id.au> Cc: linux-aspeed@lists.ozlabs.org Cc: linux-gpio@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Rob Herring <robh@kernel.org>
2019-07-20	dt-bindings: pinctrl: aspeed: Fix 'compatible' schema errors	Rob Herring	2	-2/+6
	The Aspeed pinctl schema have errors in the 'compatible' schema: Documentation/devicetree/bindings/pinctrl/aspeed,ast2400-pinctrl.yaml: \ properties:compatible:enum: ['aspeed', 'ast2400-pinctrl', 'aspeed', 'g4-pinctrl'] has non-unique elements Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml: \ properties:compatible:enum: ['aspeed', 'ast2500-pinctrl', 'aspeed', 'g5-pinctrl'] has non-unique elements Flow style sequences have to be quoted if the vales contain ','. Fix this by using the more common one line per entry formatting. Fixes: 0a617de16730 ("dt-bindings: pinctrl: aspeed: Convert AST2500 bindings to json-schema") Fixes: 07457937bb5c ("dt-bindings: pinctrl: aspeed: Convert AST2400 bindings to json-schema") Cc: Andrew Jeffery <andrew@aj.id.au> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Joel Stanley <joel@jms.id.au> Cc: linux-aspeed@lists.ozlabs.org Cc: linux-gpio@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Rob Herring <robh@kernel.org>
2019-07-20	dt-bindings: riscv: Limit cpus schema to only check RiscV 'cpu' nodes	Rob Herring	1	-82/+61
	Matching on the 'cpus' node was a bad choice because the schema is incorrectly applied to non-RiscV cpus nodes. As we now have a common cpus schema which checks the general structure, it is also redundant to do so in the Risc-V CPU schema. The downside is one could conceivably mix different architecture's cpu nodes or have typos in the compatible string. The latter problem pretty much exists for every schema. Acked-by: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Rob Herring <robh@kernel.org>
2019-07-20	dt-bindings: Ensure child nodes are of type 'object'	Rob Herring	6	-0/+8
	Properties which are child node definitions need to have an explict type. Otherwise, a matching (DT) property can silently match when an error is desired. Fix this up tree-wide. Once this is fixed, the meta-schema will enforce this on any child node definitions. Cc: Chen-Yu Tsai <wens@csie.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Brian Norris <computersforpeace@gmail.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: linux-mtd@lists.infradead.org Cc: linux-gpio@vger.kernel.org Cc: linux-stm32@st-md-mailman.stormreply.com Cc: linux-spi@vger.kernel.org Acked-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Maxime Ripard <maxime.ripard@bootlin.com> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Rob Herring <robh@kernel.org>
2019-07-20	mac80211: don't warn about CW params when not using them	Brian Norris	1	-4/+9
	ieee80211_set_wmm_default() normally sets up the initial CW min/max for each queue, except that it skips doing this if the driver doesn't support ->conf_tx. We still end up calling drv_conf_tx() in some cases (e.g., ieee80211_reconfig()), which also still won't do anything useful...except it complains here about the invalid CW parameters. Let's just skip the WARN if we weren't going to do anything useful with the parameters. Signed-off-by: Brian Norris <briannorris@chromium.org> Link: https://lore.kernel.org/r/20190718015712.197499-1-briannorris@chromium.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-20	mac80211: fix possible memory leak in ieee80211_assign_beacon	Lorenzo Bianconi	1	-2/+6
	Free new beacon_data in ieee80211_assign_beacon whenever ieee80211_assign_beacon fails Fixes: 8860020e0be1 ("cfg80211: restructure AP/GO mode API") Fixes: bc847970f432 ("mac80211: support FTM responder configuration/statistic") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/770285772543c9fca33777bb4ad4760239e56256.1562105631.git.lorenzo@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-20	nl80211: fix NL80211_HE_MAX_CAPABILITY_LEN	John Crispin	1	-1/+1
	NL80211_HE_MAX_CAPABILITY_LEN has changed between D2.0 and D4.0. It is now MAC (6) + PHY (11) + MCS (12) + PPE (25) = 54. Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190627095832.19445-1-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-20	nl80211: fix VENDOR_CMD_RAW_DATA	Johannes Berg	1	-1/+1
	Since ERR_PTR() is an inline, not a macro, just open-code it here so it's usable as an initializer, fixing the build in brcmfmac. Reported-by: Arend Van Spriel <arend.vanspriel@broadcom.com> Fixes: 901bb9891855 ("nl80211: require and validate vendor command policy") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-20	wireless: fix nl80211 vendor commands	Johannes Berg	3	-0/+8
	In my previous commit to validate a policy I neglected to actually add one to the few drivers using vendor commands, fix that now. Reported-by: Tony Lindgren <tony@atomide.com> Tested-by: Tony Lindgren <tony@atomide.com> Fixes: 901bb9891855 ("nl80211: require and validate vendor command policy") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-20	r8169: fix RTL8168g PHY init	Thomas Voegtle	1	-2/+2
	This fixes a copy&paste error in the original patch. Setting the wrong register resulted in massive packet loss on some systems. Fixes: a2928d28643e ("r8169: use paged versions of phylib MDIO access functions") Tested-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-20	x86/entry/64: Prevent clobbering of saved CR2 value	Thomas Gleixner	1	-1/+10
	The recent fix for CR2 corruption introduced a new way to reliably corrupt the saved CR2 value. CR2 is saved early in the entry code in RDX, which is the third argument to the fault handling functions. But it missed that between saving and invoking the fault handler enter_from_user_mode() can be called. RDX is a caller saved register so the invoked function can freely clobber it with the obvious consequences. The TRACE_IRQS_OFF call is safe as it calls through the thunk which preserves RDX, but TRACE_IRQS_OFF_DEBUG is not because it also calls into C-code outside of the thunk. Store CR2 in R12 instead which is a callee saved register and move R12 to RDX just before calling the fault handler. Fixes: a0d14b8909de ("x86/mm, tracing: Fix CR2 corruption") Reported-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907201020540.1782@nanos.tec.linutronix.de
2019-07-20	smp: Warn on function calls from softirq context	Peter Zijlstra	1	-0/+16
	It's clearly documented that smp function calls cannot be invoked from softirq handling context. Unfortunately nothing enforces that or emits a warning. A single function call can be invoked from softirq context only via smp_call_function_single_async(). The only legit context is task context, so add a warning to that effect. Reported-by: luferry <luferry@163.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190718160601.GP3402@hirez.programming.kicks-ass.net