aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2019-04-18Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcuIngo Molnar61-1458/+1374
Pull RCU and LKMM commits from Paul E. McKenney: - An LKMM commit adding support for synchronize_srcu_expedited() - A couple of straggling RCU flavor consolidation updates - Documentation updates. - Miscellaneous fixes - SRCU updates - RCU CPU stall-warning updates - Torture-test updates Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-17Merge tag '5.1-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds6-14/+62
Pull smb3 fixes from Steve French: "Five small SMB3 fixes, all also for stable - an important fix for an oplock (lease) bug, a handle leak, and three bugs spotted by KASAN" * tag '5.1-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: CIFS: keep FileInfo handle live during oplock break cifs: fix handle leak in smb2_query_symlink() cifs: Fix lease buffer length error cifs: Fix use-after-free in SMB2_read cifs: Fix use-after-free in SMB2_write
2019-04-17Merge tag 'for-linus-5.1-2' of git://github.com/cminyard/linux-ipmiLinus Torvalds3-3/+19
Pull IPMI fixes from Corey Minyard: "Fixes for some bugs cause by recent changes. One crash if you feed bad data to the module parameters, one BUG that sometimes occurs when a user closes the connection, and one bug that cause the driver to not work if the configuration information only comes in from SMBIOS" * tag 'for-linus-5.1-2' of git://github.com/cminyard/linux-ipmi: ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier ipmi: ipmi_si_hardcode.c: init si_type array to fix a crash ipmi: Fix failure on SMBIOS specified devices
2019-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds125-741/+1118
Pull networking fixes from David Miller: 1) Handle init flow failures properly in iwlwifi driver, from Shahar S Matityahu. 2) mac80211 TXQs need to be unscheduled on powersave start, from Felix Fietkau. 3) SKB memory accounting fix in A-MDSU aggregation, from Felix Fietkau. 4) Increase RCU lock hold time in mlx5 FPGA code, from Saeed Mahameed. 5) Avoid checksum complete with XDP in mlx5, also from Saeed. 6) Fix netdev feature clobbering in ibmvnic driver, from Thomas Falcon. 7) Partial sent TLS record leak fix from Jakub Kicinski. 8) Reject zero size iova range in vhost, from Jason Wang. 9) Allow pending work to complete before clcsock release from Karsten Graul. 10) Fix XDP handling max MTU in thunderx, from Matteo Croce. 11) A lot of protocols look at the sa_family field of a sockaddr before validating it's length is large enough, from Tetsuo Handa. 12) Don't write to free'd pointer in qede ptp error path, from Colin Ian King. 13) Have to recompile IP options in ipv4_link_failure because it can be invoked from ARP, from Stephen Suryaputra. 14) Doorbell handling fixes in qed from Denis Bolotin. 15) Revert net-sysfs kobject register leak fix, it causes new problems. From Wang Hai. 16) Spectre v1 fix in ATM code, from Gustavo A. R. Silva. 17) Fix put of BROPT_VLAN_STATS_PER_PORT in bridging code, from Nikolay Aleksandrov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (111 commits) socket: fix compat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW tcp: tcp_grow_window() needs to respect tcp_space() ocelot: Clean up stats update deferred work ocelot: Don't sleep in atomic context (irqs_disabled()) net: bridge: fix netlink export of vlan_stats_per_port option qed: fix spelling mistake "faspath" -> "fastpath" tipc: set sysctl_tipc_rmem and named_timeout right range tipc: fix link established but not in session net: Fix missing meta data in skb with vlan packet net: atm: Fix potential Spectre v1 vulnerabilities net/core: work around section mismatch warning for ptp_classifier net: bridge: fix per-port af_packet sockets bnx2x: fix spelling mistake "dicline" -> "decline" route: Avoid crash from dereferencing NULL rt->from MAINTAINERS: normalize Woojung Huh's email address bonding: fix event handling for stacked bonds Revert "net-sysfs: Fix memory leak in netdev_register_kobject" rtnetlink: fix rtnl_valid_stats_req() nlmsg_len check qed: Fix the DORQ's attentions handling qed: Fix missing DORQ attentions ...
2019-04-17ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrierCorey Minyard1-2/+17
free_user() could be called in atomic context. This patch pushed the free operation off into a workqueue. Example: BUG: sleeping function called from invalid context at kernel/workqueue.c:2856 in_atomic(): 1, irqs_disabled(): 0, pid: 177, name: ksoftirqd/27 CPU: 27 PID: 177 Comm: ksoftirqd/27 Not tainted 4.19.25-3 #1 Hardware name: AIC 1S-HV26-08/MB-DPSB04-06, BIOS IVYBV060 10/21/2015 Call Trace: dump_stack+0x5c/0x7b ___might_sleep+0xec/0x110 __flush_work+0x48/0x1f0 ? try_to_del_timer_sync+0x4d/0x80 _cleanup_srcu_struct+0x104/0x140 free_user+0x18/0x30 [ipmi_msghandler] ipmi_free_recv_msg+0x3a/0x50 [ipmi_msghandler] deliver_response+0xbd/0xd0 [ipmi_msghandler] deliver_local_response+0xe/0x30 [ipmi_msghandler] handle_one_recv_msg+0x163/0xc80 [ipmi_msghandler] ? dequeue_entity+0xa0/0x960 handle_new_recv_msgs+0x15c/0x1f0 [ipmi_msghandler] tasklet_action_common.isra.22+0x103/0x120 __do_softirq+0xf8/0x2d7 run_ksoftirqd+0x26/0x50 smpboot_thread_fn+0x11d/0x1e0 kthread+0x103/0x140 ? sort_range+0x20/0x20 ? kthread_destroy_worker+0x40/0x40 ret_from_fork+0x1f/0x40 Fixes: 77f8269606bf ("ipmi: fix use-after-free of user->release_barrier.rda") Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: stable@vger.kernel.org # 5.0 Cc: Yang Yingliang <yangyingliang@huawei.com>
2019-04-16socket: fix compat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEWArnd Bergmann1-2/+2
It looks like the new socket options only work correctly for native execution, but in case of compat mode fall back to the old behavior as we ignore the 'old_timeval' flag. Rework so we treat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW the same way in compat and native 32-bit mode. Cc: Deepa Dinamani <deepa.kernel@gmail.com> Fixes: a9beb86ae6e5 ("sock: Add SO_RCVTIMEO_NEW and SO_SNDTIMEO_NEW") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16tcp: tcp_grow_window() needs to respect tcp_space()Eric Dumazet1-5/+5
For some reason, tcp_grow_window() correctly tests if enough room is present before attempting to increase tp->rcv_ssthresh, but does not prevent it to grow past tcp_space() This is causing hard to debug issues, like failing the (__tcp_select_window(sk) >= tp->rcv_wnd) test in __tcp_ack_snd_check(), causing ACK delays and possibly slow flows. Depending on tcp_rmem[2], MTU, skb->len/skb->truesize ratio, we can see the problem happening on "netperf -t TCP_RR -- -r 2000,2000" after about 60 round trips, when the active side no longer sends immediate acks. This bug predates git history. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16ocelot: Clean up stats update deferred workClaudiu Manoil1-8/+14
This is preventive cleanup that may save troubles later. No need to cancel repeateadly queued work if code is properly refactored. Don't let the ethtool -s process interfere with the stat workqueue scheduling. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16ocelot: Don't sleep in atomic context (irqs_disabled())Claudiu Manoil1-1/+1
Preemption disabled at: [<ffff000008cabd54>] dev_set_rx_mode+0x1c/0x38 Call trace: [<ffff00000808a5c0>] dump_backtrace+0x0/0x3d0 [<ffff00000808a9a4>] show_stack+0x14/0x20 [<ffff000008e6c0c0>] dump_stack+0xac/0xe4 [<ffff0000080fe76c>] ___might_sleep+0x164/0x238 [<ffff0000080fe890>] __might_sleep+0x50/0x88 [<ffff0000082261e4>] kmem_cache_alloc+0x17c/0x1d0 [<ffff000000ea0ae8>] ocelot_set_rx_mode+0x108/0x188 [mscc_ocelot_common] [<ffff000008cabcf0>] __dev_set_rx_mode+0x58/0xa0 [<ffff000008cabd5c>] dev_set_rx_mode+0x24/0x38 Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16net: bridge: fix netlink export of vlan_stats_per_port optionNikolay Aleksandrov1-1/+1
Since the introduction of the vlan_stats_per_port option the netlink export of it has been broken since I made a typo and used the ifla attribute instead of the bridge option to retrieve its state. Sysfs export is fine, only netlink export has been affected. Fixes: 9163a0fc1f0c0 ("net: bridge: add support for per-port vlan stats") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16qed: fix spelling mistake "faspath" -> "fastpath"Colin Ian King1-1/+1
There is a spelling mistake in a DP_INFO message, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16tipc: set sysctl_tipc_rmem and named_timeout right rangeJie Liu1-2/+6
We find that sysctl_tipc_rmem and named_timeout do not have the right minimum setting. sysctl_tipc_rmem should be larger than zero, like sysctl_tcp_rmem. And named_timeout as a timeout setting should be not less than zero. Fixes: cc79dd1ba9c10 ("tipc: change socket buffer overflow control to respect sk_rcvbuf") Fixes: a5325ae5b8bff ("tipc: add name distributor resiliency queue") Signed-off-by: Jie Liu <liujie165@huawei.com> Reported-by: Qiang Ning <ningqiang1@huawei.com> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16tipc: fix link established but not in sessionTuong Lien1-0/+2
According to the link FSM, when a link endpoint got RESET_MSG (- a traditional one without the stopping bit) from its peer, it moves to PEER_RESET state and raises a LINK_DOWN event which then resets the link itself. Its state will become ESTABLISHING after the reset event and the link will be re-established soon after this endpoint starts to send ACTIVATE_MSG to the peer. There is no problem with this mechanism, however the link resetting has cleared the link 'in_session' flag (along with the other important link data such as: the link 'mtu') that was correctly set up at the 1st step (i.e. when this endpoint received the peer RESET_MSG). As a result, the link will become ESTABLISHED, but the 'in_session' flag is not set, and all STATE_MSG from its peer will be dropped at the link_validate_msg(). It means the link not synced and will sooner or later face a failure. Since the link reset action is obviously needed for a new link session (this is also true in the other situations), the problem here is that the link is re-established a bit too early when the link endpoints are not really in-sync yet. The commit forces a resync as already done in the previous commit 91986ee166cf ("tipc: fix link session and re-establish issues") by simply varying the link 'peer_session' value at the link_reset(). Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16net: Fix missing meta data in skb with vlan packetYuya Kusakabe1-1/+9
skb_reorder_vlan_header() should move XDP meta data with ethernet header if XDP meta data exists. Fixes: de8f3a83b0a0 ("bpf: add meta pointer for direct access") Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com> Signed-off-by: Takeru Hayasaka <taketarou2@gmail.com> Co-developed-by: Takeru Hayasaka <taketarou2@gmail.com> Reviewed-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16net: atm: Fix potential Spectre v1 vulnerabilitiesGustavo A. R. Silva1-1/+5
arg is controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: net/atm/lec.c:715 lec_mcast_attach() warn: potential spectre issue 'dev_lec' [r] (local cap) Fix this by sanitizing arg before using it to index dev_lec. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/ Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16net/core: work around section mismatch warning for ptp_classifierArd Biesheuvel1-3/+4
The routine ptp_classifier_init() uses an initializer for an automatic struct type variable which refers to an __initdata symbol. This is perfectly legal, but may trigger a section mismatch warning when running the compiler in -fpic mode, due to the fact that the initializer may be emitted into an anonymous .data section thats lack the __init annotation. So work around it by using assignments instead. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16net: bridge: fix per-port af_packet socketsNikolay Aleksandrov1-9/+14
When the commit below was introduced it changed two visible things: - the skb was no longer passed through the protocol handlers with the original device - the skb was passed up the stack with skb->dev = bridge The first change broke af_packet sockets on bridge ports. For example we use them for hostapd which listens for ETH_P_PAE packets on the ports. We discussed two possible fixes: - create a clone and pass it through NF_HOOK(), act on the original skb based on the result - somehow signal to the caller from the okfn() that it was called, meaning the skb is ok to be passed, which this patch is trying to implement via returning 1 from the bridge link-local okfn() Note that we rely on the fact that NF_QUEUE/STOLEN would return 0 and drop/error would return < 0 thus the okfn() is called only when the return was 1, so we signal to the caller that it was called by preserving the return value from nf_hook(). Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16ipmi: ipmi_si_hardcode.c: init si_type array to fix a crashTony Camuso1-0/+2
The intended behavior of function ipmi_hardcode_init_one() is to default to kcs interface when no type argument is presented when initializing ipmi with hard coded addresses. However, the array of char pointers allocated on the stack by function ipmi_hardcode_init() was not inited to zeroes, so it contained stack debris. Consequently, passing the cruft stored in this array to function ipmi_hardcode_init_one() caused a crash when it was unable to detect that the char * being passed was nonsense and tried to access the address specified by the bogus pointer. The fix is simply to initialize the si_type array to zeroes, so if there were no type argument given to at the command line, function ipmi_hardcode_init_one() could properly default to the kcs interface. Signed-off-by: Tony Camuso <tcamuso@redhat.com> Message-Id: <1554837603-40299-1-git-send-email-tcamuso@redhat.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2019-04-16ipmi: Fix failure on SMBIOS specified devicesCorey Minyard1-1/+0
An extra memset was put into a place that cleared the interface type. Reported-by: Tony Camuso <tcamuso@redhat.com> Fixes: 3cd83bac481dc4 ("ipmi: Consolidate the adding of platform devices") Signed-off-by: Corey Minyard <cminyard@mvista.com>
2019-04-16Merge tag 'riscv-for-linus-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linuxLinus Torvalds3-0/+110
Pull RISC-V fixes from Palmer Dabbelt: "This contains an assortment of RISC-V-related fixups that we found after rc4. They're all really unrelated: - The addition of a 32-bit defconfig, to emphasize testing the 32-bit port. - A device tree bindings patch, which is pre-work for some patches that target 5.2. - A fix to support booting on systems with more physical memory than the maximum supported by the kernel" * tag 'riscv-for-linus-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: RISC-V: Fix Maximum Physical Memory 2GiB option for 64bit systems dt-bindings: clock: sifive: add FU540-C000 PRCI clock constants RISC-V: Add separate defconfig for 32bit systems
2019-04-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds25-212/+496
Pull KVM fixes from Paolo Bonzini: "5.1 keeps its reputation as a big bugfix release for KVM x86. - Fix for a memory leak introduced during the merge window - Fixes for nested VMX with ept=0 - Fixes for AMD (APIC virtualization, NMI injection) - Fixes for Hyper-V under KVM and KVM under Hyper-V - Fixes for 32-bit SMM and tests for SMM virtualization - More array_index_nospec peppering" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits) KVM: x86: avoid misreporting level-triggered irqs as edge-triggered in tracing KVM: fix spectrev1 gadgets KVM: x86: fix warning Using plain integer as NULL pointer selftests: kvm: add a selftest for SMM selftests: kvm: fix for compilers that do not support -no-pie selftests: kvm/evmcs_test: complete I/O before migrating guest state KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU KVM: x86: clear SMM flags before loading state while leaving SMM KVM: x86: Open code kvm_set_hflags KVM: x86: Load SMRAM in a single shot when leaving SMM KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU KVM: x86: Raise #GP when guest vCPU do not support PMU x86/kvm: move kvm_load/put_guest_xcr0 into atomic context KVM: x86: svm: make sure NMI is injected after nmi_singlestep svm/avic: Fix invalidate logical APIC id entry Revert "svm: Fix AVIC incomplete IPI emulation" kvm: mmu: Fix overflow on kvm mmu page limit calculation KVM: nVMX: always use early vmcs check when EPT is disabled KVM: nVMX: allow tests to use bad virtual-APIC page address ...
2019-04-16CIFS: keep FileInfo handle live during oplock breakAurelien Aptel4-10/+53
In the oplock break handler, writing pending changes from pages puts the FileInfo handle. If the refcount reaches zero it closes the handle and waits for any oplock break handler to return, thus causing a deadlock. To prevent this situation: * We add a wait flag to cifsFileInfo_put() to decide whether we should wait for running/pending oplock break handlers * We keep an additionnal reference of the SMB FileInfo handle so that for the rest of the handler putting the handle won't close it. - The ref is bumped everytime we queue the handler via the cifs_queue_oplock_break() helper. - The ref is decremented at the end of the handler This bug was triggered by xfstest 464. Also important fix to address the various reports of oops in smb2_push_mandatory_locks Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-04-16cifs: fix handle leak in smb2_query_symlink()Ronnie Sahlberg1-0/+2
If we enter smb2_query_symlink() for something that is not a symlink and where the SMB2_open() would succeed we would never end up closing this handle and would thus leak a handle on the server. Fix this by immediately calling SMB2_close() on successfull open. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-04-16cifs: Fix lease buffer length errorZhangXiaoxu1-1/+4
There is a KASAN slab-out-of-bounds: BUG: KASAN: slab-out-of-bounds in _copy_from_iter_full+0x783/0xaa0 Read of size 80 at addr ffff88810c35e180 by task mount.cifs/539 CPU: 1 PID: 539 Comm: mount.cifs Not tainted 4.19 #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0xdd/0x12a print_address_description+0xa7/0x540 kasan_report+0x1ff/0x550 check_memory_region+0x2f1/0x310 memcpy+0x2f/0x80 _copy_from_iter_full+0x783/0xaa0 tcp_sendmsg_locked+0x1840/0x4140 tcp_sendmsg+0x37/0x60 inet_sendmsg+0x18c/0x490 sock_sendmsg+0xae/0x130 smb_send_kvec+0x29c/0x520 __smb_send_rqst+0x3ef/0xc60 smb_send_rqst+0x25a/0x2e0 compound_send_recv+0x9e8/0x2af0 cifs_send_recv+0x24/0x30 SMB2_open+0x35e/0x1620 open_shroot+0x27b/0x490 smb2_open_op_close+0x4e1/0x590 smb2_query_path_info+0x2ac/0x650 cifs_get_inode_info+0x1058/0x28f0 cifs_root_iget+0x3bb/0xf80 cifs_smb3_do_mount+0xe00/0x14c0 cifs_do_mount+0x15/0x20 mount_fs+0x5e/0x290 vfs_kern_mount+0x88/0x460 do_mount+0x398/0x31e0 ksys_mount+0xc6/0x150 __x64_sys_mount+0xea/0x190 do_syscall_64+0x122/0x590 entry_SYSCALL_64_after_hwframe+0x44/0xa9 It can be reproduced by the following step: 1. samba configured with: server max protocol = SMB2_10 2. mount -o vers=default When parse the mount version parameter, the 'ops' and 'vals' was setted to smb30, if negotiate result is smb21, just update the 'ops' to smb21, but the 'vals' is still smb30. When add lease context, the iov_base is allocated with smb21 ops, but the iov_len is initiallited with the smb30. Because the iov_len is longer than iov_base, when send the message, copy array out of bounds. we need to keep the 'ops' and 'vals' consistent. Fixes: 9764c02fcbad ("SMB3: Add support for multidialect negotiate (SMB2.1 and later)") Fixes: d5c7076b772a ("smb3: add smb3.1.1 to default dialect list") Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-04-16cifs: Fix use-after-free in SMB2_readZhangXiaoxu1-2/+2
There is a KASAN use-after-free: BUG: KASAN: use-after-free in SMB2_read+0x1136/0x1190 Read of size 8 at addr ffff8880b4e45e50 by task ln/1009 Should not release the 'req' because it will use in the trace. Fixes: eccb4422cf97 ("smb3: Add ftrace tracepoints for improved SMB3 debugging") Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> 4.18+ Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-04-16cifs: Fix use-after-free in SMB2_writeZhangXiaoxu1-1/+1
There is a KASAN use-after-free: BUG: KASAN: use-after-free in SMB2_write+0x1342/0x1580 Read of size 8 at addr ffff8880b6a8e450 by task ln/4196 Should not release the 'req' because it will use in the trace. Fixes: eccb4422cf97 ("smb3: Add ftrace tracepoints for improved SMB3 debugging") Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> 4.18+ Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-04-16Merge LKMM and RCU commitsPaul E. McKenney53-1276/+1085
2019-04-16KVM: x86: avoid misreporting level-triggered irqs as edge-triggered in tracingVitaly Kuznetsov1-2/+2
In __apic_accept_irq() interface trig_mode is int and actually on some code paths it is set above u8: kvm_apic_set_irq() extracts it from 'struct kvm_lapic_irq' where trig_mode is u16. This is done on purpose as e.g. kvm_set_msi_irq() sets it to (1 << 15) & e->msi.data kvm_apic_local_deliver sets it to reg & (1 << 15). Fix the immediate issue by making 'tm' into u16. We may also want to adjust __apic_accept_irq() interface and use proper sizes for vector, level, trig_mode but this is not urgent. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: fix spectrev1 gadgetsPaolo Bonzini4-9/+16
These were found with smatch, and then generalized when applicable. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: x86: fix warning Using plain integer as NULL pointerHariprasad Kelam1-1/+1
Changed passing argument as "0 to NULL" which resolves below sparse warning arch/x86/kvm/x86.c:3096:61: warning: Using plain integer as NULL pointer Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16selftests: kvm: add a selftest for SMMVitaly Kuznetsov4-6/+191
Add a simple test for SMM, based on VMX. The test implements its own sync between the guest and the host as using our ucall library seems to be too cumbersome: SMI handler is happening in real-address mode. This patch also fixes KVM_SET_NESTED_STATE to happen after KVM_SET_VCPU_EVENTS, in fact it places it last. This is because KVM needs to know whether the processor is in SMM or not. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16selftests: kvm: fix for compilers that do not support -no-piePaolo Bonzini1-1/+7
-no-pie was added to GCC at the same time as their configuration option --enable-default-pie. Compilers that were built before do not have -no-pie, but they also do not need it. Detect the option at build time. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16selftests: kvm/evmcs_test: complete I/O before migrating guest statePaolo Bonzini4-16/+17
Starting state migration after an IO exit without first completing IO may result in test failures. We already have two tests that need this (this patch in fact fixes evmcs_test, similar to what was fixed for state_test in commit 0f73bbc851ed, "KVM: selftests: complete IO before migrating guest state", 2019-03-13) and a third is coming. So, move the code to vcpu_save_state, and while at it do not access register state until after I/O is complete. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernelsSean Christopherson2-4/+16
Invoking the 64-bit variation on a 32-bit kenrel will crash the guest, trigger a WARN, and/or lead to a buffer overrun in the host, e.g. rsm_load_state_64() writes r8-r15 unconditionally, but enum kvm_reg and thus x86_emulate_ctxt._regs only define r8-r15 for CONFIG_X86_64. KVM allows userspace to report long mode support via CPUID, even though the guest is all but guaranteed to crash if it actually tries to enable long mode. But, a pure 32-bit guest that is ignorant of long mode will happily plod along. SMM complicates things as 64-bit CPUs use a different SMRAM save state area. KVM handles this correctly for 64-bit kernels, e.g. uses the legacy save state map if userspace has hid long mode from the guest, but doesn't fare well when userspace reports long mode support on a 32-bit host kernel (32-bit KVM doesn't support 64-bit guests). Since the alternative is to crash the guest, e.g. by not loading state or explicitly requesting shutdown, unconditionally use the legacy SMRAM save state map for 32-bit KVM. If a guest has managed to get far enough to handle SMIs when running under a weird/buggy userspace hypervisor, then don't deliberately crash the guest since there are no downsides (from KVM's perspective) to allow it to continue running. Fixes: 660a5d517aaab ("KVM: x86: save/load state on SMM switch") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPUSean Christopherson1-10/+11
Neither AMD nor Intel CPUs have an EFER field in the legacy SMRAM save state area, i.e. don't save/restore EFER across SMM transitions. KVM somewhat models this, e.g. doesn't clear EFER on entry to SMM if the guest doesn't support long mode. But during RSM, KVM unconditionally clears EFER so that it can get back to pure 32-bit mode in order to start loading CRs with their actual non-SMM values. Clear EFER only when it will be written when loading the non-SMM state so as to preserve bits that can theoretically be set on 32-bit vCPUs, e.g. KVM always emulates EFER_SCE. And because CR4.PAE is cleared only to play nice with EFER, wrap that code in the long mode check as well. Note, this may result in a compiler warning about cr4 being consumed uninitialized. Re-read CR4 even though it's technically unnecessary, as doing so allows for more readable code and RSM emulation is not a performance critical path. Fixes: 660a5d517aaab ("KVM: x86: save/load state on SMM switch") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: x86: clear SMM flags before loading state while leaving SMMSean Christopherson3-16/+10
RSM emulation is currently broken on VMX when the interrupted guest has CR4.VMXE=1. Stop dancing around the issue of HF_SMM_MASK being set when loading SMSTATE into architectural state, e.g. by toggling it for problematic flows, and simply clear HF_SMM_MASK prior to loading architectural state (from SMRAM save state area). Reported-by: Jon Doron <arilou@gmail.com> Cc: Jim Mattson <jmattson@google.com> Cc: Liran Alon <liran.alon@oracle.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Fixes: 5bea5123cbf0 ("KVM: VMX: check nested state and CR4.VMXE against SMM") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: x86: Open code kvm_set_hflagsSean Christopherson3-18/+19
Prepare for clearing HF_SMM_MASK prior to loading state from the SMRAM save state map, i.e. kvm_smm_changed() needs to be called after state has been loaded and so cannot be done automatically when setting hflags from RSM. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: x86: Load SMRAM in a single shot when leaving SMMSean Christopherson6-92/+92
RSM emulation is currently broken on VMX when the interrupted guest has CR4.VMXE=1. Rather than dance around the issue of HF_SMM_MASK being set when loading SMSTATE into architectural state, ideally RSM emulation itself would be reworked to clear HF_SMM_MASK prior to loading non-SMM architectural state. Ostensibly, the only motivation for having HF_SMM_MASK set throughout the loading of state from the SMRAM save state area is so that the memory accesses from GET_SMSTATE() are tagged with role.smm. Load all of the SMRAM save state area from guest memory at the beginning of RSM emulation, and load state from the buffer instead of reading guest memory one-by-one. This paves the way for clearing HF_SMM_MASK prior to loading state, and also aligns RSM with the enter_smm() behavior, which fills a buffer and writes SMRAM save state in a single go. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: nVMX: Expose RDPMC-exiting only when guest supports PMULiran Alon1-0/+25
Issue was discovered when running kvm-unit-tests on KVM running as L1 on top of Hyper-V. When vmx_instruction_intercept unit-test attempts to run RDPMC to test RDPMC-exiting, it is intercepted by L1 KVM which it's EXIT_REASON_RDPMC handler raise #GP because vCPU exposed by Hyper-V doesn't support PMU. Instead of unit-test expectation to be reflected with EXIT_REASON_RDPMC. The reason vmx_instruction_intercept unit-test attempts to run RDPMC even though Hyper-V doesn't support PMU is because L1 expose to L2 support for RDPMC-exiting. Which is reasonable to assume that is supported only in case CPU supports PMU to being with. Above issue can easily be simulated by modifying vmx_instruction_intercept config in x86/unittests.cfg to run QEMU with "-cpu host,+vmx,-pmu" and run unit-test. To handle issue, change KVM to expose RDPMC-exiting only when guest supports PMU. Reported-by: Saar Amar <saaramar@microsoft.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: x86: Raise #GP when guest vCPU do not support PMULiran Alon1-0/+4
Before this change, reading a VMware pseduo PMC will succeed even when PMU is not supported by guest. This can easily be seen by running kvm-unit-test vmware_backdoors with "-cpu host,-pmu" option. Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16x86/kvm: move kvm_load/put_guest_xcr0 into atomic contextWANG Chao4-6/+12
guest xcr0 could leak into host when MCE happens in guest mode. Because do_machine_check() could schedule out at a few places. For example: kvm_load_guest_xcr0 ... kvm_x86_ops->run(vcpu) { vmx_vcpu_run vmx_complete_atomic_exit kvm_machine_check do_machine_check do_memory_failure memory_failure lock_page In this case, host_xcr0 is 0x2ff, guest vcpu xcr0 is 0xff. After schedule out, host cpu has guest xcr0 loaded (0xff). In __switch_to { switch_fpu_finish copy_kernel_to_fpregs XRSTORS If any bit i in XSTATE_BV[i] == 1 and xcr0[i] == 0, XRSTORS will generate #GP (In this case, bit 9). Then ex_handler_fprestore kicks in and tries to reinitialize fpu by restoring init fpu state. Same story as last #GP, except we get DOUBLE FAULT this time. Cc: stable@vger.kernel.org Signed-off-by: WANG Chao <chao.wang@ucloud.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: x86: svm: make sure NMI is injected after nmi_singlestepVitaly Kuznetsov1-0/+3
I noticed that apic test from kvm-unit-tests always hangs on my EPYC 7401P, the hanging test nmi-after-sti is trying to deliver 30000 NMIs and tracing shows that we're sometimes able to deliver a few but never all. When we're trying to inject an NMI we may fail to do so immediately for various reasons, however, we still need to inject it so enable_nmi_window() arms nmi_singlestep mode. #DB occurs as expected, but we're not checking for pending NMIs before entering the guest and unless there's a different event to process, the NMI will never get delivered. Make KVM_REQ_EVENT request on the vCPU from db_interception() to make sure pending NMIs are checked and possibly injected. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16svm/avic: Fix invalidate logical APIC id entrySuthikulpanit, Suravee1-1/+2
Only clear the valid bit when invalidate logical APIC id entry. The current logic clear the valid bit, but also set the rest of the bits (including reserved bits) to 1. Fixes: 98d90582be2e ('svm: Fix AVIC DFR and LDR handling') Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16Revert "svm: Fix AVIC incomplete IPI emulation"Suthikulpanit, Suravee1-4/+15
This reverts commit bb218fbcfaaa3b115d4cd7a43c0ca164f3a96e57. As Oren Twaig pointed out the old discussion: https://patchwork.kernel.org/patch/8292231/ that the change coud potentially cause an extra IPI to be sent to the destination vcpu because the AVIC hardware already set the IRR bit before the incomplete IPI #VMEXIT with id=1 (target vcpu is not running). Since writting to ICR and ICR2 will also set the IRR. If something triggers the destination vcpu to get scheduled before the emulation finishes, then this could result in an additional IPI. Also, the issue mentioned in the commit bb218fbcfaaa was misdiagnosed. Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Oren Twaig <oren@scalemp.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16kvm: mmu: Fix overflow on kvm mmu page limit calculationBen Gardon4-16/+15
KVM bases its memory usage limits on the total number of guest pages across all memslots. However, those limits, and the calculations to produce them, use 32 bit unsigned integers. This can result in overflow if a VM has more guest pages that can be represented by a u32. As a result of this overflow, KVM can use a low limit on the number of MMU pages it will allocate. This makes KVM unable to map all of guest memory at once, prompting spurious faults. Tested: Ran all kvm-unit-tests on an Intel Haswell machine. This patch introduced no new failures. Signed-off-by: Ben Gardon <bgardon@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: nVMX: always use early vmcs check when EPT is disabledPaolo Bonzini2-2/+21
The remaining failures of vmx.flat when EPT is disabled are caused by incorrectly reflecting VMfails to the L1 hypervisor. What happens is that nested_vmx_restore_host_state corrupts the guest CR3, reloading it with the host's shadow CR3 instead, because it blindly loads GUEST_CR3 from the vmcs01. For simplicity let's just always use hardware VMCS checks when EPT is disabled. This way, nested_vmx_restore_host_state is not reached at all (or at least shouldn't be reached). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16KVM: nVMX: allow tests to use bad virtual-APIC page addressPaolo Bonzini3-10/+19
As mentioned in the comment, there are some special cases where we can simply clear the TPR shadow bit from the CPU-based execution controls in the vmcs02. Handle them so that we can remove some XFAILs from vmx.flat. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-15bnx2x: fix spelling mistake "dicline" -> "decline"Colin Ian King1-1/+1
There is a spelling mistake in a BNX2X_ERR message, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15Merge tag 'libnvdimm-fixes-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds7-71/+117
Pull libnvdimm fixes from Dan Williams: "I debated holding this back for the v5.2 merge window due to the size of the "zero-key" changes, but affected users would benefit from having the fixes sooner. It did not make sense to change the zero-key semantic in isolation for the "secure-erase" command, but instead include it for all security commands. The short background on the need for these changes is that some NVDIMM platforms enable security with a default zero-key rather than let the OS specify the initial key. This makes the security enabling that landed in v5.0 unusable for some users. Summary: - Compatibility fix for nvdimm-security implementations with a default zero-key. - Miscellaneous small fixes for out-of-bound accesses, cleanup after initialization failures, and missing debug messages" * tag 'libnvdimm-fixes-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: tools/testing/nvdimm: Retain security state after overwrite libnvdimm/pmem: fix a possible OOB access when read and write pmem libnvdimm/security, acpi/nfit: unify zero-key for all security commands libnvdimm/security: provide fix for secure-erase to use zero-key libnvdimm/btt: Fix a kmemdup failure check libnvdimm/namespace: Fix a potential NULL pointer dereference acpi/nfit: Always dump _DSM output payload
2019-04-15Merge tag 'fsdax-fix-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds1-0/+15
Pull fsdax fix from Dan Williams: "A single filesystem-dax fix. It has been lingering in -next for a long while and there are no other fsdax fixes on the horizon: - Avoid a crash scenario with architectures like powerpc that require 'pgtable_deposit' for the zero page" * tag 'fsdax-fix-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: fs/dax: Deposit pagetable even when installing zero page