aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-01-27KVM: Use vcpu-specific gva->hva translation when querying host page sizeSean Christopherson4-7/+7
Use kvm_vcpu_gfn_to_hva() when retrieving the host page size so that the correct set of memslots is used when handling x86 page faults in SMM. Fixes: 54bf36aac520 ("KVM: x86: use vcpu-specific functions to read/write/translate GFNs") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27mm: thp: KVM: Explicitly check for THP when populating secondary MMUSean Christopherson6-9/+31
Add a helper, is_transparent_hugepage(), to explicitly check whether a compound page is a THP and use it when populating KVM's secondary MMU. The explicit check fixes a bug where a remapped compound page, e.g. for an XDP Rx socket, is mapped into a KVM guest and is mistaken for a THP, which results in KVM incorrectly creating a huge page in its secondary MMU. Fixes: 936a5fe6e6148 ("thp: kvm mmu transparent hugepage support") Reported-by: syzbot+c9d1fb51ac9d0d10c39d@syzkaller.appspotmail.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86/mmu: Enforce max_level on HugeTLB mappingsSean Christopherson1-2/+3
Limit KVM's mapping level for HugeTLB based on its calculated max_level. The max_level check prior to invoking host_mapping_level() only filters out the case where KVM cannot create a 2mb mapping, it doesn't handle the scenario where KVM can create a 2mb but not 1gb mapping, and the host is using a 1gb HugeTLB mapping. Fixes: 2f57b7051fe8 ("KVM: x86/mmu: Persist gfn_lpage_is_disallowed() to max_level") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Return immediately if __kvm_gfn_to_hva_cache_init() failsSean Christopherson1-4/+8
Check the result of __kvm_gfn_to_hva_cache_init() and return immediately instead of relying on the kvm_is_error_hva() check to detect errors so that it's abundantly clear KVM intends to immediately bail on an error. Note, the hva check is still mandatory to handle errors on subqeuesnt calls with the same generation. Similarly, always return -EFAULT on error so that multiple (bad) calls for a given generation will get the same result, e.g. on an illegal gfn wrap, propagating the return from __kvm_gfn_to_hva_cache_init() would cause the initial call to return -EINVAL and subsequent calls to return -EFAULT. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Clean up __kvm_gfn_to_hva_cache_init() and its callersSean Christopherson1-9/+12
Barret reported a (technically benign) bug where nr_pages_avail can be accessed without being initialized if gfn_to_hva_many() fails. virt/kvm/kvm_main.c:2193:13: warning: 'nr_pages_avail' may be used uninitialized in this function [-Wmaybe-uninitialized] Rather than simply squashing the warning by initializing nr_pages_avail, fix the underlying issues by reworking __kvm_gfn_to_hva_cache_init() to return immediately instead of continuing on. Now that all callers check the result and/or bail immediately on a bad hva, there's no need to explicitly nullify the memslot on error. Reported-by: Barret Rhoden <brho@google.com> Fixes: f1b9dd5eb86c ("kvm: Disallow wraparound in kvm_gfn_to_hva_cache_init") Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Check for a bad hva before dropping into the ghc slow pathSean Christopherson1-6/+6
When reading/writing using the guest/host cache, check for a bad hva before checking for a NULL memslot, which triggers the slow path for handing cross-page accesses. Because the memslot is nullified on error by __kvm_gfn_to_hva_cache_init(), if the bad hva is encountered after crossing into a new page, then the kvm_{read,write}_guest() slow path could potentially write/access the first chunk prior to detecting the bad hva. Arguably, performing a partial access is semantically correct from an architectural perspective, but that behavior is certainly not intended. In the original implementation, memslot was not explicitly nullified and therefore the partial access behavior varied based on whether the memslot itself was null, or if the hva was simply bad. The current behavior was introduced as a seemingly unintentional side effect in commit f1b9dd5eb86c ("kvm: Disallow wraparound in kvm_gfn_to_hva_cache_init"), which justified the change with "since some callers don't check the return code from this function, it sit seems prudent to clear ghc->memslot in the event of an error". Regardless of intent, the partial access is dependent on _not_ checking the result of the cache initialization, which is arguably a bug in its own right, at best simply weird. Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.") Cc: Jim Mattson <jmattson@google.com> Cc: Andrew Honig <ahonig@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27kvm/x86: export kvm_vector_hashing_enabled() is unnecessaryPeng Hao1-1/+0
kvm_vector_hashing_enabled() is just called in kvm.ko module. Signed-off-by: Peng Hao <richard.peng@oppo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: VMX: remove duplicated segment cache clearMiaohe Lin1-2/+0
vmx_set_segment() clears segment cache unconditionally, so we should not clear it again by calling vmx_segment_cache_clear(). Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27Adding 'else' to reduce checking.Haiwei Li1-2/+2
These two conditions are in conflict, adding 'else' to reduce checking. Signed-off-by: Haiwei Li <lihaiwei@tencent.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: nVMX: Check GUEST_DR7 on vmentry of nested guestsKrish Sadhukhan3-1/+11
According to section "Checks on Guest Control Registers, Debug Registers, and and MSRs" in Intel SDM vol 3C, the following checks are performed on vmentry of nested guests: If the "load debug controls" VM-entry control is 1, bits 63:32 in the DR7 field must be 0. In KVM, GUEST_DR7 is set prior to the vmcs02 VM-entry by kvm_set_dr() and the latter synthesizes a #GP if any bit in the high dword in the former is set. Hence this field needs to be checked in software. Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: remove unused guest_enterAlex Shi1-9/+0
After commit 61bd0f66ff92 ("KVM: PPC: Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GEN"), no one use this function anymore, So better to remove it. Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Move running VCPU from ARM to common codePaolo Bonzini8-50/+34
For ring-based dirty log tracking, it will be more efficient to account writes during schedule-out or schedule-in to the currently running VCPU. We would like to do it even if the write doesn't use the current VCPU's address space, as is the case for cached writes (see commit 4e335d9e7ddb, "Revert "KVM: Support vCPU-based gfn->hva cache"", 2017-05-02). Therefore, add a mechanism to track the currently-loaded kvm_vcpu struct. There is already something similar in KVM/ARM; one important difference is that kvm_arch_vcpu_{load,put} have two callers in virt/kvm/kvm_main.c: we have to update both the architecture-independent vcpu_{load,put} and the preempt notifiers. Another change made in the process is to allow using kvm_get_running_vcpu() in preemptible code. This is allowed because preempt notifiers ensure that the value does not change even after the VCPU thread is migrated. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: X86: Drop x86_set_memory_region()Peter Xu3-18/+12
The helper x86_set_memory_region() is only used in vmx_set_tss_addr() and kvm_arch_destroy_vm(). Push the lock upper in both cases. With that, drop x86_set_memory_region(). This prepares to allow __x86_set_memory_region() to return a HVA mapped, because the HVA will need to be protected by the lock too even after __x86_set_memory_region() returns. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: X86: Don't take srcu lock in init_rmode_identity_map()Peter Xu1-7/+3
We've already got the slots_lock, so we should be safe. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Add build-time error check on kvm_run sizePeter Xu1-0/+1
It's already going to reach 2400 Bytes (which is over half of page size on 4K page archs), so maybe it's good to have this build-time check in case it overflows when adding new fields. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Remove kvm_read_guest_atomic()Peter Xu2-13/+0
Remove kvm_read_guest_atomic() because it's not used anywhere. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27x86/kvm/hyper-v: remove stale evmcs_already_enabled check from nested_enable_evmcs()Vitaly Kuznetsov1-5/+0
In nested_enable_evmcs() evmcs_already_enabled check doesn't really do anything: controls are already sanitized and we return '0' regardless. Just drop the check. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Perform non-canonical checks in 32-bit KVMSean Christopherson2-10/+0
Remove the CONFIG_X86_64 condition from the low level non-canonical helpers to effectively enable non-canonical checks on 32-bit KVM. Non-canonical checks are performed by hardware if the CPU *supports* 64-bit mode, whether or not the CPU is actually in 64-bit mode is irrelevant. For the most part, skipping non-canonical checks on 32-bit KVM is ok-ish because 32-bit KVM always (hopefully) drops bits 63:32 of whatever value it's checking before propagating it to hardware, and architecturally, the expected behavior for the guest is a bit of a grey area since the vCPU itself doesn't support 64-bit mode. I.e. a 32-bit KVM guest can observe the missed checks in several paths, e.g. INVVPID and VM-Enter, but it's debatable whether or not the missed checks constitute a bug because technically the vCPU doesn't support 64-bit mode. The primary motivation for enabling the non-canonical checks is defense in depth. As mentioned above, a guest can trigger a missed check via INVVPID or VM-Enter. INVVPID is straightforward as it takes a 64-bit virtual address as part of its 128-bit INVVPID descriptor and fails if the address is non-canonical, even if INVVPID is executed in 32-bit PM. Nested VM-Enter is a bit more convoluted as it requires the guest to write natural width VMCS fields via memory accesses and then VMPTRLD the VMCS, but it's still possible. In both cases, KVM is saved from a true bug only because its flows that propagate values to hardware (correctly) take "unsigned long" parameters and so drop bits 63:32 of the bad value. Explicitly performing the non-canonical checks makes it less likely that a bad value will be propagated to hardware, e.g. in the INVVPID case, if __invvpid() didn't implicitly drop bits 63:32 then KVM would BUG() on the resulting unexpected INVVPID failure due to hardware rejecting the non-canonical address. The only downside to enabling the non-canonical checks is that it adds a relatively small amount of overhead, but the affected flows are not hot paths, i.e. the overhead is negligible. Note, KVM technically could gate the non-canonical checks on 32-bit KVM with static_cpu_has(X86_FEATURE_LM), but on bare metal that's an even bigger waste of code for everyone except the 0.00000000000001% of the population running on Yonah, and nested 32-bit on 64-bit already fudges things with respect to 64-bit CPU behavior. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> [Also do so in nested_vmx_check_host_state as reported by Krish. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: nVMX: WARN on failure to set IA32_PERF_GLOBAL_CTRLOliver Upton1-14/+4
Writes to MSR_CORE_PERF_GLOBAL_CONTROL should never fail if the VM-exit and VM-entry controls are exposed to L1. Promote the checks to perform a full WARN if kvm_set_msr() fails and remove the now unused macro SET_MSR_OR_WARN(). Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Remove unused ctxt param from emulator's FPU accessorsSean Christopherson1-15/+13
Remove an unused struct x86_emulate_ctxt * param from low level helpers used to access guest FPU state. The unused param was left behind by commit 6ab0b9feb82a ("x86,kvm: remove KVM emulator get_fpu / put_fpu"). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Revert "KVM: X86: Fix fpu state crash in kvm guest"Sean Christopherson1-6/+3
Reload the current thread's FPU state, which contains the guest's FPU state, to the CPU registers if necessary during vcpu_enter_guest(). TIF_NEED_FPU_LOAD can be set any time control is transferred out of KVM, e.g. if I/O is triggered during a KVM call to get_user_pages() or if a softirq occurs while KVM is scheduled in. Moving the handling of TIF_NEED_FPU_LOAD from vcpu_enter_guest() to kvm_arch_vcpu_load(), effectively kvm_sched_in(), papered over a bug where kvm_put_guest_fpu() failed to account for TIF_NEED_FPU_LOAD. The easiest way to the kvm_put_guest_fpu() bug was to run with involuntary preemption enable, thus handling TIF_NEED_FPU_LOAD during kvm_sched_in() made the bug go away. But, removing the handling in vcpu_enter_guest() exposed KVM to the rare case of a softirq triggering kernel_fpu_begin() between vcpu_load() and vcpu_enter_guest(). Now that kvm_{load,put}_guest_fpu() correctly handle TIF_NEED_FPU_LOAD, revert the commit to both restore the vcpu_enter_guest() behavior and eliminate the superfluous switch_fpu_return() in kvm_arch_vcpu_load(). Note, leaving the handling in kvm_arch_vcpu_load() isn't wrong per se, but it is unnecessary, and most critically, makes it extremely difficult to find bugs such as the kvm_put_guest_fpu() issue due to shrinking the window where a softirq can corrupt state. A sample trace triggered by warning if TIF_NEED_FPU_LOAD is set while vcpu state is loaded: <IRQ> gcmaes_crypt_by_sg.constprop.12+0x26e/0x660 ? 0xffffffffc024547d ? __qdisc_run+0x83/0x510 ? __dev_queue_xmit+0x45e/0x990 ? ip_finish_output2+0x1a8/0x570 ? fib4_rule_action+0x61/0x70 ? fib4_rule_action+0x70/0x70 ? fib_rules_lookup+0x13f/0x1c0 ? helper_rfc4106_decrypt+0x82/0xa0 ? crypto_aead_decrypt+0x40/0x70 ? crypto_aead_decrypt+0x40/0x70 ? crypto_aead_decrypt+0x40/0x70 ? esp_output_tail+0x8f4/0xa5a [esp4] ? skb_ext_add+0xd3/0x170 ? xfrm_input+0x7a6/0x12c0 ? xfrm4_rcv_encap+0xae/0xd0 ? xfrm4_transport_finish+0x200/0x200 ? udp_queue_rcv_one_skb+0x1ba/0x460 ? udp_unicast_rcv_skb.isra.63+0x72/0x90 ? __udp4_lib_rcv+0x51b/0xb00 ? ip_protocol_deliver_rcu+0xd2/0x1c0 ? ip_local_deliver_finish+0x44/0x50 ? ip_local_deliver+0xe0/0xf0 ? ip_protocol_deliver_rcu+0x1c0/0x1c0 ? ip_rcv+0xbc/0xd0 ? ip_rcv_finish_core.isra.19+0x380/0x380 ? __netif_receive_skb_one_core+0x7e/0x90 ? netif_receive_skb_internal+0x3d/0xb0 ? napi_gro_receive+0xed/0x150 ? 0xffffffffc0243c77 ? net_rx_action+0x149/0x3b0 ? __do_softirq+0xe4/0x2f8 ? handle_irq_event_percpu+0x6a/0x80 ? irq_exit+0xe6/0xf0 ? do_IRQ+0x7f/0xd0 ? common_interrupt+0xf/0xf </IRQ> ? irq_entries_start+0x20/0x660 ? vmx_get_interrupt_shadow+0x2f0/0x710 [kvm_intel] ? kvm_set_msr_common+0xfc7/0x2380 [kvm] ? recalibrate_cpu_khz+0x10/0x10 ? ktime_get+0x3a/0xa0 ? kvm_arch_vcpu_ioctl_run+0x107/0x560 [kvm] ? kvm_init+0x6bf/0xd00 [kvm] ? __seccomp_filter+0x7a/0x680 ? do_vfs_ioctl+0xa4/0x630 ? security_file_ioctl+0x32/0x50 ? ksys_ioctl+0x60/0x90 ? __x64_sys_ioctl+0x16/0x20 ? do_syscall_64+0x5f/0x1a0 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 ---[ end trace 9564a1ccad733a90 ]--- This reverts commit e751732486eb3f159089a64d1901992b1357e7cc. Fixes: e751732486eb3 ("KVM: X86: Fix fpu state crash in kvm guest") Reported-by: Derek Yerger <derek@djy.llc> Reported-by: kernel@najdan.com Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Thomas Lambertz <mail@thomaslambertz.de> Cc: Rik van Riel <riel@surriel.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Ensure guest's FPU state is loaded when accessing for emulationSean Christopherson1-0/+39
Lock the FPU regs and reload the current thread's FPU state, which holds the guest's FPU state, to the CPU registers if necessary prior to accessing guest FPU state as part of emulation. kernel_fpu_begin() can be called from softirq context, therefore KVM must ensure softirqs are disabled (locking the FPU regs disables softirqs) when touching CPU FPU state. Note, for all intents and purposes this reverts commit 6ab0b9feb82a7 ("x86,kvm: remove KVM emulator get_fpu / put_fpu"), but at the time it was applied, removing get/put_fpu() was correct. The re-introduction of {get,put}_fpu() is necessitated by the deferring of FPU state load. Fixes: 5f409e20b7945 ("x86/fpu: Defer FPU state load until return to userspace") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Handle TIF_NEED_FPU_LOAD in kvm_{load,put}_guest_fpu()Sean Christopherson1-2/+17
Handle TIF_NEED_FPU_LOAD similar to how fpu__copy() handles the flag when duplicating FPU state to a new task struct. TIF_NEED_FPU_LOAD can be set any time control is transferred out of KVM, be it voluntarily, e.g. if I/O is triggered during a KVM call to get_user_pages, or involuntarily, e.g. if softirq runs after an IRQ occurs. Therefore, KVM must account for TIF_NEED_FPU_LOAD whenever it is (potentially) accessing CPU FPU state. Fixes: 5f409e20b7945 ("x86/fpu: Defer FPU state load until return to userspace") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27Revert "KVM: x86: Add a WARN on TIF_NEED_FPU_LOAD in kvm_load_guest_fpu()"Paolo Bonzini1-7/+0
This reverts commit 95145c25a78cc0a9d3cbc75708abde432310c5a1. The next few patches will fix the issue so the warning is not needed anymore; revert it separately to simplify application to stable kernels. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: apic: short-circuit kvm_apic_accept_pic_intr() when pic intr is acceptedMiaohe Lin1-4/+3
Short-circuit kvm_apic_accept_pic_intr() when pic intr is accepted, there is no need to proceed further. Also remove unnecessary var r. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: avoid clearing pending exception event twiceMiaohe Lin1-1/+0
The exception pending event is cleared by kvm_clear_exception_queue(). We shouldn't clear it again. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect pmu_intel.c from Spectre-v1/L1TF attacksMarios Pomonis1-8/+16
This fixes Spectre-v1/L1TF vulnerabilities in intel_find_fixed_event() and intel_rdpmc_ecx_to_pmc(). kvm_rdpmc() (ancestor of intel_find_fixed_event()) and reprogram_fixed_counter() (ancestor of intel_rdpmc_ecx_to_pmc()) are exported symbols so KVM should treat them conservatively from a security perspective. Fixes: 25462f7f5295 ("KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect DR-based index computations from Spectre-v1/L1TF attacksMarios Pomonis1-2/+6
This fixes a Spectre-v1/L1TF vulnerability in __kvm_set_dr() and kvm_get_dr(). Both kvm_get_dr() and kvm_set_dr() (a wrapper of __kvm_set_dr()) are exported symbols so KVM should tream them conservatively from a security perspective. Fixes: 020df0794f57 ("KVM: move DR register access handling into generic code") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect exit_reason from being used in Spectre-v1/L1TF attacksMarios Pomonis1-25/+32
This fixes a Spectre-v1/L1TF vulnerability in vmx_handle_exit(). While exit_reason is set by the hardware and therefore should not be attacker-influenced, an unknown exit_reason could potentially be used to perform such an attack. Fixes: 55d2375e58a6 ("KVM: nVMX: Move nested code to dedicated files") Signed-off-by: Marios Pomonis <pomonis@google.com> Signed-off-by: Nick Finco <nifi@google.com> Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Refactor prefix decoding to prevent Spectre-v1/L1TF attacksMarios Pomonis1-2/+14
This fixes Spectre-v1/L1TF vulnerabilities in vmx_read_guest_seg_selector(), vmx_read_guest_seg_base(), vmx_read_guest_seg_limit() and vmx_read_guest_seg_ar(). When invoked from emulation, these functions contain index computations based on the (attacker-influenced) segment value. Using constants prevents the attack. Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect MSR-based index computations from Spectre-v1/L1TF attacks in x86.cMarios Pomonis1-2/+8
This fixes a Spectre-v1/L1TF vulnerability in set_msr_mce() and get_msr_mce(). Both functions contain index computations based on the (attacker-controlled) MSR number. Fixes: 890ca9aefa78 ("KVM: Add MCE support") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect MSR-based index computations in pmu.h from Spectre-v1/L1TF attacksMarios Pomonis1-4/+14
This fixes a Spectre-v1/L1TF vulnerability in the get_gp_pmc() and get_fixed_pmc() functions. They both contain index computations based on the (attacker-controlled) MSR number. Fixes: 25462f7f5295 ("KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect MSR-based index computations in fixed_msr_to_seg_unit() from Spectre-v1/L1TF attacksMarios Pomonis1-2/+6
This fixes a Spectre-v1/L1TF vulnerability in fixed_msr_to_seg_unit(). This function contains index computations based on the (attacker-controlled) MSR number. Fixes: de9aef5e1ad6 ("KVM: MTRR: introduce fixed_mtrr_segment table") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect kvm_lapic_reg_write() from Spectre-v1/L1TF attacksMarios Pomonis1-4/+9
This fixes a Spectre-v1/L1TF vulnerability in kvm_lapic_reg_write(). This function contains index computations based on the (attacker-controlled) MSR number. Fixes: 0105d1a52640 ("KVM: x2apic interface to lapic") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect ioapic_write_indirect() from Spectre-v1/L1TF attacksMarios Pomonis1-0/+1
This fixes a Spectre-v1/L1TF vulnerability in ioapic_write_indirect(). This function contains index computations based on the (attacker-controlled) IOREGSEL register. This patch depends on patch "KVM: x86: Protect ioapic_read_indirect() from Spectre-v1/L1TF attacks". Fixes: 70f93dae32ac ("KVM: Use temporary variable to shorten lines.") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect ioapic_read_indirect() from Spectre-v1/L1TF attacksMarios Pomonis1-6/+8
This fixes a Spectre-v1/L1TF vulnerability in ioapic_read_indirect(). This function contains index computations based on the (attacker-controlled) IOREGSEL register. Fixes: a2c118bfab8b ("KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Refactor picdev_write() to prevent Spectre-v1/L1TF attacksMarios Pomonis1-1/+5
This fixes a Spectre-v1/L1TF vulnerability in picdev_write(). It replaces index computations based on the (attacked-controlled) port number with constants through a minor refactoring. Fixes: 85f455f7ddbe ("KVM: Add support for in-kernel PIC emulation") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect kvm_hv_msr_[get|set]_crash_data() from Spectre-v1/L1TF attacksMarios Pomonis1-4/+6
This fixes Spectre-v1/L1TF vulnerabilities in kvm_hv_msr_get_crash_data() and kvm_hv_msr_set_crash_data(). These functions contain index computations that use the (attacker-controlled) MSR number. Fixes: e7d9513b60e8 ("kvm/x86: added hyper-v crash msrs into kvm hyperv context") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Protect x86_decode_insn from Spectre-v1/L1TF attacksMarios Pomonis1-3/+8
This fixes a Spectre-v1/L1TF vulnerability in x86_decode_insn(). kvm_emulate_instruction() (an ancestor of x86_decode_insn()) is an exported symbol, so KVM should treat it conservatively from a security perspective. Fixes: 045a282ca415 ("KVM: emulator: implement fninit, fnstsw, fnstcw") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27kvm/svm: PKU not currently supportedJohn Allen5-1/+16
Current SVM implementation does not have support for handling PKU. Guests running on a host with future AMD cpus that support the feature will read garbage from the PKRU register and will hit segmentation faults on boot as memory is getting marked as protected that should not be. Ensure that cpuid from SVM does not advertise the feature. Signed-off-by: John Allen <john.allen@amd.com> Cc: stable@vger.kernel.org Fixes: 0556cbdc2fbc ("x86/pkeys: Don't check if PKRU is zero before writing it") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Move vcpu->run page allocation out of kvm_vcpu_init()Sean Christopherson1-21/+13
Open code the allocation and freeing of the vcpu->run page in kvm_vm_ioctl_create_vcpu() and kvm_vcpu_destroy() respectively. Doing so allows kvm_vcpu_init() to be a pure init function and eliminates kvm_vcpu_uninit() entirely. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Move putting of vcpu->pid to kvm_vcpu_destroy()Sean Christopherson1-6/+7
Move the putting of vcpu->pid to kvm_vcpu_destroy(). vcpu->pid is guaranteed to be NULL when kvm_vcpu_uninit() is called in the error path of kvm_vm_ioctl_create_vcpu(), e.g. it is explicitly nullified by kvm_vcpu_init() and is only changed by KVM_RUN. No functional change intended. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Drop kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit()Sean Christopherson11-65/+2
Remove kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() now that all arch specific implementations are nops. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: arm64: Free sve_state via arm specific hookSean Christopherson4-0/+9
Add an arm specific hook to free the arm64-only sve_state. Doing so eliminates the last functional code from kvm_arch_vcpu_uninit() across all architectures and paves the way for removing kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() entirely. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: PPC: Move all vcpu init code into kvm_arch_vcpu_create()Sean Christopherson1-24/+32
Fold init() into create() now that the two are called back-to-back by common KVM code (kvm_vcpu_init() calls kvm_arch_vcpu_init() as its last action, and kvm_vm_ioctl_create_vcpu() calls kvm_arch_vcpu_create() immediately thereafter). Rinse and repeat for kvm_arch_vcpu_uninit() and kvm_arch_vcpu_destroy(). This paves the way for removing kvm_arch_vcpu_{un}init() entirely. Note, calling kvmppc_mmu_destroy() if kvmppc_core_vcpu_create() fails may or may not be necessary. Move it along with the more obvious call to kvmppc_subarch_vcpu_uninit() so as not to inadvertantly introduce a functional change and/or bug. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: ARM: Move all vcpu init code into kvm_arch_vcpu_create()Sean Christopherson1-14/+20
Fold init() into create() now that the two are called back-to-back by common KVM code (kvm_vcpu_init() calls kvm_arch_vcpu_init() as its last action, and kvm_vm_ioctl_create_vcpu() calls kvm_arch_vcpu_create() immediately thereafter). This paves the way for removing kvm_arch_vcpu_{un}init() entirely. Note, there is no associated unwinding in kvm_arch_vcpu_uninit() that needs to be relocated (to kvm_arch_vcpu_destroy()). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: MIPS: Move all vcpu init code into kvm_arch_vcpu_create()Sean Christopherson1-12/+14
Fold init() into create() now that the two are called back-to-back by common KVM code (kvm_vcpu_init() calls kvm_arch_vcpu_init() as its last action, and kvm_vm_ioctl_create_vcpu() calls kvm_arch_vcpu_create() immediately thereafter). Rinse and repeat for kvm_arch_vcpu_uninit() and kvm_arch_vcpu_destroy(). This paves the way for removing kvm_arch_vcpu_{un}init() entirely. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: x86: Move all vcpu init code into kvm_arch_vcpu_create()Sean Christopherson1-98/+100
Fold init() into create() now that the two are called back-to-back by common KVM code (kvm_vcpu_init() calls kvm_arch_vcpu_init() as its last action, and kvm_vm_ioctl_create_vcpu() calls kvm_arch_vcpu_create() immediately thereafter). This paves the way for removing kvm_arch_vcpu_init() entirely. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Drop kvm_arch_vcpu_setup()Sean Christopherson9-41/+0
Remove kvm_arch_vcpu_setup() now that all arch specific implementations are nops. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: PPC: BookE: Setup vcpu during kvmppc_core_vcpu_create()Sean Christopherson1-27/+33
Fold setup() into create() now that the two are called back-to-back by common KVM code. This paves the way for removing kvm_arch_vcpu_setup(). Note, BookE directly implements kvm_arch_vcpu_setup() and PPC's common kvm_arch_vcpu_create() is responsible for its own cleanup, thus the only cleanup required when directly invoking kvmppc_core_vcpu_setup() is to call .vcpu_free(), which is the BookE specific portion of PPC's kvm_arch_vcpu_destroy() by way of kvmppc_core_vcpu_free(). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>