aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-02-20Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-2/+3
Pull scheduler updates from Ingo Molnar: "The main changes in this (fairly busy) cycle were: - There was a class of scheduler bugs related to forgetting to update the rq-clock timestamp which can cause weird and hard to debug problems, so there's a new debug facility for this: which uncovered a whole lot of bugs which convinced us that we want to keep the debug facility. (Peter Zijlstra, Matt Fleming) - Various cputime related updates: eliminate cputime and use u64 nanoseconds directly, simplify and improve the arch interfaces, implement delayed accounting more widely, etc. - (Frederic Weisbecker) - Move code around for better structure plus cleanups (Ingo Molnar) - Move IO schedule accounting deeper into the scheduler plus related changes to improve the situation (Tejun Heo) - ... plus a round of sched/rt and sched/deadline fixes, plus other fixes, updats and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (85 commits) sched/core: Remove unlikely() annotation from sched_move_task() sched/autogroup: Rename auto_group.[ch] to autogroup.[ch] sched/topology: Split out scheduler topology code from core.c into topology.c sched/core: Remove unnecessary #include headers sched/rq_clock: Consolidate the ordering of the rq_clock methods delayacct: Include <uapi/linux/taskstats.h> sched/core: Clean up comments sched/rt: Show the 'sched_rr_timeslice' SCHED_RR timeslice tuning knob in milliseconds sched/clock: Add dummy clear_sched_clock_stable() stub function sched/cputime: Remove generic asm headers sched/cputime: Remove unused nsec_to_cputime() s390, sched/cputime: Remove unused cputime definitions powerpc, sched/cputime: Remove unused cputime definitions s390, sched/cputime: Make arch_cpu_idle_time() to return nsecs ia64, sched/cputime: Remove unused cputime definitions ia64: Convert vtime to use nsec units directly ia64, sched/cputime: Move the nsecs based cputime headers to the last arch using it sched/cputime: Remove jiffies based cputime sched/cputime, vtime: Return nsecs instead of cputime_t to account sched/cputime: Complete nsec conversion of tick based accounting ...
2017-02-03KVM: x86: do not save guest-unsupported XSAVE stateRadim Krčmář1-0/+1
Saving unsupported state prevents migration when the new host does not support a XSAVE feature of the original host, even if the feature is not exposed to the guest. We've masked host features with guest-visible features before, with 4344ee981e21 ("KVM: x86: only copy XSAVE state for the supported features") and dropped it when implementing XSAVES. Do it again. Fixes: df1daba7d1cb ("KVM: x86: support XSAVES usage in the host") Cc: stable@vger.kernel.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-02-01sched/cputime: Convert task/group cputime to nsecsFrederic Weisbecker1-2/+3
Now that most cputime readers use the transition API which return the task cputime in old style cputime_t, we can safely store the cputime in nsecs. This will eventually make cputime statistics less opaque and more granular. Back and forth convertions between cputime_t and nsecs in order to deal with cputime_t random granularity won't be needed anymore. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/1485832191-26889-8-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-17KVM: x86: fix fixing of hypercallsDmitry Vyukov1-1/+2
emulator_fix_hypercall() replaces hypercall with vmcall instruction, but it does not handle GP exception properly when writes the new instruction. It can return X86EMUL_PROPAGATE_FAULT without setting exception information. This leads to incorrect emulation and triggers WARN_ON(ctxt->exception.vector > 0x1f) in x86_emulate_insn() as discovered by syzkaller fuzzer: WARNING: CPU: 2 PID: 18646 at arch/x86/kvm/emulate.c:5558 Call Trace: warn_slowpath_null+0x2c/0x40 kernel/panic.c:582 x86_emulate_insn+0x16a5/0x4090 arch/x86/kvm/emulate.c:5572 x86_emulate_instruction+0x403/0x1cc0 arch/x86/kvm/x86.c:5618 emulate_instruction arch/x86/include/asm/kvm_host.h:1127 [inline] handle_exception+0x594/0xfd0 arch/x86/kvm/vmx.c:5762 vmx_handle_exit+0x2b7/0x38b0 arch/x86/kvm/vmx.c:8625 vcpu_enter_guest arch/x86/kvm/x86.c:6888 [inline] vcpu_run arch/x86/kvm/x86.c:6947 [inline] Set exception information when write in emulator_fix_hypercall() fails. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: kvm@vger.kernel.org Cc: syzkaller@googlegroups.com Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-01-12KVM: x86: fix emulation of "MOV SS, null selector"Paolo Bonzini1-10/+38
This is CVE-2017-2583. On Intel this causes a failed vmentry because SS's type is neither 3 nor 7 (even though the manual says this check is only done for usable SS, and the dmesg splat says that SS is unusable!). On AMD it's worse: svm.c is confused and sets CPL to 0 in the vmcb. The fix fabricates a data segment descriptor when SS is set to a null selector, so that CPL and SS.DPL are set correctly in the VMCS/vmcb. Furthermore, only allow setting SS to a NULL selector if SS.RPL < 3; this in turn ensures CPL < 3 because RPL must be equal to CPL. Thanks to Andy Lutomirski and Willy Tarreau for help in analyzing the bug and deciphering the manuals. Reported-by: Xiaohan Zhang <zhangxiaohan1@huawei.com> Fixes: 79d5b4c3cd809c770d4bf9812635647016c56011 Cc: stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12KVM: x86: fix NULL deref in vcpu_scan_ioapicWanpeng Li1-0/+2
Reported by syzkaller: BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0 IP: _raw_spin_lock+0xc/0x30 PGD 3e28eb067 PUD 3f0ac6067 PMD 0 Oops: 0002 [#1] SMP CPU: 0 PID: 2431 Comm: test Tainted: G OE 4.10.0-rc1+ #3 Call Trace: ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm] kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm] ? pick_next_task_fair+0xe1/0x4e0 ? kvm_arch_vcpu_load+0xea/0x260 [kvm] kvm_vcpu_ioctl+0x33a/0x600 [kvm] ? hrtimer_try_to_cancel+0x29/0x130 ? do_nanosleep+0x97/0xf0 do_vfs_ioctl+0xa1/0x5d0 ? __hrtimer_init+0x90/0x90 ? do_nanosleep+0x5b/0xf0 SyS_ioctl+0x79/0x90 do_syscall_64+0x6e/0x180 entry_SYSCALL64_slow_path+0x25/0x25 RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0 The syzkaller folks reported a NULL pointer dereference due to ENABLE_CAP succeeding even without an irqchip. The Hyper-V synthetic interrupt controller is activated, resulting in a wrong request to rescan the ioapic and a NULL pointer dereference. #include <sys/ioctl.h> #include <sys/mman.h> #include <sys/types.h> #include <linux/kvm.h> #include <pthread.h> #include <stddef.h> #include <stdint.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #ifndef KVM_CAP_HYPERV_SYNIC #define KVM_CAP_HYPERV_SYNIC 123 #endif void* thr(void* arg) { struct kvm_enable_cap cap; cap.flags = 0; cap.cap = KVM_CAP_HYPERV_SYNIC; ioctl((long)arg, KVM_ENABLE_CAP, &cap); return 0; } int main() { void *host_mem = mmap(0, 0x1000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); int kvmfd = open("/dev/kvm", 0); int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0); struct kvm_userspace_memory_region memreg; memreg.slot = 0; memreg.flags = 0; memreg.guest_phys_addr = 0; memreg.memory_size = 0x1000; memreg.userspace_addr = (unsigned long)host_mem; host_mem[0] = 0xf4; ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &memreg); int cpufd = ioctl(vmfd, KVM_CREATE_VCPU, 0); struct kvm_sregs sregs; ioctl(cpufd, KVM_GET_SREGS, &sregs); sregs.cr0 = 0; sregs.cr4 = 0; sregs.efer = 0; sregs.cs.selector = 0; sregs.cs.base = 0; ioctl(cpufd, KVM_SET_SREGS, &sregs); struct kvm_regs regs = { .rflags = 2 }; ioctl(cpufd, KVM_SET_REGS, &regs); ioctl(vmfd, KVM_CREATE_IRQCHIP, 0); pthread_t th; pthread_create(&th, 0, thr, (void*)(long)cpufd); usleep(rand() % 10000); ioctl(cpufd, KVM_RUN, 0); pthread_join(th, 0); return 0; } This patch fixes it by failing ENABLE_CAP if without an irqchip. Reported-by: Dmitry Vyukov <dvyukov@google.com> Fixes: 5c919412fe61 (kvm/x86: Hyper-V synthetic interrupt controller) Cc: stable@vger.kernel.org # 4.5+ Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12KVM: x86: Introduce segmented_write_stdSteve Rutherford1-4/+18
Introduces segemented_write_std. Switches from emulated reads/writes to standard read/writes in fxsave, fxrstor, sgdt, and sidt. This fixes CVE-2017-2584, a longstanding kernel memory leak. Since commit 283c95d0e389 ("KVM: x86: emulate FXSAVE and FXRSTOR", 2016-11-09), which is luckily not yet in any final release, this would also be an exploitable kernel memory *write*! Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: 96051572c819194c37a8367624b285be10297eca Fixes: 283c95d0e3891b64087706b344a4b545d04a6e62 Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12KVM: x86: flush pending lapic jump label updates on module unloadDavid Matlack3-0/+8
KVM's lapic emulation uses static_key_deferred (apic_{hw,sw}_disabled). These are implemented with delayed_work structs which can still be pending when the KVM module is unloaded. We've seen this cause kernel panics when the kvm_intel module is quickly reloaded. Use the new static_key_deferred_flush() API to flush pending updates on module unload. Signed-off-by: David Matlack <dmatlack@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-06Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-12/+11
Pull KVM fixes from Radim Krčmář: "MIPS: - fix host kernel crashes when receiving a signal with 64-bit userspace - flush instruction cache on all vcpus after generating entry code (both for stable) x86: - fix NULL dereference in MMU caused by SMM transitions (for stable) - correct guest instruction pointer after emulating some VMX errors - minor cleanup" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: remove duplicated declaration KVM: MIPS: Flush KVM entry code from icache globally KVM: MIPS: Don't clobber CP0_Status.UX KVM: x86: reset MMU on KVM_SET_VCPU_EVENTS KVM: nVMX: fix instruction skipping during emulated vm-entry
2017-01-05KVM: VMX: remove duplicated declarationJan Dakinevich1-6/+0
Declaration of VMX_VPID_EXTENT_SUPPORTED_MASK occures twice in the code. Probably, it was happened after unsuccessful merge. Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-25Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-9/+9
Pull timer type cleanups from Thomas Gleixner: "This series does a tree wide cleanup of types related to timers/timekeeping. - Get rid of cycles_t and use a plain u64. The type is not really helpful and caused more confusion than clarity - Get rid of the ktime union. The union has become useless as we use the scalar nanoseconds storage unconditionally now. The 32bit timespec alike storage got removed due to the Y2038 limitations some time ago. That leaves the odd union access around for no reason. Clean it up. Both changes have been done with coccinelle and a small amount of manual mopping up" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ktime: Get rid of ktime_equal() ktime: Cleanup ktime_set() usage ktime: Get rid of the union clocksource: Use a plain u64 instead of cycle_t
2016-12-25ktime: Cleanup ktime_set() usageThomas Gleixner1-2/+2
ktime_set(S,N) was required for the timespec storage type and is still useful for situations where a Seconds and Nanoseconds part of a time value needs to be converted. For anything where the Seconds argument is 0, this is pointless and can be replaced with a simple assignment. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
2016-12-25clocksource: Use a plain u64 instead of cycle_tThomas Gleixner1-7/+7
There is no point in having an extra type for extra confusion. u64 is unambiguous. Conversion was done with the following coccinelle script: @rem@ @@ -typedef u64 cycle_t; @fix@ typedef cycle_t; @@ -cycle_t +u64 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org>
2016-12-25cpu/hotplug: Cleanup state namesThomas Gleixner1-1/+1
When the state names got added a script was used to add the extra argument to the calls. The script basically converted the state constant to a string, but the cleanup to convert these strings into meaningful ones did not happen. Replace all the useless strings with 'subsys/xxx/yyy:state' strings which are used in all the other places already. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-24KVM: x86: reset MMU on KVM_SET_VCPU_EVENTSXiao Guangrong1-2/+7
Otherwise, mismatch between the smm bit in hflags and the MMU role can cause a NULL pointer dereference. Cc: stable@vger.kernel.org Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-21KVM: nVMX: fix instruction skipping during emulated vm-entryDavid Matlack1-4/+4
kvm_skip_emulated_instruction() should not be called after emulating a VM-entry failure during or after loading guest state (nested_vmx_entry_failure()). Otherwise the L1 hypervisor is resumed some number of bytes past vmcs->host_rip. Fixes: eb2775621701e6ee3ea2a474437d04e93ccdcb2f Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-19kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF)Jim Mattson1-6/+5
When L2 exits to L0 due to "exception or NMI", software exceptions (#BP and #OF) for which L1 has requested an intercept should be handled by L1 rather than L0. Previously, only hardware exceptions were forwarded to L1. Signed-off-by: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-19kvm: take srcu lock around kvm_steal_time_set_preempted()Andrea Arcangeli1-0/+7
kvm_memslots() will be called by kvm_write_guest_offset_cached() so take the srcu lock. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-19kvm: fix schedule in atomic in kvm_steal_time_set_preempted()Andrea Arcangeli1-0/+10
kvm_steal_time_set_preempted() isn't disabling the pagefaults before calling __copy_to_user and the kernel debug notices. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-16KVM: hyperv: fix locking of struct kvm_hv fieldsPaolo Bonzini2-9/+16
Introduce a new mutex to avoid an AB-BA deadlock between kvm->lock and vcpu->mutex. Protect accesses in kvm_hv_setup_tsc_page too, as suggested by Roman. Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-15KVM: x86: Expose Intel AVX512IFMA/AVX512VBMI/SHA features to guest.Yi Sun1-4/+5
Expose AVX512IFMA/AVX512VBMI/SHA features to guest. AVX512 spec can be found at: https://software.intel.com/sites/default/files/managed/26/40/319433-026.pdf SHA spec can be found at: https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf This patch depends on below patch. http://marc.info/?l=linux-kernel&m=147932800828178&w=2 Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-15kvm: nVMX: Correct a VMX instruction error code for VMPTRLDGanShun1-1/+1
When the operand passed to VMPTRLD matches the address of the VMXON region, the VMX instruction error code should be VMXERR_VMPTRLD_VMXON_POINTER rather than VMXERR_VMCLEAR_VMXON_POINTER. Signed-off-by: GanShun <ganshun@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds11-560/+1137
Pull KVM updates from Paolo Bonzini: "Small release, the most interesting stuff is x86 nested virt improvements. x86: - userspace can now hide nested VMX features from guests - nested VMX can now run Hyper-V in a guest - support for AVX512_4VNNIW and AVX512_FMAPS in KVM - infrastructure support for virtual Intel GPUs. PPC: - support for KVM guests on POWER9 - improved support for interrupt polling - optimizations and cleanups. s390: - two small optimizations, more stuff is in flight and will be in 4.11. ARM: - support for the GICv3 ITS on 32bit platforms" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits) arm64: KVM: pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest KVM: arm/arm64: timer: Check for properly initialized timer on init KVM: arm/arm64: vgic-v2: Limit ITARGETSR bits to number of VCPUs KVM: x86: Handle the kthread worker using the new API KVM: nVMX: invvpid handling improvements KVM: nVMX: check host CR3 on vmentry and vmexit KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry KVM: nVMX: propagate errors from prepare_vmcs02 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT KVM: nVMX: load GUEST_EFER after GUEST_CR0 during emulated VM-entry KVM: nVMX: generate MSR_IA32_CR{0,4}_FIXED1 from guest CPUID KVM: nVMX: fix checks on CR{0,4} during virtual VMX operation KVM: nVMX: support restore of VMX capability MSRs KVM: nVMX: generate non-true VMX MSRs based on true versions KVM: x86: Do not clear RFLAGS.TF when a singlestep trap occurs. KVM: x86: Add kvm_skip_emulated_instruction and use it. KVM: VMX: Move skip_emulated_instruction out of nested_vmx_check_vmcs12 KVM: VMX: Reorder some skip_emulated_instruction calls KVM: x86: Add a return value to kvm_emulate_cpuid KVM: PPC: Book3S: Move prototypes for KVM functions into kvm_ppc.h ...
2016-12-13Merge tag 'drm-for-v4.10' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds3-3/+41
Pull drm updates from Dave Airlie: "This is the main pull request for drm for 4.10 kernel. New drivers: - ZTE VOU display driver (zxdrm) - Amlogic Meson Graphic Controller GXBB/GXL/GXM SoCs (meson) - MXSFB support (mxsfb) Core: - Format handling has been reworked - Better atomic state debugging - drm_mm leak debugging - Atomic explicit fencing support - fbdev helper ops - Documentation updates - MST fbcon fixes Bridge: - Silicon Image SiI8620 driver Panel: - Add support for new simple panels i915: - GVT Device model - Better HDMI2.0 support on skylake - More watermark fixes - GPU idling rework for suspend/resume - DP Audio workarounds - Scheduler prep-work - Opregion CADL handling - GPU scheduler and priority boosting amdgfx/radeon: - Support for virtual devices - New VM manager for non-contig VRAM buffers - UVD powergating - SI register header cleanup - Cursor fixes - Powermanagement fixes nouveau: - Powermangement reworks for better voltage/clock changes - Atomic modesetting support - Displayport Multistream (MST) support. - GP102/104 hang and cursor fixes - GP106 support hisilicon: - hibmc support (BMC chip for aarch64 servers) armada: - add tracing support for overlay change - refactor plane support - de-midlayer the driver omapdrm: - Timing code cleanups rcar-du: - R8A7792/R8A7796 support - Misc fixes. sunxi: - A31 SoC display engine support imx-drm: - YUV format support - Cleanup plane atomic update mali-dp: - Misc fixes dw-hdmi: - Add support for HDMI i2c master controller tegra: - IOMMU support fixes - Error handling fixes tda998x: - Fix connector registration - Improved robustness - Fix infoframe/audio compliance virtio: - fix busid issues - allocate more vbufs qxl: - misc fixes and cleanups. vc4: - Fragment shader threading - ETC1 support - VEC (tv-out) support msm: - A5XX GPU support - Lots of atomic changes tilcdc: - Misc fixes and cleanups. etnaviv: - Fix dma-buf export path - DRAW_INSTANCED support - fix driver on i.MX6SX exynos: - HDMI refactoring fsl-dcu: - fbdev changes" * tag 'drm-for-v4.10' of git://people.freedesktop.org/~airlied/linux: (1343 commits) drm/nouveau/kms/nv50: fix atomic regression on original G80 drm/nouveau/bl: Do not register interface if Apple GMUX detected drm/nouveau/bl: Assign different names to interfaces drm/nouveau/bios/dp: fix handling of LevelEntryTableIndex on DP table 4.2 drm/nouveau/ltc: protect clearing of comptags with mutex drm/nouveau/gr/gf100-: handle GPC/TPC/MPC trap drm/nouveau/core: recognise GP106 chipset drm/nouveau/ttm: wait for bo fence to signal before unmapping vmas drm/nouveau/gr/gf100-: FECS intr handling is not relevant on proprietary ucode drm/nouveau/gr/gf100-: properly ack all FECS error interrupts drm/nouveau/fifo/gf100-: recover from host mmu faults drm: Add fake controlD* symlinks for backwards compat drm/vc4: Don't use drm_put_dev drm/vc4: Document VEC DT binding drm/vc4: Add support for the VEC (Video Encoder) IP drm: Add TV connector states to drm_connector_state drm: Turn DRM_MODE_SUBCONNECTOR_xx definitions into an enum drm/vc4: Fix ->clock_select setting for the VEC encoder drm/amdgpu/dce6: Set MASTER_UPDATE_MODE to 0 in resume_mc_access as well drm/amdgpu: use pin rather than pin_restricted in a few cases ...
2016-12-12Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds3-29/+6
Pull x86 FPU updates from Ingo Molnar: "The main changes in this cycle were: - do a large round of simplifications after all CPUs do 'eager' FPU context switching in v4.9: remove CR0 twiddling, remove leftover eager/lazy bts, etc (Andy Lutomirski) - more FPU code simplifications: remove struct fpu::counter, clarify nomenclature, remove unnecessary arguments/functions and better structure the code (Rik van Riel)" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Remove clts() x86/fpu: Remove stts() x86/fpu: Handle #NM without FPU emulation as an error x86/fpu, lguest: Remove CR0.TS support x86/fpu, kvm: Remove host CR0.TS manipulation x86/fpu: Remove irq_ts_save() and irq_ts_restore() x86/fpu: Stop saving and restoring CR0.TS in fpu__init_check_bugs() x86/fpu: Get rid of two redundant clts() calls x86/fpu: Finish excising 'eagerfpu' x86/fpu: Split old_fpu & new_fpu handling into separate functions x86/fpu: Remove 'cpu' argument from __cpu_invalidate_fpregs_state() x86/fpu: Split old & new FPU code paths x86/fpu: Remove __fpregs_(de)activate() x86/fpu: Rename lazy restore functions to "register state valid" x86/fpu, kvm: Remove KVM vcpu->fpu_counter x86/fpu: Remove struct fpu::counter x86/fpu: Remove use_eager_fpu() x86/fpu: Remove the XFEATURE_MASK_EAGER/LAZY distinction x86/fpu: Hard-disable lazy FPU mode x86/crypto, x86/fpu: Remove X86_FEATURE_EAGER_FPU #ifdef from the crc32c code
2016-12-11Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar5-29/+28
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-08KVM: x86: Handle the kthread worker using the new APIPetr Mladek2-11/+7
Use the new API to create and destroy the "kvm-pit" kthread worker. The API hides some implementation details. In particular, kthread_create_worker() allocates and initializes struct kthread_worker. It runs the kthread the right way and stores task_struct into the worker structure. kthread_destroy_worker() flushes all pending works, stops the kthread and frees the structure. This patch does not change the existing behavior except for dynamically allocating struct kthread_worker and storing only the pointer of this structure. It is compile tested only because I did not find an easy way how to run the code. Well, it should be pretty safe given the nature of the change. Signed-off-by: Petr Mladek <pmladek@suse.com> Message-Id: <1476877847-11217-1-git-send-email-pmladek@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-08KVM: nVMX: invvpid handling improvementsJan Dakinevich1-0/+10
- Expose all invalidation types to the L1 - Reject invvpid instruction, if L1 passed zero vpid value to single context invalidations Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-08KVM: nVMX: check host CR3 on vmentry and vmexitLadi Prosek1-7/+18
This commit adds missing host CR3 checks. Before entering guest mode, the value of CR3 is checked for reserved bits. After returning, nested_vmx_load_cr3 is called to set the new CR3 value and check and load PDPTRs. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentryLadi Prosek2-16/+44
Loading CR3 as part of emulating vmentry is different from regular CR3 loads, as implemented in kvm_set_cr3, in several ways. * different rules are followed to check CR3 and it is desirable for the caller to distinguish between the possible failures * PDPTRs are not loaded if PAE paging and nested EPT are both enabled * many MMU operations are not necessary This patch introduces nested_vmx_load_cr3 suitable for CR3 loads as part of nested vmentry and vmexit, and makes use of it on the nested vmentry path. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: nVMX: propagate errors from prepare_vmcs02Ladi Prosek1-4/+19
It is possible that prepare_vmcs02 fails to load the guest state. This patch adds the proper error handling for such a case. L1 will receive an INVALID_STATE vmexit with the appropriate exit qualification if it happens. A failure to set guest CR3 is the only error propagated from prepare_vmcs02 at the moment. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPTLadi Prosek1-2/+14
KVM does not correctly handle L1 hypervisors that emulate L2 real mode with PAE and EPT, such as Hyper-V. In this mode, the L1 hypervisor populates guest PDPTE VMCS fields and leaves guest CR3 uninitialized because it is not used (see 26.3.2.4 Loading Page-Directory-Pointer-Table Entries). KVM always dereferences CR3 and tries to load PDPTEs if PAE is on. This leads to two related issues: 1) On the first nested vmentry, the guest PDPTEs, as populated by L1, are overwritten in ept_load_pdptrs because the registers are believed to have been loaded in load_pdptrs as part of kvm_set_cr3. This is incorrect. L2 is running with PAE enabled but PDPTRs have been set up by L1. 2) When L2 is about to enable paging and loads its CR3, we, again, attempt to load PDPTEs in load_pdptrs called from kvm_set_cr3. There are no guarantees that this will succeed (it's just a CR3 load, paging is not enabled yet) and if it doesn't, kvm_set_cr3 returns early without persisting the CR3 which is then lost and L2 crashes right after it enables paging. This patch replaces the kvm_set_cr3 call with a simple register write if PAE and EPT are both on. CR3 is not to be interpreted in this case. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: nVMX: load GUEST_EFER after GUEST_CR0 during emulated VM-entryDavid Matlack1-9/+9
vmx_set_cr0() modifies GUEST_EFER and "IA-32e mode guest" in the current VMCS. Call vmx_set_efer() after vmx_set_cr0() so that emulated VM-entry is more faithful to VMCS12. This patch correctly causes VM-entry to fail when "IA-32e mode guest" is 1 and GUEST_CR0.PG is 0. Previously this configuration would succeed and "IA-32e mode guest" would silently be disabled by KVM. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: nVMX: generate MSR_IA32_CR{0,4}_FIXED1 from guest CPUIDDavid Matlack1-3/+52
MSR_IA32_CR{0,4}_FIXED1 define which bits in CR0 and CR4 are allowed to be 1 during VMX operation. Since the set of allowed-1 bits is the same in and out of VMX operation, we can generate these MSRs entirely from the guest's CPUID. This lets userspace avoiding having to save/restore these MSRs. This patch also initializes MSR_IA32_CR{0,4}_FIXED1 from the CPU's MSRs by default. This is a saner than the current default of -1ull, which includes bits that the host CPU does not support. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: nVMX: fix checks on CR{0,4} during virtual VMX operationDavid Matlack1-24/+53
KVM emulates MSR_IA32_VMX_CR{0,4}_FIXED1 with the value -1ULL, meaning all CR0 and CR4 bits are allowed to be 1 during VMX operation. This does not match real hardware, which disallows the high 32 bits of CR0 to be 1, and disallows reserved bits of CR4 to be 1 (including bits which are defined in the SDM but missing according to CPUID). A guest can induce a VM-entry failure by setting these bits in GUEST_CR0 and GUEST_CR4, despite MSR_IA32_VMX_CR{0,4}_FIXED1 indicating they are valid. Since KVM has allowed all bits to be 1 in CR0 and CR4, the existing checks on these registers do not verify must-be-0 bits. Fix these checks to identify must-be-0 bits according to MSR_IA32_VMX_CR{0,4}_FIXED1. This patch should introduce no change in behavior in KVM, since these MSRs are still -1ULL. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: nVMX: support restore of VMX capability MSRsDavid Matlack1-24/+266
The VMX capability MSRs advertise the set of features the KVM virtual CPU can support. This set of features varies across different host CPUs and KVM versions. This patch aims to addresses both sources of differences, allowing VMs to be migrated across CPUs and KVM versions without guest-visible changes to these MSRs. Note that cross-KVM- version migration is only supported from this point forward. When the VMX capability MSRs are restored, they are audited to check that the set of features advertised are a subset of what KVM and the CPU support. Since the VMX capability MSRs are read-only, they do not need to be on the default MSR save/restore lists. The userspace hypervisor can set the values of these MSRs or read them from KVM at VCPU creation time, and restore the same value after every save/restore. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: nVMX: generate non-true VMX MSRs based on true versionsDavid Matlack1-26/+19
The "non-true" VMX capability MSRs can be generated from their "true" counterparts, by OR-ing the default1 bits. The default1 bits are fixed and defined in the SDM. Since we can generate the non-true VMX MSRs from the true versions, there's no need to store both in struct nested_vmx. This also lets userspace avoid having to restore the non-true MSRs. Note this does not preclude emulating MSR_IA32_VMX_BASIC[55]=0. To do so, we simply need to set all the default1 bits in the true MSRs (such that the true MSRs and the generated non-true MSRs are equal). Signed-off-by: David Matlack <dmatlack@google.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: x86: Do not clear RFLAGS.TF when a singlestep trap occurs.Kyle Huey1-1/+0
The trap flag stays set until software clears it. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: x86: Add kvm_skip_emulated_instruction and use it.Kyle Huey4-123/+101
kvm_skip_emulated_instruction calls both kvm_x86_ops->skip_emulated_instruction and kvm_vcpu_check_singlestep, skipping the emulated instruction and generating a trap if necessary. Replacing skip_emulated_instruction calls with kvm_skip_emulated_instruction is straightforward, except for: - ICEBP, which is already inside a trap, so avoid triggering another trap. - Instructions that can trigger exits to userspace, such as the IO insns, MOVs to CR8, and HALT. If kvm_skip_emulated_instruction does trigger a KVM_GUESTDBG_SINGLESTEP exit, and the handling code for IN/OUT/MOV CR8/HALT also triggers an exit to userspace, the latter will take precedence. The singlestep will be triggered again on the next instruction, which is the current behavior. - Task switch instructions which would require additional handling (e.g. the task switch bit) and are instead left alone. - Cases where VMLAUNCH/VMRESUME do not proceed to the next instruction, which do not trigger singlestep traps as mentioned previously. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: VMX: Move skip_emulated_instruction out of nested_vmx_check_vmcs12Kyle Huey1-20/+33
We can't return both the pass/fail boolean for the vmcs and the upcoming continue/exit-to-userspace boolean for skip_emulated_instruction out of nested_vmx_check_vmcs, so move skip_emulated_instruction out of it instead. Additionally, VMENTER/VMRESUME only trigger singlestep exceptions when they advance the IP to the following instruction, not when they a) succeed, b) fail MSR validation or c) throw an exception. Add a separate call to skip_emulated_instruction that will later not be converted to the variant that checks the singlestep flag. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: VMX: Reorder some skip_emulated_instruction callsKyle Huey1-6/+6
The functions being moved ahead of skip_emulated_instruction here don't need updated IPs, and skipping the emulated instruction at the end will make it easier to return its value. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08KVM: x86: Add a return value to kvm_emulate_cpuidKyle Huey3-5/+4
Once skipping the emulated instruction can potentially trigger an exit to userspace (via KVM_GUESTDBG_SINGLESTEP) kvm_emulate_cpuid will need to propagate a return value. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-05Backmerge tag 'v4.9-rc8' into drm-nextDave Airlie6-73/+89
Linux 4.9-rc8 Daniel requested this so we could apply some follow on fixes cleanly to -next.
2016-11-24KVM: x86: check for pic and ioapic presence before useRadim Krčmář1-0/+13
Split irqchip allows pic and ioapic routes to be used without them being created, which results in NULL access. Check for NULL and avoid it. (The setup is too racy for a nicer solutions.) Found by syzkaller: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 3 PID: 11923 Comm: kworker/3:2 Not tainted 4.9.0-rc5+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: events irqfd_inject task: ffff88006a06c7c0 task.stack: ffff880068638000 RIP: 0010:[...] [...] __lock_acquire+0xb35/0x3380 kernel/locking/lockdep.c:3221 RSP: 0000:ffff88006863ea20 EFLAGS: 00010006 RAX: dffffc0000000000 RBX: dffffc0000000000 RCX: 0000000000000000 RDX: 0000000000000039 RSI: 0000000000000000 RDI: 1ffff1000d0c7d9e RBP: ffff88006863ef58 R08: 0000000000000001 R09: 0000000000000000 R10: 00000000000001c8 R11: 0000000000000000 R12: ffff88006a06c7c0 R13: 0000000000000001 R14: ffffffff8baab1a0 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88006d100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004abdd0 CR3: 000000003e2f2000 CR4: 00000000000026e0 Stack: ffffffff894d0098 1ffff1000d0c7d56 ffff88006863ecd0 dffffc0000000000 ffff88006a06c7c0 0000000000000000 ffff88006863ecf8 0000000000000082 0000000000000000 ffffffff815dd7c1 ffffffff00000000 ffffffff00000000 Call Trace: [...] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3746 [...] __raw_spin_lock include/linux/spinlock_api_smp.h:144 [...] _raw_spin_lock+0x38/0x50 kernel/locking/spinlock.c:151 [...] spin_lock include/linux/spinlock.h:302 [...] kvm_ioapic_set_irq+0x4c/0x100 arch/x86/kvm/ioapic.c:379 [...] kvm_set_ioapic_irq+0x8f/0xc0 arch/x86/kvm/irq_comm.c:52 [...] kvm_set_irq+0x239/0x640 arch/x86/kvm/../../../virt/kvm/irqchip.c:101 [...] irqfd_inject+0xb4/0x150 arch/x86/kvm/../../../virt/kvm/eventfd.c:60 [...] process_one_work+0xb40/0x1ba0 kernel/workqueue.c:2096 [...] worker_thread+0x214/0x18a0 kernel/workqueue.c:2230 [...] kthread+0x328/0x3e0 kernel/kthread.c:209 [...] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433 Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: 49df6397edfc ("KVM: x86: Split the APIC from the rest of IRQCHIP.") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-11-24KVM: x86: fix out-of-bounds accesses of rtc_eoi mapRadim Krčmář2-3/+3
KVM was using arrays of size KVM_MAX_VCPUS with vcpu_id, but ID can be bigger that the maximal number of VCPUs, resulting in out-of-bounds access. Found by syzkaller: BUG: KASAN: slab-out-of-bounds in __apic_accept_irq+0xb33/0xb50 at addr [...] Write of size 1 by task a.out/27101 CPU: 1 PID: 27101 Comm: a.out Not tainted 4.9.0-rc5+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [...] Call Trace: [...] __apic_accept_irq+0xb33/0xb50 arch/x86/kvm/lapic.c:905 [...] kvm_apic_set_irq+0x10e/0x180 arch/x86/kvm/lapic.c:495 [...] kvm_irq_delivery_to_apic+0x732/0xc10 arch/x86/kvm/irq_comm.c:86 [...] ioapic_service+0x41d/0x760 arch/x86/kvm/ioapic.c:360 [...] ioapic_set_irq+0x275/0x6c0 arch/x86/kvm/ioapic.c:222 [...] kvm_ioapic_inject_all arch/x86/kvm/ioapic.c:235 [...] kvm_set_ioapic+0x223/0x310 arch/x86/kvm/ioapic.c:670 [...] kvm_vm_ioctl_set_irqchip arch/x86/kvm/x86.c:3668 [...] kvm_arch_vm_ioctl+0x1a08/0x23c0 arch/x86/kvm/x86.c:3999 [...] kvm_vm_ioctl+0x1fa/0x1a70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3099 Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: af1bae5497b9 ("KVM: x86: bump KVM_MAX_VCPU_ID to 1023") Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-11-24KVM: x86: drop error recovery in em_jmp_far and em_ret_farRadim Krčmář1-25/+11
em_jmp_far and em_ret_far assumed that setting IP can only fail in 64 bit mode, but syzkaller proved otherwise (and SDM agrees). Code segment was restored upon failure, but it was left uninitialized outside of long mode, which could lead to a leak of host kernel stack. We could have fixed that by always saving and restoring the CS, but we take a simpler approach and just break any guest that manages to fail as the error recovery is error-prone and modern CPUs don't need emulator for this. Found by syzkaller: WARNING: CPU: 2 PID: 3668 at arch/x86/kvm/emulate.c:2217 em_ret_far+0x428/0x480 Kernel panic - not syncing: panic_on_warn set ... CPU: 2 PID: 3668 Comm: syz-executor Not tainted 4.9.0-rc4+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [...] Call Trace: [...] __dump_stack lib/dump_stack.c:15 [...] dump_stack+0xb3/0x118 lib/dump_stack.c:51 [...] panic+0x1b7/0x3a3 kernel/panic.c:179 [...] __warn+0x1c4/0x1e0 kernel/panic.c:542 [...] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585 [...] em_ret_far+0x428/0x480 arch/x86/kvm/emulate.c:2217 [...] em_ret_far_imm+0x17/0x70 arch/x86/kvm/emulate.c:2227 [...] x86_emulate_insn+0x87a/0x3730 arch/x86/kvm/emulate.c:5294 [...] x86_emulate_instruction+0x520/0x1ba0 arch/x86/kvm/x86.c:5545 [...] emulate_instruction arch/x86/include/asm/kvm_host.h:1116 [...] complete_emulated_io arch/x86/kvm/x86.c:6870 [...] complete_emulated_mmio+0x4e9/0x710 arch/x86/kvm/x86.c:6934 [...] kvm_arch_vcpu_ioctl_run+0x3b7a/0x5a90 arch/x86/kvm/x86.c:6978 [...] kvm_vcpu_ioctl+0x61e/0xdd0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2557 [...] vfs_ioctl fs/ioctl.c:43 [...] do_vfs_ioctl+0x18c/0x1040 fs/ioctl.c:679 [...] SYSC_ioctl fs/ioctl.c:694 [...] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685 [...] entry_SYSCALL_64_fastpath+0x1f/0xc2 Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: d1442d85cc30 ("KVM: x86: Handle errors when RIP is set during far jumps") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-11-24KVM: x86: fix out-of-bounds access in lapicRadim Krčmář1-1/+1
Cluster xAPIC delivery incorrectly assumed that dest_id <= 0xff. With enabled KVM_X2APIC_API_USE_32BIT_IDS in KVM_CAP_X2APIC_API, a userspace can send an interrupt with dest_id that results in out-of-bounds access. Found by syzkaller: BUG: KASAN: slab-out-of-bounds in kvm_irq_delivery_to_apic_fast+0x11fa/0x1210 at addr ffff88003d9ca750 Read of size 8 by task syz-executor/22923 CPU: 0 PID: 22923 Comm: syz-executor Not tainted 4.9.0-rc4+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [...] Call Trace: [...] __dump_stack lib/dump_stack.c:15 [...] dump_stack+0xb3/0x118 lib/dump_stack.c:51 [...] kasan_object_err+0x1c/0x70 mm/kasan/report.c:156 [...] print_address_description mm/kasan/report.c:194 [...] kasan_report_error mm/kasan/report.c:283 [...] kasan_report+0x231/0x500 mm/kasan/report.c:303 [...] __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:329 [...] kvm_irq_delivery_to_apic_fast+0x11fa/0x1210 arch/x86/kvm/lapic.c:824 [...] kvm_irq_delivery_to_apic+0x132/0x9a0 arch/x86/kvm/irq_comm.c:72 [...] kvm_set_msi+0x111/0x160 arch/x86/kvm/irq_comm.c:157 [...] kvm_send_userspace_msi+0x201/0x280 arch/x86/kvm/../../../virt/kvm/irqchip.c:74 [...] kvm_vm_ioctl+0xba5/0x1670 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3015 [...] vfs_ioctl fs/ioctl.c:43 [...] do_vfs_ioctl+0x18c/0x1040 fs/ioctl.c:679 [...] SYSC_ioctl fs/ioctl.c:694 [...] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685 [...] entry_SYSCALL_64_fastpath+0x1f/0xc2 Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: e45115b62f9a ("KVM: x86: use physical LAPIC array for logical x2APIC") Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-11-24kvm: svm: Add kvm_fast_pio_in supportTom Lendacky2-2/+46
Update the I/O interception support to add the kvm_fast_pio_in function to speed up the in instruction similar to the out instruction. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-11-24kvm: svm: Add support for additional SVM NPF error codesTom Lendacky2-3/+19
AMD hardware adds two additional bits to aid in nested page fault handling. Bit 32 - NPF occurred while translating the guest's final physical address Bit 33 - NPF occurred while translating the guest page tables The guest page tables fault indicator can be used as an aid for nested virtualization. Using V0 for the host, V1 for the first level guest and V2 for the second level guest, when both V1 and V2 are using nested paging there are currently a number of unnecessary instruction emulations. When V2 is launched shadow paging is used in V1 for the nested tables of V2. As a result, KVM marks these pages as RO in the host nested page tables. When V2 exits and we resume V1, these pages are still marked RO. Every nested walk for a guest page table is treated as a user-level write access and this causes a lot of NPFs because the V1 page tables are marked RO in the V0 nested tables. While executing V1, when these NPFs occur KVM sees a write to a read-only page, emulates the V1 instruction and unprotects the page (marking it RW). This patch looks for cases where we get a NPF due to a guest page table walk where the page was marked RO. It immediately unprotects the page and resumes the guest, leading to far fewer instruction emulations when nested virtualization is used. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-11-23Merge branch 'linus' into x86/fpu, to resolve conflictsIngo Molnar5-120/+91
Conflicts: arch/x86/kernel/fpu/core.c Signed-off-by: Ingo Molnar <mingo@kernel.org>