aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm (follow)
AgeCommit message (Collapse)AuthorFilesLines
2009-11-21Merge branch 'tracing/hw-breakpoints' into perf/coreIngo Molnar1-8/+10
Conflicts: arch/x86/kernel/kprobes.c kernel/trace/Makefile Merge reason: hw-breakpoints perf integration is looking good in testing and in reviews, plus conflicts are mounting up - so merge & resolve. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10hw-breakpoints: Wrap in the KVM breakpoint active state checkFrederic Weisbecker1-1/+1
Wrap in the cpu dr7 check that tells if we have active breakpoints that need to be restored in the cpu. This wrapper makes the check more self-explainable and also reusable for any further other uses. Reported-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Avi Kivity <avi@redhat.com> Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
2009-11-08hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf eventsFrederic Weisbecker1-8/+10
This patch rebase the implementation of the breakpoints API on top of perf events instances. Each breakpoints are now perf events that handle the register scheduling, thread/cpu attachment, etc.. The new layering is now made as follows: ptrace kgdb ftrace perf syscall \ | / / \ | / / / Core breakpoint API / / | / | / Breakpoints perf events | | Breakpoints PMU ---- Debug Register constraints handling (Part of core breakpoint API) | | Hardware debug registers Reasons of this rewrite: - Use the centralized/optimized pmu registers scheduling, implying an easier arch integration - More powerful register handling: perf attributes (pinned/flexible events, exclusive/non-exclusive, tunable period, etc...) Impact: - New perf ABI: the hardware breakpoints counters - Ptrace breakpoints setting remains tricky and still needs some per thread breakpoints references. Todo (in the order): - Support breakpoints perf counter events for perf tools (ie: implement perf_bpcounter_event()) - Support from perf tools Changes in v2: - Follow the perf "event " rename - The ptrace regression have been fixed (ptrace breakpoint perf events weren't released when a task ended) - Drop the struct hw_breakpoint and store generic fields in perf_event_attr. - Separate core and arch specific headers, drop asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h - Use new generic len/type for breakpoint - Handle off case: when breakpoints api is not supported by an arch Changes in v3: - Fix broken CONFIG_KVM, we need to propagate the breakpoint api changes to kvm when we exit the guest and restore the bp registers to the host. Changes in v4: - Drop the hw_breakpoint_restore() stub as it is only used by KVM - EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a module - Restore the breakpoints unconditionally on kvm guest exit: TIF_DEBUG_THREAD doesn't anymore cover every cases of running breakpoints and vcpu->arch.switch_db_regs might not always be set when the guest used debug registers. (Waiting for a reliable optimization) Changes in v5: - Split-up the asm-generic/hw-breakpoint.h moving to linux/hw_breakpoint.h into a separate patch - Optimize the breakpoints restoring while switching from kvm guest to host. We only want to restore the state if we have active breakpoints to the host, otherwise we don't care about messed-up address registers. - Add asm/hw_breakpoint.h to Kbuild - Fix bad breakpoint type in trace_selftest.c Changes in v6: - Fix wrong header inclusion in trace.h (triggered a build error with CONFIG_FTRACE_SELFTEST Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jan Kiszka <jan.kiszka@web.de> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Avi Kivity <avi@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org>
2009-11-04KVM: get_tss_base_addr() should return a gpa_tGleb Natapov1-1/+1
If TSS we are switching to resides in high memory task switch will fail since address will be truncated. Windows2k3 does this sometimes when running with more then 4G Cc: stable@kernel.org Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-11-04KVM: x86: Catch potential overrun in MCE setupJan Kiszka1-1/+1
We only allocate memory for 32 MCE banks (KVM_MAX_MCE_BANKS) but we allow user space to fill up to 255 on setup (mcg_cap & 0xff), corrupting kernel memory. Catch these overflows. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-18Merge commit 'perf/core' into perf/hw-breakpointFrederic Weisbecker6-29/+104
Conflicts: kernel/Makefile kernel/trace/Makefile kernel/trace/trace.h samples/Makefile Merge reason: We need to be uptodate with the perf events development branch because we plan to rewrite the breakpoints API on top of perf events.
2009-10-16KVM: MMU: fix pointer castFrederik Deweerdt1-6/+10
On a 32 bits compile, commit 3da0dd433dc399a8c0124d0614d82a09b6a49bce introduced the following warnings: arch/x86/kvm/mmu.c: In function ‘kvm_set_pte_rmapp’: arch/x86/kvm/mmu.c:770: warning: cast to pointer from integer of different size arch/x86/kvm/mmu.c: In function ‘kvm_set_spte_hva’: arch/x86/kvm/mmu.c:849: warning: cast from pointer to integer of different size The following patch uses 'unsigned long' instead of u64 to match the pointer size on both arches. Signed-off-by: Frederik Deweerdt <frederik.deweerdt@xprog.eu> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-16KVM: use proper hrtimer function to retrieve expiration timeMarcelo Tosatti2-2/+2
hrtimer->base can be temporarily NULL due to racing hrtimer_start. See switch_hrtimer_base/lock_hrtimer_base. Use hrtimer_get_remaining which is robust against it. CC: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-10-04KVM: add support for change_pte mmu notifiersIzik Eidus1-9/+53
this is needed for kvm if it want ksm to directly map pages into its shadow page tables. [marcelo: cast pfn assignment to u64] Signed-off-by: Izik Eidus <ieidus@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: MMU: add SPTE_HOST_WRITEABLE flag to the shadow ptesIzik Eidus2-7/+26
this flag notify that the host physical page we are pointing to from the spte is write protected, and therefore we cant change its access to be write unless we run get_user_pages(write = 1). (this is needed for change_pte support in kvm) Signed-off-by: Izik Eidus <ieidus@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: MMU: dont hold pagecount reference for mapped sptes pagesIzik Eidus1-5/+2
When using mmu notifiers, we are allowed to remove the page count reference tooken by get_user_pages to a specific page that is mapped inside the shadow page tables. This is needed so we can balance the pagecount against mapcount checking. (Right now kvm increase the pagecount and does not increase the mapcount when mapping page into shadow page table entry, so when comparing pagecount against mapcount, you have no reliable result.) Signed-off-by: Izik Eidus <ieidus@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: Prevent overflow in KVM_GET_SUPPORTED_CPUIDAvi Kivity1-0/+2
The number of entries is multiplied by the entry size, which can overflow on 32-bit hosts. Bound the entry count instead. Reported-by: David Wagner <daw@cs.berkeley.edu> Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
2009-10-04KVM: VMX: flush TLB with INVEPT on cpu migrationMarcelo Tosatti1-1/+1
It is possible that stale EPTP-tagged mappings are used, if a vcpu migrates to a different pcpu. Set KVM_REQ_TLB_FLUSH in vmx_vcpu_load, when switching pcpus, which will invalidate both VPID and EPT mappings on the next vm-entry. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: fix LAPIC timer period overflowAurelien Jarno1-1/+1
Don't overflow when computing the 64-bit period from 32-bit registers. Fixes sourceforge bug #2826486. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: SVM: Handle tsc in svm_get_msr/svm_set_msr correctlyJoerg Roedel1-6/+17
When running nested we need to touch the l1 guests tsc_offset. Otherwise changes will be lost or a wrong value be read. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04KVM: SVM: Fix tsc offset adjustment when running nestedJoerg Roedel1-0/+2
When svm_vcpu_load is called while the vcpu is running in guest mode the tsc adjustment made there is lost on the next emulated #vmexit. This causes the tsc running backwards in the guest. This patch fixes the issue by also adjusting the tsc_offset in the emulated hsave area so that it will not get lost. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-09-15Merge branch 'linus' into tracing/hw-breakpointsIngo Molnar22-1327/+3211
Conflicts: arch/x86/kernel/process_64.c Semantic conflict fixed in: arch/x86/kvm/x86.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-14Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds22-1321/+3210
* 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (202 commits) MAINTAINERS: update KVM entry KVM: correct error-handling code KVM: fix compile warnings on s390 KVM: VMX: Check cpl before emulating debug register access KVM: fix misreporting of coalesced interrupts by kvm tracer KVM: x86: drop duplicate kvm_flush_remote_tlb calls KVM: VMX: call vmx_load_host_state() only if msr is cached KVM: VMX: Conditionally reload debug register 6 KVM: Use thread debug register storage instead of kvm specific data KVM guest: do not batch pte updates from interrupt context KVM: Fix coalesced interrupt reporting in IOAPIC KVM guest: fix bogus wallclock physical address calculation KVM: VMX: Fix cr8 exiting control clobbering by EPT KVM: Optimize kvm_mmu_unprotect_page_virt() for tdp KVM: Document KVM_CAP_IRQCHIP KVM: Protect update_cr8_intercept() when running without an apic KVM: VMX: Fix EPT with WP bit change during paging KVM: Use kvm_{read,write}_guest_virt() to read and write segment descriptors KVM: x86 emulator: Add adc and sbb missing decoder flags KVM: Add missing #include ...
2009-09-10KVM: VMX: Check cpl before emulating debug register accessAvi Kivity2-0/+15
Debug registers may only be accessed from cpl 0. Unfortunately, vmx will code to emulate the instruction even though it was issued from guest userspace, possibly leading to an unexpected trap later. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-09-10KVM: fix misreporting of coalesced interrupts by kvm tracerGleb Natapov1-1/+1
Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-09-10KVM: x86: drop duplicate kvm_flush_remote_tlb callsMarcelo Tosatti1-2/+0
kvm_mmu_slot_remove_write_access already calls it. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: call vmx_load_host_state() only if msr is cachedGleb Natapov1-2/+2
No need to call it before each kvm_(set|get)_msr_common() Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-09-10KVM: VMX: Conditionally reload debug register 6Avi Kivity1-5/+9
Only reload debug register 6 if we're running with the guest's debug registers. Saves around 150 cycles from the guest lightweight exit path. dr6 contains a couple of bits that are updated on #DB, so intercept that unconditionally and update those bits then. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-09-10KVM: Use thread debug register storage instead of kvm specific dataAvi Kivity1-15/+7
Instead of saving the debug registers from the processor to a kvm data structure, rely in the debug registers stored in the thread structure. This allows us not to save dr6 and dr7. Reduces lightweight vmexit cost by 350 cycles, or 11 percent. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Fix cr8 exiting control clobbering by EPTGleb Natapov1-6/+3
Don't call adjust_vmx_controls() two times for the same control. It restores options that were dropped earlier. This loses us the cr8 exit control, which causes a massive performance regression Windows x64. Cc: stable@kernel.org Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Optimize kvm_mmu_unprotect_page_virt() for tdpAvi Kivity1-0/+3
We know no pages are protected, so we can short-circuit the whole thing (including fairly nasty guest memory accesses). Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Protect update_cr8_intercept() when running without an apicAvi Kivity1-0/+3
update_cr8_intercept() can be triggered from userspace while there is no apic present. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Fix EPT with WP bit change during pagingSheng Yang1-3/+3
QNX update WP bit when paging enabled, which is not covered yet. This one fix QNX boot with EPT. Cc: stable@kernel.org Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Use kvm_{read,write}_guest_virt() to read and write segment descriptorsMikhail Ershov1-8/+2
Segment descriptors tables can be placed on two non-contiguous pages. This patch makes reading segment descriptors by linear address. Signed-off-by: Mikhail Ershov <Mike.Ershov@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: x86 emulator: Add adc and sbb missing decoder flagsMohammed Gamal1-2/+2
Add missing decoder flags for adc and sbb instructions (opcodes 0x14-0x15, 0x1c-0x1d) Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Rename x86_emulate.c to emulate.cAvi Kivity3-4/+4
We're in arch/x86, what could we possibly be emulating? Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: When switching to a vm8086 task, load segments as 16-bitAnthony Liguori1-1/+8
According to 16.2.5 in the SDM, eflags.vm in the tss is consulted before loading and new segments. If eflags.vm == 1, then the segments are treated as 16-bit segments. The LDTR and TR are not normally available in vm86 mode so if they happen to somehow get loaded, they need to be treated as 32-bit segments. This fixes an invalid vmentry failure in a custom OS that was happening after a task switch into vm8086 mode. Since the segments were being mistakenly treated as 32-bit, we loaded garbage state. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Adjust rflags if in real mode emulationAvi Kivity1-1/+6
We set rflags.vm86 when virtualizing real mode to do through vm8086 mode; so we need to take it out again when reading rflags. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: Drop tlb flush workaround in nptAvi Kivity1-11/+2
It is no longer possible to reproduce the problem any more, so presumably it has been fixed. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Update cr8 intercept when APIC TPR is changed by userspaceGleb Natapov1-0/+2
Since on vcpu entry we do it only if apic is enabled we should do it when TPR is changed while apic is disabled. This happens when windows resets HW without setting TPR to zero. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: enable nested svm by defaultJoerg Roedel1-1/+1
Nested SVM is (in my experience) stable enough to be enabled by default. So omit the requirement to pass a module parameter. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: check for nested VINTR flag in svm_interrupt_allowedJoerg Roedel1-1/+1
Not checking for this flag breaks any nested hypervisor that does not set VINTR. So fix it with this patch. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: move nested_svm_intr main logic out of if-clauseJoerg Roedel1-10/+11
This patch removes one indentation level from nested_svm_intr and makes the logic more readable. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: remove unnecessary is_nested check from svm_cpu_runJoerg Roedel1-2/+1
This check is not necessary. We have to sync the vcpu->arch.cr2 always back to the VMCB. This patch remove the is_nested check. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: move special nested exit handling to separate functionJoerg Roedel1-30/+50
This patch moves the handling for special nested vmexits like #pf to a separate function. This makes the kvm_override parameter obsolete and makes the code more readable. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: handle errors in vmrun emulation path appropriatlyJoerg Roedel1-1/+13
If nested svm fails to load the msrpm the vmrun succeeds with the old msrpm which is not correct. This patch changes the logic to roll back to host mode in case the msrpm cannot be loaded. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: remove nested_svm_do and helper functionsJoerg Roedel1-60/+0
This function is not longer required. So remove it. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: clean up nested vmrun pathJoerg Roedel1-12/+22
This patch removes the usage of nested_svm_do from the vmrun emulation path. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: clean up nestec vmload/vmsave pathsJoerg Roedel1-19/+17
This patch removes the usage of nested_svm_do from the vmload and vmsave emulation code paths. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: clean up nested_svm_exit_handled_msrJoerg Roedel1-17/+21
This patch changes nested svm to call nested_svm_exit_handled_msr directly and not through nested_svm_do. [alex: fix oops due to nested kmap_atomics] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: get rid of nested_svm_vmexit_realJoerg Roedel1-12/+40
This patch is the starting point of removing nested_svm_do from the nested svm code. The nested_svm_do function basically maps two guest physical pages to host virtual addresses and calls a passed function on it. This function pointer code flow is hard to read and not the best technical solution here. As a side effect this patch indroduces the nested_svm_[un]map helper functions. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: simplify nested_svm_check_exceptionJoerg Roedel1-11/+8
Makes the code of this function more readable by removing on indentation level for the core logic. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: do nested vmexit in nested_svm_exit_handledJoerg Roedel1-23/+19
If this function returns true a nested vmexit is required. Move that vmexit into the nested_svm_exit_handled function. This also simplifies the handling of nested #pf intercepts in this function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: consolidate nested_svm_exit_handledJoerg Roedel1-60/+49
When caching guest intercepts there is no need anymore for the nested_svm_exit_handled_real function. So move its code into nested_svm_exit_handled. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: cache nested interceptsJoerg Roedel1-7/+23
When the nested intercepts are cached we don't need to call get_user_pages and/or map the nested vmcb on every nested #vmexit to check who will handle the intercept. Further this patch aligns the emulated svm behavior better to real hardware. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>