aboutsummaryrefslogtreecommitdiffstats
path: root/arch (follow)
AgeCommit message (Collapse)AuthorFilesLines
2010-05-17KVM: use the correct RCU API for PROVE_RCU=yLai Jiangshan6-8/+16
The RCU/SRCU API have already changed for proving RCU usage. I got the following dmesg when PROVE_RCU=y because we used incorrect API. This patch coverts rcu_deference() to srcu_dereference() or family API. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by qemu-system-x86/8550: #0: (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa011a6ac>] kvm_set_memory_region+0x29/0x50 [kvm] #1: (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<ffffffffa012262d>] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm] stack backtrace: Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27 Call Trace: [<ffffffff8106c59e>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffffa012f6c1>] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm] [<ffffffffa012263e>] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm] [<ffffffffa011a5d7>] __kvm_set_memory_region+0x636/0x6e2 [kvm] [<ffffffffa011a6ba>] kvm_set_memory_region+0x37/0x50 [kvm] [<ffffffffa015e956>] vmx_set_tss_addr+0x46/0x5a [kvm_intel] [<ffffffffa0126592>] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm] [<ffffffff810a8692>] ? unlock_page+0x27/0x2c [<ffffffff810bf879>] ? __do_fault+0x3a9/0x3e1 [<ffffffffa011b12f>] kvm_vm_ioctl+0x364/0x38d [kvm] [<ffffffff81060cfa>] ? up_read+0x23/0x3d [<ffffffff810f3587>] vfs_ioctl+0x32/0xa6 [<ffffffff810f3b19>] do_vfs_ioctl+0x495/0x4db [<ffffffff810e6b2f>] ? fget_light+0xc2/0x241 [<ffffffff810e416c>] ? do_sys_open+0x104/0x116 [<ffffffff81382d6d>] ? retint_swapgs+0xe/0x13 [<ffffffff810f3ba6>] sys_ioctl+0x47/0x6a [<ffffffff810021db>] system_call_fastpath+0x16/0x1b Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17Merge branch 'perf'Avi Kivity40-3272/+3189
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: MMU: cleanup for hlist walk restartXiao Guangrong1-5/+10
Quote from Avi: |Just change the assignment to a 'goto restart;' please, |I don't like playing with list_for_each internals. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: prevent spurious exit to userspace during task switch emulation.Gleb Natapov2-4/+14
If kvm_task_switch() fails code exits to userspace without specifying exit reason, so the previous exit reason is reused by userspace. Fix this by specifying exit reason correctly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: MMU: remove unused parameter in mmu_parent_walk()Xiao Guangrong1-13/+11
'vcpu' is unused, remove it Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: MMU: reduce 'struct kvm_mmu_page' sizeXiao Guangrong1-2/+2
Define 'multimapped' as 'bool'. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: MMU: remove unused struct kvm_unsync_walkXiao Guangrong1-5/+0
Remove 'struct kvm_unsync_walk' since it's not used. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: fix emulator_task_switch() return value.Gleb Natapov2-4/+5
emulator_task_switch() should return -1 for failure and 0 for success to the caller, just like x86_emulate_insn() does. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: MMU: Replace role.glevels with role.cr4_paeAvi Kivity3-9/+10
There is no real distinction between glevels=3 and glevels=4; both have exactly the same format and the code is treated exactly the same way. Drop role.glevels and replace is with role.cr4_pae (which is meaningful). This simplifies the code a bit. As a side effect, it allows sharing shadow page tables between pae and longmode guest page tables at the same guest page. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86: Push potential exception error code on task switchesJan Kiszka7-10/+48
When a fault triggers a task switch, the error code, if existent, has to be pushed on the new task's stack. Implement the missing bits. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86: Terminate early if task_switch_16/32 failedJan Kiszka1-0/+2
Stop the switch immediately if task_switch_16/32 returned an error. Only if that step succeeded, the switch should actually take place and update any register states. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86: get rid of mmu_only parameter in emulator_write_emulated()Gleb Natapov1-25/+11
We can call kvm_mmu_pte_write() directly from emulator_cmpxchg_emulated() instead of passing mmu_only down to emulator_write_emulated_onepage() and call it there. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: move DR register access handling into generic codeGleb Natapov4-134/+93
Currently both SVM and VMX have their own DR handling code. Move it to x86.c. Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: SVM: implement NEXTRIPsave SVM featureAndre Przywara2-6/+11
On SVM we set the instruction length of skipped instructions to hard-coded, well known values, which could be wrong when (bogus, but valid) prefixes (REX, segment override) are used. Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or AthlonII) have an explicit NEXTRIP field in the VMCB containing the desired information. Since it is cheap to do so, we use this field to override the guessed value on newer processors. A fix for older CPUs would be rather expensive, as it would require to fetch and partially decode the instruction. As the problem is not a security issue and needs special, handcrafted code to trigger (no compiler will ever generate such code), I omit a fix for older CPUs. If someone is interested, I have both a patch for these CPUs as well as demo code triggering this issue: It segfaults under KVM, but runs perfectly on native Linux. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: Fix MAXPHYADDR calculation when cpuid does not support itAvi Kivity1-0/+4
MAXPHYADDR is derived from cpuid 0x80000008, but when that isn't present, we get some random value. Fix by checking first that cpuid 0x80000008 is supported. Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: Trace emulated instructionsAvi Kivity2-0/+90
Log emulated instructions in ftrace, especially if they failed. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86 emulator: Don't overwrite decode cacheAvi Kivity1-9/+10
Currently if we an instruction spans a page boundary, when we fetch the second half we overwrite the first half. This prevents us from tracing the full instruction opcodes. Fix by appending the second half to the first. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: PPC: Add dequeue for external on BookEAlexander Graf1-0/+6
Commit a0abee86af2d1f048dbe99d2bcc4a2cefe685617 introduced unsetting of the IRQ line from userspace. This added a new core specific callback that I apparently forgot to add for BookE. So let's add the callback for BookE as well, making it build again. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: MMU: remove unnecessary NX check in walk_addrXiao Guangrong1-1/+1
After is_rsvd_bits_set() checks, EFER.NXE must be enabled if NX bit is seted Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: MMU: remove unused fieldXiao Guangrong2-3/+0
kvm_mmu_page.oos_link is not used, so remove it Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: MMU: cleanup/fix mmu audit codeXiao Guangrong1-7/+8
This patch does: - 'sp' parameter in inspect_spte_fn() is not used, so remove it - fix 'kvm' and 'slots' is not defined in count_rmaps() - fix a bug in inspect_spte_has_rmap() Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: PPC: Don't export Book3S symbols on BookEAlexander Graf1-0/+2
Book3S knows how to convert floats to doubles and vice versa. BookE doesn't. So let's make sure we don't export them on BookE. This fixes a link error on BookE with CONFIG_KVM=y. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Only use QPRs when availableAlexander Graf1-0/+2
BookE KVM doesn't know about QPRs, so let's not try to access then. This fixes a build error on BookE KVM. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Disable MSR_FEx for Cell hostsAlexander Graf1-0/+4
Cell can't handle MSR_FE0 and MSR_FE1 too well. It gets dog slow. So let's just override the guest whenever we see one of the two and mask them out. See commit ddf5f75a16b3e7460ffee881795aa168dffcd0cf for reference. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Make bools bitfieldsAlexander Graf2-17/+17
Bool defaults to at least byte width. We usually only want to waste a single bit on this. So let's move all the bool values to bitfields, potentially saving memory. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Use ULL for big numbersAlexander Graf1-6/+6
Some constants were bigger than ints. Let's mark them as such so we don't accidently truncate them. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Add check if pte was mapped secondaryAlexander Graf1-0/+7
Some HTAB providers (namely the PS3) ignore the SECONDARY flag. They just put an entry in the htab as secondary when they see fit. So we need to check the return value of htab_insert to remember the correct slot id so we can actually invalidate the entry again. Fixes KVM on the PS3. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Add emulation for dcbaAlexander Graf1-0/+4
Mac OS X uses the dcba instruction. According to the specification it doesn't guarantee any functionality, so let's just emulate it as nop. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Fix dcbz emulationAlexander Graf2-38/+37
On most systems we need to emulate dcbz when running 32 bit guests. So far we've been rather slack, not giving correct DSISR values to the guest. This patch makes the emulation more accurate, introducing a difference between "page not mapped" and "write protection fault". While at it, it also speeds up dcbz emulation by an order of magnitude by using kmap. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Make build work without CONFIG_VSX/ALTIVECAlexander Graf1-0/+8
The FPU/Altivec/VSX enablement also brought access to some structure elements that are only defined when the respective config options are enabled. Unfortuately I forgot to check for the config options at some places, so let's do that now. Unbreaks the build when CONFIG_VSX is not set. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Add OSI hypercall interfaceAlexander Graf4-6/+37
MOL uses its own hypercall interface to call back into userspace when the guest wants to do something. So let's implement that as an exit reason, specify it with a CAP and only really use it when userspace wants us to. The only user of it so far is MOL. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: Add support for enabling capabilities per-vcpuAlexander Graf1-0/+27
Some times we don't want all capabilities to be available to all our vcpus. One example for that is the OSI interface, implemented in the next patch. In order to have a generic mechanism in how to enable capabilities individually, this patch introduces a new ioctl that can be used for this purpose. That way features we don't want in all guests or userspace configurations can just not be enabled and we're good. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Implement alignment interruptAlexander Graf3-0/+87
Mac OS X has some applications - namely the Finder - that require alignment interrupts to work properly. So we need to implement them. But the spec for 970 and 750 also looks different. While 750 requires the DSISR and DAR fields to reflect some instruction bits (DSISR) and the fault address (DAR), the 970 declares this as an optional feature. So we need to reconstruct DSISR and DAR manually. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Implement emulation for lbzux and lhaxAlexander Graf1-0/+20
We get MMIOs with the weirdest instructions. But every time we do, we need to improve our emulator to implement them. So let's do that - this time it's lbzux and lhax's round. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Make XER load 32 bitAlexander Graf1-1/+1
We have a 32 bit value in the PACA to store XER in. We also do an stw when storing XER in there. But then we load it with ld, completely screwing it up on every entry. Welcome to the Big Endian world. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Implement BAT readsAlexander Graf1-0/+35
BATs can't only be written to, you can also read them out! So let's implement emulation for reading BAT values again. While at it, I also made BAT setting flush the segment cache, so we're absolutely sure there's no MMU state left when writing BATs. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Implement mfsr emulationAlexander Graf1-0/+13
We emulate the mfsrin instruction already, that passes the SR number in a register value. But we lacked support for mfsr that encoded the SR number in the opcode. So let's implement it. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Load VCPU for register fetchingAlexander Graf1-0/+8
When trying to read or store vcpu register data, we should also make sure the vcpu is actually loaded, so we're 100% sure we get the correct values. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Don't reload FPU with invalid valuesAlexander Graf1-0/+5
When the guest activates the FPU, we load it up. That's fine when it wasn't activated before on the host, but if it was we end up reloading FPU values from last time the FPU was deactivated on the host without writing the proper values back to the vcpu struct. This patch checks if the FPU is enabled already and if so just doesn't bother activating it, making FPU operations survive guest context switches. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Split instruction reading outAlexander Graf1-8/+16
The current check_ext function reads the instruction and then does the checking. Let's split the reading out so we can reuse it for different functions. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Book3S_32 guest MMU fixesAlexander Graf2-7/+24
This patch makes the VSID of mapped pages always reflecting all special cases we have, like split mode. It also changes the tlbie mask to 0x0ffff000 according to the spec. The mask we used before was incorrect. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Make DSISR 32 bits wideAlexander Graf3-3/+3
DSISR is only defined as 32 bits wide. So let's reflect that in the structs too. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Allow userspace to unset the IRQ lineAlexander Graf4-1/+16
Userspace can tell us that it wants to trigger an interrupt. But so far it can't tell us that it wants to stop triggering one. So let's interpret the parameter to the ioctl that we have anyways to tell us if we want to raise or lower the interrupt line. Signed-off-by: Alexander Graf <agraf@suse.de> v2 -> v3: - Add CAP for unset irq Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: PPC: Ensure split mode worksAlexander Graf2-26/+29
On PowerPC we can go into MMU Split Mode. That means that either data relocation is on but instruction relocation is off or vice versa. That mode didn't work properly, as we weren't always flushing entries when going into a new split mode, potentially mapping different code or data that we're supposed to. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: MMU: Disassociate direct maps from guest levelsAvi Kivity1-0/+2
Direct maps are linear translations for a section of memory, used for real mode or with large pages. As such, they are independent of the guest levels. Teach the mmu about this by making page->role.glevels = 0 for direct maps. This allows direct maps to be shared among real mode and the various paging modes. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: MMU: check reserved bits only if CR4.PSE=1 or CR4.PAE=1Xiao Guangrong1-3/+9
- Check reserved bits only if CR4.PAE=1 or CR4.PSE=1 when guest #PF occurs - Fix a typo in reset_rsvds_bits_mask() Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: x86: document KVM_REQ_PENDING_TIMER usageMarcelo Tosatti1-1/+2
Document that KVM_REQ_PENDING_TIMER is implicitly used during guest entry. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: x86 emulator: fix unlocked CMPXCHG8B emulationGleb Natapov1-1/+0
When CMPXCHG8B is executed without LOCK prefix it is racy. Preserve this behaviour in emulator too. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: x86 emulator: add decoding of CMPXCHG8B dst operandGleb Natapov1-14/+10
Decode CMPXCHG8B destination operand in decoding stage. Fixes regression introduced by "If LOCK prefix is used dest arg should be memory" commit. This commit relies on dst operand be decoded at the beginning of an instruction emulation. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: x86 emulator: commit rflags as part of registers commitGleb Natapov3-2/+8
Make sure that rflags is committed only after successful instruction emulation. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>