aboutsummaryrefslogtreecommitdiffstats
path: root/arch/arm/include/asm/kvm_mmu.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2015-04-16Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds1-0/+10
Pull arm64 updates from Will Deacon: "Here are the core arm64 updates for 4.1. Highlights include a significant rework to head.S (allowing us to boot on machines with physical memory at a really high address), an AES performance boost on Cortex-A57 and the ability to run a 32-bit userspace with 64k pages (although this requires said userspace to be built with a recent binutils). The head.S rework spilt over into KVM, so there are some changes under arch/arm/ which have been acked by Marc Zyngier (KVM co-maintainer). In particular, the linker script changes caused us some issues in -next, so there are a few merge commits where we had to apply fixes on top of a stable branch. Other changes include: - AES performance boost for Cortex-A57 - AArch32 (compat) userspace with 64k pages - Cortex-A53 erratum workaround for #845719 - defconfig updates (new platforms, PCI, ...)" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (39 commits) arm64: fix midr range for Cortex-A57 erratum 832075 arm64: errata: add workaround for cortex-a53 erratum #845719 arm64: Use bool function return values of true/false not 1/0 arm64: defconfig: updates for 4.1 arm64: Extract feature parsing code from cpu_errata.c arm64: alternative: Allow immediate branch as alternative instruction arm64: insn: Add aarch64_insn_decode_immediate ARM: kvm: round HYP section to page size instead of log2 upper bound ARM: kvm: assert on HYP section boundaries not actual code size arm64: head.S: ensure idmap_t0sz is visible arm64: pmu: add support for interrupt-affinity property dt: pmu: extend ARM PMU binding to allow for explicit interrupt affinity arm64: head.S: ensure visibility of page tables arm64: KVM: use ID map with increased VA range if required arm64: mm: increase VA range of identity map ARM: kvm: implement replacement for ld's LOG2CEIL() arm64: proc: remove unused cpu_get_pgd macro arm64: enforce x1|x2|x3 == 0 upon kernel entry as per boot protocol arm64: remove __calc_phys_offset arm64: merge __enable_mmu and __turn_mmu_on ...
2015-03-23arm64: KVM: use ID map with increased VA range if requiredArd Biesheuvel1-0/+10
This patch modifies the HYP init code so it can deal with system RAM residing at an offset which exceeds the reach of VA_BITS. Like for EL1, this involves configuring an additional level of translation for the ID map. However, in case of EL2, this implies that all translations use the extra level, as we cannot seamlessly switch between translation tables with different numbers of translation levels. So add an extra translation table at the root level. Since the ID map and the runtime HYP map are guaranteed not to overlap, they can share this root level, and we can essentially merge these two tables into one. Tested-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-11arm64: KVM: Do not use pgd_index to index stage-2 pgdMarc Zyngier1-1/+2
The kernel's pgd_index macro is designed to index a normal, page sized array. KVM is a bit diffferent, as we can use concatenated pages to have a bigger address space (for example 40bit IPA with 4kB pages gives us an 8kB PGD. In the above case, the use of pgd_index will always return an index inside the first 4kB, which makes a guest that has memory above 0x8000000000 rather unhappy, as it spins forever in a page fault, whist the host happilly corrupts the lower pgd. The obvious fix is to get our own kvm_pgd_index that does the right thing(tm). Tested on X-Gene with a hacked kvmtool that put memory at a stupidly high address. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-11arm64: KVM: Fix stage-2 PGD allocation to have per-page refcountingMarc Zyngier1-6/+4
We're using __get_free_pages with to allocate the guest's stage-2 PGD. The standard behaviour of this function is to return a set of pages where only the head page has a valid refcount. This behaviour gets us into trouble when we're trying to increment the refcount on a non-head page: page:ffff7c00cfb693c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x4000000000000000() page dumped because: VM_BUG_ON_PAGE((*({ __attribute__((unused)) typeof((&page->_count)->counter) __var = ( typeof((&page->_count)->counter)) 0; (volatile typeof((&page->_count)->counter) *)&((&page->_count)->counter); })) <= 0) BUG: failure at include/linux/mm.h:548/get_page()! Kernel panic - not syncing: BUG! CPU: 1 PID: 1695 Comm: kvm-vcpu-0 Not tainted 4.0.0-rc1+ #3825 Hardware name: APM X-Gene Mustang board (DT) Call trace: [<ffff80000008a09c>] dump_backtrace+0x0/0x13c [<ffff80000008a1e8>] show_stack+0x10/0x1c [<ffff800000691da8>] dump_stack+0x74/0x94 [<ffff800000690d78>] panic+0x100/0x240 [<ffff8000000a0bc4>] stage2_get_pmd+0x17c/0x2bc [<ffff8000000a1dc4>] kvm_handle_guest_abort+0x4b4/0x6b0 [<ffff8000000a420c>] handle_exit+0x58/0x180 [<ffff80000009e7a4>] kvm_arch_vcpu_ioctl_run+0x114/0x45c [<ffff800000099df4>] kvm_vcpu_ioctl+0x2e0/0x754 [<ffff8000001c0a18>] do_vfs_ioctl+0x424/0x5c8 [<ffff8000001c0bfc>] SyS_ioctl+0x40/0x78 CPU0: stopping A possible approach for this is to split the compound page using split_page() at allocation time, and change the teardown path to free one page at a time. It turns out that alloc_pages_exact() and free_pages_exact() does exactly that. While we're at it, the PGD allocation code is reworked to reduce duplication. This has been tested on an X-Gene platform with a 4kB/48bit-VA host kernel, and kvmtool hacked to place memory in the second page of the hardware PGD (PUD for the host kernel). Also regression-tested on a Cubietruck (Cortex-A7). [ Reworked to use alloc_pages_exact() and free_pages_exact() and to return pointers directly instead of by reference as arguments - Christoffer ] Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-02-23ARM: KVM: Fix size check in __coherent_cache_guest_pageJan Kiszka1-1/+1
The check is supposed to catch page-unaligned sizes, not the inverse. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-02-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+21
Pull KVM update from Paolo Bonzini: "Fairly small update, but there are some interesting new features. Common: Optional support for adding a small amount of polling on each HLT instruction executed in the guest (or equivalent for other architectures). This can improve latency up to 50% on some scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests). This also has to be enabled manually for now, but the plan is to auto-tune this in the future. ARM/ARM64: The highlights are support for GICv3 emulation and dirty page tracking s390: Several optimizations and bugfixes. Also a first: a feature exposed by KVM (UUID and long guest name in /proc/sysinfo) before it is available in IBM's hypervisor! :) MIPS: Bugfixes. x86: Support for PML (page modification logging, a new feature in Broadwell Xeons that speeds up dirty page tracking), nested virtualization improvements (nested APICv---a nice optimization), usual round of emulation fixes. There is also a new option to reduce latency of the TSC deadline timer in the guest; this needs to be tuned manually. Some commits are common between this pull and Catalin's; I see you have already included his tree. Powerpc: Nothing yet. The KVM/PPC changes will come in through the PPC maintainers, because I haven't received them yet and I might end up being offline for some part of next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits) KVM: ia64: drop kvm.h from installed user headers KVM: x86: fix build with !CONFIG_SMP KVM: x86: emulate: correct page fault error code for NoWrite instructions KVM: Disable compat ioctl for s390 KVM: s390: add cpu model support KVM: s390: use facilities and cpu_id per KVM KVM: s390/CPACF: Choose crypto control block format s390/kernel: Update /proc/sysinfo file with Extended Name and UUID KVM: s390: reenable LPP facility KVM: s390: floating irqs: fix user triggerable endless loop kvm: add halt_poll_ns module parameter kvm: remove KVM_MMIO_SIZE KVM: MIPS: Don't leak FPU/DSP to guest KVM: MIPS: Disable HTW while in guest KVM: nVMX: Enable nested posted interrupt processing KVM: nVMX: Enable nested virtual interrupt delivery KVM: nVMX: Enable nested apic register virtualization KVM: nVMX: Make nested control MSRs per-cpu KVM: nVMX: Enable nested virtualize x2apic mode KVM: nVMX: Prepare for using hardware MSR bitmap ...
2015-01-29arm/arm64: KVM: Use kernel mapping to perform invalidation on page faultMarc Zyngier1-9/+34
When handling a fault in stage-2, we need to resync I$ and D$, just to be sure we don't leave any old cache line behind. That's very good, except that we do so using the *user* address. Under heavy load (swapping like crazy), we may end up in a situation where the page gets mapped in stage-2 while being unmapped from userspace by another CPU. At that point, the DC/IC instructions can generate a fault, which we handle with kvm->mmu_lock held. The box quickly deadlocks, user is unhappy. Instead, perform this invalidation through the kernel mapping, which is guaranteed to be present. The box is much happier, and so am I. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-29arm/arm64: KVM: Invalidate data cache on unmapMarc Zyngier1-0/+31
Let's assume a guest has created an uncached mapping, and written to that page. Let's also assume that the host uses a cache-coherent IO subsystem. Let's finally assume that the host is under memory pressure and starts to swap things out. Before this "uncached" page is evicted, we need to make sure we invalidate potential speculated, clean cache lines that are sitting there, or the IO subsystem is going to swap out the cached view, loosing the data that has been written directly into memory. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-29arm/arm64: KVM: Use set/way op trapping to track the state of the cachesMarc Zyngier1-1/+2
Trying to emulate the behaviour of set/way cache ops is fairly pointless, as there are too many ways we can end-up missing stuff. Also, there is some system caches out there that simply ignore set/way operations. So instead of trying to implement them, let's convert it to VA ops, and use them as a way to re-enable the trapping of VM ops. That way, we can detect the point when the MMU/caches are turned off, and do a full VM flush (which is what the guest was trying to do anyway). This allows a 32bit zImage to boot on the APM thingy, and will probably help bootloaders in general. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-16KVM: arm: Add initial dirty page locking supportMario Smarduch1-0/+21
Add support for initial write protection of VM memslots. This patch series assumes that huge PUDs will not be used in 2nd stage tables, which is always valid on ARMv7 Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
2014-12-13arm/arm64: KVM: Introduce stage2_unmap_vmChristoffer Dall1-0/+1
Introduce a new function to unmap user RAM regions in the stage2 page tables. This is needed on reboot (or when the guest turns off the MMU) to ensure we fault in pages again and make the dcache, RAM, and icache coherent. Using unmap_stage2_range for the whole guest physical range does not work, because that unmaps IO regions (such as the GIC) which will not be recreated or in the best case faulted in on a page-by-page basis. Call this function on secondary and subsequent calls to the KVM_ARM_VCPU_INIT ioctl so that a reset VCPU will detect the guest Stage-1 MMU is off when faulting in pages and make the caches coherent. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-11-25arm, arm64: KVM: allow forced dcache flush on page faultsLaszlo Ersek1-2/+3
To allow handling of incoherent memslots in a subsequent patch, this patch adds a paramater 'ipa_uncached' to cache_coherent_guest_page() so that we can instruct it to flush the page's contents to DRAM even if the guest has caching globally enabled. Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-10-14arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2Christoffer Dall1-3/+26
This patch adds the necessary support for all host kernel PGSIZE and VA_SPACE configuration options for both EL2 and the Stage-2 page tables. However, for 40bit and 42bit PARange systems, the architecture mandates that VTCR_EL2.SL0 is maximum 1, resulting in fewer levels of stage-2 pagge tables than levels of host kernel page tables. At the same time, systems with a PARange > 42bit, we limit the IPA range by always setting VTCR_EL2.T0SZ to 24. To solve the situation with different levels of page tables for Stage-2 translation than the host kernel page tables, we allocate a dummy PGD with pointers to our actual inital level Stage-2 page table, in order for us to reuse the kernel pgtable manipulation primitives. Reproducing all these in KVM does not look pretty and unnecessarily complicates the 32-bit side. Systems with a PARange < 40bits are not yet supported. [ I have reworked this patch from its original form submitted by Jungseok to take the architecture constraints into consideration. There were too many changes from the original patch for me to preserve the authorship. Thanks to Catalin Marinas for his help in figuring out a good solution to this challenge. I have also fixed various bugs and missing error code handling from the original patch. - Christoffer ] Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-10arm/arm64: KVM: add 'writable' parameter to kvm_phys_addr_ioremapArd Biesheuvel1-1/+1
Add support for read-only MMIO passthrough mappings by adding a 'writable' parameter to kvm_phys_addr_ioremap. For the moment, mappings will be read-write even if 'writable' is false, but once the definition of PAGE_S2_DEVICE gets changed, those mappings will be created read-only. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-09-11ARM/arm64: KVM: fix use of WnR bit in kvm_is_write_fault()Ard Biesheuvel1-11/+0
The ISS encoding for an exception from a Data Abort has a WnR bit[6] that indicates whether the Data Abort was caused by a read or a write instruction. While there are several fields in the encoding that are only valid if the ISV bit[24] is set, WnR is not one of them, so we can read it unconditionally. Instead of fixing both implementations of kvm_is_write_fault() in place, reimplement it just once using kvm_vcpu_dabt_iswrite(), which already does the right thing with respect to the WnR bit. Also fix up the callers to pass 'vcpu' Acked-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-07-11arm/arm64: KVM: Fix and refactor unmap_rangeChristoffer Dall1-0/+12
unmap_range() was utterly broken, to quote Marc, and broke in all sorts of situations. It was also quite complicated to follow and didn't follow the usual scheme of having a separate iterating function for each level of page tables. Address this by refactoring the code and introduce a pgd_clear() function. Reviewed-by: Jungseok Lee <jays.lee@samsung.com> Reviewed-by: Mario Smarduch <m.smarduch@samsung.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-03-03ARM: KVM: force cache clean on page fault when caches are offMarc Zyngier1-1/+10
In order for a guest with caches disabled to observe data written contained in a given page, we need to make sure that page is committed to memory, and not just hanging in the cache (as guest accesses are completely bypassing the cache until it decides to enable it). For this purpose, hook into the coherent_cache_guest_page function and flush the region if the guest SCTLR register doesn't show the MMU and caches as being enabled. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
2014-03-03arm64: KVM: flush VM pages before letting the guest enable cachesMarc Zyngier1-0/+2
When the guest runs with caches disabled (like in an early boot sequence, for example), all the writes are diectly going to RAM, bypassing the caches altogether. Once the MMU and caches are enabled, whatever sits in the cache becomes suddenly visible, which isn't what the guest expects. A way to avoid this potential disaster is to invalidate the cache when the MMU is being turned on. For this, we hook into the SCTLR_EL1 trapping code, and scan the stage-2 page tables, invalidating the pages/sections that have already been mapped in. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-03-03ARM: KVM: introduce kvm_p*d_addr_endMarc Zyngier1-0/+13
The use of p*d_addr_end with stage-2 translation is slightly dodgy, as the IPA is 40bits, while all the p*d_addr_end helpers are taking an unsigned long (arm64 is fine with that as unligned long is 64bit). The fix is to introduce 64bit clean versions of the same helpers, and use them in the stage-2 page table code. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-03-03arm64: KVM: force cache clean on page fault when caches are offMarc Zyngier1-2/+2
In order for the guest with caches off to observe data written contained in a given page, we need to make sure that page is committed to memory, and not just hanging in the cache (as guest accesses are completely bypassing the cache until it decides to enable it). For this purpose, hook into the coherent_icache_guest_page function and flush the region if the guest SCTLR_EL1 register doesn't show the MMU and caches as being enabled. The function also get renamed to coherent_cache_guest_page. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-11arm/arm64: kvm: Use virt_to_idmap instead of virt_to_phys for idmap mappingsSantosh Shilimkar1-0/+1
KVM initialisation fails on architectures implementing virt_to_idmap() because virt_to_phys() on such architectures won't fetch you the correct idmap page. So update the KVM ARM code to use the virt_to_idmap() to fix the issue. Since the KVM code is shared between arm and arm64, we create kvm_virt_to_phys() and handle the redirection in respective headers. Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17KVM: ARM: Support hugetlbfs backed huge pagesChristoffer Dall1-3/+14
Support huge pages in KVM/ARM and KVM/ARM64. The pud_huge checking on the unmap path may feel a bit silly as the pud_huge check is always defined to false, but the compiler should be smart about this. Note: This deals only with VMAs marked as huge which are allocated by users through hugetlbfs only. Transparent huge pages can only be detected by looking at the underlying pages (or the page tables themselves) and this patch so far simply maps these on a page-by-page level in the Stage-2 page tables. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-08-30ARM: KVM: Fix kvm_set_pte assignmentChristoffer Dall1-1/+1
THe kvm_set_pte function was actually assigning the entire struct to the structure member, which should work because the structure only has that one member, but it is still not very nice. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-04-28ARM: KVM: perform HYP initilization for hotplugged CPUsMarc Zyngier1-0/+1
Now that we have the necessary infrastructure to boot a hotplugged CPU at any point in time, wire a CPU notifier that will perform the HYP init for the incoming CPU. Note that this depends on the platform code and/or firmware to boot the incoming CPU with HYP mode enabled and return to the kernel by following the normal boot path (HYP stub installed). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28ARM: KVM: switch to a dual-step HYP init codeMarc Zyngier1-3/+21
Our HYP init code suffers from two major design issues: - it cannot support CPU hotplug, as we tear down the idmap very early - it cannot perform a TLB invalidation when switching from init to runtime mappings, as pages are manipulated from PL1 exclusively The hotplug problem mandates that we keep two sets of page tables (boot and runtime). The TLB problem mandates that we're able to transition from one PGD to another while in HYP, invalidating the TLBs in the process. To be able to do this, we need to share a page between the two page tables. A page that will have the same VA in both configurations. All we need is a VA that has the following properties: - This VA can't be used to represent a kernel mapping. - This VA will not conflict with the physical address of the kernel text The vectors page seems to satisfy this requirement: - The kernel never maps anything else there - The kernel text being copied at the beginning of the physical memory, it is unlikely to use the last 64kB (I doubt we'll ever support KVM on a system with something like 4MB of RAM, but patches are very welcome). Let's call this VA the trampoline VA. Now, we map our init page at 3 locations: - idmap in the boot pgd - trampoline VA in the boot pgd - trampoline VA in the runtime pgd The init scenario is now the following: - We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd, runtime stack, runtime vectors - Enable the MMU with the boot pgd - Jump to a target into the trampoline page (remember, this is the same physical page!) - Now switch to the runtime pgd (same VA, and still the same physical page!) - Invalidate TLBs - Set stack and vectors - Profit! (or eret, if you only care about the code). Note that we keep the boot mapping permanently (it is not strictly an idmap anymore) to allow for CPU hotplug in later patches. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28ARM: KVM: rework HYP page table freeingMarc Zyngier1-1/+1
There is no point in freeing HYP page tables differently from Stage-2. They now have the same requirements, and should be dealt with the same way. Promote unmap_stage2_range to be The One True Way, and get rid of a number of nasty bugs in the process (good thing we never actually called free_hyp_pmds before...). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28ARM: KVM: move to a KVM provided HYP idmapMarc Zyngier1-1/+0
After the HYP page table rework, it is pretty easy to let the KVM code provide its own idmap, rather than expecting the kernel to provide it. It takes actually less code to do so. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06ARM: KVM: move include of asm/idmap.h to kvm_mmu.hMarc Zyngier1-0/+1
Since the arm64 code doesn't have a global asm/idmap.h file, move the inclusion to asm/kvm_mmu.h. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06ARM: KVM: allow HYP mappings to be at an offset from kernel mappingsMarc Zyngier1-0/+8
arm64 cannot represent the kernel VAs in HYP mode, because of the lack of TTBR1 at EL2. A way to cope with this situation is to have HYP VAs to be an offset from the kernel VAs. Introduce macros to convert a kernel VA to a HYP VA, make the HYP mapping functions use these conversion macros. Also change the documentation to reflect the existence of the offset. On ARM, where we can have an identity mapping between kernel and HYP, the macros are without any effect. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06ARM: KVM: abstract most MMU operationsMarc Zyngier1-0/+58
Move low level MMU-related operations to kvm_mmu.h. This makes the MMU code reusable by the arm64 port. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-01-23KVM: ARM: Handle guest faults in KVMChristoffer Dall1-0/+12
Handles the guest faults in KVM by mapping in corresponding user pages in the 2nd stage page tables. We invalidate the instruction cache by MVA whenever we map a page to the guest (no, we cannot only do it when we have an iabt because the guest may happily read/write a page before hitting the icache) if the hardware uses VIPT or PIPT. In the latter case, we can invalidate only that physical page. In the first case, all bets are off and we simply must invalidate the whole affair. Not that VIVT icaches are tagged with vmids, and we are out of the woods on that one. Alexander Graf was nice enough to remind us of this massive pain. Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23KVM: ARM: Memory virtualization setupChristoffer Dall1-0/+9
This commit introduces the framework for guest memory management through the use of 2nd stage translation. Each VM has a pointer to a level-1 table (the pgd field in struct kvm_arch) which is used for the 2nd stage translations. Entries are added when handling guest faults (later patch) and the table itself can be allocated and freed through the following functions implemented in arch/arm/kvm/arm_mmu.c: - kvm_alloc_stage2_pgd(struct kvm *kvm); - kvm_free_stage2_pgd(struct kvm *kvm); Each entry in TLBs and caches are tagged with a VMID identifier in addition to ASIDs. The VMIDs are assigned consecutively to VMs in the order that VMs are executed, and caches and tlbs are invalidated when the VMID space has been used to allow for more than 255 simultaenously running guests. The 2nd stage pgd is allocated in kvm_arch_init_vm(). The table is freed in kvm_arch_destroy_vm(). Both functions are called from the main KVM code. We pre-allocate page table memory to be able to synchronize using a spinlock and be called under rcu_read_lock from the MMU notifiers. We steal the mmu_memory_cache implementation from x86 and adapt for our specific usage. We support MMU notifiers (thanks to Marc Zyngier) through kvm_unmap_hva and kvm_set_spte_hva. Finally, define kvm_phys_addr_ioremap() to map a device at a guest IPA, which is used by VGIC support to map the virtual CPU interface registers to the guest. This support is added by Marc Zyngier. Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23KVM: ARM: Hypervisor initializationChristoffer Dall1-0/+29
Sets up KVM code to handle all exceptions taken to Hyp mode. When the kernel is booted in Hyp mode, calling an hvc instruction with r0 pointing to the new vectors, the HVBAR is changed to the the vector pointers. This allows subsystems (like KVM here) to execute code in Hyp-mode with the MMU disabled. We initialize other Hyp-mode registers and enables the MMU for Hyp-mode from the id-mapped hyp initialization code. Afterwards, the HVBAR is changed to point to KVM Hyp vectors used to catch guest faults and to switch to Hyp mode to perform a world-switch into a KVM guest. Also provides memory mapping code to map required code pages, data structures, and I/O regions accessed in Hyp mode at the same virtual address as the host kernel virtual addresses, but which conforms to the architectural requirements for translations in Hyp mode. This interface is added in arch/arm/kvm/arm_mmu.c and comprises: - create_hyp_mappings(from, to); - create_hyp_io_mappings(from, to, phys_addr); - free_hyp_pmds(); Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>