path: root/Documentation
diff options
authorLinus Torvalds <torvalds@linux-foundation.org>2013-02-24 13:07:18 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2013-02-24 13:07:18 -0800
commit89f883372fa60f604d136924baf3e89ff1870e9e (patch)
treecb69b0a14957945ba00d3d392bf9ccbbef56f3b8 /Documentation
parentMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal (diff)
parentRevert "KVM: MMU: lazily drop large spte" (diff)
Merge tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Marcelo Tosatti: "KVM updates for the 3.9 merge window, including x86 real mode emulation fixes, stronger memory slot interface restrictions, mmu_lock spinlock hold time reduction, improved handling of large page faults on shadow, initial APICv HW acceleration support, s390 channel IO based virtio, amongst others" * tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits) Revert "KVM: MMU: lazily drop large spte" x86: pvclock kvm: align allocation size to page size KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr x86 emulator: fix parity calculation for AAD instruction KVM: PPC: BookE: Handle alignment interrupts booke: Added DBCR4 SPR number KVM: PPC: booke: Allow multiple exception types KVM: PPC: booke: use vcpu reference from thread_struct KVM: Remove user_alloc from struct kvm_memory_slot KVM: VMX: disable apicv by default KVM: s390: Fix handling of iscs. KVM: MMU: cleanup __direct_map KVM: MMU: remove pt_access in mmu_set_spte KVM: MMU: cleanup mapping-level KVM: MMU: lazily drop large spte KVM: VMX: cleanup vmx_set_cr0(). KVM: VMX: add missing exit names to VMX_EXIT_REASONS array KVM: VMX: disable SMEP feature when guest is in non-paging mode KVM: Remove duplicate text in api.txt Revert "KVM: MMU: split kvm_mmu_free_page" ...
Diffstat (limited to 'Documentation')
2 files changed, 85 insertions, 30 deletions
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index e0fa0ea2b187..119358dfb742 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -219,19 +219,6 @@ allocation of vcpu ids. For example, if userspace wants
single-threaded guest vcpus, it should make all vcpu ids be a multiple
of the number of vcpus per vcore.
-On powerpc using book3s_hv mode, the vcpus are mapped onto virtual
-threads in one or more virtual CPU cores. (This is because the
-hardware requires all the hardware threads in a CPU core to be in the
-same partition.) The KVM_CAP_PPC_SMT capability indicates the number
-of vcpus per virtual core (vcore). The vcore id is obtained by
-dividing the vcpu id by the number of vcpus per vcore. The vcpus in a
-given vcore will always be in the same physical core as each other
-(though that might be a different physical core from time to time).
-Userspace can control the threading (SMT) mode of the guest by its
-allocation of vcpu ids. For example, if userspace wants
-single-threaded guest vcpus, it should make all vcpu ids be a multiple
-of the number of vcpus per vcore.
For virtual cpus that have been created with S390 user controlled virtual
machines, the resulting vcpu fd can be memory mapped at page offset
KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual
@@ -345,7 +332,7 @@ struct kvm_sregs {
__u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
-/* ppc -- see arch/powerpc/include/asm/kvm.h */
+/* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */
interrupt_bitmap is a bitmap of pending external interrupts. At most
one bit may be set. This interrupt has been acknowledged by the APIC
@@ -892,12 +879,12 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
be identical. This allows large pages in the guest to be backed by large
pages in the host.
-The flags field supports two flag, KVM_MEM_LOG_DIRTY_PAGES, which instructs
-kvm to keep track of writes to memory within the slot. See KVM_GET_DIRTY_LOG
-ioctl. The KVM_CAP_READONLY_MEM capability indicates the availability of the
-KVM_MEM_READONLY flag. When this flag is set for a memory region, KVM only
-allows read accesses. Writes will be posted to userspace as KVM_EXIT_MMIO
+The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and
+KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of
+writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to
+use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it,
+to make a new slot read-only. In this case, writes to this memory will be
+posted to userspace as KVM_EXIT_MMIO exits.
When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
the memory region are automatically reflected into the guest. For example, an
@@ -931,7 +918,7 @@ documentation when it pops into existence).
-Architectures: ppc
+Architectures: ppc, s390
Type: vcpu ioctl
Parameters: struct kvm_enable_cap (in)
Returns: 0 on success; -1 on error
@@ -1792,6 +1779,7 @@ registers, find a list below:
ARM registers are mapped using the lower 32 bits. The upper 16 of that
is the register group type, or coprocessor number:
@@ -2108,6 +2096,14 @@ KVM_S390_INT_VIRTIO (vm) - virtio external interrupt; external interrupt
KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm
KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm
KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
+KVM_S390_INT_IO(ai,cssid,ssid,schid) (vm) - compound value to indicate an
+ I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel);
+ I/O interruption parameters in parm (subchannel) and parm64 (intparm,
+ interruption subclass)
+KVM_S390_MCHK (vm, vcpu) - machine check interrupt; cr 14 bits in parm,
+ machine check interrupt code in parm64 (note that
+ machine checks needing further payload are not
+ supported by this ioctl)
Note that the vcpu ioctl is asynchronous to vcpu execution.
@@ -2359,8 +2355,8 @@ executed a memory-mapped I/O instruction which could not be satisfied
by kvm. The 'data' member contains the written data if 'is_write' is
true, and should be filled by application code otherwise.
- and KVM_EXIT_PAPR the corresponding
+ KVM_EXIT_PAPR and KVM_EXIT_EPR the corresponding
operations are complete (and guest state is consistent) only after userspace
has re-entered the kernel with KVM_RUN. The kernel side will first finish
incomplete operations and then check for pending signals. Userspace
@@ -2463,6 +2459,41 @@ The possible hypercalls are defined in the Power Architecture Platform
Requirements (PAPR) document available from www.power.org (free
developer registration required to access it).
+ /* KVM_EXIT_S390_TSCH */
+ struct {
+ __u16 subchannel_id;
+ __u16 subchannel_nr;
+ __u32 io_int_parm;
+ __u32 io_int_word;
+ __u32 ipb;
+ __u8 dequeued;
+ } s390_tsch;
+s390 specific. This exit occurs when KVM_CAP_S390_CSS_SUPPORT has been enabled
+and TEST SUBCHANNEL was intercepted. If dequeued is set, a pending I/O
+interrupt for the target subchannel has been dequeued and subchannel_id,
+subchannel_nr, io_int_parm and io_int_word contain the parameters for that
+interrupt. ipb is needed for instruction parameter decoding.
+ /* KVM_EXIT_EPR */
+ struct {
+ __u32 epr;
+ } epr;
+On FSL BookE PowerPC chips, the interrupt controller has a fast patch
+interrupt acknowledge path to the core. When the core successfully
+delivers an interrupt, it automatically populates the EPR register with
+the interrupt vector number and acknowledges the interrupt inside
+the interrupt controller.
+In case the interrupt controller lives in user space, we need to do
+the interrupt acknowledge cycle through it to fetch the next to be
+delivered interrupt vector using this exit.
+It gets triggered whenever both KVM_CAP_PPC_EPR are enabled and an
+external interrupt has just been delivered into the guest. User space
+should put the acknowledged interrupt vector into the 'epr' field.
/* Fix the size of the union. */
char padding[256];
@@ -2584,3 +2615,34 @@ For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV:
where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value.
- The tsize field of mas1 shall be set to 4K on TLB0, even though the
hardware ignores this value for TLB0.
+Architectures: s390
+Parameters: none
+Returns: 0 on success; -1 on error
+This capability enables support for handling of channel I/O instructions.
+TEST PENDING INTERRUPTION and the interrupt portion of TEST SUBCHANNEL are
+handled in-kernel, while the other I/O instructions are passed to userspace.
+When this capability is enabled, KVM_EXIT_S390_TSCH will occur on TEST
+SUBCHANNEL intercepts.
+Architectures: ppc
+Parameters: args[0] defines whether the proxy facility is active
+Returns: 0 on success; -1 on error
+This capability enables or disables the delivery of interrupts through the
+external proxy facility.
+When enabled (args[0] != 0), every time the guest gets an external interrupt
+delivered, it automatically exits into user space with a KVM_EXIT_EPR exit
+to receive the topmost interrupt vector.
+When disabled (args[0] == 0), behavior is as if this facility is unsupported.
+When this capability is enabled, KVM_EXIT_EPR can occur.
diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt
index fa5f1dbc6b23..43fcb761ed16 100644
--- a/Documentation/virtual/kvm/mmu.txt
+++ b/Documentation/virtual/kvm/mmu.txt
@@ -187,13 +187,6 @@ Shadow pages contain the following information:
perform a reverse map from a pte to a gfn. When role.direct is set, any
element of this array can be calculated from the gfn field when used, in
this case, the array of gfns is not allocated. See role.direct and gfn.
- slot_bitmap:
- A bitmap containing one bit per memory slot. If the page contains a pte
- mapping a page from memory slot n, then bit n of slot_bitmap will be set
- (if a page is aliased among several slots, then it is not guaranteed that
- all slots will be marked).
- Used during dirty logging to avoid scanning a shadow page if none if its
- pages need tracking.
A counter keeping track of how many hardware registers (guest cr3 or
pdptrs) are now pointing at the page. While this counter is nonzero, the