<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/arch/x86/kvm, branch master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/arch/x86/kvm?h=master</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/arch/x86/kvm?h=master'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-11-11T12:19:46Z</updated>
<entry>
<title>KVM: x86/mmu: Block all page faults during kvm_zap_gfn_range()</title>
<updated>2022-11-11T12:19:46Z</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2022-11-11T00:18:41Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=6d3085e4d89ad7e6c7f1c6cf929d903393565861'/>
<id>urn:sha1:6d3085e4d89ad7e6c7f1c6cf929d903393565861</id>
<content type='text'>
When zapping a GFN range, pass 0 =&gt; ALL_ONES for the to-be-invalidated
range to effectively block all page faults while the zap is in-progress.
The invalidation helpers take a host virtual address, whereas zapping a
GFN obviously provides a guest physical address and with the wrong unit
of measurement (frame vs. byte).

Alternatively, KVM could walk all memslots to get the associated HVAs,
but thanks to SMM, that would require multiple lookups.  And practically
speaking, kvm_zap_gfn_range() usage is quite rare and not a hot path,
e.g. MTRR and CR0.CD are almost guaranteed to be done only on vCPU0
during boot, and APICv inhibits are similarly infrequent operations.

Fixes: edb298c663fc ("KVM: x86/mmu: bump mmu notifier count in kvm_zap_gfn_range")
Reported-by: Chao Peng &lt;chao.p.peng@linux.intel.com&gt;
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky &lt;mlevitsk@redhat.com&gt;
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Message-Id: &lt;20221111001841.2412598-1-seanjc@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: x86/pmu: Limit the maximum number of supported AMD GP counters</title>
<updated>2022-11-09T17:26:54Z</updated>
<author>
<name>Like Xu</name>
<email>likexu@tencent.com</email>
</author>
<published>2022-09-19T09:10:08Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=556f3c9ad7c101aa16a43ef4539f3aabc1d7b32e'/>
<id>urn:sha1:556f3c9ad7c101aa16a43ef4539f3aabc1d7b32e</id>
<content type='text'>
The AMD PerfMonV2 specification allows for a maximum of 16 GP counters,
but currently only 6 pairs of MSRs are accepted by KVM.

While AMD64_NUM_COUNTERS_CORE is already equal to 6, increasing without
adjusting msrs_to_save_all[] could result in out-of-bounds accesses.
Therefore introduce a macro (named KVM_AMD_PMC_MAX_GENERIC) to
refer to the number of counters supported by KVM.

Signed-off-by: Like Xu &lt;likexu@tencent.com&gt;
Reviewed-by: Jim Mattson &lt;jmattson@google.com&gt;
Message-Id: &lt;20220919091008.60695-3-likexu@tencent.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: x86/pmu: Limit the maximum number of supported Intel GP counters</title>
<updated>2022-11-09T17:26:53Z</updated>
<author>
<name>Like Xu</name>
<email>likexu@tencent.com</email>
</author>
<published>2022-09-19T09:10:07Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=4f1fa2a1bbeb2feca436d2c86bf6f78dc4e5e4c4'/>
<id>urn:sha1:4f1fa2a1bbeb2feca436d2c86bf6f78dc4e5e4c4</id>
<content type='text'>
The Intel Architectural IA32_PMCx MSRs addresses range allows for a
maximum of 8 GP counters, and KVM cannot address any more.  Introduce a
local macro (named KVM_INTEL_PMC_MAX_GENERIC) and use it consistently to
refer to the number of counters supported by KVM, thus avoiding possible
out-of-bound accesses.

Suggested-by: Jim Mattson &lt;jmattson@google.com&gt;
Signed-off-by: Like Xu &lt;likexu@tencent.com&gt;
Reviewed-by: Jim Mattson &lt;jmattson@google.com&gt;
Message-Id: &lt;20220919091008.60695-2-likexu@tencent.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: x86/pmu: Do not speculatively query Intel GP PMCs that don't exist yet</title>
<updated>2022-11-09T17:26:53Z</updated>
<author>
<name>Like Xu</name>
<email>likexu@tencent.com</email>
</author>
<published>2022-09-19T09:10:06Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=8631ef59b62290c7d88e7209e35dfb47f33f4902'/>
<id>urn:sha1:8631ef59b62290c7d88e7209e35dfb47f33f4902</id>
<content type='text'>
The SDM lists an architectural MSR IA32_CORE_CAPABILITIES (0xCF)
that limits the theoretical maximum value of the Intel GP PMC MSRs
allocated at 0xC1 to 14; likewise the Intel April 2022 SDM adds
IA32_OVERCLOCKING_STATUS at 0x195 which limits the number of event
selection MSRs to 15 (0x186-0x194).

Limiting the maximum number of counters to 14 or 18 based on the currently
allocated MSRs is clearly fragile, and it seems likely that Intel will
even place PMCs 8-15 at a completely different range of MSR indices.
So stop at the maximum number of GP PMCs supported today on Intel
processors.

There are some machines, like Intel P4 with non Architectural PMU, that
may indeed have 18 counters, but those counters are in a completely
different MSR address range and are not supported by KVM.

Cc: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Cc: stable@vger.kernel.org
Fixes: cf05a67b68b8 ("KVM: x86: omit "impossible" pmu MSRs from MSR list")
Suggested-by: Jim Mattson &lt;jmattson@google.com&gt;
Signed-off-by: Like Xu &lt;likexu@tencent.com&gt;
Reviewed-by: Jim Mattson &lt;jmattson@google.com&gt;
Message-Id: &lt;20220919091008.60695-1-likexu@tencent.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: SVM: Only dump VMSA to klog at KERN_DEBUG level</title>
<updated>2022-11-09T17:26:53Z</updated>
<author>
<name>Peter Gonda</name>
<email>pgonda@google.com</email>
</author>
<published>2022-11-04T14:22:20Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=0bd8bd2f7a789fe1dcb21ad148199d2f62d79873'/>
<id>urn:sha1:0bd8bd2f7a789fe1dcb21ad148199d2f62d79873</id>
<content type='text'>
Explicitly print the VMSA dump at KERN_DEBUG log level, KERN_CONT uses
KERNEL_DEFAULT if the previous log line has a newline, i.e. if there's
nothing to continuing, and as a result the VMSA gets dumped when it
shouldn't.

The KERN_CONT documentation says it defaults back to KERNL_DEFAULT if the
previous log line has a newline. So switch from KERN_CONT to
print_hex_dump_debug().

Jarkko pointed this out in reference to the original patch. See:
https://lore.kernel.org/all/YuPMeWX4uuR1Tz3M@kernel.org/
print_hex_dump(KERN_DEBUG, ...) was pointed out there, but
print_hex_dump_debug() should similar.

Fixes: 6fac42f127b8 ("KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog")
Signed-off-by: Peter Gonda &lt;pgonda@google.com&gt;
Reviewed-by: Sean Christopherson &lt;seanjc@google.com&gt;
Cc: Jarkko Sakkinen &lt;jarkko@kernel.org&gt;
Cc: Harald Hoyer &lt;harald@profian.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: x86@kernel.org
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Message-Id: &lt;20221104142220.469452-1-pgonda@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers</title>
<updated>2022-11-09T17:26:51Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2022-09-30T18:48:24Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=bd3d394e367e66e773a6cb25a82c29b04464230b'/>
<id>urn:sha1:bd3d394e367e66e773a6cb25a82c29b04464230b</id>
<content type='text'>
x86_virt_spec_ctrl only deals with the paravirtualized
MSR_IA32_VIRT_SPEC_CTRL now and does not handle MSR_IA32_SPEC_CTRL
anymore; remove the corresponding, unused argument.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly</title>
<updated>2022-11-09T17:25:53Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2022-09-30T18:24:40Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=9f2febf3f04daebdaaa5a43cfa20e3844905c0f9'/>
<id>urn:sha1:9f2febf3f04daebdaaa5a43cfa20e3844905c0f9</id>
<content type='text'>
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.

With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.

To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.

Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson &lt;jmattson@google.com&gt;
Reviewed-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: SVM: restore host save area from assembly</title>
<updated>2022-11-09T17:25:33Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2022-11-07T08:49:59Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=e287bd005ad9d85dd6271dd795d3ecfb6bca46ad'/>
<id>urn:sha1:e287bd005ad9d85dd6271dd795d3ecfb6bca46ad</id>
<content type='text'>
Allow access to the percpu area via the GS segment base, which is
needed in order to access the saved host spec_ctrl value.  In linux-next
FILL_RETURN_BUFFER also needs to access percpu data.

For simplicity, the physical address of the save area is added to struct
svm_cpu_data.

Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Reported-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Analyzed-by: Andrew Cooper &lt;andrew.cooper3@citrix.com&gt;
Tested-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Reviewed-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: SVM: move guest vmsave/vmload back to assembly</title>
<updated>2022-11-09T17:25:06Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2022-11-07T10:14:27Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=e61ab42de874c5af8c5d98b327c77a374d9e7da1'/>
<id>urn:sha1:e61ab42de874c5af8c5d98b327c77a374d9e7da1</id>
<content type='text'>
It is error-prone that code after vmexit cannot access percpu data
because GSBASE has not been restored yet.  It forces MSR_IA32_SPEC_CTRL
save/restore to happen very late, after the predictor untraining
sequence, and it gets in the way of return stack depth tracking
(a retbleed mitigation that is in linux-next as of 2022-11-09).

As a first step towards fixing that, move the VMCB VMSAVE/VMLOAD to
assembly, essentially undoing commit fb0c4a4fee5a ("KVM: SVM: move
VMLOAD/VMSAVE to C code", 2021-03-15).  The reason for that commit was
that it made it simpler to use a different VMCB for VMLOAD/VMSAVE versus
VMRUN; but that is not a big hassle anymore thanks to the kvm-asm-offsets
machinery and other related cleanups.

The idea on how to number the exception tables is stolen from
a prototype patch by Peter Zijlstra.

Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Link: &lt;https://lore.kernel.org/all/f571e404-e625-bae1-10e9-449b2eb4cbd8@citrix.com/&gt;
Reviewed-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: SVM: do not allocate struct svm_cpu_data dynamically</title>
<updated>2022-11-09T17:23:59Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2022-11-09T14:07:55Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=73412dfeea724e6bd775ba64d21157ff322eac9a'/>
<id>urn:sha1:73412dfeea724e6bd775ba64d21157ff322eac9a</id>
<content type='text'>
The svm_data percpu variable is a pointer, but it is allocated via
svm_hardware_setup() when KVM is loaded.  Unlike hardware_enable()
this means that it is never NULL for the whole lifetime of KVM, and
static allocation does not waste any memory compared to the status quo.
It is also more efficient and more easily handled from assembly code,
so do it and don't look back.

Reviewed-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
</feed>
