linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-07-03	lockd: Show pid of lockd for remote locks	Benjamin Coddington	5	-6/+4
	Use the pid of lockd instead of the remote lock's svid for the fl_pid for local POSIX locks. This allows proper enumeration of which local process owns which lock. The svid is meaningless to local lock readers. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-07-03	lockd: Remove lm_compare_owner and lm_owner_key	Benjamin Coddington	1	-18/+0
	Now that the NLM server allocates an nlm_lockowner for fl_owner, there's no need for special hashing or comparison. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-07-03	lockd: Convert NLM service fl_owner to nlm_lockowner	Benjamin Coddington	7	-5/+123
	Do as the NLM client: allocate and track a struct nlm_lockowner for use as the fl_owner for locks created by the NLM sever. This allows us to keep the svid within this structure for matching locks, and will allow us to track the pid of lockd in a future patch. It should also allow easier reference of the nlm_host in conflicting locks, and simplify lock hashing and comparison. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> [bfields@redhat.com: fix type of some error returns] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-07-03	lockd: prepare nlm_lockowner for use by the server	Benjamin Coddington	1	-10/+11
	The nlm_lockowner structure that the client uses to track locks is generally useful to the server as well. Very similar functions to handle allocation and tracking of the nlm_lockowner will follow. Rename the client functions for clarity. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-07-03	nfsd: note inadequate stats locking	J. Bruce Fields	1	-2/+5
	After 89a26b3d295d "nfsd: split DRC global spinlock into per-bucket locks", there is no longer a single global spinlock to protect these stats. So, really we need to fix that. For now, at least fix the comment. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-07-03	nfsd4: drc containerization	J. Bruce Fields	4	-125/+154
	The nfsd duplicate reply cache should not be shared between network namespaces. The most straightforward way to fix this is just to move every global in the code to per-net-namespace memory, so that's what we do. Still todo: sort out which members of nfsd_stats should be global and which per-net-namespace. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-07-03	nfsd: don't call nfsd_reply_cache_shutdown twice	J. Bruce Fields	1	-1/+0
	The caller is cleaning up on ENOMEM, don't try to do it here too. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-07-03	nfsd: Fix overflow causing non-working mounts on 1 TB machines	Paul Menzel	1	-1/+1
	Since commit 10a68cdf10 (nfsd: fix performance-limiting session calculation) (Linux 5.1-rc1 and 4.19.31), shares from NFS servers with 1 TB of memory cannot be mounted anymore. The mount just hangs on the client. The gist of commit 10a68cdf10 is the change below. -avail = clamp_t(int, avail, slotsize, avail/3); +avail = clamp_t(int, avail, slotsize, total_avail/3); Here are the macros. #define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <) #define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi) `total_avail` is 8,434,659,328 on the 1 TB machine. `clamp_t()` casts the values to `int`, which for 32-bit integers can only hold values −2,147,483,648 (−2^31) through 2,147,483,647 (2^31 − 1). `avail` (in the function signature) is just 65536, so that no overflow was happening. Before the commit the assignment would result in 21845, and `num = 4`. When using `total_avail`, it is causing the assignment to be 18446744072226137429 (printed as %lu), and `num` is then 4164608182. My next guess is, that `nfsd_drc_mem_used` is then exceeded, and the server thinks there is no memory available any more for this client. Updating the arguments of `clamp_t()` and `min_t()` to `unsigned long` fixes the issue. Now, `avail = 65536` (before commit 10a68cdf10 `avail = 21845`), but `num = 4` remains the same. Fixes: c54f24e338ed (nfsd: fix performance-limiting session calculation) Cc: stable@vger.kernel.org Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-06-19	svcrdma: Ignore source port when computing DRC hash	Chuck Lever	1	-1/+6
	The DRC appears to be effectively empty after an RPC/RDMA transport reconnect. The problem is that each connection uses a different source port, which defeats the DRC hash. Clients always have to disconnect before they send retransmissions to reset the connection's credit accounting, thus every retransmit on NFS/RDMA will miss the DRC. An NFS/RDMA client's IP source port is meaningless for RDMA transports. The transport layer typically sets the source port value on the connection to a random ephemeral port. The server already ignores it for the "secure port" check. See commit 16e4d93f6de7 ("NFSD: Ignore client's source port on RDMA transports"). The Linux NFS server's DRC resolves XID collisions from the same source IP address by using the checksum of the first 200 bytes of the RPC call header. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-05-31	Revert "lockd: Show pid of lockd for remote locks"	Benjamin Coddington	2	-4/+4
	This reverts most of commit b8eee0e90f97 ("lockd: Show pid of lockd for remote locks"), which caused remote locks to not be differentiated between remote processes for NLM. We retain the fixup for setting the client's fl_pid to a negative value. Fixes: b8eee0e90f97 ("lockd: Show pid of lockd for remote locks") Cc: stable@vger.kernel.org Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: XueWei Zhang <xueweiz@google.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-05-26	Linux 5.2-rc2	Linus Torvalds	1	-2/+2

2019-05-26	random: fix soft lockup when trying to read from an uninitialized blocking pool	Theodore Ts'o	1	-3/+13
	Fixes: eb9d1bf079bb: "random: only read from /dev/random after its pool has received 128 bits" Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-05-25	tracing: Silence GCC 9 array bounds warning	Miguel Ojeda	3	-10/+20
	Starting with GCC 9, -Warray-bounds detects cases when memset is called starting on a member of a struct but the size to be cleared ends up writing over further members. Such a call happens in the trace code to clear, at once, all members after and including `seq` on struct trace_iterator: In function 'memset', inlined from 'ftrace_dump' at kernel/trace/trace.c:8914:3: ./include/linux/string.h:344:9: warning: '__builtin_memset' offset [8505, 8560] from the object at 'iter' is out of the bounds of referenced subobject 'seq' with type 'struct trace_seq' at offset 4368 [-Warray-bounds] 344 \| return __builtin_memset(p, c, size); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In order to avoid GCC complaining about it, we compute the address ourselves by adding the offsetof distance instead of referring directly to the member. Since there are two places doing this clear (trace.c and trace_kdb.c), take the chance to move the workaround into a single place in the internal header. Link: http://lkml.kernel.org/r/20190523124535.GA12931@gmail.com Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> [ Removed unnecessary parenthesis around "iter" ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-24	ext4: fix dcache lookup of !casefolded directories	Gabriel Krisman Bertazi	1	-1/+1
	Found by visual inspection, this wasn't caught by my xfstest, since it's effect is ignoring positive dentries in the cache the fallback just goes to the disk. it was introduced in the last iteration of the case-insensitive patch. d_compare should return 0 when the entries match, so make sure we are correctly comparing the entire string if the encoding feature is set and we are on a case-INsensitive directory. Fixes: b886ee3e778e ("ext4: Support case-insensitive file name lookups") Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-05-24	locking/lock_events: Use this_cpu_add() when necessary	Waiman Long	1	-2/+40
	The kernel test robot has reported that the use of __this_cpu_add() causes bug messages like: BUG: using __this_cpu_add() in preemptible [00000000] code: ... Given the imprecise nature of the count and the possibility of resetting the count and doing the measurement again, this is not really a big problem to use the unprotected __this_cpu_() functions. To make the preemption checking code happy, the this_cpu_() functions will be used if CONFIG_DEBUG_PREEMPT is defined. The imprecise nature of the locking counts are also documented with the suggestion that we should run the measurement a few times with the counts reset in between to get a better picture of what is going on under the hood. Fixes: a8654596f0371 ("locking/rwsem: Enable lock event counting") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-24	KVM: x86: fix return value for reserved EFER	Paolo Bonzini	1	-1/+1
	Commit 11988499e62b ("KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes", 2019-04-02) introduced a "return false" in a function returning int, and anyway set_efer has a "nonzero on error" conventon so it should be returning 1. Reported-by: Pavel Machek <pavel@denx.de> Fixes: 11988499e62b ("KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes") Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	tools/kvm_stat: fix fields filter for child events	Stefan Raspl	2	-4/+14
	The fields filter would not work with child fields, as the respective parents would not be included. No parents displayed == no childs displayed. To reproduce, run on s390 (would work on other platforms, too, but would require a different filter name): - Run 'kvm_stat -d' - Press 'f' - Enter 'instruct' Notice that events like instruction_diag_44 or instruction_diag_500 are not displayed - the output remains empty. With this patch, we will filter by matching events and their parents. However, consider the following example where we filter by instruction_diag_44: kvm statistics - summary regex filter: instruction_diag_44 Event Total %Total CurAvg/s exit_instruction 276 100.0 12 instruction_diag_44 256 92.8 11 Total 276 12 Note that the parent ('exit_instruction') displays the total events, but the childs listed do not match its total (256 instead of 276). This is intended (since we're filtering all but one child), but might be confusing on first sight. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: selftests: Wrap vcpu_nested_state_get/set functions with x86 guard	Thomas Huth	2	-0/+4
	struct kvm_nested_state is only available on x86 so far. To be able to compile the code on other architectures as well, we need to wrap the related code with #ifdefs. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	kvm: selftests: aarch64: compile with warnings on	Andrew Jones	1	-4/+5
	aarch64 fixups needed to compile with warnings as errors. Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	kvm: selftests: aarch64: fix default vm mode	Andrew Jones	1	-1/+1
	VM_MODE_P52V48_4K is not a valid mode for AArch64. Replace its use in vm_create_default() with a mode that works and represents a good AArch64 default. (We didn't ever see a problem with this because we don't have any unit tests using vm_create_default(), but it's good to get it fixed in advance.) Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	kvm: selftests: aarch64: dirty_log_test: fix unaligned memslot size	Andrew Jones	1	-1/+1
	The memory slot size must be aligned to the host's page size. When testing a guest with a 4k page size on a host with a 64k page size, then 3 guest pages are not host page size aligned. Since we just need a nearly arbitrary number of extra pages to ensure the memslot is not aligned to a 64 host-page boundary for this test, then we can use 16, as that's 64k aligned, but not 64 * 64k aligned. Fixes: 76d58e0f07ec ("KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size", 2019-04-17) Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION	Christian Borntraeger	1	-14/+21
	kselftests exposed a problem in the s390 handling for memory slots. Right now we only do proper memory slot handling for creation of new memory slots. Neither MOVE, nor DELETION are handled properly. Let us implement those. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: x86/pmu: do not mask the value that is written to fixed PMUs	Paolo Bonzini	1	-5/+8
	According to the SDM, for MSR_IA32_PERFCTR0/1 "the lower-order 32 bits of each MSR may be written with any value, and the high-order 8 bits are sign-extended according to the value of bit 31", but the fixed counters in real hardware are limited to the width of the fixed counters ("bits beyond the width of the fixed-function counter are reserved and must be written as zeros"). Fix KVM to do the same. Reported-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: x86/pmu: mask the result of rdpmc according to the width of the counters	Paolo Bonzini	4	-13/+15
	This patch will simplify the changes in the next, by enforcing the masking of the counters to RDPMC and RDMSR. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	x86/kvm/pmu: Set AMD's virt PMU version to 1	Borislav Petkov	1	-1/+1
	After commit: 672ff6cff80c ("KVM: x86: Raise #GP when guest vCPU do not support PMU") my AMD guests started #GPing like this: general protection fault: 0000 [#1] PREEMPT SMP CPU: 1 PID: 4355 Comm: bash Not tainted 5.1.0-rc6+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:x86_perf_event_update+0x3b/0xa0 with Code: pointing to RDPMC. It is RDPMC because the guest has the hardware watchdog CONFIG_HARDLOCKUP_DETECTOR_PERF enabled which uses perf. Instrumenting kvm_pmu_rdpmc() some, showed that it fails due to: if (!pmu->version) return 1; which the above commit added. Since AMD's PMU leaves the version at 0, that causes the #GP injection into the guest. Set pmu->version arbitrarily to 1 and move it above the non-applicable struct kvm_pmu members. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: kvm@vger.kernel.org Cc: Liran Alon <liran.alon@oracle.com> Cc: Mihai Carabas <mihai.carabas@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: stable@vger.kernel.org Fixes: 672ff6cff80c ("KVM: x86: Raise #GP when guest vCPU do not support PMU") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: x86: do not spam dmesg with VMCS/VMCB dumps	Paolo Bonzini	2	-8/+27
	Userspace can easily set up invalid processor state in such a way that dmesg will be filled with VMCS or VMCB dumps. Disable this by default using a module parameter. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	kvm: Check irqchip mode before assign irqfd	Peter Xu	3	-0/+17
	When assigning kvm irqfd we didn't check the irqchip mode but we allow KVM_IRQFD to succeed with all the irqchip modes. However it does not make much sense to create irqfd even without the kernel chips. Let's provide a arch-dependent helper to check whether a specific irqfd is allowed by the arch. At least for x86, it should make sense to check: - when irqchip mode is NONE, all irqfds should be disallowed, and, - when irqchip mode is SPLIT, irqfds that are with resamplefd should be disallowed. For either of the case, previously we'll silently ignore the irq or the irq ack event if the irqchip mode is incorrect. However that can cause misterious guest behaviors and it can be hard to triage. Let's fail KVM_IRQFD even earlier to detect these incorrect configurations. CC: Paolo Bonzini <pbonzini@redhat.com> CC: Radim Krčmář <rkrcmar@redhat.com> CC: Alex Williamson <alex.williamson@redhat.com> CC: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	kvm: svm/avic: fix off-by-one in checking host APIC ID	Suthikulpanit, Suravee	1	-1/+5
	Current logic does not allow VCPU to be loaded onto CPU with APIC ID 255. This should be allowed since the host physical APIC ID field in the AVIC Physical APIC table entry is an 8-bit value, and APIC ID 255 is valid in system with x2APIC enabled. Instead, do not allow VCPU load if the host APIC ID cannot be represented by an 8-bit value. Also, use the more appropriate AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK instead of AVIC_MAX_PHYSICAL_ID_COUNT. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: selftests: do not blindly clobber registers in guest asm	Paolo Bonzini	1	-24/+30
	The guest_code of sync_regs_test is assuming that the compiler will not touch %r11 outside the asm that increments it, which is a bit brittle. Instead, we can increment a variable and use a dummy asm to ensure the increment is not optimized away. However, we also need to use a callee-save register or the compiler will insert a save/restore around the vmexit, breaking the whole idea behind the test. (Yes, "if it ain't broken...", but I would like the test to be clean before it is copied into the upcoming s390 selftests). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: selftests: Remove duplicated TEST_ASSERT in hyperv_cpuid.c	Thomas Huth	1	-3/+0
	The check for entry->index == 0 is done twice. One time should be sufficient. Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: LAPIC: Expose per-vCPU timer_advance_ns to userspace	Wanpeng Li	1	-0/+18
	Expose per-vCPU timer_advance_ns to userspace, so it is able to query the auto-adjusted value. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: LAPIC: Fix lapic_timer_advance_ns parameter overflow	Wanpeng Li	1	-1/+1
	After commit c3941d9e0 (KVM: lapic: Allow user to disable adaptive tuning of timer advancement), '-1' enables adaptive tuning starting from default advancment of 1000ns. However, we should expose an int instead of an overflow uint module parameter. Before patch: /sys/module/kvm/parameters/lapic_timer_advance_ns:4294967295 After patch: /sys/module/kvm/parameters/lapic_timer_advance_ns:-1 Fixes: c3941d9e0 (KVM: lapic: Allow user to disable adaptive tuning of timer advancement) Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Liran Alon <liran.alon@oracle.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	kvm: vmx: Fix -Wmissing-prototypes warnings	Yi Wang	1	-0/+1
	We get a warning when build kernel W=1: arch/x86/kvm/vmx/vmx.c:6365:6: warning: no previous prototype for ‘vmx_update_host_rsp’ [-Wmissing-prototypes] void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp) Add the missing declaration to fix this. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: nVMX: Fix using __this_cpu_read() in preemptible context	Wanpeng Li	1	-2/+2
	BUG: using __this_cpu_read() in preemptible [00000000] code: qemu-system-x86/4590 caller is nested_vmx_enter_non_root_mode+0xebd/0x1790 [kvm_intel] CPU: 4 PID: 4590 Comm: qemu-system-x86 Tainted: G OE 5.1.0-rc4+ #1 Call Trace: dump_stack+0x67/0x95 __this_cpu_preempt_check+0xd2/0xe0 nested_vmx_enter_non_root_mode+0xebd/0x1790 [kvm_intel] nested_vmx_run+0xda/0x2b0 [kvm_intel] handle_vmlaunch+0x13/0x20 [kvm_intel] vmx_handle_exit+0xbd/0x660 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xa2c/0x1e50 [kvm] kvm_vcpu_ioctl+0x3ad/0x6d0 [kvm] do_vfs_ioctl+0xa5/0x6e0 ksys_ioctl+0x6d/0x80 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x6f/0x6c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Accessing per-cpu variable should disable preemption, this patch extends the preemption disable region for __this_cpu_read(). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Fixes: 52017608da33 ("KVM: nVMX: add option to perform early consistency checks via H/W") Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	kvm: fix compilation on s390	Paolo Bonzini	1	-0/+2
	s390 does not have memremap, even though in this particular case it would be useful. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	kvm: x86: Include CPUID leaf 0x8000001e in kvm's supported CPUID	Jim Mattson	1	-0/+1
	Kvm now supports extended CPUID functions through 0x8000001f. CPUID leaf 0x8000001e is AMD's Processor Topology Information leaf. This contains similar information to CPUID leaf 0xb (Intel's Extended Topology Enumeration leaf), and should be included in the output of KVM_GET_SUPPORTED_CPUID, even though userspace is likely to override some of this information based upon the configuration of the particular VM. Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Borislav Petkov <bp@suse.de> Fixes: 8765d75329a38 ("KVM: X86: Extend CPUID range to include new leaf") Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Marc Orr <marcorr@google.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	kvm: x86: Include multiple indices with CPUID leaf 0x8000001d	Jim Mattson	1	-4/+3
	Per the APM, "CPUID Fn8000_001D_E[D,C,B,A]X reports cache topology information for the cache enumerated by the value passed to the instruction in ECX, referred to as Cache n in the following description. To gather information for all cache levels, software must repeatedly execute CPUID with 8000_001Dh in EAX and ECX set to increasing values beginning with 0 until a value of 00h is returned in the field CacheType (EAX[4:0]) indicating no more cache descriptions are available for this processor." The termination condition is the same as leaf 4, so we can reuse that code block for leaf 0x8000001d. Fixes: 8765d75329a38 ("KVM: X86: Extend CPUID range to include new leaf") Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Borislav Petkov <bp@suse.de> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Marc Orr <marcorr@google.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: selftests: Compile code with warnings enabled	Thomas Huth	12	-31/+16
	So far the KVM selftests are compiled without any compiler warnings enabled. That's quite bad, since we miss a lot of possible bugs this way. Let's enable at least "-Wall" and some other useful warning flags now, and fix at least the trivial problems in the code (like unused variables). Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	kvm: selftests: avoid type punning	Paolo Bonzini	2	-2/+2
	Avoid warnings from -Wstrict-aliasing by using memcpy. Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: selftests: Fix a condition in test_hv_cpuid()	Dan Carpenter	1	-3/+2
	The code is trying to check that all the padding is zeroed out and it does this: entry->padding[0] == entry->padding[1] == entry->padding[2] == 0 Assume everything is zeroed correctly, then the first comparison is true, the next comparison is false and false is equal to zero so the overall condition is true. This bug doesn't affect run time very badly, but the code should instead just check that all three paddings are zero individually. Also the error message was copy and pasted from an earlier error and it wasn't correct. Fixes: 7edcb7343327 ("KVM: selftests: Add hyperv_cpuid test") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: Fix spinlock taken warning during host resume	Wanpeng Li	1	-1/+4
	WARNING: CPU: 0 PID: 13554 at kvm/arch/x86/kvm//../../../virt/kvm/kvm_main.c:4183 kvm_resume+0x3c/0x40 [kvm] CPU: 0 PID: 13554 Comm: step_after_susp Tainted: G OE 5.1.0-rc4+ #1 RIP: 0010:kvm_resume+0x3c/0x40 [kvm] Call Trace: syscore_resume+0x63/0x2d0 suspend_devices_and_enter+0x9d1/0xa40 pm_suspend+0x33a/0x3b0 state_store+0x82/0xf0 kobj_attr_store+0x12/0x20 sysfs_kf_write+0x4b/0x60 kernfs_fop_write+0x120/0x1a0 __vfs_write+0x1b/0x40 vfs_write+0xcd/0x1d0 ksys_write+0x5f/0xe0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x6f/0x6c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Commit ca84d1a24 (KVM: x86: Add clock sync request to hardware enable) mentioned that "we always hold kvm_lock when hardware_enable is called. The one place that doesn't need to worry about it is resume, as resuming a frozen CPU, the spinlock won't be taken." However, commit 6706dae9 (virt/kvm: Replace spin_is_locked() with lockdep) introduces a bug, it asserts when the lock is not held which is contrary to the original goal. This patch fixes it by WARN_ON when the lock is held. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Fixes: 6706dae9 ("virt/kvm: Replace spin_is_locked() with lockdep") [Wrap with #ifdef CONFIG_LOCKDEP - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: nVMX: Clear nested_run_pending if setting nested state fails	Sean Christopherson	1	-12/+17
	VMX's nested_run_pending flag is subtly consumed when stuffing state to enter guest mode, i.e. needs to be set according before KVM knows if setting guest state is successful. If setting guest state fails, clear the flag as a nested run is obviously not pending. Reported-by: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	KVM: nVMX: really fix the size checks on KVM_SET_NESTED_STATE	Paolo Bonzini	1	-1/+1
	The offset for reading the shadow VMCS is sizeof(kvm_state)+VMCS12_SIZE, so the correct size must be that plus sizeof(vmcs12). This could lead to KVM reading garbage data from userspace and not reporting an error, but is otherwise not sensitive. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 125	Thomas Gleixner	1	-17/+1
	Based on 1 normalized pattern(s): osl gpl code release authorized by [jalil] [fadavi] this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program see the file copying if not write to the free software foundation 675 mass ave cambridge ma 02139 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190523091651.689335553@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 123	Thomas Gleixner	7	-84/+7
	Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 or later as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 7 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190523091651.504392586@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 122	Thomas Gleixner	2	-18/+2
	Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 2 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190523091651.414247666@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 121	Thomas Gleixner	1	-1/+1
	Based on 1 normalized pattern(s): licensed under the gplv2 or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190523091651.323272658@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 120	Thomas Gleixner	12	-163/+12
	Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see the file copying or write to the free software foundation inc extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 12 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190523091651.231300438@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 119	Thomas Gleixner	2	-20/+2
	Based on 1 normalized pattern(s): released under the gpl this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 2 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190523091651.124582774@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118	Thomas Gleixner	44	-449/+44
	Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 44 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190523091651.032047323@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>