wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2020-09-25	nfsd: Cache R, RW, and W opens separately	J. Bruce Fields	1	-1/+1
	The nfsd open code has always kept separate read-only, read-write, and write-only opens as necessary to ensure that when a client closes or downgrades, we don't retain more access than necessary. Also, I didn't realize the cache behaved this way when I wrote 94415b06eb8a "nfsd4: a client's own opens needn't prevent delegations". There I assumed fi_fds[O_WRONLY] and fi_fds[O_RDWR] would always be distinct. The violation of that assumption is triggering a WARN_ON_ONCE() and could also cause the server to give out a delegation when it shouldn't. Fixes: 94415b06eb8a ("nfsd4: a client's own opens needn't prevent delegations") Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	sunrpc: simplify do_cache_clean	J. Bruce Fields	1	-7/+8
	Is it just me, or is the logic written in a slightly convoluted way? I find it a little easier to read this way. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	sunrpc: cache : Replace seq_printf with seq_puts	Xu Wang	1	-2/+2
	seq_puts is a lot cheaper than seq_printf, so use that to print literal strings. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	silence nfscache allocation warnings with kvzalloc	Rik van Riel	1	-8/+4
	silence nfscache allocation warnings with kvzalloc Currently nfsd_reply_cache_init attempts hash table allocation through kmalloc, and manually falls back to vzalloc if that fails. This makes the code a little larger than needed, and creates a significant amount of serial console spam if you have enough systems. Switching to kvzalloc gets rid of the allocation warnings, and makes the code a little cleaner too as a side effect. Freeing of nn->drc_hashtbl is already done using kvfree currently. Signed-off-by: Rik van Riel <riel@surriel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	nfsd: fix comparison to bool warning	Zheng Bin	1	-1/+1
	Fixes coccicheck warning: fs/nfsd/nfs4proc.c:3234:5-29: WARNING: Comparison to bool Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	NFSD: Correct type annotations in COPY XDR functions	Chuck Lever	1	-1/+1
	Squelch some sparse warnings: /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1860:16: warning: incorrect type in assignment (different base types) /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1860:16: expected int status /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1860:16: got restricted __be32 /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1862:24: warning: incorrect type in return expression (different base types) /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1862:24: expected restricted __be32 /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1862:24: got int status Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	NFSD: Correct type annotations in user xattr XDR functions	Chuck Lever	1	-4/+4
	Squelch some sparse warnings: /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4692:24: warning: incorrect type in return expression (different base types) /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4692:24: expected int /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4692:24: got restricted __be32 [usertype] /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4702:32: warning: incorrect type in return expression (different base types) /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4702:32: expected int /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4702:32: got restricted __be32 [usertype] /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4739:13: warning: incorrect type in assignment (different base types) /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4739:13: expected restricted __be32 [usertype] err /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4739:13: got int /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4891:15: warning: incorrect type in assignment (different base types) /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4891:15: expected unsigned int [assigned] [usertype] count /home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4891:15: got restricted __be32 [usertype] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	NFSD: Correct type annotations in user xattr helpers	Chuck Lever	1	-2/+4
	Squelch some sparse warnings: /home/cel/src/linux/linux/fs/nfsd/vfs.c:2264:13: warning: incorrect type in assignment (different base types) /home/cel/src/linux/linux/fs/nfsd/vfs.c:2264:13: expected int err /home/cel/src/linux/linux/fs/nfsd/vfs.c:2264:13: got restricted __be32 /home/cel/src/linux/linux/fs/nfsd/vfs.c:2266:24: warning: incorrect type in return expression (different base types) /home/cel/src/linux/linux/fs/nfsd/vfs.c:2266:24: expected restricted __be32 /home/cel/src/linux/linux/fs/nfsd/vfs.c:2266:24: got int err /home/cel/src/linux/linux/fs/nfsd/vfs.c:2288:13: warning: incorrect type in assignment (different base types) /home/cel/src/linux/linux/fs/nfsd/vfs.c:2288:13: expected int err /home/cel/src/linux/linux/fs/nfsd/vfs.c:2288:13: got restricted __be32 /home/cel/src/linux/linux/fs/nfsd/vfs.c:2290:24: warning: incorrect type in return expression (different base types) /home/cel/src/linux/linux/fs/nfsd/vfs.c:2290:24: expected restricted __be32 /home/cel/src/linux/linux/fs/nfsd/vfs.c:2290:24: got int err Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	SUNRPC/NFSD: Implement xdr_reserve_space_vec()	Anna Schumaker	3	-25/+50
	Reserving space for a large READ payload requires special handling when reserving space in the xdr buffer pages. One problem we can have is use of the scratch buffer, which is used to get a pointer to a contiguous region of data up to PAGE_SIZE. When using the scratch buffer, calls to xdr_commit_encode() shift the data to it's proper alignment in the xdr buffer. If we've reserved several pages in a vector, then this could potentially invalidate earlier pointers and result in incorrect READ data being sent to the client. I get around this by looking at the amount of space left in the current page, and never reserve more than that for each entry in the read vector. This lets us place data directly where it needs to go in the buffer pages. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	nfsd: rename delegation related tracepoints to make them less confusing	Hou Tao	2	-4/+4
	Now when a read delegation is given, two delegation related traces will be printed: nfsd_deleg_open: client 5f45b854:e6058001 stateid 00000030:00000001 nfsd_deleg_none: client 5f45b854:e6058001 stateid 0000002f:00000001 Although the intention is to let developers know two stateid are returned, the traces are confusing about whether or not a read delegation is handled out. So renaming trace_nfsd_deleg_none() to trace_nfsd_open() and trace_nfsd_deleg_open() to trace_nfsd_deleg_read() to make the intension clearer. The patched traces will be: nfsd_deleg_read: client 5f48a967:b55b21cd stateid 00000003:00000001 nfsd_open: client 5f48a967:b55b21cd stateid 00000002:00000001 Suggested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	Documentation: update RPCSEC_GSSv3 RFC link	J. Bruce Fields	1	-3/+2
	This draft is an official RFC now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	MAINTAINERS: Note NFS docs under Documentation/	J. Bruce Fields	1	-0/+2
	It struck me while watching Jon Corbet ask how to keep kernel Documentation up to date, that it might help if we were actually cc'd on Documentation/filesystems/nfs/ changes. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	nfsd: Remove unnecessary assignment in nfs4xdr.c	Alex Dewar	1	-1/+1
	In nfsd4_encode_listxattrs(), the variable p is assigned to at one point but this value is never used before p is reassigned. Fix this. Addresses-Coverity: ("Unused value") Signed-off-by: Alex Dewar <alex.dewar90@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	net: sunrpc: delete repeated words	Randy Dunlap	3	-3/+3
	Drop duplicate words in net/sunrpc/. Also fix "Anyone" to be "Any one". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	nfsd: Fix typo in comment	Alex Dewar	1	-1/+1
	Missing "is". Signed-off-by: Alex Dewar <alex.dewar90@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	nfsd: give up callbacks on revoked delegations	J. Bruce Fields	1	-1/+2
	The delegation is no longer returnable, so I don't think there's much point retrying the recall. (I think it's worth asking why we even need separate CLOSED_DELEG and REVOKED_DELEG states. But treating them the same would currently cause nfsd4_free_stateid to call list_del_init(&dp->dl_recall_lru) on a delegation that the laundromat had unhashed but not revoked, incorrectly removing it from the laundromat's reaplist or a client's dl_recall_lru.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	nfsd: remove fault injection code	J. Bruce Fields	8	-755/+0
	It was an interesting idea but nobody seems to be using it, it's buggy at this point, and nfs4state.c is already complicated enough without it. The new nfsd/clients/ code provides some of the same functionality, and could probably do more if desired. This feature has been deprecated since 9d60d93198c6 ("Deprecate nfsd fault injection"). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-13	Linux 5.9-rc5	Linus Torvalds	1	-1/+1

2020-09-12	KVM: emulator: more strict rsm checks.	Maxim Levitsky	1	-5/+17
	Don't ignore return values in rsm_load_state_64/32 to avoid loading invalid state from SMM state area if it was tampered with by the guest. This is primarly intended to avoid letting guest set bits in EFER (like EFER.SVME when nesting is disabled) by manipulating SMM save area. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20200827171145.374620-8-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-12	KVM: nSVM: more strict SMM checks when returning to nested guest	Maxim Levitsky	1	-11/+18
	* check that guest is 64 bit guest, otherwise the SVM related fields in the smm state area are not defined * If the SMM area indicates that SMM interrupted a running guest, check that EFER.SVME which is also saved in this area is set, otherwise the guest might have tampered with SMM save area, and so indicate emulation failure which should triple fault the guest. * Check that that guest CPUID supports SVM (due to the same issue as above) Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20200827162720.278690-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-12	SVM: nSVM: setup nested msr permission bitmap on nested state load	Maxim Levitsky	1	-0/+3
	This code was missing and was forcing the L2 run with L1's msr permission bitmap Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20200827162720.278690-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-12	SVM: nSVM: correctly restore GIF on vmexit from nesting after migration	Maxim Levitsky	1	-1/+3
	Currently code in svm_set_nested_state copies the current vmcb control area to L1 control area (hsave->control), under assumption that it mostly reflects the defaults that kvm choose, and later qemu overrides these defaults with L2 state using standard KVM interfaces, like KVM_SET_REGS. However nested GIF (which is AMD specific thing) is by default is true, and it is copied to hsave area as such. This alone is not a big deal since on VMexit, GIF is always set to false, regardless of what it was on VM entry. However in nested_svm_vmexit we were first were setting GIF to false, but then we overwrite the control fields with value from the hsave area. (including the nested GIF field itself if GIF virtualization is enabled). Now on normal vm entry this is not a problem, since GIF is usually false prior to normal vm entry, and this is the value that copied to hsave, and then restored, but this is not always the case when the nested state is loaded as explained above. To fix this issue, move svm_set_gif after we restore the L1 control state in nested_svm_vmexit, so that even with wrong GIF in the saved L1 control area, we still clear GIF as the spec says. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20200827162720.278690-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-12	openrisc: Fix issue with get_user for 64-bit values	Stafford Horne	1	-12/+21
	A build failure was raised by kbuild with the following error. drivers/android/binder.c: Assembler messages: drivers/android/binder.c:3861: Error: unrecognized keyword/register name `l.lwz ?ap,4(r24)' drivers/android/binder.c:3866: Error: unrecognized keyword/register name `l.addi ?ap,r0,0' The issue is with 64-bit get_user() calls on openrisc. I traced this to a problem where in the internally in the get_user macros there is a cast to long __gu_val this causes GCC to think the get_user call is 32-bit. This binder code is really long and GCC allocates register r30, which triggers the issue. The 64-bit get_user asm tries to get the 64-bit pair register, which for r30 overflows the general register names and returns the dummy register ?ap. The fix here is to move the temporary variables into the asm macros. We use a 32-bit __gu_tmp for 32-bit and smaller macro and a 64-bit tmp in the 64-bit macro. The cast in the 64-bit macro has a trick of casting through __typeof__((x)-(x)) which avoids the below warning. This was barrowed from riscv. arch/openrisc/include/asm/uaccess.h:240:8: warning: cast to pointer from integer of different size I tested this in a small unit test to check reading between 64-bit and 32-bit pointers to 64-bit and 32-bit values in all combinations. Also I ran make C=1 to confirm no new sparse warnings came up. It all looks clean to me. Link: https://lore.kernel.org/lkml/202008200453.ohnhqkjQ%25lkp@intel.com/ Signed-off-by: Stafford Horne <shorne@gmail.com> Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-09-12	x86/kvm: don't forget to ACK async PF IRQ	Vitaly Kuznetsov	1	-0/+2
	Merge commit 26d05b368a5c0 ("Merge branch 'kvm-async-pf-int' into HEAD") tried to adapt the new interrupt based async PF mechanism to the newly introduced IDTENTRY magic but unfortunately it missed the fact that DEFINE_IDTENTRY_SYSVEC() doesn't call ack_APIC_irq() on its own and all DEFINE_IDTENTRY_SYSVEC() users have to call it manually. As the result all multi-CPU KVM guest hang on boot when KVM_FEATURE_ASYNC_PF_INT is present. The breakage went unnoticed because no KVM userspace (e.g. QEMU) currently set it (and thus async PF mechanism is currently disabled) but we're about to change that. Fixes: 26d05b368a5c0 ("Merge branch 'kvm-async-pf-int' into HEAD") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200908135350.355053-3-vkuznets@redhat.com> Tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-12	x86/kvm: properly use DEFINE_IDTENTRY_SYSVEC() macro	Vitaly Kuznetsov	1	-4/+0
	DEFINE_IDTENTRY_SYSVEC() already contains irqentry_enter()/ irqentry_exit(). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200908135350.355053-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-12	KVM: VMX: Don't freeze guest when event delivery causes an APIC-access exit	Wanpeng Li	1	-0/+1
	According to SDM 27.2.4, Event delivery causes an APIC-access VM exit. Don't report internal error and freeze guest when event delivery causes an APIC-access exit, it is handleable and the event will be re-injected during the next vmentry. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1597827327-25055-2-git-send-email-wanpengli@tencent.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-12	KVM: SVM: avoid emulation with stale next_rip	Wanpeng Li	1	-4/+3
	svm->next_rip is reset in svm_vcpu_run() only after calling svm_exit_handlers_fastpath(), which will cause SVM's skip_emulated_instruction() to write a stale RIP. We can move svm_exit_handlers_fastpath towards the end of svm_vcpu_run(). To align VMX with SVM, keep svm_complete_interrupts() close as well. Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Paul K. <kronenpj@kronenpj.dyndns.org> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> [Also move vmcb_mark_all_clean before any possible write to the VMCB. - Paolo]
2020-09-11	KVM: x86: always allow writing '0' to MSR_KVM_ASYNC_PF_EN	Vitaly Kuznetsov	1	-1/+1
	Even without in-kernel LAPIC we should allow writing '0' to MSR_KVM_ASYNC_PF_EN as we're not enabling the mechanism. In particular, QEMU with 'kernel-irqchip=off' fails to start a guest with qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0 Fixes: 9d3c447c72fb2 ("KVM: X86: Fix async pf caused null-ptr-deref") Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200911093147.484565-1-vkuznets@redhat.com> [Actually commit the version proposed by Sean Christopherson. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-11	KVM: SVM: Periodically schedule when unregistering regions on destroy	David Rientjes	1	-0/+1
	There may be many encrypted regions that need to be unregistered when a SEV VM is destroyed. This can lead to soft lockups. For example, on a host running 4.15: watchdog: BUG: soft lockup - CPU#206 stuck for 11s! [t_virtual_machi:194348] CPU: 206 PID: 194348 Comm: t_virtual_machi RIP: 0010:free_unref_page_list+0x105/0x170 ... Call Trace: [<0>] release_pages+0x159/0x3d0 [<0>] sev_unpin_memory+0x2c/0x50 [kvm_amd] [<0>] __unregister_enc_region_locked+0x2f/0x70 [kvm_amd] [<0>] svm_vm_destroy+0xa9/0x200 [kvm_amd] [<0>] kvm_arch_destroy_vm+0x47/0x200 [<0>] kvm_put_kvm+0x1a8/0x2f0 [<0>] kvm_vm_release+0x25/0x30 [<0>] do_exit+0x335/0xc10 [<0>] do_group_exit+0x3f/0xa0 [<0>] get_signal+0x1bc/0x670 [<0>] do_signal+0x31/0x130 Although the CLFLUSH is no longer issued on every encrypted region to be unregistered, there are no other changes that can prevent soft lockups for very large SEV VMs in the latest kernel. Periodically schedule if necessary. This still holds kvm->lock across the resched, but since this only happens when the VM is destroyed this is assumed to be acceptable. Signed-off-by: David Rientjes <rientjes@google.com> Message-Id: <alpine.DEB.2.23.453.2008251255240.2987727@chino.kir.corp.google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-11	KVM: MIPS: Change the definition of kvm type	Huacai Chen	2	-2/+5
	MIPS defines two kvm types: #define KVM_VM_MIPS_TE 0 #define KVM_VM_MIPS_VZ 1 In Documentation/virt/kvm/api.rst it is said that "You probably want to use 0 as machine type", which implies that type 0 be the "automatic" or "default" type. And, in user-space libvirt use the null-machine (with type 0) to detect the kvm capability, which returns "KVM not supported" on a VZ platform. I try to fix it in QEMU but it is ugly: https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05629.html And Thomas Huth suggests me to change the definition of kvm type: https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg03281.html So I define like this: #define KVM_VM_MIPS_AUTO 0 #define KVM_VM_MIPS_VZ 1 #define KVM_VM_MIPS_TE 2 Since VZ and TE cannot co-exists, using type 0 on a TE platform will still return success (so old user-space tools have no problems on new kernels); the advantage is that using type 0 on a VZ platform will not return failure. So, the only problem is "new user-space tools use type 2 on old kernels", but if we treat this as a kernel bug, we can backport this patch to old stable kernels. Signed-off-by: Huacai Chen <chenhc@lemote.com> Message-Id: <1599734031-28746-1-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-11	kvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed	Lai Jiangshan	1	-1/+1
	When kvm_mmu_get_page() gets a page with unsynced children, the spt pagetable is unsynchronized with the guest pagetable. But the guest might not issue a "flush" operation on it when the pagetable entry is changed from zero or other cases. The hypervisor has the responsibility to synchronize the pagetables. KVM behaved as above for many years, But commit 8c8560b83390 ("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes") inadvertently included a line of code to change it without giving any reason in the changelog. It is clear that the commit's intention was to change KVM_REQ_TLB_FLUSH -> KVM_REQ_TLB_FLUSH_CURRENT, so we don't needlessly flush other contexts; however, one of the hunks changed a nearby KVM_REQ_MMU_SYNC instead. This patch changes it back. Link: https://lore.kernel.org/lkml/20200320212833.3507-26-sean.j.christopherson@intel.com/ Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20200902135421.31158-1-jiangshanlai@gmail.com> fixes: 8c8560b83390 ("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-11	KVM: nVMX: Fix the update value of nested load IA32_PERF_GLOBAL_CTRL control	Chenyi Qiang	1	-1/+1
	A minor fix for the update of VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL field in exit_ctls_high. Fixes: 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL VM-{Entry,Exit} control") Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-Id: <20200828085622.8365-5-chenyi.qiang@intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-11	KVM: fix memory leak in kvm_io_bus_unregister_dev()	Rustam Kovhaev	1	-9/+12
	when kmalloc() fails in kvm_io_bus_unregister_dev(), before removing the bus, we should iterate over all other devices linked to it and call kvm_iodevice_destructor() for them Fixes: 90db10434b16 ("KVM: kvm_io_bus_unregister_dev() should never fail") Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+f196caa45793d6374707@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=f196caa45793d6374707 Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200907185535.233114-1-rkovhaev@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-11	KVM: Check the allocation of pv cpu mask	Haiwei Li	1	-3/+19
	check the allocation of per-cpu __pv_cpu_mask. Initialize ops only when successful. Signed-off-by: Haiwei Li <lihaiwei@tencent.com> Message-Id: <d59f05df-e6d3-3d31-a036-cc25a2b2f33f@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-11	KVM: nVMX: Update VMCS02 when L2 PAE PDPTE updates detected	Peter Shier	3	-2/+11
	When L2 uses PAE, L0 intercepts of L2 writes to CR0/CR3/CR4 call load_pdptrs to read the possibly updated PDPTEs from the guest physical address referenced by CR3. It loads them into vcpu->arch.walk_mmu->pdptrs and sets VCPU_EXREG_PDPTR in vcpu->arch.regs_dirty. At the subsequent assumed reentry into L2, the mmu will call vmx_load_mmu_pgd which calls ept_load_pdptrs. ept_load_pdptrs sees VCPU_EXREG_PDPTR set in vcpu->arch.regs_dirty and loads VMCS02.GUEST_PDPTRn from vcpu->arch.walk_mmu->pdptrs[]. This all works if the L2 CRn write intercept always resumes L2. The resume path calls vmx_check_nested_events which checks for exceptions, MTF, and expired VMX preemption timers. If vmx_check_nested_events finds any of these conditions pending it will reflect the corresponding exit into L1. Live migration at this point would also cause a missed immediate reentry into L2. After L1 exits, vmx_vcpu_run calls vmx_register_cache_reset which clears VCPU_EXREG_PDPTR in vcpu->arch.regs_dirty. When L2 next resumes, ept_load_pdptrs finds VCPU_EXREG_PDPTR clear in vcpu->arch.regs_dirty and does not load VMCS02.GUEST_PDPTRn from vcpu->arch.walk_mmu->pdptrs[]. prepare_vmcs02 will then load VMCS02.GUEST_PDPTRn from vmcs12->pdptr0/1/2/3 which contain the stale values stored at last L2 exit. A repro of this bug showed L2 entering triple fault immediately due to the bad VMCS02.GUEST_PDPTRn values. When L2 is in PAE paging mode add a call to ept_load_pdptrs before leaving L2. This will update VMCS02.GUEST_PDPTRn if they are dirty in vcpu->arch.walk_mmu->pdptrs[]. Tested: kvm-unit-tests with new directed test: vmx_mtf_pdpte_test. Verified that test fails without the fix. Also ran Google internal VMM with an Ubuntu 16.04 4.4.0-83 guest running a custom hypervisor with a 32-bit Windows XP L2 guest using PAE. Prior to fix would repro readily. Ran 14 simultaneous L2s for 140 iterations with no failures. Signed-off-by: Peter Shier <pshier@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20200820230545.2411347-1-pshier@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-11	gcov: add support for GCC 10.1	Peter Oberparleiter	2	-2/+3
	Using gcov to collect coverage data for kernels compiled with GCC 10.1 causes random malfunctions and kernel crashes. This is the result of a changed GCOV_COUNTERS value in GCC 10.1 that causes a mismatch between the layout of the gcov_info structure created by GCC profiling code and the related structure used by the kernel. Fix this by updating the in-kernel GCOV_COUNTERS value. Also re-enable config GCOV_KERNEL for use with GCC 10. Reported-by: Colin Ian King <colin.king@canonical.com> Reported-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Tested-by: Leon Romanovsky <leonro@nvidia.com> Tested-and-Acked-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-10	powercap: make documentation reflect code	Amit Kucheria	1	-6/+5
	Fix up the documentation of the struct powercap_control_type members to match the code. Also fixup stray whitespace. Signed-off-by: Amit Kucheria <amitk@kernel.org> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-10	PM: <linux/device.h>: fix @em_pd kernel-doc warning	Randy Dunlap	1	-0/+1
	Fix kernel-doc warning in <linux/device.h>: ../include/linux/device.h:613: warning: Function parameter or member 'em_pd' not described in 'device' Fixes: 1bc138c62295 ("PM / EM: add support for other devices than CPUs in Energy Model") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-10	powercap/intel_rapl: add support for AlderLake	Zhang Rui	1	-0/+1
	Add intel_rapl support for the AlderLake platform. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-10	powercap/intel_rapl: add support for RocketLake	Zhang Rui	1	-0/+1
	Add intel_rapl support for the RocketLake platform. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-10	powercap/intel_rapl: add support for TigerLake Desktop	Zhang Rui	1	-0/+1
	Add intel_rapl support for the TigerLake desktop platform. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-10	Revert "dyndbg: accept query terms like file=bar and module=foo"	Greg Kroah-Hartman	2	-34/+20
	This reverts commit 14775b04964264189caa4a0862eac05dab8c0502 as there were still some parsing problems with it, and the follow-on patch for it. Let's revisit it later, just drop it for now. Cc: <jbaron@akamai.com> Cc: Jim Cromie <jim.cromie@gmail.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 14775b049642 ("dyndbg: accept query terms like file=bar and module=foo") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-10	Revert "dyndbg: fix problem parsing format="foo bar""	Greg Kroah-Hartman	1	-17/+21
	This reverts commit 42f07816ac0cc797928119cc039c414ae2b95d34 as it still causes problems. It will be resolved later, let's revert it so we can also revert the original patch this was supposed to be helping with. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: 42f07816ac0c ("dyndbg: fix problem parsing format="foo bar"") Cc: Jim Cromie <jim.cromie@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-10	test_firmware: Test platform fw loading on non-EFI systems	Kees Cook	3	-9/+16
	On non-EFI systems, it wasn't possible to test the platform firmware loader because it will have never set "checked_fw" during __init. Instead, allow the test code to override this check. Additionally split the declarations into a private symbol namespace so there is greater enforcement of the symbol visibility. Fixes: 548193cba2a7 ("test_firmware: add support for firmware_request_platform") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20200909225354.3118328-1-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-09	arm64: dts: ns2: Fixed QSPI compatible string	Florian Fainelli	1	-1/+1
	The string was incorrectly defined before from least to most specific, swap the compatible strings accordingly. Fixes: ff73917d38a6 ("ARM64: dts: Add QSPI Device Tree node for NS2") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2020-09-09	ARM: dts: BCM5301X: Fixed QSPI compatible string	Florian Fainelli	1	-1/+1
	The string was incorrectly defined before from least to most specific, swap the compatible strings accordingly. Fixes: 1c8f40650723 ("ARM: dts: BCM5301X: convert to iProc QSPI") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2020-09-09	ARM: dts: NSP: Fixed QSPI compatible string	Florian Fainelli	1	-1/+1
	The string was incorrectly defined before from least to most specific, swap the compatible strings accordingly. Fixes: 329f98c1974e ("ARM: dts: NSP: Add QSPI nodes to NSPI and bcm958625k DTSes") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2020-09-09	ARM: dts: bcm: HR2: Fixed QSPI compatible string	Florian Fainelli	1	-1/+1
	The string was incorrectly defined before from least to most specific, swap the compatible strings accordingly. Fixes: b9099ec754b5 ("ARM: dts: Add Broadcom Hurricane 2 DTS include file") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2020-09-09	dt-bindings: spi: Fix spi-bcm-qspi compatible ordering	Florian Fainelli	1	-8/+8
	The binding is currently incorrectly defining the compatible strings from least specifice to most specific instead of the converse. Re-order them from most specific (left) to least specific (right) and fix the examples as well. Fixes: 5fc78f4c842a ("spi: Broadcom BRCMSTB, NSP, NS2 SoC bindings") Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2020-09-09	IB/isert: Fix unaligned immediate-data handling	Sagi Grimberg	2	-56/+78
	Currently we allocate rx buffers in a single contiguous buffers for headers (iser and iscsi) and data trailer. This means that most likely the data starting offset is aligned to 76 bytes (size of both headers). This worked fine for years, but at some point this broke, resulting in data corruptions in isert when a command comes with immediate data and the underlying backend device assumes 512 bytes buffer alignment. We assume a hard-requirement for all direct I/O buffers to be 512 bytes aligned. To fix this, we should avoid passing unaligned buffers for I/O. Instead, we allocate our recv buffers with some extra space such that we can have the data portion align to 512 byte boundary. This also means that we cannot reference headers or data using structure but rather accessors (as they may move based on alignment). Also, get rid of the wrong __packed annotation from iser_rx_desc as this has only harmful effects (not aligned to anything). This affects the rx descriptors for iscsi login and data plane. Fixes: 3d75ca0adef4 ("block: introduce multi-page bvec helpers") Link: https://lore.kernel.org/r/20200904195039.31687-1-sagi@grimberg.me Reported-by: Stephen Rust <srust@blockbridge.com> Tested-by: Doug Dumitru <doug@dumitru.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>