wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2023-09-05	gfs2: Use gfs2_qd_dispose in gfs2_quota_cleanup	Andreas Gruenbacher	1	-22/+4
	Change gfs2_quota_cleanup() to move the quota data objects to dispose of on a dispose list and call gfs2_qd_dispose() on that list, like gfs2_qd_shrink_scan() does, instead of disposing of the quota data objects directly. This may look a bit pointless by itself, but it will make more sense in combination with a fix that follows. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: Fix wrong quota shrinker return value	Andreas Gruenbacher	1	-2/+6
	Function gfs2_qd_isolate must only return LRU_REMOVED when removing the item from the lru list; otherwise, the number of items on the list will go wrong. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: Rename SDF_DEACTIVATING to SDF_KILL	Andreas Gruenbacher	6	-8/+8
	Rename the SDF_DEACTIVATING flag to SDF_KILL to make it more obvious that this relates to the kill_sb filesystem operation. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: Rename sd_{ glock => kill }_wait	Andreas Gruenbacher	3	-5/+5
	Rename sd_glock_wait to sd_kill_wait: we'll use it for other things related to "killing" a filesystem on unmount soon (kill_sb). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: Use qd_sbd more consequently	Bob Peterson	1	-11/+11
	Before this patch many of the functions in quota.c got their superblock pointer, sdp, from the quota_data's glock pointer. That's silly because the qd already has its own pointer to the superblock (qd_sbd). This patch changes references to use that instead, eliminating a level of indirection. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: journal flush threshold fixes and cleanup	Andreas Gruenbacher	1	-18/+16
	Commit f07b35202148 ("GFS2: Made logd daemon take into account log demand") changed gfs2_ail_flush_reqd() and gfs2_jrnl_flush_reqd() to take sd_log_blks_needed into account, but the checks in gfs2_log_commit() were not updated correspondingly. Once that is fixed, gfs2_jrnl_flush_reqd() and gfs2_ail_flush_reqd() can be used in gfs2_log_commit(). Make those two helpers available to gfs2_log_commit() by defining them above gfs2_log_commit(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: Fix logd wakeup on I/O error	Andreas Gruenbacher	1	-0/+1
	When quotad detects an I/O error, it sets sd_log_error and then it wakes up logd to withdraw the filesystem. However, logd doesn't wake up when sd_log_error is set. Fix that. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: low-memory forced flush fixes	Andreas Gruenbacher	2	-6/+6
	First, function gfs2_ail_flush_reqd checks the SDF_FORCE_AIL_FLUSH flag to determine if an AIL flush should be forced in low-memory situations. However, it also immediately clears the flag, and when called repeatedly as in function gfs2_logd, the flag will be lost. Fix that by pulling the SDF_FORCE_AIL_FLUSH flag check out of gfs2_ail_flush_reqd. Second, function gfs2_writepages sets the SDF_FORCE_AIL_FLUSH flag whether or not enough pages were written. If enough pages could be written, flushing the AIL is unnecessary, though. Third, gfs2_writepages doesn't wake up logd after setting the SDF_FORCE_AIL_FLUSH flag, so it can take a long time for logd to react. It would be preferable to wake up logd, but that hurts the performance of some workloads and we don't quite understand why so far, so don't wake up logd so far. Fixes: b066a4eebd4f ("gfs2: forcibly flush ail to relieve memory pressure") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: Switch to wait_event in gfs2_logd	Andreas Gruenbacher	1	-12/+5
	In gfs2_logd(), switch from an open-coded wait loop to wait_event_interruptible_timeout(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: conversion deadlock do_promote bypass	Bob Peterson	1	-0/+2
	Consider the following case: 1. A glock is held in shared mode. 2. A process requests the glock in exclusive mode (rename). 3. Before the lock is granted, more processes (read / ls) request the glock in shared mode again. 4. gfs2 sends a request to dlm for the lock in exclusive mode because that holder is at the head of the queue. 5. Somehow the dlm request gets canceled, so dlm sends us back a response with state == LM_ST_SHARED and LM_OUT_CANCELED. So at that point, the glock is still held in shared mode. 6. finish_xmote gets called to process the response from dlm. It detects that the glock is not in the requested mode and no demote is in progress, so it moves the canceled holder to the tail of the queue and finds the new holder at the head of the queue. That holder is requesting the glock in shared mode. 7. finish_xmote calls do_xmote to transition the glock into shared mode, but the glock is already in shared mode and so do_xmote complains about that with: GLOCK_BUG_ON(gl, gl->gl_state == gl->gl_target); Instead, in finish_xmote, after moving the canceled holder to the tail of the queue, check if any new holders can be granted. Only call do_xmote to repeat the dlm request if the holder at the head of the queue is requesting the glock in a mode that is incompatible with the mode the glock is currently held in. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: Remove LM_FLAG_PRIORITY flag	Andreas Gruenbacher	4	-33/+7
	The last user of this flag was removed in commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: do_promote cleanup	Andreas Gruenbacher	1	-6/+6
	Change function do_promote to return true on success, and false otherwise. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs: Don't use GFP_NOFS in gfs2_unstuff_dinode	Andreas Gruenbacher	1	-1/+1
	Revert the rest of commit 220cca2a4f58 ("GFS2: Change truncate page allocation to be GFP_NOFS"): In gfs2_unstuff_dinode(), there is no need to carry out the page cache allocation under GFP_NOFS because inodes on the "regular" filesystem are never un-inlined under memory pressure, so switch back from find_or_create_page() to grab_cache_page() here as well. Inodes on the "metadata" filesystem can theoretically be un-inlined under memory pressure, but any page cache allocations in that context would happen in GFP_NOFS context because those inodes have inode->i_mapping->gfp_mask set to GFP_NOFS (see the previous patch). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: Use mapping->gfp_mask for metadata inodes	Andreas Gruenbacher	3	-9/+14
	Set mapping->gfp mask to GFP_NOFS for all metadata inodes so that allocating pages in the address space of those inodes won't call back into the filesystem. This allows to switch back from find_or_create_page() to grab_cache_page() in two places. Partially reverts commit 220cca2a4f58 ("GFS2: Change truncate page allocation to be GFP_NOFS"). Thanks to Dan Carpenter <dan.carpenter@linaro.org> for pointing out a Smatch static checker warning. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05	gfs2: increase usage of folio_next_index() helper	Minjie Du	1	-2/+1
	Simplify code pattern of 'folio->index + folio_nr_pages(folio)' by using the existing helper folio_next_index(). Signed-off-by: Minjie Du <duminjie@vivo.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-08-07	tpm/tpm_tis: Disable interrupts for Lenovo P620 devices	Jonathan McDowell	1	-0/+8
	The Lenovo ThinkStation P620 suffers from an irq storm issue like various other Lenovo machines, so add an entry for it to tpm_tis_dmi_table and force polling. It is worth noting that 481c2d14627d (tpm,tpm_tis: Disable interrupts after 1000 unhandled IRQs) does not seem to fix the problem on this machine, but setting 'tpm_tis.interrupts=0' on the kernel command line does. [jarkko@kernel.org: truncated the commit ID in the description to 12 characters] Cc: stable@vger.kernel.org # v6.4+ Fixes: e644b2f498d2 ("tpm, tpm_tis: Enable interrupt test") Signed-off-by: Jonathan McDowell <noodles@meta.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07	tpm: Disable RNG for all AMD fTPMs	Mario Limonciello	3	-66/+33
	The TPM RNG functionality is not necessary for entropy when the CPU already supports the RDRAND instruction. The TPM RNG functionality was previously disabled on a subset of AMD fTPM series, but reports continue to show problems on some systems causing stutter root caused to TPM RNG functionality. Expand disabling TPM RNG use for all AMD fTPMs whether they have versions that claim to have fixed or not. To accomplish this, move the detection into part of the TPM CRB registration and add a flag indicating that the TPM should opt-out of registration to hwrng. Cc: stable@vger.kernel.org # 6.1.y+ Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources") Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs") Reported-by: daniil.stas@posteo.net Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719 Reported-by: bitlord0xff@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07	sysctl: set variable key_sysctls storage-class-specifier to static	Tom Rix	1	-1/+1
	smatch reports security/keys/sysctl.c:12:18: warning: symbol 'key_sysctls' was not declared. Should it be static? This variable is only used in its defining file, so it should be static. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07	tpm/tpm_tis: Disable interrupts for TUXEDO InfinityBook S 15/17 Gen7	Takashi Iwai	1	-0/+8
	TUXEDO InfinityBook S 15/17 Gen7 suffers from an IRQ problem on tpm_tis like a few other laptops. Add an entry for the workaround. Cc: stable@vger.kernel.org Fixes: e644b2f498d2 ("tpm, tpm_tis: Enable interrupt test") Link: https://bugzilla.suse.com/show_bug.cgi?id=1213645 Signed-off-by: Takashi Iwai <tiwai@suse.de> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07	gfs2: Don't use filemap_splice_read	Bob Peterson	1	-2/+2
	Starting with patch 2cb1e08985, gfs2 started using the new function filemap_splice_read rather than the old (and subsequently deleted) function generic_file_splice_read. filemap_splice_read works by taking references to a number of folios in the page cache and splicing those folios into a pipe. The folios are then read from the pipe and the folio references are dropped. This can take an arbitrary amount of time. We cannot allow that in gfs2 because those folio references will pin the inode glock to the node and prevent it from being demoted, which can lead to cluster-wide deadlocks. Instead, use copy_splice_read. (In addition, the old generic_file_splice_read called into ->read_iter, which called gfs2_file_read_iter, which took the inode glock during the operation. The new filemap_splice_read interface does not take the inode glock anymore. This is fixable, but it still wouldn't prevent cluster-wide deadlocks.) Fixes: 2cb1e08985e3 ("splice: Use filemap_splice_read() instead of generic_file_splice_read()") Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-08-07	gfs2: Fix freeze consistency check in gfs2_trans_add_meta	Andreas Gruenbacher	1	-4/+10
	Function gfs2_trans_add_meta() checks for the SDF_FROZEN flag to make sure that no buffers are added to a transaction while the filesystem is frozen. With the recent freeze/thaw rework, the SDF_FROZEN flag is cleared after thaw_super() is called, which is sufficient for serializing freeze/thaw. However, other filesystem operations started after thaw_super() may now be calling gfs2_trans_add_meta() before the SDF_FROZEN flag is cleared, which will trigger the SDF_FROZEN check in gfs2_trans_add_meta(). Fix that by checking the s_writers.frozen state instead. In addition, make sure not to call gfs2_assert_withdraw() with the sd_log_lock spin lock held. Check for a withdrawn filesystem before checking for a frozen filesystem, and don't pin/add buffers to the current transaction in case of a failure in either case. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2023-08-07	x86/srso: Tie SBPB bit setting to microcode patch detection	Borislav Petkov (AMD)	2	-11/+15
	The SBPB bit in MSR_IA32_PRED_CMD is supported only after a microcode patch has been applied so set X86_FEATURE_SBPB only then. Otherwise, guests would attempt to set that bit and #GP on the MSR write. While at it, make SMT detection more robust as some guests - depending on how and what CPUID leafs their report - lead to cpu_smt_control getting set to CPU_SMT_NOT_SUPPORTED but SRSO_NO should be set for any guest incarnation where one simply cannot do SMT, for whatever reason. Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation") Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-by: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-08-06	Linux 6.5-rc5	Linus Torvalds	1	-1/+1

2023-08-06	fs: rely on ->iterate_shared to determine f_pos locking	Christian Brauner	1	-1/+1
	Now that we removed ->iterate we don't need to check for either ->iterate or ->iterate_shared in file_needs_f_pos_lock(). Simply check for ->iterate_shared instead. This will tell us whether we need to unconditionally take the lock. Not just does it allow us to avoid checking f_inode's mode it also actually clearly shows that we're locking because of readdir. Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06	vfs: get rid of old '->iterate' directory operation	Linus Torvalds	13	-58/+95
	All users now just use '->iterate_shared()', which only takes the directory inode lock for reading. Filesystems that never got convered to shared mode now instead use a wrapper that drops the lock, re-takes it in write mode, calls the old function, and then downgrades the lock back to read mode. This way the VFS layer and other callers no longer need to care about filesystems that never got converted to the modern era. The filesystems that use the new wrapper are ceph, coda, exfat, jfs, ntfs, ocfs2, overlayfs, and vboxsf. Honestly, several of them look like they really could just iterate their directories in shared mode and skip the wrapper entirely, but the point of this change is to not change semantics or fix filesystems that haven't been fixed in the last 7+ years, but to finally get rid of the dual iterators. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06	proc: fix missing conversion to 'iterate_shared'	Linus Torvalds	1	-1/+1
	I'm looking at the directory handling due to the discussion about f_pos locking (see commit 797964253d35: "file: reinstate f_pos locking optimization for regular files"), and wanting to clean that up. And one source of ugliness is how we were supposed to move filesystems over to the '->iterate_shared()' function that only takes the inode lock for reading many many years ago, but several filesystems still use the bad old '->iterate()' that takes the inode lock for exclusive access. See commit 6192269444eb ("introduce a parallel variant of ->iterate()") that also added some documentation stating Old method is only used if the new one is absent; eventually it will be removed. Switch while you still can; the old one won't stay. and that was back in April 2016. Here we are, many years later, and the old version is still clearly sadly alive and well. Now, some of those old style iterators are probably just because the filesystem may end up having per-inode mutable data that it uses for iterating a directory, but at least one case is just a mistake. Al switched over most filesystems to use '->iterate_shared()' back when it was introduced. In particular, the /proc filesystem was converted as one of the first ones in commit f50752eaa0b0 ("switch all procfs directories ->iterate_shared()"). But then later one new user of '->iterate()' was then re-introduced by commit 6d9c939dbe4d ("procfs: add smack subdir to attrs"). And that's clearly not what we wanted, since that new case just uses the same 'proc_pident_readdir()' and 'proc_pident_lookup()' helper functions that other /proc pident directories use, and they are most definitely safe to use with the inode lock held shared. So just fix it. This still leaves a fair number of oddball filesystems using the old-style directory iterator (ceph, coda, exfat, jfs, ntfs, ocfs2, overlayfs, and vboxsf), but at least we don't have any remaining in the core filesystems. I'm going to add a wrapper function that just drops the read-lock and takes it as a write lock, so that we can clean up the core vfs layer and make all the ugly 'this filesystem needs exclusive inode locking' be just filesystem-internal warts. I just didn't want to make that conversion when we still had a core user left. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06	open: make RESOLVE_CACHED correctly test for O_TMPFILE	Aleksa Sarai	1	-1/+1
	O_TMPFILE is actually __O_TMPFILE\|O_DIRECTORY. This means that the old fast-path check for RESOLVE_CACHED would reject all users passing O_DIRECTORY with -EAGAIN, when in fact the intended test was to check for __O_TMPFILE. Cc: stable@vger.kernel.org # v5.12+ Fixes: 99668f618062 ("fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Message-Id: <20230806-resolve_cached-o_tmpfile-v1-1-7ba16308465e@cyphar.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-04	selftests/rseq: Fix build with undefined __weak	Mark Brown	2	-1/+5
	Commit 3bcbc20942db ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+") which is now in Linus' tree introduced uses of __weak but did nothing to ensure that a definition is provided for it resulting in build failures for the rseq tests: rseq.c:41:1: error: unknown type name '__weak' __weak ptrdiff_t __rseq_offset; ^ rseq.c:41:17: error: expected ';' after top level declarator __weak ptrdiff_t __rseq_offset; ^ ; rseq.c:42:1: error: unknown type name '__weak' __weak unsigned int __rseq_size; ^ rseq.c:43:1: error: unknown type name '__weak' __weak unsigned int __rseq_flags; Fix this by using the definition from tools/include compiler.h. Fixes: 3bcbc20942db ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+") Signed-off-by: Mark Brown <broonie@kernel.org> Message-Id: <20230804-kselftest-rseq-build-v1-1-015830b66aa9@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-04	file: reinstate f_pos locking optimization for regular files	Linus Torvalds	1	-1/+17
	In commit 20ea1e7d13c1 ("file: always lock position for FMODE_ATOMIC_POS") we ended up always taking the file pos lock, because pidfd_getfd() could get a reference to the file even when it didn't have an elevated file count due to threading of other sharing cases. But Mateusz Guzik reports that the extra locking is actually measurable, so let's re-introduce the optimization, and only force the locking for directory traversal. Directories need the lock for correctness reasons, while regular files only need it for "POSIX semantics". Since pidfd_getfd() is about debuggers etc special things that are _way_ outside of POSIX, we can relax the rules for that case. Reported-by: Mateusz Guzik <mjguzik@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/linux-fsdevel/20230803095311.ijpvhx3fyrbkasul@f/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-08-04	KVM: SEV: remove ghcb variable declarations	Paolo Bonzini	1	-18/+12
	To avoid possible time-of-check/time-of-use issues, the GHCB should almost never be accessed outside dump_ghcb, sev_es_sync_to_ghcb and sev_es_sync_from_ghcb. The only legitimate uses are to set the exitinfo fields and to find the address of the scratch area embedded in the ghcb. Accessing ghcb_usage also goes through svm->sev_es.ghcb in sev_es_validate_vmgexit(), but that is because anyway the value is not used. Removing a shortcut variable that contains the value of svm->sev_es.ghcb makes these cases a bit more verbose, but it limits the chance of someone reading the ghcb by mistake. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-04	KVM: SEV: only access GHCB fields once	Paolo Bonzini	1	-11/+14
	A KVM guest using SEV-ES or SEV-SNP with multiple vCPUs can trigger a double fetch race condition vulnerability and invoke the VMGEXIT handler recursively. sev_handle_vmgexit() maps the GHCB page using kvm_vcpu_map() and then fetches the exit code using ghcb_get_sw_exit_code(). Soon after, sev_es_validate_vmgexit() fetches the exit code again. Since the GHCB page is shared with the guest, the guest is able to quickly swap the values with another vCPU and hence bypass the validation. One vmexit code that can be rejected by sev_es_validate_vmgexit() is SVM_EXIT_VMGEXIT; if sev_handle_vmgexit() observes it in the second fetch, the call to svm_invoke_exit_handler() will invoke sev_handle_vmgexit() again recursively. To avoid the race, always fetch the GHCB data from the places where sev_es_sync_from_ghcb stores it. Exploiting recursions on linux kernel has been proven feasible in the past, but the impact is mitigated by stack guard pages (CONFIG_VMAP_STACK). Still, if an attacker manages to call the handler multiple times, they can theoretically trigger a stack overflow and cause a denial-of-service, or potentially guest-to-host escape in kernel configurations without stack guard pages. Note that winning the race reliably in every iteration is very tricky due to the very tight window of the fetches; depending on the compiler settings, they are often consecutive because of optimization and inlining. Tested by booting an SEV-ES RHEL9 guest. Fixes: CVE-2023-4155 Fixes: 291bd20d5d88 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT") Cc: stable@vger.kernel.org Reported-by: Andy Nguyen <theflow@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-04	KVM: SEV: snapshot the GHCB before accessing it	Paolo Bonzini	2	-34/+61
	Validation of the GHCB is susceptible to time-of-check/time-of-use vulnerabilities. To avoid them, we would like to always snapshot the fields that are read in sev_es_validate_vmgexit(), and not use the GHCB anymore after it returns. This means: - invoking sev_es_sync_from_ghcb() before any GHCB access, including before sev_es_validate_vmgexit() - snapshotting all fields including the valid bitmap and the sw_scratch field, which are currently not caching anywhere. The valid bitmap is the first thing to be copied out of the GHCB; then, further accesses will use the copy in svm->sev_es. Fixes: 291bd20d5d88 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-04	arm64/fpsimd: Sync and zero pad FPSIMD state for streaming SVE	Mark Brown	1	-1/+2
	We have a function sve_sync_from_fpsimd_zeropad() which is used by the ptrace code to update the SVE state when the user writes to the the FPSIMD register set. Currently this checks that the task has SVE enabled but this will miss updates for tasks which have streaming SVE enabled if SVE has not been enabled for the thread, also do the conversion if the task has streaming SVE enabled. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-3-49df214bfb3e@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-08-04	arm64/fpsimd: Sync FPSIMD state with SVE for SME only systems	Mark Brown	1	-2/+2
	Currently we guard FPSIMD/SVE state conversions with a check for the system supporting SVE but SME only systems may need to sync streaming mode SVE state so add a check for SME support too. These functions are only used by the ptrace code. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-2-49df214bfb3e@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-08-04	arm64/ptrace: Don't enable SVE when setting streaming SVE	Mark Brown	1	-3/+5
	Systems which implement SME without also implementing SVE are architecturally valid but were not initially supported by the kernel, unfortunately we missed one issue in the ptrace code. The SVE register setting code is shared between SVE and streaming mode SVE. When we set full SVE register state we currently enable TIF_SVE unconditionally, in the case where streaming SVE is being configured on a system that supports vanilla SVE this is not an issue since we always initialise enough state for both vector lengths but on a system which only support SME it will result in us attempting to restore the SVE vector length after having set streaming SVE registers. Fix this by making the enabling of SVE conditional on setting SVE vector state. If we set streaming SVE state and SVE was not already enabled this will result in a SVE access trap on next use of normal SVE, this will cause us to flush our register state but this is fine since the only way to trigger a SVE access trap would be to exit streaming mode which will cause the in register state to be flushed anyway. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-1-49df214bfb3e@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-08-04	rust: fix bindgen build error with UBSAN_BOUNDS_STRICT	Andrea Righi	1	-1/+1
	With commit 2d47c6956ab3 ("ubsan: Tighten UBSAN_BOUNDS on GCC") if CONFIG_UBSAN is enabled and gcc supports -fsanitize=bounds-strict, we can trigger the following build error due to bindgen lacking support for this additional build option: BINDGEN rust/bindings/bindings_generated.rs error: unsupported argument 'bounds-strict' to option '-fsanitize=' Fix by adding -fsanitize=bounds-strict to the list of skipped gcc flags for bindgen. Fixes: 2d47c6956ab3 ("ubsan: Tighten UBSAN_BOUNDS on GCC") Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com> Link: https://lore.kernel.org/r/20230711071914.133946-1-andrea.righi@canonical.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2023-08-04	rust: delete `ForeignOwnable::borrow_mut`	Alice Ryhl	2	-22/+3
	We discovered that the current design of `borrow_mut` is problematic. This patch removes it until a better solution can be found. Specifically, the current design gives you access to a `&mut T`, which lets you change where the `ForeignOwnable` points (e.g., with `core::mem::swap`). No upcoming user of this API intended to make that possible, making all of them unsound. Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com> Fixes: 0fc4424d24a2 ("rust: types: introduce `ForeignOwnable`") Link: https://lore.kernel.org/r/20230706094615.3080784-1-aliceryhl@google.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2023-08-04	rust: allocator: Prevent mis-aligned allocation	Boqun Feng	2	-15/+60
	Currently the rust allocator simply passes the size of the type Layout to krealloc(), and in theory the alignment requirement from the type Layout may be larger than the guarantee provided by SLAB, which means the allocated object is mis-aligned. Fix this by adjusting the allocation size to the nearest power of two, which SLAB always guarantees a size-aligned allocation. And because Rust guarantees that the original size must be a multiple of alignment and the alignment must be a power of two, then the alignment requirement is satisfied. Suggested-by: Vlastimil Babka <vbabka@suse.cz> Co-developed-by: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk> Signed-off-by: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Cc: stable@vger.kernel.org # v6.1+ Acked-by: Vlastimil Babka <vbabka@suse.cz> Fixes: 247b365dc8dc ("rust: add `kernel` crate") Link: https://github.com/Rust-for-Linux/linux/issues/974 Link: https://lore.kernel.org/r/20230730012905.643822-2-boqun.feng@gmail.com [ Applied rewording of comment as discussed in the mailing list. ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2023-08-03	MAINTAINERS: update TUN/TAP maintainers	Jakub Kicinski	1	-1/+4
	Willem and Jason have agreed to take over the maintainer duties for TUN/TAP, thank you! There's an existing entry for TUN/TAP which only covers the user mode Linux implementation. Since we haven't heard from Maxim on the list for almost a decade, extend that entry and take it over, rather than adding a new one. Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20230802182843.4193099-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03	test/vsock: remove vsock_perf executable on `make clean`	Stefano Garzarella	1	-1/+1
	We forgot to add vsock_perf to the rm command in the `clean` target, so now we have a left over after `make clean` in tools/testing/vsock. Fixes: 8abbffd27ced ("test/vsock: vsock_perf utility") Cc: AVKrasnov@sberdevices.ru Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Link: https://lore.kernel.org/r/20230803085454.30897-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03	tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen	Eric Dumazet	1	-4/+5
	Whenever tcpm_new() reclaims an old entry, tcpm_suck_dst() would overwrite data that could be read from tcp_fastopen_cache_get() or tcp_metrics_fill_info(). We need to acquire fastopen_seqlock to maintain consistency. For newly allocated objects, tcpm_new() can switch to kzalloc() to avoid an extra fastopen_seqlock acquisition. Fixes: 1fe4c481ba63 ("net-tcp: Fast Open client - cookie cache") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230802131500.1478140-7-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03	tcp_metrics: annotate data-races around tm->tcpm_net	Eric Dumazet	1	-4/+7
	tm->tcpm_net can be read or written locklessly. Instead of changing write_pnet() and read_pnet() and potentially hurt performance, add the needed READ_ONCE()/WRITE_ONCE() in tm_net() and tcpm_new(). Fixes: 849e8a0ca8d5 ("tcp_metrics: Add a field tcpm_net and verify it matches on lookup") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230802131500.1478140-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03	tcp_metrics: annotate data-races around tm->tcpm_vals[]	Eric Dumazet	1	-9/+14
	tm->tcpm_vals[] values can be read or written locklessly. Add needed READ_ONCE()/WRITE_ONCE() to document this, and force use of tcp_metric_get() and tcp_metric_set() Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03	tcp_metrics: annotate data-races around tm->tcpm_lock	Eric Dumazet	1	-2/+4
	tm->tcpm_lock can be read or written locklessly. Add needed READ_ONCE()/WRITE_ONCE() to document this. Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230802131500.1478140-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03	tcp_metrics: annotate data-races around tm->tcpm_stamp	Eric Dumazet	1	-6/+13
	tm->tcpm_stamp can be read or written locklessly. Add needed READ_ONCE()/WRITE_ONCE() to document this. Also constify tcpm_check_stamp() dst argument. Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230802131500.1478140-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03	tcp_metrics: fix addr_same() helper	Eric Dumazet	1	-1/+1
	Because v4 and v6 families use separate inetpeer trees (respectively net->ipv4.peers and net->ipv6.peers), inetpeer_addr_cmp(a, b) assumes a & b share the same family. tcp_metrics use a common hash table, where entries can have different families. We must therefore make sure to not call inetpeer_addr_cmp() if the families do not match. Fixes: d39d14ffa24c ("net: Add helper function to compare inetpeer addresses") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230802131500.1478140-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03	prestera: fix fallback to previous version on same major version	Jonas Gorski	1	-1/+2
	When both supported and previous version have the same major version, and the firmwares are missing, the driver ends in a loop requesting the same (previous) version over and over again: [ 76.327413] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.1.img firmware, fall-back to previous 4.0 version [ 76.339802] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.352162] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.364502] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.376848] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.389183] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.401522] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.413860] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.426199] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version ... Fix this by inverting the check to that we aren't yet at the previous version, and also check the minor version. This also catches the case where both versions are the same, as it was after commit bb5dbf2cc64d ("net: marvell: prestera: add firmware v4.0 support"). With this fix applied: [ 88.499622] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.1.img firmware, fall-back to previous 4.0 version [ 88.511995] Prestera DX 0000:01:00.0: failed to request previous firmware: mrvl/prestera/mvsw_prestera_fw-v4.0.img [ 88.522403] Prestera DX: probe of 0000:01:00.0 failed with error -2 Fixes: 47f26018a414 ("net: marvell: prestera: try to load previous fw version") Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de> Acked-by: Elad Nachman <enachman@marvell.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Taras Chornyi <taras.chornyi@plvision.eu> Link: https://lore.kernel.org/r/20230802092357.163944-1-jonas.gorski@bisdn.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03	arm64/ptrace: Flush FP state when setting ZT0	Mark Brown	1	-0/+2
	When setting ZT0 via ptrace we do not currently force a reload of the floating point register state from memory, do that to ensure that the newly set value gets loaded into the registers on next task execution. The function was templated off the function for FPSIMD which due to our providing the option of embedding a FPSIMD regset within the SVE regset does not directly include the flush. Fixes: f90b529bcbe5 ("arm64/sme: Implement ZT0 ptrace support") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-zt0-flush-v1-1-72e854eaf96e@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-08-03	arm64/fpsimd: Clear SME state in the target task when setting the VL	Mark Brown	1	-1/+1
	When setting SME vector lengths we clear TIF_SME to reenable SME traps, doing a reallocation of the backing storage on next use. We do this using clear_thread_flag() which operates on the current thread, meaning that when setting the vector length via ptrace we may both not force traps for the target task and force a spurious flush of any SME state that the tracing task may have. Clear the flag in the target task. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Reported-by: David Spickett <David.Spickett@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-tif-sme-v1-1-88312fd6fbfd@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-08-03	parisc: unaligned: Add required spaces after ','	hanyu001@208suo.com	1	-9/+9
	Fix checkpatch warnings: unaligned.c:475: ERROR: space required after that ',' Signed-off-by: Yu Han <hanyu001@208suo.com> Signed-off-by: Helge Deller <deller@gmx.de>