linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2021-06-11	IMA: support for duplicate measurement records	Tushar Sugandhi	2	-2/+10
	IMA measures contents of a given file/buffer/critical-data record, and properly re-measures it on change. However, IMA does not measure the duplicate value for a given record, since TPM extend is a very expensive operation. For example, if the record changes from value 'v#1' to 'v#2', and then back to 'v#1', IMA will not measure and log the last change to 'v#1', since the hash of 'v#1' for that record is already present in the IMA htable. This limits the ability of an external attestation service to accurately determine the current state of the system. The service would incorrectly conclude that the latest value of the given record on the system is 'v#2', and act accordingly. Define and use a new Kconfig option IMA_DISABLE_HTABLE to permit duplicate records in the IMA measurement list. In addition to the duplicate measurement records described above, other duplicate file measurement records may be included in the log, when CONFIG_IMA_DISABLE_HTABLE is enabled. For example, - i_version is not enabled, - i_generation changed, - same file present on different filesystems, - an inode is evicted from dcache Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Reviewed-by: Petr Vorel <pvorel@suse.cz> [zohar@linux.ibm.com: updated list of duplicate measurement records] Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-11	ima: Fix warning: no previous prototype for function 'ima_add_kexec_buffer'	Lakshmi Ramasubramanian	1	-0/+1
	The function prototype for ima_add_kexec_buffer() is present in 'linux/ima.h'. But this header file is not included in ima_kexec.c where the function is implemented. This results in the following compiler warning when "-Wmissing-prototypes" flag is turned on: security/integrity/ima/ima_kexec.c:81:6: warning: no previous prototype for function 'ima_add_kexec_buffer' [-Wmissing-prototypes] Include the header file 'linux/ima.h' in ima_kexec.c to fix the compiler warning. Fixes: dce92f6b11c3 (arm64: Enable passing IMA log to next kernel on kexec) Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-10	ima: differentiate between EVM failures in the audit log	Mimi Zohar	1	-1/+2
	Differentiate between an invalid EVM portable signature failure from other EVM HMAC/signature failures. Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-08	ima: Fix fall-through warning for Clang	Gustavo A. R. Silva	1	-0/+1
	In preparation to enable -Wimplicit-fallthrough for Clang, fix a fall-through warning by explicitly adding a break statement instead of just letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-08	ima: Pass NULL instead of 0 to ima_get_action() in ima_file_mprotect()	Roberto Sassu	1	-1/+1
	This patch fixes the sparse warning: sparse: warning: Using plain integer as NULL pointer Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-08	ima: Include header defining ima_post_key_create_or_update()	Roberto Sassu	1	-0/+1
	This patch fixes the sparse warning for ima_post_key_create_or_update() by adding the header file that defines the prototype (linux/ima.h). Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-08	ima/evm: Fix type mismatch	Roberto Sassu	4	-11/+12
	The endianness of a variable written to the measurement list cannot be determined at compile time, as it depends on the value of the ima_canonical_fmt global variable (set through a kernel option with the same name if the machine is big endian). If ima_canonical_fmt is false, the endianness of a variable is the same as the machine; if ima_canonical_fmt is true, the endianness is little endian. The warning arises due to this type of instruction: var = cpu_to_leXX(var) which tries to assign a value in little endian to a variable with native endianness (little or big endian). Given that the variables set with this instruction are not used in any operation but just written to a buffer, it is safe to force the type of the value being set to be the same of the type of the variable with: var = (__force <var type>)cpu_to_leXX(var) Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-08	ima: Set correct casting types	Roberto Sassu	2	-9/+10
	The code expects that the values being parsed from a buffer when the ima_canonical_fmt global variable is true are in little endian. Thus, this patch sets the casting types accordingly. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-08	doc: Fix warning in Documentation/security/IMA-templates.rst	Roberto Sassu	1	-1/+1
	This patch fixes the warning: Documentation/security/IMA-templates.rst:81: WARNING: Inline substitution_reference start-string without end-string. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-03	evm: Don't return an error in evm_write_xattrs() if audit is not enabled	Roberto Sassu	1	-1/+1
	This patch avoids that evm_write_xattrs() returns an error when audit is not enabled. The ab variable can be NULL and still be passed to the other audit_log_() functions, as those functions do not include any instruction. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-03	ima: Define new template evm-sig	Roberto Sassu	2	-1/+5
	With the recent introduction of the evmsig template field, remote verifiers can obtain the EVM portable signature instead of the IMA signature, to verify file metadata. After introducing the new fields to include file metadata in the measurement list, this patch finally defines the evm-sig template, whose format is: d-ng\|n-ng\|evmsig\|xattrnames\|xattrlengths\|xattrvalues\|iuid\|igid\|imode xattrnames, xattrlengths and xattrvalues are populated only from defined EVM protected xattrs, i.e. the ones that EVM considers to verify the portable signature. xattrnames and xattrlengths are populated only if the xattr is present. xattrnames and xattrlengths are not necessary for verifying the EVM portable signature, but they are included for completeness of information, if a remote verifier wants to infer more from file metadata. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-02	ima: Define new template fields xattrnames, xattrlengths and xattrvalues	Roberto Sassu	6	-0/+162
	This patch defines the new template fields xattrnames, xattrlengths and xattrvalues, which contain respectively a list of xattr names (strings, separated by \|), lengths (u32, hex) and values (hex). If an xattr is not present, the name and length are not displayed in the measurement list. Reported-by: kernel test robot <lkp@intel.com> (Missing prototype def) Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-01	evm: Verify portable signatures against all protected xattrs	Roberto Sassu	5	-12/+74
	Currently, the evm_config_default_xattrnames array contains xattr names only related to LSMs which are enabled in the kernel configuration. However, EVM portable signatures do not depend on local information and a vendor might include in the signature calculation xattrs that are not enabled in the target platform. Just including all xattrs names in evm_config_default_xattrnames is not a safe approach, because a target system might have already calculated signatures or HMACs based only on the enabled xattrs. After applying this patch, EVM would verify those signatures and HMACs with all xattrs instead. The non-enabled ones, which could possibly exist, would cause a verification error. Thus, this patch adds a new field named enabled to the xattr_list structure, which is set to true if the LSM associated to a given xattr name is enabled in the kernel configuration. The non-enabled xattrs are taken into account only in evm_calc_hmac_or_hash(), if the passed security.evm type is EVM_XATTR_PORTABLE_DIGSIG. The new function evm_protected_xattr_if_enabled() has been defined so that IMA can include all protected xattrs and not only the enabled ones in the measurement list, if the new template fields xattrnames, xattrlengths or xattrvalues have been included in the template format. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-01	ima: Define new template field imode	Roberto Sassu	4	-0/+27
	This patch defines the new template field imode, which includes the inode mode. It can be used by a remote verifier to verify the EVM portable signature, if it was included with the template fields sig or evmsig. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-01	ima: Define new template fields iuid and igid	Roberto Sassu	4	-0/+55
	This patch defines the new template fields iuid and igid, which include respectively the inode UID and GID. For idmapped mounts, still the original UID and GID are provided. These fields can be used to verify the EVM portable signature, if it was included with the template fields sig or evmsig. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-01	ima: Add ima_show_template_uint() template library function	Roberto Sassu	2	-1/+39
	This patch introduces the new function ima_show_template_uint(). This can be used for showing integers of different sizes in ASCII format. The function ima_show_template_data_ascii() automatically determines how to print a stored integer by checking the integer size. If integers have been written in canonical format, ima_show_template_data_ascii() calls the appropriate leXX_to_cpu() function to correctly display the value. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-01	ima: Don't remove security.ima if file must not be appraised	Roberto Sassu	1	-2/+0
	Files might come from a remote source and might have xattrs, including security.ima. It should not be IMA task to decide whether security.ima should be kept or not. This patch removes the removexattr() system call in ima_inode_post_setattr(). Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-01	ima: Introduce template field evmsig and write to field sig as fallback	Roberto Sassu	4	-2/+39
	With the patch to accept EVM portable signatures when the appraise_type=imasig requirement is specified in the policy, appraisal can be successfully done even if the file does not have an IMA signature. However, remote attestation would not see that a different signature type was used, as only IMA signatures can be included in the measurement list. This patch solves the issue by introducing the new template field 'evmsig' to show EVM portable signatures and by including its value in the existing field 'sig' if the IMA signature is not found. Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-01	ima: Allow imasig requirement to be satisfied by EVM portable signatures	Roberto Sassu	1	-7/+17
	System administrators can require that all accessed files have a signature by specifying appraise_type=imasig in a policy rule. Currently, IMA signatures satisfy this requirement. Appended signatures may also satisfy this requirement, but are not applicable as IMA signatures. IMA/appended signatures ensure data source authentication for file content and prevent any change. EVM signatures instead ensure data source authentication for file metadata. Given that the digest or signature of the file content must be included in the metadata, EVM signatures provide the same file data guarantees of IMA signatures, as well as providing file metadata guarantees. This patch lets systems protected with EVM signatures pass appraisal verification if the appraise_type=imasig requirement is specified in the policy. This facilitates deployment in the scenarios where only EVM signatures are available. The patch makes the following changes: file xattr types: security.ima: IMA_XATTR_DIGEST/IMA_XATTR_DIGEST_NG security.evm: EVM_XATTR_PORTABLE_DIGSIG execve(), mmap(), open() behavior (with appraise_type=imasig): before: denied (file without IMA signature, imasig requirement not met) after: allowed (file with EVM portable signature, imasig requirement met) open(O_WRONLY) behavior (without appraise_type=imasig): before: allowed (file without IMA signature, not immutable) after: denied (file with EVM portable signature, immutable) In addition, similarly to IMA signatures, this patch temporarily allows new files without or with incomplete metadata to be opened so that content can be written. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-01	evm: Deprecate EVM_ALLOW_METADATA_WRITES	Roberto Sassu	1	-2/+8
	This patch deprecates the usage of EVM_ALLOW_METADATA_WRITES, as it is no longer necessary. All the issues that prevent the usage of EVM portable signatures just with a public key loaded have been solved. This flag will remain available for a short time to ensure that users are able to use EVM without it. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-06-01	evm: Allow setxattr() and setattr() for unmodified metadata	Roberto Sassu	1	-1/+112
	With the patch to allow xattr/attr operations if a portable signature verification fails, cp and tar can copy all xattrs/attrs so that at the end of the process verification succeeds. However, it might happen that the xattrs/attrs are already set to the correct value (taken at signing time) and signature verification succeeds before the copy has completed. For example, an archive might contains files owned by root and the archive is extracted by root. Then, since portable signatures are immutable, all subsequent operations fail (e.g. fchown()), even if the operation is legitimate (does not alter the current value). This patch avoids this problem by reporting successful operation to user space when that operation does not alter the current value of xattrs/attrs. With this patch, the one that introduces evm_hmac_disabled() and the one that allows a metadata operation on the INTEGRITY_FAIL_IMMUTABLE error, EVM portable signatures can be used without disabling metadata verification (by setting EVM_ALLOW_METADATA_WRITES). Due to keeping metadata verification enabled, altering immutable metadata protected with a portable signature that was successfully verified will be denied (existing behavior). Reported-by: kernel test robot <lkp@intel.com> [implicit declaration of function] Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-05-21	evm: Pass user namespace to set/remove xattr hooks	Roberto Sassu	3	-12/+21
	In preparation for 'evm: Allow setxattr() and setattr() for unmodified metadata', this patch passes mnt_userns to the inode set/remove xattr hooks so that the GID of the inode on an idmapped mount is correctly determined by posix_acl_update_mode(). Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-05-21	evm: Allow xattr/attr operations for portable signatures	Roberto Sassu	3	-6/+30
	If files with portable signatures are copied from one location to another or are extracted from an archive, verification can temporarily fail until all xattrs/attrs are set in the destination. Only portable signatures may be moved or copied from one file to another, as they don't depend on system-specific information such as the inode generation. Instead portable signatures must include security.ima. Unlike other security.evm types, EVM portable signatures are also immutable. Thus, it wouldn't be a problem to allow xattr/attr operations when verification fails, as portable signatures will never be replaced with the HMAC on possibly corrupted xattrs/attrs. This patch first introduces a new integrity status called INTEGRITY_FAIL_IMMUTABLE, that allows callers of evm_verify_current_integrity() to detect that a portable signature didn't pass verification and then adds an exception in evm_protect_xattr() and evm_inode_setattr() for this status and returns 0 instead of -EPERM. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-05-21	evm: Introduce evm_hmac_disabled() to safely ignore verification errors	Roberto Sassu	1	-1/+38
	When a file is being created, LSMs can set the initial label with the inode_init_security hook. If no HMAC key is loaded, the new file will have LSM xattrs but not the HMAC. It is also possible that the file remains without protected xattrs after creation if no active LSM provided it, or because the filesystem does not support them. Unfortunately, EVM will deny any further metadata operation on new files, as evm_protect_xattr() will return the INTEGRITY_NOLABEL error if protected xattrs exist without security.evm, INTEGRITY_NOXATTRS if no protected xattrs exist or INTEGRITY_UNKNOWN if xattrs are not supported. This would limit the usability of EVM when only a public key is loaded, as commands such as cp or tar with the option to preserve xattrs won't work. This patch introduces the evm_hmac_disabled() function to determine whether or not it is safe to ignore verification errors, based on the ability of EVM to calculate HMACs. If the HMAC key is not loaded, and it cannot be loaded in the future due to the EVM_SETUP_COMPLETE initialization flag, allowing an operation despite the attrs/xattrs being found invalid will not make them valid. Since the post hooks can be executed even when the HMAC key is not loaded, this patch also ensures that the EVM_INIT_HMAC initialization flag is set before the post hooks call evm_update_evmxattr(). Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Suggested-by: Mimi Zohar <zohar@linux.ibm.com> (for ensuring EVM_INIT_HMAC is set) Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-05-21	evm: Introduce evm_revalidate_status()	Roberto Sassu	3	-9/+52
	When EVM_ALLOW_METADATA_WRITES is set, EVM allows any operation on metadata. Its main purpose is to allow users to freely set metadata when it is protected by a portable signature, until an HMAC key is loaded. However, callers of evm_verifyxattr() are not notified about metadata changes and continue to rely on the last status returned by the function. For example IMA, since it caches the appraisal result, will not call again evm_verifyxattr() until the appraisal flags are cleared, and will grant access to the file even if there was a metadata operation that made the portable signature invalid. This patch introduces evm_revalidate_status(), which callers of evm_verifyxattr() can use in their xattr hooks to determine whether re-validation is necessary and to do the proper actions. IMA calls it in its xattr hooks to reset the appraisal flags, so that the EVM status is re-evaluated after a metadata operation. Lastly, this patch also adds a call to evm_reset_status() in evm_inode_post_setattr() to invalidate the cached EVM status after a setattr operation. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-05-21	evm: Refuse EVM_ALLOW_METADATA_WRITES only if an HMAC key is loaded	Roberto Sassu	2	-6/+28
	EVM_ALLOW_METADATA_WRITES is an EVM initialization flag that can be set to temporarily disable metadata verification until all xattrs/attrs necessary to verify an EVM portable signature are copied to the file. This flag is cleared when EVM is initialized with an HMAC key, to avoid that the HMAC is calculated on unverified xattrs/attrs. Currently EVM unnecessarily denies setting this flag if EVM is initialized with a public key, which is not a concern as it cannot be used to trust xattrs/attrs updates. This patch removes this limitation. Fixes: ae1ba1676b88e ("EVM: Allow userland to permit modification of EVM-protected metadata") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: stable@vger.kernel.org # 4.16.x Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-05-21	evm: Load EVM key in ima_load_x509() to avoid appraisal	Roberto Sassu	2	-1/+7
	The public builtin keys do not need to be appraised by IMA as the restriction on the IMA/EVM trusted keyrings ensures that a key can be loaded only if it is signed with a key on the builtin or secondary keyrings. However, when evm_load_x509() is called, appraisal is already enabled and a valid IMA signature must be added to the EVM key to pass verification. Since the restriction is applied on both IMA and EVM trusted keyrings, it is safe to disable appraisal also when the EVM key is loaded. This patch calls evm_load_x509() inside ima_load_x509() if CONFIG_IMA_LOAD_X509 is enabled, which crosses the normal IMA and EVM boundary. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-05-21	evm: Execute evm_inode_init_security() only when an HMAC key is loaded	Roberto Sassu	1	-2/+3
	evm_inode_init_security() requires an HMAC key to calculate the HMAC on initial xattrs provided by LSMs. However, it checks generically whether a key has been loaded, including also public keys, which is not correct as public keys are not suitable to calculate the HMAC. Originally, support for signature verification was introduced to verify a possibly immutable initial ram disk, when no new files are created, and to switch to HMAC for the root filesystem. By that time, an HMAC key should have been loaded and usable to calculate HMACs for new files. More recently support for requiring an HMAC key was removed from the kernel, so that signature verification can be used alone. Since this is a legitimate use case, evm_inode_init_security() should not return an error when no HMAC key has been loaded. This patch fixes this problem by replacing the evm_key_loaded() check with a check of the EVM_INIT_HMAC flag in evm_initialized. Fixes: 26ddabfe96b ("evm: enable EVM when X509 certificate is loaded") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: stable@vger.kernel.org # 4.5.x Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-05-20	evm: fix writing <securityfs>/evm overflow	Mimi Zohar	1	-2/+3
	EVM_SETUP_COMPLETE is defined as 0x80000000, which is larger than INT_MAX. The "-fno-strict-overflow" compiler option properly prevents signaling EVM that the EVM policy setup is complete. Define and read an unsigned int. Fixes: f00d79750712 ("EVM: Allow userspace to signal an RSA key has been loaded") Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-05-16	Linux 5.13-rc2	Linus Torvalds	1	-1/+1

2021-05-15	tty: vt: always invoke vc->vc_sw->con_resize callback	Tetsuo Handa	2	-2/+2
	syzbot is reporting OOB write at vga16fb_imageblit() [1], for resize_screen() from ioctl(VT_RESIZE) returns 0 without checking whether requested rows/columns fit the amount of memory reserved for the graphical screen if current mode is KD_GRAPHICS. ---------- #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/ioctl.h> #include <linux/kd.h> #include <linux/vt.h> int main(int argc, char *argv[]) { const int fd = open("/dev/char/4:1", O_RDWR); struct vt_sizes vt = { 0x4100, 2 }; ioctl(fd, KDSETMODE, KD_GRAPHICS); ioctl(fd, VT_RESIZE, &vt); ioctl(fd, KDSETMODE, KD_TEXT); return 0; } ---------- Allow framebuffer drivers to return -EINVAL, by moving vc->vc_mode != KD_GRAPHICS check from resize_screen() to fbcon_resize(). Link: https://syzkaller.appspot.com/bug?extid=1f29e126cf461c4de3b3 [1] Reported-by: syzbot <syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot <syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	mm/ioremap: fix iomap_max_page_shift	Christophe Leroy	1	-3/+3
	iomap_max_page_shift is expected to contain a page shift, so it can't be a 'bool', has to be an 'unsigned int' And fix the default values: P4D_SHIFT is when huge iomap is allowed. However, on some architectures (eg: powerpc book3s/64), P4D_SHIFT is not a constant so it can't be used to initialise a static variable. So, initialise iomap_max_page_shift with a maximum shift supported by the architecture, it is gated by P4D_SHIFT in vmap_try_huge_p4d() anyway. Link: https://lkml.kernel.org/r/ad2d366015794a9f21320dcbdd0a8eb98979e9df.1620898113.git.christophe.leroy@csgroup.eu Fixes: bbc180a5adb0 ("mm: HUGE_VMAP arch support cleanup") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	docs: admin-guide: update description for kernel.modprobe sysctl	Rasmus Villemoes	1	-4/+5
	When I added CONFIG_MODPROBE_PATH, I neglected to update Documentation/. It's still true that this defaults to /sbin/modprobe, but now via a level of indirection. So document that the kernel might have been built with something other than /sbin/modprobe as the initial value. Link: https://lkml.kernel.org/r/20210420125324.1246826-1-linux@rasmusvillemoes.dk Fixes: 17652f4240f7a ("modules: add CONFIG_MODPROBE_PATH") Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	hfsplus: prevent corruption in shrinking truncate	Jouni Roivas	1	-3/+4
	I believe there are some issues introduced by commit 31651c607151 ("hfsplus: avoid deadlock on file truncation") HFS+ has extent records which always contains 8 extents. In case the first extent record in catalog file gets full, new ones are allocated from extents overflow file. In case shrinking truncate happens to middle of an extent record which locates in extents overflow file, the logic in hfsplus_file_truncate() was changed so that call to hfs_brec_remove() is not guarded any more. Right action would be just freeing the extents that exceed the new size inside extent record by calling hfsplus_free_extents(), and then check if the whole extent record should be removed. However since the guard (blk_cnt > start) is now after the call to hfs_brec_remove(), this has unfortunate effect that the last matching extent record is removed unconditionally. To reproduce this issue, create a file which has at least 10 extents, and then perform shrinking truncate into middle of the last extent record, so that the number of remaining extents is not under or divisible by 8. This causes the last extent record (8 extents) to be removed totally instead of truncating into middle of it. Thus this causes corruption, and lost data. Fix for this is simply checking if the new truncated end is below the start of this extent record, making it safe to remove the full extent record. However call to hfs_brec_remove() can't be moved to it's previous place since we're dropping ->tree_lock and it can cause a race condition and the cached info being invalidated possibly corrupting the node data. Another issue is related to this one. When entering into the block (blk_cnt > start) we are not holding the ->tree_lock. We break out from the loop not holding the lock, but hfs_find_exit() does unlock it. Not sure if it's possible for someone else to take the lock under our feet, but it can cause hard to debug errors and premature unlocking. Even if there's no real risk of it, the locking should still always be kept in balance. Thus taking the lock now just before the check. Link: https://lkml.kernel.org/r/20210429165139.3082828-1-jouni.roivas@tuxera.com Fixes: 31651c607151f ("hfsplus: avoid deadlock on file truncation") Signed-off-by: Jouni Roivas <jouni.roivas@tuxera.com> Reviewed-by: Anton Altaparmakov <anton@tuxera.com> Cc: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Cc: Viacheslav Dubeyko <slava@dubeyko.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	mm/filemap: fix readahead return types	Matthew Wilcox (Oracle)	2	-5/+5
	A readahead request will not allocate more memory than can be represented by a size_t, even on systems that have HIGHMEM available. Change the length functions from returning an loff_t to a size_t. Link: https://lkml.kernel.org/r/20210510201201.1558972-1-willy@infradead.org Fixes: 32c0a6bcaa1f57 ("btrfs: add and use readahead_batch_length") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled	Peter Collingbourne	1	-6/+23
	These tests deliberately access these arrays out of bounds, which will cause the dynamic local bounds checks inserted by CONFIG_UBSAN_LOCAL_BOUNDS to fail and panic the kernel. To avoid this problem, access the arrays via volatile pointers, which will prevent the compiler from being able to determine the array bounds. These accesses use volatile pointers to char (char volatile) rather than the more conventional pointers to volatile char (volatile char ) because we want to prevent the compiler from making inferences about the pointer itself (i.e. its array bounds), not the data that it refers to. Link: https://lkml.kernel.org/r/20210507025915.1464056-1-pcc@google.com Link: https://linux-review.googlesource.com/id/I90b1713fbfa1bf68ff895aef099ea77b98a7c3b9 Signed-off-by: Peter Collingbourne <pcc@google.com> Tested-by: Alexander Potapenko <glider@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Peter Collingbourne <pcc@google.com> Cc: George Popescu <georgepope@android.com> Cc: Elena Petrova <lenaptr@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	mm: fix struct page layout on 32-bit systems	Matthew Wilcox (Oracle)	3	-8/+20
	32-bit architectures which expect 8-byte alignment for 8-byte integers and need 64-bit DMA addresses (arm, mips, ppc) had their struct page inadvertently expanded in 2019. When the dma_addr_t was added, it forced the alignment of the union to 8 bytes, which inserted a 4 byte gap between 'flags' and the union. Fix this by storing the dma_addr_t in one or two adjacent unsigned longs. This restores the alignment to that of an unsigned long. We always store the low bits in the first word to prevent the PageTail bit from being inadvertently set on a big endian platform. If that happened, get_user_pages_fast() racing against a page which was freed and reallocated to the page_pool could dereference a bogus compound_head(), which would be hard to trace back to this cause. Link: https://lkml.kernel.org/r/20210510153211.1504886-1-willy@infradead.org Fixes: c25fff7171be ("mm: add dma_addr_t to struct page") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Matteo Croce <mcroce@linux.microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	ksm: revert "use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()"	Hugh Dickins	1	-1/+2
	This reverts commit 3e96b6a2e9ad929a3230a22f4d64a74671a0720b. General Protection Fault in rmap_walk_ksm() under memory pressure: remove_rmap_item_from_tree() needs to take page lock, of course. Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2105092253500.1127@eggly.anvils Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	userfaultfd: release page in error path to avoid BUG_ON	Axel Rasmussen	1	-1/+11
	Consider the following sequence of events: 1. Userspace issues a UFFD ioctl, which ends up calling into shmem_mfill_atomic_pte(). We successfully account the blocks, we shmem_alloc_page(), but then the copy_from_user() fails. We return -ENOENT. We don't release the page we allocated. 2. Our caller detects this error code, tries the copy_from_user() after dropping the mmap_lock, and retries, calling back into shmem_mfill_atomic_pte(). 3. Meanwhile, let's say another process filled up the tmpfs being used. 4. So shmem_mfill_atomic_pte() fails to account blocks this time, and immediately returns - without releasing the page. This triggers a BUG_ON in our caller, which asserts that the page should always be consumed, unless -ENOENT is returned. To fix this, detect if we have such a "dangling" page when accounting fails, and if so, release it before returning. Link: https://lkml.kernel.org/r/20210428230858.348400-1-axelrasmussen@google.com Fixes: cb658a453b93 ("userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY") Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Reported-by: Hugh Dickins <hughd@google.com> Acked-by: Hugh Dickins <hughd@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	squashfs: fix divide error in calculate_skip()	Phillip Lougher	1	-3/+3
	Sysbot has reported a "divide error" which has been identified as being caused by a corrupted file_size value within the file inode. This value has been corrupted to a much larger value than expected. Calculate_skip() is passed i_size_read(inode) >> msblk->block_log. Due to the file_size value corruption this overflows the int argument/variable in that function, leading to the divide error. This patch changes the function to use u64. This will accommodate any unexpectedly large values due to corruption. The value returned from calculate_skip() is clamped to be never more than SQUASHFS_CACHED_BLKS - 1, or 7. So file_size corruption does not lead to an unexpectedly large return result here. Link: https://lkml.kernel.org/r/20210507152618.9447-1-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: <syzbot+e8f781243ce16ac2f962@syzkaller.appspotmail.com> Reported-by: <syzbot+7b98870d4fec9447b951@syzkaller.appspotmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	kernel/resource: fix return code check in __request_free_mem_region	Alistair Popple	1	-1/+1
	Splitting an earlier version of a patch that allowed calling __request_region() while holding the resource lock into a series of patches required changing the return code for the newly introduced __request_region_locked(). Unfortunately this change was not carried through to a subsequent commit 56fd94919b8b ("kernel/resource: fix locking in request_free_mem_region") in the series. This resulted in a use-after-free due to freeing the struct resource without properly releasing it. Fix this by correcting the return code check so that the struct is not freed if the request to add it was successful. Link: https://lkml.kernel.org/r/20210512073528.22334-1-apopple@nvidia.com Fixes: 56fd94919b8b ("kernel/resource: fix locking in request_free_mem_region") Signed-off-by: Alistair Popple <apopple@nvidia.com> Reported-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Muchun Song <smuchun@gmail.com> Cc: Oliver Sang <oliver.sang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	mm, slub: move slub_debug static key enabling outside slab_mutex	Vlastimil Babka	2	-9/+10
	Paul E. McKenney reported [1] that commit 1f0723a4c0df ("mm, slub: enable slub_debug static key when creating cache with explicit debug flags") results in the lockdep complaint: ====================================================== WARNING: possible circular locking dependency detected 5.12.0+ #15 Not tainted ------------------------------------------------------ rcu_torture_sta/109 is trying to acquire lock: ffffffff96063cd0 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0x9/0x20 but task is already holding lock: ffffffff96173c28 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_create_usercopy+0x2d/0x250 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (slab_mutex){+.+.}-{3:3}: lock_acquire+0xb9/0x3a0 __mutex_lock+0x8d/0x920 slub_cpu_dead+0x15/0xf0 cpuhp_invoke_callback+0x17a/0x7c0 cpuhp_invoke_callback_range+0x3b/0x80 _cpu_down+0xdf/0x2a0 cpu_down+0x2c/0x50 device_offline+0x82/0xb0 remove_cpu+0x1a/0x30 torture_offline+0x80/0x140 torture_onoff+0x147/0x260 kthread+0x10a/0x140 ret_from_fork+0x22/0x30 -> #0 (cpu_hotplug_lock){++++}-{0:0}: check_prev_add+0x8f/0xbf0 __lock_acquire+0x13f0/0x1d80 lock_acquire+0xb9/0x3a0 cpus_read_lock+0x21/0xa0 static_key_enable+0x9/0x20 __kmem_cache_create+0x38d/0x430 kmem_cache_create_usercopy+0x146/0x250 kmem_cache_create+0xd/0x10 rcu_torture_stats+0x79/0x280 kthread+0x10a/0x140 ret_from_fork+0x22/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(slab_mutex); lock(cpu_hotplug_lock); lock(slab_mutex); lock(cpu_hotplug_lock); * DEADLOCK * 1 lock held by rcu_torture_sta/109: #0: ffffffff96173c28 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_create_usercopy+0x2d/0x250 stack backtrace: CPU: 3 PID: 109 Comm: rcu_torture_sta Not tainted 5.12.0+ #15 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: dump_stack+0x6d/0x89 check_noncircular+0xfe/0x110 ? lock_is_held_type+0x98/0x110 check_prev_add+0x8f/0xbf0 __lock_acquire+0x13f0/0x1d80 lock_acquire+0xb9/0x3a0 ? static_key_enable+0x9/0x20 ? mark_held_locks+0x49/0x70 cpus_read_lock+0x21/0xa0 ? static_key_enable+0x9/0x20 static_key_enable+0x9/0x20 __kmem_cache_create+0x38d/0x430 kmem_cache_create_usercopy+0x146/0x250 ? rcu_torture_stats_print+0xd0/0xd0 kmem_cache_create+0xd/0x10 rcu_torture_stats+0x79/0x280 ? rcu_torture_stats_print+0xd0/0xd0 kthread+0x10a/0x140 ? kthread_park+0x80/0x80 ret_from_fork+0x22/0x30 This is because there's one order of locking from the hotplug callbacks: lock(cpu_hotplug_lock); // from hotplug machinery itself lock(slab_mutex); // in e.g. slab_mem_going_offline_callback() And commit 1f0723a4c0df made the reverse sequence possible: lock(slab_mutex); // in kmem_cache_create_usercopy() lock(cpu_hotplug_lock); // kmem_cache_open() -> static_key_enable() The simplest fix is to move static_key_enable() to a place before slab_mutex is taken. That means kmem_cache_create_usercopy() in mm/slab_common.c which is not ideal for SLUB-specific code, but the #ifdef CONFIG_SLUB_DEBUG makes it at least self-contained and obvious. [1] https://lore.kernel.org/lkml/20210502171827.GA3670492@paulmck-ThinkPad-P17-Gen-1/ Link: https://lkml.kernel.org/r/20210504120019.26791-1-vbabka@suse.cz Fixes: 1f0723a4c0df ("mm, slub: enable slub_debug static key when creating cache with explicit debug flags") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	mm/hugetlb: fix cow where page writtable in child	Peter Xu	1	-0/+1
	When rework early cow of pinned hugetlb pages, we moved huge_ptep_get() upper but overlooked a side effect that the huge_ptep_get() will fetch the pte after wr-protection. After moving it upwards, we need explicit wr-protect of child pte or we will keep the write bit set in the child process, which could cause data corrution where the child can write to the original page directly. This issue can also be exposed by "memfd_test hugetlbfs" kselftest. Link: https://lkml.kernel.org/r/20210503234356.9097-3-peterx@redhat.com Fixes: 4eae4efa2c299 ("hugetlb: do early cow when page pinned on src mm") Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	mm/hugetlb: fix F_SEAL_FUTURE_WRITE	Peter Xu	3	-18/+41
	Patch series "mm/hugetlb: Fix issues on file sealing and fork", v2. Hugh reported issue with F_SEAL_FUTURE_WRITE not applied correctly to hugetlbfs, which I can easily verify using the memfd_test program, which seems that the program is hardly run with hugetlbfs pages (as by default shmem). Meanwhile I found another probably even more severe issue on that hugetlb fork won't wr-protect child cow pages, so child can potentially write to parent private pages. Patch 2 addresses that. After this series applied, "memfd_test hugetlbfs" should start to pass. This patch (of 2): F_SEAL_FUTURE_WRITE is missing for hugetlb starting from the first day. There is a test program for that and it fails constantly. $ ./memfd_test hugetlbfs memfd-hugetlb: CREATE memfd-hugetlb: BASIC memfd-hugetlb: SEAL-WRITE memfd-hugetlb: SEAL-FUTURE-WRITE mmap() didn't fail as expected Aborted (core dumped) I think it's probably because no one is really running the hugetlbfs test. Fix it by checking FUTURE_WRITE also in hugetlbfs_file_mmap() as what we do in shmem_mmap(). Generalize a helper for that. Link: https://lkml.kernel.org/r/20210503234356.9097-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20210503234356.9097-2-peterx@redhat.com Fixes: ab3948f58ff84 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Hugh Dickins <hughd@google.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14	arm64: Fix race condition on PG_dcache_clean in __sync_icache_dcache()	Catalin Marinas	1	-1/+3
	To ensure that instructions are observable in a new mapping, the arm64 set_pte_at() implementation cleans the D-cache and invalidates the I-cache to the PoU. As an optimisation, this is only done on executable mappings and the PG_dcache_clean page flag is set to avoid future cache maintenance on the same page. When two different processes map the same page (e.g. private executable file or shared mapping) there's a potential race on checking and setting PG_dcache_clean via set_pte_at() -> __sync_icache_dcache(). While on the fault paths the page is locked (PG_locked), mprotect() does not take the page lock. The result is that one process may see the PG_dcache_clean flag set but the I/D cache maintenance not yet performed. Avoid test_and_set_bit(PG_dcache_clean) in favour of separate test_bit() and set_bit(). In the rare event of a race, the cache maintenance is done twice. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Steven Price <steven.price@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210514095001.13236-1-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-05-14	block/partitions/efi.c: Fix the efi_partition() kernel-doc header	Bart Van Assche	1	-1/+1
	Fix the following kernel-doc warning: block/partitions/efi.c:685: warning: wrong kernel-doc identifier on line: * efi_partition(struct parsed_partitions *state) Cc: Alexander Viro <viro@math.psu.edu> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20210513171708.8391-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14	blk-mq: Swap two calls in blk_mq_exit_queue()	Bart Van Assche	1	-2/+4
	If a tag set is shared across request queues (e.g. SCSI LUNs) then the block layer core keeps track of the number of active request queues in tags->active_queues. blk_mq_tag_busy() and blk_mq_tag_idle() update that atomic counter if the hctx flag BLK_MQ_F_TAG_QUEUE_SHARED is set. Make sure that blk_mq_exit_queue() calls blk_mq_tag_idle() before that flag is cleared by blk_mq_del_queue_tag_set(). Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Fixes: 0d2602ca30e4 ("blk-mq: improve support for shared tags maps") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210513171529.7977-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14	blk-mq: plug request for shared sbitmap	Ming Lei	1	-2/+3
	In case of shared sbitmap, request won't be held in plug list any more sine commit 32bc15afed04 ("blk-mq: Facilitate a shared sbitmap per tagset"), this way makes request merge from flush plug list & batching submission not possible, so cause performance regression. Yanhui reports performance regression when running sequential IO test(libaio, 16 jobs, 8 depth for each job) in VM, and the VM disk is emulated with image stored on xfs/megaraid_sas. Fix the issue by recovering original behavior to allow to hold request in plug list. Cc: Yanhui Ma <yama@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: kashyap.desai@broadcom.com Fixes: 32bc15afed04 ("blk-mq: Facilitate a shared sbitmap per tagset") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210514022052.1047665-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14	xen/swiotlb: check if the swiotlb has already been initialized	Stefano Stabellini	2	-1/+12
	xen_swiotlb_init calls swiotlb_late_init_with_tbl, which fails with -ENOMEM if the swiotlb has already been initialized. Add an explicit check io_tlb_default_mem != NULL at the beginning of xen_swiotlb_init. If the swiotlb is already initialized print a warning and return -EEXIST. On x86, the error propagates. On ARM, we don't actually need a special swiotlb buffer (yet), any buffer would do. So ignore the error and continue. CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Boris Ostrovsky <boris.ostrvsky@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210512201823.1963-3-sstabellini@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
2021-05-14	arm64: do not set SWIOTLB_NO_FORCE when swiotlb is required	Christoph Hellwig	1	-1/+2
	Although SWIOTLB_NO_FORCE is meant to allow later calls to swiotlb_init, today dma_direct_map_page returns error if SWIOTLB_NO_FORCE. For now, without a larger overhaul of SWIOTLB_NO_FORCE, the best we can do is to avoid setting SWIOTLB_NO_FORCE in mem_init when we know that it is going to be required later (e.g. Xen requires it). CC: boris.ostrovsky@oracle.com CC: jgross@suse.com CC: catalin.marinas@arm.com CC: will@kernel.org CC: linux-arm-kernel@lists.infradead.org Fixes: 2726bf3ff252 ("swiotlb: Make SWIOTLB_NO_FORCE perform no allocation") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210512201823.1963-2-sstabellini@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>