linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-06-29	xfs: don't hold xattr leaf buffers across transaction rolls	Darrick J. Wong	5	-65/+12
	Now that we've established (again!) that empty xattr leaf buffers are ok, we no longer need to bhold them to transactions when we're creating new leaf blocks. Get rid of the entire mechanism, which should simplify the xattr code quite a bit. The original justification for using bhold here was to prevent the AIL from trying to write the empty leaf block into the fs during the brief time that we release the buffer lock. The reason for /that/ was to prevent recovery from tripping over the empty ondisk block. Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-29	xfs: empty xattr leaf header blocks are not corruption	Darrick J. Wong	1	-9/+17
	TLDR: Revert commit 51e6104fdb95 ("xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify") because it was wrong. Every now and then we get a corruption report from the kernel or xfs_repair about empty leaf blocks in the extended attribute structure. We've long thought that these shouldn't be possible, but prior to 5.18 one would shake loose in the recoveryloop fstests about once a month. A new addition to the xattr leaf block verifier in 5.19-rc1 makes this happen every 7 minutes on my testing cloud. I added a ton of logging to detect any time we set the header count on an xattr leaf block to zero. This produced the following dmesg output on generic/388: XFS (sda4): ino 0x21fcbaf leaf 0x129bf78 hdcount==0! Call Trace: <TASK> dump_stack_lvl+0x34/0x44 xfs_attr3_leaf_create+0x187/0x230 xfs_attr_shortform_to_leaf+0xd1/0x2f0 xfs_attr_set_iter+0x73e/0xa90 xfs_xattri_finish_update+0x45/0x80 xfs_attr_finish_item+0x1b/0xd0 xfs_defer_finish_noroll+0x19c/0x770 __xfs_trans_commit+0x153/0x3e0 xfs_attr_set+0x36b/0x740 xfs_xattr_set+0x89/0xd0 __vfs_setxattr+0x67/0x80 __vfs_setxattr_noperm+0x6e/0x120 vfs_setxattr+0x97/0x180 setxattr+0x88/0xa0 path_setxattr+0xc3/0xe0 __x64_sys_setxattr+0x27/0x30 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 So now we know that someone is creating empty xattr leaf blocks as part of converting a sf xattr structure into a leaf xattr structure. The conversion routine logs any existing sf attributes in the same transaction that creates the leaf block, so we know this is a setxattr to a file that has no attributes at all. Next, g/388 calls the shutdown ioctl and cycles the mount to trigger log recovery. I also augmented buffer item recovery to call ->verify_struct on any attr leaf blocks and complain if it finds a failure: XFS (sda4): Unmounting Filesystem XFS (sda4): Mounting V5 Filesystem XFS (sda4): Starting recovery (logdev: internal) XFS (sda4): xattr leaf daddr 0x129bf78 hdrcount == 0! Call Trace: <TASK> dump_stack_lvl+0x34/0x44 xfs_attr3_leaf_verify+0x3b8/0x420 xlog_recover_buf_commit_pass2+0x60a/0x6c0 xlog_recover_items_pass2+0x4e/0xc0 xlog_recover_commit_trans+0x33c/0x350 xlog_recovery_process_trans+0xa5/0xe0 xlog_recover_process_data+0x8d/0x140 xlog_do_recovery_pass+0x19b/0x720 xlog_do_log_recovery+0x62/0xc0 xlog_do_recover+0x33/0x1d0 xlog_recover+0xda/0x190 xfs_log_mount+0x14c/0x360 xfs_mountfs+0x517/0xa60 xfs_fs_fill_super+0x6bc/0x950 get_tree_bdev+0x175/0x280 vfs_get_tree+0x1a/0x80 path_mount+0x6f5/0xaa0 __x64_sys_mount+0x103/0x140 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fc61e241eae And a moment later, the _delwri_submit of the recovered buffers trips the same verifier and recovery fails: XFS (sda4): Metadata corruption detected at xfs_attr3_leaf_verify+0x393/0x420 [xfs], xfs_attr3_leaf block 0x129bf78 XFS (sda4): Unmount and run xfs_repair XFS (sda4): First 128 bytes of corrupted metadata buffer: 00000000: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00 ........;....... 00000010: 00 00 00 00 01 29 bf 78 00 00 00 00 00 00 00 00 .....).x........ 00000020: a5 1b d0 02 b2 9a 49 df 8e 9c fb 8d f8 31 3e 9d ......I......1>. 00000030: 00 00 00 00 02 1f cb af 00 00 00 00 10 00 00 00 ................ 00000040: 00 50 0f b0 00 00 00 00 00 00 00 00 00 00 00 00 .P.............. 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ XFS (sda4): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply+0x37f/0x3b0 [xfs] (fs/xfs/xfs_buf.c:1518). Shutting down filesystem. XFS (sda4): Please unmount the filesystem and rectify the problem(s) XFS (sda4): log mount/recovery failed: error -117 XFS (sda4): log mount failed I think I see what's going on here -- setxattr is racing with something that shuts down the filesystem: Thread 1 Thread 2 -------- -------- xfs_attr_sf_addname xfs_attr_shortform_to_leaf <create empty leaf> xfs_trans_bhold(leaf) xattri_dela_state = XFS_DAS_LEAF_ADD <roll transaction> <flush log> <shut down filesystem> xfs_trans_bhold_release(leaf) <discover fs is dead, bail> Thread 3 -------- <cycle mount, start recovery> xlog_recover_buf_commit_pass2 xlog_recover_do_reg_buffer <replay empty leaf buffer from recovered buf item> xfs_buf_delwri_queue(leaf) xfs_buf_delwri_submit _xfs_buf_ioapply(leaf) xfs_attr3_leaf_write_verify <trip over empty leaf buffer> <fail recovery> As you can see, the bhold keeps the leaf buffer locked and thus prevents the AIL from tripping over the ichdr.count==0 check in the write verifier. Unfortunately, it doesn't prevent the log from getting flushed to disk, which sets up log recovery to fail. So. It's clear that the kernel has always had the ability to persist attr leaf blocks with ichdr.count==0, which means that it's part of the ondisk format now. Unfortunately, this check has been added and removed multiple times throughout history. It first appeared in[1] kernel 3.10 as part of the early V5 format patches. The check was later discovered to break log recovery and hence disabled[2] during log recovery in kernel 4.10. Simultaneously, the check was added[3] to xfs_repair 4.9.0 to try to weed out the empty leaf blocks. This was still not correct because log recovery would recover an empty attr leaf block successfully only for regular xattr operations to trip over the empty block during of the block during regular operation. Therefore, the check was removed entirely[4] in kernel 5.7 but removal of the xfs_repair check was forgotten. The continued complaints from xfs_repair lead to us mistakenly re-adding[5] the verifier check for kernel 5.19. Remove it once again. [1] 517c22207b04 ("xfs: add CRCs to attr leaf blocks") [2] 2e1d23370e75 ("xfs: ignore leaf attr ichdr.count in verifier during log replay") [3] f7140161 ("xfs_repair: junk leaf attribute if count == 0") [4] f28cef9e4dac ("xfs: don't fail verifier on empty attr3 leaf block") [5] 51e6104fdb95 ("xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify") Looking at the rest of the xattr code, it seems that files with empty leaf blocks behave as expected -- listxattr reports no attributes; getxattr on any xattr returns nothing as expected; removexattr does nothing; and setxattr can add attributes just fine. Original-bug: 517c22207b04 ("xfs: add CRCs to attr leaf blocks") Still-not-fixed-by: 2e1d23370e75 ("xfs: ignore leaf attr ichdr.count in verifier during log replay") Removed-in: f28cef9e4dac ("xfs: don't fail verifier on empty attr3 leaf block") Fixes: 51e6104fdb95 ("xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-06-26	xfs: clean up the end of xfs_attri_item_recover	Darrick J. Wong	1	-19/+26
	The end of this function could use some cleanup -- the EAGAIN conditionals make it harder to figure out what's going on with the disposal of xattri_leaf_bp, and the dual error/ret variables aren't needed. Turn the EAGAIN case into a separate block documenting all the subtleties of recovering in the middle of an xattr update chain, which makes the rest of the prologue much simpler. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-06-26	xfs: always free xattri_leaf_bp when cancelling a deferred op	Darrick J. Wong	1	-1/+19
	While running the following fstest with logged xattrs DISabled, I noticed the following: # FSSTRESS_AVOID="-z -f unlink=1 -f rmdir=1 -f creat=2 -f mkdir=2 -f getfattr=3 -f listfattr=3 -f attr_remove=4 -f removefattr=4 -f setfattr=20 -f attr_set=60" ./check generic/475 INFO: task u9:1:40 blocked for more than 61 seconds. Tainted: G O 5.19.0-rc2-djwx #rc2 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:u9:1 state:D stack:12872 pid: 40 ppid: 2 flags:0x00004000 Workqueue: xfs-cil/dm-0 xlog_cil_push_work [xfs] Call Trace: <TASK> __schedule+0x2db/0x1110 schedule+0x58/0xc0 schedule_timeout+0x115/0x160 __down_common+0x126/0x210 down+0x54/0x70 xfs_buf_lock+0x2d/0xe0 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73] xfs_buf_item_unpin+0x227/0x3a0 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73] xfs_trans_committed_bulk+0x18e/0x320 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73] xlog_cil_committed+0x2ea/0x360 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73] xlog_cil_push_work+0x60f/0x690 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73] process_one_work+0x1df/0x3c0 worker_thread+0x53/0x3b0 kthread+0xea/0x110 ret_from_fork+0x1f/0x30 </TASK> This appears to be the result of shortform_to_leaf creating a new leaf buffer as part of adding an xattr to a file. The new leaf buffer is held and attached to the xfs_attr_intent structure, but then the filesystem shuts down. Instead of the usual path (which adds the attr to the held leaf buffer which releases the hold), we instead cancel the entire deferred operation. Unfortunately, xfs_attr_cancel_item doesn't release any attached leaf buffers, so we leak the locked buffer. The CIL cannot do anything about that, and hangs. Fix this by teaching it to release leaf buffers, and make XFS a little more careful about not leaving a dangling reference. The prologue of xfs_attri_item_recover is (in this author's opinion) a little hard to figure out, so I'll clean that up in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-06-26	xfs: use invalidate_lock to check the state of mmap_lock	Kaixu Xia	1	-2/+2
	We should use invalidate_lock and XFS_MMAPLOCK_SHARED to check the state of mmap_lock rw_semaphore in xfs_isilocked(), rather than i_rwsem and XFS_IOLOCK_SHARED. Fixes: 2433480a7e1d ("xfs: Convert to use invalidate_lock") Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-26	xfs: factor out the common lock flags assert	Kaixu Xia	1	-37/+23
	There are similar lock flags assert in xfs_ilock(), xfs_ilock_nowait(), xfs_iunlock(), thus we can factor it out into a helper that is clear. Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-23	xfs: introduce xfs_inodegc_push()	Dave Chinner	5	-10/+28
	The current blocking mechanism for pushing the inodegc queue out to disk can result in systems becoming unusable when there is a long running inodegc operation. This is because the statfs() implementation currently issues a blocking flush of the inodegc queue and a significant number of common system utilities will call statfs() to discover something about the underlying filesystem. This can result in userspace operations getting stuck on inodegc progress, and when trying to remove a heavily reflinked file on slow storage with a full journal, this can result in delays measuring in hours. Avoid this problem by adding "push" function that expedites the flushing of the inodegc queue, but doesn't wait for it to complete. Convert xfs_fs_statfs() and xfs_qm_scall_getquota() to use this mechanism so they don't block but still ensure that queued operations are expedited. Fixes: ab23a7768739 ("xfs: per-cpu deferred inode inactivation queues") Reported-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Dave Chinner <dchinner@redhat.com> [djwong: fix _getquota_next to use _inodegc_push too] Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-23	xfs: bound maximum wait time for inodegc work	Dave Chinner	3	-16/+24
	Currently inodegc work can sit queued on the per-cpu queue until the workqueue is either flushed of the queue reaches a depth that triggers work queuing (and later throttling). This means that we could queue work that waits for a long time for some other event to trigger flushing. Hence instead of just queueing work at a specific depth, use a delayed work that queues the work at a bound time. We can still schedule the work immediately at a given depth, but we no long need to worry about leaving a number of items on the list that won't get processed until external events prevail. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-15	xfs: preserve DIFLAG2_NREXT64 when setting other inode attributes	Darrick J. Wong	1	-1/+2
	It is vitally important that we preserve the state of the NREXT64 inode flag when we're changing the other flags2 fields. Fixes: 9b7d16e34bbe ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-15	xfs: fix variable state usage	Darrick J. Wong	1	-2/+1
	The variable @args is fed to a tracepoint, and that's the only place it's used. This is fine for the kernel, but for userspace, tracepoints are #define'd out of existence, which results in this warning on gcc 11.2: xfs_attr.c: In function ‘xfs_attr_node_try_addname’: xfs_attr.c:1440:42: warning: unused variable ‘args’ [-Wunused-variable] 1440 \| struct xfs_da_args *args = attr->xattri_da_args; \| ^~~~ Clean this up. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-15	xfs: fix TOCTOU race involving the new logged xattrs control knob	Darrick J. Wong	6	-22/+34
	I found a race involving the larp control knob, aka the debugging knob that lets developers enable logging of extended attribute updates: Thread 1 Thread 2 echo 0 > /sys/fs/xfs/debug/larp setxattr(REPLACE) xfs_has_larp (returns false) xfs_attr_set echo 1 > /sys/fs/xfs/debug/larp xfs_attr_defer_replace xfs_attr_init_replace_state xfs_has_larp (returns true) xfs_attr_init_remove_state <oops, wrong DAS state!> This isn't a particularly severe problem right now because xattr logging is only enabled when CONFIG_XFS_DEBUG=y, and developers should know what they're doing. However, the eventual intent is that callers should be able to ask for the assistance of the log in persisting xattr updates. This capability might not be required for /all/ callers, which means that dynamic control must work correctly. Once an xattr update has decided whether or not to use logged xattrs, it needs to stay in that mode until the end of the operation regardless of what subsequent parallel operations might do. Therefore, it is an error to continue sampling xfs_globals.larp once xfs_attr_change has made a decision about larp, and it was not correct for me to have told Allison that ->create_intent functions can sample the global log incompat feature bitfield to decide to elide a log item. Instead, create a new op flag for the xfs_da_args structure, and convert all other callers of xfs_has_larp and xfs_sb_version_haslogxattrs within the attr update state machine to look for the operations flag. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-12	Linux 5.19-rc2	Linus Torvalds	1	-1/+1

2022-06-12	platform/x86/intel: hid: Add Surface Go to VGBS allow list	Duke Lee	1	-0/+6
	The Surface Go reports Chassis Type 9 (Laptop,) so the device needs to be added to dmi_vgbs_allow_list to enable tablet mode when an attached Type Cover is folded back. BugLink: https://github.com/linux-surface/linux-surface/issues/837 Signed-off-by: Duke Lee <krnhotwings@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220607213654.5567-1-krnhotwings@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-12	platform/x86: hp-wmi: Use zero insize parameter only when supported	Bedant Patnaik	1	-8/+15
	commit be9d73e64957 ("platform/x86: hp-wmi: Fix 0x05 error code reported by several WMI calls") and commit 12b19f14a21a ("platform/x86: hp-wmi: Fix hp_wmi_read_int() reporting error (0x05)") cause ACPI BIOS Error (bug): Attempt to CreateField of length zero (20211217/dsopcode-133) because of the ACPI method HWMC, which unconditionally creates a Field of size (insize8) bits: CreateField (Arg1, 0x80, (Local5 0x08), DAIN) In cases where args->insize = 0, the Field size is 0, resulting in an error. Fix this by using zero insize only if 0x5 error code is returned Tested on Omen 15 AMD (2020) board ID: 8786. Fixes: be9d73e64957 ("platform/x86: hp-wmi: Fix 0x05 error code reported by several WMI calls") Signed-off-by: Bedant Patnaik <bedant.patnaik@gmail.com> Tested-by: Jorge Lopez <jorge.lopez2@hp.com> Link: https://lore.kernel.org/r/41be46743d21c78741232a47bbb5f1cdbcc3d21e.camel@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-12	platform/x86: hp-wmi: Resolve WMI query failures on some devices	Jorge Lopez	1	-2/+4
	WMI queries fail on some devices where the ACPI method HWMC unconditionally attempts to create Fields beyond the buffer if the buffer is too small, this breaks essential features such as power profiles: CreateByteField (Arg1, 0x10, D008) CreateByteField (Arg1, 0x11, D009) CreateByteField (Arg1, 0x12, D010) CreateDWordField (Arg1, 0x10, D032) CreateField (Arg1, 0x80, 0x0400, D128) In cases where args->data had zero length, ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [D008] at bit offset/length 128/8 exceeds size of target Buffer (128 bits) (20211217/dsopcode-198) was obtained. ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [D009] at bit offset/length 136/8 exceeds size of target Buffer (136bits) (20211217/dsopcode-198) The original code created a buffer size of 128 bytes regardless if the WMI call required a smaller buffer or not. This particular behavior occurs in older BIOS and reproduced in OMEN laptops. Newer BIOS handles buffer sizes properly and meets the latest specification requirements. This is the reason why testing with a dynamically allocated buffer did not uncover any failures with the test systems at hand. This patch was tested on several OMEN, Elite, and Zbooks. It was confirmed the patch resolves HPWMI_FAN GET/SET calls in an OMEN Laptop 15-ek0xxx. No problems were reported when testing on several Elite and Zbooks notebooks. Fixes: 4b4967cbd268 ("platform/x86: hp-wmi: Changing bios_args.data to be dynamically allocated") Signed-off-by: Jorge Lopez <jorge.lopez2@hp.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220608212923.8585-2-jorge.lopez2@hp.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-11	workqueue: Switch to new kerneldoc syntax for named variable macro argument	Jonathan Neuschäfer	1	-1/+1
	The syntax without dots is available since commit 43756e347f21 ("scripts/kernel-doc: Add support for named variable macro arguments"). The same HTML output is produced with and without this patch. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2022-06-11	iov_iter: fix build issue due to possible type mis-match	Linus Torvalds	1	-2/+2
	Commit 6c77676645ad ("iov_iter: Fix iter_xarray_get_pages{,_alloc}()") introduced a problem on some 32-bit architectures (at least arm, xtensa, csky,sparc and mips), that have a 'size_t' that is 'unsigned int'. The reason is that we now do min(nr * PAGE_SIZE - offset, maxsize); where 'nr' and 'offset' and both 'unsigned int', and PAGE_SIZE is 'unsigned long'. As a result, the normal C type rules means that the first argument to 'min()' ends up being 'unsigned long'. In contrast, 'maxsize' is of type 'size_t'. Now, 'size_t' and 'unsigned long' are always the same physical type in the kernel, so you'd think this doesn't matter, and from an actual arithmetic standpoint it doesn't. But on 32-bit architectures 'size_t' is commonly 'unsigned int', even if it could also be 'unsigned long'. In that situation, both are unsigned 32-bit types, but they are not the same type. And as a result 'min()' will complain about the distinct types (ignore the "pointer types" part of the error message: that's an artifact of the way we have made 'min()' check types for being the same): lib/iov_iter.c: In function 'iter_xarray_get_pages': include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast [-Werror] 20 \| (!!(sizeof((typeof(x) )1 == (typeof(y) )1))) \| ^~ lib/iov_iter.c:1464:16: note: in expansion of macro 'min' 1464 \| return min(nr * PAGE_SIZE - offset, maxsize); \| ^~~ This was not visible on 64-bit architectures (where we always define 'size_t' to be 'unsigned long'). Force these cases to use 'min_t(size_t, x, y)' to make the type explicit and avoid the issue. [ Nit-picky note: technically 'size_t' doesn't have to match 'unsigned long' arithmetically. We've certainly historically seen environments with 16-bit address spaces and 32-bit 'unsigned long'. Similarly, even in 64-bit modern environments, 'size_t' could be its own type distinct from 'unsigned long', even if it were arithmetically identical. So the above type commentary is only really descriptive of the kernel environment, not some kind of universal truth for the kinds of wild and crazy situations that are allowed by the C standard ] Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Link: https://lore.kernel.org/all/YqRyL2sIqQNDfky2@debian/ Cc: Jeff Layton <jlayton@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-11	wireguard: selftests: use maximum cpu features and allow rng seeding	Jason A. Donenfeld	3	-15/+19
	By forcing the maximum CPU that QEMU has available, we expose additional capabilities, such as the RNDR instruction, which increases test coverage. This then allows the CI to skip the fake seeding step in some cases. Also enable STRICT_KERNEL_RWX to catch issues related to early jump labels when the RNG is initialized at boot. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-11	scripts/gdb: change kernel config dumping method	Kuan-Ying Lee	1	-3/+3
	MAGIC_START("IKCFG_ST") and MAGIC_END("IKCFG_ED") are moved out from the kernel_config_data variable. Thus, we parse kernel_config_data directly instead of considering offset of MAGIC_START and MAGIC_END. Fixes: 13610aa908dc ("kernel/configs: use .incbin directive to embed config_data.gz") Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-06-10	um: virt-pci: set device ready in probe()	Vincent Whitchurch	1	-1/+6
	Call virtio_device_ready() to make this driver work after commit b4ec69d7e09 ("virtio: harden vring IRQ"), since the driver uses the virtqueues in the probe function. (The virtio core sets the device ready when probe returns.) Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ") Fixes: 68f5d3f3b654 ("um: add PCI over virtio emulation driver") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Message-Id: <20220610151203.3492541-1-vincent.whitchurch@axis.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Johannes Berg <johannes@sipsolutions.net>
2022-06-10	cifs: populate empty hostnames for extra channels	Shyam Prasad N	2	-1/+8
	Currently, the secondary channels of a multichannel session also get hostname populated based on the info in primary channel. However, this will end up with a wrong resolution of hostname to IP address during reconnect. This change fixes this by not populating hostname info for all secondary channels. Fixes: 5112d80c162f ("cifs: populate server_hostname for extra channels") Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-10	platform/x86: gigabyte-wmi: Add support for B450M DS3H-CF	August Wikerfors	1	-0/+1
	Tested and works on my system. Signed-off-by: August Wikerfors <git@augustwikerfors.se> Link: https://lore.kernel.org/r/20220608212028.28307-1-git@augustwikerfors.se Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	platform/x86: gigabyte-wmi: Add Z690M AORUS ELITE AX DDR4 support	Piotr Chmura	1	-0/+1
	Add dmi_system_id of Gigabyte Z690M AORUS ELITE AX DDR4 board. Tested on my PC. Signed-off-by: Piotr Chmura <chmooreck@gmail.com> Link: https://lore.kernel.org/r/bd83567e-ebf5-0b31-074b-5f6dc7f7c147@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	platform/x86: barco-p50-gpio: Add check for platform_driver_register	Jiasheng Jiang	1	-1/+4
	As platform_driver_register() could fail, it should be better to deal with the return value in order to maintain the code consisitency. Fixes: 86af1d02d458 ("platform/x86: Support for EC-connected GPIOs for identify LED/button on Barco P50 board") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Acked-by: Peter Korsgaard <peter.korsgaard@barco.com> Link: https://lore.kernel.org/r/20220526090345.1444172-1-jiasheng@iscas.ac.cn Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	platform/x86/intel: pmc: Support Intel Raptorlake P	George D Sworo	1	-0/+1
	Add Raptorlake P to the list of the platforms that intel_pmc_core driver supports for pmc_core device. Raptorlake P PCH is based on Alderlake P PCH. Signed-off-by: George D Sworo <george.d.sworo@intel.com> Reviewed-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220602012617.20100-1-george.d.sworo@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	platform/x86/intel: Fix pmt_crashlog array reference	David Arcari	1	-1/+1
	The probe function pmt_crashlog_probe() may incorrectly reference the 'priv->entry array' as it uses 'i' to reference the array instead of 'priv->num_entries' as it should. This is similar to the problem that was addressed in pmt_telemetry_probe via commit 2cdfa0c20d58 ("platform/x86/intel: Fix 'rmmod pmt_telemetry' panic"). Cc: "David E. Box" <david.e.box@linux.intel.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Mark Gross <markgross@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: David Arcari <darcari@redhat.com> Reviewed-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220526203140.339120-1-darcari@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	platform/mellanox: Add static in struct declaration.	Michael Shych	1	-1/+1
	Fix problem of missing static in struct declaration. Fixes: 662f24826f954 ("platform/mellanox: Add support for new SN2201 system") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/20220602145103.11859-1-michaelsh@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	iov_iter: Fix iter_xarray_get_pages{,_alloc}()	David Howells	1	-16/+4
	The maths at the end of iter_xarray_get_pages() to calculate the actual size doesn't work under some circumstances, such as when it's been asked to extract a partial single page. Various terms of the equation cancel out and you end up with actual == offset. The same issue exists in iter_xarray_get_pages_alloc(). Fix these to just use min() to select the lesser amount from between the amount of page content transcribed into the buffer, minus the offset, and the size limit specified. This doesn't appear to have caused a problem yet upstream because network filesystems aren't getting the pages from an xarray iterator, but rather passing it directly to the socket, which just iterates over it. Cachefiles does do DIO from one to/from ext4/xfs/btrfs/etc. but it always asks for whole pages to be written or read. Fixes: 7ff5062079ef ("iov_iter: Add ITER_XARRAY") Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: Alexander Viro <viro@zeniv.linux.org.uk> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Mike Marshall <hubcap@omnibond.com> cc: Gao Xiang <xiang@kernel.org> cc: linux-afs@lists.infradead.org cc: v9fs-developer@lists.sourceforge.net cc: devel@lists.orangefs.org cc: linux-erofs@lists.ozlabs.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-10	netfs: Rename the netfs_io_request cleanup op and give it an op pointer	David Howells	6	-30/+29
	The netfs_io_request cleanup op is now always in a position to be given a pointer to a netfs_io_request struct, so this can be passed in instead of the mapping and private data arguments (both of which are included in the struct). So rename the ->cleanup op to ->free_request (to match ->init_request) and pass in the I/O pointer. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com
2022-06-10	netfs: Further cleanups after struct netfs_inode wrapper introduced	Linus Torvalds	14	-30/+30
	Change the signature of netfs helper functions to take a struct netfs_inode pointer rather than a struct inode pointer where appropriate, thereby relieving the need for the network filesystem to convert its internal inode format down to the VFS inode only for netfslib to bounce it back up. For type safety, it's better not to do that (and it's less typing too). Give netfs_write_begin() an extra argument to pass in a pointer to the netfs_inode struct rather than deriving it internally from the file pointer. Note that the ->write_begin() and ->write_end() ops are intended to be replaced in the future by netfslib code that manages this without the need to call in twice for each page. netfs_readpage() and similar are intended to be pointed at directly by the address_space_operations table, so must stick to the signature dictated by the function pointers there. Changes ======= - Updated the kerneldoc comments and documentation [DH]. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/CAHk-=wgkwKyNmNdKpQkqZ6DnmUL-x9hp0YBnUGjaPFEAdxDTbw@mail.gmail.com/
2022-06-10	afs: Fix some checker issues	David Howells	1	-2/+1
	Remove an unused global variable and make another static as reported by make C=1. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org
2022-06-10	dm: fix zoned locking imbalance due to needless check in clone_endio	Mike Snitzer	1	-15/+11
	After the commit ca522482e3ea ("dm: pass NULL bdev to bio_alloc_clone"), clone_endio() only calls dm_zone_endio() when DM targets remap the clone bio's bdev to something other than the md->disk->part0 default. However, if a DM target (e.g. dm-crypt) stacked ontop of a dm-zoned does not remap the clone bio using bio_set_dev() then dm_zone_endio() is not called at completion of the bios and zone locks are not properly unlocked. This triggers a hang, in dm_zone_map_bio(), when blktests block/004 is run for dm-crypt on zoned block devices. To avoid the hang, simply remove the clone_endio() check that verifies the target remapped the clone bio to a device other than the default. Fixes: ca522482e3ea ("dm: pass NULL bdev to bio_alloc_clone") Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-10	platform/mellanox: Spelling s/platfom/platform/	Geert Uytterhoeven	1	-1/+1
	Fix a misspelling of the word "platform". Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/9c8edde31e271311b7832d7677fe84aba917da8d.1653376503.git.geert@linux-m68k.org Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	certs: Convert spaces in certs/Makefile to a tab	David Howells	1	-1/+1
	There's a rule in certs/Makefile for which the command begins with eight spaces. This results in: ../certs/Makefile:21: FORCE prerequisite is missing ../certs/Makefile:21: *** missing separator. Stop. Fix this by turning the spaces into a tab. Fixes: addf466389d9 ("certs: Check that builtin blacklist hashes are valid") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Mickaël Salaün <mic@linux.microsoft.com> cc: keyrings@vger.kernel.org Link: https://lore.kernel.org/r/486b1b80-9932-aab6-138d-434c541c934a@digikod.net/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-10	dt-bindings: display: arm,malidp: remove bogus RQOS property	Andre Przywara	1	-6/+1
	As Liviu pointed out, the arm,malidp-arqos-high-level property mentioned in the original .txt binding was a mistake, and arm,malidp-arqos-value needs to take its place. The binding commit ce6eb0253cba ("dt/bindings: display: Add optional property node define for Mali DP500") mentions the right name in the commit message, but has the wrong name in the diff. Commit d298e6a27a81 ("drm/arm/mali-dp: Add display QoS interface configuration for Mali DP500") uses the property in the driver, but uses the shorter name. Remove the wrong property from the binding, and use the proper name in the example. The actual property was already documented properly. Fixes: 2c8b082a3ab1 ("dt-bindings: display: convert Arm Mali-DP to DT schema") Link: https://lore.kernel.org/linux-arm-kernel/YnumGEilUblhBx8E@e110455-lin.cambridge.arm.com/ Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reported-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220609162729.1441760-1-andre.przywara@arm.com
2022-06-10	dt-bindings: pinctrl: ralink: Fix 'enum' lists with duplicate entries	Rob Herring	2	-25/+28
	There's no reason to list the same value twice in an 'enum'. This was fixed treewide in commit c3b006819426 ("dt-bindings: Fix 'enum' lists with duplicate entries"), but this one got added in the merge window. A meta-schema change will catch future cases. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Link: https://lore.kernel.org/r/20220606212239.1360877-1-robh@kernel.org
2022-06-10	arm64: Add kasan_hw_tags_enable() prototype to silence sparse	Catalin Marinas	1	-0/+6
	This function is only called from assembly, no need for a prototype declaration in a header file. In addition, add #ifdef around the function since it is only used when CONFIG_KASAN_HW_TAGS. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: kernel test robot <lkp@intel.com>
2022-06-10	arm64/sme: Fix EFI save/restore	Mark Brown	1	-4/+14
	The EFI save/restore code is confused. When saving the check for saving FFR is inverted due to confusion with the streaming mode check, and when restoring we check if we need to restore FFR by checking the percpu efi_sm_state without the required wrapper rather than based on the combination of FA64 support and streaming mode. Fixes: e0838f6373e5 ("arm64/sme: Save and restore streaming mode over EFI runtime calls") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220602124132.3528951-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-06-10	arm64/fpsimd: Fix typo in comment	Xiang wangx	1	-1/+1
	Delete the redundant word 'in'. Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com> Link: https://lore.kernel.org/r/20220610070543.59338-1-wangxiang@cdjrlc.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-06-10	arm64/sysreg: Fix typo in Enum element regex	Alejandro Tafalla	1	-1/+1
	In the awk script, there was a typo with the comparison operator when checking if the matched pattern is inside an Enum block. This prevented the generation of the whole sysreg-defs.h header. Fixes: 66847e0618d7 ("arm64: Add sysreg header generation scripting") Signed-off-by: Alejandro Tafalla <atafalla@dnyon.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220609204220.12112-1-atafalla@dnyon.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-06-10	gpio: dwapb: Don't print error on -EPROBE_DEFER	Serge Semin	1	-4/+3
	Currently if the APB or Debounce clocks aren't yet ready to be requested the DW GPIO driver will correctly handle that by deferring the probe procedure, but the error is still printed to the system log. It needlessly pollutes the log since there was no real error but a request to postpone the clock request procedure since the clocks subsystem hasn't been fully initialized yet. Let's fix that by using the dev_err_probe method to print the APB/clock request error status. It will correctly handle the deferred probe situation and print the error if it actually happens. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-06-10	random: remove rng_has_arch_random()	Jason A. Donenfeld	3	-16/+1
	With arch randomness being used by every distro and enabled in defconfigs, the distinction between rng_has_arch_random() and rng_is_initialized() is now rather small. In fact, the places where they differ are now places where paranoid users and system builders really don't want arch randomness to be used, in which case we should respect that choice, or places where arch randomness is known to be broken, in which case that choice is all the more important. So this commit just removes the function and its one user. Reviewed-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-10	random: credit cpu and bootloader seeds by default	Jason A. Donenfeld	1	-19/+31
	This commit changes the default Kconfig values of RANDOM_TRUST_CPU and RANDOM_TRUST_BOOTLOADER to be Y by default. It does not change any existing configs or change any kernel behavior. The reason for this is several fold. As background, I recently had an email thread with the kernel maintainers of Fedora/RHEL, Debian, Ubuntu, Gentoo, Arch, NixOS, Alpine, SUSE, and Void as recipients. I noted that some distros trust RDRAND, some trust EFI, and some trust both, and I asked why or why not. There wasn't really much of a "debate" but rather an interesting discussion of what the historical reasons have been for this, and it came up that some distros just missed the introduction of the bootloader Kconfig knob, while another didn't want to enable it until there was a boot time switch to turn it off for more concerned users (which has since been added). The result of the rather uneventful discussion is that every major Linux distro enables these two options by default. While I didn't have really too strong of an opinion going into this thread -- and I mostly wanted to learn what the distros' thinking was one way or another -- ultimately I think their choice was a decent enough one for a default option (which can be disabled at boot time). I'll try to summarize the pros and cons: Pros: - The RNG machinery gets initialized super quickly, and there's no messing around with subsequent blocking behavior. - The bootloader mechanism is used by kexec in order for the prior kernel to initialize the RNG of the next kernel, which increases the entropy available to early boot daemons of the next kernel. - Previous objections related to backdoors centered around Dual_EC_DRBG-like kleptographic systems, in which observing some amount of the output stream enables an adversary holding the right key to determine the entire output stream. This used to be a partially justified concern, because RDRAND output was mixed into the output stream in varying ways, some of which may have lacked pre-image resistance (e.g. XOR or an LFSR). But this is no longer the case. Now, all usage of RDRAND and bootloader seeds go through a cryptographic hash function. This means that the CPU would have to compute a hash pre-image, which is not considered to be feasible (otherwise the hash function would be terribly broken). - More generally, if the CPU is backdoored, the RNG is probably not the realistic vector of choice for an attacker. - These CPU or bootloader seeds are far from being the only source of entropy. Rather, there is generally a pretty huge amount of entropy, not all of which is credited, especially on CPUs that support instructions like RDRAND. In other words, assuming RDRAND outputs all zeros, an attacker would still have to accurately model every single other entropy source also in use. - The RNG now reseeds itself quite rapidly during boot, starting at 2 seconds, then 4, then 8, then 16, and so forth, so that other sources of entropy get used without much delay. - Paranoid users can set random.trust_{cpu,bootloader}=no in the kernel command line, and paranoid system builders can set the Kconfig options to N, so there's no reduction or restriction of optionality. - It's a practical default. - All the distros have it set this way. Microsoft and Apple trust it too. Bandwagon. Cons: - RDRAND could still be backdoored with something like a fixed key or limited space serial number seed or another indexable scheme like that. (However, it's hard to imagine threat models where the CPU is backdoored like this, yet people are still okay making any computations with it or connecting it to networks, etc.) - RDRAND could be defective, rather than backdoored, and produce garbage that is in one way or another insufficient for crypto. - Suggesting a reduction in paranoia, as this commit effectively does, may cause some to question my personal integrity as a "security person". - Bootloader seeds and RDRAND are generally very difficult if not all together impossible to audit. Keep in mind that this doesn't actually change any behavior. This is just a change in the default Kconfig value. The distros already are shipping kernels that set things this way. Ard made an additional argument in [1]: We're at the mercy of firmware and micro-architecture anyway, given that we are also relying on it to ensure that every instruction in the kernel's executable image has been faithfully copied to memory, and that the CPU implements those instructions as documented. So I don't think firmware or ISA bugs related to RNGs deserve special treatment - if they are broken, we should quirk around them like we usually do. So enabling these by default is a step in the right direction IMHO. In [2], Phil pointed out that having this disabled masked a bug that CI otherwise would have caught: A clean 5.15.45 boots cleanly, whereas a downstream kernel shows the static key warning (but it does go on to boot). The significant difference is that our defconfigs set CONFIG_RANDOM_TRUST_BOOTLOADER=y defining that on top of multi_v7_defconfig demonstrates the issue on a clean 5.15.45. Conversely, not setting that option in a downstream kernel build avoids the warning [1] https://lore.kernel.org/lkml/CAMj1kXGi+ieviFjXv9zQBSaGyyzeGW_VpMpTLJK8PJb2QHEQ-w@mail.gmail.com/ [2] https://lore.kernel.org/lkml/c47c42e3-1d56-5859-a6ad-976a1a3381c6@raspberrypi.com/ Cc: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-10	random: do not use jump labels before they are initialized	Jason A. Donenfeld	1	-1/+10
	Stephen reported that a static key warning splat appears during early boot on systems that credit randomness from device trees that contain an "rng-seed" property, because because setup_machine_fdt() is called before jump_label_init() during setup_arch(): static_key_enable_cpuslocked(): static key '0xffffffe51c6fcfc0' used before call to jump_label_init() WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xb0/0xb8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0+ #224 44b43e377bfc84bc99bb5ab885ff694984ee09ff pstate: 600001c9 (nZCv dAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : static_key_enable_cpuslocked+0xb0/0xb8 lr : static_key_enable_cpuslocked+0xb0/0xb8 sp : ffffffe51c393cf0 x29: ffffffe51c393cf0 x28: 000000008185054c x27: 00000000f1042f10 x26: 0000000000000000 x25: 00000000f10302b2 x24: 0000002513200000 x23: 0000002513200000 x22: ffffffe51c1c9000 x21: fffffffdfdc00000 x20: ffffffe51c2f0831 x19: ffffffe51c6fcfc0 x18: 00000000ffff1020 x17: 00000000e1e2ac90 x16: 00000000000000e0 x15: ffffffe51b710708 x14: 0000000000000066 x13: 0000000000000018 x12: 0000000000000000 x11: 0000000000000000 x10: 00000000ffffffff x9 : 0000000000000000 x8 : 0000000000000000 x7 : 61632065726f6665 x6 : 6220646573752027 x5 : ffffffe51c641d25 x4 : ffffffe51c13142c x3 : ffff0a00ffffff05 x2 : 40000000ffffe003 x1 : 00000000000001c0 x0 : 0000000000000065 Call trace: static_key_enable_cpuslocked+0xb0/0xb8 static_key_enable+0x2c/0x40 crng_set_ready+0x24/0x30 execute_in_process_context+0x80/0x90 _credit_init_bits+0x100/0x154 add_bootloader_randomness+0x64/0x78 early_init_dt_scan_chosen+0x140/0x184 early_init_dt_scan_nodes+0x28/0x4c early_init_dt_scan+0x40/0x44 setup_machine_fdt+0x7c/0x120 setup_arch+0x74/0x1d8 start_kernel+0x84/0x44c __primary_switched+0xc0/0xc8 ---[ end trace 0000000000000000 ]--- random: crng init done Machine model: Google Lazor (rev1 - 2) with LTE A trivial fix went in to address this on arm64, 73e2d827a501 ("arm64: Initialize jump labels before setup_machine_fdt()"). I wrote patches as well for arm32 and risc-v. But still patches are needed on xtensa, powerpc, arc, and mips. So that's 7 platforms where things aren't quite right. This sort of points to larger issues that might need a larger solution. Instead, this commit just defers setting the static branch until later in the boot process. random_init() is called after jump_label_init() has been called, and so is always a safe place from which to adjust the static branch. Fixes: f5bda35fba61 ("random: use static branch for crng_ready()") Reported-by: Stephen Boyd <swboyd@chromium.org> Reported-by: Phil Elwell <phil@raspberrypi.com> Tested-by: Phil Elwell <phil@raspberrypi.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-10	random: account for arch randomness in bits	Jason A. Donenfeld	1	-5/+5
	Rather than accounting in bytes and multiplying (shifting), we can just account in bits and avoid the shift. The main motivation for this is there are other patches in flux that expand this code a bit, and avoiding the duplication of "* 8" everywhere makes things a bit clearer. Cc: stable@vger.kernel.org Fixes: 12e45a2a6308 ("random: credit architectural init the exact amount") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-10	random: mark bootloader randomness code as __init	Jason A. Donenfeld	2	-5/+4
	add_bootloader_randomness() and the variables it touches are only used during __init and not after, so mark these as __init. At the same time, unexport this, since it's only called by other __init code that's built-in. Cc: stable@vger.kernel.org Fixes: 428826f5358c ("fdt: add support for rng-seed") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-10	random: avoid checking crng_ready() twice in random_init()	Jason A. Donenfeld	1	-1/+1
	The current flow expands to: if (crng_ready()) ... else if (...) if (!crng_ready()) ... The second crng_ready() call is redundant, but can't so easily be optimized out by the compiler. This commit simplifies that to: if (crng_ready() ... else if (...) ... Fixes: 560181c27b58 ("random: move initialization functions out of hot pages") Cc: stable@vger.kernel.org Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-09	net: seg6: fix seg6_lookup_any_nexthop() to handle VRFs using flowi_l3mdev	Andrea Mayer	1	-0/+1
	Commit 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") adds a new entry (flowi_l3mdev) in the common flow struct used for indicating the l3mdev index for later rule and table matching. The l3mdev_update_flow() has been adapted to properly set the flowi_l3mdev based on the flowi_oif/flowi_iif. In fact, when a valid flowi_iif is supplied to the l3mdev_update_flow(), this function can update the flowi_l3mdev entry only if it has not yet been set (i.e., the flowi_l3mdev entry is equal to 0). The SRv6 End.DT6 behavior in VRF mode leverages a VRF device in order to force the routing lookup into the associated routing table. This routing operation is performed by seg6_lookup_any_nextop() preparing a flowi6 data structure used by ip6_route_input_lookup() which, in turn, (indirectly) invokes l3mdev_update_flow(). However, seg6_lookup_any_nexthop() does not initialize the new flowi_l3mdev entry which is filled with random garbage data. This prevents l3mdev_update_flow() from properly updating the flowi_l3mdev with the VRF index, and thus SRv6 End.DT6 (VRF mode)/DT46 behaviors are broken. This patch correctly initializes the flowi6 instance allocated and used by seg6_lookup_any_nexhtop(). Specifically, the entire flowi6 instance is wiped out: in case new entries are added to flowi/flowi6 (as happened with the flowi_l3mdev entry), we should no longer have incorrectly initialized values. As a result of this operation, the value of flowi_l3mdev is also set to 0. The proposed fix can be tested easily. Starting from the commit referenced in the Fixes, selftests [1],[2] indicate that the SRv6 End.DT6 (VRF mode)/DT46 behaviors no longer work correctly. By applying this patch, those behaviors are back to work properly again. [1] - tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh [2] - tools/testing/selftests/net/srv6_end_dt6_l3vpn_test.sh Fixes: 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") Reported-by: Anton Makarov <am@3a-alliance.com> Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220608091917.20345-1-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09	nfp: flower: restructure flow-key for gre+vlan combination	Etienne van der Linde	2	-24/+24
	Swap around the GRE and VLAN parts in the flow-key offloaded by the driver to fit in with other tunnel types and the firmware. Without this change used cases with GRE+VLAN on the outer header does not get offloaded as the flow-key mismatches what the firmware expect. Fixes: 0d630f58989a ("nfp: flower: add support to offload QinQ match") Fixes: 5a2b93041646 ("nfp: flower-ct: compile match sections of flow_payload") Signed-off-by: Etienne van der Linde <etienne.vanderlinde@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09	nfp: avoid unnecessary check warnings in nfp_app_get_vf_config	Fei Qin	1	-13/+15
	nfp_net_sriov_check is added in nfp_app_get_vf_config which intends to ensure ivi->vlan_proto and ivi->max_tx_rate/min_tx_rate can be read from VF config table only when firmware supports corresponding capability. However, "nfp_app_get_vf_config" can be called by commands like "ip a", "ip link set $DEV up" and "ip link set $DEV vf $NUM vlan $param" (with VF). When using commands above, many warnings "ndo_set_vf_<cap_x> not supported" would appear if firmware doesn't support VF rate limit and 802.1ad VLAN assingment. If more VFs are created, things could get worse. Thus, this patch add an extra bool parameter for nfp_net_sriov_check to enable/disable the cap check warning report. Unnecessary warnings in nfp_app_get_vf_config can be avoided. Valid warnings in kinds of vf setting function can be reserved. Fixes: e0d0e1fdf1ed ("nfp: VF rate limit support") Fixes: 59359597b010 ("nfp: support 802.1ad VLAN assingment to VF") Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>