wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2022-07-07	xfs: move CIL ordering to the logvec chain	Dave Chinner	2	-5/+12
	Adding a list_sort() call to the CIL push work while the xc_ctx_lock is held exclusively has resulted in fairly long lock hold times and that stops all front end transaction commits from making progress. We can move the sorting out of the xc_ctx_lock if we can transfer the ordering information to the log vectors as they are detached from the log items and then we can sort the log vectors. With these changes, we can move the list_sort() call to just before we call xlog_write() when we aren't holding any locks at all. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-07	xfs: convert log vector chain to use list heads	Dave Chinner	6	-35/+43
	Because the next change is going to require sorting log vectors, and that requires arbitrary rearrangement of the list which cannot be done easily with a single linked list. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-07	xfs: convert CIL to unordered per cpu lists	Dave Chinner	2	-21/+17
	So that we can remove the cil_lock which is a global serialisation point. We've already got ordering sorted, so all we need to do is treat the CIL list like the busy extent list and reconstruct it before the push starts. This is what we're trying to avoid: - 75.35% 1.83% [kernel] [k] xfs_log_commit_cil - 46.35% xfs_log_commit_cil - 41.54% _raw_spin_lock - 67.30% do_raw_spin_lock 66.96% __pv_queued_spin_lock_slowpath Which happens on a 32p system when running a 32-way 'rm -rf' workload. After this patch: - 20.90% 3.23% [kernel] [k] xfs_log_commit_cil - 17.67% xfs_log_commit_cil - 6.51% xfs_log_ticket_ungrant 1.40% xfs_log_space_wake 2.32% memcpy_erms - 2.18% xfs_buf_item_committing - 2.12% xfs_buf_item_release - 1.03% xfs_buf_unlock 0.96% up 0.72% xfs_buf_rele 1.33% xfs_inode_item_format 1.19% down_read 0.91% up_read 0.76% xfs_buf_item_format - 0.68% kmem_alloc_large - 0.67% kmem_alloc 0.64% __kmalloc 0.50% xfs_buf_item_size It kinda looks like the workload is running out of log space all the time. But all the spinlock contention is gone and the transaction commit rate has gone from 800k/s to 1.3M/s so the amount of real work being done has gone up a lot. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-07	xfs: Add order IDs to log items in CIL	Dave Chinner	3	-8/+33
	Before we split the ordered CIL up into per cpu lists, we need a mechanism to track the order of the items in the CIL. We need to do this because there are rules around the order in which related items must physically appear in the log even inside a single checkpoint transaction. An example of this is intents - an intent must appear in the log before it's intent done record so that log recovery can cancel the intent correctly. If we have these two records misordered in the CIL, then they will not be recovered correctly by journal replay. We also will not be able to move items to the tail of the CIL list when they are relogged, hence the log items will need some mechanism to allow the correct log item order to be recreated before we write log items to the hournal. Hence we need to have a mechanism for recording global order of transactions in the log items so that we can recover that order from un-ordered per-cpu lists. Do this with a simple monotonic increasing commit counter in the CIL context. Each log item in the transaction gets stamped with the current commit order ID before it is added to the CIL. If the item is already in the CIL, leave it where it is instead of moving it to the tail of the list and instead sort the list before we start the push work. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-07	xfs: convert CIL busy extents to per-cpu	Dave Chinner	1	-6/+20
	To get them out from under the CIL lock. This is an unordered list, so we can simply punt it to per-cpu lists during transaction commits and reaggregate it back into a single list during the CIL push work. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-07	xfs: track CIL ticket reservation in percpu structure	Dave Chinner	2	-4/+13
	To get it out from under the cil spinlock. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-07	xfs: implement percpu cil space used calculation	Dave Chinner	2	-31/+149
	Now that we have the CIL percpu structures in place, implement the space used counter as a per-cpu counter. We have to be really careful now about ensuring that the checks and updates run without arbitrary delays, which means they need to run with pre-emption disabled. We do this by careful placement of the get_cpu_ptr/put_cpu_ptr calls to access the per-cpu structures for that CPU. We need to be able to reliably detect that the CIL has reached the hard limit threshold so we can take extra reservations for the iclog headers when the space used overruns the original reservation. hence we factor out xlog_cil_over_hard_limit() from xlog_cil_push_background(). The global CIL space used is an atomic variable that is backed by per-cpu aggregation to minimise the number of atomic updates we do to the global state in the fast path. While we are under the soft limit, we aggregate only when the per-cpu aggregation is over the proportion of the soft limit assigned to that CPU. This means that all CPUs can use all but one byte of their aggregation threshold and we will not go over the soft limit. Hence once we detect that we've gone over both a per-cpu aggregation threshold and the soft limit, we know that we have only exceeded the soft limit by one per-cpu aggregation threshold. Even if all CPUs hit this at the same time, we can't be over the hard limit, so we can run an aggregation back into the atomic counter at this point and still be under the hard limit. At this point, we will be over the soft limit and hence we'll aggregate into the global atomic used space directly rather than the per-cpu counters, hence providing accurate detection of hard limit excursion for accounting and reservation purposes. Hence we get the best of both worlds - lockless, scalable per-cpu fast path plus accurate, atomic detection of hard limit excursion. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-02	xfs: introduce per-cpu CIL tracking structure	Dave Chinner	3	-2/+47
	The CIL push lock is highly contended on larger machines, becoming a hard bottleneck that about 700,000 transaction commits/s on >16p machines. To address this, start moving the CIL tracking infrastructure to utilise per-CPU structures. We need to track the space used, the amount of log reservation space reserved to write the CIL, the log items in the CIL and the busy extents that need to be completed by the CIL commit. This requires a couple of per-cpu counters, an unordered per-cpu list and a globally ordered per-cpu list. Create a per-cpu structure to hold these and all the management interfaces needed, as well as the hooks to handle hotplug CPUs. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-02	xfs: rework per-iclog header CIL reservation	Dave Chinner	3	-25/+59
	For every iclog that a CIL push will use up, we need to ensure we have space reserved for the iclog header in each iclog. It is extremely difficult to do this accurately with a per-cpu counter without expensive summing of the counter in every commit. However, we know what the maximum CIL size is going to be because of the hard space limit we have, and hence we know exactly how many iclogs we are going to need to write out the CIL. We are constrained by the requirement that small transactions only have reservation space for a single iclog header built into them. At commit time we don't know how much of the current transaction reservation is made up of iclog header reservations as calculated by xfs_log_calc_unit_res() when the ticket was reserved. As larger reservations have multiple header spaces reserved, we can steal more than one iclog header reservation at a time, but we only steal the exact number needed for the given log vector size delta. As a result, we don't know exactly when we are going to steal iclog header reservations, nor do we know exactly how many we are going to need for a given CIL. To make things simple, start by calculating the worst case number of iclog headers a full CIL push will require. Record this into an atomic variable in the CIL. Then add a byte counter to the log ticket that records exactly how much iclog header space has been reserved in this ticket by xfs_log_calc_unit_res(). This tells us exactly how much space we can steal from the ticket at transaction commit time. Now, at transaction commit time, we can check if the CIL has a full iclog header reservation and, if not, steal the entire reservation the current ticket holds for iclog headers. This minimises the number of times we need to do atomic operations in the fast path, but still guarantees we get all the reservations we need. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-02	xfs: lift init CIL reservation out of xc_cil_lock	Dave Chinner	1	-16/+14
	The xc_cil_lock is the most highly contended lock in XFS now. To start the process of getting rid of it, lift the initial reservation of the CIL log space out from under the xc_cil_lock. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-02	xfs: use the CIL space used counter for emptiness checks	Dave Chinner	2	-19/+28
	In the next patches we are going to make the CIL list itself per-cpu, and so we cannot use list_empty() to check is the list is empty. Replace the list_empty() checks with a flag in the CIL to indicate we have committed at least one transaction to the CIL and hence the CIL is not empty. We need this flag to be an atomic so that we can clear it without holding any locks in the commit fast path, but we also need to be careful to avoid atomic operations in the fast path. Hence we use the fact that test_bit() is not an atomic op to first check if the flag is set and then run the atomic test_and_clear_bit() operation to clear it and steal the initial unit reservation for the CIL context checkpoint. When we are switching to a new context in a push, we place the setting of the XLOG_CIL_EMPTY flag under the xc_push_lock. THis allows all the other places that need to check whether the CIL is empty to use test_bit() and still be serialised correctly with the CIL context swaps that set the bit. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-07-01	xfs: prevent a UAF when log IO errors race with unmount	Darrick J. Wong	1	-2/+7
	KASAN reported the following use after free bug when running generic/475: XFS (dm-0): Mounting V5 Filesystem XFS (dm-0): Starting recovery (logdev: internal) XFS (dm-0): Ending recovery (logdev: internal) Buffer I/O error on dev dm-0, logical block 20639616, async page read Buffer I/O error on dev dm-0, logical block 20639617, async page read XFS (dm-0): log I/O error -5 XFS (dm-0): Filesystem has been shut down due to log error (0x2). XFS (dm-0): Unmounting Filesystem XFS (dm-0): Please unmount the filesystem and rectify the problem(s). ================================================================== BUG: KASAN: use-after-free in do_raw_spin_lock+0x246/0x270 Read of size 4 at addr ffff888109dd84c4 by task 3:1H/136 CPU: 3 PID: 136 Comm: 3:1H Not tainted 5.19.0-rc4-xfsx #rc4 8e53ab5ad0fddeb31cee5e7063ff9c361915a9c4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 Workqueue: xfs-log/dm-0 xlog_ioend_work [xfs] Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x2b8/0x661 ? do_raw_spin_lock+0x246/0x270 kasan_report+0xab/0x120 ? do_raw_spin_lock+0x246/0x270 do_raw_spin_lock+0x246/0x270 ? rwlock_bug.part.0+0x90/0x90 xlog_force_shutdown+0xf6/0x370 [xfs 4ad76ae0d6add7e8183a553e624c31e9ed567318] xlog_ioend_work+0x100/0x190 [xfs 4ad76ae0d6add7e8183a553e624c31e9ed567318] process_one_work+0x672/0x1040 worker_thread+0x59b/0xec0 ? __kthread_parkme+0xc6/0x1f0 ? process_one_work+0x1040/0x1040 ? process_one_work+0x1040/0x1040 kthread+0x29e/0x340 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK> Allocated by task 154099: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x81/0xa0 kmem_alloc+0x8d/0x2e0 [xfs] xlog_cil_init+0x1f/0x540 [xfs] xlog_alloc_log+0xd1e/0x1260 [xfs] xfs_log_mount+0xba/0x640 [xfs] xfs_mountfs+0xf2b/0x1d00 [xfs] xfs_fs_fill_super+0x10af/0x1910 [xfs] get_tree_bdev+0x383/0x670 vfs_get_tree+0x7d/0x240 path_mount+0xdb7/0x1890 __x64_sys_mount+0x1fa/0x270 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Freed by task 154151: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_set_free_info+0x20/0x30 ____kasan_slab_free+0x110/0x190 slab_free_freelist_hook+0xab/0x180 kfree+0xbc/0x310 xlog_dealloc_log+0x1b/0x2b0 [xfs] xfs_unmountfs+0x119/0x200 [xfs] xfs_fs_put_super+0x6e/0x2e0 [xfs] generic_shutdown_super+0x12b/0x3a0 kill_block_super+0x95/0xd0 deactivate_locked_super+0x80/0x130 cleanup_mnt+0x329/0x4d0 task_work_run+0xc5/0x160 exit_to_user_mode_prepare+0xd4/0xe0 syscall_exit_to_user_mode+0x1d/0x40 entry_SYSCALL_64_after_hwframe+0x46/0xb0 This appears to be a race between the unmount process, which frees the CIL and waits for in-flight iclog IO; and the iclog IO completion. When generic/475 runs, it starts fsstress in the background, waits a few seconds, and substitutes a dm-error device to simulate a disk falling out of a machine. If the fsstress encounters EIO on a pure data write, it will exit but the filesystem will still be online. The next thing the test does is unmount the filesystem, which tries to clean the log, free the CIL, and wait for iclog IO completion. If an iclog was being written when the dm-error switch occurred, it can race with log unmounting as follows: Thread 1 Thread 2 xfs_log_unmount xfs_log_clean xfs_log_quiesce xlog_ioend_work <observe error> xlog_force_shutdown test_and_set_bit(XLOG_IOERROR) xfs_log_force <log is shut down, nop> xfs_log_umount_write <log is shut down, nop> xlog_dealloc_log xlog_cil_destroy <wait for iclogs> spin_lock(&log->l_cilp->xc_push_lock) <KABOOM> Therefore, free the CIL after waiting for the iclogs to complete. I /think/ this race has existed for quite a few years now, though I don't remember the ~2014 era logging code well enough to know if it was a real threat then or if the actual race was exposed only more recently. Fixes: ac983517ec59 ("xfs: don't sleep in xlog_cil_force_lsn on shutdown") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-06-29	xfs: dont treat rt extents beyond EOF as eofblocks to be cleared	Darrick J. Wong	1	-0/+2
	On a system with a realtime volume and a 28k realtime extent, generic/491 fails because the test opens a file on a frozen filesystem and closing it causes xfs_release -> xfs_can_free_eofblocks to mistakenly think that the the blocks of the realtime extent beyond EOF are posteof blocks to be freed. Realtime extents cannot be partially unmapped, so this is pointless. Worse yet, this triggers posteof cleanup, which stalls on a transaction allocation, which is why the test fails. Teach the predicate to account for realtime extents properly. Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-06-29	xfs: don't hold xattr leaf buffers across transaction rolls	Darrick J. Wong	5	-65/+12
	Now that we've established (again!) that empty xattr leaf buffers are ok, we no longer need to bhold them to transactions when we're creating new leaf blocks. Get rid of the entire mechanism, which should simplify the xattr code quite a bit. The original justification for using bhold here was to prevent the AIL from trying to write the empty leaf block into the fs during the brief time that we release the buffer lock. The reason for /that/ was to prevent recovery from tripping over the empty ondisk block. Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-29	xfs: empty xattr leaf header blocks are not corruption	Darrick J. Wong	1	-9/+17
	TLDR: Revert commit 51e6104fdb95 ("xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify") because it was wrong. Every now and then we get a corruption report from the kernel or xfs_repair about empty leaf blocks in the extended attribute structure. We've long thought that these shouldn't be possible, but prior to 5.18 one would shake loose in the recoveryloop fstests about once a month. A new addition to the xattr leaf block verifier in 5.19-rc1 makes this happen every 7 minutes on my testing cloud. I added a ton of logging to detect any time we set the header count on an xattr leaf block to zero. This produced the following dmesg output on generic/388: XFS (sda4): ino 0x21fcbaf leaf 0x129bf78 hdcount==0! Call Trace: <TASK> dump_stack_lvl+0x34/0x44 xfs_attr3_leaf_create+0x187/0x230 xfs_attr_shortform_to_leaf+0xd1/0x2f0 xfs_attr_set_iter+0x73e/0xa90 xfs_xattri_finish_update+0x45/0x80 xfs_attr_finish_item+0x1b/0xd0 xfs_defer_finish_noroll+0x19c/0x770 __xfs_trans_commit+0x153/0x3e0 xfs_attr_set+0x36b/0x740 xfs_xattr_set+0x89/0xd0 __vfs_setxattr+0x67/0x80 __vfs_setxattr_noperm+0x6e/0x120 vfs_setxattr+0x97/0x180 setxattr+0x88/0xa0 path_setxattr+0xc3/0xe0 __x64_sys_setxattr+0x27/0x30 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 So now we know that someone is creating empty xattr leaf blocks as part of converting a sf xattr structure into a leaf xattr structure. The conversion routine logs any existing sf attributes in the same transaction that creates the leaf block, so we know this is a setxattr to a file that has no attributes at all. Next, g/388 calls the shutdown ioctl and cycles the mount to trigger log recovery. I also augmented buffer item recovery to call ->verify_struct on any attr leaf blocks and complain if it finds a failure: XFS (sda4): Unmounting Filesystem XFS (sda4): Mounting V5 Filesystem XFS (sda4): Starting recovery (logdev: internal) XFS (sda4): xattr leaf daddr 0x129bf78 hdrcount == 0! Call Trace: <TASK> dump_stack_lvl+0x34/0x44 xfs_attr3_leaf_verify+0x3b8/0x420 xlog_recover_buf_commit_pass2+0x60a/0x6c0 xlog_recover_items_pass2+0x4e/0xc0 xlog_recover_commit_trans+0x33c/0x350 xlog_recovery_process_trans+0xa5/0xe0 xlog_recover_process_data+0x8d/0x140 xlog_do_recovery_pass+0x19b/0x720 xlog_do_log_recovery+0x62/0xc0 xlog_do_recover+0x33/0x1d0 xlog_recover+0xda/0x190 xfs_log_mount+0x14c/0x360 xfs_mountfs+0x517/0xa60 xfs_fs_fill_super+0x6bc/0x950 get_tree_bdev+0x175/0x280 vfs_get_tree+0x1a/0x80 path_mount+0x6f5/0xaa0 __x64_sys_mount+0x103/0x140 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fc61e241eae And a moment later, the _delwri_submit of the recovered buffers trips the same verifier and recovery fails: XFS (sda4): Metadata corruption detected at xfs_attr3_leaf_verify+0x393/0x420 [xfs], xfs_attr3_leaf block 0x129bf78 XFS (sda4): Unmount and run xfs_repair XFS (sda4): First 128 bytes of corrupted metadata buffer: 00000000: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00 ........;....... 00000010: 00 00 00 00 01 29 bf 78 00 00 00 00 00 00 00 00 .....).x........ 00000020: a5 1b d0 02 b2 9a 49 df 8e 9c fb 8d f8 31 3e 9d ......I......1>. 00000030: 00 00 00 00 02 1f cb af 00 00 00 00 10 00 00 00 ................ 00000040: 00 50 0f b0 00 00 00 00 00 00 00 00 00 00 00 00 .P.............. 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ XFS (sda4): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply+0x37f/0x3b0 [xfs] (fs/xfs/xfs_buf.c:1518). Shutting down filesystem. XFS (sda4): Please unmount the filesystem and rectify the problem(s) XFS (sda4): log mount/recovery failed: error -117 XFS (sda4): log mount failed I think I see what's going on here -- setxattr is racing with something that shuts down the filesystem: Thread 1 Thread 2 -------- -------- xfs_attr_sf_addname xfs_attr_shortform_to_leaf <create empty leaf> xfs_trans_bhold(leaf) xattri_dela_state = XFS_DAS_LEAF_ADD <roll transaction> <flush log> <shut down filesystem> xfs_trans_bhold_release(leaf) <discover fs is dead, bail> Thread 3 -------- <cycle mount, start recovery> xlog_recover_buf_commit_pass2 xlog_recover_do_reg_buffer <replay empty leaf buffer from recovered buf item> xfs_buf_delwri_queue(leaf) xfs_buf_delwri_submit _xfs_buf_ioapply(leaf) xfs_attr3_leaf_write_verify <trip over empty leaf buffer> <fail recovery> As you can see, the bhold keeps the leaf buffer locked and thus prevents the AIL from tripping over the ichdr.count==0 check in the write verifier. Unfortunately, it doesn't prevent the log from getting flushed to disk, which sets up log recovery to fail. So. It's clear that the kernel has always had the ability to persist attr leaf blocks with ichdr.count==0, which means that it's part of the ondisk format now. Unfortunately, this check has been added and removed multiple times throughout history. It first appeared in[1] kernel 3.10 as part of the early V5 format patches. The check was later discovered to break log recovery and hence disabled[2] during log recovery in kernel 4.10. Simultaneously, the check was added[3] to xfs_repair 4.9.0 to try to weed out the empty leaf blocks. This was still not correct because log recovery would recover an empty attr leaf block successfully only for regular xattr operations to trip over the empty block during of the block during regular operation. Therefore, the check was removed entirely[4] in kernel 5.7 but removal of the xfs_repair check was forgotten. The continued complaints from xfs_repair lead to us mistakenly re-adding[5] the verifier check for kernel 5.19. Remove it once again. [1] 517c22207b04 ("xfs: add CRCs to attr leaf blocks") [2] 2e1d23370e75 ("xfs: ignore leaf attr ichdr.count in verifier during log replay") [3] f7140161 ("xfs_repair: junk leaf attribute if count == 0") [4] f28cef9e4dac ("xfs: don't fail verifier on empty attr3 leaf block") [5] 51e6104fdb95 ("xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify") Looking at the rest of the xattr code, it seems that files with empty leaf blocks behave as expected -- listxattr reports no attributes; getxattr on any xattr returns nothing as expected; removexattr does nothing; and setxattr can add attributes just fine. Original-bug: 517c22207b04 ("xfs: add CRCs to attr leaf blocks") Still-not-fixed-by: 2e1d23370e75 ("xfs: ignore leaf attr ichdr.count in verifier during log replay") Removed-in: f28cef9e4dac ("xfs: don't fail verifier on empty attr3 leaf block") Fixes: 51e6104fdb95 ("xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-06-26	xfs: clean up the end of xfs_attri_item_recover	Darrick J. Wong	1	-19/+26
	The end of this function could use some cleanup -- the EAGAIN conditionals make it harder to figure out what's going on with the disposal of xattri_leaf_bp, and the dual error/ret variables aren't needed. Turn the EAGAIN case into a separate block documenting all the subtleties of recovering in the middle of an xattr update chain, which makes the rest of the prologue much simpler. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-06-26	xfs: always free xattri_leaf_bp when cancelling a deferred op	Darrick J. Wong	1	-1/+19
	While running the following fstest with logged xattrs DISabled, I noticed the following: # FSSTRESS_AVOID="-z -f unlink=1 -f rmdir=1 -f creat=2 -f mkdir=2 -f getfattr=3 -f listfattr=3 -f attr_remove=4 -f removefattr=4 -f setfattr=20 -f attr_set=60" ./check generic/475 INFO: task u9:1:40 blocked for more than 61 seconds. Tainted: G O 5.19.0-rc2-djwx #rc2 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:u9:1 state:D stack:12872 pid: 40 ppid: 2 flags:0x00004000 Workqueue: xfs-cil/dm-0 xlog_cil_push_work [xfs] Call Trace: <TASK> __schedule+0x2db/0x1110 schedule+0x58/0xc0 schedule_timeout+0x115/0x160 __down_common+0x126/0x210 down+0x54/0x70 xfs_buf_lock+0x2d/0xe0 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73] xfs_buf_item_unpin+0x227/0x3a0 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73] xfs_trans_committed_bulk+0x18e/0x320 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73] xlog_cil_committed+0x2ea/0x360 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73] xlog_cil_push_work+0x60f/0x690 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73] process_one_work+0x1df/0x3c0 worker_thread+0x53/0x3b0 kthread+0xea/0x110 ret_from_fork+0x1f/0x30 </TASK> This appears to be the result of shortform_to_leaf creating a new leaf buffer as part of adding an xattr to a file. The new leaf buffer is held and attached to the xfs_attr_intent structure, but then the filesystem shuts down. Instead of the usual path (which adds the attr to the held leaf buffer which releases the hold), we instead cancel the entire deferred operation. Unfortunately, xfs_attr_cancel_item doesn't release any attached leaf buffers, so we leak the locked buffer. The CIL cannot do anything about that, and hangs. Fix this by teaching it to release leaf buffers, and make XFS a little more careful about not leaving a dangling reference. The prologue of xfs_attri_item_recover is (in this author's opinion) a little hard to figure out, so I'll clean that up in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-06-26	xfs: use invalidate_lock to check the state of mmap_lock	Kaixu Xia	1	-2/+2
	We should use invalidate_lock and XFS_MMAPLOCK_SHARED to check the state of mmap_lock rw_semaphore in xfs_isilocked(), rather than i_rwsem and XFS_IOLOCK_SHARED. Fixes: 2433480a7e1d ("xfs: Convert to use invalidate_lock") Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-26	xfs: factor out the common lock flags assert	Kaixu Xia	1	-37/+23
	There are similar lock flags assert in xfs_ilock(), xfs_ilock_nowait(), xfs_iunlock(), thus we can factor it out into a helper that is clear. Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-23	xfs: introduce xfs_inodegc_push()	Dave Chinner	5	-10/+28
	The current blocking mechanism for pushing the inodegc queue out to disk can result in systems becoming unusable when there is a long running inodegc operation. This is because the statfs() implementation currently issues a blocking flush of the inodegc queue and a significant number of common system utilities will call statfs() to discover something about the underlying filesystem. This can result in userspace operations getting stuck on inodegc progress, and when trying to remove a heavily reflinked file on slow storage with a full journal, this can result in delays measuring in hours. Avoid this problem by adding "push" function that expedites the flushing of the inodegc queue, but doesn't wait for it to complete. Convert xfs_fs_statfs() and xfs_qm_scall_getquota() to use this mechanism so they don't block but still ensure that queued operations are expedited. Fixes: ab23a7768739 ("xfs: per-cpu deferred inode inactivation queues") Reported-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Dave Chinner <dchinner@redhat.com> [djwong: fix _getquota_next to use _inodegc_push too] Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-23	xfs: bound maximum wait time for inodegc work	Dave Chinner	3	-16/+24
	Currently inodegc work can sit queued on the per-cpu queue until the workqueue is either flushed of the queue reaches a depth that triggers work queuing (and later throttling). This means that we could queue work that waits for a long time for some other event to trigger flushing. Hence instead of just queueing work at a specific depth, use a delayed work that queues the work at a bound time. We can still schedule the work immediately at a given depth, but we no long need to worry about leaving a number of items on the list that won't get processed until external events prevail. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-15	xfs: preserve DIFLAG2_NREXT64 when setting other inode attributes	Darrick J. Wong	1	-1/+2
	It is vitally important that we preserve the state of the NREXT64 inode flag when we're changing the other flags2 fields. Fixes: 9b7d16e34bbe ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-15	xfs: fix variable state usage	Darrick J. Wong	1	-2/+1
	The variable @args is fed to a tracepoint, and that's the only place it's used. This is fine for the kernel, but for userspace, tracepoints are #define'd out of existence, which results in this warning on gcc 11.2: xfs_attr.c: In function ‘xfs_attr_node_try_addname’: xfs_attr.c:1440:42: warning: unused variable ‘args’ [-Wunused-variable] 1440 \| struct xfs_da_args *args = attr->xattri_da_args; \| ^~~~ Clean this up. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-15	xfs: fix TOCTOU race involving the new logged xattrs control knob	Darrick J. Wong	6	-22/+34
	I found a race involving the larp control knob, aka the debugging knob that lets developers enable logging of extended attribute updates: Thread 1 Thread 2 echo 0 > /sys/fs/xfs/debug/larp setxattr(REPLACE) xfs_has_larp (returns false) xfs_attr_set echo 1 > /sys/fs/xfs/debug/larp xfs_attr_defer_replace xfs_attr_init_replace_state xfs_has_larp (returns true) xfs_attr_init_remove_state <oops, wrong DAS state!> This isn't a particularly severe problem right now because xattr logging is only enabled when CONFIG_XFS_DEBUG=y, and developers should know what they're doing. However, the eventual intent is that callers should be able to ask for the assistance of the log in persisting xattr updates. This capability might not be required for /all/ callers, which means that dynamic control must work correctly. Once an xattr update has decided whether or not to use logged xattrs, it needs to stay in that mode until the end of the operation regardless of what subsequent parallel operations might do. Therefore, it is an error to continue sampling xfs_globals.larp once xfs_attr_change has made a decision about larp, and it was not correct for me to have told Allison that ->create_intent functions can sample the global log incompat feature bitfield to decide to elide a log item. Instead, create a new op flag for the xfs_da_args structure, and convert all other callers of xfs_has_larp and xfs_sb_version_haslogxattrs within the attr update state machine to look for the operations flag. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-12	Linux 5.19-rc2	Linus Torvalds	1	-1/+1

2022-06-12	platform/x86/intel: hid: Add Surface Go to VGBS allow list	Duke Lee	1	-0/+6
	The Surface Go reports Chassis Type 9 (Laptop,) so the device needs to be added to dmi_vgbs_allow_list to enable tablet mode when an attached Type Cover is folded back. BugLink: https://github.com/linux-surface/linux-surface/issues/837 Signed-off-by: Duke Lee <krnhotwings@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220607213654.5567-1-krnhotwings@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-12	platform/x86: hp-wmi: Use zero insize parameter only when supported	Bedant Patnaik	1	-8/+15
	commit be9d73e64957 ("platform/x86: hp-wmi: Fix 0x05 error code reported by several WMI calls") and commit 12b19f14a21a ("platform/x86: hp-wmi: Fix hp_wmi_read_int() reporting error (0x05)") cause ACPI BIOS Error (bug): Attempt to CreateField of length zero (20211217/dsopcode-133) because of the ACPI method HWMC, which unconditionally creates a Field of size (insize8) bits: CreateField (Arg1, 0x80, (Local5 0x08), DAIN) In cases where args->insize = 0, the Field size is 0, resulting in an error. Fix this by using zero insize only if 0x5 error code is returned Tested on Omen 15 AMD (2020) board ID: 8786. Fixes: be9d73e64957 ("platform/x86: hp-wmi: Fix 0x05 error code reported by several WMI calls") Signed-off-by: Bedant Patnaik <bedant.patnaik@gmail.com> Tested-by: Jorge Lopez <jorge.lopez2@hp.com> Link: https://lore.kernel.org/r/41be46743d21c78741232a47bbb5f1cdbcc3d21e.camel@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-12	platform/x86: hp-wmi: Resolve WMI query failures on some devices	Jorge Lopez	1	-2/+4
	WMI queries fail on some devices where the ACPI method HWMC unconditionally attempts to create Fields beyond the buffer if the buffer is too small, this breaks essential features such as power profiles: CreateByteField (Arg1, 0x10, D008) CreateByteField (Arg1, 0x11, D009) CreateByteField (Arg1, 0x12, D010) CreateDWordField (Arg1, 0x10, D032) CreateField (Arg1, 0x80, 0x0400, D128) In cases where args->data had zero length, ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [D008] at bit offset/length 128/8 exceeds size of target Buffer (128 bits) (20211217/dsopcode-198) was obtained. ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [D009] at bit offset/length 136/8 exceeds size of target Buffer (136bits) (20211217/dsopcode-198) The original code created a buffer size of 128 bytes regardless if the WMI call required a smaller buffer or not. This particular behavior occurs in older BIOS and reproduced in OMEN laptops. Newer BIOS handles buffer sizes properly and meets the latest specification requirements. This is the reason why testing with a dynamically allocated buffer did not uncover any failures with the test systems at hand. This patch was tested on several OMEN, Elite, and Zbooks. It was confirmed the patch resolves HPWMI_FAN GET/SET calls in an OMEN Laptop 15-ek0xxx. No problems were reported when testing on several Elite and Zbooks notebooks. Fixes: 4b4967cbd268 ("platform/x86: hp-wmi: Changing bios_args.data to be dynamically allocated") Signed-off-by: Jorge Lopez <jorge.lopez2@hp.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220608212923.8585-2-jorge.lopez2@hp.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-11	workqueue: Switch to new kerneldoc syntax for named variable macro argument	Jonathan Neuschäfer	1	-1/+1
	The syntax without dots is available since commit 43756e347f21 ("scripts/kernel-doc: Add support for named variable macro arguments"). The same HTML output is produced with and without this patch. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2022-06-11	iov_iter: fix build issue due to possible type mis-match	Linus Torvalds	1	-2/+2
	Commit 6c77676645ad ("iov_iter: Fix iter_xarray_get_pages{,_alloc}()") introduced a problem on some 32-bit architectures (at least arm, xtensa, csky,sparc and mips), that have a 'size_t' that is 'unsigned int'. The reason is that we now do min(nr * PAGE_SIZE - offset, maxsize); where 'nr' and 'offset' and both 'unsigned int', and PAGE_SIZE is 'unsigned long'. As a result, the normal C type rules means that the first argument to 'min()' ends up being 'unsigned long'. In contrast, 'maxsize' is of type 'size_t'. Now, 'size_t' and 'unsigned long' are always the same physical type in the kernel, so you'd think this doesn't matter, and from an actual arithmetic standpoint it doesn't. But on 32-bit architectures 'size_t' is commonly 'unsigned int', even if it could also be 'unsigned long'. In that situation, both are unsigned 32-bit types, but they are not the same type. And as a result 'min()' will complain about the distinct types (ignore the "pointer types" part of the error message: that's an artifact of the way we have made 'min()' check types for being the same): lib/iov_iter.c: In function 'iter_xarray_get_pages': include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast [-Werror] 20 \| (!!(sizeof((typeof(x) )1 == (typeof(y) )1))) \| ^~ lib/iov_iter.c:1464:16: note: in expansion of macro 'min' 1464 \| return min(nr * PAGE_SIZE - offset, maxsize); \| ^~~ This was not visible on 64-bit architectures (where we always define 'size_t' to be 'unsigned long'). Force these cases to use 'min_t(size_t, x, y)' to make the type explicit and avoid the issue. [ Nit-picky note: technically 'size_t' doesn't have to match 'unsigned long' arithmetically. We've certainly historically seen environments with 16-bit address spaces and 32-bit 'unsigned long'. Similarly, even in 64-bit modern environments, 'size_t' could be its own type distinct from 'unsigned long', even if it were arithmetically identical. So the above type commentary is only really descriptive of the kernel environment, not some kind of universal truth for the kinds of wild and crazy situations that are allowed by the C standard ] Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Link: https://lore.kernel.org/all/YqRyL2sIqQNDfky2@debian/ Cc: Jeff Layton <jlayton@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-11	wireguard: selftests: use maximum cpu features and allow rng seeding	Jason A. Donenfeld	3	-15/+19
	By forcing the maximum CPU that QEMU has available, we expose additional capabilities, such as the RNDR instruction, which increases test coverage. This then allows the CI to skip the fake seeding step in some cases. Also enable STRICT_KERNEL_RWX to catch issues related to early jump labels when the RNG is initialized at boot. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-11	scripts/gdb: change kernel config dumping method	Kuan-Ying Lee	1	-3/+3
	MAGIC_START("IKCFG_ST") and MAGIC_END("IKCFG_ED") are moved out from the kernel_config_data variable. Thus, we parse kernel_config_data directly instead of considering offset of MAGIC_START and MAGIC_END. Fixes: 13610aa908dc ("kernel/configs: use .incbin directive to embed config_data.gz") Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-06-10	um: virt-pci: set device ready in probe()	Vincent Whitchurch	1	-1/+6
	Call virtio_device_ready() to make this driver work after commit b4ec69d7e09 ("virtio: harden vring IRQ"), since the driver uses the virtqueues in the probe function. (The virtio core sets the device ready when probe returns.) Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ") Fixes: 68f5d3f3b654 ("um: add PCI over virtio emulation driver") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Message-Id: <20220610151203.3492541-1-vincent.whitchurch@axis.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Johannes Berg <johannes@sipsolutions.net>
2022-06-10	cifs: populate empty hostnames for extra channels	Shyam Prasad N	2	-1/+8
	Currently, the secondary channels of a multichannel session also get hostname populated based on the info in primary channel. However, this will end up with a wrong resolution of hostname to IP address during reconnect. This change fixes this by not populating hostname info for all secondary channels. Fixes: 5112d80c162f ("cifs: populate server_hostname for extra channels") Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-10	platform/x86: gigabyte-wmi: Add support for B450M DS3H-CF	August Wikerfors	1	-0/+1
	Tested and works on my system. Signed-off-by: August Wikerfors <git@augustwikerfors.se> Link: https://lore.kernel.org/r/20220608212028.28307-1-git@augustwikerfors.se Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	platform/x86: gigabyte-wmi: Add Z690M AORUS ELITE AX DDR4 support	Piotr Chmura	1	-0/+1
	Add dmi_system_id of Gigabyte Z690M AORUS ELITE AX DDR4 board. Tested on my PC. Signed-off-by: Piotr Chmura <chmooreck@gmail.com> Link: https://lore.kernel.org/r/bd83567e-ebf5-0b31-074b-5f6dc7f7c147@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	platform/x86: barco-p50-gpio: Add check for platform_driver_register	Jiasheng Jiang	1	-1/+4
	As platform_driver_register() could fail, it should be better to deal with the return value in order to maintain the code consisitency. Fixes: 86af1d02d458 ("platform/x86: Support for EC-connected GPIOs for identify LED/button on Barco P50 board") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Acked-by: Peter Korsgaard <peter.korsgaard@barco.com> Link: https://lore.kernel.org/r/20220526090345.1444172-1-jiasheng@iscas.ac.cn Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	platform/x86/intel: pmc: Support Intel Raptorlake P	George D Sworo	1	-0/+1
	Add Raptorlake P to the list of the platforms that intel_pmc_core driver supports for pmc_core device. Raptorlake P PCH is based on Alderlake P PCH. Signed-off-by: George D Sworo <george.d.sworo@intel.com> Reviewed-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220602012617.20100-1-george.d.sworo@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	platform/x86/intel: Fix pmt_crashlog array reference	David Arcari	1	-1/+1
	The probe function pmt_crashlog_probe() may incorrectly reference the 'priv->entry array' as it uses 'i' to reference the array instead of 'priv->num_entries' as it should. This is similar to the problem that was addressed in pmt_telemetry_probe via commit 2cdfa0c20d58 ("platform/x86/intel: Fix 'rmmod pmt_telemetry' panic"). Cc: "David E. Box" <david.e.box@linux.intel.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Mark Gross <markgross@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: David Arcari <darcari@redhat.com> Reviewed-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220526203140.339120-1-darcari@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	platform/mellanox: Add static in struct declaration.	Michael Shych	1	-1/+1
	Fix problem of missing static in struct declaration. Fixes: 662f24826f954 ("platform/mellanox: Add support for new SN2201 system") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/20220602145103.11859-1-michaelsh@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	iov_iter: Fix iter_xarray_get_pages{,_alloc}()	David Howells	1	-16/+4
	The maths at the end of iter_xarray_get_pages() to calculate the actual size doesn't work under some circumstances, such as when it's been asked to extract a partial single page. Various terms of the equation cancel out and you end up with actual == offset. The same issue exists in iter_xarray_get_pages_alloc(). Fix these to just use min() to select the lesser amount from between the amount of page content transcribed into the buffer, minus the offset, and the size limit specified. This doesn't appear to have caused a problem yet upstream because network filesystems aren't getting the pages from an xarray iterator, but rather passing it directly to the socket, which just iterates over it. Cachefiles does do DIO from one to/from ext4/xfs/btrfs/etc. but it always asks for whole pages to be written or read. Fixes: 7ff5062079ef ("iov_iter: Add ITER_XARRAY") Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: Alexander Viro <viro@zeniv.linux.org.uk> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Mike Marshall <hubcap@omnibond.com> cc: Gao Xiang <xiang@kernel.org> cc: linux-afs@lists.infradead.org cc: v9fs-developer@lists.sourceforge.net cc: devel@lists.orangefs.org cc: linux-erofs@lists.ozlabs.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-10	netfs: Rename the netfs_io_request cleanup op and give it an op pointer	David Howells	6	-30/+29
	The netfs_io_request cleanup op is now always in a position to be given a pointer to a netfs_io_request struct, so this can be passed in instead of the mapping and private data arguments (both of which are included in the struct). So rename the ->cleanup op to ->free_request (to match ->init_request) and pass in the I/O pointer. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com
2022-06-10	netfs: Further cleanups after struct netfs_inode wrapper introduced	Linus Torvalds	14	-30/+30
	Change the signature of netfs helper functions to take a struct netfs_inode pointer rather than a struct inode pointer where appropriate, thereby relieving the need for the network filesystem to convert its internal inode format down to the VFS inode only for netfslib to bounce it back up. For type safety, it's better not to do that (and it's less typing too). Give netfs_write_begin() an extra argument to pass in a pointer to the netfs_inode struct rather than deriving it internally from the file pointer. Note that the ->write_begin() and ->write_end() ops are intended to be replaced in the future by netfslib code that manages this without the need to call in twice for each page. netfs_readpage() and similar are intended to be pointed at directly by the address_space_operations table, so must stick to the signature dictated by the function pointers there. Changes ======= - Updated the kerneldoc comments and documentation [DH]. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/CAHk-=wgkwKyNmNdKpQkqZ6DnmUL-x9hp0YBnUGjaPFEAdxDTbw@mail.gmail.com/
2022-06-10	afs: Fix some checker issues	David Howells	1	-2/+1
	Remove an unused global variable and make another static as reported by make C=1. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org
2022-06-10	dm: fix zoned locking imbalance due to needless check in clone_endio	Mike Snitzer	1	-15/+11
	After the commit ca522482e3ea ("dm: pass NULL bdev to bio_alloc_clone"), clone_endio() only calls dm_zone_endio() when DM targets remap the clone bio's bdev to something other than the md->disk->part0 default. However, if a DM target (e.g. dm-crypt) stacked ontop of a dm-zoned does not remap the clone bio using bio_set_dev() then dm_zone_endio() is not called at completion of the bios and zone locks are not properly unlocked. This triggers a hang, in dm_zone_map_bio(), when blktests block/004 is run for dm-crypt on zoned block devices. To avoid the hang, simply remove the clone_endio() check that verifies the target remapped the clone bio to a device other than the default. Fixes: ca522482e3ea ("dm: pass NULL bdev to bio_alloc_clone") Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-10	platform/mellanox: Spelling s/platfom/platform/	Geert Uytterhoeven	1	-1/+1
	Fix a misspelling of the word "platform". Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/9c8edde31e271311b7832d7677fe84aba917da8d.1653376503.git.geert@linux-m68k.org Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-10	certs: Convert spaces in certs/Makefile to a tab	David Howells	1	-1/+1
	There's a rule in certs/Makefile for which the command begins with eight spaces. This results in: ../certs/Makefile:21: FORCE prerequisite is missing ../certs/Makefile:21: *** missing separator. Stop. Fix this by turning the spaces into a tab. Fixes: addf466389d9 ("certs: Check that builtin blacklist hashes are valid") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Mickaël Salaün <mic@linux.microsoft.com> cc: keyrings@vger.kernel.org Link: https://lore.kernel.org/r/486b1b80-9932-aab6-138d-434c541c934a@digikod.net/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-10	dt-bindings: display: arm,malidp: remove bogus RQOS property	Andre Przywara	1	-6/+1
	As Liviu pointed out, the arm,malidp-arqos-high-level property mentioned in the original .txt binding was a mistake, and arm,malidp-arqos-value needs to take its place. The binding commit ce6eb0253cba ("dt/bindings: display: Add optional property node define for Mali DP500") mentions the right name in the commit message, but has the wrong name in the diff. Commit d298e6a27a81 ("drm/arm/mali-dp: Add display QoS interface configuration for Mali DP500") uses the property in the driver, but uses the shorter name. Remove the wrong property from the binding, and use the proper name in the example. The actual property was already documented properly. Fixes: 2c8b082a3ab1 ("dt-bindings: display: convert Arm Mali-DP to DT schema") Link: https://lore.kernel.org/linux-arm-kernel/YnumGEilUblhBx8E@e110455-lin.cambridge.arm.com/ Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reported-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220609162729.1441760-1-andre.przywara@arm.com
2022-06-10	dt-bindings: pinctrl: ralink: Fix 'enum' lists with duplicate entries	Rob Herring	2	-25/+28
	There's no reason to list the same value twice in an 'enum'. This was fixed treewide in commit c3b006819426 ("dt-bindings: Fix 'enum' lists with duplicate entries"), but this one got added in the merge window. A meta-schema change will catch future cases. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Link: https://lore.kernel.org/r/20220606212239.1360877-1-robh@kernel.org
2022-06-10	arm64: Add kasan_hw_tags_enable() prototype to silence sparse	Catalin Marinas	1	-0/+6
	This function is only called from assembly, no need for a prototype declaration in a header file. In addition, add #ifdef around the function since it is only used when CONFIG_KASAN_HW_TAGS. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: kernel test robot <lkp@intel.com>