linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-09-24	xfs: log proper length of superblock	Eric Sandeen	1	-1/+1
	xfs_trans_log_buf takes first byte, last byte as args. In this case, it should be from 0 to sizeof() - 1. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-23	xfs: revert 1baa2800e62d ("xfs: remove the unused XFS_ALLOC_USERDATA flag")	Darrick J. Wong	2	-5/+10
	Revert this commit, as it caused periodic regressions in xfs/173 w/ 1k blocks. [1] https://lore.kernel.org/lkml/20190919014602.GN15734@shao2-debian/ Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-09-23	xfs: convert inode to extent format after extent merge due to shift	Brian Foster	1	-0/+5
	The collapse range operation can merge extents if two newly adjacent extents are physically contiguous. If the extent count is reduced on a btree format inode, a change to extent format might be necessary. This format change currently occurs as a side effect of the file size update after extents have been shifted for the collapse. This codepath ultimately calls xfs_bunmapi(), which happens to check for and execute the format conversion even if there were no blocks removed from the mapping. While this ultimately puts the inode into the correct state, the fact the format conversion occurs in a separate transaction from the change that called for it is a problem. If an extent shift transaction commits and the filesystem happens to crash before the format conversion, the inode fork is left in a corrupted state after log recovery. The inode fork verifier fails and xfs_repair ultimately nukes the inode. This problem was originally reproduced by generic/388. Similar to how the insert range extent split code handles extent to btree conversion, update the collapse range extent merge code to handle btree to extent format conversion in the same transaction that merges the extents. This ensures that the inode fork format remains consistent if the filesystem happens to crash in the middle of a collapse range operation that changes the inode fork format. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-03	xfs: define a flags field for the AG geometry ioctl structure	Darrick J. Wong	1	-1/+1
	Define a flags field for the AG geometry ioctl structure. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-09-03	xfs: add a xfs_valid_startblock helper	Christoph Hellwig	2	-1/+8
	Add a helper that validates the startblock is valid. This checks for a non-zero block on the main device, but skips that check for blocks on the realtime device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: remove the unused XFS_ALLOC_USERDATA flag	Christoph Hellwig	2	-10/+5
	Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: allocate xattr buffer on demand	Dave Chinner	4	-9/+49
	When doing file lookups and checking for permissions, we end up in xfs_get_acl() to see if there are any ACLs on the inode. This requires and xattr lookup, and to do that we have to supply a buffer large enough to hold an maximum sized xattr. On workloads were we are accessing a wide range of cache cold files under memory pressure (e.g. NFS fileservers) we end up spending a lot of time allocating the buffer. The buffer is 64k in length, so is a contiguous multi-page allocation, and if that then fails we fall back to vmalloc(). Hence the allocation here is /expensive/ when we are looking up hundreds of thousands of files a second. Initial numbers from a bpf trace show average time in xfs_get_acl() is ~32us, with ~19us of that in the memory allocation. Note these are average times, so there are going to be affected by the worst case allocations more than the common fast case... To avoid this, we could just do a "null" lookup to see if the ACL xattr exists and then only do the allocation if it exists. This, however, optimises the path for the "no ACL present" case at the expense of the "acl present" case. i.e. we can halve the time in xfs_get_acl() for the no acl case (i.e down to ~10-15us), but that then increases the ACL case by 30% (i.e. up to 40-45us). To solve this and speed up both cases, drive the xattr buffer allocation into the attribute code once we know what the actual xattr length is. For the no-xattr case, we avoid the allocation completely, speeding up that case. For the common ACL case, we'll end up with a fast heap allocation (because it'll be smaller than a page), and only for the rarer "we have a remote xattr" will we have a multi-page allocation occur. Hence the common ACL case will be much faster, too. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: consolidate attribute value copying	Dave Chinner	1	-39/+49
	The same code is used to copy do the attribute copying in three different places. Consolidate them into a single function in preparation from on-demand buffer allocation. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: move remote attr retrieval into xfs_attr3_leaf_getvalue	Dave Chinner	2	-16/+2
	Because we repeat exactly the same code to get the remote attribute value after both calls to xfs_attr3_leaf_getvalue() if it's a remote attr. Just do it in xfs_attr3_leaf_getvalue() so the callers don't have to care about it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: remove unnecessary indenting from xfs_attr3_leaf_getvalue	Dave Chinner	1	-16/+17
	Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: make attr lookup returns consistent	Dave Chinner	3	-24/+48
	Shortform, leaf and remote value attr value retrieval return different values for success. This makes it more complex to handle actual errors xfs_attr_get() as some errors mean success and some mean failure. Make the return values consistent for success and failure consistent for all attribute formats. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: reverse search directory freespace indexes	Dave Chinner	1	-8/+5
	When a directory is growing rapidly, new blocks tend to get added at the end of the directory. These end up at the end of the freespace index, and when the directory gets large finding these new freespaces gets expensive. The code does a linear search across the frespace index from the first block in the directory to the last, hence meaning the newly added space is the last index searched. Instead, do a reverse order index search, starting from the last block and index in the freespace index. This makes most lookups for free space on rapidly growing directories O(1) instead of O(N), but should not have any impact on random insert workloads because the average search length is the same regardless of which end of the array we start at. The result is a major improvement in large directory grow rates: create time(sec) / rate (files/s) File count vanilla Prev commit Patched 10k 0.41 / 24.3k 0.42 / 23.8k 0.41 / 24.3k 20k 0.74 / 27.0k 0.76 / 26.3k 0.75 / 26.7k 100k 3.81 / 26.4k 3.47 / 28.8k 3.27 / 30.6k 200k 8.58 / 23.3k 7.19 / 27.8k 6.71 / 29.8k 1M 85.69 / 11.7k 48.53 / 20.6k 37.67 / 26.5k 2M 280.31 / 7.1k 130.14 / 15.3k 79.55 / 25.2k 10M 3913.26 / 2.5k 552.89 / 18.1k Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: speed up directory bestfree block scanning	Dave Chinner	1	-63/+34
	When running a "create millions inodes in a directory" test recently, I noticed we were spending a huge amount of time converting freespace block headers from disk format to in-memory format: 31.47% [kernel] [k] xfs_dir2_node_addname 17.86% [kernel] [k] xfs_dir3_free_hdr_from_disk 3.55% [kernel] [k] xfs_dir3_free_bests_p We shouldn't be hitting the best free block scanning code so hard when doing sequential directory creates, and it turns out there's a highly suboptimal loop searching the the best free array in the freespace block - it decodes the block header before checking each entry inside a loop, instead of decoding the header once before running the entry search loop. This makes a massive difference to create rates. Profile now looks like this: 13.15% [kernel] [k] xfs_dir2_node_addname 3.52% [kernel] [k] xfs_dir3_leaf_check_int 3.11% [kernel] [k] xfs_log_commit_cil And the wall time/average file create rate differences are just as stark: create time(sec) / rate (files/s) File count vanilla patched 10k 0.41 / 24.3k 0.42 / 23.8k 20k 0.74 / 27.0k 0.76 / 26.3k 100k 3.81 / 26.4k 3.47 / 28.8k 200k 8.58 / 23.3k 7.19 / 27.8k 1M 85.69 / 11.7k 48.53 / 20.6k 2M 280.31 / 7.1k 130.14 / 15.3k The larger the directory, the bigger the performance improvement. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: factor free block index lookup from xfs_dir2_node_addname_int()	Dave Chinner	1	-92/+102
	Simplify the logic in xfs_dir2_node_addname_int() by factoring out the free block index lookup code that finds a block with enough free space for the entry to be added. The code that is moved gets a major cleanup at the same time, but there is no algorithm change here. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: factor data block addition from xfs_dir2_node_addname_int()	Dave Chinner	1	-166/+158
	Factor out the code that adds a data block to a directory from xfs_dir2_node_addname_int(). This makes the code flow cleaner and more obvious and provides clear isolation of upcoming optimsations. Signed-off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: move xfs_dir2_addname()	Dave Chinner	1	-71/+69
	This gets rid of the need for a forward declaration of the static function xfs_dir2_addname_int() and readies the code for factoring of xfs_dir2_addname_int(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30	xfs: remove all *_ITER_CONTINUE values	Darrick J. Wong	3	-8/+4
	Iterator functions already use 0 to signal "continue iterating", so get rid of the #defines and just do it directly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-29	xfs: remove all *_ITER_ABORT values	Darrick J. Wong	4	-17/+17
	Use -ECANCELED to signal "stop iterating" instead of these magical *_ITER_ABORT values, since it's duplicative. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28	xfs: reinitialize rm_flags when unpacking an offset into an rmap irec	Darrick J. Wong	2	-1/+1
	In xfs_rmap_irec_offset_unpack, we should always clear the contents of rm_flags before we begin unpacking the encoded (ondisk) offset into the incore rm_offset and incore rm_flags fields. Remove the open-coded field zeroing as this encourages api misuse. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28	xfs: remove unnecessary int returns from deferred bmap functions	Darrick J. Wong	2	-8/+8
	Remove the return value from the functions that schedule deferred bmap operations since they never fail and do not return status. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28	xfs: remove unnecessary int returns from deferred refcount functions	Darrick J. Wong	3	-44/+28
	Remove the return value from the functions that schedule deferred refcount operations since they never fail and do not return status. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28	xfs: remove unnecessary int returns from deferred rmap functions	Darrick J. Wong	4	-52/+36
	Remove the return value from the functions that schedule deferred rmap operations since they never fail and do not return status. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28	xfs: remove unnecessary parameter from xfs_iext_inc_seq	Darrick J. Wong	1	-4/+4
	This function doesn't use the @state parameter, so get rid of it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28	xfs: fix sign handling problem in xfs_bmbt_diff_two_keys	Darrick J. Wong	1	-2/+14
	In xfs_bmbt_diff_two_keys, we perform a signed int64_t subtraction with two unsigned 64-bit quantities. If the second quantity is actually the "maximum" key (all ones) as used in _query_all, the subtraction effectively becomes addition of two positive numbers and the function returns incorrect results. Fix this with explicit comparisons of the unsigned values. Nobody needs this now, but the online repair patches will need this to work properly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28	xfs: don't return _QUERY_ABORT from xfs_rmap_has_other_keys	Darrick J. Wong	1	-1/+4
	The xfs_rmap_has_other_keys helper aborts the iteration as soon as it has an answer. Don't let this abort leak out to callers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28	xfs: fix maxicount division by zero error	Darrick J. Wong	1	-2/+7
	In xfs_ialloc_setup_geometry, it's possible for a malicious/corrupt fs image to set an unreasonably large value for sb_inopblog which will cause ialloc_blks to be zero. If sb_imax_pct is also set, this results in a division by zero error in the second do_div call. Therefore, force maxicount to zero if ialloc_blks is zero. Note that the kernel metadata verifiers will catch the garbage inopblog value and abort the fs mount long before it tries to set up the inode geometry; this is needed to avoid a crash in xfs_db while setting up the xfs_mount structure. Found by fuzzing sb_inopblog to 122 in xfs/350. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2019-08-26	xfs: remove excess function parameter description in 'xfs_btree_sblock_v5hdr_verify'	zhengbin	1	-2/+0
	Fixes gcc warning: fs/xfs/libxfs/xfs_btree.c:4475: warning: Excess function parameter 'max_recs' description in 'xfs_btree_sblock_v5hdr_verify' fs/xfs/libxfs/xfs_btree.c:4475: warning: Excess function parameter 'pag_max_level' description in 'xfs_btree_sblock_v5hdr_verify' Fixes: c5ab131ba0df ("libxfs: refactor short btree block verification") Signed-off-by: zhengbin <zhengbin13@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26	xfs: add kmem allocation trace points	Dave Chinner	1	-0/+8
	When trying to correlate XFS kernel allocations to memory reclaim behaviour, it is useful to know what allocations XFS is actually attempting. This information is not directly available from tracepoints in the generic memory allocation and reclaim tracepoints, so these new trace points provide a high level indication of what the XFS memory demand actually is. There is no per-filesystem context in this code, so we just trace the type of allocation, the size and the allocation constraints. The kmem code also doesn't include much of the common XFS headers, so there are a few definitions that need to be added to the trace headers and a couple of types that need to be made common to avoid needing to include the whole world in the kmem code. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26	fs: xfs: Remove KM_NOSLEEP and KM_SLEEP.	Tetsuo Handa	11	-35/+35
	Since no caller is using KM_NOSLEEP and no callee branches on KM_SLEEP, we can remove KM_NOSLEEP and replace KM_SLEEP with 0. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-12	xfs: don't crash on null attr fork xfs_bmapi_read	Darrick J. Wong	1	-8/+21
	Zorro Lang reported a crash in generic/475 if we try to inactivate a corrupt inode with a NULL attr fork (stack trace shortened somewhat): RIP: 0010:xfs_bmapi_read+0x311/0xb00 [xfs] RSP: 0018:ffff888047f9ed68 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888047f9f038 RCX: 1ffffffff5f99f51 RDX: 0000000000000002 RSI: 0000000000000008 RDI: 0000000000000012 RBP: ffff888002a41f00 R08: ffffed10005483f0 R09: ffffed10005483ef R10: ffffed10005483ef R11: ffff888002a41f7f R12: 0000000000000004 R13: ffffe8fff53b5768 R14: 0000000000000005 R15: 0000000000000001 FS: 00007f11d44b5b80(0000) GS:ffff888114200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000ef6000 CR3: 000000002e176003 CR4: 00000000001606e0 Call Trace: xfs_dabuf_map.constprop.18+0x696/0xe50 [xfs] xfs_da_read_buf+0xf5/0x2c0 [xfs] xfs_da3_node_read+0x1d/0x230 [xfs] xfs_attr_inactive+0x3cc/0x5e0 [xfs] xfs_inactive+0x4c8/0x5b0 [xfs] xfs_fs_destroy_inode+0x31b/0x8e0 [xfs] destroy_inode+0xbc/0x190 xfs_bulkstat_one_int+0xa8c/0x1200 [xfs] xfs_bulkstat_one+0x16/0x20 [xfs] xfs_bulkstat+0x6fa/0xf20 [xfs] xfs_ioc_bulkstat+0x182/0x2b0 [xfs] xfs_file_ioctl+0xee0/0x12a0 [xfs] do_vfs_ioctl+0x193/0x1000 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x6f/0xb0 do_syscall_64+0x9f/0x4d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f11d39a3e5b The "obvious" cause is that the attr ifork is null despite the inode claiming an attr fork having at least one extent, but it's not so obvious why we ended up with an inode in that state. Reported-by: Zorro Lang <zlang@redhat.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204031 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2019-08-12	xfs: remove more ondisk directory corruption asserts	Darrick J. Wong	2	-8/+14
	Continue our game of replacing ASSERTs for corrupt ondisk metadata with EFSCORRUPTED returns. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2019-07-15	xfs: sync up xfs_trans_inode with userspace	Eric Sandeen	1	-0/+4
	Add an XFS_ICHGTIME_CREATE case to xfs_trans_ichgtime() to keep in sync with userspace. (Currently no kernel caller sends this flag.) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-07-15	xfs: move xfs_trans_inode.c to libxfs/	Eric Sandeen	1	-0/+152
	Userspace now has an identical xfs_trans_inode.c which it has already moved to libxfs/ so do the same move for kernelspace. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-07-05	xfs: attribute scrub should use seen_enough to pass error values	Darrick J. Wong	1	-1/+7
	When we're iterating all the attributes using the built-in xattr iterator, we can use the seen_enough variable to pass error codes back to the main scrub function instead of flattening them into 0/1. This will be used in a more exciting fashion in upcoming patches. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-04	xfs: allow single bulkstat of special inodes	Darrick J. Wong	1	-1/+12
	Create a new bulk ireq flag that enables userspace to ask us for a special inode number instead of interpreting @ino as a literal inode number. This enables us to query the root inode easily. The reason for adding the ability to query specifically the root directory inode is that certain programs (xfsdump and xfsrestore) want to confirm when they've been pointed to the root directory. The userspace code assumes the root directory is always the first result from calling bulkstat with lastino == 0, but this isn't true if the (initial btree roots + initial AGFL + inode alignment padding) is itself long enough to be allocated to new inodes if all of those blocks should happen to be free at the same time. Rather than make userspace guess at internal filesystem state, we provide a direct query. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-04	xfs: specify AG in bulk req	Darrick J. Wong	1	-2/+8
	Add a new xfs_bulk_ireq flag to constrain the iteration to a single AG. If the passed-in startino value is zero then we start with the first inode in the AG that the user passes in; otherwise, we iterate only within the same AG as the passed-in inode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03	xfs: wire up the v5 inumbers ioctl	Darrick J. Wong	1	-0/+8
	Wire up the v5 INUMBERS ioctl. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03	xfs: wire up new v5 bulkstat ioctls	Darrick J. Wong	1	-1/+23
	Wire up the new v5 BULKSTAT ioctl. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03	xfs: introduce v5 inode group structure	Darrick J. Wong	1	-0/+11
	Introduce a new "v5" inode group structure that fixes the alignment and padding problems of the existing structure. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03	xfs: introduce new v5 bulkstat structure	Darrick J. Wong	2	-2/+48
	Introduce a new version of the in-core bulkstat structure that supports our new v5 format features. This structure also fills the gaps in the previous structure. We leave wiring up the ioctls for the next patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03	xfs: remove various bulk request typedef usage	Darrick J. Wong	1	-8/+8
	Remove xfs_bstat_t, xfs_fsop_bulkreq_t, xfs_inogrp_t, and similarly named compat typedefs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02	xfs: create simplified inode walk function	Darrick J. Wong	2	-4/+36
	Create a new iterator function to simplify walking inodes in an XFS filesystem. This new iterator will replace the existing open-coded walking that goes on in various places. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02	xfs: create iterator error codes	Darrick J. Wong	3	-3/+9
	Currently, xfs doesn't have generic error codes defined for "stop iterating"; we just reuse the XFS_BTREE_QUERY_* return values. This looks a little weird if we're not actually iterating a btree index. Before we start adding more iterators, we should create general XFS_ITER_{CONTINUE,ABORT} return values and define the XFS_BTREE_QUERY_* ones from that. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-06-30	xfs: remove XFS_TRANS_NOFS	Christoph Hellwig	1	-1/+0
	Instead of a magic flag for xfs_trans_alloc, just ensure all callers that can't relclaim through the file system use memalloc_nofs_save to set the per-task nofs flag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28	xfs: remove unused header files	Eric Sandeen	35	-138/+0
	There are many, many xfs header files which are included but unneeded (or included twice) in the xfs code, so remove them. nb: xfs_linux.h includes about 9 headers for everyone, so those explicit includes get removed by this. I'm not sure what the preference is, but if we wanted explicit includes everywhere, a followup patch could remove those xfs_*.h includes from xfs_linux.h and move them into the files that need them. Or it could be left as-is. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28	xfs: account for log space when formatting new AGs	Darrick J. Wong	1	-0/+67
	When we're writing out a fresh new AG, make sure that we don't list an internal log as free and that we create the rmap for the region. growfs never does this, but we will need it when we hook up mkfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-06-28	xfs: refactor free space btree record initialization	Darrick J. Wong	1	-11/+16
	Refactor the code that populates the free space btrees of a new AG so that we can avoid code duplication once things start getting complicated. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-06-28	xfs: always update params on small allocation	Brian Foster	1	-2/+2
	xfs_alloc_ag_vextent_small() doesn't update the output parameters in the event of an AGFL allocation. Instead, it updates the xfs_alloc_arg structure directly to complete the allocation. Update both args and the output params to provide consistent behavior for future callers. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28	xfs: skip small alloc cntbt logic on NULL cursor	Brian Foster	1	-2/+9
	The small allocation helper is implemented in a way that is fairly tightly integrated to the existing allocation algorithms. It expects a cntbt cursor beyond the end of the tree, attempts to locate the last record in the tree and only attempts an AGFL allocation if the cntbt is empty. The upcoming generic algorithm doesn't rely on the cntbt processing of this function. It will only call this function when the cntbt doesn't have a big enough extent or is empty and thus AGFL allocation is the only remaining option. Tweak xfs_alloc_ag_vextent_small() to handle a NULL cntbt cursor and skip the cntbt logic. This facilitates use by the existing allocation code and new code that only requires an AGFL allocation attempt. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28	xfs: move small allocation helper	Brian Foster	1	-96/+94
	Move the small allocation helper further up in the file to avoid the need for a function declaration. The remaining declarations will be removed by followup patches. No functional changes. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>