aboutsummaryrefslogtreecommitdiffstats
path: root/fs/xfs (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2020-02-07fs_parse: fold fs_parameter_desc/fs_parameter_specAl Viro1-7/+3
The former contains nothing but a pointer to an array of the latter... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-07fs_parser: remove fs_parameter_description name fieldEric Sandeen1-1/+0
Unused now. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-26xfs: fix xfs_buf_ioerror_alert location reportingDarrick J. Wong5-13/+17
Instead of passing __func__ to the error reporting function, let's use the return address builtins so that the messages actually tell you which higher level function called the buffer functions. This was previously true for the xfs_buf_read callers, but not for the xfs_trans_read_buf callers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-26xfs: remove unnecessary null pointer checks from _read_agf callersDarrick J. Wong6-18/+1
Drop the null buffer pointer checks in all code that calls xfs_alloc_read_agf and doesn't pass XFS_ALLOC_FLAG_TRYLOCK because they're no longer necessary. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_*read_agf return EAGAIN to ALLOC_FLAG_TRYLOCK callersDarrick J. Wong3-33/+25
Refactor xfs_read_agf and xfs_alloc_read_agf to return EAGAIN if the caller passed TRYLOCK and we weren't able to get the lock; and change the callers to recognize this. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: remove the xfs_btree_get_buf[ls] functionsDarrick J. Wong4-79/+18
Remove the xfs_btree_get_bufs and xfs_btree_get_bufl functions, since they're pretty trivial oneliners. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_trans_get_buf return an error codeDarrick J. Wong10-62/+67
Convert xfs_trans_get_buf() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_trans_get_buf_map return an error codeDarrick J. Wong3-22/+23
Convert xfs_trans_get_buf_map() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_buf_read return an error codeDarrick J. Wong4-24/+18
Convert xfs_buf_read() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_buf_get_uncached return an error codeDarrick J. Wong3-20/+30
Convert xfs_buf_get_uncached() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_buf_get return an error codeDarrick J. Wong3-15/+11
Convert xfs_buf_get() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_buf_read_map return an error codeDarrick J. Wong7-83/+72
Convert xfs_buf_read_map() to return numeric error codes like most everywhere else in xfs. This involves moving the open-coded logic that reports metadata IO read / corruption errors and stales the buffer into xfs_buf_read_map so that the logic is all in one place. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_buf_get_map return an error codeDarrick J. Wong3-37/+34
Convert xfs_buf_get_map() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_buf_alloc return an error codeDarrick J. Wong1-9/+12
Convert _xfs_buf_alloc() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-23xfs: remove unused variable 'done'YueHaibing1-1/+0
fs/xfs/xfs_inode.c: In function 'xfs_itruncate_extents_flags': fs/xfs/xfs_inode.c:1523:8: warning: unused variable 'done' [-Wunused-variable] commit 4bbb04abb4ee ("xfs: truncate should remove all blocks, not just to the end of the page cache") left behind this, so remove it. Fixes: 4bbb04abb4ee ("xfs: truncate should remove all blocks, not just to the end of the page cache") Reported-by: Hulk Robot <hulkci@huawei.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-23xfs: fix uninitialized variable in xfs_attr3_leaf_inactiveDarrick J. Wong1-1/+1
Dan Carpenter pointed out that error is uninitialized. While there never should be an attr leaf block with zero entries, let's not leave that logic bomb there. Fixes: 0bb9d159bd01 ("xfs: streamline xfs_attr3_leaf_inactive") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2020-01-20xfs: change return value of xfs_inode_need_cow to intzhengbin3-5/+5
Fixes coccicheck warning: fs/xfs/xfs_reflink.c:236:9-10: WARNING: return of 0/1 in function 'xfs_inode_need_cow' with return type bool Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> [darrick: rename the function so it doesn't sound like a predicate] Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-16xfs: check log iovec size to make sure it's plausibly a buffer log formatDarrick J. Wong3-0/+24
When log recovery is processing buffer log items, we should check that the incoming iovec actually describes a region of memory large enough to contain the log format and the dirty map. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-16xfs: make struct xfs_buf_log_format have a consistent sizeDarrick J. Wong2-5/+15
Increase XFS_BLF_DATAMAP_SIZE by 1 to fill in the implied padding at the end of struct xfs_buf_log_format. This makes the size consistent so that we can check it in xfs_ondisk.h, and will be needed once we start logging attribute values. On amd64 we get the following pahole: struct xfs_buf_log_format { short unsigned int blf_type; /* 0 2 */ short unsigned int blf_size; /* 2 2 */ short unsigned int blf_flags; /* 4 2 */ short unsigned int blf_len; /* 6 2 */ long long int blf_blkno; /* 8 8 */ unsigned int blf_map_size; /* 16 4 */ unsigned int blf_data_map[16]; /* 20 64 */ /* --- cacheline 1 boundary (64 bytes) was 20 bytes ago --- */ /* size: 88, cachelines: 2, members: 7 */ /* padding: 4 */ /* last cacheline: 24 bytes */ }; But on i386 we get the following: struct xfs_buf_log_format { short unsigned int blf_type; /* 0 2 */ short unsigned int blf_size; /* 2 2 */ short unsigned int blf_flags; /* 4 2 */ short unsigned int blf_len; /* 6 2 */ long long int blf_blkno; /* 8 8 */ unsigned int blf_map_size; /* 16 4 */ unsigned int blf_data_map[16]; /* 20 64 */ /* --- cacheline 1 boundary (64 bytes) was 20 bytes ago --- */ /* size: 84, cachelines: 2, members: 7 */ /* last cacheline: 20 bytes */ }; Notice how the amd64 compiler inserts 4 bytes of padding to the end of the structure to ensure 8-byte alignment. Prior to "xfs: fix memory corruption during remote attr value buffer invalidation" we would try to write to blf_data_map[17], which is harmless on amd64 but really bad on i386. This shouldn't cause any changes in the ondisk logging formats because the log code writes out the log vectors with the appropriate size for the log item's map_size, and log recovery treats the data_map array as a VLA. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-16xfs: complain if anyone tries to create a too-large buffer log itemDarrick J. Wong1-0/+12
Complain if someone calls xfs_buf_item_init on a buffer that is larger than the dirty bitmap can handle, or tries to log a region that's past the end of the dirty bitmap. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-16xfs: clean up xfs_buf_item_get_format return valueDarrick J. Wong1-13/+3
The only thing that can cause a nonzero return from xfs_buf_item_get_format is if the kmem_alloc fails, which it can't. Get rid of all the unnecessary error handling. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-16xfs: streamline xfs_attr3_leaf_inactiveDarrick J. Wong2-81/+29
Now that we know we don't have to take a transaction to stale the incore buffers for a remote value, get rid of the unnecessary memory allocation in the leaf walker and call the rmt_stale function directly. Flatten the loop while we're at it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-16xfs: fix memory corruption during remote attr value buffer invalidationDarrick J. Wong2-41/+46
While running generic/103, I observed what looks like memory corruption and (with slub debugging turned on) a slub redzone warning on i386 when inactivating an inode with a 64k remote attr value. On a v5 filesystem, maximally sized remote attr values require one block more than 64k worth of space to hold both the remote attribute value header (64 bytes). On a 4k block filesystem this results in a 68k buffer; on a 64k block filesystem, this would be a 128k buffer. Note that even though we'll never use more than 65,600 bytes of this buffer, XFS_MAX_BLOCKSIZE is 64k. This is a problem because the definition of struct xfs_buf_log_format allows for XFS_MAX_BLOCKSIZE worth of dirty bitmap (64k). On i386 when we invalidate a remote attribute, xfs_trans_binval zeroes all 68k worth of the dirty map, writing right off the end of the log item and corrupting memory. We've gotten away with this on x86_64 for years because the compiler inserts a u32 padding on the end of struct xfs_buf_log_format. Fortunately for us, remote attribute values are written to disk with xfs_bwrite(), which is to say that they are not logged. Fix the problem by removing all places where we could end up creating a buffer log item for a remote attribute value and leave a note explaining why. Next, replace the open-coded buffer invalidation with a call to the helper we created in the previous patch that does better checking for bad metadata before marking the buffer stale. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-16xfs: refactor remote attr value buffer invalidationDarrick J. Wong2-20/+34
Hoist the code that invalidates remote extended attribute value buffers into a separate helper function. This prepares us for a memory corruption fix in the next patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-15xfs: fix IOCB_NOWAIT handling in xfs_file_dio_aio_readChristoph Hellwig1-1/+6
Direct I/O reads can also be used with RWF_NOWAIT & co. Fix the inode locking in xfs_file_dio_aio_read to take IOCB_NOWAIT into account. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-15xfs: Add __packed to xfs_dir2_sf_entry_t definitionVincenzo Frascino1-1/+1
xfs_check_ondisk_structs() verifies that the sizes of the data types used by xfs are correct via the XFS_CHECK_STRUCT_SIZE() macro. Since the structures padding can vary depending on the ABI (e.g. on ARM OABI structures are padded to multiple of 32 bits), it may happen that xfs_dir2_sf_entry_t size check breaks the compilation with the assertion below: In file included from linux/include/linux/string.h:6, from linux/include/linux/uuid.h:12, from linux/fs/xfs/xfs_linux.h:10, from linux/fs/xfs/xfs.h:22, from linux/fs/xfs/xfs_super.c:7: In function ‘xfs_check_ondisk_structs’, inlined from ‘init_xfs_fs’ at linux/fs/xfs/xfs_super.c:2025:2: linux/include/linux/compiler.h:350:38: error: call to ‘__compiletime_assert_107’ declared with attribute error: XFS: sizeof(xfs_dir2_sf_entry_t) is wrong, expected 3 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) Restore the correct behavior adding __packed to the structure definition. Cc: Darrick J. Wong <darrick.wong@oracle.com> Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-14xfs: fix s_maxbytes computation on 32-bit kernelsDarrick J. Wong1-27/+21
I observed a hang in generic/308 while running fstests on a i686 kernel. The hang occurred when trying to purge the pagecache on a large sparse file that had a page created past MAX_LFS_FILESIZE, which caused an integer overflow in the pagecache xarray and resulted in an infinite loop. I then noticed that Linus changed the definition of MAX_LFS_FILESIZE in commit 0cc3b0ec23ce ("Clarify (and fix) MAX_LFS_FILESIZE macros") so that it is now one page short of the maximum page index on 32-bit kernels. Because the XFS function to compute max offset open-codes the 2005-era MAX_LFS_FILESIZE computation and neither the vfs nor mm perform any sanity checking of s_maxbytes, the code in generic/308 can create a page above the pagecache's limit and kaboom. Fix all this by setting s_maxbytes to MAX_LFS_FILESIZE directly and aborting the mount with a warning if our assumptions ever break. I have no answer for why this seems to have been broken for years and nobody noticed. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-14xfs: truncate should remove all blocks, not just to the end of the page cacheDarrick J. Wong1-12/+12
xfs_itruncate_extents_flags() is supposed to unmap every block in a file from EOF onwards. Oddly, it uses s_maxbytes as the upper limit to the bunmapi range, even though s_maxbytes reflects the highest offset the pagecache can support, not the highest offset that XFS supports. The result of this confusion is that if you create a 20T file on a 64-bit machine, mount the filesystem on a 32-bit machine, and remove the file, we leak everything above 16T. Fix this by capping the bunmapi request at the maximum possible block offset, not s_maxbytes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-14xfs: introduce XFS_MAX_FILEOFFDarrick J. Wong2-1/+9
Introduce a new #define for the maximum supported file block offset. We'll use this in the next patch to make it more obvious that we're doing some operation for all possible inode fork mappings after a given offset. We can't use ULLONG_MAX here because bunmapi uses that to detect when it's done. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-09xfs: remove bogus assertion when online repair isn't enabledDarrick J. Wong1-1/+0
We don't need to assert on !REPAIR in the stub version of xrep_calc_ag_resblks that is called when online repair hasn't been compiled into the kernel because none of the repair code will ever run. Reported-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-09xfs: Remove all strlen in all xfs_attr_* functions for attr names.Allison Henderson6-20/+41
This helps to pre-simplify the extra handling of the null terminator in delayed operations which use memcpy rather than strlen. Later when we introduce parent pointers, attribute names will become binary, so strlen will not work at all. Removing uses of strlen now will help reduce complexities later Signed-off-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-09xfs: fix misuse of the XFS_ATTR_INCOMPLETE flagChristoph Hellwig4-6/+6
XFS_ATTR_INCOMPLETE is a flag in the on-disk attribute format, and thus in a different namespace as the ATTR_* flags in xfs_da_args.flags. Switch to using a XFS_DA_OP_INCOMPLETE flag in op_flags instead. Without this users might be able to inject this flag into operations using the attr by handle ioctl. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-09xfs: also remove cached ACLs when removing the underlying attrChristoph Hellwig1-3/+4
We should not just invalidate the ACL when setting the underlying attribute, but also when removing it. The ioctl interface gets that right, but the normal xattr inteface skipped the xfs_forget_acl due to an early return. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-09xfs: reject invalid flags combinations in XFS_IOC_ATTRMULTI_BY_HANDLEChristoph Hellwig2-0/+10
While the flags field in the ABI and the on-disk format allows for multiple namespace flags, that is a logically invalid combination that scrub complains about. Reject it at the ioctl level, as all other interface already get this right at higher levels. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-09xfs: clear kernel only flags in XFS_IOC_ATTRMULTI_BY_HANDLEChristoph Hellwig3-2/+9
Don't allow passing arbitrary flags as they change behavior including memory allocation that the call stack is not prepared for. Fixes: ddbca70cc45c ("xfs: allocate xattr buffer on demand") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-07xfs: remove shadow variable in xfs_btree_lshiftEric Sandeen1-2/+0
Sparse warns about a shadow variable in this function after the Fixed: commit added another int i; with larger scope. It's safe to remove the one with the smaller scope to fix this shadow, although the shadow itself is harmless. Fixes: 2c813ad66a72 ("xfs: support btrees with overlapping intervals for keys") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-06xfs: quota: move to time64_t interfacesArnd Bergmann4-12/+14
As a preparation for removing the 32-bit time_t type and all associated interfaces, change xfs to use time64_t and ktime_get_real_seconds() for the quota housekeeping. This avoids one difference between 32-bit and 64-bit kernels, raising the theoretical limit for the quota grace period to year 2106 on 32-bit instead of year 2038. Note that common user space tools using the XFS quotactl interface instead of the generic one still use the y2038 dates. To fix quotas properly, both the on-disk format and user space still need to be changed. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-06xfs: rename compat_time_t to old_time32_tArnd Bergmann2-2/+2
The compat_time_t type has been removed everywhere else, as most users rely on old_time32_t for both native and compat mode handling of 32-bit time_t. Remove the last one in xfs. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-03dax: Pass dax_dev instead of bdev to dax_writeback_mapping_range()Vivek Goyal1-1/+1
As of now dax_writeback_mapping_range() takes "struct block_device" as a parameter and dax_dev is searched from bdev name. This also involves taking a fresh reference on dax_dev and putting that reference at the end of function. We are developing a new filesystem virtio-fs and using dax to access host page cache directly. But there is no block device. IOW, we want to make use of dax but want to get rid of this assumption that there is always a block device associated with dax_dev. So pass in "struct dax_device" as parameter instead of bdev. ext2/ext4/xfs are current users and they already have a reference on dax_device. So there is no need to take reference and drop reference to dax_device on each call of this function. Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20200103183307.GB13350@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-12-20xfs: Make the symbol 'xfs_rtalloc_log_count' staticChen Wandun1-1/+1
Fix the following sparse warning: fs/xfs/libxfs/xfs_trans_resv.c:206:1: warning: symbol 'xfs_rtalloc_log_count' was not declared. Should it be static? Fixes: b1de6fc7520f ("xfs: fix log reservation overflows when allocating large rt extents") Signed-off-by: Chen Wandun <chenwandun@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-19xfs: don't commit sunit/swidth updates to disk if that would cause repair failuresDarrick J. Wong4-1/+130
Alex Lyakas reported[1] that mounting an xfs filesystem with new sunit and swidth values could cause xfs_repair to fail loudly. The problem here is that repair calculates the where mkfs should have allocated the root inode, based on the superblock geometry. The allocation decisions depend on sunit, which means that we really can't go updating sunit if it would lead to a subsequent repair failure on an otherwise correct filesystem. Port from xfs_repair some code that computes the location of the root inode and teach mount to skip the ondisk update if it would cause problems for repair. Along the way we'll update the documentation, provide a function for computing the minimum AGFL size instead of open-coding it, and cut down some indenting in the mount code. Note that we allow the mount to proceed (and new allocations will reflect this new geometry) because we've never screened this kind of thing before. We'll have to wait for a new future incompat feature to enforce correct behavior, alas. Note that the geometry reporting always uses the superblock values, not the incore ones, so that is what xfs_info and xfs_growfs will report. [1] https://lore.kernel.org/linux-xfs/20191125130744.GA44777@bfoster/T/#m00f9594b511e076e2fcdd489d78bc30216d72a7d Reported-by: Alex Lyakas <alex@zadara.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-19xfs: split the sunit parameter update into two partsDarrick J. Wong1-51/+72
If the administrator provided a sunit= mount option, we need to validate the raw parameter, convert the mount option units (512b blocks) into the internal unit (fs blocks), and then validate that the (now cooked) parameter doesn't screw anything up on disk. The incore inode geometry computation can depend on the new sunit option, but a subsequent patch will make validating the cooked value depends on the computed inode geometry, so break the sunit update into two steps. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-19xfs: refactor agfl length computation functionDarrick J. Wong1-5/+13
Refactor xfs_alloc_min_freelist to accept a NULL @pag argument, in which case it returns the largest possible minimum length. This will be used in an upcoming patch to compute the length of the AGFL at mkfs time. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-19libxfs: resync with the userspace libxfsDarrick J. Wong4-24/+34
Prepare to resync the userspace libxfs with the kernel libxfs. There were a few things I missed -- a couple of static inline directory functions that have to be exported for xfs_repair; a couple of directory naming functions that make porting much easier if they're /not/ static inline; and a u16 usage that should have been uint16_t. None of these things are bugs in their own right; this just makes porting xfsprogs easier. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2019-12-19xfs: use bitops interface for buf log item AIL flag checkBrian Foster1-1/+1
The xfs_log_item flags were converted to atomic bitops as of commit 22525c17ed ("xfs: log item flags are racy"). The assert check for AIL presence in xfs_buf_item_relse() still uses the old value based check. This likely went unnoticed as XFS_LI_IN_AIL evaluates to 0 and causes the assert to unconditionally pass. Fix up the check. Signed-off-by: Brian Foster <bfoster@redhat.com> Fixes: 22525c17ed ("xfs: log item flags are racy") Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-17xfs: fix log reservation overflows when allocating large rt extentsDarrick J. Wong1-19/+77
Omar Sandoval reported that a 4G fallocate on the realtime device causes filesystem shutdowns due to a log reservation overflow that happens when we log the rtbitmap updates. Factor rtbitmap/rtsummary updates into the the tr_write and tr_itruncate log reservation calculation. "The following reproducer results in a transaction log overrun warning for me: mkfs.xfs -f -r rtdev=/dev/vdc -d rtinherit=1 -m reflink=0 /dev/vdb mount -o rtdev=/dev/vdc /dev/vdb /mnt fallocate -l 4G /mnt/foo Reported-by: Omar Sandoval <osandov@osandov.com> Tested-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-11xfs: stabilize insert range start boundary to avoid COW writeback raceBrian Foster2-2/+13
generic/522 (fsx) occasionally fails with a file corruption due to an insert range operation. The primary characteristic of the corruption is a misplaced insert range operation that differs from the requested target offset. The reason for this behavior is a race between the extent shift sequence of an insert range and a COW writeback completion that causes a front merge with the first extent in the shift. The shift preparation function flushes and unmaps from the target offset of the operation to the end of the file to ensure no modifications can be made and page cache is invalidated before file data is shifted. An insert range operation then splits the extent at the target offset, if necessary, and begins to shift the start offset of each extent starting from the end of the file to the start offset. The shift sequence operates at extent level and so depends on the preparation sequence to guarantee no changes can be made to the target range during the shift. If the block immediately prior to the target offset was dirty and shared, however, it can undergo writeback and move from the COW fork to the data fork at any point during the shift. If the block is contiguous with the block at the start offset of the insert range, it can front merge and alter the start offset of the extent. Once the shift sequence reaches the target offset, it shifts based on the latest start offset and silently changes the target offset of the operation and corrupts the file. To address this problem, update the shift preparation code to stabilize the start boundary along with the full range of the insert. Also update the existing corruption check to fail if any extent is shifted with a start offset behind the target offset of the insert range. This prevents insert from racing with COW writeback completion and fails loudly in the event of an unexpected extent shift. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-03xfs: fix mount failure crash on invalid iclog memory accessBrian Foster1-0/+2
syzbot (via KASAN) reports a use-after-free in the error path of xlog_alloc_log(). Specifically, the iclog freeing loop doesn't handle the case of a fully initialized ->l_iclog linked list. Instead, it assumes that the list is partially constructed and NULL terminated. This bug manifested because there was no possible error scenario after iclog list setup when the original code was added. Subsequent code and associated error conditions were added some time later, while the original error handling code was never updated. Fix up the error loop to terminate either on a NULL iclog or reaching the end of the list. Reported-by: syzbot+c732f8644185de340492@syzkaller.appspotmail.com Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-02xfs: don't check for AG deadlock for realtime files in bunmapiOmar Sandoval1-1/+1
Commit 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi") added a check in __xfs_bunmapi() to stop early if we would touch multiple AGs in the wrong order. However, this check isn't applicable for realtime files. In most cases, it just makes us do unnecessary commits. However, without the fix from the previous commit ("xfs: fix realtime file data space leak"), if the last and second-to-last extents also happen to have different "AG numbers", then the break actually causes __xfs_bunmapi() to return without making any progress, which sends xfs_itruncate_extents_flags() into an infinite loop. Fixes: 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi") Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-02xfs: fix realtime file data space leakOmar Sandoval1-11/+14
Realtime files in XFS allocate extents in rextsize units. However, the written/unwritten state of those extents is still tracked in blocksize units. Therefore, a realtime file can be split up into written and unwritten extents that are not necessarily aligned to the realtime extent size. __xfs_bunmapi() has some logic to handle these various corner cases. Consider how it handles the following case: 1. The last extent is unwritten. 2. The last extent is smaller than the realtime extent size. 3. startblock of the last extent is not aligned to the realtime extent size, but startblock + blockcount is. In this case, __xfs_bunmapi() calls xfs_bmap_add_extent_unwritten_real() to set the second-to-last extent to unwritten. This should merge the last and second-to-last extents, so __xfs_bunmapi() moves on to the second-to-last extent. However, if the size of the last and second-to-last extents combined is greater than MAXEXTLEN, xfs_bmap_add_extent_unwritten_real() does not merge the two extents. When that happens, __xfs_bunmapi() skips past the last extent without unmapping it, thus leaking the space. Fix it by only unwriting the minimum amount needed to align the last extent to the realtime extent size, which is guaranteed to merge with the last extent. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>