aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/fs/xfs/libxfs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-12-20xfs: Make the symbol 'xfs_rtalloc_log_count' staticChen Wandun1-1/+1
Fix the following sparse warning: fs/xfs/libxfs/xfs_trans_resv.c:206:1: warning: symbol 'xfs_rtalloc_log_count' was not declared. Should it be static? Fixes: b1de6fc7520f ("xfs: fix log reservation overflows when allocating large rt extents") Signed-off-by: Chen Wandun <chenwandun@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-19xfs: don't commit sunit/swidth updates to disk if that would cause repair failuresDarrick J. Wong2-0/+65
Alex Lyakas reported[1] that mounting an xfs filesystem with new sunit and swidth values could cause xfs_repair to fail loudly. The problem here is that repair calculates the where mkfs should have allocated the root inode, based on the superblock geometry. The allocation decisions depend on sunit, which means that we really can't go updating sunit if it would lead to a subsequent repair failure on an otherwise correct filesystem. Port from xfs_repair some code that computes the location of the root inode and teach mount to skip the ondisk update if it would cause problems for repair. Along the way we'll update the documentation, provide a function for computing the minimum AGFL size instead of open-coding it, and cut down some indenting in the mount code. Note that we allow the mount to proceed (and new allocations will reflect this new geometry) because we've never screened this kind of thing before. We'll have to wait for a new future incompat feature to enforce correct behavior, alas. Note that the geometry reporting always uses the superblock values, not the incore ones, so that is what xfs_info and xfs_growfs will report. [1] https://lore.kernel.org/linux-xfs/20191125130744.GA44777@bfoster/T/#m00f9594b511e076e2fcdd489d78bc30216d72a7d Reported-by: Alex Lyakas <alex@zadara.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-19xfs: refactor agfl length computation functionDarrick J. Wong1-5/+13
Refactor xfs_alloc_min_freelist to accept a NULL @pag argument, in which case it returns the largest possible minimum length. This will be used in an upcoming patch to compute the length of the AGFL at mkfs time. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-19libxfs: resync with the userspace libxfsDarrick J. Wong4-24/+34
Prepare to resync the userspace libxfs with the kernel libxfs. There were a few things I missed -- a couple of static inline directory functions that have to be exported for xfs_repair; a couple of directory naming functions that make porting much easier if they're /not/ static inline; and a u16 usage that should have been uint16_t. None of these things are bugs in their own right; this just makes porting xfsprogs easier. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2019-12-17xfs: fix log reservation overflows when allocating large rt extentsDarrick J. Wong1-19/+77
Omar Sandoval reported that a 4G fallocate on the realtime device causes filesystem shutdowns due to a log reservation overflow that happens when we log the rtbitmap updates. Factor rtbitmap/rtsummary updates into the the tr_write and tr_itruncate log reservation calculation. "The following reproducer results in a transaction log overrun warning for me: mkfs.xfs -f -r rtdev=/dev/vdc -d rtinherit=1 -m reflink=0 /dev/vdb mount -o rtdev=/dev/vdc /dev/vdb /mnt fallocate -l 4G /mnt/foo Reported-by: Omar Sandoval <osandov@osandov.com> Tested-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-11xfs: stabilize insert range start boundary to avoid COW writeback raceBrian Foster1-2/+1
generic/522 (fsx) occasionally fails with a file corruption due to an insert range operation. The primary characteristic of the corruption is a misplaced insert range operation that differs from the requested target offset. The reason for this behavior is a race between the extent shift sequence of an insert range and a COW writeback completion that causes a front merge with the first extent in the shift. The shift preparation function flushes and unmaps from the target offset of the operation to the end of the file to ensure no modifications can be made and page cache is invalidated before file data is shifted. An insert range operation then splits the extent at the target offset, if necessary, and begins to shift the start offset of each extent starting from the end of the file to the start offset. The shift sequence operates at extent level and so depends on the preparation sequence to guarantee no changes can be made to the target range during the shift. If the block immediately prior to the target offset was dirty and shared, however, it can undergo writeback and move from the COW fork to the data fork at any point during the shift. If the block is contiguous with the block at the start offset of the insert range, it can front merge and alter the start offset of the extent. Once the shift sequence reaches the target offset, it shifts based on the latest start offset and silently changes the target offset of the operation and corrupts the file. To address this problem, update the shift preparation code to stabilize the start boundary along with the full range of the insert. Also update the existing corruption check to fail if any extent is shifted with a start offset behind the target offset of the insert range. This prevents insert from racing with COW writeback completion and fails loudly in the event of an unexpected extent shift. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-07Merge tag 'xfs-5.5-merge-17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-12/+15
Pull xfs fixes from Darrick Wong: "Fix a couple of resource management errors and a hang: - fix a crash in the log setup code when log mounting fails - fix a hang when allocating space on the realtime device - fix a block leak when freeing space on the realtime device" * tag 'xfs-5.5-merge-17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix mount failure crash on invalid iclog memory access xfs: don't check for AG deadlock for realtime files in bunmapi xfs: fix realtime file data space leak
2019-12-02xfs: don't check for AG deadlock for realtime files in bunmapiOmar Sandoval1-1/+1
Commit 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi") added a check in __xfs_bunmapi() to stop early if we would touch multiple AGs in the wrong order. However, this check isn't applicable for realtime files. In most cases, it just makes us do unnecessary commits. However, without the fix from the previous commit ("xfs: fix realtime file data space leak"), if the last and second-to-last extents also happen to have different "AG numbers", then the break actually causes __xfs_bunmapi() to return without making any progress, which sends xfs_itruncate_extents_flags() into an infinite loop. Fixes: 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi") Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-02xfs: fix realtime file data space leakOmar Sandoval1-11/+14
Realtime files in XFS allocate extents in rextsize units. However, the written/unwritten state of those extents is still tracked in blocksize units. Therefore, a realtime file can be split up into written and unwritten extents that are not necessarily aligned to the realtime extent size. __xfs_bunmapi() has some logic to handle these various corner cases. Consider how it handles the following case: 1. The last extent is unwritten. 2. The last extent is smaller than the realtime extent size. 3. startblock of the last extent is not aligned to the realtime extent size, but startblock + blockcount is. In this case, __xfs_bunmapi() calls xfs_bmap_add_extent_unwritten_real() to set the second-to-last extent to unwritten. This should merge the last and second-to-last extents, so __xfs_bunmapi() moves on to the second-to-last extent. However, if the size of the last and second-to-last extents combined is greater than MAXEXTLEN, xfs_bmap_add_extent_unwritten_real() does not merge the two extents. When that happens, __xfs_bunmapi() skips past the last extent without unmapping it, thus leaking the space. Fix it by only unwriting the minimum amount needed to align the last extent to the realtime extent size, which is guaranteed to merge with the last extent. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-02Merge tag 'xfs-5.5-merge-16' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds42-3271/+3324
Pull XFS updates from Darrick Wong: "For this release, we changed quite a few things. Highlights: - Fixed some long tail latency problems in the block allocator - Removed some long deprecated (and for the past several years no-op) mount options and ioctls - Strengthened the extended attribute and directory verifiers - Audited and fixed all the places where we could return EFSCORRUPTED without logging anything - Refactored the old SGI space allocation ioctls to make the equivalent fallocate calls - Fixed a race between fallocate and directio - Fixed an integer overflow when files have more than a few billion(!) extents - Fixed a longstanding bug where quota accounting could be incorrect when performing unwritten extent conversion on a freshly mounted fs - Fixed various complaints in scrub about soft lockups and unresponsiveness to signals - De-vtable'd the directory handling code, which should make it faster - Converted to the new mount api, for better or for worse - Cleaned up some memory leaks and quite a lot of other smaller fixes and cleanups. A more detailed summary: - Fill out the build string - Prevent inode fork extent count overflows - Refactor the allocator to reduce long tail latency - Rework incore log locking a little to reduce spinning - Break up the xfs_iomap_begin functions into smaller more cohesive parts - Fix allocation alignment being dropped too early when the allocation request is for more blocks than an AG is large - Other small cleanups - Clean up file buftarg retrieval helpers - Hoist the resvsp and unresvsp ioctls to the vfs - Remove the undocumented biosize mount option, since it has never been mentioned as existing or supported on linux - Clean up some of the mount option printing and parsing - Enhance attr leaf verifier to check block structure - Check dirent and attr names for invalid characters before passing them to the vfs - Refactor open-coded bmbt walking - Fix a few places where we return EIO instead of EFSCORRUPTED after failing metadata sanity checks - Fix a synchronization problem between fallocate and aio dio corrupting the file length - Clean up various loose ends in the iomap and bmap code - Convert to the new mount api - Make sure we always log something when returning EFSCORRUPTED - Fix some problems where long running scrub loops could trigger soft lockup warnings and/or fail to exit due to fatal signals pending - Fix various Coverity complaints - Remove most of the function pointers from the directory code to reduce indirection penalties - Ensure that dquots are attached to the inode when performing unwritten extent conversion after io - Deuglify incore projid and crtime types - Fix another AGI/AGF locking order deadlock when renaming - Clean up some quota typedefs - Remove the FSSETDM ioctls which haven't done anything in 20 years - Fix some memory leaks when mounting the log fails - Fix an underflow when updating an xattr leaf freemap - Remove some trivial wrappers - Report metadata corruption as an error, not a (potentially) fatal assertion - Clean up the dir/attr buffer mapping code - Allow fatal signals to kill scrub during parent pointer checks" * tag 'xfs-5.5-merge-16' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (198 commits) xfs: allow parent directory scans to be interrupted with fatal signals xfs: remove the mappedbno argument to xfs_da_get_buf xfs: remove the mappedbno argument to xfs_da_read_buf xfs: split xfs_da3_node_read xfs: remove the mappedbno argument to xfs_dir3_leafn_read xfs: remove the mappedbno argument to xfs_dir3_leaf_read xfs: remove the mappedbno argument to xfs_attr3_leaf_read xfs: remove the mappedbno argument to xfs_da_reada_buf xfs: improve the xfs_dabuf_map calling conventions xfs: refactor xfs_dabuf_map xfs: simplify mappedbno handling in xfs_da_{get,read}_buf xfs: report corruption only as a regular error xfs: Remove kmem_zone_free() wrapper xfs: Remove kmem_zone_destroy() wrapper xfs: Remove slab init wrappers xfs: fix attr leaf header freemap.size underflow xfs: fix some memory leaks in log recovery xfs: fix another missing include xfs: remove XFS_IOC_FSSETDM and XFS_IOC_FSSETDM_BY_HANDLE xfs: remove duplicated include from xfs_dir2_data.c ...
2019-11-30Merge tag 'iomap-5.5-merge-11' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds2-6/+11
Pull iomap updates from Darrick Wong: "In this release, we hoisted as much of XFS' writeback code into iomap as was practicable, refactored the unshare file data function, added the ability to perform buffered io copy on write, and tweaked various parts of the directio implementation as needed to port ext4's directio code (that will be a separate pull). Summary: - Make iomap_dio_rw callers explicitly tell us if they want us to wait - Port the xfs writeback code to iomap to complete the buffered io library functions - Refactor the unshare code to share common pieces - Add support for performing copy on write with buffered writes - Other minor fixes - Fix unchecked return in iomap_bmap - Fix a type casting bug in a ternary statement in iomap_dio_bio_actor - Improve tracepoints for easier diagnostic ability - Fix pipe page leakage in directio reads" * tag 'iomap-5.5-merge-11' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (31 commits) iomap: Fix pipe page leakage during splicing iomap: trace iomap_appply results iomap: fix return value of iomap_dio_bio_actor on 32bit systems iomap: iomap_bmap should check iomap_apply return value iomap: Fix overflow in iomap_page_mkwrite fs/iomap: remove redundant check in iomap_dio_rw() iomap: use a srcmap for a read-modify-write I/O iomap: renumber IOMAP_HOLE to 0 iomap: use write_begin to read pages to unshare iomap: move the zeroing case out of iomap_read_page_sync iomap: ignore non-shared or non-data blocks in xfs_file_dirty iomap: always use AOP_FLAG_NOFS in iomap_write_begin iomap: remove the unused iomap argument to __iomap_write_end iomap: better document the IOMAP_F_* flags iomap: enhance writeback error message iomap: pass a struct page to iomap_finish_page_writeback iomap: cleanup iomap_ioend_compare iomap: move struct iomap_page out of iomap.h iomap: warn on inline maps in iomap_writepage_map iomap: lift the xfs writeback code to iomap ...
2019-11-22xfs: remove the mappedbno argument to xfs_da_get_bufChristoph Hellwig6-22/+9
Use the xfs_da_get_buf_daddr function directly for the two callers that pass a mapped disk address, and then remove the mappedbno argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_da_read_bufChristoph Hellwig8-43/+39
Move the code for reading an already mapped block into xfs_da3_node_read_mapped, which is the only caller ever passing a block number in the mappedbno argument and replace the mappedbno argument with the simple xfs_dabuf_get flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: split xfs_da3_node_readChristoph Hellwig3-56/+75
Split xfs_da3_node_read into one variant that always looks up the daddr and doesn't accept holes, and one that already has a daddr at hand. This is in preparation of splitting up xfs_da_read_buf in a similar way. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_dir3_leafn_readChristoph Hellwig3-7/+5
This argument is always hard coded to -1, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_dir3_leaf_readChristoph Hellwig2-7/+6
This argument is always hard coded to -1, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_attr3_leaf_readChristoph Hellwig3-16/+14
This argument is always hard coded to -1, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_da_reada_bufChristoph Hellwig4-15/+9
Replace the mappedbno argument with the simple flags for xfs_da_reada_buf and xfs_dir3_data_readahead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: improve the xfs_dabuf_map calling conventionsChristoph Hellwig2-29/+17
Use a flags argument with the XFS_DABUF_MAP_HOLE_OK flag to signal that a hole is okay and not corruption, and return 0 with *nmap set to 0 to signal that case in the return value instead of a nameless -1 return code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: refactor xfs_dabuf_mapChristoph Hellwig1-102/+54
Merge xfs_buf_map_from_irec and xfs_da_map_covers_blocks into a single loop in the caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: simplify mappedbno handling in xfs_da_{get,read}_bufChristoph Hellwig1-52/+51
Shortcut the creation of xfs_bmbt_irec and xfs_buf_map for the case where the callers passed an already mapped xfs_daddr_t. This is in preparation for splitting these cases out entirely later. Also reject the mappedbno case for xfs_da_reada_buf as no callers currently uses it and it will be removed soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-18xfs: Remove kmem_zone_free() wrapperCarlos Maiolino3-6/+6
We can remove it now, without needing to rework the KM_ flags. Use kmem_cache_free() directly. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-15xfs: fix attr leaf header freemap.size underflowBrian Foster1-1/+3
The leaf format xattr addition helper xfs_attr3_leaf_add_work() adjusts the block freemap in a couple places. The first update drops the size of the freemap that the caller had already selected to place the xattr name/value data. Before the function returns, it also checks whether the entries array has encroached on a freemap range by virtue of the new entry addition. This is necessary because the entries array grows from the start of the block (but end of the block header) towards the end of the block while the name/value data grows from the end of the block in the opposite direction. If the associated freemap is already empty, however, size is zero and the subtraction underflows the field and causes corruption. This is reproduced rarely by generic/070. The observed behavior is that a smaller sized freemap is aligned to the end of the entries list, several subsequent xattr additions land in larger freemaps and the entries list expands into the smaller freemap until it is fully consumed and then underflows. Note that it is not otherwise a corruption for the entries array to consume an empty freemap because the nameval list (i.e. the firstused pointer in the xattr header) starts beyond the end of the corrupted freemap. Update the freemap size modification to account for the fact that the freemap entry can be empty and thus stale. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: remove duplicated include from xfs_dir2_data.cYueHaibing1-1/+0
Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: remove unused structure members & simple typedefsEric Sandeen1-2/+0
Remove some unused typedef'd simple types, and some unused structure members. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: remove unused typedef definitionsEric Sandeen2-4/+4
Remove some typdefs for type_t's that are no longer referred to by their typedef'd types. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: remove the xfs_qoff_logitem_t typedefPavel Reichl1-2/+2
Signed-off-by: Pavel Reichl <preichl@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix a comment] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: remove the xfs_disk_dquot_t and xfs_dquot_tPavel Reichl3-10/+10
Signed-off-by: Pavel Reichl <preichl@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix some of the comments] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: avoid time_t in user apiArnd Bergmann1-1/+1
The ioctl definitions for XFS_IOC_SWAPEXT, XFS_IOC_FSBULKSTAT and XFS_IOC_FSBULKSTAT_SINGLE are part of libxfs and based on time_t. The definition for time_t differs between current kernels and coming 32-bit libc variants that define it as 64-bit. For most ioctls, that means the kernel has to be able to handle two different command codes based on the different structure sizes. The same solution could be applied for XFS_IOC_SWAPEXT, but it would not work for XFS_IOC_FSBULKSTAT and XFS_IOC_FSBULKSTAT_SINGLE because the structure with the time_t is passed through an indirect pointer, and the command number itself is based on struct xfs_fsop_bulkreq, which does not differ based on time_t. This means any solution that can be applied requires a change of the ABI definition in the xfs_fs.h header file, as well as doing the same change in any user application that contains a copy of this header. The usual solution would be to define a replacement structure and use conditional compilation for the ioctl command codes to use one or the other, such as #define XFS_IOC_FSBULKSTAT_OLD _IOWR('X', 101, struct xfs_fsop_bulkreq) #define XFS_IOC_FSBULKSTAT_NEW _IOWR('X', 129, struct xfs_fsop_bulkreq) #define XFS_IOC_FSBULKSTAT ((sizeof(time_t) == sizeof(__kernel_long_t)) ? \ XFS_IOC_FSBULKSTAT_OLD : XFS_IOC_FSBULKSTAT_NEW) After this, the kernel would be able to implement both XFS_IOC_FSBULKSTAT_OLD and XFS_IOC_FSBULKSTAT_NEW handlers on 32-bit architectures with the correct ABI for either definition of time_t. However, as long as two observations are true, a much simpler solution can be used: 1. xfsprogs is the only user space project that has a copy of this header 2. xfsprogs already has a replacement for all three affected ioctl commands, based on the xfs_bulkstat structure to pass 64-bit timestamps regardless of the architecture Based on those assumptions, changing xfs_bstime to use __kernel_long_t instead of time_t in both the kernel and in xfsprogs preserves the current ABI for any libc definition of time_t and solves the problem of passing 64-bit timestamps to 32-bit user space. If either of the two assumptions is invalid, more discussion is needed for coming up with a way to fix as much of the affected user space code as possible. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename()kaixuxia2-5/+25
When target_ip exists in xfs_rename(), the xfs_dir_replace() call may need to hold the AGF lock to allocate more blocks, and then invoking the xfs_droplink() call to hold AGI lock to drop target_ip onto the unlinked list, so we get the lock order AGF->AGI. This would break the ordering constraint on AGI and AGF locking - inode allocation locks the AGI, then can allocate a new extent for new inodes, locking the AGF after the AGI. In this patch we check whether the replace operation need more blocks firstly. If so, acquire the agi lock firstly to preserve locking order(AGI/AGF). Actually, the locking order problem only occurs when we are locking the AGI/AGF of the same AG. For multiple AGs the AGI lock will be released after the transaction committed. Signed-off-by: kaixuxia <kaixuxia@tencent.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: reword the comment] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: don't reset the "inode core" in xfs_ireadChristoph Hellwig1-2/+0
We have the exact same memset in xfs_inode_alloc, which is always called just before xfs_iread. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: merge the projid fields in struct xfs_icdinodeChristoph Hellwig2-8/+6
There is no point in splitting the fields like this in an purely in-memory structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: use a struct timespec64 for the in-core crtimeChristoph Hellwig3-10/+8
struct xfs_icdinode is purely an in-memory data structure, so don't use a log on-disk structure for it. This simplifies the code a bit, and also reduces our include hell slightly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix a minor indenting problem in xfs_trans_ichgtime] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: devirtualize ->m_dirnameopsChristoph Hellwig9-56/+42
Instead of causing a relatively expensive indirect call for each hashing and comparism of a file name in a directory just use an inline function and a simple branch on the ASCII CI bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix unused variable warning] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: convert open coded corruption check to use XFS_IS_CORRUPTDarrick J. Wong9-113/+63
Convert the last of the open coded corruption check and report idioms to use the XFS_IS_CORRUPT macro. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-12xfs: kill the XFS_WANT_CORRUPT_* macrosDarrick J. Wong6-301/+950
The XFS_WANT_CORRUPT_* macros conceal subtle side effects such as the creation of local variables and redirections of the code flow. This is pretty ugly, so replace them with explicit XFS_IS_CORRUPT tests that remove both of those ugly points. The change was performed with the following coccinelle script: @@ expression mp, test; identifier label; @@ - XFS_WANT_CORRUPTED_GOTO(mp, test, label); + if (XFS_IS_CORRUPT(mp, !test)) { error = -EFSCORRUPTED; goto label; } @@ expression mp, test; @@ - XFS_WANT_CORRUPTED_RETURN(mp, test); + if (XFS_IS_CORRUPT(mp, !test)) return -EFSCORRUPTED; @@ expression mp, lval, rval; @@ - XFS_IS_CORRUPT(mp, !(lval == rval)) + XFS_IS_CORRUPT(mp, lval != rval) @@ expression mp, e1, e2; @@ - XFS_IS_CORRUPT(mp, !(e1 && e2)) + XFS_IS_CORRUPT(mp, !e1 || !e2) @@ expression e1, e2; @@ - !(e1 == e2) + e1 != e2 @@ expression e1, e2, e3, e4, e5, e6; @@ - !(e1 == e2 && e3 == e4) || e5 != e6 + e1 != e2 || e3 != e4 || e5 != e6 @@ expression e1, e2, e3, e4, e5, e6; @@ - !(e1 == e2 || (e3 <= e4 && e5 <= e6)) + e1 != e2 && (e3 > e4 || e5 > e6) @@ expression mp, e1, e2; @@ - XFS_IS_CORRUPT(mp, !(e1 <= e2)) + XFS_IS_CORRUPT(mp, e1 > e2) @@ expression mp, e1, e2; @@ - XFS_IS_CORRUPT(mp, !(e1 < e2)) + XFS_IS_CORRUPT(mp, e1 >= e2) @@ expression mp, e1; @@ - XFS_IS_CORRUPT(mp, !!e1) + XFS_IS_CORRUPT(mp, e1) @@ expression mp, e1, e2; @@ - XFS_IS_CORRUPT(mp, !(e1 || e2)) + XFS_IS_CORRUPT(mp, !e1 && !e2) @@ expression mp, e1, e2, e3, e4; @@ - XFS_IS_CORRUPT(mp, !(e1 == e2) && !(e3 == e4)) + XFS_IS_CORRUPT(mp, e1 != e2 && e3 != e4) @@ expression mp, e1, e2, e3, e4; @@ - XFS_IS_CORRUPT(mp, !(e1 <= e2) || !(e3 >= e4)) + XFS_IS_CORRUPT(mp, e1 > e2 || e3 < e4) @@ expression mp, e1, e2, e3, e4; @@ - XFS_IS_CORRUPT(mp, !(e1 == e2) && !(e3 <= e4)) + XFS_IS_CORRUPT(mp, e1 != e2 && e3 > e4) Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-11xfs: actually check xfs_btree_check_block return in xfs_btree_islastblockDarrick J. Wong2-27/+17
Coverity points out that xfs_btree_islastblock doesn't check the return value of xfs_btree_check_block. Since the question "Does the cursor point to the last block in this level?" only makes sense if the caller previously performed a lookup or seek operation, the block should already have been checked. Therefore, check the return value in an ASSERT and turn the whole thing into a static inline predicate. Coverity-id: 114069 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-10xfs: always pass a valid hdr to xfs_dir3_leaf_check_intChristoph Hellwig1-18/+13
Move the code for extracting the incore header to the only caller that didn't already do that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: merge xfs_dir2_data_freescan and xfs_dir2_data_freescan_intChristoph Hellwig5-23/+12
There is no real need for xfs_dir2_data_freescan wrapper, so rename xfs_dir2_data_freescan_int to xfs_dir2_data_freescan and let the callers dereference the mount pointer from the inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: remove the now unused dir ops infrastructureChristoph Hellwig4-58/+0
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: devirtualize ->data_get_ftype and ->data_put_ftypeChristoph Hellwig8-75/+49
Replace the ->data_get_ftype and ->data_put_ftype dir ops methods with directly called xfs_dir2_data_get_ftype and xfs_dir2_data_put_ftype helpers that takes care of the differences between the directory format with and without the file type field. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: devirtualize ->data_bestfree_pChristoph Hellwig7-36/+30
Replace the ->data_bestfree_p dir ops method with a directly called xfs_dir2_data_bestfree_p helper that takes care of the differences between the v4 and v5 on-disk format. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: cleanup xfs_dir2_data_entsizeChristoph Hellwig2-34/+14
Remove the XFS_DIR2_DATA_ENTSIZE and XFS_DIR3_DATA_ENTSIZE and open code them in their only caller, which now becomes so simple that we can turn it into an inline function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: move the dir2 data block fixed offsets to struct xfs_da_geometryChristoph Hellwig9-67/+54
Move the data block fixed offsets towards our structure for dir/attr geometry parameters. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: devirtualize ->data_entry_tag_pChristoph Hellwig7-32/+26
Replace the ->data_entry_tag_p dir ops method with a directly called xfs_dir2_data_entry_tag_p helper that takes care of the differences between the directory format with and without the file type field. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: devirtualize ->data_entsizeChristoph Hellwig8-39/+37
Replace the ->data_entsize dir ops method with a directly called xfs_dir2_data_entsize helper that takes care of the differences between the directory format with and without the file type field. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: replace xfs_dir3_data_endp with xfs_dir3_data_end_offsetChristoph Hellwig3-16/+17
All the callers really want an offset into the buffer, so adopt the helper to return that instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: remove the now unused ->data_entry_p methodChristoph Hellwig2-23/+0
Now that all users use the data_entry_offset field this method is unused and can be removed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: cleanup __xfs_dir3_data_checkChristoph Hellwig1-26/+33
Use an offset as the main means for iteration, and only do pointer arithmetics to find the data/unused entries. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-10xfs: cleanup xfs_dir2_data_freescan_intChristoph Hellwig1-28/+20
Use an offset as the main means for iteration, and only do pointer arithmetics to find the data/unused entries. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>