From 4547f4d8ffd63ba4ac129f9136027bd14b729101 Mon Sep 17 00:00:00 2001 From: Liu Bo Date: Fri, 23 Sep 2016 14:05:04 -0700 Subject: Btrfs: kill BUG_ON in do_relocation While updating btree, we try to push items between sibling nodes/leaves in order to keep height as low as possible. But we don't memset the original places with zero when pushing items so that we could end up leaving stale content in nodes/leaves. One may read the above stale content by increasing btree blocks' @nritems. One case I've come across is that in fs tree, a leaf has two parent nodes, hence running balance ends up with processing this leaf with two parent nodes, but it can only reach the valid parent node through btrfs_search_slot, so it'd be like, do_relocation for P in all parent nodes of block A: if !P->eb: btrfs_search_slot(key); --> get path from P to A. if lowest: BUG_ON(A->bytenr != bytenr of A recorded in P); btrfs_cow_block(P, A); --> change A's bytenr in P. After btrfs_cow_block, P has the new bytenr of A, but with the same @key, we get the same path again, and get panic by BUG_ON. Note that this is only happening in a corrupted fs, for a regular fs in which we have correct @nritems so that we won't read stale content in any case. Reviewed-by: Josef Bacik Signed-off-by: Liu Bo Signed-off-by: David Sterba --- fs/btrfs/relocation.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) (limited to 'fs') diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 0ec8ffa37ab0..c4af0cdb783d 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2728,7 +2728,14 @@ static int do_relocation(struct btrfs_trans_handle *trans, bytenr = btrfs_node_blockptr(upper->eb, slot); if (lowest) { - BUG_ON(bytenr != node->bytenr); + if (bytenr != node->bytenr) { + btrfs_err(root->fs_info, + "lowest leaf/node mismatch: bytenr %llu node->bytenr %llu slot %d upper %llu", + bytenr, node->bytenr, slot, + upper->eb->start); + err = -EIO; + goto next; + } } else { if (node->eb->start == bytenr) goto next; -- cgit v1.2.3-59-g8ed1b From 0b34c261e235a5c74dcf78bd305845bd15fe2b42 Mon Sep 17 00:00:00 2001 From: Goldwyn Rodrigues Date: Fri, 30 Sep 2016 10:40:52 -0500 Subject: btrfs: qgroup: Prevent qgroup->reserved from going subzero While free'ing qgroup->reserved resources, we much check if the page has not been invalidated by a truncate operation by checking if the page is still dirty before reducing the qgroup resources. Resources in such a case are free'd when the entire extent is released by delayed_ref. This fixes a double accounting while releasing resources in case of truncating a file, reproduced by the following testcase. SCRATCH_DEV=/dev/vdb SCRATCH_MNT=/mnt mkfs.btrfs -f $SCRATCH_DEV mount -t btrfs $SCRATCH_DEV $SCRATCH_MNT cd $SCRATCH_MNT btrfs quota enable $SCRATCH_MNT btrfs subvolume create a btrfs qgroup limit 500m a $SCRATCH_MNT sync for c in {1..15}; do dd if=/dev/zero bs=1M count=40 of=$SCRATCH_MNT/a/file; done sleep 10 sync sleep 5 touch $SCRATCH_MNT/a/newfile echo "Removing file" rm $SCRATCH_MNT/a/file Fixes: b9d0b38928 ("btrfs: Add handler for invalidate page") Signed-off-by: Goldwyn Rodrigues Reviewed-by: Qu Wenruo Signed-off-by: David Sterba --- fs/btrfs/inode.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) (limited to 'fs') diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 50ba4ca167e7..6677674d0505 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8930,9 +8930,14 @@ again: * So even we call qgroup_free_data(), it won't decrease reserved * space. * 2) Not written to disk - * This means the reserved space should be freed here. + * This means the reserved space should be freed here. However, + * if a truncate invalidates the page (by clearing PageDirty) + * and the page is accounted for while allocating extent + * in btrfs_check_data_free_space() we let delayed_ref to + * free the entire extent. */ - btrfs_qgroup_free_data(inode, page_start, PAGE_SIZE); + if (PageDirty(page)) + btrfs_qgroup_free_data(inode, page_start, PAGE_SIZE); if (!inode_evicting) { clear_extent_bit(tree, page_start, page_end, EXTENT_LOCKED | EXTENT_DIRTY | -- cgit v1.2.3-59-g8ed1b From 69ae5e4459e43e56f03d0987e865fbac2b05af2a Mon Sep 17 00:00:00 2001 From: Wang Xiaoguang Date: Thu, 13 Oct 2016 09:23:39 +0800 Subject: btrfs: make file clone aware of fatal signals Indeed this just make the behavior similar to xfs when process has fatal signals pending, and it'll make fstests/generic/298 happy. Signed-off-by: Wang Xiaoguang Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/ioctl.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'fs') diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index af69129d7e0e..71634570943c 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3814,6 +3814,11 @@ process_slot: } btrfs_release_path(path); key.offset = next_key_min_offset; + + if (fatal_signal_pending(current)) { + ret = -EINTR; + goto out; + } } ret = 0; -- cgit v1.2.3-59-g8ed1b From dd4b857aabbce57518f2d32d9805365d3ff34fc6 Mon Sep 17 00:00:00 2001 From: Wang Xiaoguang Date: Tue, 18 Oct 2016 15:56:13 +0800 Subject: btrfs: pass correct args to btrfs_async_run_delayed_refs() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In btrfs_truncate_inode_items()->btrfs_async_run_delayed_refs(), we swap the arg2 and arg3 wrongly, fix this. This bug just impacts asynchronous delayed refs handle when we truncate inodes. In delayed_ref_async_start(), there is such codes: trans = btrfs_join_transaction(async->root); if (trans->transid > async->transid) goto end; ret = btrfs_run_delayed_refs(trans, async->root, async->count); From this codes, we can see that this just influence whether can we handle delayed refs or the number of delayed refs to handle, this may impact performance, but will not result in missing delayed refs, all delayed refs will be handled in btrfs_commit_transaction(). Signed-off-by: Wang Xiaoguang Reviewed-by: Holger Hoffstätte Signed-off-by: David Sterba --- fs/btrfs/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'fs') diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 6677674d0505..1f980efbb2e8 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4605,8 +4605,8 @@ delete: BUG_ON(ret); if (btrfs_should_throttle_delayed_refs(trans, root)) btrfs_async_run_delayed_refs(root, - trans->transid, - trans->delayed_ref_updates * 2, 0); + trans->delayed_ref_updates * 2, + trans->transid, 0); if (be_nice) { if (truncate_space_check(trans, root, extent_num_bytes)) { -- cgit v1.2.3-59-g8ed1b From 9c894696f56f5d84fb5766af81bcda4a7cb9a842 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Wed, 12 Oct 2016 11:33:21 +0300 Subject: Btrfs: remove some no-op casts We cast 0 to a u8 but then because of type promotion, it's immediately cast to int back to int before we do a bitwise negate. The cast doesn't matter in this case, the code works as intended. It causes a static checker warning though so let's remove it. Signed-off-by: Dan Carpenter Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/extent_io.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'fs') diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 66a755150056..8ed05d95584a 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5569,7 +5569,7 @@ void le_bitmap_set(u8 *map, unsigned int start, int len) *p |= mask_to_set; len -= bits_to_set; bits_to_set = BITS_PER_BYTE; - mask_to_set = ~(u8)0; + mask_to_set = ~0; p++; } if (len) { @@ -5589,7 +5589,7 @@ void le_bitmap_clear(u8 *map, unsigned int start, int len) *p &= ~mask_to_clear; len -= bits_to_clear; bits_to_clear = BITS_PER_BYTE; - mask_to_clear = ~(u8)0; + mask_to_clear = ~0; p++; } if (len) { @@ -5679,7 +5679,7 @@ void extent_buffer_bitmap_set(struct extent_buffer *eb, unsigned long start, kaddr[offset] |= mask_to_set; len -= bits_to_set; bits_to_set = BITS_PER_BYTE; - mask_to_set = ~(u8)0; + mask_to_set = ~0; if (++offset >= PAGE_SIZE && len > 0) { offset = 0; page = eb->pages[++i]; @@ -5721,7 +5721,7 @@ void extent_buffer_bitmap_clear(struct extent_buffer *eb, unsigned long start, kaddr[offset] &= ~mask_to_clear; len -= bits_to_clear; bits_to_clear = BITS_PER_BYTE; - mask_to_clear = ~(u8)0; + mask_to_clear = ~0; if (++offset >= PAGE_SIZE && len > 0) { offset = 0; page = eb->pages[++i]; -- cgit v1.2.3-59-g8ed1b From 9d1032cc49a8a1065e79ee323de66bcb4fdbd535 Mon Sep 17 00:00:00 2001 From: Wang Xiaoguang Date: Mon, 20 Jun 2016 09:18:52 +0800 Subject: btrfs: fix WARNING in btrfs_select_ref_head() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This issue was found when testing in-band dedupe enospc behaviour, sometimes run_one_delayed_ref() may fail for enospc reason, then __btrfs_run_delayed_refs()will return, but forget to add num_heads_read back, which will trigger "WARN_ON(delayed_refs->num_heads_ready == 0)" in btrfs_select_ref_head(). Signed-off-by: Wang Xiaoguang Signed-off-by: David Sterba --- fs/btrfs/extent-tree.c | 3 +++ 1 file changed, 3 insertions(+) (limited to 'fs') diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 210c94ac8818..4607af38c72e 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2647,7 +2647,10 @@ static noinline int __btrfs_run_delayed_refs(struct btrfs_trans_handle *trans, btrfs_free_delayed_extent_op(extent_op); if (ret) { + spin_lock(&delayed_refs->lock); locked_ref->processing = 0; + delayed_refs->num_heads_ready++; + spin_unlock(&delayed_refs->lock); btrfs_delayed_ref_unlock(locked_ref); btrfs_put_delayed_ref(ref); btrfs_debug(fs_info, "run_one_delayed_ref returned %d", -- cgit v1.2.3-59-g8ed1b