Age | Commit message (Collapse) | Author | Files | Lines |
|
Allow btree_trans to be used with CLASS().
Automatic cleanup, instead of manually calling bch2_trans_put().
We don't use DEFINE_CLASS because using a static inline for the
constructor breaks bch2_trans_get()'s use of __func__, so we have to
open code it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a DEFINE_CLASS() for printbufs.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
New helpers to avoid open coded loops.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If we don't finish journal replay we need to keep journal keys around
until the filesystem shuts down - otherwise e.g. -o norecovery, various
tools (dump, list) break, and eventually we'll be doing journal replay
in the background.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We had a bug report where the errors from btree_node_check_topology()
don't seem to be getting printed; log_fsck_err() does some fancy
ratelimiting-type stuff that we don't want here.
Instead, just use bch2_count_fsck_err(); this is simpler, and modelled
after how we're currently handling bucket ref update errors in
buckets.c.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More self healing code: readdir will now notice if there are dirents
hashed incorrectly, and it'll repair them if errors=fix_safe.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We don't track snapshot overwrites outside of fsck, so for this to be
called at runtime outside of fsck we need to create it on demand, when
we have repair to do.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Now uses bch2_get_snapshot_overwrites(), and much shorter.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
New helper for getting a list of snapshot IDs that have overwritten a
given key.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Recover from "journal and btree in same bucket".
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If snapshot deletion incorrectly missing some keys and leaves keys for
deleted snapshots, that causes a bit of a problem for data move - we
can't move an extent for a nonexistent snapshot, because the extent
might have to be fragmented, and maintaining correct visibility in child
snapshots doesn't work if it doesn't have a snapshot.
Previously we'd just skip these keys, but it turns out that causes
copygc to spin.
So we need runtime self healing, i.e. calling check_key_has_snapshot()
from the data move path.
Snapshot deletion v2 included sentinal values for deleted snapshot
nodes, so this is quite safe.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
data_update_init() does need to do btree operations, delay doing the
unlock-before-io.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We'll be adding subtypes of these errors, and new error code tracing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
These don't access global memory or defer pointer arguments - this
enables CSE optimizations.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We were accidentally including the contents from the previous
fsck_err().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Make the superblock error counters available in sysfs; the only other
way they can be seen is 'show-super', but we don't write the superblock
every time the error count gets incremented.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This is straightforward enough: check_fix_ptrs() currently only runs
before we go RW, so updating the btree root pointer in c->btree_roots
suffices - it'll be written out in the first journal write we do.
For that, do_bch2_trans_commit_to_journal_replay() now handles
JSET_ENTRY_btree_root entries.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We have a bug report that looks like we might be leaking open buckets -
let's check if they got left attached to the cached btree node.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More stack usage work.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
with typical config options, variables in different inline functions
aren't sharing stack space - and these are slowpaths.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Specialize the .to_text() for alloc_v4, to avoid the temporary on the
stack for conversion from old versions.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
- Separate out a slowpath for bkey_nocow_lock()
- Don't call bch2_bkey_ptrs_c() or loop over pointers more than
necessary
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More stack usage work.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Factor out an error path for a small stack usage improvement.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Allocate some (smaller) temporary storage in btree_trans for this -
btree_path_down() is in our max-stack call stack.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix an assertion pop in the tiering_misaligned test: rounding down to
bucket size at the end of the journal space calculations leaves
cur_entry_sectors == 0, which is incorrect with !cur_entry_err.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It's uncomon to have multiple devices with journalling only on a subset,
but can be specified with the 'data_allowed' option. We need to know if
we're doing data/metadata writes to multiple devices, as that requires
issuing flushes before the journal writes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix an infinite loop when bkey_i->k.u64s is 0.
This only happens in userspace, where 'bcachefs list_journal' can print
the entire contents of the journal, and non-dirty entries aren't
validated.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
- Don't print a checksum error when we first read a journal entry: we
print a checksum error later if we'll be using the journal entry.
- Continuing with the theme of of improving error messages and grouping
errors into a single log message per error, print a single 'checksum
error' message per journal entry, and use bch2_journal_ptr_to_text()
to print out where on the device it was.
- Factor out checksum error messages and checking for missing journal
entries into helpers, bch2_journal_read() has gotten obnoxiously big.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix a small regression from the "run recovery passes" rewrite, which
enabled async recovery passes.
This fixes getting stuck in a loop in recovery.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Other repair code seems to be doing commits themselves, but
check_key_has_snapshot() does not.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix a missing wakeup in
'bcachefs set-file-option' -> xattr option update -> inode_write
this was missing because the wakeup needs to happen after transaction
commit. Also, add a 'kick' counter, to make sure we don't miss a wakeup
that occured right after we finished checking the rebalance_work btree.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a version of bch2_kthread_io_clock_wait() that only schedules once -
behaving more like schedule_timeout().
This will be used for fixing rebalance wakeups.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Also, don't error out in bucket_ref_update_err(): we don't want to
return -BCH_ERR_cannot_rewind_recovery if it's not an insert, if it's an
overwrite we continue.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Repair code will do updates on older snapshot versions, so needs the
correct annotation.
Reported-by: syzbot+42581416dba62b364750@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If we're doing a reflink copy of existing reflinked data, we may only
set REFLINK_P_MAY_UPDATE_OPTIONS if it was set on the reflink pointer
we're copying from.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Large folios aren't supported without TRANSPARENT_HUGEPAGE
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We can't unlock a should_be_locked path unless we're in a transaction
restart.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Different versions differ on the size of the blacklist range; it is
theoretically possible that we could end up with blacklisted journal
sequence numbers newer than the newest seq we find in the journal, and
pick a new start seq that's blacklisted.
Explicitly check for this in bch2_fs_journal_start().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We don't want to change the bucket gen, on gen mismatch: it's possible
to have multiple btree nodes with different gens in the same bucket that
we want to keep, if we have to recover from btree node scan.
It's also not necessary to set g->gen_valid; add a comment to that
effect.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|