Age | Commit message (Collapse) | Author | Files | Lines |
|
If the final line in in the message to be printed is blang, don't print
it.
This happens with indented printbufs - after a newline we emit spaces up
to the indent level.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add async objs list for
- promote_op
- bch_read_bio
- btree_read_bio
- btree_write_bio
This gets us introspection on in-flight async ops, and because under the
hood it uses fast_lists (percpu slot buffer on top of a radix tree),
it'll be fast enough to enable in production.
This will be very helpful for debugging "something got stuck" issues,
which have been cropping up from time to time (in the CI, especially
with folio writeback).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Debugging infrastructure for async objs: this lets us easily create
fast_lists for various object types so they'll be visible in debugfs.
Add new object types to the BCH_ASYNC_OBJS_TYPES() enum, and drop a
pretty-printer wrapper in async_objs.c.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
A fast "list" data structure, which is actually a radix tree, with an
IDA for slot allocation and a percpu buffer on top of that.
Items cannot be added or moved to the head or tail, only added at some
(arbitrary) position and removed. The advantage is that adding, removing
and iteration is generally lockless, only hitting the lock in ida when
the percpu buffer is full or empty.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Pretty printer for struct bch_read_bio.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Pretty printer for struct bio, to be used for async object debugging.
This is pretty minimal, we'll add more to it as we discover what we
need.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Convert device IO refs to enumerated_refs, for easier debugging of
refcount issues.
Simple conversion: enumerate all users and convert to the new helpers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Drop the single-purpose write ref code in bcachefs.h, and convert to
enumarated refs.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Factor out the debug code for rw filesystem refs into a small library.
In release mode an enumerated ref is a normal percpu refcount, but in
debug mode all enumerated users of the ref get their own atomic_long_t
ref - making it much easier to chase down refcount usage bugs for when a
refcount has many users.
For debugging, we have enumerated_ref_to_text(), which prints the
current value of each different user.
Additionally, in debug mode enumerated_ref_stop() has a 10 second
timeout, after which it will dump outstanding refcounts.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a pass for checking the rebalance_work btree.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This pops up when buliding in userspace.
The structs aren't actually variable length, but no way to tell the
compiler that...
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
People have been asking to see the plan for this, so -
bcachefs has various background tasks that need to be scheduled to
balance efficiency, predictability of performance, etc.
The design and philosophy hasn't changed too much since bcache, which
was primarily designed for server usage, with sustained load in mind.
These days we're seeing more desktop usage - where we really want to let
the system idle effictively, to reduce total power usage - while also
still balancing previous concerns, we still want to let work accumulate
to a degree.
This lays out all the requirements and starts to sketch out the
algorithm I have in mind.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We can't go RW if it's an image file that hasn't been resized.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If a filesystem is going to only be used read-only, and will be a
deployable image, we can strip out alloc info for a substantial
reduction in metadata size - around half, due to backpointers.
Alloc info will be regenerated on first read-write mount.
Remounting RW is disallowed for now, since we don't yet have
check_allocations running in RW mode.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Don't generate entries for versions that won't be able to mount.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If the root inode/subvolume is unreadable we can repair automatically -
but only if we're still in recovery, so that we can rewind to the
appropriate recovery pass.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Instead of going emegency read only with a bch2_fs_inconsistent() call,
log the error and recovery pass appropriately.
If we're still in recovery it'll be repaired immediately, otherwise
it'll be repaired on the next mount.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_print_string_as_lines() is a low level helper that allows messages
longer than 1k to be printed without truncation.
But we should always be printing with the helpers that take a filesystem
object, if we're in fsck they direct output to the userspace process
controlling fsck instead of the dmesg log.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Part of the ongoing project to kill off bch2_(fs|trans)_inconsistent
calls - they generally need to be replaced with either
- a fsck_err() call that can repair the error, or
- logging an error of the appropriate type in the superblock, and
flagging the appropriate recovery pass to repair the error
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We prefer helpers that emit log messages to printbufs rather than
printing them directly; that way, we can ensure that different log
messages from the same event are grouped together and formatted
appropriately in the dmesg log.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
version_upgrade is now a runtime option.
In the future we'll want to add compatible upgrades at runtime, and call
the full check_version_upgrade() when the option changes, but we don't
have compatible optional upgrades just yet.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The helpers are now:
- bch2_opt_hook_pre_set()
- bch2_opts_hooks_pre_set()
- bch2_opt_hook_post_set
Fix a bug where the filesystem discard option would incorrectly be
changed when setting the device option, and don't trigger rebalance
scans unnecessarily (when options aren't changing).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Single device filesystems are now identified by the block device name,
not the UUID - and single device filesystems with the same UUID can be
mounted simultaneously, without any special options.
This allocates a new bit in the superblock, BCH_SB_MULTI_DEVICE, which
indicates whether a filesystem has ever been multi device.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
On single device filesystems, c->name contains the block device name,
not the UUID.
Initialize this earlier, so that single device mode can use it for
initializing sysfs/debugfs.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It's now a wrapper around bch2_journal_halt_locked().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a helper that lets us change bch_member.data_allowed at runtime.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Make the output slightly clearer, and include a counter for "nodes we
couldn't free because we would have gone under our reserve".
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Factor out a helper so we're not duplicating checks after locking the
btree node.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Small cleanup, just always increment the counters.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Kill 'opts.very_degraded', and make 'opts.degraded' a persistent option,
stored in the superblock.
It's now an enum, with available choices ask/yes/very/no.
"ask" mode will be handled by the mount helper, for prompting the user
(on a machine used interactively) for whether to do a degraded mount.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|