bcachefs: bcachefs_metadata_version_casefolding

This patch implements support for case-insensitive file name lookups in bcachefs. The implementation uses the same UTF-8 lowering and normalization that ext4 and f2fs is using. More information is provided in Documentation/bcachefs/casefolding.rst Compatibility notes: This uses the new versioning scheme for incompatible features where an incompatible feature is tied to a version number: the superblock says "we may use incompat features up to x" and "incompat features up to x are in use", disallowing mounting by previous versions. Additionally, and old style incompat feature bit is used, so that kernels without utf8 casefolding support know if casefolding specifically is in use and they're allowed to mount. Signed-off-by: Joshua Ashton <joshua@froggi.es> Cc: André Almeida <andrealmeid@igalia.com> Cc: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
author: Joshua Ashton <joshua@froggi.es> 2023-08-13 18:34:17 +0100
committer: Kent Overstreet <kent.overstreet@linux.dev> 2025-03-14 21:02:15 -0400
commit: d37c14ac6f05ec98db9b3d9db424dc73a0f5b1cd (patch)
tree: ec86a78663c9222afcfaec3b85ed968ba4d205cf /Documentation/filesystems
parent: bcachefs: Split out dirent alloc and name initialization (diff)
download: wireguard-linux-d37c14ac6f05ec98db9b3d9db424dc73a0f5b1cd.tar.xz
wireguard-linux-d37c14ac6f05ec98db9b3d9db424dc73a0f5b1cd.zip
1 files changed, 87 insertions, 0 deletions
diff --git a/Documentation/filesystems/bcachefs/casefolding.rst b/Documentation/filesystems/bcachefs/casefolding.rst
new file mode 100644
index 000000000000..6546aa4f7a86
--- /dev/null
+++ b/Documentation/filesystems/bcachefs/casefolding.rst
@@ -0,0 +1,87 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Casefolding
+===========
+
+bcachefs has support for case-insensitive file and directory
+lookups using the regular `chattr +F` (`S_CASEFOLD`, `FS_CASEFOLD_FL`)
+casefolding attributes.
+
+The main usecase for casefolding is compatibility with software written
+against other filesystems that rely on casefolded lookups
+(eg. NTFS and Wine/Proton).
+Taking advantage of file-system level casefolding can lead to great
+loading time gains in many applications and games.
+
+Casefolding support requires a kernel with the `CONFIG_UNICODE` enabled.
+Once a directory has been flagged for casefolding, a feature bit
+is enabled on the superblock which marks the filesystem as using
+casefolding.
+When the feature bit for casefolding is enabled, it is no longer possible
+to mount that filesystem on kernels without `CONFIG_UNICODE` enabled.
+
+On the lookup/query side: casefolding is implemented by allocating a new
+string of `BCH_NAME_MAX` length using the `utf8_casefold` function to
+casefold the query string.
+
+On the dirent side: casefolding is implemented by ensuring the `bkey`'s
+hash is made from the casefolded string and storing the cached casefolded
+name with the regular name in the dirent.
+
+The structure looks like this:
+
+Regular:    [dirent data][regular name][nul][nul]...
+Casefolded: [dirent data][reg len][cf len][regular name][casefolded name][nul][nul]...
+
+(Do note, the number of `NUL`s here is merely for illustration, they count can vary
+ per-key, and they may not even be present if the key is aligned to `sizeof(u64)`.)
+
+This is efficient as it means that for all file lookups that require casefolding,
+it has identical performance to a regular lookup:
+a hash comparison and a `memcmp` of the name.
+
+Rationale
+---------
+
+Several designs were considered for this system:
+One was to introduce a dirent_v2, however that would be painful especially as
+the hash system only has support for a single key type. This would also need
+`BCH_NAME_MAX` to change between versions, and a new feature bit.
+
+Another option was to store without the two lengths, and just take the length of
+the regular name and casefolded name contiguously / 2 as the length. This would
+assume that the regular length == casefolded length, but that could potentially
+not be true, if the uppercase unicode glyph had a different UTF-8 encoding than
+the lowercase unicode glyph.
+It would be possible to disregard the casefold cache for those cases, but it was
+decided to simply encode the two string lengths in the key to avoid random
+performance issues if this edgecase was ever hit.
+
+The option settled on was to use a free-bit in d_type to mark a dirent as having
+a casefold cache, and then treat the first 4 bytes the name block as lengths.
+You can see this in the `d_cf_name_block` member of union in `bch_dirent`.
+
+The feature bit was used to allow casefolding support to be enabled for the majority
+of users, but some allow users who have no need for the feature to still use bcachefs as
+`CONFIG_UNICODE` can increase the kernel side a significant amount due to the tables used,
+which may be decider between using bcachefs for eg. embedded platforms.
+
+Other filesystems like ext4 and f2fs have a super-block level option for casefolding
+encoding, but bcachefs currently does not provide this. ext4 and f2fs do not expose
+any encodings than a single UTF-8 version. When future encodings are desirable,
+they will be added trivially using the opts mechanism.
+
+dentry/dcache considerations
+---------
+
+Currently, in casefolded directories, bcachefs (like other filesystems) will not cache
+negative dentry's.
+
+This is because currently doing so presents a problem in the following scenario:
+ - Lookup file "blAH" in a casefolded directory
+ - Creation of file "BLAH" in a casefolded directory
+ - Lookup file "blAH" in a casefolded directory
+This would fail if negative dentry's were cached.
+
+This is slightly suboptimal, but could be fixed in future with some vfs work.
+
author	Joshua Ashton <joshua@froggi.es>	2023-08-13 18:34:17 +0100
committer	Kent Overstreet <kent.overstreet@linux.dev>	2025-03-14 21:02:15 -0400
commit	d37c14ac6f05ec98db9b3d9db424dc73a0f5b1cd (patch)
tree	ec86a78663c9222afcfaec3b85ed968ba4d205cf /Documentation/filesystems
parent	bcachefs: Split out dirent alloc and name initialization (diff)
download	wireguard-linux-d37c14ac6f05ec98db9b3d9db424dc73a0f5b1cd.tar.xz wireguard-linux-d37c14ac6f05ec98db9b3d9db424dc73a0f5b1cd.zip