aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Documentation/filesystems/index.rst1
-rw-r--r--Documentation/filesystems/ntfs3.rst106
-rw-r--r--MAINTAINERS9
-rw-r--r--fs/Kconfig1
-rw-r--r--fs/Makefile1
-rw-r--r--fs/ntfs3/Kconfig46
-rw-r--r--fs/ntfs3/Makefile36
-rw-r--r--fs/ntfs3/attrib.c2093
-rw-r--r--fs/ntfs3/attrlist.c460
-rw-r--r--fs/ntfs3/bitfunc.c134
-rw-r--r--fs/ntfs3/bitmap.c1493
-rw-r--r--fs/ntfs3/debug.h52
-rw-r--r--fs/ntfs3/dir.c599
-rw-r--r--fs/ntfs3/file.c1251
-rw-r--r--fs/ntfs3/frecord.c3257
-rw-r--r--fs/ntfs3/fslog.c5217
-rw-r--r--fs/ntfs3/fsntfs.c2509
-rw-r--r--fs/ntfs3/index.c2650
-rw-r--r--fs/ntfs3/inode.c1957
-rw-r--r--fs/ntfs3/lib/decompress_common.c319
-rw-r--r--fs/ntfs3/lib/decompress_common.h338
-rw-r--r--fs/ntfs3/lib/lib.h26
-rw-r--r--fs/ntfs3/lib/lzx_decompress.c670
-rw-r--r--fs/ntfs3/lib/xpress_decompress.c142
-rw-r--r--fs/ntfs3/lznt.c453
-rw-r--r--fs/ntfs3/namei.c411
-rw-r--r--fs/ntfs3/ntfs.h1216
-rw-r--r--fs/ntfs3/ntfs_fs.h1111
-rw-r--r--fs/ntfs3/record.c605
-rw-r--r--fs/ntfs3/run.c1113
-rw-r--r--fs/ntfs3/super.c1512
-rw-r--r--fs/ntfs3/upcase.c108
-rw-r--r--fs/ntfs3/xattr.c1122
33 files changed, 31018 insertions, 0 deletions
diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst
index 1a2dd4d35717..c0ad233963ae 100644
--- a/Documentation/filesystems/index.rst
+++ b/Documentation/filesystems/index.rst
@@ -101,6 +101,7 @@ Documentation for filesystem implementations.
nilfs2
nfs/index
ntfs
+ ntfs3
ocfs2
ocfs2-online-filecheck
omfs
diff --git a/Documentation/filesystems/ntfs3.rst b/Documentation/filesystems/ntfs3.rst
new file mode 100644
index 000000000000..ffe9ea0c1499
--- /dev/null
+++ b/Documentation/filesystems/ntfs3.rst
@@ -0,0 +1,106 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====
+NTFS3
+=====
+
+
+Summary and Features
+====================
+
+NTFS3 is fully functional NTFS Read-Write driver. The driver works with
+NTFS versions up to 3.1, normal/compressed/sparse files
+and journal replaying. File system type to use on mount is 'ntfs3'.
+
+- This driver implements NTFS read/write support for normal, sparse and
+ compressed files.
+- Supports native journal replaying;
+- Supports extended attributes
+ Predefined extended attributes:
+ - 'system.ntfs_security' gets/sets security
+ descriptor (SECURITY_DESCRIPTOR_RELATIVE)
+ - 'system.ntfs_attrib' gets/sets ntfs file/dir attributes.
+ Note: applied to empty files, this allows to switch type between
+ sparse(0x200), compressed(0x800) and normal;
+- Supports NFS export of mounted NTFS volumes.
+
+Mount Options
+=============
+
+The list below describes mount options supported by NTFS3 driver in addition to
+generic ones.
+
+===============================================================================
+
+nls=name This option informs the driver how to interpret path
+ strings and translate them to Unicode and back. If
+ this option is not set, the default codepage will be
+ used (CONFIG_NLS_DEFAULT).
+ Examples:
+ 'nls=utf8'
+
+uid=
+gid=
+umask= Controls the default permissions for files/directories created
+ after the NTFS volume is mounted.
+
+fmask=
+dmask= Instead of specifying umask which applies both to
+ files and directories, fmask applies only to files and
+ dmask only to directories.
+
+nohidden Files with the Windows-specific HIDDEN (FILE_ATTRIBUTE_HIDDEN)
+ attribute will not be shown under Linux.
+
+sys_immutable Files with the Windows-specific SYSTEM
+ (FILE_ATTRIBUTE_SYSTEM) attribute will be marked as system
+ immutable files.
+
+discard Enable support of the TRIM command for improved performance
+ on delete operations, which is recommended for use with the
+ solid-state drives (SSD).
+
+force Forces the driver to mount partitions even if 'dirty' flag
+ (volume dirty) is set. Not recommended for use.
+
+sparse Create new files as "sparse".
+
+showmeta Use this parameter to show all meta-files (System Files) on
+ a mounted NTFS partition.
+ By default, all meta-files are hidden.
+
+prealloc Preallocate space for files excessively when file size is
+ increasing on writes. Decreases fragmentation in case of
+ parallel write operations to different files.
+
+no_acs_rules "No access rules" mount option sets access rights for
+ files/folders to 777 and owner/group to root. This mount
+ option absorbs all other permissions:
+ - permissions change for files/folders will be reported
+ as successful, but they will remain 777;
+ - owner/group change will be reported as successful, but
+ they will stay as root
+
+acl Support POSIX ACLs (Access Control Lists). Effective if
+ supported by Kernel. Not to be confused with NTFS ACLs.
+ The option specified as acl enables support for POSIX ACLs.
+
+noatime All files and directories will not update their last access
+ time attribute if a partition is mounted with this parameter.
+ This option can speed up file system operation.
+
+===============================================================================
+
+ToDo list
+=========
+
+- Full journaling support (currently journal replaying is supported) over JBD.
+
+
+References
+==========
+https://www.paragon-software.com/home/ntfs-linux-professional/
+ - Commercial version of the NTFS driver for Linux.
+
+almaz.alexandrovich@paragon-software.com
+ - Direct e-mail address for feedback and requests on the NTFS3 implementation.
diff --git a/MAINTAINERS b/MAINTAINERS
index 5ffe43730e43..e4bdea045e85 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13340,6 +13340,15 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs.git
F: Documentation/filesystems/ntfs.rst
F: fs/ntfs/
+NTFS3 FILESYSTEM
+M: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
+L: ntfs3@lists.linux.dev
+S: Supported
+W: http://www.paragon-software.com/
+T: git https://github.com/Paragon-Software-Group/linux-ntfs3.git
+F: Documentation/filesystems/ntfs3.rst
+F: fs/ntfs3/
+
NUBUS SUBSYSTEM
M: Finn Thain <fthain@linux-m68k.org>
L: linux-m68k@lists.linux-m68k.org
diff --git a/fs/Kconfig b/fs/Kconfig
index b11bd4b387e1..d8207a1b8c44 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -136,6 +136,7 @@ menu "DOS/FAT/EXFAT/NT Filesystems"
source "fs/fat/Kconfig"
source "fs/exfat/Kconfig"
source "fs/ntfs/Kconfig"
+source "fs/ntfs3/Kconfig"
endmenu
endif # BLOCK
diff --git a/fs/Makefile b/fs/Makefile
index 354e2ba3ee67..2f21300851ae 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -101,6 +101,7 @@ obj-$(CONFIG_CIFS) += cifs/
obj-$(CONFIG_SMB_SERVER) += ksmbd/
obj-$(CONFIG_HPFS_FS) += hpfs/
obj-$(CONFIG_NTFS_FS) += ntfs/
+obj-$(CONFIG_NTFS3_FS) += ntfs3/
obj-$(CONFIG_UFS_FS) += ufs/
obj-$(CONFIG_EFS_FS) += efs/
obj-$(CONFIG_JFFS2_FS) += jffs2/
diff --git a/fs/ntfs3/Kconfig b/fs/ntfs3/Kconfig
new file mode 100644
index 000000000000..6e4cbc48ab8e
--- /dev/null
+++ b/fs/ntfs3/Kconfig
@@ -0,0 +1,46 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config NTFS3_FS
+ tristate "NTFS Read-Write file system support"
+ select NLS
+ help
+ Windows OS native file system (NTFS) support up to NTFS version 3.1.
+
+ Y or M enables the NTFS3 driver with full features enabled (read,
+ write, journal replaying, sparse/compressed files support).
+ File system type to use on mount is "ntfs3". Module name (M option)
+ is also "ntfs3".
+
+ Documentation: <file:Documentation/filesystems/ntfs3.rst>
+
+config NTFS3_64BIT_CLUSTER
+ bool "64 bits per NTFS clusters"
+ depends on NTFS3_FS && 64BIT
+ help
+ Windows implementation of ntfs.sys uses 32 bits per clusters.
+ If activated 64 bits per clusters you will be able to use 4k cluster
+ for 16T+ volumes. Windows will not be able to mount such volumes.
+
+ It is recommended to say N here.
+
+config NTFS3_LZX_XPRESS
+ bool "activate support of external compressions lzx/xpress"
+ depends on NTFS3_FS
+ help
+ In Windows 10 one can use command "compact" to compress any files.
+ 4 possible variants of compression are: xpress4k, xpress8k, xpress16k and lzx.
+ If activated you will be able to read such files correctly.
+
+ It is recommended to say Y here.
+
+config NTFS3_FS_POSIX_ACL
+ bool "NTFS POSIX Access Control Lists"
+ depends on NTFS3_FS
+ select FS_POSIX_ACL
+ help
+ POSIX Access Control Lists (ACLs) support additional access rights
+ for users and groups beyond the standard owner/group/world scheme,
+ and this option selects support for ACLs specifically for ntfs
+ filesystems.
+ NOTE: this is linux only feature. Windows will ignore these ACLs.
+
+ If you don't know what Access Control Lists are, say N.
diff --git a/fs/ntfs3/Makefile b/fs/ntfs3/Makefile
new file mode 100644
index 000000000000..279701b62bbe
--- /dev/null
+++ b/fs/ntfs3/Makefile
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the ntfs3 filesystem support.
+#
+
+# to check robot warnings
+ccflags-y += -Wint-to-pointer-cast \
+ $(call cc-option,-Wunused-but-set-variable,-Wunused-const-variable) \
+ $(call cc-option,-Wold-style-declaration,-Wout-of-line-declaration)
+
+obj-$(CONFIG_NTFS3_FS) += ntfs3.o
+
+ntfs3-y := attrib.o \
+ attrlist.o \
+ bitfunc.o \
+ bitmap.o \
+ dir.o \
+ fsntfs.o \
+ frecord.o \
+ file.o \
+ fslog.o \
+ inode.o \
+ index.o \
+ lznt.o \
+ namei.o \
+ record.o \
+ run.o \
+ super.o \
+ upcase.o \
+ xattr.o
+
+ntfs3-$(CONFIG_NTFS3_LZX_XPRESS) += $(addprefix lib/,\
+ decompress_common.o \
+ lzx_decompress.o \
+ xpress_decompress.o \
+ ) \ No newline at end of file
diff --git a/fs/ntfs3/attrib.c b/fs/ntfs3/attrib.c
new file mode 100644
index 000000000000..34c4cbf7e29b
--- /dev/null
+++ b/fs/ntfs3/attrib.c
@@ -0,0 +1,2093 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ * TODO: Merge attr_set_size/attr_data_get_block/attr_allocate_frame?
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/hash.h>
+#include <linux/nls.h>
+#include <linux/ratelimit.h>
+#include <linux/slab.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+/*
+ * You can set external NTFS_MIN_LOG2_OF_CLUMP/NTFS_MAX_LOG2_OF_CLUMP to manage
+ * preallocate algorithm.
+ */
+#ifndef NTFS_MIN_LOG2_OF_CLUMP
+#define NTFS_MIN_LOG2_OF_CLUMP 16
+#endif
+
+#ifndef NTFS_MAX_LOG2_OF_CLUMP
+#define NTFS_MAX_LOG2_OF_CLUMP 26
+#endif
+
+// 16M
+#define NTFS_CLUMP_MIN (1 << (NTFS_MIN_LOG2_OF_CLUMP + 8))
+// 16G
+#define NTFS_CLUMP_MAX (1ull << (NTFS_MAX_LOG2_OF_CLUMP + 8))
+
+static inline u64 get_pre_allocated(u64 size)
+{
+ u32 clump;
+ u8 align_shift;
+ u64 ret;
+
+ if (size <= NTFS_CLUMP_MIN) {
+ clump = 1 << NTFS_MIN_LOG2_OF_CLUMP;
+ align_shift = NTFS_MIN_LOG2_OF_CLUMP;
+ } else if (size >= NTFS_CLUMP_MAX) {
+ clump = 1 << NTFS_MAX_LOG2_OF_CLUMP;
+ align_shift = NTFS_MAX_LOG2_OF_CLUMP;
+ } else {
+ align_shift = NTFS_MIN_LOG2_OF_CLUMP - 1 +
+ __ffs(size >> (8 + NTFS_MIN_LOG2_OF_CLUMP));
+ clump = 1u << align_shift;
+ }
+
+ ret = (((size + clump - 1) >> align_shift)) << align_shift;
+
+ return ret;
+}
+
+/*
+ * attr_must_be_resident
+ *
+ * Return: True if attribute must be resident.
+ */
+static inline bool attr_must_be_resident(struct ntfs_sb_info *sbi,
+ enum ATTR_TYPE type)
+{
+ const struct ATTR_DEF_ENTRY *de;
+
+ switch (type) {
+ case ATTR_STD:
+ case ATTR_NAME:
+ case ATTR_ID:
+ case ATTR_LABEL:
+ case ATTR_VOL_INFO:
+ case ATTR_ROOT:
+ case ATTR_EA_INFO:
+ return true;
+ default:
+ de = ntfs_query_def(sbi, type);
+ if (de && (de->flags & NTFS_ATTR_MUST_BE_RESIDENT))
+ return true;
+ return false;
+ }
+}
+
+/*
+ * attr_load_runs - Load all runs stored in @attr.
+ */
+int attr_load_runs(struct ATTRIB *attr, struct ntfs_inode *ni,
+ struct runs_tree *run, const CLST *vcn)
+{
+ int err;
+ CLST svcn = le64_to_cpu(attr->nres.svcn);
+ CLST evcn = le64_to_cpu(attr->nres.evcn);
+ u32 asize;
+ u16 run_off;
+
+ if (svcn >= evcn + 1 || run_is_mapped_full(run, svcn, evcn))
+ return 0;
+
+ if (vcn && (evcn < *vcn || *vcn < svcn))
+ return -EINVAL;
+
+ asize = le32_to_cpu(attr->size);
+ run_off = le16_to_cpu(attr->nres.run_off);
+ err = run_unpack_ex(run, ni->mi.sbi, ni->mi.rno, svcn, evcn,
+ vcn ? *vcn : svcn, Add2Ptr(attr, run_off),
+ asize - run_off);
+ if (err < 0)
+ return err;
+
+ return 0;
+}
+
+/*
+ * run_deallocate_ex - Deallocate clusters.
+ */
+static int run_deallocate_ex(struct ntfs_sb_info *sbi, struct runs_tree *run,
+ CLST vcn, CLST len, CLST *done, bool trim)
+{
+ int err = 0;
+ CLST vcn_next, vcn0 = vcn, lcn, clen, dn = 0;
+ size_t idx;
+
+ if (!len)
+ goto out;
+
+ if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) {
+failed:
+ run_truncate(run, vcn0);
+ err = -EINVAL;
+ goto out;
+ }
+
+ for (;;) {
+ if (clen > len)
+ clen = len;
+
+ if (!clen) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (lcn != SPARSE_LCN) {
+ mark_as_free_ex(sbi, lcn, clen, trim);
+ dn += clen;
+ }
+
+ len -= clen;
+ if (!len)
+ break;
+
+ vcn_next = vcn + clen;
+ if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) ||
+ vcn != vcn_next) {
+ /* Save memory - don't load entire run. */
+ goto failed;
+ }
+ }
+
+out:
+ if (done)
+ *done += dn;
+
+ return err;
+}
+
+/*
+ * attr_allocate_clusters - Find free space, mark it as used and store in @run.
+ */
+int attr_allocate_clusters(struct ntfs_sb_info *sbi, struct runs_tree *run,
+ CLST vcn, CLST lcn, CLST len, CLST *pre_alloc,
+ enum ALLOCATE_OPT opt, CLST *alen, const size_t fr,
+ CLST *new_lcn)
+{
+ int err;
+ CLST flen, vcn0 = vcn, pre = pre_alloc ? *pre_alloc : 0;
+ struct wnd_bitmap *wnd = &sbi->used.bitmap;
+ size_t cnt = run->count;
+
+ for (;;) {
+ err = ntfs_look_for_free_space(sbi, lcn, len + pre, &lcn, &flen,
+ opt);
+
+ if (err == -ENOSPC && pre) {
+ pre = 0;
+ if (*pre_alloc)
+ *pre_alloc = 0;
+ continue;
+ }
+
+ if (err)
+ goto out;
+
+ if (new_lcn && vcn == vcn0)
+ *new_lcn = lcn;
+
+ /* Add new fragment into run storage. */
+ if (!run_add_entry(run, vcn, lcn, flen, opt == ALLOCATE_MFT)) {
+ /* Undo last 'ntfs_look_for_free_space' */
+ down_write_nested(&wnd->rw_lock, BITMAP_MUTEX_CLUSTERS);
+ wnd_set_free(wnd, lcn, flen);
+ up_write(&wnd->rw_lock);
+ err = -ENOMEM;
+ goto out;
+ }
+
+ vcn += flen;
+
+ if (flen >= len || opt == ALLOCATE_MFT ||
+ (fr && run->count - cnt >= fr)) {
+ *alen = vcn - vcn0;
+ return 0;
+ }
+
+ len -= flen;
+ }
+
+out:
+ /* Undo 'ntfs_look_for_free_space' */
+ if (vcn - vcn0) {
+ run_deallocate_ex(sbi, run, vcn0, vcn - vcn0, NULL, false);
+ run_truncate(run, vcn0);
+ }
+
+ return err;
+}
+
+/*
+ * attr_make_nonresident
+ *
+ * If page is not NULL - it is already contains resident data
+ * and locked (called from ni_write_frame()).
+ */
+int attr_make_nonresident(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct ATTR_LIST_ENTRY *le, struct mft_inode *mi,
+ u64 new_size, struct runs_tree *run,
+ struct ATTRIB **ins_attr, struct page *page)
+{
+ struct ntfs_sb_info *sbi;
+ struct ATTRIB *attr_s;
+ struct MFT_REC *rec;
+ u32 used, asize, rsize, aoff, align;
+ bool is_data;
+ CLST len, alen;
+ char *next;
+ int err;
+
+ if (attr->non_res) {
+ *ins_attr = attr;
+ return 0;
+ }
+
+ sbi = mi->sbi;
+ rec = mi->mrec;
+ attr_s = NULL;
+ used = le32_to_cpu(rec->used);
+ asize = le32_to_cpu(attr->size);
+ next = Add2Ptr(attr, asize);
+ aoff = PtrOffset(rec, attr);
+ rsize = le32_to_cpu(attr->res.data_size);
+ is_data = attr->type == ATTR_DATA && !attr->name_len;
+
+ align = sbi->cluster_size;
+ if (is_attr_compressed(attr))
+ align <<= COMPRESSION_UNIT;
+ len = (rsize + align - 1) >> sbi->cluster_bits;
+
+ run_init(run);
+
+ /* Make a copy of original attribute. */
+ attr_s = kmemdup(attr, asize, GFP_NOFS);
+ if (!attr_s) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ if (!len) {
+ /* Empty resident -> Empty nonresident. */
+ alen = 0;
+ } else {
+ const char *data = resident_data(attr);
+
+ err = attr_allocate_clusters(sbi, run, 0, 0, len, NULL,
+ ALLOCATE_DEF, &alen, 0, NULL);
+ if (err)
+ goto out1;
+
+ if (!rsize) {
+ /* Empty resident -> Non empty nonresident. */
+ } else if (!is_data) {
+ err = ntfs_sb_write_run(sbi, run, 0, data, rsize);
+ if (err)
+ goto out2;
+ } else if (!page) {
+ char *kaddr;
+
+ page = grab_cache_page(ni->vfs_inode.i_mapping, 0);
+ if (!page) {
+ err = -ENOMEM;
+ goto out2;
+ }
+ kaddr = kmap_atomic(page);
+ memcpy(kaddr, data, rsize);
+ memset(kaddr + rsize, 0, PAGE_SIZE - rsize);
+ kunmap_atomic(kaddr);
+ flush_dcache_page(page);
+ SetPageUptodate(page);
+ set_page_dirty(page);
+ unlock_page(page);
+ put_page(page);
+ }
+ }
+
+ /* Remove original attribute. */
+ used -= asize;
+ memmove(attr, Add2Ptr(attr, asize), used - aoff);
+ rec->used = cpu_to_le32(used);
+ mi->dirty = true;
+ if (le)
+ al_remove_le(ni, le);
+
+ err = ni_insert_nonresident(ni, attr_s->type, attr_name(attr_s),
+ attr_s->name_len, run, 0, alen,
+ attr_s->flags, &attr, NULL);
+ if (err)
+ goto out3;
+
+ kfree(attr_s);
+ attr->nres.data_size = cpu_to_le64(rsize);
+ attr->nres.valid_size = attr->nres.data_size;
+
+ *ins_attr = attr;
+
+ if (is_data)
+ ni->ni_flags &= ~NI_FLAG_RESIDENT;
+
+ /* Resident attribute becomes non resident. */
+ return 0;
+
+out3:
+ attr = Add2Ptr(rec, aoff);
+ memmove(next, attr, used - aoff);
+ memcpy(attr, attr_s, asize);
+ rec->used = cpu_to_le32(used + asize);
+ mi->dirty = true;
+out2:
+ /* Undo: do not trim new allocated clusters. */
+ run_deallocate(sbi, run, false);
+ run_close(run);
+out1:
+ kfree(attr_s);
+out:
+ return err;
+}
+
+/*
+ * attr_set_size_res - Helper for attr_set_size().
+ */
+static int attr_set_size_res(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct ATTR_LIST_ENTRY *le, struct mft_inode *mi,
+ u64 new_size, struct runs_tree *run,
+ struct ATTRIB **ins_attr)
+{
+ struct ntfs_sb_info *sbi = mi->sbi;
+ struct MFT_REC *rec = mi->mrec;
+ u32 used = le32_to_cpu(rec->used);
+ u32 asize = le32_to_cpu(attr->size);
+ u32 aoff = PtrOffset(rec, attr);
+ u32 rsize = le32_to_cpu(attr->res.data_size);
+ u32 tail = used - aoff - asize;
+ char *next = Add2Ptr(attr, asize);
+ s64 dsize = ALIGN(new_size, 8) - ALIGN(rsize, 8);
+
+ if (dsize < 0) {
+ memmove(next + dsize, next, tail);
+ } else if (dsize > 0) {
+ if (used + dsize > sbi->max_bytes_per_attr)
+ return attr_make_nonresident(ni, attr, le, mi, new_size,
+ run, ins_attr, NULL);
+
+ memmove(next + dsize, next, tail);
+ memset(next, 0, dsize);
+ }
+
+ if (new_size > rsize)
+ memset(Add2Ptr(resident_data(attr), rsize), 0,
+ new_size - rsize);
+
+ rec->used = cpu_to_le32(used + dsize);
+ attr->size = cpu_to_le32(asize + dsize);
+ attr->res.data_size = cpu_to_le32(new_size);
+ mi->dirty = true;
+ *ins_attr = attr;
+
+ return 0;
+}
+
+/*
+ * attr_set_size - Change the size of attribute.
+ *
+ * Extend:
+ * - Sparse/compressed: No allocated clusters.
+ * - Normal: Append allocated and preallocated new clusters.
+ * Shrink:
+ * - No deallocate if @keep_prealloc is set.
+ */
+int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, struct runs_tree *run,
+ u64 new_size, const u64 *new_valid, bool keep_prealloc,
+ struct ATTRIB **ret)
+{
+ int err = 0;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ u8 cluster_bits = sbi->cluster_bits;
+ bool is_mft =
+ ni->mi.rno == MFT_REC_MFT && type == ATTR_DATA && !name_len;
+ u64 old_valid, old_size, old_alloc, new_alloc, new_alloc_tmp;
+ struct ATTRIB *attr = NULL, *attr_b;
+ struct ATTR_LIST_ENTRY *le, *le_b;
+ struct mft_inode *mi, *mi_b;
+ CLST alen, vcn, lcn, new_alen, old_alen, svcn, evcn;
+ CLST next_svcn, pre_alloc = -1, done = 0;
+ bool is_ext;
+ u32 align;
+ struct MFT_REC *rec;
+
+again:
+ le_b = NULL;
+ attr_b = ni_find_attr(ni, NULL, &le_b, type, name, name_len, NULL,
+ &mi_b);
+ if (!attr_b) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ if (!attr_b->non_res) {
+ err = attr_set_size_res(ni, attr_b, le_b, mi_b, new_size, run,
+ &attr_b);
+ if (err || !attr_b->non_res)
+ goto out;
+
+ /* Layout of records may be changed, so do a full search. */
+ goto again;
+ }
+
+ is_ext = is_attr_ext(attr_b);
+
+again_1:
+ align = sbi->cluster_size;
+
+ if (is_ext) {
+ align <<= attr_b->nres.c_unit;
+ if (is_attr_sparsed(attr_b))
+ keep_prealloc = false;
+ }
+
+ old_valid = le64_to_cpu(attr_b->nres.valid_size);
+ old_size = le64_to_cpu(attr_b->nres.data_size);
+ old_alloc = le64_to_cpu(attr_b->nres.alloc_size);
+ old_alen = old_alloc >> cluster_bits;
+
+ new_alloc = (new_size + align - 1) & ~(u64)(align - 1);
+ new_alen = new_alloc >> cluster_bits;
+
+ if (keep_prealloc && is_ext)
+ keep_prealloc = false;
+
+ if (keep_prealloc && new_size < old_size) {
+ attr_b->nres.data_size = cpu_to_le64(new_size);
+ mi_b->dirty = true;
+ goto ok;
+ }
+
+ vcn = old_alen - 1;
+
+ svcn = le64_to_cpu(attr_b->nres.svcn);
+ evcn = le64_to_cpu(attr_b->nres.evcn);
+
+ if (svcn <= vcn && vcn <= evcn) {
+ attr = attr_b;
+ le = le_b;
+ mi = mi_b;
+ } else if (!le_b) {
+ err = -EINVAL;
+ goto out;
+ } else {
+ le = le_b;
+ attr = ni_find_attr(ni, attr_b, &le, type, name, name_len, &vcn,
+ &mi);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+next_le_1:
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn = le64_to_cpu(attr->nres.evcn);
+ }
+
+next_le:
+ rec = mi->mrec;
+
+ err = attr_load_runs(attr, ni, run, NULL);
+ if (err)
+ goto out;
+
+ if (new_size > old_size) {
+ CLST to_allocate;
+ size_t free;
+
+ if (new_alloc <= old_alloc) {
+ attr_b->nres.data_size = cpu_to_le64(new_size);
+ mi_b->dirty = true;
+ goto ok;
+ }
+
+ to_allocate = new_alen - old_alen;
+add_alloc_in_same_attr_seg:
+ lcn = 0;
+ if (is_mft) {
+ /* MFT allocates clusters from MFT zone. */
+ pre_alloc = 0;
+ } else if (is_ext) {
+ /* No preallocate for sparse/compress. */
+ pre_alloc = 0;
+ } else if (pre_alloc == -1) {
+ pre_alloc = 0;
+ if (type == ATTR_DATA && !name_len &&
+ sbi->options.prealloc) {
+ CLST new_alen2 = bytes_to_cluster(
+ sbi, get_pre_allocated(new_size));
+ pre_alloc = new_alen2 - new_alen;
+ }
+
+ /* Get the last LCN to allocate from. */
+ if (old_alen &&
+ !run_lookup_entry(run, vcn, &lcn, NULL, NULL)) {
+ lcn = SPARSE_LCN;
+ }
+
+ if (lcn == SPARSE_LCN)
+ lcn = 0;
+ else if (lcn)
+ lcn += 1;
+
+ free = wnd_zeroes(&sbi->used.bitmap);
+ if (to_allocate > free) {
+ err = -ENOSPC;
+ goto out;
+ }
+
+ if (pre_alloc && to_allocate + pre_alloc > free)
+ pre_alloc = 0;
+ }
+
+ vcn = old_alen;
+
+ if (is_ext) {
+ if (!run_add_entry(run, vcn, SPARSE_LCN, to_allocate,
+ false)) {
+ err = -ENOMEM;
+ goto out;
+ }
+ alen = to_allocate;
+ } else {
+ /* ~3 bytes per fragment. */
+ err = attr_allocate_clusters(
+ sbi, run, vcn, lcn, to_allocate, &pre_alloc,
+ is_mft ? ALLOCATE_MFT : 0, &alen,
+ is_mft ? 0
+ : (sbi->record_size -
+ le32_to_cpu(rec->used) + 8) /
+ 3 +
+ 1,
+ NULL);
+ if (err)
+ goto out;
+ }
+
+ done += alen;
+ vcn += alen;
+ if (to_allocate > alen)
+ to_allocate -= alen;
+ else
+ to_allocate = 0;
+
+pack_runs:
+ err = mi_pack_runs(mi, attr, run, vcn - svcn);
+ if (err)
+ goto out;
+
+ next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+ new_alloc_tmp = (u64)next_svcn << cluster_bits;
+ attr_b->nres.alloc_size = cpu_to_le64(new_alloc_tmp);
+ mi_b->dirty = true;
+
+ if (next_svcn >= vcn && !to_allocate) {
+ /* Normal way. Update attribute and exit. */
+ attr_b->nres.data_size = cpu_to_le64(new_size);
+ goto ok;
+ }
+
+ /* At least two MFT to avoid recursive loop. */
+ if (is_mft && next_svcn == vcn &&
+ ((u64)done << sbi->cluster_bits) >= 2 * sbi->record_size) {
+ new_size = new_alloc_tmp;
+ attr_b->nres.data_size = attr_b->nres.alloc_size;
+ goto ok;
+ }
+
+ if (le32_to_cpu(rec->used) < sbi->record_size) {
+ old_alen = next_svcn;
+ evcn = old_alen - 1;
+ goto add_alloc_in_same_attr_seg;
+ }
+
+ attr_b->nres.data_size = attr_b->nres.alloc_size;
+ if (new_alloc_tmp < old_valid)
+ attr_b->nres.valid_size = attr_b->nres.data_size;
+
+ if (type == ATTR_LIST) {
+ err = ni_expand_list(ni);
+ if (err)
+ goto out;
+ if (next_svcn < vcn)
+ goto pack_runs;
+
+ /* Layout of records is changed. */
+ goto again;
+ }
+
+ if (!ni->attr_list.size) {
+ err = ni_create_attr_list(ni);
+ if (err)
+ goto out;
+ /* Layout of records is changed. */
+ }
+
+ if (next_svcn >= vcn) {
+ /* This is MFT data, repeat. */
+ goto again;
+ }
+
+ /* Insert new attribute segment. */
+ err = ni_insert_nonresident(ni, type, name, name_len, run,
+ next_svcn, vcn - next_svcn,
+ attr_b->flags, &attr, &mi);
+ if (err)
+ goto out;
+
+ if (!is_mft)
+ run_truncate_head(run, evcn + 1);
+
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn = le64_to_cpu(attr->nres.evcn);
+
+ le_b = NULL;
+ /*
+ * Layout of records maybe changed.
+ * Find base attribute to update.
+ */
+ attr_b = ni_find_attr(ni, NULL, &le_b, type, name, name_len,
+ NULL, &mi_b);
+ if (!attr_b) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ attr_b->nres.alloc_size = cpu_to_le64((u64)vcn << cluster_bits);
+ attr_b->nres.data_size = attr_b->nres.alloc_size;
+ attr_b->nres.valid_size = attr_b->nres.alloc_size;
+ mi_b->dirty = true;
+ goto again_1;
+ }
+
+ if (new_size != old_size ||
+ (new_alloc != old_alloc && !keep_prealloc)) {
+ vcn = max(svcn, new_alen);
+ new_alloc_tmp = (u64)vcn << cluster_bits;
+
+ alen = 0;
+ err = run_deallocate_ex(sbi, run, vcn, evcn - vcn + 1, &alen,
+ true);
+ if (err)
+ goto out;
+
+ run_truncate(run, vcn);
+
+ if (vcn > svcn) {
+ err = mi_pack_runs(mi, attr, run, vcn - svcn);
+ if (err)
+ goto out;
+ } else if (le && le->vcn) {
+ u16 le_sz = le16_to_cpu(le->size);
+
+ /*
+ * NOTE: List entries for one attribute are always
+ * the same size. We deal with last entry (vcn==0)
+ * and it is not first in entries array
+ * (list entry for std attribute always first).
+ * So it is safe to step back.
+ */
+ mi_remove_attr(NULL, mi, attr);
+
+ if (!al_remove_le(ni, le)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ le = (struct ATTR_LIST_ENTRY *)((u8 *)le - le_sz);
+ } else {
+ attr->nres.evcn = cpu_to_le64((u64)vcn - 1);
+ mi->dirty = true;
+ }
+
+ attr_b->nres.alloc_size = cpu_to_le64(new_alloc_tmp);
+
+ if (vcn == new_alen) {
+ attr_b->nres.data_size = cpu_to_le64(new_size);
+ if (new_size < old_valid)
+ attr_b->nres.valid_size =
+ attr_b->nres.data_size;
+ } else {
+ if (new_alloc_tmp <=
+ le64_to_cpu(attr_b->nres.data_size))
+ attr_b->nres.data_size =
+ attr_b->nres.alloc_size;
+ if (new_alloc_tmp <
+ le64_to_cpu(attr_b->nres.valid_size))
+ attr_b->nres.valid_size =
+ attr_b->nres.alloc_size;
+ }
+
+ if (is_ext)
+ le64_sub_cpu(&attr_b->nres.total_size,
+ ((u64)alen << cluster_bits));
+
+ mi_b->dirty = true;
+
+ if (new_alloc_tmp <= new_alloc)
+ goto ok;
+
+ old_size = new_alloc_tmp;
+ vcn = svcn - 1;
+
+ if (le == le_b) {
+ attr = attr_b;
+ mi = mi_b;
+ evcn = svcn - 1;
+ svcn = 0;
+ goto next_le;
+ }
+
+ if (le->type != type || le->name_len != name_len ||
+ memcmp(le_name(le), name, name_len * sizeof(short))) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = ni_load_mi(ni, le, &mi);
+ if (err)
+ goto out;
+
+ attr = mi_find_attr(mi, NULL, type, name, name_len, &le->id);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+ goto next_le_1;
+ }
+
+ok:
+ if (new_valid) {
+ __le64 valid = cpu_to_le64(min(*new_valid, new_size));
+
+ if (attr_b->nres.valid_size != valid) {
+ attr_b->nres.valid_size = valid;
+ mi_b->dirty = true;
+ }
+ }
+
+out:
+ if (!err && attr_b && ret)
+ *ret = attr_b;
+
+ /* Update inode_set_bytes. */
+ if (!err && ((type == ATTR_DATA && !name_len) ||
+ (type == ATTR_ALLOC && name == I30_NAME))) {
+ bool dirty = false;
+
+ if (ni->vfs_inode.i_size != new_size) {
+ ni->vfs_inode.i_size = new_size;
+ dirty = true;
+ }
+
+ if (attr_b && attr_b->non_res) {
+ new_alloc = le64_to_cpu(attr_b->nres.alloc_size);
+ if (inode_get_bytes(&ni->vfs_inode) != new_alloc) {
+ inode_set_bytes(&ni->vfs_inode, new_alloc);
+ dirty = true;
+ }
+ }
+
+ if (dirty) {
+ ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+ mark_inode_dirty(&ni->vfs_inode);
+ }
+ }
+
+ return err;
+}
+
+int attr_data_get_block(struct ntfs_inode *ni, CLST vcn, CLST clen, CLST *lcn,
+ CLST *len, bool *new)
+{
+ int err = 0;
+ struct runs_tree *run = &ni->file.run;
+ struct ntfs_sb_info *sbi;
+ u8 cluster_bits;
+ struct ATTRIB *attr = NULL, *attr_b;
+ struct ATTR_LIST_ENTRY *le, *le_b;
+ struct mft_inode *mi, *mi_b;
+ CLST hint, svcn, to_alloc, evcn1, next_svcn, asize, end;
+ u64 total_size;
+ u32 clst_per_frame;
+ bool ok;
+
+ if (new)
+ *new = false;
+
+ down_read(&ni->file.run_lock);
+ ok = run_lookup_entry(run, vcn, lcn, len, NULL);
+ up_read(&ni->file.run_lock);
+
+ if (ok && (*lcn != SPARSE_LCN || !new)) {
+ /* Normal way. */
+ return 0;
+ }
+
+ if (!clen)
+ clen = 1;
+
+ if (ok && clen > *len)
+ clen = *len;
+
+ sbi = ni->mi.sbi;
+ cluster_bits = sbi->cluster_bits;
+
+ ni_lock(ni);
+ down_write(&ni->file.run_lock);
+
+ le_b = NULL;
+ attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b);
+ if (!attr_b) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ if (!attr_b->non_res) {
+ *lcn = RESIDENT_LCN;
+ *len = 1;
+ goto out;
+ }
+
+ asize = le64_to_cpu(attr_b->nres.alloc_size) >> sbi->cluster_bits;
+ if (vcn >= asize) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ clst_per_frame = 1u << attr_b->nres.c_unit;
+ to_alloc = (clen + clst_per_frame - 1) & ~(clst_per_frame - 1);
+
+ if (vcn + to_alloc > asize)
+ to_alloc = asize - vcn;
+
+ svcn = le64_to_cpu(attr_b->nres.svcn);
+ evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1;
+
+ attr = attr_b;
+ le = le_b;
+ mi = mi_b;
+
+ if (le_b && (vcn < svcn || evcn1 <= vcn)) {
+ attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn,
+ &mi);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+ }
+
+ err = attr_load_runs(attr, ni, run, NULL);
+ if (err)
+ goto out;
+
+ if (!ok) {
+ ok = run_lookup_entry(run, vcn, lcn, len, NULL);
+ if (ok && (*lcn != SPARSE_LCN || !new)) {
+ /* Normal way. */
+ err = 0;
+ goto ok;
+ }
+
+ if (!ok && !new) {
+ *len = 0;
+ err = 0;
+ goto ok;
+ }
+
+ if (ok && clen > *len) {
+ clen = *len;
+ to_alloc = (clen + clst_per_frame - 1) &
+ ~(clst_per_frame - 1);
+ }
+ }
+
+ if (!is_attr_ext(attr_b)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Get the last LCN to allocate from. */
+ hint = 0;
+
+ if (vcn > evcn1) {
+ if (!run_add_entry(run, evcn1, SPARSE_LCN, vcn - evcn1,
+ false)) {
+ err = -ENOMEM;
+ goto out;
+ }
+ } else if (vcn && !run_lookup_entry(run, vcn - 1, &hint, NULL, NULL)) {
+ hint = -1;
+ }
+
+ err = attr_allocate_clusters(
+ sbi, run, vcn, hint + 1, to_alloc, NULL, 0, len,
+ (sbi->record_size - le32_to_cpu(mi->mrec->used) + 8) / 3 + 1,
+ lcn);
+ if (err)
+ goto out;
+ *new = true;
+
+ end = vcn + *len;
+
+ total_size = le64_to_cpu(attr_b->nres.total_size) +
+ ((u64)*len << cluster_bits);
+
+repack:
+ err = mi_pack_runs(mi, attr, run, max(end, evcn1) - svcn);
+ if (err)
+ goto out;
+
+ attr_b->nres.total_size = cpu_to_le64(total_size);
+ inode_set_bytes(&ni->vfs_inode, total_size);
+ ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+
+ mi_b->dirty = true;
+ mark_inode_dirty(&ni->vfs_inode);
+
+ /* Stored [vcn : next_svcn) from [vcn : end). */
+ next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+
+ if (end <= evcn1) {
+ if (next_svcn == evcn1) {
+ /* Normal way. Update attribute and exit. */
+ goto ok;
+ }
+ /* Add new segment [next_svcn : evcn1 - next_svcn). */
+ if (!ni->attr_list.size) {
+ err = ni_create_attr_list(ni);
+ if (err)
+ goto out;
+ /* Layout of records is changed. */
+ le_b = NULL;
+ attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL,
+ 0, NULL, &mi_b);
+ if (!attr_b) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ attr = attr_b;
+ le = le_b;
+ mi = mi_b;
+ goto repack;
+ }
+ }
+
+ svcn = evcn1;
+
+ /* Estimate next attribute. */
+ attr = ni_find_attr(ni, attr, &le, ATTR_DATA, NULL, 0, &svcn, &mi);
+
+ if (attr) {
+ CLST alloc = bytes_to_cluster(
+ sbi, le64_to_cpu(attr_b->nres.alloc_size));
+ CLST evcn = le64_to_cpu(attr->nres.evcn);
+
+ if (end < next_svcn)
+ end = next_svcn;
+ while (end > evcn) {
+ /* Remove segment [svcn : evcn). */
+ mi_remove_attr(NULL, mi, attr);
+
+ if (!al_remove_le(ni, le)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (evcn + 1 >= alloc) {
+ /* Last attribute segment. */
+ evcn1 = evcn + 1;
+ goto ins_ext;
+ }
+
+ if (ni_load_mi(ni, le, &mi)) {
+ attr = NULL;
+ goto out;
+ }
+
+ attr = mi_find_attr(mi, NULL, ATTR_DATA, NULL, 0,
+ &le->id);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn = le64_to_cpu(attr->nres.evcn);
+ }
+
+ if (end < svcn)
+ end = svcn;
+
+ err = attr_load_runs(attr, ni, run, &end);
+ if (err)
+ goto out;
+
+ evcn1 = evcn + 1;
+ attr->nres.svcn = cpu_to_le64(next_svcn);
+ err = mi_pack_runs(mi, attr, run, evcn1 - next_svcn);
+ if (err)
+ goto out;
+
+ le->vcn = cpu_to_le64(next_svcn);
+ ni->attr_list.dirty = true;
+ mi->dirty = true;
+
+ next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+ }
+ins_ext:
+ if (evcn1 > next_svcn) {
+ err = ni_insert_nonresident(ni, ATTR_DATA, NULL, 0, run,
+ next_svcn, evcn1 - next_svcn,
+ attr_b->flags, &attr, &mi);
+ if (err)
+ goto out;
+ }
+ok:
+ run_truncate_around(run, vcn);
+out:
+ up_write(&ni->file.run_lock);
+ ni_unlock(ni);
+
+ return err;
+}
+
+int attr_data_read_resident(struct ntfs_inode *ni, struct page *page)
+{
+ u64 vbo;
+ struct ATTRIB *attr;
+ u32 data_size;
+
+ attr = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL, NULL);
+ if (!attr)
+ return -EINVAL;
+
+ if (attr->non_res)
+ return E_NTFS_NONRESIDENT;
+
+ vbo = page->index << PAGE_SHIFT;
+ data_size = le32_to_cpu(attr->res.data_size);
+ if (vbo < data_size) {
+ const char *data = resident_data(attr);
+ char *kaddr = kmap_atomic(page);
+ u32 use = data_size - vbo;
+
+ if (use > PAGE_SIZE)
+ use = PAGE_SIZE;
+
+ memcpy(kaddr, data + vbo, use);
+ memset(kaddr + use, 0, PAGE_SIZE - use);
+ kunmap_atomic(kaddr);
+ flush_dcache_page(page);
+ SetPageUptodate(page);
+ } else if (!PageUptodate(page)) {
+ zero_user_segment(page, 0, PAGE_SIZE);
+ SetPageUptodate(page);
+ }
+
+ return 0;
+}
+
+int attr_data_write_resident(struct ntfs_inode *ni, struct page *page)
+{
+ u64 vbo;
+ struct mft_inode *mi;
+ struct ATTRIB *attr;
+ u32 data_size;
+
+ attr = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL, &mi);
+ if (!attr)
+ return -EINVAL;
+
+ if (attr->non_res) {
+ /* Return special error code to check this case. */
+ return E_NTFS_NONRESIDENT;
+ }
+
+ vbo = page->index << PAGE_SHIFT;
+ data_size = le32_to_cpu(attr->res.data_size);
+ if (vbo < data_size) {
+ char *data = resident_data(attr);
+ char *kaddr = kmap_atomic(page);
+ u32 use = data_size - vbo;
+
+ if (use > PAGE_SIZE)
+ use = PAGE_SIZE;
+ memcpy(data + vbo, kaddr, use);
+ kunmap_atomic(kaddr);
+ mi->dirty = true;
+ }
+ ni->i_valid = data_size;
+
+ return 0;
+}
+
+/*
+ * attr_load_runs_vcn - Load runs with VCN.
+ */
+int attr_load_runs_vcn(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, struct runs_tree *run,
+ CLST vcn)
+{
+ struct ATTRIB *attr;
+ int err;
+ CLST svcn, evcn;
+ u16 ro;
+
+ attr = ni_find_attr(ni, NULL, NULL, type, name, name_len, &vcn, NULL);
+ if (!attr) {
+ /* Is record corrupted? */
+ return -ENOENT;
+ }
+
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn = le64_to_cpu(attr->nres.evcn);
+
+ if (evcn < vcn || vcn < svcn) {
+ /* Is record corrupted? */
+ return -EINVAL;
+ }
+
+ ro = le16_to_cpu(attr->nres.run_off);
+ err = run_unpack_ex(run, ni->mi.sbi, ni->mi.rno, svcn, evcn, svcn,
+ Add2Ptr(attr, ro), le32_to_cpu(attr->size) - ro);
+ if (err < 0)
+ return err;
+ return 0;
+}
+
+/*
+ * attr_load_runs_range - Load runs for given range [from to).
+ */
+int attr_load_runs_range(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, struct runs_tree *run,
+ u64 from, u64 to)
+{
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ u8 cluster_bits = sbi->cluster_bits;
+ CLST vcn = from >> cluster_bits;
+ CLST vcn_last = (to - 1) >> cluster_bits;
+ CLST lcn, clen;
+ int err;
+
+ for (vcn = from >> cluster_bits; vcn <= vcn_last; vcn += clen) {
+ if (!run_lookup_entry(run, vcn, &lcn, &clen, NULL)) {
+ err = attr_load_runs_vcn(ni, type, name, name_len, run,
+ vcn);
+ if (err)
+ return err;
+ clen = 0; /* Next run_lookup_entry(vcn) must be success. */
+ }
+ }
+
+ return 0;
+}
+
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+/*
+ * attr_wof_frame_info
+ *
+ * Read header of Xpress/LZX file to get info about frame.
+ */
+int attr_wof_frame_info(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct runs_tree *run, u64 frame, u64 frames,
+ u8 frame_bits, u32 *ondisk_size, u64 *vbo_data)
+{
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ u64 vbo[2], off[2], wof_size;
+ u32 voff;
+ u8 bytes_per_off;
+ char *addr;
+ struct page *page;
+ int i, err;
+ __le32 *off32;
+ __le64 *off64;
+
+ if (ni->vfs_inode.i_size < 0x100000000ull) {
+ /* File starts with array of 32 bit offsets. */
+ bytes_per_off = sizeof(__le32);
+ vbo[1] = frame << 2;
+ *vbo_data = frames << 2;
+ } else {
+ /* File starts with array of 64 bit offsets. */
+ bytes_per_off = sizeof(__le64);
+ vbo[1] = frame << 3;
+ *vbo_data = frames << 3;
+ }
+
+ /*
+ * Read 4/8 bytes at [vbo - 4(8)] == offset where compressed frame starts.
+ * Read 4/8 bytes at [vbo] == offset where compressed frame ends.
+ */
+ if (!attr->non_res) {
+ if (vbo[1] + bytes_per_off > le32_to_cpu(attr->res.data_size)) {
+ ntfs_inode_err(&ni->vfs_inode, "is corrupted");
+ return -EINVAL;
+ }
+ addr = resident_data(attr);
+
+ if (bytes_per_off == sizeof(__le32)) {
+ off32 = Add2Ptr(addr, vbo[1]);
+ off[0] = vbo[1] ? le32_to_cpu(off32[-1]) : 0;
+ off[1] = le32_to_cpu(off32[0]);
+ } else {
+ off64 = Add2Ptr(addr, vbo[1]);
+ off[0] = vbo[1] ? le64_to_cpu(off64[-1]) : 0;
+ off[1] = le64_to_cpu(off64[0]);
+ }
+
+ *vbo_data += off[0];
+ *ondisk_size = off[1] - off[0];
+ return 0;
+ }
+
+ wof_size = le64_to_cpu(attr->nres.data_size);
+ down_write(&ni->file.run_lock);
+ page = ni->file.offs_page;
+ if (!page) {
+ page = alloc_page(GFP_KERNEL);
+ if (!page) {
+ err = -ENOMEM;
+ goto out;
+ }
+ page->index = -1;
+ ni->file.offs_page = page;
+ }
+ lock_page(page);
+ addr = page_address(page);
+
+ if (vbo[1]) {
+ voff = vbo[1] & (PAGE_SIZE - 1);
+ vbo[0] = vbo[1] - bytes_per_off;
+ i = 0;
+ } else {
+ voff = 0;
+ vbo[0] = 0;
+ off[0] = 0;
+ i = 1;
+ }
+
+ do {
+ pgoff_t index = vbo[i] >> PAGE_SHIFT;
+
+ if (index != page->index) {
+ u64 from = vbo[i] & ~(u64)(PAGE_SIZE - 1);
+ u64 to = min(from + PAGE_SIZE, wof_size);
+
+ err = attr_load_runs_range(ni, ATTR_DATA, WOF_NAME,
+ ARRAY_SIZE(WOF_NAME), run,
+ from, to);
+ if (err)
+ goto out1;
+
+ err = ntfs_bio_pages(sbi, run, &page, 1, from,
+ to - from, REQ_OP_READ);
+ if (err) {
+ page->index = -1;
+ goto out1;
+ }
+ page->index = index;
+ }
+
+ if (i) {
+ if (bytes_per_off == sizeof(__le32)) {
+ off32 = Add2Ptr(addr, voff);
+ off[1] = le32_to_cpu(*off32);
+ } else {
+ off64 = Add2Ptr(addr, voff);
+ off[1] = le64_to_cpu(*off64);
+ }
+ } else if (!voff) {
+ if (bytes_per_off == sizeof(__le32)) {
+ off32 = Add2Ptr(addr, PAGE_SIZE - sizeof(u32));
+ off[0] = le32_to_cpu(*off32);
+ } else {
+ off64 = Add2Ptr(addr, PAGE_SIZE - sizeof(u64));
+ off[0] = le64_to_cpu(*off64);
+ }
+ } else {
+ /* Two values in one page. */
+ if (bytes_per_off == sizeof(__le32)) {
+ off32 = Add2Ptr(addr, voff);
+ off[0] = le32_to_cpu(off32[-1]);
+ off[1] = le32_to_cpu(off32[0]);
+ } else {
+ off64 = Add2Ptr(addr, voff);
+ off[0] = le64_to_cpu(off64[-1]);
+ off[1] = le64_to_cpu(off64[0]);
+ }
+ break;
+ }
+ } while (++i < 2);
+
+ *vbo_data += off[0];
+ *ondisk_size = off[1] - off[0];
+
+out1:
+ unlock_page(page);
+out:
+ up_write(&ni->file.run_lock);
+ return err;
+}
+#endif
+
+/*
+ * attr_is_frame_compressed - Used to detect compressed frame.
+ */
+int attr_is_frame_compressed(struct ntfs_inode *ni, struct ATTRIB *attr,
+ CLST frame, CLST *clst_data)
+{
+ int err;
+ u32 clst_frame;
+ CLST clen, lcn, vcn, alen, slen, vcn_next;
+ size_t idx;
+ struct runs_tree *run;
+
+ *clst_data = 0;
+
+ if (!is_attr_compressed(attr))
+ return 0;
+
+ if (!attr->non_res)
+ return 0;
+
+ clst_frame = 1u << attr->nres.c_unit;
+ vcn = frame * clst_frame;
+ run = &ni->file.run;
+
+ if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) {
+ err = attr_load_runs_vcn(ni, attr->type, attr_name(attr),
+ attr->name_len, run, vcn);
+ if (err)
+ return err;
+
+ if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx))
+ return -EINVAL;
+ }
+
+ if (lcn == SPARSE_LCN) {
+ /* Sparsed frame. */
+ return 0;
+ }
+
+ if (clen >= clst_frame) {
+ /*
+ * The frame is not compressed 'cause
+ * it does not contain any sparse clusters.
+ */
+ *clst_data = clst_frame;
+ return 0;
+ }
+
+ alen = bytes_to_cluster(ni->mi.sbi, le64_to_cpu(attr->nres.alloc_size));
+ slen = 0;
+ *clst_data = clen;
+
+ /*
+ * The frame is compressed if *clst_data + slen >= clst_frame.
+ * Check next fragments.
+ */
+ while ((vcn += clen) < alen) {
+ vcn_next = vcn;
+
+ if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) ||
+ vcn_next != vcn) {
+ err = attr_load_runs_vcn(ni, attr->type,
+ attr_name(attr),
+ attr->name_len, run, vcn_next);
+ if (err)
+ return err;
+ vcn = vcn_next;
+
+ if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx))
+ return -EINVAL;
+ }
+
+ if (lcn == SPARSE_LCN) {
+ slen += clen;
+ } else {
+ if (slen) {
+ /*
+ * Data_clusters + sparse_clusters =
+ * not enough for frame.
+ */
+ return -EINVAL;
+ }
+ *clst_data += clen;
+ }
+
+ if (*clst_data + slen >= clst_frame) {
+ if (!slen) {
+ /*
+ * There is no sparsed clusters in this frame
+ * so it is not compressed.
+ */
+ *clst_data = clst_frame;
+ } else {
+ /* Frame is compressed. */
+ }
+ break;
+ }
+ }
+
+ return 0;
+}
+
+/*
+ * attr_allocate_frame - Allocate/free clusters for @frame.
+ *
+ * Assumed: down_write(&ni->file.run_lock);
+ */
+int attr_allocate_frame(struct ntfs_inode *ni, CLST frame, size_t compr_size,
+ u64 new_valid)
+{
+ int err = 0;
+ struct runs_tree *run = &ni->file.run;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct ATTRIB *attr = NULL, *attr_b;
+ struct ATTR_LIST_ENTRY *le, *le_b;
+ struct mft_inode *mi, *mi_b;
+ CLST svcn, evcn1, next_svcn, lcn, len;
+ CLST vcn, end, clst_data;
+ u64 total_size, valid_size, data_size;
+
+ le_b = NULL;
+ attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b);
+ if (!attr_b)
+ return -ENOENT;
+
+ if (!is_attr_ext(attr_b))
+ return -EINVAL;
+
+ vcn = frame << NTFS_LZNT_CUNIT;
+ total_size = le64_to_cpu(attr_b->nres.total_size);
+
+ svcn = le64_to_cpu(attr_b->nres.svcn);
+ evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1;
+ data_size = le64_to_cpu(attr_b->nres.data_size);
+
+ if (svcn <= vcn && vcn < evcn1) {
+ attr = attr_b;
+ le = le_b;
+ mi = mi_b;
+ } else if (!le_b) {
+ err = -EINVAL;
+ goto out;
+ } else {
+ le = le_b;
+ attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn,
+ &mi);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+ }
+
+ err = attr_load_runs(attr, ni, run, NULL);
+ if (err)
+ goto out;
+
+ err = attr_is_frame_compressed(ni, attr_b, frame, &clst_data);
+ if (err)
+ goto out;
+
+ total_size -= (u64)clst_data << sbi->cluster_bits;
+
+ len = bytes_to_cluster(sbi, compr_size);
+
+ if (len == clst_data)
+ goto out;
+
+ if (len < clst_data) {
+ err = run_deallocate_ex(sbi, run, vcn + len, clst_data - len,
+ NULL, true);
+ if (err)
+ goto out;
+
+ if (!run_add_entry(run, vcn + len, SPARSE_LCN, clst_data - len,
+ false)) {
+ err = -ENOMEM;
+ goto out;
+ }
+ end = vcn + clst_data;
+ /* Run contains updated range [vcn + len : end). */
+ } else {
+ CLST alen, hint = 0;
+ /* Get the last LCN to allocate from. */
+ if (vcn + clst_data &&
+ !run_lookup_entry(run, vcn + clst_data - 1, &hint, NULL,
+ NULL)) {
+ hint = -1;
+ }
+
+ err = attr_allocate_clusters(sbi, run, vcn + clst_data,
+ hint + 1, len - clst_data, NULL, 0,
+ &alen, 0, &lcn);
+ if (err)
+ goto out;
+
+ end = vcn + len;
+ /* Run contains updated range [vcn + clst_data : end). */
+ }
+
+ total_size += (u64)len << sbi->cluster_bits;
+
+repack:
+ err = mi_pack_runs(mi, attr, run, max(end, evcn1) - svcn);
+ if (err)
+ goto out;
+
+ attr_b->nres.total_size = cpu_to_le64(total_size);
+ inode_set_bytes(&ni->vfs_inode, total_size);
+
+ mi_b->dirty = true;
+ mark_inode_dirty(&ni->vfs_inode);
+
+ /* Stored [vcn : next_svcn) from [vcn : end). */
+ next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+
+ if (end <= evcn1) {
+ if (next_svcn == evcn1) {
+ /* Normal way. Update attribute and exit. */
+ goto ok;
+ }
+ /* Add new segment [next_svcn : evcn1 - next_svcn). */
+ if (!ni->attr_list.size) {
+ err = ni_create_attr_list(ni);
+ if (err)
+ goto out;
+ /* Layout of records is changed. */
+ le_b = NULL;
+ attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL,
+ 0, NULL, &mi_b);
+ if (!attr_b) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ attr = attr_b;
+ le = le_b;
+ mi = mi_b;
+ goto repack;
+ }
+ }
+
+ svcn = evcn1;
+
+ /* Estimate next attribute. */
+ attr = ni_find_attr(ni, attr, &le, ATTR_DATA, NULL, 0, &svcn, &mi);
+
+ if (attr) {
+ CLST alloc = bytes_to_cluster(
+ sbi, le64_to_cpu(attr_b->nres.alloc_size));
+ CLST evcn = le64_to_cpu(attr->nres.evcn);
+
+ if (end < next_svcn)
+ end = next_svcn;
+ while (end > evcn) {
+ /* Remove segment [svcn : evcn). */
+ mi_remove_attr(NULL, mi, attr);
+
+ if (!al_remove_le(ni, le)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (evcn + 1 >= alloc) {
+ /* Last attribute segment. */
+ evcn1 = evcn + 1;
+ goto ins_ext;
+ }
+
+ if (ni_load_mi(ni, le, &mi)) {
+ attr = NULL;
+ goto out;
+ }
+
+ attr = mi_find_attr(mi, NULL, ATTR_DATA, NULL, 0,
+ &le->id);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn = le64_to_cpu(attr->nres.evcn);
+ }
+
+ if (end < svcn)
+ end = svcn;
+
+ err = attr_load_runs(attr, ni, run, &end);
+ if (err)
+ goto out;
+
+ evcn1 = evcn + 1;
+ attr->nres.svcn = cpu_to_le64(next_svcn);
+ err = mi_pack_runs(mi, attr, run, evcn1 - next_svcn);
+ if (err)
+ goto out;
+
+ le->vcn = cpu_to_le64(next_svcn);
+ ni->attr_list.dirty = true;
+ mi->dirty = true;
+
+ next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+ }
+ins_ext:
+ if (evcn1 > next_svcn) {
+ err = ni_insert_nonresident(ni, ATTR_DATA, NULL, 0, run,
+ next_svcn, evcn1 - next_svcn,
+ attr_b->flags, &attr, &mi);
+ if (err)
+ goto out;
+ }
+ok:
+ run_truncate_around(run, vcn);
+out:
+ if (new_valid > data_size)
+ new_valid = data_size;
+
+ valid_size = le64_to_cpu(attr_b->nres.valid_size);
+ if (new_valid != valid_size) {
+ attr_b->nres.valid_size = cpu_to_le64(valid_size);
+ mi_b->dirty = true;
+ }
+
+ return err;
+}
+
+/*
+ * attr_collapse_range - Collapse range in file.
+ */
+int attr_collapse_range(struct ntfs_inode *ni, u64 vbo, u64 bytes)
+{
+ int err = 0;
+ struct runs_tree *run = &ni->file.run;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct ATTRIB *attr = NULL, *attr_b;
+ struct ATTR_LIST_ENTRY *le, *le_b;
+ struct mft_inode *mi, *mi_b;
+ CLST svcn, evcn1, len, dealloc, alen;
+ CLST vcn, end;
+ u64 valid_size, data_size, alloc_size, total_size;
+ u32 mask;
+ __le16 a_flags;
+
+ if (!bytes)
+ return 0;
+
+ le_b = NULL;
+ attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b);
+ if (!attr_b)
+ return -ENOENT;
+
+ if (!attr_b->non_res) {
+ /* Attribute is resident. Nothing to do? */
+ return 0;
+ }
+
+ data_size = le64_to_cpu(attr_b->nres.data_size);
+ alloc_size = le64_to_cpu(attr_b->nres.alloc_size);
+ a_flags = attr_b->flags;
+
+ if (is_attr_ext(attr_b)) {
+ total_size = le64_to_cpu(attr_b->nres.total_size);
+ mask = (sbi->cluster_size << attr_b->nres.c_unit) - 1;
+ } else {
+ total_size = alloc_size;
+ mask = sbi->cluster_mask;
+ }
+
+ if ((vbo & mask) || (bytes & mask)) {
+ /* Allow to collapse only cluster aligned ranges. */
+ return -EINVAL;
+ }
+
+ if (vbo > data_size)
+ return -EINVAL;
+
+ down_write(&ni->file.run_lock);
+
+ if (vbo + bytes >= data_size) {
+ u64 new_valid = min(ni->i_valid, vbo);
+
+ /* Simple truncate file at 'vbo'. */
+ truncate_setsize(&ni->vfs_inode, vbo);
+ err = attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, vbo,
+ &new_valid, true, NULL);
+
+ if (!err && new_valid < ni->i_valid)
+ ni->i_valid = new_valid;
+
+ goto out;
+ }
+
+ /*
+ * Enumerate all attribute segments and collapse.
+ */
+ alen = alloc_size >> sbi->cluster_bits;
+ vcn = vbo >> sbi->cluster_bits;
+ len = bytes >> sbi->cluster_bits;
+ end = vcn + len;
+ dealloc = 0;
+
+ svcn = le64_to_cpu(attr_b->nres.svcn);
+ evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1;
+
+ if (svcn <= vcn && vcn < evcn1) {
+ attr = attr_b;
+ le = le_b;
+ mi = mi_b;
+ } else if (!le_b) {
+ err = -EINVAL;
+ goto out;
+ } else {
+ le = le_b;
+ attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn,
+ &mi);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+ }
+
+ for (;;) {
+ if (svcn >= end) {
+ /* Shift VCN- */
+ attr->nres.svcn = cpu_to_le64(svcn - len);
+ attr->nres.evcn = cpu_to_le64(evcn1 - 1 - len);
+ if (le) {
+ le->vcn = attr->nres.svcn;
+ ni->attr_list.dirty = true;
+ }
+ mi->dirty = true;
+ } else if (svcn < vcn || end < evcn1) {
+ CLST vcn1, eat, next_svcn;
+
+ /* Collapse a part of this attribute segment. */
+ err = attr_load_runs(attr, ni, run, &svcn);
+ if (err)
+ goto out;
+ vcn1 = max(vcn, svcn);
+ eat = min(end, evcn1) - vcn1;
+
+ err = run_deallocate_ex(sbi, run, vcn1, eat, &dealloc,
+ true);
+ if (err)
+ goto out;
+
+ if (!run_collapse_range(run, vcn1, eat)) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ if (svcn >= vcn) {
+ /* Shift VCN */
+ attr->nres.svcn = cpu_to_le64(vcn);
+ if (le) {
+ le->vcn = attr->nres.svcn;
+ ni->attr_list.dirty = true;
+ }
+ }
+
+ err = mi_pack_runs(mi, attr, run, evcn1 - svcn - eat);
+ if (err)
+ goto out;
+
+ next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+ if (next_svcn + eat < evcn1) {
+ err = ni_insert_nonresident(
+ ni, ATTR_DATA, NULL, 0, run, next_svcn,
+ evcn1 - eat - next_svcn, a_flags, &attr,
+ &mi);
+ if (err)
+ goto out;
+
+ /* Layout of records maybe changed. */
+ attr_b = NULL;
+ le = al_find_ex(ni, NULL, ATTR_DATA, NULL, 0,
+ &next_svcn);
+ if (!le) {
+ err = -EINVAL;
+ goto out;
+ }
+ }
+
+ /* Free all allocated memory. */
+ run_truncate(run, 0);
+ } else {
+ u16 le_sz;
+ u16 roff = le16_to_cpu(attr->nres.run_off);
+
+ run_unpack_ex(RUN_DEALLOCATE, sbi, ni->mi.rno, svcn,
+ evcn1 - 1, svcn, Add2Ptr(attr, roff),
+ le32_to_cpu(attr->size) - roff);
+
+ /* Delete this attribute segment. */
+ mi_remove_attr(NULL, mi, attr);
+ if (!le)
+ break;
+
+ le_sz = le16_to_cpu(le->size);
+ if (!al_remove_le(ni, le)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (evcn1 >= alen)
+ break;
+
+ if (!svcn) {
+ /* Load next record that contains this attribute. */
+ if (ni_load_mi(ni, le, &mi)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Look for required attribute. */
+ attr = mi_find_attr(mi, NULL, ATTR_DATA, NULL,
+ 0, &le->id);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+ goto next_attr;
+ }
+ le = (struct ATTR_LIST_ENTRY *)((u8 *)le - le_sz);
+ }
+
+ if (evcn1 >= alen)
+ break;
+
+ attr = ni_enum_attr_ex(ni, attr, &le, &mi);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+next_attr:
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+ }
+
+ if (!attr_b) {
+ le_b = NULL;
+ attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL,
+ &mi_b);
+ if (!attr_b) {
+ err = -ENOENT;
+ goto out;
+ }
+ }
+
+ data_size -= bytes;
+ valid_size = ni->i_valid;
+ if (vbo + bytes <= valid_size)
+ valid_size -= bytes;
+ else if (vbo < valid_size)
+ valid_size = vbo;
+
+ attr_b->nres.alloc_size = cpu_to_le64(alloc_size - bytes);
+ attr_b->nres.data_size = cpu_to_le64(data_size);
+ attr_b->nres.valid_size = cpu_to_le64(min(valid_size, data_size));
+ total_size -= (u64)dealloc << sbi->cluster_bits;
+ if (is_attr_ext(attr_b))
+ attr_b->nres.total_size = cpu_to_le64(total_size);
+ mi_b->dirty = true;
+
+ /* Update inode size. */
+ ni->i_valid = valid_size;
+ ni->vfs_inode.i_size = data_size;
+ inode_set_bytes(&ni->vfs_inode, total_size);
+ ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+ mark_inode_dirty(&ni->vfs_inode);
+
+out:
+ up_write(&ni->file.run_lock);
+ if (err)
+ make_bad_inode(&ni->vfs_inode);
+
+ return err;
+}
+
+/*
+ * attr_punch_hole
+ *
+ * Not for normal files.
+ */
+int attr_punch_hole(struct ntfs_inode *ni, u64 vbo, u64 bytes, u32 *frame_size)
+{
+ int err = 0;
+ struct runs_tree *run = &ni->file.run;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct ATTRIB *attr = NULL, *attr_b;
+ struct ATTR_LIST_ENTRY *le, *le_b;
+ struct mft_inode *mi, *mi_b;
+ CLST svcn, evcn1, vcn, len, end, alen, dealloc;
+ u64 total_size, alloc_size;
+ u32 mask;
+
+ if (!bytes)
+ return 0;
+
+ le_b = NULL;
+ attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b);
+ if (!attr_b)
+ return -ENOENT;
+
+ if (!attr_b->non_res) {
+ u32 data_size = le32_to_cpu(attr->res.data_size);
+ u32 from, to;
+
+ if (vbo > data_size)
+ return 0;
+
+ from = vbo;
+ to = (vbo + bytes) < data_size ? (vbo + bytes) : data_size;
+ memset(Add2Ptr(resident_data(attr_b), from), 0, to - from);
+ return 0;
+ }
+
+ if (!is_attr_ext(attr_b))
+ return -EOPNOTSUPP;
+
+ alloc_size = le64_to_cpu(attr_b->nres.alloc_size);
+ total_size = le64_to_cpu(attr_b->nres.total_size);
+
+ if (vbo >= alloc_size) {
+ /* NOTE: It is allowed. */
+ return 0;
+ }
+
+ mask = (sbi->cluster_size << attr_b->nres.c_unit) - 1;
+
+ bytes += vbo;
+ if (bytes > alloc_size)
+ bytes = alloc_size;
+ bytes -= vbo;
+
+ if ((vbo & mask) || (bytes & mask)) {
+ /* We have to zero a range(s). */
+ if (frame_size == NULL) {
+ /* Caller insists range is aligned. */
+ return -EINVAL;
+ }
+ *frame_size = mask + 1;
+ return E_NTFS_NOTALIGNED;
+ }
+
+ down_write(&ni->file.run_lock);
+ /*
+ * Enumerate all attribute segments and punch hole where necessary.
+ */
+ alen = alloc_size >> sbi->cluster_bits;
+ vcn = vbo >> sbi->cluster_bits;
+ len = bytes >> sbi->cluster_bits;
+ end = vcn + len;
+ dealloc = 0;
+
+ svcn = le64_to_cpu(attr_b->nres.svcn);
+ evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1;
+
+ if (svcn <= vcn && vcn < evcn1) {
+ attr = attr_b;
+ le = le_b;
+ mi = mi_b;
+ } else if (!le_b) {
+ err = -EINVAL;
+ goto out;
+ } else {
+ le = le_b;
+ attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn,
+ &mi);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+ }
+
+ while (svcn < end) {
+ CLST vcn1, zero, dealloc2;
+
+ err = attr_load_runs(attr, ni, run, &svcn);
+ if (err)
+ goto out;
+ vcn1 = max(vcn, svcn);
+ zero = min(end, evcn1) - vcn1;
+
+ dealloc2 = dealloc;
+ err = run_deallocate_ex(sbi, run, vcn1, zero, &dealloc, true);
+ if (err)
+ goto out;
+
+ if (dealloc2 == dealloc) {
+ /* Looks like the required range is already sparsed. */
+ } else {
+ if (!run_add_entry(run, vcn1, SPARSE_LCN, zero,
+ false)) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ err = mi_pack_runs(mi, attr, run, evcn1 - svcn);
+ if (err)
+ goto out;
+ }
+ /* Free all allocated memory. */
+ run_truncate(run, 0);
+
+ if (evcn1 >= alen)
+ break;
+
+ attr = ni_enum_attr_ex(ni, attr, &le, &mi);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+ }
+
+ total_size -= (u64)dealloc << sbi->cluster_bits;
+ attr_b->nres.total_size = cpu_to_le64(total_size);
+ mi_b->dirty = true;
+
+ /* Update inode size. */
+ inode_set_bytes(&ni->vfs_inode, total_size);
+ ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+ mark_inode_dirty(&ni->vfs_inode);
+
+out:
+ up_write(&ni->file.run_lock);
+ if (err)
+ make_bad_inode(&ni->vfs_inode);
+
+ return err;
+}
diff --git a/fs/ntfs3/attrlist.c b/fs/ntfs3/attrlist.c
new file mode 100644
index 000000000000..fa32399eb517
--- /dev/null
+++ b/fs/ntfs3/attrlist.c
@@ -0,0 +1,460 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+/*
+ * al_is_valid_le
+ *
+ * Return: True if @le is valid.
+ */
+static inline bool al_is_valid_le(const struct ntfs_inode *ni,
+ struct ATTR_LIST_ENTRY *le)
+{
+ if (!le || !ni->attr_list.le || !ni->attr_list.size)
+ return false;
+
+ return PtrOffset(ni->attr_list.le, le) + le16_to_cpu(le->size) <=
+ ni->attr_list.size;
+}
+
+void al_destroy(struct ntfs_inode *ni)
+{
+ run_close(&ni->attr_list.run);
+ kfree(ni->attr_list.le);
+ ni->attr_list.le = NULL;
+ ni->attr_list.size = 0;
+ ni->attr_list.dirty = false;
+}
+
+/*
+ * ntfs_load_attr_list
+ *
+ * This method makes sure that the ATTRIB list, if present,
+ * has been properly set up.
+ */
+int ntfs_load_attr_list(struct ntfs_inode *ni, struct ATTRIB *attr)
+{
+ int err;
+ size_t lsize;
+ void *le = NULL;
+
+ if (ni->attr_list.size)
+ return 0;
+
+ if (!attr->non_res) {
+ lsize = le32_to_cpu(attr->res.data_size);
+ le = kmalloc(al_aligned(lsize), GFP_NOFS);
+ if (!le) {
+ err = -ENOMEM;
+ goto out;
+ }
+ memcpy(le, resident_data(attr), lsize);
+ } else if (attr->nres.svcn) {
+ err = -EINVAL;
+ goto out;
+ } else {
+ u16 run_off = le16_to_cpu(attr->nres.run_off);
+
+ lsize = le64_to_cpu(attr->nres.data_size);
+
+ run_init(&ni->attr_list.run);
+
+ err = run_unpack_ex(&ni->attr_list.run, ni->mi.sbi, ni->mi.rno,
+ 0, le64_to_cpu(attr->nres.evcn), 0,
+ Add2Ptr(attr, run_off),
+ le32_to_cpu(attr->size) - run_off);
+ if (err < 0)
+ goto out;
+
+ le = kmalloc(al_aligned(lsize), GFP_NOFS);
+ if (!le) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ err = ntfs_read_run_nb(ni->mi.sbi, &ni->attr_list.run, 0, le,
+ lsize, NULL);
+ if (err)
+ goto out;
+ }
+
+ ni->attr_list.size = lsize;
+ ni->attr_list.le = le;
+
+ return 0;
+
+out:
+ ni->attr_list.le = le;
+ al_destroy(ni);
+
+ return err;
+}
+
+/*
+ * al_enumerate
+ *
+ * Return:
+ * * The next list le.
+ * * If @le is NULL then return the first le.
+ */
+struct ATTR_LIST_ENTRY *al_enumerate(struct ntfs_inode *ni,
+ struct ATTR_LIST_ENTRY *le)
+{
+ size_t off;
+ u16 sz;
+
+ if (!le) {
+ le = ni->attr_list.le;
+ } else {
+ sz = le16_to_cpu(le->size);
+ if (sz < sizeof(struct ATTR_LIST_ENTRY)) {
+ /* Impossible 'cause we should not return such le. */
+ return NULL;
+ }
+ le = Add2Ptr(le, sz);
+ }
+
+ /* Check boundary. */
+ off = PtrOffset(ni->attr_list.le, le);
+ if (off + sizeof(struct ATTR_LIST_ENTRY) > ni->attr_list.size) {
+ /* The regular end of list. */
+ return NULL;
+ }
+
+ sz = le16_to_cpu(le->size);
+
+ /* Check le for errors. */
+ if (sz < sizeof(struct ATTR_LIST_ENTRY) ||
+ off + sz > ni->attr_list.size ||
+ sz < le->name_off + le->name_len * sizeof(short)) {
+ return NULL;
+ }
+
+ return le;
+}
+
+/*
+ * al_find_le
+ *
+ * Find the first le in the list which matches type, name and VCN.
+ *
+ * Return: NULL if not found.
+ */
+struct ATTR_LIST_ENTRY *al_find_le(struct ntfs_inode *ni,
+ struct ATTR_LIST_ENTRY *le,
+ const struct ATTRIB *attr)
+{
+ CLST svcn = attr_svcn(attr);
+
+ return al_find_ex(ni, le, attr->type, attr_name(attr), attr->name_len,
+ &svcn);
+}
+
+/*
+ * al_find_ex
+ *
+ * Find the first le in the list which matches type, name and VCN.
+ *
+ * Return: NULL if not found.
+ */
+struct ATTR_LIST_ENTRY *al_find_ex(struct ntfs_inode *ni,
+ struct ATTR_LIST_ENTRY *le,
+ enum ATTR_TYPE type, const __le16 *name,
+ u8 name_len, const CLST *vcn)
+{
+ struct ATTR_LIST_ENTRY *ret = NULL;
+ u32 type_in = le32_to_cpu(type);
+
+ while ((le = al_enumerate(ni, le))) {
+ u64 le_vcn;
+ int diff = le32_to_cpu(le->type) - type_in;
+
+ /* List entries are sorted by type, name and VCN. */
+ if (diff < 0)
+ continue;
+
+ if (diff > 0)
+ return ret;
+
+ if (le->name_len != name_len)
+ continue;
+
+ le_vcn = le64_to_cpu(le->vcn);
+ if (!le_vcn) {
+ /*
+ * Compare entry names only for entry with vcn == 0.
+ */
+ diff = ntfs_cmp_names(le_name(le), name_len, name,
+ name_len, ni->mi.sbi->upcase,
+ true);
+ if (diff < 0)
+ continue;
+
+ if (diff > 0)
+ return ret;
+ }
+
+ if (!vcn)
+ return le;
+
+ if (*vcn == le_vcn)
+ return le;
+
+ if (*vcn < le_vcn)
+ return ret;
+
+ ret = le;
+ }
+
+ return ret;
+}
+
+/*
+ * al_find_le_to_insert
+ *
+ * Find the first list entry which matches type, name and VCN.
+ */
+static struct ATTR_LIST_ENTRY *al_find_le_to_insert(struct ntfs_inode *ni,
+ enum ATTR_TYPE type,
+ const __le16 *name,
+ u8 name_len, CLST vcn)
+{
+ struct ATTR_LIST_ENTRY *le = NULL, *prev;
+ u32 type_in = le32_to_cpu(type);
+
+ /* List entries are sorted by type, name and VCN. */
+ while ((le = al_enumerate(ni, prev = le))) {
+ int diff = le32_to_cpu(le->type) - type_in;
+
+ if (diff < 0)
+ continue;
+
+ if (diff > 0)
+ return le;
+
+ if (!le->vcn) {
+ /*
+ * Compare entry names only for entry with vcn == 0.
+ */
+ diff = ntfs_cmp_names(le_name(le), le->name_len, name,
+ name_len, ni->mi.sbi->upcase,
+ true);
+ if (diff < 0)
+ continue;
+
+ if (diff > 0)
+ return le;
+ }
+
+ if (le64_to_cpu(le->vcn) >= vcn)
+ return le;
+ }
+
+ return prev ? Add2Ptr(prev, le16_to_cpu(prev->size)) : ni->attr_list.le;
+}
+
+/*
+ * al_add_le
+ *
+ * Add an "attribute list entry" to the list.
+ */
+int al_add_le(struct ntfs_inode *ni, enum ATTR_TYPE type, const __le16 *name,
+ u8 name_len, CLST svcn, __le16 id, const struct MFT_REF *ref,
+ struct ATTR_LIST_ENTRY **new_le)
+{
+ int err;
+ struct ATTRIB *attr;
+ struct ATTR_LIST_ENTRY *le;
+ size_t off;
+ u16 sz;
+ size_t asize, new_asize, old_size;
+ u64 new_size;
+ typeof(ni->attr_list) *al = &ni->attr_list;
+
+ /*
+ * Compute the size of the new 'le'
+ */
+ sz = le_size(name_len);
+ old_size = al->size;
+ new_size = old_size + sz;
+ asize = al_aligned(old_size);
+ new_asize = al_aligned(new_size);
+
+ /* Scan forward to the point at which the new 'le' should be inserted. */
+ le = al_find_le_to_insert(ni, type, name, name_len, svcn);
+ off = PtrOffset(al->le, le);
+
+ if (new_size > asize) {
+ void *ptr = kmalloc(new_asize, GFP_NOFS);
+
+ if (!ptr)
+ return -ENOMEM;
+
+ memcpy(ptr, al->le, off);
+ memcpy(Add2Ptr(ptr, off + sz), le, old_size - off);
+ le = Add2Ptr(ptr, off);
+ kfree(al->le);
+ al->le = ptr;
+ } else {
+ memmove(Add2Ptr(le, sz), le, old_size - off);
+ }
+ *new_le = le;
+
+ al->size = new_size;
+
+ le->type = type;
+ le->size = cpu_to_le16(sz);
+ le->name_len = name_len;
+ le->name_off = offsetof(struct ATTR_LIST_ENTRY, name);
+ le->vcn = cpu_to_le64(svcn);
+ le->ref = *ref;
+ le->id = id;
+ memcpy(le->name, name, sizeof(short) * name_len);
+
+ err = attr_set_size(ni, ATTR_LIST, NULL, 0, &al->run, new_size,
+ &new_size, true, &attr);
+ if (err) {
+ /* Undo memmove above. */
+ memmove(le, Add2Ptr(le, sz), old_size - off);
+ al->size = old_size;
+ return err;
+ }
+
+ al->dirty = true;
+
+ if (attr && attr->non_res) {
+ err = ntfs_sb_write_run(ni->mi.sbi, &al->run, 0, al->le,
+ al->size);
+ if (err)
+ return err;
+ al->dirty = false;
+ }
+
+ return 0;
+}
+
+/*
+ * al_remove_le - Remove @le from attribute list.
+ */
+bool al_remove_le(struct ntfs_inode *ni, struct ATTR_LIST_ENTRY *le)
+{
+ u16 size;
+ size_t off;
+ typeof(ni->attr_list) *al = &ni->attr_list;
+
+ if (!al_is_valid_le(ni, le))
+ return false;
+
+ /* Save on stack the size of 'le' */
+ size = le16_to_cpu(le->size);
+ off = PtrOffset(al->le, le);
+
+ memmove(le, Add2Ptr(le, size), al->size - (off + size));
+
+ al->size -= size;
+ al->dirty = true;
+
+ return true;
+}
+
+/*
+ * al_delete_le - Delete first le from the list which matches its parameters.
+ */
+bool al_delete_le(struct ntfs_inode *ni, enum ATTR_TYPE type, CLST vcn,
+ const __le16 *name, size_t name_len,
+ const struct MFT_REF *ref)
+{
+ u16 size;
+ struct ATTR_LIST_ENTRY *le;
+ size_t off;
+ typeof(ni->attr_list) *al = &ni->attr_list;
+
+ /* Scan forward to the first le that matches the input. */
+ le = al_find_ex(ni, NULL, type, name, name_len, &vcn);
+ if (!le)
+ return false;
+
+ off = PtrOffset(al->le, le);
+
+next:
+ if (off >= al->size)
+ return false;
+ if (le->type != type)
+ return false;
+ if (le->name_len != name_len)
+ return false;
+ if (name_len && ntfs_cmp_names(le_name(le), name_len, name, name_len,
+ ni->mi.sbi->upcase, true))
+ return false;
+ if (le64_to_cpu(le->vcn) != vcn)
+ return false;
+
+ /*
+ * The caller specified a segment reference, so we have to
+ * scan through the matching entries until we find that segment
+ * reference or we run of matching entries.
+ */
+ if (ref && memcmp(ref, &le->ref, sizeof(*ref))) {
+ off += le16_to_cpu(le->size);
+ le = Add2Ptr(al->le, off);
+ goto next;
+ }
+
+ /* Save on stack the size of 'le'. */
+ size = le16_to_cpu(le->size);
+ /* Delete the le. */
+ memmove(le, Add2Ptr(le, size), al->size - (off + size));
+
+ al->size -= size;
+ al->dirty = true;
+
+ return true;
+}
+
+int al_update(struct ntfs_inode *ni)
+{
+ int err;
+ struct ATTRIB *attr;
+ typeof(ni->attr_list) *al = &ni->attr_list;
+
+ if (!al->dirty || !al->size)
+ return 0;
+
+ /*
+ * Attribute list increased on demand in al_add_le.
+ * Attribute list decreased here.
+ */
+ err = attr_set_size(ni, ATTR_LIST, NULL, 0, &al->run, al->size, NULL,
+ false, &attr);
+ if (err)
+ goto out;
+
+ if (!attr->non_res) {
+ memcpy(resident_data(attr), al->le, al->size);
+ } else {
+ err = ntfs_sb_write_run(ni->mi.sbi, &al->run, 0, al->le,
+ al->size);
+ if (err)
+ goto out;
+
+ attr->nres.valid_size = attr->nres.data_size;
+ }
+
+ ni->mi.dirty = true;
+ al->dirty = false;
+
+out:
+ return err;
+}
diff --git a/fs/ntfs3/bitfunc.c b/fs/ntfs3/bitfunc.c
new file mode 100644
index 000000000000..ce304d40b5e1
--- /dev/null
+++ b/fs/ntfs3/bitfunc.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+#define BITS_IN_SIZE_T (sizeof(size_t) * 8)
+
+/*
+ * fill_mask[i] - first i bits are '1' , i = 0,1,2,3,4,5,6,7,8
+ * fill_mask[i] = 0xFF >> (8-i)
+ */
+static const u8 fill_mask[] = { 0x00, 0x01, 0x03, 0x07, 0x0F,
+ 0x1F, 0x3F, 0x7F, 0xFF };
+
+/*
+ * zero_mask[i] - first i bits are '0' , i = 0,1,2,3,4,5,6,7,8
+ * zero_mask[i] = 0xFF << i
+ */
+static const u8 zero_mask[] = { 0xFF, 0xFE, 0xFC, 0xF8, 0xF0,
+ 0xE0, 0xC0, 0x80, 0x00 };
+
+/*
+ * are_bits_clear
+ *
+ * Return: True if all bits [bit, bit+nbits) are zeros "0".
+ */
+bool are_bits_clear(const ulong *lmap, size_t bit, size_t nbits)
+{
+ size_t pos = bit & 7;
+ const u8 *map = (u8 *)lmap + (bit >> 3);
+
+ if (pos) {
+ if (8 - pos >= nbits)
+ return !nbits || !(*map & fill_mask[pos + nbits] &
+ zero_mask[pos]);
+
+ if (*map++ & zero_mask[pos])
+ return false;
+ nbits -= 8 - pos;
+ }
+
+ pos = ((size_t)map) & (sizeof(size_t) - 1);
+ if (pos) {
+ pos = sizeof(size_t) - pos;
+ if (nbits >= pos * 8) {
+ for (nbits -= pos * 8; pos; pos--, map++) {
+ if (*map)
+ return false;
+ }
+ }
+ }
+
+ for (pos = nbits / BITS_IN_SIZE_T; pos; pos--, map += sizeof(size_t)) {
+ if (*((size_t *)map))
+ return false;
+ }
+
+ for (pos = (nbits % BITS_IN_SIZE_T) >> 3; pos; pos--, map++) {
+ if (*map)
+ return false;
+ }
+
+ pos = nbits & 7;
+ if (pos && (*map & fill_mask[pos]))
+ return false;
+
+ return true;
+}
+
+/*
+ * are_bits_set
+ *
+ * Return: True if all bits [bit, bit+nbits) are ones "1".
+ */
+bool are_bits_set(const ulong *lmap, size_t bit, size_t nbits)
+{
+ u8 mask;
+ size_t pos = bit & 7;
+ const u8 *map = (u8 *)lmap + (bit >> 3);
+
+ if (pos) {
+ if (8 - pos >= nbits) {
+ mask = fill_mask[pos + nbits] & zero_mask[pos];
+ return !nbits || (*map & mask) == mask;
+ }
+
+ mask = zero_mask[pos];
+ if ((*map++ & mask) != mask)
+ return false;
+ nbits -= 8 - pos;
+ }
+
+ pos = ((size_t)map) & (sizeof(size_t) - 1);
+ if (pos) {
+ pos = sizeof(size_t) - pos;
+ if (nbits >= pos * 8) {
+ for (nbits -= pos * 8; pos; pos--, map++) {
+ if (*map != 0xFF)
+ return false;
+ }
+ }
+ }
+
+ for (pos = nbits / BITS_IN_SIZE_T; pos; pos--, map += sizeof(size_t)) {
+ if (*((size_t *)map) != MINUS_ONE_T)
+ return false;
+ }
+
+ for (pos = (nbits % BITS_IN_SIZE_T) >> 3; pos; pos--, map++) {
+ if (*map != 0xFF)
+ return false;
+ }
+
+ pos = nbits & 7;
+ if (pos) {
+ u8 mask = fill_mask[pos];
+
+ if ((*map & mask) != mask)
+ return false;
+ }
+
+ return true;
+}
diff --git a/fs/ntfs3/bitmap.c b/fs/ntfs3/bitmap.c
new file mode 100644
index 000000000000..831501555009
--- /dev/null
+++ b/fs/ntfs3/bitmap.c
@@ -0,0 +1,1493 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ * This code builds two trees of free clusters extents.
+ * Trees are sorted by start of extent and by length of extent.
+ * NTFS_MAX_WND_EXTENTS defines the maximum number of elements in trees.
+ * In extreme case code reads on-disk bitmap to find free clusters.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+/*
+ * Maximum number of extents in tree.
+ */
+#define NTFS_MAX_WND_EXTENTS (32u * 1024u)
+
+struct rb_node_key {
+ struct rb_node node;
+ size_t key;
+};
+
+struct e_node {
+ struct rb_node_key start; /* Tree sorted by start. */
+ struct rb_node_key count; /* Tree sorted by len. */
+};
+
+static int wnd_rescan(struct wnd_bitmap *wnd);
+static struct buffer_head *wnd_map(struct wnd_bitmap *wnd, size_t iw);
+static bool wnd_is_free_hlp(struct wnd_bitmap *wnd, size_t bit, size_t bits);
+
+static struct kmem_cache *ntfs_enode_cachep;
+
+int __init ntfs3_init_bitmap(void)
+{
+ ntfs_enode_cachep =
+ kmem_cache_create("ntfs3_enode_cache", sizeof(struct e_node), 0,
+ SLAB_RECLAIM_ACCOUNT, NULL);
+ return ntfs_enode_cachep ? 0 : -ENOMEM;
+}
+
+void ntfs3_exit_bitmap(void)
+{
+ kmem_cache_destroy(ntfs_enode_cachep);
+}
+
+static inline u32 wnd_bits(const struct wnd_bitmap *wnd, size_t i)
+{
+ return i + 1 == wnd->nwnd ? wnd->bits_last : wnd->sb->s_blocksize * 8;
+}
+
+/*
+ * wnd_scan
+ *
+ * b_pos + b_len - biggest fragment.
+ * Scan range [wpos wbits) window @buf.
+ *
+ * Return: -1 if not found.
+ */
+static size_t wnd_scan(const ulong *buf, size_t wbit, u32 wpos, u32 wend,
+ size_t to_alloc, size_t *prev_tail, size_t *b_pos,
+ size_t *b_len)
+{
+ while (wpos < wend) {
+ size_t free_len;
+ u32 free_bits, end;
+ u32 used = find_next_zero_bit(buf, wend, wpos);
+
+ if (used >= wend) {
+ if (*b_len < *prev_tail) {
+ *b_pos = wbit - *prev_tail;
+ *b_len = *prev_tail;
+ }
+
+ *prev_tail = 0;
+ return -1;
+ }
+
+ if (used > wpos) {
+ wpos = used;
+ if (*b_len < *prev_tail) {
+ *b_pos = wbit - *prev_tail;
+ *b_len = *prev_tail;
+ }
+
+ *prev_tail = 0;
+ }
+
+ /*
+ * Now we have a fragment [wpos, wend) staring with 0.
+ */
+ end = wpos + to_alloc - *prev_tail;
+ free_bits = find_next_bit(buf, min(end, wend), wpos);
+
+ free_len = *prev_tail + free_bits - wpos;
+
+ if (*b_len < free_len) {
+ *b_pos = wbit + wpos - *prev_tail;
+ *b_len = free_len;
+ }
+
+ if (free_len >= to_alloc)
+ return wbit + wpos - *prev_tail;
+
+ if (free_bits >= wend) {
+ *prev_tail += free_bits - wpos;
+ return -1;
+ }
+
+ wpos = free_bits + 1;
+
+ *prev_tail = 0;
+ }
+
+ return -1;
+}
+
+/*
+ * wnd_close - Frees all resources.
+ */
+void wnd_close(struct wnd_bitmap *wnd)
+{
+ struct rb_node *node, *next;
+
+ kfree(wnd->free_bits);
+ run_close(&wnd->run);
+
+ node = rb_first(&wnd->start_tree);
+
+ while (node) {
+ next = rb_next(node);
+ rb_erase(node, &wnd->start_tree);
+ kmem_cache_free(ntfs_enode_cachep,
+ rb_entry(node, struct e_node, start.node));
+ node = next;
+ }
+}
+
+static struct rb_node *rb_lookup(struct rb_root *root, size_t v)
+{
+ struct rb_node **p = &root->rb_node;
+ struct rb_node *r = NULL;
+
+ while (*p) {
+ struct rb_node_key *k;
+
+ k = rb_entry(*p, struct rb_node_key, node);
+ if (v < k->key) {
+ p = &(*p)->rb_left;
+ } else if (v > k->key) {
+ r = &k->node;
+ p = &(*p)->rb_right;
+ } else {
+ return &k->node;
+ }
+ }
+
+ return r;
+}
+
+/*
+ * rb_insert_count - Helper function to insert special kind of 'count' tree.
+ */
+static inline bool rb_insert_count(struct rb_root *root, struct e_node *e)
+{
+ struct rb_node **p = &root->rb_node;
+ struct rb_node *parent = NULL;
+ size_t e_ckey = e->count.key;
+ size_t e_skey = e->start.key;
+
+ while (*p) {
+ struct e_node *k =
+ rb_entry(parent = *p, struct e_node, count.node);
+
+ if (e_ckey > k->count.key) {
+ p = &(*p)->rb_left;
+ } else if (e_ckey < k->count.key) {
+ p = &(*p)->rb_right;
+ } else if (e_skey < k->start.key) {
+ p = &(*p)->rb_left;
+ } else if (e_skey > k->start.key) {
+ p = &(*p)->rb_right;
+ } else {
+ WARN_ON(1);
+ return false;
+ }
+ }
+
+ rb_link_node(&e->count.node, parent, p);
+ rb_insert_color(&e->count.node, root);
+ return true;
+}
+
+/*
+ * rb_insert_start - Helper function to insert special kind of 'count' tree.
+ */
+static inline bool rb_insert_start(struct rb_root *root, struct e_node *e)
+{
+ struct rb_node **p = &root->rb_node;
+ struct rb_node *parent = NULL;
+ size_t e_skey = e->start.key;
+
+ while (*p) {
+ struct e_node *k;
+
+ parent = *p;
+
+ k = rb_entry(parent, struct e_node, start.node);
+ if (e_skey < k->start.key) {
+ p = &(*p)->rb_left;
+ } else if (e_skey > k->start.key) {
+ p = &(*p)->rb_right;
+ } else {
+ WARN_ON(1);
+ return false;
+ }
+ }
+
+ rb_link_node(&e->start.node, parent, p);
+ rb_insert_color(&e->start.node, root);
+ return true;
+}
+
+/*
+ * wnd_add_free_ext - Adds a new extent of free space.
+ * @build: 1 when building tree.
+ */
+static void wnd_add_free_ext(struct wnd_bitmap *wnd, size_t bit, size_t len,
+ bool build)
+{
+ struct e_node *e, *e0 = NULL;
+ size_t ib, end_in = bit + len;
+ struct rb_node *n;
+
+ if (build) {
+ /* Use extent_min to filter too short extents. */
+ if (wnd->count >= NTFS_MAX_WND_EXTENTS &&
+ len <= wnd->extent_min) {
+ wnd->uptodated = -1;
+ return;
+ }
+ } else {
+ /* Try to find extent before 'bit'. */
+ n = rb_lookup(&wnd->start_tree, bit);
+
+ if (!n) {
+ n = rb_first(&wnd->start_tree);
+ } else {
+ e = rb_entry(n, struct e_node, start.node);
+ n = rb_next(n);
+ if (e->start.key + e->count.key == bit) {
+ /* Remove left. */
+ bit = e->start.key;
+ len += e->count.key;
+ rb_erase(&e->start.node, &wnd->start_tree);
+ rb_erase(&e->count.node, &wnd->count_tree);
+ wnd->count -= 1;
+ e0 = e;
+ }
+ }
+
+ while (n) {
+ size_t next_end;
+
+ e = rb_entry(n, struct e_node, start.node);
+ next_end = e->start.key + e->count.key;
+ if (e->start.key > end_in)
+ break;
+
+ /* Remove right. */
+ n = rb_next(n);
+ len += next_end - end_in;
+ end_in = next_end;
+ rb_erase(&e->start.node, &wnd->start_tree);
+ rb_erase(&e->count.node, &wnd->count_tree);
+ wnd->count -= 1;
+
+ if (!e0)
+ e0 = e;
+ else
+ kmem_cache_free(ntfs_enode_cachep, e);
+ }
+
+ if (wnd->uptodated != 1) {
+ /* Check bits before 'bit'. */
+ ib = wnd->zone_bit == wnd->zone_end ||
+ bit < wnd->zone_end
+ ? 0
+ : wnd->zone_end;
+
+ while (bit > ib && wnd_is_free_hlp(wnd, bit - 1, 1)) {
+ bit -= 1;
+ len += 1;
+ }
+
+ /* Check bits after 'end_in'. */
+ ib = wnd->zone_bit == wnd->zone_end ||
+ end_in > wnd->zone_bit
+ ? wnd->nbits
+ : wnd->zone_bit;
+
+ while (end_in < ib && wnd_is_free_hlp(wnd, end_in, 1)) {
+ end_in += 1;
+ len += 1;
+ }
+ }
+ }
+ /* Insert new fragment. */
+ if (wnd->count >= NTFS_MAX_WND_EXTENTS) {
+ if (e0)
+ kmem_cache_free(ntfs_enode_cachep, e0);
+
+ wnd->uptodated = -1;
+
+ /* Compare with smallest fragment. */
+ n = rb_last(&wnd->count_tree);
+ e = rb_entry(n, struct e_node, count.node);
+ if (len <= e->count.key)
+ goto out; /* Do not insert small fragments. */
+
+ if (build) {
+ struct e_node *e2;
+
+ n = rb_prev(n);
+ e2 = rb_entry(n, struct e_node, count.node);
+ /* Smallest fragment will be 'e2->count.key'. */
+ wnd->extent_min = e2->count.key;
+ }
+
+ /* Replace smallest fragment by new one. */
+ rb_erase(&e->start.node, &wnd->start_tree);
+ rb_erase(&e->count.node, &wnd->count_tree);
+ wnd->count -= 1;
+ } else {
+ e = e0 ? e0 : kmem_cache_alloc(ntfs_enode_cachep, GFP_ATOMIC);
+ if (!e) {
+ wnd->uptodated = -1;
+ goto out;
+ }
+
+ if (build && len <= wnd->extent_min)
+ wnd->extent_min = len;
+ }
+ e->start.key = bit;
+ e->count.key = len;
+ if (len > wnd->extent_max)
+ wnd->extent_max = len;
+
+ rb_insert_start(&wnd->start_tree, e);
+ rb_insert_count(&wnd->count_tree, e);
+ wnd->count += 1;
+
+out:;
+}
+
+/*
+ * wnd_remove_free_ext - Remove a run from the cached free space.
+ */
+static void wnd_remove_free_ext(struct wnd_bitmap *wnd, size_t bit, size_t len)
+{
+ struct rb_node *n, *n3;
+ struct e_node *e, *e3;
+ size_t end_in = bit + len;
+ size_t end3, end, new_key, new_len, max_new_len;
+
+ /* Try to find extent before 'bit'. */
+ n = rb_lookup(&wnd->start_tree, bit);
+
+ if (!n)
+ return;
+
+ e = rb_entry(n, struct e_node, start.node);
+ end = e->start.key + e->count.key;
+
+ new_key = new_len = 0;
+ len = e->count.key;
+
+ /* Range [bit,end_in) must be inside 'e' or outside 'e' and 'n'. */
+ if (e->start.key > bit)
+ ;
+ else if (end_in <= end) {
+ /* Range [bit,end_in) inside 'e'. */
+ new_key = end_in;
+ new_len = end - end_in;
+ len = bit - e->start.key;
+ } else if (bit > end) {
+ bool bmax = false;
+
+ n3 = rb_next(n);
+
+ while (n3) {
+ e3 = rb_entry(n3, struct e_node, start.node);
+ if (e3->start.key >= end_in)
+ break;
+
+ if (e3->count.key == wnd->extent_max)
+ bmax = true;
+
+ end3 = e3->start.key + e3->count.key;
+ if (end3 > end_in) {
+ e3->start.key = end_in;
+ rb_erase(&e3->count.node, &wnd->count_tree);
+ e3->count.key = end3 - end_in;
+ rb_insert_count(&wnd->count_tree, e3);
+ break;
+ }
+
+ n3 = rb_next(n3);
+ rb_erase(&e3->start.node, &wnd->start_tree);
+ rb_erase(&e3->count.node, &wnd->count_tree);
+ wnd->count -= 1;
+ kmem_cache_free(ntfs_enode_cachep, e3);
+ }
+ if (!bmax)
+ return;
+ n3 = rb_first(&wnd->count_tree);
+ wnd->extent_max =
+ n3 ? rb_entry(n3, struct e_node, count.node)->count.key
+ : 0;
+ return;
+ }
+
+ if (e->count.key != wnd->extent_max) {
+ ;
+ } else if (rb_prev(&e->count.node)) {
+ ;
+ } else {
+ n3 = rb_next(&e->count.node);
+ max_new_len = len > new_len ? len : new_len;
+ if (!n3) {
+ wnd->extent_max = max_new_len;
+ } else {
+ e3 = rb_entry(n3, struct e_node, count.node);
+ wnd->extent_max = max(e3->count.key, max_new_len);
+ }
+ }
+
+ if (!len) {
+ if (new_len) {
+ e->start.key = new_key;
+ rb_erase(&e->count.node, &wnd->count_tree);
+ e->count.key = new_len;
+ rb_insert_count(&wnd->count_tree, e);
+ } else {
+ rb_erase(&e->start.node, &wnd->start_tree);
+ rb_erase(&e->count.node, &wnd->count_tree);
+ wnd->count -= 1;
+ kmem_cache_free(ntfs_enode_cachep, e);
+ }
+ goto out;
+ }
+ rb_erase(&e->count.node, &wnd->count_tree);
+ e->count.key = len;
+ rb_insert_count(&wnd->count_tree, e);
+
+ if (!new_len)
+ goto out;
+
+ if (wnd->count >= NTFS_MAX_WND_EXTENTS) {
+ wnd->uptodated = -1;
+
+ /* Get minimal extent. */
+ e = rb_entry(rb_last(&wnd->count_tree), struct e_node,
+ count.node);
+ if (e->count.key > new_len)
+ goto out;
+
+ /* Replace minimum. */
+ rb_erase(&e->start.node, &wnd->start_tree);
+ rb_erase(&e->count.node, &wnd->count_tree);
+ wnd->count -= 1;
+ } else {
+ e = kmem_cache_alloc(ntfs_enode_cachep, GFP_ATOMIC);
+ if (!e)
+ wnd->uptodated = -1;
+ }
+
+ if (e) {
+ e->start.key = new_key;
+ e->count.key = new_len;
+ rb_insert_start(&wnd->start_tree, e);
+ rb_insert_count(&wnd->count_tree, e);
+ wnd->count += 1;
+ }
+
+out:
+ if (!wnd->count && 1 != wnd->uptodated)
+ wnd_rescan(wnd);
+}
+
+/*
+ * wnd_rescan - Scan all bitmap. Used while initialization.
+ */
+static int wnd_rescan(struct wnd_bitmap *wnd)
+{
+ int err = 0;
+ size_t prev_tail = 0;
+ struct super_block *sb = wnd->sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ u64 lbo, len = 0;
+ u32 blocksize = sb->s_blocksize;
+ u8 cluster_bits = sbi->cluster_bits;
+ u32 wbits = 8 * sb->s_blocksize;
+ u32 used, frb;
+ const ulong *buf;
+ size_t wpos, wbit, iw, vbo;
+ struct buffer_head *bh = NULL;
+ CLST lcn, clen;
+
+ wnd->uptodated = 0;
+ wnd->extent_max = 0;
+ wnd->extent_min = MINUS_ONE_T;
+ wnd->total_zeroes = 0;
+
+ vbo = 0;
+
+ for (iw = 0; iw < wnd->nwnd; iw++) {
+ if (iw + 1 == wnd->nwnd)
+ wbits = wnd->bits_last;
+
+ if (wnd->inited) {
+ if (!wnd->free_bits[iw]) {
+ /* All ones. */
+ if (prev_tail) {
+ wnd_add_free_ext(wnd,
+ vbo * 8 - prev_tail,
+ prev_tail, true);
+ prev_tail = 0;
+ }
+ goto next_wnd;
+ }
+ if (wbits == wnd->free_bits[iw]) {
+ /* All zeroes. */
+ prev_tail += wbits;
+ wnd->total_zeroes += wbits;
+ goto next_wnd;
+ }
+ }
+
+ if (!len) {
+ u32 off = vbo & sbi->cluster_mask;
+
+ if (!run_lookup_entry(&wnd->run, vbo >> cluster_bits,
+ &lcn, &clen, NULL)) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ lbo = ((u64)lcn << cluster_bits) + off;
+ len = ((u64)clen << cluster_bits) - off;
+ }
+
+ bh = ntfs_bread(sb, lbo >> sb->s_blocksize_bits);
+ if (!bh) {
+ err = -EIO;
+ goto out;
+ }
+
+ buf = (ulong *)bh->b_data;
+
+ used = __bitmap_weight(buf, wbits);
+ if (used < wbits) {
+ frb = wbits - used;
+ wnd->free_bits[iw] = frb;
+ wnd->total_zeroes += frb;
+ }
+
+ wpos = 0;
+ wbit = vbo * 8;
+
+ if (wbit + wbits > wnd->nbits)
+ wbits = wnd->nbits - wbit;
+
+ do {
+ used = find_next_zero_bit(buf, wbits, wpos);
+
+ if (used > wpos && prev_tail) {
+ wnd_add_free_ext(wnd, wbit + wpos - prev_tail,
+ prev_tail, true);
+ prev_tail = 0;
+ }
+
+ wpos = used;
+
+ if (wpos >= wbits) {
+ /* No free blocks. */
+ prev_tail = 0;
+ break;
+ }
+
+ frb = find_next_bit(buf, wbits, wpos);
+ if (frb >= wbits) {
+ /* Keep last free block. */
+ prev_tail += frb - wpos;
+ break;
+ }
+
+ wnd_add_free_ext(wnd, wbit + wpos - prev_tail,
+ frb + prev_tail - wpos, true);
+
+ /* Skip free block and first '1'. */
+ wpos = frb + 1;
+ /* Reset previous tail. */
+ prev_tail = 0;
+ } while (wpos < wbits);
+
+next_wnd:
+
+ if (bh)
+ put_bh(bh);
+ bh = NULL;
+
+ vbo += blocksize;
+ if (len) {
+ len -= blocksize;
+ lbo += blocksize;
+ }
+ }
+
+ /* Add last block. */
+ if (prev_tail)
+ wnd_add_free_ext(wnd, wnd->nbits - prev_tail, prev_tail, true);
+
+ /*
+ * Before init cycle wnd->uptodated was 0.
+ * If any errors or limits occurs while initialization then
+ * wnd->uptodated will be -1.
+ * If 'uptodated' is still 0 then Tree is really updated.
+ */
+ if (!wnd->uptodated)
+ wnd->uptodated = 1;
+
+ if (wnd->zone_bit != wnd->zone_end) {
+ size_t zlen = wnd->zone_end - wnd->zone_bit;
+
+ wnd->zone_end = wnd->zone_bit;
+ wnd_zone_set(wnd, wnd->zone_bit, zlen);
+ }
+
+out:
+ return err;
+}
+
+int wnd_init(struct wnd_bitmap *wnd, struct super_block *sb, size_t nbits)
+{
+ int err;
+ u32 blocksize = sb->s_blocksize;
+ u32 wbits = blocksize * 8;
+
+ init_rwsem(&wnd->rw_lock);
+
+ wnd->sb = sb;
+ wnd->nbits = nbits;
+ wnd->total_zeroes = nbits;
+ wnd->extent_max = MINUS_ONE_T;
+ wnd->zone_bit = wnd->zone_end = 0;
+ wnd->nwnd = bytes_to_block(sb, bitmap_size(nbits));
+ wnd->bits_last = nbits & (wbits - 1);
+ if (!wnd->bits_last)
+ wnd->bits_last = wbits;
+
+ wnd->free_bits = kcalloc(wnd->nwnd, sizeof(u16), GFP_NOFS);
+ if (!wnd->free_bits)
+ return -ENOMEM;
+
+ err = wnd_rescan(wnd);
+ if (err)
+ return err;
+
+ wnd->inited = true;
+
+ return 0;
+}
+
+/*
+ * wnd_map - Call sb_bread for requested window.
+ */
+static struct buffer_head *wnd_map(struct wnd_bitmap *wnd, size_t iw)
+{
+ size_t vbo;
+ CLST lcn, clen;
+ struct super_block *sb = wnd->sb;
+ struct ntfs_sb_info *sbi;
+ struct buffer_head *bh;
+ u64 lbo;
+
+ sbi = sb->s_fs_info;
+ vbo = (u64)iw << sb->s_blocksize_bits;
+
+ if (!run_lookup_entry(&wnd->run, vbo >> sbi->cluster_bits, &lcn, &clen,
+ NULL)) {
+ return ERR_PTR(-ENOENT);
+ }
+
+ lbo = ((u64)lcn << sbi->cluster_bits) + (vbo & sbi->cluster_mask);
+
+ bh = ntfs_bread(wnd->sb, lbo >> sb->s_blocksize_bits);
+ if (!bh)
+ return ERR_PTR(-EIO);
+
+ return bh;
+}
+
+/*
+ * wnd_set_free - Mark the bits range from bit to bit + bits as free.
+ */
+int wnd_set_free(struct wnd_bitmap *wnd, size_t bit, size_t bits)
+{
+ int err = 0;
+ struct super_block *sb = wnd->sb;
+ size_t bits0 = bits;
+ u32 wbits = 8 * sb->s_blocksize;
+ size_t iw = bit >> (sb->s_blocksize_bits + 3);
+ u32 wbit = bit & (wbits - 1);
+ struct buffer_head *bh;
+
+ while (iw < wnd->nwnd && bits) {
+ u32 tail, op;
+ ulong *buf;
+
+ if (iw + 1 == wnd->nwnd)
+ wbits = wnd->bits_last;
+
+ tail = wbits - wbit;
+ op = tail < bits ? tail : bits;
+
+ bh = wnd_map(wnd, iw);
+ if (IS_ERR(bh)) {
+ err = PTR_ERR(bh);
+ break;
+ }
+
+ buf = (ulong *)bh->b_data;
+
+ lock_buffer(bh);
+
+ __bitmap_clear(buf, wbit, op);
+
+ wnd->free_bits[iw] += op;
+
+ set_buffer_uptodate(bh);
+ mark_buffer_dirty(bh);
+ unlock_buffer(bh);
+ put_bh(bh);
+
+ wnd->total_zeroes += op;
+ bits -= op;
+ wbit = 0;
+ iw += 1;
+ }
+
+ wnd_add_free_ext(wnd, bit, bits0, false);
+
+ return err;
+}
+
+/*
+ * wnd_set_used - Mark the bits range from bit to bit + bits as used.
+ */
+int wnd_set_used(struct wnd_bitmap *wnd, size_t bit, size_t bits)
+{
+ int err = 0;
+ struct super_block *sb = wnd->sb;
+ size_t bits0 = bits;
+ size_t iw = bit >> (sb->s_blocksize_bits + 3);
+ u32 wbits = 8 * sb->s_blocksize;
+ u32 wbit = bit & (wbits - 1);
+ struct buffer_head *bh;
+
+ while (iw < wnd->nwnd && bits) {
+ u32 tail, op;
+ ulong *buf;
+
+ if (unlikely(iw + 1 == wnd->nwnd))
+ wbits = wnd->bits_last;
+
+ tail = wbits - wbit;
+ op = tail < bits ? tail : bits;
+
+ bh = wnd_map(wnd, iw);
+ if (IS_ERR(bh)) {
+ err = PTR_ERR(bh);
+ break;
+ }
+ buf = (ulong *)bh->b_data;
+
+ lock_buffer(bh);
+
+ __bitmap_set(buf, wbit, op);
+ wnd->free_bits[iw] -= op;
+
+ set_buffer_uptodate(bh);
+ mark_buffer_dirty(bh);
+ unlock_buffer(bh);
+ put_bh(bh);
+
+ wnd->total_zeroes -= op;
+ bits -= op;
+ wbit = 0;
+ iw += 1;
+ }
+
+ if (!RB_EMPTY_ROOT(&wnd->start_tree))
+ wnd_remove_free_ext(wnd, bit, bits0);
+
+ return err;
+}
+
+/*
+ * wnd_is_free_hlp
+ *
+ * Return: True if all clusters [bit, bit+bits) are free (bitmap only).
+ */
+static bool wnd_is_free_hlp(struct wnd_bitmap *wnd, size_t bit, size_t bits)
+{
+ struct super_block *sb = wnd->sb;
+ size_t iw = bit >> (sb->s_blocksize_bits + 3);
+ u32 wbits = 8 * sb->s_blocksize;
+ u32 wbit = bit & (wbits - 1);
+
+ while (iw < wnd->nwnd && bits) {
+ u32 tail, op;
+
+ if (unlikely(iw + 1 == wnd->nwnd))
+ wbits = wnd->bits_last;
+
+ tail = wbits - wbit;
+ op = tail < bits ? tail : bits;
+
+ if (wbits != wnd->free_bits[iw]) {
+ bool ret;
+ struct buffer_head *bh = wnd_map(wnd, iw);
+
+ if (IS_ERR(bh))
+ return false;
+
+ ret = are_bits_clear((ulong *)bh->b_data, wbit, op);
+
+ put_bh(bh);
+ if (!ret)
+ return false;
+ }
+
+ bits -= op;
+ wbit = 0;
+ iw += 1;
+ }
+
+ return true;
+}
+
+/*
+ * wnd_is_free
+ *
+ * Return: True if all clusters [bit, bit+bits) are free.
+ */
+bool wnd_is_free(struct wnd_bitmap *wnd, size_t bit, size_t bits)
+{
+ bool ret;
+ struct rb_node *n;
+ size_t end;
+ struct e_node *e;
+
+ if (RB_EMPTY_ROOT(&wnd->start_tree))
+ goto use_wnd;
+
+ n = rb_lookup(&wnd->start_tree, bit);
+ if (!n)
+ goto use_wnd;
+
+ e = rb_entry(n, struct e_node, start.node);
+
+ end = e->start.key + e->count.key;
+
+ if (bit < end && bit + bits <= end)
+ return true;
+
+use_wnd:
+ ret = wnd_is_free_hlp(wnd, bit, bits);
+
+ return ret;
+}
+
+/*
+ * wnd_is_used
+ *
+ * Return: True if all clusters [bit, bit+bits) are used.
+ */
+bool wnd_is_used(struct wnd_bitmap *wnd, size_t bit, size_t bits)
+{
+ bool ret = false;
+ struct super_block *sb = wnd->sb;
+ size_t iw = bit >> (sb->s_blocksize_bits + 3);
+ u32 wbits = 8 * sb->s_blocksize;
+ u32 wbit = bit & (wbits - 1);
+ size_t end;
+ struct rb_node *n;
+ struct e_node *e;
+
+ if (RB_EMPTY_ROOT(&wnd->start_tree))
+ goto use_wnd;
+
+ end = bit + bits;
+ n = rb_lookup(&wnd->start_tree, end - 1);
+ if (!n)
+ goto use_wnd;
+
+ e = rb_entry(n, struct e_node, start.node);
+ if (e->start.key + e->count.key > bit)
+ return false;
+
+use_wnd:
+ while (iw < wnd->nwnd && bits) {
+ u32 tail, op;
+
+ if (unlikely(iw + 1 == wnd->nwnd))
+ wbits = wnd->bits_last;
+
+ tail = wbits - wbit;
+ op = tail < bits ? tail : bits;
+
+ if (wnd->free_bits[iw]) {
+ bool ret;
+ struct buffer_head *bh = wnd_map(wnd, iw);
+
+ if (IS_ERR(bh))
+ goto out;
+
+ ret = are_bits_set((ulong *)bh->b_data, wbit, op);
+ put_bh(bh);
+ if (!ret)
+ goto out;
+ }
+
+ bits -= op;
+ wbit = 0;
+ iw += 1;
+ }
+ ret = true;
+
+out:
+ return ret;
+}
+
+/*
+ * wnd_find - Look for free space.
+ *
+ * - flags - BITMAP_FIND_XXX flags
+ *
+ * Return: 0 if not found.
+ */
+size_t wnd_find(struct wnd_bitmap *wnd, size_t to_alloc, size_t hint,
+ size_t flags, size_t *allocated)
+{
+ struct super_block *sb;
+ u32 wbits, wpos, wzbit, wzend;
+ size_t fnd, max_alloc, b_len, b_pos;
+ size_t iw, prev_tail, nwnd, wbit, ebit, zbit, zend;
+ size_t to_alloc0 = to_alloc;
+ const ulong *buf;
+ const struct e_node *e;
+ const struct rb_node *pr, *cr;
+ u8 log2_bits;
+ bool fbits_valid;
+ struct buffer_head *bh;
+
+ /* Fast checking for available free space. */
+ if (flags & BITMAP_FIND_FULL) {
+ size_t zeroes = wnd_zeroes(wnd);
+
+ zeroes -= wnd->zone_end - wnd->zone_bit;
+ if (zeroes < to_alloc0)
+ goto no_space;
+
+ if (to_alloc0 > wnd->extent_max)
+ goto no_space;
+ } else {
+ if (to_alloc > wnd->extent_max)
+ to_alloc = wnd->extent_max;
+ }
+
+ if (wnd->zone_bit <= hint && hint < wnd->zone_end)
+ hint = wnd->zone_end;
+
+ max_alloc = wnd->nbits;
+ b_len = b_pos = 0;
+
+ if (hint >= max_alloc)
+ hint = 0;
+
+ if (RB_EMPTY_ROOT(&wnd->start_tree)) {
+ if (wnd->uptodated == 1) {
+ /* Extents tree is updated -> No free space. */
+ goto no_space;
+ }
+ goto scan_bitmap;
+ }
+
+ e = NULL;
+ if (!hint)
+ goto allocate_biggest;
+
+ /* Use hint: Enumerate extents by start >= hint. */
+ pr = NULL;
+ cr = wnd->start_tree.rb_node;
+
+ for (;;) {
+ e = rb_entry(cr, struct e_node, start.node);
+
+ if (e->start.key == hint)
+ break;
+
+ if (e->start.key < hint) {
+ pr = cr;
+ cr = cr->rb_right;
+ if (!cr)
+ break;
+ continue;
+ }
+
+ cr = cr->rb_left;
+ if (!cr) {
+ e = pr ? rb_entry(pr, struct e_node, start.node) : NULL;
+ break;
+ }
+ }
+
+ if (!e)
+ goto allocate_biggest;
+
+ if (e->start.key + e->count.key > hint) {
+ /* We have found extension with 'hint' inside. */
+ size_t len = e->start.key + e->count.key - hint;
+
+ if (len >= to_alloc && hint + to_alloc <= max_alloc) {
+ fnd = hint;
+ goto found;
+ }
+
+ if (!(flags & BITMAP_FIND_FULL)) {
+ if (len > to_alloc)
+ len = to_alloc;
+
+ if (hint + len <= max_alloc) {
+ fnd = hint;
+ to_alloc = len;
+ goto found;
+ }
+ }
+ }
+
+allocate_biggest:
+ /* Allocate from biggest free extent. */
+ e = rb_entry(rb_first(&wnd->count_tree), struct e_node, count.node);
+ if (e->count.key != wnd->extent_max)
+ wnd->extent_max = e->count.key;
+
+ if (e->count.key < max_alloc) {
+ if (e->count.key >= to_alloc) {
+ ;
+ } else if (flags & BITMAP_FIND_FULL) {
+ if (e->count.key < to_alloc0) {
+ /* Biggest free block is less then requested. */
+ goto no_space;
+ }
+ to_alloc = e->count.key;
+ } else if (-1 != wnd->uptodated) {
+ to_alloc = e->count.key;
+ } else {
+ /* Check if we can use more bits. */
+ size_t op, max_check;
+ struct rb_root start_tree;
+
+ memcpy(&start_tree, &wnd->start_tree,
+ sizeof(struct rb_root));
+ memset(&wnd->start_tree, 0, sizeof(struct rb_root));
+
+ max_check = e->start.key + to_alloc;
+ if (max_check > max_alloc)
+ max_check = max_alloc;
+ for (op = e->start.key + e->count.key; op < max_check;
+ op++) {
+ if (!wnd_is_free(wnd, op, 1))
+ break;
+ }
+ memcpy(&wnd->start_tree, &start_tree,
+ sizeof(struct rb_root));
+ to_alloc = op - e->start.key;
+ }
+
+ /* Prepare to return. */
+ fnd = e->start.key;
+ if (e->start.key + to_alloc > max_alloc)
+ to_alloc = max_alloc - e->start.key;
+ goto found;
+ }
+
+ if (wnd->uptodated == 1) {
+ /* Extents tree is updated -> no free space. */
+ goto no_space;
+ }
+
+ b_len = e->count.key;
+ b_pos = e->start.key;
+
+scan_bitmap:
+ sb = wnd->sb;
+ log2_bits = sb->s_blocksize_bits + 3;
+
+ /* At most two ranges [hint, max_alloc) + [0, hint). */
+Again:
+
+ /* TODO: Optimize request for case nbits > wbits. */
+ iw = hint >> log2_bits;
+ wbits = sb->s_blocksize * 8;
+ wpos = hint & (wbits - 1);
+ prev_tail = 0;
+ fbits_valid = true;
+
+ if (max_alloc == wnd->nbits) {
+ nwnd = wnd->nwnd;
+ } else {
+ size_t t = max_alloc + wbits - 1;
+
+ nwnd = likely(t > max_alloc) ? (t >> log2_bits) : wnd->nwnd;
+ }
+
+ /* Enumerate all windows. */
+ for (; iw < nwnd; iw++) {
+ wbit = iw << log2_bits;
+
+ if (!wnd->free_bits[iw]) {
+ if (prev_tail > b_len) {
+ b_pos = wbit - prev_tail;
+ b_len = prev_tail;
+ }
+
+ /* Skip full used window. */
+ prev_tail = 0;
+ wpos = 0;
+ continue;
+ }
+
+ if (unlikely(iw + 1 == nwnd)) {
+ if (max_alloc == wnd->nbits) {
+ wbits = wnd->bits_last;
+ } else {
+ size_t t = max_alloc & (wbits - 1);
+
+ if (t) {
+ wbits = t;
+ fbits_valid = false;
+ }
+ }
+ }
+
+ if (wnd->zone_end > wnd->zone_bit) {
+ ebit = wbit + wbits;
+ zbit = max(wnd->zone_bit, wbit);
+ zend = min(wnd->zone_end, ebit);
+
+ /* Here we have a window [wbit, ebit) and zone [zbit, zend). */
+ if (zend <= zbit) {
+ /* Zone does not overlap window. */
+ } else {
+ wzbit = zbit - wbit;
+ wzend = zend - wbit;
+
+ /* Zone overlaps window. */
+ if (wnd->free_bits[iw] == wzend - wzbit) {
+ prev_tail = 0;
+ wpos = 0;
+ continue;
+ }
+
+ /* Scan two ranges window: [wbit, zbit) and [zend, ebit). */
+ bh = wnd_map(wnd, iw);
+
+ if (IS_ERR(bh)) {
+ /* TODO: Error */
+ prev_tail = 0;
+ wpos = 0;
+ continue;
+ }
+
+ buf = (ulong *)bh->b_data;
+
+ /* Scan range [wbit, zbit). */
+ if (wpos < wzbit) {
+ /* Scan range [wpos, zbit). */
+ fnd = wnd_scan(buf, wbit, wpos, wzbit,
+ to_alloc, &prev_tail,
+ &b_pos, &b_len);
+ if (fnd != MINUS_ONE_T) {
+ put_bh(bh);
+ goto found;
+ }
+ }
+
+ prev_tail = 0;
+
+ /* Scan range [zend, ebit). */
+ if (wzend < wbits) {
+ fnd = wnd_scan(buf, wbit,
+ max(wzend, wpos), wbits,
+ to_alloc, &prev_tail,
+ &b_pos, &b_len);
+ if (fnd != MINUS_ONE_T) {
+ put_bh(bh);
+ goto found;
+ }
+ }
+
+ wpos = 0;
+ put_bh(bh);
+ continue;
+ }
+ }
+
+ /* Current window does not overlap zone. */
+ if (!wpos && fbits_valid && wnd->free_bits[iw] == wbits) {
+ /* Window is empty. */
+ if (prev_tail + wbits >= to_alloc) {
+ fnd = wbit + wpos - prev_tail;
+ goto found;
+ }
+
+ /* Increase 'prev_tail' and process next window. */
+ prev_tail += wbits;
+ wpos = 0;
+ continue;
+ }
+
+ /* Read window. */
+ bh = wnd_map(wnd, iw);
+ if (IS_ERR(bh)) {
+ // TODO: Error.
+ prev_tail = 0;
+ wpos = 0;
+ continue;
+ }
+
+ buf = (ulong *)bh->b_data;
+
+ /* Scan range [wpos, eBits). */
+ fnd = wnd_scan(buf, wbit, wpos, wbits, to_alloc, &prev_tail,
+ &b_pos, &b_len);
+ put_bh(bh);
+ if (fnd != MINUS_ONE_T)
+ goto found;
+ }
+
+ if (b_len < prev_tail) {
+ /* The last fragment. */
+ b_len = prev_tail;
+ b_pos = max_alloc - prev_tail;
+ }
+
+ if (hint) {
+ /*
+ * We have scanned range [hint max_alloc).
+ * Prepare to scan range [0 hint + to_alloc).
+ */
+ size_t nextmax = hint + to_alloc;
+
+ if (likely(nextmax >= hint) && nextmax < max_alloc)
+ max_alloc = nextmax;
+ hint = 0;
+ goto Again;
+ }
+
+ if (!b_len)
+ goto no_space;
+
+ wnd->extent_max = b_len;
+
+ if (flags & BITMAP_FIND_FULL)
+ goto no_space;
+
+ fnd = b_pos;
+ to_alloc = b_len;
+
+found:
+ if (flags & BITMAP_FIND_MARK_AS_USED) {
+ /* TODO: Optimize remove extent (pass 'e'?). */
+ if (wnd_set_used(wnd, fnd, to_alloc))
+ goto no_space;
+ } else if (wnd->extent_max != MINUS_ONE_T &&
+ to_alloc > wnd->extent_max) {
+ wnd->extent_max = to_alloc;
+ }
+
+ *allocated = fnd;
+ return to_alloc;
+
+no_space:
+ return 0;
+}
+
+/*
+ * wnd_extend - Extend bitmap ($MFT bitmap).
+ */
+int wnd_extend(struct wnd_bitmap *wnd, size_t new_bits)
+{
+ int err;
+ struct super_block *sb = wnd->sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ u32 blocksize = sb->s_blocksize;
+ u32 wbits = blocksize * 8;
+ u32 b0, new_last;
+ size_t bits, iw, new_wnd;
+ size_t old_bits = wnd->nbits;
+ u16 *new_free;
+
+ if (new_bits <= old_bits)
+ return -EINVAL;
+
+ /* Align to 8 byte boundary. */
+ new_wnd = bytes_to_block(sb, bitmap_size(new_bits));
+ new_last = new_bits & (wbits - 1);
+ if (!new_last)
+ new_last = wbits;
+
+ if (new_wnd != wnd->nwnd) {
+ new_free = kmalloc(new_wnd * sizeof(u16), GFP_NOFS);
+ if (!new_free)
+ return -ENOMEM;
+
+ if (new_free != wnd->free_bits)
+ memcpy(new_free, wnd->free_bits,
+ wnd->nwnd * sizeof(short));
+ memset(new_free + wnd->nwnd, 0,
+ (new_wnd - wnd->nwnd) * sizeof(short));
+ kfree(wnd->free_bits);
+ wnd->free_bits = new_free;
+ }
+
+ /* Zero bits [old_bits,new_bits). */
+ bits = new_bits - old_bits;
+ b0 = old_bits & (wbits - 1);
+
+ for (iw = old_bits >> (sb->s_blocksize_bits + 3); bits; iw += 1) {
+ u32 op;
+ size_t frb;
+ u64 vbo, lbo, bytes;
+ struct buffer_head *bh;
+ ulong *buf;
+
+ if (iw + 1 == new_wnd)
+ wbits = new_last;
+
+ op = b0 + bits > wbits ? wbits - b0 : bits;
+ vbo = (u64)iw * blocksize;
+
+ err = ntfs_vbo_to_lbo(sbi, &wnd->run, vbo, &lbo, &bytes);
+ if (err)
+ break;
+
+ bh = ntfs_bread(sb, lbo >> sb->s_blocksize_bits);
+ if (!bh)
+ return -EIO;
+
+ lock_buffer(bh);
+ buf = (ulong *)bh->b_data;
+
+ __bitmap_clear(buf, b0, blocksize * 8 - b0);
+ frb = wbits - __bitmap_weight(buf, wbits);
+ wnd->total_zeroes += frb - wnd->free_bits[iw];
+ wnd->free_bits[iw] = frb;
+
+ set_buffer_uptodate(bh);
+ mark_buffer_dirty(bh);
+ unlock_buffer(bh);
+ /* err = sync_dirty_buffer(bh); */
+
+ b0 = 0;
+ bits -= op;
+ }
+
+ wnd->nbits = new_bits;
+ wnd->nwnd = new_wnd;
+ wnd->bits_last = new_last;
+
+ wnd_add_free_ext(wnd, old_bits, new_bits - old_bits, false);
+
+ return 0;
+}
+
+void wnd_zone_set(struct wnd_bitmap *wnd, size_t lcn, size_t len)
+{
+ size_t zlen;
+
+ zlen = wnd->zone_end - wnd->zone_bit;
+ if (zlen)
+ wnd_add_free_ext(wnd, wnd->zone_bit, zlen, false);
+
+ if (!RB_EMPTY_ROOT(&wnd->start_tree) && len)
+ wnd_remove_free_ext(wnd, lcn, len);
+
+ wnd->zone_bit = lcn;
+ wnd->zone_end = lcn + len;
+}
+
+int ntfs_trim_fs(struct ntfs_sb_info *sbi, struct fstrim_range *range)
+{
+ int err = 0;
+ struct super_block *sb = sbi->sb;
+ struct wnd_bitmap *wnd = &sbi->used.bitmap;
+ u32 wbits = 8 * sb->s_blocksize;
+ CLST len = 0, lcn = 0, done = 0;
+ CLST minlen = bytes_to_cluster(sbi, range->minlen);
+ CLST lcn_from = bytes_to_cluster(sbi, range->start);
+ size_t iw = lcn_from >> (sb->s_blocksize_bits + 3);
+ u32 wbit = lcn_from & (wbits - 1);
+ const ulong *buf;
+ CLST lcn_to;
+
+ if (!minlen)
+ minlen = 1;
+
+ if (range->len == (u64)-1)
+ lcn_to = wnd->nbits;
+ else
+ lcn_to = bytes_to_cluster(sbi, range->start + range->len);
+
+ down_read_nested(&wnd->rw_lock, BITMAP_MUTEX_CLUSTERS);
+
+ for (; iw < wnd->nbits; iw++, wbit = 0) {
+ CLST lcn_wnd = iw * wbits;
+ struct buffer_head *bh;
+
+ if (lcn_wnd > lcn_to)
+ break;
+
+ if (!wnd->free_bits[iw])
+ continue;
+
+ if (iw + 1 == wnd->nwnd)
+ wbits = wnd->bits_last;
+
+ if (lcn_wnd + wbits > lcn_to)
+ wbits = lcn_to - lcn_wnd;
+
+ bh = wnd_map(wnd, iw);
+ if (IS_ERR(bh)) {
+ err = PTR_ERR(bh);
+ break;
+ }
+
+ buf = (ulong *)bh->b_data;
+
+ for (; wbit < wbits; wbit++) {
+ if (!test_bit(wbit, buf)) {
+ if (!len)
+ lcn = lcn_wnd + wbit;
+ len += 1;
+ continue;
+ }
+ if (len >= minlen) {
+ err = ntfs_discard(sbi, lcn, len);
+ if (err)
+ goto out;
+ done += len;
+ }
+ len = 0;
+ }
+ put_bh(bh);
+ }
+
+ /* Process the last fragment. */
+ if (len >= minlen) {
+ err = ntfs_discard(sbi, lcn, len);
+ if (err)
+ goto out;
+ done += len;
+ }
+
+out:
+ range->len = (u64)done << sbi->cluster_bits;
+
+ up_read(&wnd->rw_lock);
+
+ return err;
+}
diff --git a/fs/ntfs3/debug.h b/fs/ntfs3/debug.h
new file mode 100644
index 000000000000..31120569a87b
--- /dev/null
+++ b/fs/ntfs3/debug.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ * Useful functions for debugging.
+ *
+ */
+
+// clang-format off
+#ifndef _LINUX_NTFS3_DEBUG_H
+#define _LINUX_NTFS3_DEBUG_H
+
+#ifndef Add2Ptr
+#define Add2Ptr(P, I) ((void *)((u8 *)(P) + (I)))
+#define PtrOffset(B, O) ((size_t)((size_t)(O) - (size_t)(B)))
+#endif
+
+#ifdef CONFIG_PRINTK
+__printf(2, 3)
+void ntfs_printk(const struct super_block *sb, const char *fmt, ...);
+__printf(2, 3)
+void ntfs_inode_printk(struct inode *inode, const char *fmt, ...);
+#else
+static inline __printf(2, 3)
+void ntfs_printk(const struct super_block *sb, const char *fmt, ...)
+{
+}
+
+static inline __printf(2, 3)
+void ntfs_inode_printk(struct inode *inode, const char *fmt, ...)
+{
+}
+#endif
+
+/*
+ * Logging macros. Thanks Joe Perches <joe@perches.com> for implementation.
+ */
+
+#define ntfs_err(sb, fmt, ...) ntfs_printk(sb, KERN_ERR fmt, ##__VA_ARGS__)
+#define ntfs_warn(sb, fmt, ...) ntfs_printk(sb, KERN_WARNING fmt, ##__VA_ARGS__)
+#define ntfs_info(sb, fmt, ...) ntfs_printk(sb, KERN_INFO fmt, ##__VA_ARGS__)
+#define ntfs_notice(sb, fmt, ...) \
+ ntfs_printk(sb, KERN_NOTICE fmt, ##__VA_ARGS__)
+
+#define ntfs_inode_err(inode, fmt, ...) \
+ ntfs_inode_printk(inode, KERN_ERR fmt, ##__VA_ARGS__)
+#define ntfs_inode_warn(inode, fmt, ...) \
+ ntfs_inode_printk(inode, KERN_WARNING fmt, ##__VA_ARGS__)
+
+#endif /* _LINUX_NTFS3_DEBUG_H */
+// clang-format on
diff --git a/fs/ntfs3/dir.c b/fs/ntfs3/dir.c
new file mode 100644
index 000000000000..93f6d485564e
--- /dev/null
+++ b/fs/ntfs3/dir.c
@@ -0,0 +1,599 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ * Directory handling functions for NTFS-based filesystems.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/iversion.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+/* Convert little endian UTF-16 to NLS string. */
+int ntfs_utf16_to_nls(struct ntfs_sb_info *sbi, const struct le_str *uni,
+ u8 *buf, int buf_len)
+{
+ int ret, uni_len, warn;
+ const __le16 *ip;
+ u8 *op;
+ struct nls_table *nls = sbi->options.nls;
+
+ static_assert(sizeof(wchar_t) == sizeof(__le16));
+
+ if (!nls) {
+ /* UTF-16 -> UTF-8 */
+ ret = utf16s_to_utf8s((wchar_t *)uni->name, uni->len,
+ UTF16_LITTLE_ENDIAN, buf, buf_len);
+ buf[ret] = '\0';
+ return ret;
+ }
+
+ ip = uni->name;
+ op = buf;
+ uni_len = uni->len;
+ warn = 0;
+
+ while (uni_len--) {
+ u16 ec;
+ int charlen;
+ char dump[5];
+
+ if (buf_len < NLS_MAX_CHARSET_SIZE) {
+ ntfs_warn(sbi->sb,
+ "filename was truncated while converting.");
+ break;
+ }
+
+ ec = le16_to_cpu(*ip++);
+ charlen = nls->uni2char(ec, op, buf_len);
+
+ if (charlen > 0) {
+ op += charlen;
+ buf_len -= charlen;
+ continue;
+ }
+
+ *op++ = '_';
+ buf_len -= 1;
+ if (warn)
+ continue;
+
+ warn = 1;
+ hex_byte_pack(&dump[0], ec >> 8);
+ hex_byte_pack(&dump[2], ec);
+ dump[4] = 0;
+
+ ntfs_err(sbi->sb, "failed to convert \"%s\" to %s", dump,
+ nls->charset);
+ }
+
+ *op = '\0';
+ return op - buf;
+}
+
+// clang-format off
+#define PLANE_SIZE 0x00010000
+
+#define SURROGATE_PAIR 0x0000d800
+#define SURROGATE_LOW 0x00000400
+#define SURROGATE_BITS 0x000003ff
+// clang-format on
+
+/*
+ * put_utf16 - Modified version of put_utf16 from fs/nls/nls_base.c
+ *
+ * Function is sparse warnings free.
+ */
+static inline void put_utf16(wchar_t *s, unsigned int c,
+ enum utf16_endian endian)
+{
+ static_assert(sizeof(wchar_t) == sizeof(__le16));
+ static_assert(sizeof(wchar_t) == sizeof(__be16));
+
+ switch (endian) {
+ default:
+ *s = (wchar_t)c;
+ break;
+ case UTF16_LITTLE_ENDIAN:
+ *(__le16 *)s = __cpu_to_le16(c);
+ break;
+ case UTF16_BIG_ENDIAN:
+ *(__be16 *)s = __cpu_to_be16(c);
+ break;
+ }
+}
+
+/*
+ * _utf8s_to_utf16s
+ *
+ * Modified version of 'utf8s_to_utf16s' allows to
+ * detect -ENAMETOOLONG without writing out of expected maximum.
+ */
+static int _utf8s_to_utf16s(const u8 *s, int inlen, enum utf16_endian endian,
+ wchar_t *pwcs, int maxout)
+{
+ u16 *op;
+ int size;
+ unicode_t u;
+
+ op = pwcs;
+ while (inlen > 0 && *s) {
+ if (*s & 0x80) {
+ size = utf8_to_utf32(s, inlen, &u);
+ if (size < 0)
+ return -EINVAL;
+ s += size;
+ inlen -= size;
+
+ if (u >= PLANE_SIZE) {
+ if (maxout < 2)
+ return -ENAMETOOLONG;
+
+ u -= PLANE_SIZE;
+ put_utf16(op++,
+ SURROGATE_PAIR |
+ ((u >> 10) & SURROGATE_BITS),
+ endian);
+ put_utf16(op++,
+ SURROGATE_PAIR | SURROGATE_LOW |
+ (u & SURROGATE_BITS),
+ endian);
+ maxout -= 2;
+ } else {
+ if (maxout < 1)
+ return -ENAMETOOLONG;
+
+ put_utf16(op++, u, endian);
+ maxout--;
+ }
+ } else {
+ if (maxout < 1)
+ return -ENAMETOOLONG;
+
+ put_utf16(op++, *s++, endian);
+ inlen--;
+ maxout--;
+ }
+ }
+ return op - pwcs;
+}
+
+/*
+ * ntfs_nls_to_utf16 - Convert input string to UTF-16.
+ * @name: Input name.
+ * @name_len: Input name length.
+ * @uni: Destination memory.
+ * @max_ulen: Destination memory.
+ * @endian: Endian of target UTF-16 string.
+ *
+ * This function is called:
+ * - to create NTFS name
+ * - to create symlink
+ *
+ * Return: UTF-16 string length or error (if negative).
+ */
+int ntfs_nls_to_utf16(struct ntfs_sb_info *sbi, const u8 *name, u32 name_len,
+ struct cpu_str *uni, u32 max_ulen,
+ enum utf16_endian endian)
+{
+ int ret, slen;
+ const u8 *end;
+ struct nls_table *nls = sbi->options.nls;
+ u16 *uname = uni->name;
+
+ static_assert(sizeof(wchar_t) == sizeof(u16));
+
+ if (!nls) {
+ /* utf8 -> utf16 */
+ ret = _utf8s_to_utf16s(name, name_len, endian, uname, max_ulen);
+ uni->len = ret;
+ return ret;
+ }
+
+ for (ret = 0, end = name + name_len; name < end; ret++, name += slen) {
+ if (ret >= max_ulen)
+ return -ENAMETOOLONG;
+
+ slen = nls->char2uni(name, end - name, uname + ret);
+ if (!slen)
+ return -EINVAL;
+ if (slen < 0)
+ return slen;
+ }
+
+#ifdef __BIG_ENDIAN
+ if (endian == UTF16_LITTLE_ENDIAN) {
+ int i = ret;
+
+ while (i--) {
+ __cpu_to_le16s(uname);
+ uname++;
+ }
+ }
+#else
+ if (endian == UTF16_BIG_ENDIAN) {
+ int i = ret;
+
+ while (i--) {
+ __cpu_to_be16s(uname);
+ uname++;
+ }
+ }
+#endif
+
+ uni->len = ret;
+ return ret;
+}
+
+/*
+ * dir_search_u - Helper function.
+ */
+struct inode *dir_search_u(struct inode *dir, const struct cpu_str *uni,
+ struct ntfs_fnd *fnd)
+{
+ int err = 0;
+ struct super_block *sb = dir->i_sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ struct ntfs_inode *ni = ntfs_i(dir);
+ struct NTFS_DE *e;
+ int diff;
+ struct inode *inode = NULL;
+ struct ntfs_fnd *fnd_a = NULL;
+
+ if (!fnd) {
+ fnd_a = fnd_get();
+ if (!fnd_a) {
+ err = -ENOMEM;
+ goto out;
+ }
+ fnd = fnd_a;
+ }
+
+ err = indx_find(&ni->dir, ni, NULL, uni, 0, sbi, &diff, &e, fnd);
+
+ if (err)
+ goto out;
+
+ if (diff) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ inode = ntfs_iget5(sb, &e->ref, uni);
+ if (!IS_ERR(inode) && is_bad_inode(inode)) {
+ iput(inode);
+ err = -EINVAL;
+ }
+out:
+ fnd_put(fnd_a);
+
+ return err == -ENOENT ? NULL : err ? ERR_PTR(err) : inode;
+}
+
+static inline int ntfs_filldir(struct ntfs_sb_info *sbi, struct ntfs_inode *ni,
+ const struct NTFS_DE *e, u8 *name,
+ struct dir_context *ctx)
+{
+ const struct ATTR_FILE_NAME *fname;
+ unsigned long ino;
+ int name_len;
+ u32 dt_type;
+
+ fname = Add2Ptr(e, sizeof(struct NTFS_DE));
+
+ if (fname->type == FILE_NAME_DOS)
+ return 0;
+
+ if (!mi_is_ref(&ni->mi, &fname->home))
+ return 0;
+
+ ino = ino_get(&e->ref);
+
+ if (ino == MFT_REC_ROOT)
+ return 0;
+
+ /* Skip meta files. Unless option to show metafiles is set. */
+ if (!sbi->options.showmeta && ntfs_is_meta_file(sbi, ino))
+ return 0;
+
+ if (sbi->options.nohidden && (fname->dup.fa & FILE_ATTRIBUTE_HIDDEN))
+ return 0;
+
+ name_len = ntfs_utf16_to_nls(sbi, (struct le_str *)&fname->name_len,
+ name, PATH_MAX);
+ if (name_len <= 0) {
+ ntfs_warn(sbi->sb, "failed to convert name for inode %lx.",
+ ino);
+ return 0;
+ }
+
+ dt_type = (fname->dup.fa & FILE_ATTRIBUTE_DIRECTORY) ? DT_DIR : DT_REG;
+
+ return !dir_emit(ctx, (s8 *)name, name_len, ino, dt_type);
+}
+
+/*
+ * ntfs_read_hdr - Helper function for ntfs_readdir().
+ */
+static int ntfs_read_hdr(struct ntfs_sb_info *sbi, struct ntfs_inode *ni,
+ const struct INDEX_HDR *hdr, u64 vbo, u64 pos,
+ u8 *name, struct dir_context *ctx)
+{
+ int err;
+ const struct NTFS_DE *e;
+ u32 e_size;
+ u32 end = le32_to_cpu(hdr->used);
+ u32 off = le32_to_cpu(hdr->de_off);
+
+ for (;; off += e_size) {
+ if (off + sizeof(struct NTFS_DE) > end)
+ return -1;
+
+ e = Add2Ptr(hdr, off);
+ e_size = le16_to_cpu(e->size);
+ if (e_size < sizeof(struct NTFS_DE) || off + e_size > end)
+ return -1;
+
+ if (de_is_last(e))
+ return 0;
+
+ /* Skip already enumerated. */
+ if (vbo + off < pos)
+ continue;
+
+ if (le16_to_cpu(e->key_size) < SIZEOF_ATTRIBUTE_FILENAME)
+ return -1;
+
+ ctx->pos = vbo + off;
+
+ /* Submit the name to the filldir callback. */
+ err = ntfs_filldir(sbi, ni, e, name, ctx);
+ if (err)
+ return err;
+ }
+}
+
+/*
+ * ntfs_readdir - file_operations::iterate_shared
+ *
+ * Use non sorted enumeration.
+ * We have an example of broken volume where sorted enumeration
+ * counts each name twice.
+ */
+static int ntfs_readdir(struct file *file, struct dir_context *ctx)
+{
+ const struct INDEX_ROOT *root;
+ u64 vbo;
+ size_t bit;
+ loff_t eod;
+ int err = 0;
+ struct inode *dir = file_inode(file);
+ struct ntfs_inode *ni = ntfs_i(dir);
+ struct super_block *sb = dir->i_sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ loff_t i_size = i_size_read(dir);
+ u32 pos = ctx->pos;
+ u8 *name = NULL;
+ struct indx_node *node = NULL;
+ u8 index_bits = ni->dir.index_bits;
+
+ /* Name is a buffer of PATH_MAX length. */
+ static_assert(NTFS_NAME_LEN * 4 < PATH_MAX);
+
+ eod = i_size + sbi->record_size;
+
+ if (pos >= eod)
+ return 0;
+
+ if (!dir_emit_dots(file, ctx))
+ return 0;
+
+ /* Allocate PATH_MAX bytes. */
+ name = __getname();
+ if (!name)
+ return -ENOMEM;
+
+ if (!ni->mi_loaded && ni->attr_list.size) {
+ /*
+ * Directory inode is locked for read.
+ * Load all subrecords to avoid 'write' access to 'ni' during
+ * directory reading.
+ */
+ ni_lock(ni);
+ if (!ni->mi_loaded && ni->attr_list.size) {
+ err = ni_load_all_mi(ni);
+ if (!err)
+ ni->mi_loaded = true;
+ }
+ ni_unlock(ni);
+ if (err)
+ goto out;
+ }
+
+ root = indx_get_root(&ni->dir, ni, NULL, NULL);
+ if (!root) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (pos >= sbi->record_size) {
+ bit = (pos - sbi->record_size) >> index_bits;
+ } else {
+ err = ntfs_read_hdr(sbi, ni, &root->ihdr, 0, pos, name, ctx);
+ if (err)
+ goto out;
+ bit = 0;
+ }
+
+ if (!i_size) {
+ ctx->pos = eod;
+ goto out;
+ }
+
+ for (;;) {
+ vbo = (u64)bit << index_bits;
+ if (vbo >= i_size) {
+ ctx->pos = eod;
+ goto out;
+ }
+
+ err = indx_used_bit(&ni->dir, ni, &bit);
+ if (err)
+ goto out;
+
+ if (bit == MINUS_ONE_T) {
+ ctx->pos = eod;
+ goto out;
+ }
+
+ vbo = (u64)bit << index_bits;
+ if (vbo >= i_size) {
+ ntfs_inode_err(dir, "Looks like your dir is corrupt");
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = indx_read(&ni->dir, ni, bit << ni->dir.idx2vbn_bits,
+ &node);
+ if (err)
+ goto out;
+
+ err = ntfs_read_hdr(sbi, ni, &node->index->ihdr,
+ vbo + sbi->record_size, pos, name, ctx);
+ if (err)
+ goto out;
+
+ bit += 1;
+ }
+
+out:
+
+ __putname(name);
+ put_indx_node(node);
+
+ if (err == -ENOENT) {
+ err = 0;
+ ctx->pos = pos;
+ }
+
+ return err;
+}
+
+static int ntfs_dir_count(struct inode *dir, bool *is_empty, size_t *dirs,
+ size_t *files)
+{
+ int err = 0;
+ struct ntfs_inode *ni = ntfs_i(dir);
+ struct NTFS_DE *e = NULL;
+ struct INDEX_ROOT *root;
+ struct INDEX_HDR *hdr;
+ const struct ATTR_FILE_NAME *fname;
+ u32 e_size, off, end;
+ u64 vbo = 0;
+ size_t drs = 0, fles = 0, bit = 0;
+ loff_t i_size = ni->vfs_inode.i_size;
+ struct indx_node *node = NULL;
+ u8 index_bits = ni->dir.index_bits;
+
+ if (is_empty)
+ *is_empty = true;
+
+ root = indx_get_root(&ni->dir, ni, NULL, NULL);
+ if (!root)
+ return -EINVAL;
+
+ hdr = &root->ihdr;
+
+ for (;;) {
+ end = le32_to_cpu(hdr->used);
+ off = le32_to_cpu(hdr->de_off);
+
+ for (; off + sizeof(struct NTFS_DE) <= end; off += e_size) {
+ e = Add2Ptr(hdr, off);
+ e_size = le16_to_cpu(e->size);
+ if (e_size < sizeof(struct NTFS_DE) ||
+ off + e_size > end)
+ break;
+
+ if (de_is_last(e))
+ break;
+
+ fname = de_get_fname(e);
+ if (!fname)
+ continue;
+
+ if (fname->type == FILE_NAME_DOS)
+ continue;
+
+ if (is_empty) {
+ *is_empty = false;
+ if (!dirs && !files)
+ goto out;
+ }
+
+ if (fname->dup.fa & FILE_ATTRIBUTE_DIRECTORY)
+ drs += 1;
+ else
+ fles += 1;
+ }
+
+ if (vbo >= i_size)
+ goto out;
+
+ err = indx_used_bit(&ni->dir, ni, &bit);
+ if (err)
+ goto out;
+
+ if (bit == MINUS_ONE_T)
+ goto out;
+
+ vbo = (u64)bit << index_bits;
+ if (vbo >= i_size)
+ goto out;
+
+ err = indx_read(&ni->dir, ni, bit << ni->dir.idx2vbn_bits,
+ &node);
+ if (err)
+ goto out;
+
+ hdr = &node->index->ihdr;
+ bit += 1;
+ vbo = (u64)bit << ni->dir.idx2vbn_bits;
+ }
+
+out:
+ put_indx_node(node);
+ if (dirs)
+ *dirs = drs;
+ if (files)
+ *files = fles;
+
+ return err;
+}
+
+bool dir_is_empty(struct inode *dir)
+{
+ bool is_empty = false;
+
+ ntfs_dir_count(dir, &is_empty, NULL, NULL);
+
+ return is_empty;
+}
+
+// clang-format off
+const struct file_operations ntfs_dir_operations = {
+ .llseek = generic_file_llseek,
+ .read = generic_read_dir,
+ .iterate_shared = ntfs_readdir,
+ .fsync = generic_file_fsync,
+ .open = ntfs_file_open,
+};
+// clang-format on
diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
new file mode 100644
index 000000000000..424450e77ad5
--- /dev/null
+++ b/fs/ntfs3/file.c
@@ -0,0 +1,1251 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ * Regular file handling primitives for NTFS-based filesystems.
+ *
+ */
+
+#include <linux/backing-dev.h>
+#include <linux/buffer_head.h>
+#include <linux/compat.h>
+#include <linux/falloc.h>
+#include <linux/fiemap.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+static int ntfs_ioctl_fitrim(struct ntfs_sb_info *sbi, unsigned long arg)
+{
+ struct fstrim_range __user *user_range;
+ struct fstrim_range range;
+ struct request_queue *q = bdev_get_queue(sbi->sb->s_bdev);
+ int err;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (!blk_queue_discard(q))
+ return -EOPNOTSUPP;
+
+ user_range = (struct fstrim_range __user *)arg;
+ if (copy_from_user(&range, user_range, sizeof(range)))
+ return -EFAULT;
+
+ range.minlen = max_t(u32, range.minlen, q->limits.discard_granularity);
+
+ err = ntfs_trim_fs(sbi, &range);
+ if (err < 0)
+ return err;
+
+ if (copy_to_user(user_range, &range, sizeof(range)))
+ return -EFAULT;
+
+ return 0;
+}
+
+static long ntfs_ioctl(struct file *filp, u32 cmd, unsigned long arg)
+{
+ struct inode *inode = file_inode(filp);
+ struct ntfs_sb_info *sbi = inode->i_sb->s_fs_info;
+
+ switch (cmd) {
+ case FITRIM:
+ return ntfs_ioctl_fitrim(sbi, arg);
+ }
+ return -ENOTTY; /* Inappropriate ioctl for device. */
+}
+
+#ifdef CONFIG_COMPAT
+static long ntfs_compat_ioctl(struct file *filp, u32 cmd, unsigned long arg)
+
+{
+ return ntfs_ioctl(filp, cmd, (unsigned long)compat_ptr(arg));
+}
+#endif
+
+/*
+ * ntfs_getattr - inode_operations::getattr
+ */
+int ntfs_getattr(struct user_namespace *mnt_userns, const struct path *path,
+ struct kstat *stat, u32 request_mask, u32 flags)
+{
+ struct inode *inode = d_inode(path->dentry);
+ struct ntfs_inode *ni = ntfs_i(inode);
+
+ if (is_compressed(ni))
+ stat->attributes |= STATX_ATTR_COMPRESSED;
+
+ if (is_encrypted(ni))
+ stat->attributes |= STATX_ATTR_ENCRYPTED;
+
+ stat->attributes_mask |= STATX_ATTR_COMPRESSED | STATX_ATTR_ENCRYPTED;
+
+ generic_fillattr(mnt_userns, inode, stat);
+
+ stat->result_mask |= STATX_BTIME;
+ stat->btime = ni->i_crtime;
+ stat->blksize = ni->mi.sbi->cluster_size; /* 512, 1K, ..., 2M */
+
+ return 0;
+}
+
+static int ntfs_extend_initialized_size(struct file *file,
+ struct ntfs_inode *ni,
+ const loff_t valid,
+ const loff_t new_valid)
+{
+ struct inode *inode = &ni->vfs_inode;
+ struct address_space *mapping = inode->i_mapping;
+ struct ntfs_sb_info *sbi = inode->i_sb->s_fs_info;
+ loff_t pos = valid;
+ int err;
+
+ if (is_resident(ni)) {
+ ni->i_valid = new_valid;
+ return 0;
+ }
+
+ WARN_ON(is_compressed(ni));
+ WARN_ON(valid >= new_valid);
+
+ for (;;) {
+ u32 zerofrom, len;
+ struct page *page;
+ void *fsdata;
+ u8 bits;
+ CLST vcn, lcn, clen;
+
+ if (is_sparsed(ni)) {
+ bits = sbi->cluster_bits;
+ vcn = pos >> bits;
+
+ err = attr_data_get_block(ni, vcn, 0, &lcn, &clen,
+ NULL);
+ if (err)
+ goto out;
+
+ if (lcn == SPARSE_LCN) {
+ loff_t vbo = (loff_t)vcn << bits;
+ loff_t to = vbo + ((loff_t)clen << bits);
+
+ if (to <= new_valid) {
+ ni->i_valid = to;
+ pos = to;
+ goto next;
+ }
+
+ if (vbo < pos) {
+ pos = vbo;
+ } else {
+ to = (new_valid >> bits) << bits;
+ if (pos < to) {
+ ni->i_valid = to;
+ pos = to;
+ goto next;
+ }
+ }
+ }
+ }
+
+ zerofrom = pos & (PAGE_SIZE - 1);
+ len = PAGE_SIZE - zerofrom;
+
+ if (pos + len > new_valid)
+ len = new_valid - pos;
+
+ err = pagecache_write_begin(file, mapping, pos, len, 0, &page,
+ &fsdata);
+ if (err)
+ goto out;
+
+ zero_user_segment(page, zerofrom, PAGE_SIZE);
+
+ /* This function in any case puts page. */
+ err = pagecache_write_end(file, mapping, pos, len, len, page,
+ fsdata);
+ if (err < 0)
+ goto out;
+ pos += len;
+
+next:
+ if (pos >= new_valid)
+ break;
+
+ balance_dirty_pages_ratelimited(mapping);
+ cond_resched();
+ }
+
+ return 0;
+
+out:
+ ni->i_valid = valid;
+ ntfs_inode_warn(inode, "failed to extend initialized size to %llx.",
+ new_valid);
+ return err;
+}
+
+/*
+ * ntfs_zero_range - Helper function for punch_hole.
+ *
+ * It zeroes a range [vbo, vbo_to).
+ */
+static int ntfs_zero_range(struct inode *inode, u64 vbo, u64 vbo_to)
+{
+ int err = 0;
+ struct address_space *mapping = inode->i_mapping;
+ u32 blocksize = 1 << inode->i_blkbits;
+ pgoff_t idx = vbo >> PAGE_SHIFT;
+ u32 z_start = vbo & (PAGE_SIZE - 1);
+ pgoff_t idx_end = (vbo_to + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ loff_t page_off;
+ struct buffer_head *head, *bh;
+ u32 bh_next, bh_off, z_end;
+ sector_t iblock;
+ struct page *page;
+
+ for (; idx < idx_end; idx += 1, z_start = 0) {
+ page_off = (loff_t)idx << PAGE_SHIFT;
+ z_end = (page_off + PAGE_SIZE) > vbo_to ? (vbo_to - page_off)
+ : PAGE_SIZE;
+ iblock = page_off >> inode->i_blkbits;
+
+ page = find_or_create_page(mapping, idx,
+ mapping_gfp_constraint(mapping,
+ ~__GFP_FS));
+ if (!page)
+ return -ENOMEM;
+
+ if (!page_has_buffers(page))
+ create_empty_buffers(page, blocksize, 0);
+
+ bh = head = page_buffers(page);
+ bh_off = 0;
+ do {
+ bh_next = bh_off + blocksize;
+
+ if (bh_next <= z_start || bh_off >= z_end)
+ continue;
+
+ if (!buffer_mapped(bh)) {
+ ntfs_get_block(inode, iblock, bh, 0);
+ /* Unmapped? It's a hole - nothing to do. */
+ if (!buffer_mapped(bh))
+ continue;
+ }
+
+ /* Ok, it's mapped. Make sure it's up-to-date. */
+ if (PageUptodate(page))
+ set_buffer_uptodate(bh);
+
+ if (!buffer_uptodate(bh)) {
+ lock_buffer(bh);
+ bh->b_end_io = end_buffer_read_sync;
+ get_bh(bh);
+ submit_bh(REQ_OP_READ, 0, bh);
+
+ wait_on_buffer(bh);
+ if (!buffer_uptodate(bh)) {
+ unlock_page(page);
+ put_page(page);
+ err = -EIO;
+ goto out;
+ }
+ }
+
+ mark_buffer_dirty(bh);
+
+ } while (bh_off = bh_next, iblock += 1,
+ head != (bh = bh->b_this_page));
+
+ zero_user_segment(page, z_start, z_end);
+
+ unlock_page(page);
+ put_page(page);
+ cond_resched();
+ }
+out:
+ mark_inode_dirty(inode);
+ return err;
+}
+
+/*
+ * ntfs_sparse_cluster - Helper function to zero a new allocated clusters.
+ *
+ * NOTE: 512 <= cluster size <= 2M
+ */
+void ntfs_sparse_cluster(struct inode *inode, struct page *page0, CLST vcn,
+ CLST len)
+{
+ struct address_space *mapping = inode->i_mapping;
+ struct ntfs_sb_info *sbi = inode->i_sb->s_fs_info;
+ u64 vbo = (u64)vcn << sbi->cluster_bits;
+ u64 bytes = (u64)len << sbi->cluster_bits;
+ u32 blocksize = 1 << inode->i_blkbits;
+ pgoff_t idx0 = page0 ? page0->index : -1;
+ loff_t vbo_clst = vbo & sbi->cluster_mask_inv;
+ loff_t end = ntfs_up_cluster(sbi, vbo + bytes);
+ pgoff_t idx = vbo_clst >> PAGE_SHIFT;
+ u32 from = vbo_clst & (PAGE_SIZE - 1);
+ pgoff_t idx_end = (end + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ loff_t page_off;
+ u32 to;
+ bool partial;
+ struct page *page;
+
+ for (; idx < idx_end; idx += 1, from = 0) {
+ page = idx == idx0 ? page0 : grab_cache_page(mapping, idx);
+
+ if (!page)
+ continue;
+
+ page_off = (loff_t)idx << PAGE_SHIFT;
+ to = (page_off + PAGE_SIZE) > end ? (end - page_off)
+ : PAGE_SIZE;
+ partial = false;
+
+ if ((from || PAGE_SIZE != to) &&
+ likely(!page_has_buffers(page))) {
+ create_empty_buffers(page, blocksize, 0);
+ }
+
+ if (page_has_buffers(page)) {
+ struct buffer_head *head, *bh;
+ u32 bh_off = 0;
+
+ bh = head = page_buffers(page);
+ do {
+ u32 bh_next = bh_off + blocksize;
+
+ if (from <= bh_off && bh_next <= to) {
+ set_buffer_uptodate(bh);
+ mark_buffer_dirty(bh);
+ } else if (!buffer_uptodate(bh)) {
+ partial = true;
+ }
+ bh_off = bh_next;
+ } while (head != (bh = bh->b_this_page));
+ }
+
+ zero_user_segment(page, from, to);
+
+ if (!partial) {
+ if (!PageUptodate(page))
+ SetPageUptodate(page);
+ set_page_dirty(page);
+ }
+
+ if (idx != idx0) {
+ unlock_page(page);
+ put_page(page);
+ }
+ cond_resched();
+ }
+ mark_inode_dirty(inode);
+}
+
+/*
+ * ntfs_file_mmap - file_operations::mmap
+ */
+static int ntfs_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ struct address_space *mapping = file->f_mapping;
+ struct inode *inode = mapping->host;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ u64 from = ((u64)vma->vm_pgoff << PAGE_SHIFT);
+ bool rw = vma->vm_flags & VM_WRITE;
+ int err;
+
+ if (is_encrypted(ni)) {
+ ntfs_inode_warn(inode, "mmap encrypted not supported");
+ return -EOPNOTSUPP;
+ }
+
+ if (is_dedup(ni)) {
+ ntfs_inode_warn(inode, "mmap deduplicated not supported");
+ return -EOPNOTSUPP;
+ }
+
+ if (is_compressed(ni) && rw) {
+ ntfs_inode_warn(inode, "mmap(write) compressed not supported");
+ return -EOPNOTSUPP;
+ }
+
+ if (rw) {
+ u64 to = min_t(loff_t, i_size_read(inode),
+ from + vma->vm_end - vma->vm_start);
+
+ if (is_sparsed(ni)) {
+ /* Allocate clusters for rw map. */
+ struct ntfs_sb_info *sbi = inode->i_sb->s_fs_info;
+ CLST lcn, len;
+ CLST vcn = from >> sbi->cluster_bits;
+ CLST end = bytes_to_cluster(sbi, to);
+ bool new;
+
+ for (; vcn < end; vcn += len) {
+ err = attr_data_get_block(ni, vcn, 1, &lcn,
+ &len, &new);
+ if (err)
+ goto out;
+
+ if (!new)
+ continue;
+ ntfs_sparse_cluster(inode, NULL, vcn, 1);
+ }
+ }
+
+ if (ni->i_valid < to) {
+ if (!inode_trylock(inode)) {
+ err = -EAGAIN;
+ goto out;
+ }
+ err = ntfs_extend_initialized_size(file, ni,
+ ni->i_valid, to);
+ inode_unlock(inode);
+ if (err)
+ goto out;
+ }
+ }
+
+ err = generic_file_mmap(file, vma);
+out:
+ return err;
+}
+
+static int ntfs_extend(struct inode *inode, loff_t pos, size_t count,
+ struct file *file)
+{
+ struct ntfs_inode *ni = ntfs_i(inode);
+ struct address_space *mapping = inode->i_mapping;
+ loff_t end = pos + count;
+ bool extend_init = file && pos > ni->i_valid;
+ int err;
+
+ if (end <= inode->i_size && !extend_init)
+ return 0;
+
+ /* Mark rw ntfs as dirty. It will be cleared at umount. */
+ ntfs_set_state(ni->mi.sbi, NTFS_DIRTY_DIRTY);
+
+ if (end > inode->i_size) {
+ err = ntfs_set_size(inode, end);
+ if (err)
+ goto out;
+ inode->i_size = end;
+ }
+
+ if (extend_init && !is_compressed(ni)) {
+ err = ntfs_extend_initialized_size(file, ni, ni->i_valid, pos);
+ if (err)
+ goto out;
+ } else {
+ err = 0;
+ }
+
+ inode->i_ctime = inode->i_mtime = current_time(inode);
+ mark_inode_dirty(inode);
+
+ if (IS_SYNC(inode)) {
+ int err2;
+
+ err = filemap_fdatawrite_range(mapping, pos, end - 1);
+ err2 = sync_mapping_buffers(mapping);
+ if (!err)
+ err = err2;
+ err2 = write_inode_now(inode, 1);
+ if (!err)
+ err = err2;
+ if (!err)
+ err = filemap_fdatawait_range(mapping, pos, end - 1);
+ }
+
+out:
+ return err;
+}
+
+static int ntfs_truncate(struct inode *inode, loff_t new_size)
+{
+ struct super_block *sb = inode->i_sb;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ int err, dirty = 0;
+ u64 new_valid;
+
+ if (!S_ISREG(inode->i_mode))
+ return 0;
+
+ if (is_compressed(ni)) {
+ if (ni->i_valid > new_size)
+ ni->i_valid = new_size;
+ } else {
+ err = block_truncate_page(inode->i_mapping, new_size,
+ ntfs_get_block);
+ if (err)
+ return err;
+ }
+
+ new_valid = ntfs_up_block(sb, min_t(u64, ni->i_valid, new_size));
+
+ ni_lock(ni);
+
+ truncate_setsize(inode, new_size);
+
+ down_write(&ni->file.run_lock);
+ err = attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, new_size,
+ &new_valid, true, NULL);
+ up_write(&ni->file.run_lock);
+
+ if (new_valid < ni->i_valid)
+ ni->i_valid = new_valid;
+
+ ni_unlock(ni);
+
+ ni->std_fa |= FILE_ATTRIBUTE_ARCHIVE;
+ inode->i_ctime = inode->i_mtime = current_time(inode);
+ if (!IS_DIRSYNC(inode)) {
+ dirty = 1;
+ } else {
+ err = ntfs_sync_inode(inode);
+ if (err)
+ return err;
+ }
+
+ if (dirty)
+ mark_inode_dirty(inode);
+
+ /*ntfs_flush_inodes(inode->i_sb, inode, NULL);*/
+
+ return 0;
+}
+
+/*
+ * ntfs_fallocate
+ *
+ * Preallocate space for a file. This implements ntfs's fallocate file
+ * operation, which gets called from sys_fallocate system call. User
+ * space requests 'len' bytes at 'vbo'. If FALLOC_FL_KEEP_SIZE is set
+ * we just allocate clusters without zeroing them out. Otherwise we
+ * allocate and zero out clusters via an expanding truncate.
+ */
+static long ntfs_fallocate(struct file *file, int mode, loff_t vbo, loff_t len)
+{
+ struct inode *inode = file->f_mapping->host;
+ struct super_block *sb = inode->i_sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ loff_t end = vbo + len;
+ loff_t vbo_down = round_down(vbo, PAGE_SIZE);
+ loff_t i_size;
+ int err;
+
+ /* No support for dir. */
+ if (!S_ISREG(inode->i_mode))
+ return -EOPNOTSUPP;
+
+ /* Return error if mode is not supported. */
+ if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
+ FALLOC_FL_COLLAPSE_RANGE)) {
+ ntfs_inode_warn(inode, "fallocate(0x%x) is not supported",
+ mode);
+ return -EOPNOTSUPP;
+ }
+
+ ntfs_set_state(sbi, NTFS_DIRTY_DIRTY);
+
+ inode_lock(inode);
+ i_size = inode->i_size;
+
+ if (WARN_ON(ni->ni_flags & NI_FLAG_COMPRESSED_MASK)) {
+ /* Should never be here, see ntfs_file_open. */
+ err = -EOPNOTSUPP;
+ goto out;
+ }
+
+ if (mode & FALLOC_FL_PUNCH_HOLE) {
+ u32 frame_size;
+ loff_t mask, vbo_a, end_a, tmp;
+
+ if (!(mode & FALLOC_FL_KEEP_SIZE)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = filemap_write_and_wait_range(inode->i_mapping, vbo,
+ end - 1);
+ if (err)
+ goto out;
+
+ err = filemap_write_and_wait_range(inode->i_mapping, end,
+ LLONG_MAX);
+ if (err)
+ goto out;
+
+ inode_dio_wait(inode);
+
+ truncate_pagecache(inode, vbo_down);
+
+ if (!is_sparsed(ni) && !is_compressed(ni)) {
+ /* Normal file. */
+ err = ntfs_zero_range(inode, vbo, end);
+ goto out;
+ }
+
+ ni_lock(ni);
+ err = attr_punch_hole(ni, vbo, len, &frame_size);
+ ni_unlock(ni);
+ if (err != E_NTFS_NOTALIGNED)
+ goto out;
+
+ /* Process not aligned punch. */
+ mask = frame_size - 1;
+ vbo_a = (vbo + mask) & ~mask;
+ end_a = end & ~mask;
+
+ tmp = min(vbo_a, end);
+ if (tmp > vbo) {
+ err = ntfs_zero_range(inode, vbo, tmp);
+ if (err)
+ goto out;
+ }
+
+ if (vbo < end_a && end_a < end) {
+ err = ntfs_zero_range(inode, end_a, end);
+ if (err)
+ goto out;
+ }
+
+ /* Aligned punch_hole */
+ if (end_a > vbo_a) {
+ ni_lock(ni);
+ err = attr_punch_hole(ni, vbo_a, end_a - vbo_a, NULL);
+ ni_unlock(ni);
+ }
+ } else if (mode & FALLOC_FL_COLLAPSE_RANGE) {
+ if (mode & ~FALLOC_FL_COLLAPSE_RANGE) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * Write tail of the last page before removed range since
+ * it will get removed from the page cache below.
+ */
+ err = filemap_write_and_wait_range(inode->i_mapping, vbo_down,
+ vbo);
+ if (err)
+ goto out;
+
+ /*
+ * Write data that will be shifted to preserve them
+ * when discarding page cache below.
+ */
+ err = filemap_write_and_wait_range(inode->i_mapping, end,
+ LLONG_MAX);
+ if (err)
+ goto out;
+
+ /* Wait for existing dio to complete. */
+ inode_dio_wait(inode);
+
+ truncate_pagecache(inode, vbo_down);
+
+ ni_lock(ni);
+ err = attr_collapse_range(ni, vbo, len);
+ ni_unlock(ni);
+ } else {
+ /*
+ * Normal file: Allocate clusters, do not change 'valid' size.
+ */
+ err = ntfs_set_size(inode, max(end, i_size));
+ if (err)
+ goto out;
+
+ if (is_sparsed(ni) || is_compressed(ni)) {
+ CLST vcn_v = ni->i_valid >> sbi->cluster_bits;
+ CLST vcn = vbo >> sbi->cluster_bits;
+ CLST cend = bytes_to_cluster(sbi, end);
+ CLST lcn, clen;
+ bool new;
+
+ /*
+ * Allocate but do not zero new clusters. (see below comments)
+ * This breaks security: One can read unused on-disk areas.
+ * Zeroing these clusters may be too long.
+ * Maybe we should check here for root rights?
+ */
+ for (; vcn < cend; vcn += clen) {
+ err = attr_data_get_block(ni, vcn, cend - vcn,
+ &lcn, &clen, &new);
+ if (err)
+ goto out;
+ if (!new || vcn >= vcn_v)
+ continue;
+
+ /*
+ * Unwritten area.
+ * NTFS is not able to store several unwritten areas.
+ * Activate 'ntfs_sparse_cluster' to zero new allocated clusters.
+ *
+ * Dangerous in case:
+ * 1G of sparsed clusters + 1 cluster of data =>
+ * valid_size == 1G + 1 cluster
+ * fallocate(1G) will zero 1G and this can be very long
+ * xfstest 016/086 will fail without 'ntfs_sparse_cluster'.
+ */
+ ntfs_sparse_cluster(inode, NULL, vcn,
+ min(vcn_v - vcn, clen));
+ }
+ }
+
+ if (mode & FALLOC_FL_KEEP_SIZE) {
+ ni_lock(ni);
+ /* True - Keep preallocated. */
+ err = attr_set_size(ni, ATTR_DATA, NULL, 0,
+ &ni->file.run, i_size, &ni->i_valid,
+ true, NULL);
+ ni_unlock(ni);
+ }
+ }
+
+out:
+ if (err == -EFBIG)
+ err = -ENOSPC;
+
+ if (!err) {
+ inode->i_ctime = inode->i_mtime = current_time(inode);
+ mark_inode_dirty(inode);
+ }
+
+ inode_unlock(inode);
+ return err;
+}
+
+/*
+ * ntfs3_setattr - inode_operations::setattr
+ */
+int ntfs3_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
+ struct iattr *attr)
+{
+ struct super_block *sb = dentry->d_sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ struct inode *inode = d_inode(dentry);
+ struct ntfs_inode *ni = ntfs_i(inode);
+ u32 ia_valid = attr->ia_valid;
+ umode_t mode = inode->i_mode;
+ int err;
+
+ if (sbi->options.no_acs_rules) {
+ /* "No access rules" - Force any changes of time etc. */
+ attr->ia_valid |= ATTR_FORCE;
+ /* and disable for editing some attributes. */
+ attr->ia_valid &= ~(ATTR_UID | ATTR_GID | ATTR_MODE);
+ ia_valid = attr->ia_valid;
+ }
+
+ err = setattr_prepare(mnt_userns, dentry, attr);
+ if (err)
+ goto out;
+
+ if (ia_valid & ATTR_SIZE) {
+ loff_t oldsize = inode->i_size;
+
+ if (WARN_ON(ni->ni_flags & NI_FLAG_COMPRESSED_MASK)) {
+ /* Should never be here, see ntfs_file_open(). */
+ err = -EOPNOTSUPP;
+ goto out;
+ }
+ inode_dio_wait(inode);
+
+ if (attr->ia_size < oldsize)
+ err = ntfs_truncate(inode, attr->ia_size);
+ else if (attr->ia_size > oldsize)
+ err = ntfs_extend(inode, attr->ia_size, 0, NULL);
+
+ if (err)
+ goto out;
+
+ ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+ }
+
+ setattr_copy(mnt_userns, inode, attr);
+
+ if (mode != inode->i_mode) {
+ err = ntfs_acl_chmod(mnt_userns, inode);
+ if (err)
+ goto out;
+
+ /* Linux 'w' -> Windows 'ro'. */
+ if (0222 & inode->i_mode)
+ ni->std_fa &= ~FILE_ATTRIBUTE_READONLY;
+ else
+ ni->std_fa |= FILE_ATTRIBUTE_READONLY;
+ }
+
+ if (ia_valid & (ATTR_UID | ATTR_GID | ATTR_MODE))
+ ntfs_save_wsl_perm(inode);
+ mark_inode_dirty(inode);
+out:
+ return err;
+}
+
+static ssize_t ntfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
+{
+ struct file *file = iocb->ki_filp;
+ struct inode *inode = file->f_mapping->host;
+ struct ntfs_inode *ni = ntfs_i(inode);
+
+ if (is_encrypted(ni)) {
+ ntfs_inode_warn(inode, "encrypted i/o not supported");
+ return -EOPNOTSUPP;
+ }
+
+ if (is_compressed(ni) && (iocb->ki_flags & IOCB_DIRECT)) {
+ ntfs_inode_warn(inode, "direct i/o + compressed not supported");
+ return -EOPNOTSUPP;
+ }
+
+#ifndef CONFIG_NTFS3_LZX_XPRESS
+ if (ni->ni_flags & NI_FLAG_COMPRESSED_MASK) {
+ ntfs_inode_warn(
+ inode,
+ "activate CONFIG_NTFS3_LZX_XPRESS to read external compressed files");
+ return -EOPNOTSUPP;
+ }
+#endif
+
+ if (is_dedup(ni)) {
+ ntfs_inode_warn(inode, "read deduplicated not supported");
+ return -EOPNOTSUPP;
+ }
+
+ return generic_file_read_iter(iocb, iter);
+}
+
+/*
+ * ntfs_get_frame_pages
+ *
+ * Return: Array of locked pages.
+ */
+static int ntfs_get_frame_pages(struct address_space *mapping, pgoff_t index,
+ struct page **pages, u32 pages_per_frame,
+ bool *frame_uptodate)
+{
+ gfp_t gfp_mask = mapping_gfp_mask(mapping);
+ u32 npages;
+
+ *frame_uptodate = true;
+
+ for (npages = 0; npages < pages_per_frame; npages++, index++) {
+ struct page *page;
+
+ page = find_or_create_page(mapping, index, gfp_mask);
+ if (!page) {
+ while (npages--) {
+ page = pages[npages];
+ unlock_page(page);
+ put_page(page);
+ }
+
+ return -ENOMEM;
+ }
+
+ if (!PageUptodate(page))
+ *frame_uptodate = false;
+
+ pages[npages] = page;
+ }
+
+ return 0;
+}
+
+/*
+ * ntfs_compress_write - Helper for ntfs_file_write_iter() (compressed files).
+ */
+static ssize_t ntfs_compress_write(struct kiocb *iocb, struct iov_iter *from)
+{
+ int err;
+ struct file *file = iocb->ki_filp;
+ size_t count = iov_iter_count(from);
+ loff_t pos = iocb->ki_pos;
+ struct inode *inode = file_inode(file);
+ loff_t i_size = inode->i_size;
+ struct address_space *mapping = inode->i_mapping;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ u64 valid = ni->i_valid;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct page *page, **pages = NULL;
+ size_t written = 0;
+ u8 frame_bits = NTFS_LZNT_CUNIT + sbi->cluster_bits;
+ u32 frame_size = 1u << frame_bits;
+ u32 pages_per_frame = frame_size >> PAGE_SHIFT;
+ u32 ip, off;
+ CLST frame;
+ u64 frame_vbo;
+ pgoff_t index;
+ bool frame_uptodate;
+
+ if (frame_size < PAGE_SIZE) {
+ /*
+ * frame_size == 8K if cluster 512
+ * frame_size == 64K if cluster 4096
+ */
+ ntfs_inode_warn(inode, "page size is bigger than frame size");
+ return -EOPNOTSUPP;
+ }
+
+ pages = kmalloc_array(pages_per_frame, sizeof(struct page *), GFP_NOFS);
+ if (!pages)
+ return -ENOMEM;
+
+ current->backing_dev_info = inode_to_bdi(inode);
+ err = file_remove_privs(file);
+ if (err)
+ goto out;
+
+ err = file_update_time(file);
+ if (err)
+ goto out;
+
+ /* Zero range [valid : pos). */
+ while (valid < pos) {
+ CLST lcn, clen;
+
+ frame = valid >> frame_bits;
+ frame_vbo = valid & ~(frame_size - 1);
+ off = valid & (frame_size - 1);
+
+ err = attr_data_get_block(ni, frame << NTFS_LZNT_CUNIT, 0, &lcn,
+ &clen, NULL);
+ if (err)
+ goto out;
+
+ if (lcn == SPARSE_LCN) {
+ ni->i_valid = valid =
+ frame_vbo + ((u64)clen << sbi->cluster_bits);
+ continue;
+ }
+
+ /* Load full frame. */
+ err = ntfs_get_frame_pages(mapping, frame_vbo >> PAGE_SHIFT,
+ pages, pages_per_frame,
+ &frame_uptodate);
+ if (err)
+ goto out;
+
+ if (!frame_uptodate && off) {
+ err = ni_read_frame(ni, frame_vbo, pages,
+ pages_per_frame);
+ if (err) {
+ for (ip = 0; ip < pages_per_frame; ip++) {
+ page = pages[ip];
+ unlock_page(page);
+ put_page(page);
+ }
+ goto out;
+ }
+ }
+
+ ip = off >> PAGE_SHIFT;
+ off = offset_in_page(valid);
+ for (; ip < pages_per_frame; ip++, off = 0) {
+ page = pages[ip];
+ zero_user_segment(page, off, PAGE_SIZE);
+ flush_dcache_page(page);
+ SetPageUptodate(page);
+ }
+
+ ni_lock(ni);
+ err = ni_write_frame(ni, pages, pages_per_frame);
+ ni_unlock(ni);
+
+ for (ip = 0; ip < pages_per_frame; ip++) {
+ page = pages[ip];
+ SetPageUptodate(page);
+ unlock_page(page);
+ put_page(page);
+ }
+
+ if (err)
+ goto out;
+
+ ni->i_valid = valid = frame_vbo + frame_size;
+ }
+
+ /* Copy user data [pos : pos + count). */
+ while (count) {
+ size_t copied, bytes;
+
+ off = pos & (frame_size - 1);
+ bytes = frame_size - off;
+ if (bytes > count)
+ bytes = count;
+
+ frame = pos >> frame_bits;
+ frame_vbo = pos & ~(frame_size - 1);
+ index = frame_vbo >> PAGE_SHIFT;
+
+ if (unlikely(iov_iter_fault_in_readable(from, bytes))) {
+ err = -EFAULT;
+ goto out;
+ }
+
+ /* Load full frame. */
+ err = ntfs_get_frame_pages(mapping, index, pages,
+ pages_per_frame, &frame_uptodate);
+ if (err)
+ goto out;
+
+ if (!frame_uptodate) {
+ loff_t to = pos + bytes;
+
+ if (off || (to < i_size && (to & (frame_size - 1)))) {
+ err = ni_read_frame(ni, frame_vbo, pages,
+ pages_per_frame);
+ if (err) {
+ for (ip = 0; ip < pages_per_frame;
+ ip++) {
+ page = pages[ip];
+ unlock_page(page);
+ put_page(page);
+ }
+ goto out;
+ }
+ }
+ }
+
+ WARN_ON(!bytes);
+ copied = 0;
+ ip = off >> PAGE_SHIFT;
+ off = offset_in_page(pos);
+
+ /* Copy user data to pages. */
+ for (;;) {
+ size_t cp, tail = PAGE_SIZE - off;
+
+ page = pages[ip];
+ cp = copy_page_from_iter_atomic(page, off,
+ min(tail, bytes), from);
+ flush_dcache_page(page);
+
+ copied += cp;
+ bytes -= cp;
+ if (!bytes || !cp)
+ break;
+
+ if (cp < tail) {
+ off += cp;
+ } else {
+ ip++;
+ off = 0;
+ }
+ }
+
+ ni_lock(ni);
+ err = ni_write_frame(ni, pages, pages_per_frame);
+ ni_unlock(ni);
+
+ for (ip = 0; ip < pages_per_frame; ip++) {
+ page = pages[ip];
+ ClearPageDirty(page);
+ SetPageUptodate(page);
+ unlock_page(page);
+ put_page(page);
+ }
+
+ if (err)
+ goto out;
+
+ /*
+ * We can loop for a long time in here. Be nice and allow
+ * us to schedule out to avoid softlocking if preempt
+ * is disabled.
+ */
+ cond_resched();
+
+ pos += copied;
+ written += copied;
+
+ count = iov_iter_count(from);
+ }
+
+out:
+ kfree(pages);
+
+ current->backing_dev_info = NULL;
+
+ if (err < 0)
+ return err;
+
+ iocb->ki_pos += written;
+ if (iocb->ki_pos > ni->i_valid)
+ ni->i_valid = iocb->ki_pos;
+
+ return written;
+}
+
+/*
+ * ntfs_file_write_iter - file_operations::write_iter
+ */
+static ssize_t ntfs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
+{
+ struct file *file = iocb->ki_filp;
+ struct address_space *mapping = file->f_mapping;
+ struct inode *inode = mapping->host;
+ ssize_t ret;
+ struct ntfs_inode *ni = ntfs_i(inode);
+
+ if (is_encrypted(ni)) {
+ ntfs_inode_warn(inode, "encrypted i/o not supported");
+ return -EOPNOTSUPP;
+ }
+
+ if (is_compressed(ni) && (iocb->ki_flags & IOCB_DIRECT)) {
+ ntfs_inode_warn(inode, "direct i/o + compressed not supported");
+ return -EOPNOTSUPP;
+ }
+
+ if (is_dedup(ni)) {
+ ntfs_inode_warn(inode, "write into deduplicated not supported");
+ return -EOPNOTSUPP;
+ }
+
+ if (!inode_trylock(inode)) {
+ if (iocb->ki_flags & IOCB_NOWAIT)
+ return -EAGAIN;
+ inode_lock(inode);
+ }
+
+ ret = generic_write_checks(iocb, from);
+ if (ret <= 0)
+ goto out;
+
+ if (WARN_ON(ni->ni_flags & NI_FLAG_COMPRESSED_MASK)) {
+ /* Should never be here, see ntfs_file_open(). */
+ ret = -EOPNOTSUPP;
+ goto out;
+ }
+
+ ret = ntfs_extend(inode, iocb->ki_pos, ret, file);
+ if (ret)
+ goto out;
+
+ ret = is_compressed(ni) ? ntfs_compress_write(iocb, from)
+ : __generic_file_write_iter(iocb, from);
+
+out:
+ inode_unlock(inode);
+
+ if (ret > 0)
+ ret = generic_write_sync(iocb, ret);
+
+ return ret;
+}
+
+/*
+ * ntfs_file_open - file_operations::open
+ */
+int ntfs_file_open(struct inode *inode, struct file *file)
+{
+ struct ntfs_inode *ni = ntfs_i(inode);
+
+ if (unlikely((is_compressed(ni) || is_encrypted(ni)) &&
+ (file->f_flags & O_DIRECT))) {
+ return -EOPNOTSUPP;
+ }
+
+ /* Decompress "external compressed" file if opened for rw. */
+ if ((ni->ni_flags & NI_FLAG_COMPRESSED_MASK) &&
+ (file->f_flags & (O_WRONLY | O_RDWR | O_TRUNC))) {
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+ int err = ni_decompress_file(ni);
+
+ if (err)
+ return err;
+#else
+ ntfs_inode_warn(
+ inode,
+ "activate CONFIG_NTFS3_LZX_XPRESS to write external compressed files");
+ return -EOPNOTSUPP;
+#endif
+ }
+
+ return generic_file_open(inode, file);
+}
+
+/*
+ * ntfs_file_release - file_operations::release
+ */
+static int ntfs_file_release(struct inode *inode, struct file *file)
+{
+ struct ntfs_inode *ni = ntfs_i(inode);
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ int err = 0;
+
+ /* If we are last writer on the inode, drop the block reservation. */
+ if (sbi->options.prealloc && ((file->f_mode & FMODE_WRITE) &&
+ atomic_read(&inode->i_writecount) == 1)) {
+ ni_lock(ni);
+ down_write(&ni->file.run_lock);
+
+ err = attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run,
+ inode->i_size, &ni->i_valid, false, NULL);
+
+ up_write(&ni->file.run_lock);
+ ni_unlock(ni);
+ }
+ return err;
+}
+
+/*
+ * ntfs_fiemap - file_operations::fiemap
+ */
+int ntfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
+ __u64 start, __u64 len)
+{
+ int err;
+ struct ntfs_inode *ni = ntfs_i(inode);
+
+ err = fiemap_prep(inode, fieinfo, start, &len, ~FIEMAP_FLAG_XATTR);
+ if (err)
+ return err;
+
+ ni_lock(ni);
+
+ err = ni_fiemap(ni, fieinfo, start, len);
+
+ ni_unlock(ni);
+
+ return err;
+}
+
+// clang-format off
+const struct inode_operations ntfs_file_inode_operations = {
+ .getattr = ntfs_getattr,
+ .setattr = ntfs3_setattr,
+ .listxattr = ntfs_listxattr,
+ .permission = ntfs_permission,
+ .get_acl = ntfs_get_acl,
+ .set_acl = ntfs_set_acl,
+ .fiemap = ntfs_fiemap,
+};
+
+const struct file_operations ntfs_file_operations = {
+ .llseek = generic_file_llseek,
+ .read_iter = ntfs_file_read_iter,
+ .write_iter = ntfs_file_write_iter,
+ .unlocked_ioctl = ntfs_ioctl,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = ntfs_compat_ioctl,
+#endif
+ .splice_read = generic_file_splice_read,
+ .mmap = ntfs_file_mmap,
+ .open = ntfs_file_open,
+ .fsync = generic_file_fsync,
+ .splice_write = iter_file_splice_write,
+ .fallocate = ntfs_fallocate,
+ .release = ntfs_file_release,
+};
+// clang-format on
diff --git a/fs/ntfs3/frecord.c b/fs/ntfs3/frecord.c
new file mode 100644
index 000000000000..938b12d56ca6
--- /dev/null
+++ b/fs/ntfs3/frecord.c
@@ -0,0 +1,3257 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fiemap.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+#include <linux/vmalloc.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+#include "lib/lib.h"
+#endif
+
+static struct mft_inode *ni_ins_mi(struct ntfs_inode *ni, struct rb_root *tree,
+ CLST ino, struct rb_node *ins)
+{
+ struct rb_node **p = &tree->rb_node;
+ struct rb_node *pr = NULL;
+
+ while (*p) {
+ struct mft_inode *mi;
+
+ pr = *p;
+ mi = rb_entry(pr, struct mft_inode, node);
+ if (mi->rno > ino)
+ p = &pr->rb_left;
+ else if (mi->rno < ino)
+ p = &pr->rb_right;
+ else
+ return mi;
+ }
+
+ if (!ins)
+ return NULL;
+
+ rb_link_node(ins, pr, p);
+ rb_insert_color(ins, tree);
+ return rb_entry(ins, struct mft_inode, node);
+}
+
+/*
+ * ni_find_mi - Find mft_inode by record number.
+ */
+static struct mft_inode *ni_find_mi(struct ntfs_inode *ni, CLST rno)
+{
+ return ni_ins_mi(ni, &ni->mi_tree, rno, NULL);
+}
+
+/*
+ * ni_add_mi - Add new mft_inode into ntfs_inode.
+ */
+static void ni_add_mi(struct ntfs_inode *ni, struct mft_inode *mi)
+{
+ ni_ins_mi(ni, &ni->mi_tree, mi->rno, &mi->node);
+}
+
+/*
+ * ni_remove_mi - Remove mft_inode from ntfs_inode.
+ */
+void ni_remove_mi(struct ntfs_inode *ni, struct mft_inode *mi)
+{
+ rb_erase(&mi->node, &ni->mi_tree);
+}
+
+/*
+ * ni_std - Return: Pointer into std_info from primary record.
+ */
+struct ATTR_STD_INFO *ni_std(struct ntfs_inode *ni)
+{
+ const struct ATTRIB *attr;
+
+ attr = mi_find_attr(&ni->mi, NULL, ATTR_STD, NULL, 0, NULL);
+ return attr ? resident_data_ex(attr, sizeof(struct ATTR_STD_INFO))
+ : NULL;
+}
+
+/*
+ * ni_std5
+ *
+ * Return: Pointer into std_info from primary record.
+ */
+struct ATTR_STD_INFO5 *ni_std5(struct ntfs_inode *ni)
+{
+ const struct ATTRIB *attr;
+
+ attr = mi_find_attr(&ni->mi, NULL, ATTR_STD, NULL, 0, NULL);
+
+ return attr ? resident_data_ex(attr, sizeof(struct ATTR_STD_INFO5))
+ : NULL;
+}
+
+/*
+ * ni_clear - Clear resources allocated by ntfs_inode.
+ */
+void ni_clear(struct ntfs_inode *ni)
+{
+ struct rb_node *node;
+
+ if (!ni->vfs_inode.i_nlink && is_rec_inuse(ni->mi.mrec))
+ ni_delete_all(ni);
+
+ al_destroy(ni);
+
+ for (node = rb_first(&ni->mi_tree); node;) {
+ struct rb_node *next = rb_next(node);
+ struct mft_inode *mi = rb_entry(node, struct mft_inode, node);
+
+ rb_erase(node, &ni->mi_tree);
+ mi_put(mi);
+ node = next;
+ }
+
+ /* Bad inode always has mode == S_IFREG. */
+ if (ni->ni_flags & NI_FLAG_DIR)
+ indx_clear(&ni->dir);
+ else {
+ run_close(&ni->file.run);
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+ if (ni->file.offs_page) {
+ /* On-demand allocated page for offsets. */
+ put_page(ni->file.offs_page);
+ ni->file.offs_page = NULL;
+ }
+#endif
+ }
+
+ mi_clear(&ni->mi);
+}
+
+/*
+ * ni_load_mi_ex - Find mft_inode by record number.
+ */
+int ni_load_mi_ex(struct ntfs_inode *ni, CLST rno, struct mft_inode **mi)
+{
+ int err;
+ struct mft_inode *r;
+
+ r = ni_find_mi(ni, rno);
+ if (r)
+ goto out;
+
+ err = mi_get(ni->mi.sbi, rno, &r);
+ if (err)
+ return err;
+
+ ni_add_mi(ni, r);
+
+out:
+ if (mi)
+ *mi = r;
+ return 0;
+}
+
+/*
+ * ni_load_mi - Load mft_inode corresponded list_entry.
+ */
+int ni_load_mi(struct ntfs_inode *ni, const struct ATTR_LIST_ENTRY *le,
+ struct mft_inode **mi)
+{
+ CLST rno;
+
+ if (!le) {
+ *mi = &ni->mi;
+ return 0;
+ }
+
+ rno = ino_get(&le->ref);
+ if (rno == ni->mi.rno) {
+ *mi = &ni->mi;
+ return 0;
+ }
+ return ni_load_mi_ex(ni, rno, mi);
+}
+
+/*
+ * ni_find_attr
+ *
+ * Return: Attribute and record this attribute belongs to.
+ */
+struct ATTRIB *ni_find_attr(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct ATTR_LIST_ENTRY **le_o, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, const CLST *vcn,
+ struct mft_inode **mi)
+{
+ struct ATTR_LIST_ENTRY *le;
+ struct mft_inode *m;
+
+ if (!ni->attr_list.size ||
+ (!name_len && (type == ATTR_LIST || type == ATTR_STD))) {
+ if (le_o)
+ *le_o = NULL;
+ if (mi)
+ *mi = &ni->mi;
+
+ /* Look for required attribute in primary record. */
+ return mi_find_attr(&ni->mi, attr, type, name, name_len, NULL);
+ }
+
+ /* First look for list entry of required type. */
+ le = al_find_ex(ni, le_o ? *le_o : NULL, type, name, name_len, vcn);
+ if (!le)
+ return NULL;
+
+ if (le_o)
+ *le_o = le;
+
+ /* Load record that contains this attribute. */
+ if (ni_load_mi(ni, le, &m))
+ return NULL;
+
+ /* Look for required attribute. */
+ attr = mi_find_attr(m, NULL, type, name, name_len, &le->id);
+
+ if (!attr)
+ goto out;
+
+ if (!attr->non_res) {
+ if (vcn && *vcn)
+ goto out;
+ } else if (!vcn) {
+ if (attr->nres.svcn)
+ goto out;
+ } else if (le64_to_cpu(attr->nres.svcn) > *vcn ||
+ *vcn > le64_to_cpu(attr->nres.evcn)) {
+ goto out;
+ }
+
+ if (mi)
+ *mi = m;
+ return attr;
+
+out:
+ ntfs_set_state(ni->mi.sbi, NTFS_DIRTY_ERROR);
+ return NULL;
+}
+
+/*
+ * ni_enum_attr_ex - Enumerates attributes in ntfs_inode.
+ */
+struct ATTRIB *ni_enum_attr_ex(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct ATTR_LIST_ENTRY **le,
+ struct mft_inode **mi)
+{
+ struct mft_inode *mi2;
+ struct ATTR_LIST_ENTRY *le2;
+
+ /* Do we have an attribute list? */
+ if (!ni->attr_list.size) {
+ *le = NULL;
+ if (mi)
+ *mi = &ni->mi;
+ /* Enum attributes in primary record. */
+ return mi_enum_attr(&ni->mi, attr);
+ }
+
+ /* Get next list entry. */
+ le2 = *le = al_enumerate(ni, attr ? *le : NULL);
+ if (!le2)
+ return NULL;
+
+ /* Load record that contains the required attribute. */
+ if (ni_load_mi(ni, le2, &mi2))
+ return NULL;
+
+ if (mi)
+ *mi = mi2;
+
+ /* Find attribute in loaded record. */
+ return rec_find_attr_le(mi2, le2);
+}
+
+/*
+ * ni_load_attr - Load attribute that contains given VCN.
+ */
+struct ATTRIB *ni_load_attr(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, CLST vcn,
+ struct mft_inode **pmi)
+{
+ struct ATTR_LIST_ENTRY *le;
+ struct ATTRIB *attr;
+ struct mft_inode *mi;
+ struct ATTR_LIST_ENTRY *next;
+
+ if (!ni->attr_list.size) {
+ if (pmi)
+ *pmi = &ni->mi;
+ return mi_find_attr(&ni->mi, NULL, type, name, name_len, NULL);
+ }
+
+ le = al_find_ex(ni, NULL, type, name, name_len, NULL);
+ if (!le)
+ return NULL;
+
+ /*
+ * Unfortunately ATTR_LIST_ENTRY contains only start VCN.
+ * So to find the ATTRIB segment that contains 'vcn' we should
+ * enumerate some entries.
+ */
+ if (vcn) {
+ for (;; le = next) {
+ next = al_find_ex(ni, le, type, name, name_len, NULL);
+ if (!next || le64_to_cpu(next->vcn) > vcn)
+ break;
+ }
+ }
+
+ if (ni_load_mi(ni, le, &mi))
+ return NULL;
+
+ if (pmi)
+ *pmi = mi;
+
+ attr = mi_find_attr(mi, NULL, type, name, name_len, &le->id);
+ if (!attr)
+ return NULL;
+
+ if (!attr->non_res)
+ return attr;
+
+ if (le64_to_cpu(attr->nres.svcn) <= vcn &&
+ vcn <= le64_to_cpu(attr->nres.evcn))
+ return attr;
+
+ return NULL;
+}
+
+/*
+ * ni_load_all_mi - Load all subrecords.
+ */
+int ni_load_all_mi(struct ntfs_inode *ni)
+{
+ int err;
+ struct ATTR_LIST_ENTRY *le;
+
+ if (!ni->attr_list.size)
+ return 0;
+
+ le = NULL;
+
+ while ((le = al_enumerate(ni, le))) {
+ CLST rno = ino_get(&le->ref);
+
+ if (rno == ni->mi.rno)
+ continue;
+
+ err = ni_load_mi_ex(ni, rno, NULL);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+/*
+ * ni_add_subrecord - Allocate + format + attach a new subrecord.
+ */
+bool ni_add_subrecord(struct ntfs_inode *ni, CLST rno, struct mft_inode **mi)
+{
+ struct mft_inode *m;
+
+ m = kzalloc(sizeof(struct mft_inode), GFP_NOFS);
+ if (!m)
+ return false;
+
+ if (mi_format_new(m, ni->mi.sbi, rno, 0, ni->mi.rno == MFT_REC_MFT)) {
+ mi_put(m);
+ return false;
+ }
+
+ mi_get_ref(&ni->mi, &m->mrec->parent_ref);
+
+ ni_add_mi(ni, m);
+ *mi = m;
+ return true;
+}
+
+/*
+ * ni_remove_attr - Remove all attributes for the given type/name/id.
+ */
+int ni_remove_attr(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, size_t name_len, bool base_only,
+ const __le16 *id)
+{
+ int err;
+ struct ATTRIB *attr;
+ struct ATTR_LIST_ENTRY *le;
+ struct mft_inode *mi;
+ u32 type_in;
+ int diff;
+
+ if (base_only || type == ATTR_LIST || !ni->attr_list.size) {
+ attr = mi_find_attr(&ni->mi, NULL, type, name, name_len, id);
+ if (!attr)
+ return -ENOENT;
+
+ mi_remove_attr(ni, &ni->mi, attr);
+ return 0;
+ }
+
+ type_in = le32_to_cpu(type);
+ le = NULL;
+
+ for (;;) {
+ le = al_enumerate(ni, le);
+ if (!le)
+ return 0;
+
+next_le2:
+ diff = le32_to_cpu(le->type) - type_in;
+ if (diff < 0)
+ continue;
+
+ if (diff > 0)
+ return 0;
+
+ if (le->name_len != name_len)
+ continue;
+
+ if (name_len &&
+ memcmp(le_name(le), name, name_len * sizeof(short)))
+ continue;
+
+ if (id && le->id != *id)
+ continue;
+ err = ni_load_mi(ni, le, &mi);
+ if (err)
+ return err;
+
+ al_remove_le(ni, le);
+
+ attr = mi_find_attr(mi, NULL, type, name, name_len, id);
+ if (!attr)
+ return -ENOENT;
+
+ mi_remove_attr(ni, mi, attr);
+
+ if (PtrOffset(ni->attr_list.le, le) >= ni->attr_list.size)
+ return 0;
+ goto next_le2;
+ }
+}
+
+/*
+ * ni_ins_new_attr - Insert the attribute into record.
+ *
+ * Return: Not full constructed attribute or NULL if not possible to create.
+ */
+static struct ATTRIB *
+ni_ins_new_attr(struct ntfs_inode *ni, struct mft_inode *mi,
+ struct ATTR_LIST_ENTRY *le, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, u32 asize, u16 name_off,
+ CLST svcn, struct ATTR_LIST_ENTRY **ins_le)
+{
+ int err;
+ struct ATTRIB *attr;
+ bool le_added = false;
+ struct MFT_REF ref;
+
+ mi_get_ref(mi, &ref);
+
+ if (type != ATTR_LIST && !le && ni->attr_list.size) {
+ err = al_add_le(ni, type, name, name_len, svcn, cpu_to_le16(-1),
+ &ref, &le);
+ if (err) {
+ /* No memory or no space. */
+ return NULL;
+ }
+ le_added = true;
+
+ /*
+ * al_add_le -> attr_set_size (list) -> ni_expand_list
+ * which moves some attributes out of primary record
+ * this means that name may point into moved memory
+ * reinit 'name' from le.
+ */
+ name = le->name;
+ }
+
+ attr = mi_insert_attr(mi, type, name, name_len, asize, name_off);
+ if (!attr) {
+ if (le_added)
+ al_remove_le(ni, le);
+ return NULL;
+ }
+
+ if (type == ATTR_LIST) {
+ /* Attr list is not in list entry array. */
+ goto out;
+ }
+
+ if (!le)
+ goto out;
+
+ /* Update ATTRIB Id and record reference. */
+ le->id = attr->id;
+ ni->attr_list.dirty = true;
+ le->ref = ref;
+
+out:
+ if (ins_le)
+ *ins_le = le;
+ return attr;
+}
+
+/*
+ * ni_repack
+ *
+ * Random write access to sparsed or compressed file may result to
+ * not optimized packed runs.
+ * Here is the place to optimize it.
+ */
+static int ni_repack(struct ntfs_inode *ni)
+{
+ int err = 0;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct mft_inode *mi, *mi_p = NULL;
+ struct ATTRIB *attr = NULL, *attr_p;
+ struct ATTR_LIST_ENTRY *le = NULL, *le_p;
+ CLST alloc = 0;
+ u8 cluster_bits = sbi->cluster_bits;
+ CLST svcn, evcn = 0, svcn_p, evcn_p, next_svcn;
+ u32 roff, rs = sbi->record_size;
+ struct runs_tree run;
+
+ run_init(&run);
+
+ while ((attr = ni_enum_attr_ex(ni, attr, &le, &mi))) {
+ if (!attr->non_res)
+ continue;
+
+ svcn = le64_to_cpu(attr->nres.svcn);
+ if (svcn != le64_to_cpu(le->vcn)) {
+ err = -EINVAL;
+ break;
+ }
+
+ if (!svcn) {
+ alloc = le64_to_cpu(attr->nres.alloc_size) >>
+ cluster_bits;
+ mi_p = NULL;
+ } else if (svcn != evcn + 1) {
+ err = -EINVAL;
+ break;
+ }
+
+ evcn = le64_to_cpu(attr->nres.evcn);
+
+ if (svcn > evcn + 1) {
+ err = -EINVAL;
+ break;
+ }
+
+ if (!mi_p) {
+ /* Do not try if not enogh free space. */
+ if (le32_to_cpu(mi->mrec->used) + 8 >= rs)
+ continue;
+
+ /* Do not try if last attribute segment. */
+ if (evcn + 1 == alloc)
+ continue;
+ run_close(&run);
+ }
+
+ roff = le16_to_cpu(attr->nres.run_off);
+ err = run_unpack(&run, sbi, ni->mi.rno, svcn, evcn, svcn,
+ Add2Ptr(attr, roff),
+ le32_to_cpu(attr->size) - roff);
+ if (err < 0)
+ break;
+
+ if (!mi_p) {
+ mi_p = mi;
+ attr_p = attr;
+ svcn_p = svcn;
+ evcn_p = evcn;
+ le_p = le;
+ err = 0;
+ continue;
+ }
+
+ /*
+ * Run contains data from two records: mi_p and mi
+ * Try to pack in one.
+ */
+ err = mi_pack_runs(mi_p, attr_p, &run, evcn + 1 - svcn_p);
+ if (err)
+ break;
+
+ next_svcn = le64_to_cpu(attr_p->nres.evcn) + 1;
+
+ if (next_svcn >= evcn + 1) {
+ /* We can remove this attribute segment. */
+ al_remove_le(ni, le);
+ mi_remove_attr(NULL, mi, attr);
+ le = le_p;
+ continue;
+ }
+
+ attr->nres.svcn = le->vcn = cpu_to_le64(next_svcn);
+ mi->dirty = true;
+ ni->attr_list.dirty = true;
+
+ if (evcn + 1 == alloc) {
+ err = mi_pack_runs(mi, attr, &run,
+ evcn + 1 - next_svcn);
+ if (err)
+ break;
+ mi_p = NULL;
+ } else {
+ mi_p = mi;
+ attr_p = attr;
+ svcn_p = next_svcn;
+ evcn_p = evcn;
+ le_p = le;
+ run_truncate_head(&run, next_svcn);
+ }
+ }
+
+ if (err) {
+ ntfs_inode_warn(&ni->vfs_inode, "repack problem");
+ ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+
+ /* Pack loaded but not packed runs. */
+ if (mi_p)
+ mi_pack_runs(mi_p, attr_p, &run, evcn_p + 1 - svcn_p);
+ }
+
+ run_close(&run);
+ return err;
+}
+
+/*
+ * ni_try_remove_attr_list
+ *
+ * Can we remove attribute list?
+ * Check the case when primary record contains enough space for all attributes.
+ */
+static int ni_try_remove_attr_list(struct ntfs_inode *ni)
+{
+ int err = 0;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct ATTRIB *attr, *attr_list, *attr_ins;
+ struct ATTR_LIST_ENTRY *le;
+ struct mft_inode *mi;
+ u32 asize, free;
+ struct MFT_REF ref;
+ __le16 id;
+
+ if (!ni->attr_list.dirty)
+ return 0;
+
+ err = ni_repack(ni);
+ if (err)
+ return err;
+
+ attr_list = mi_find_attr(&ni->mi, NULL, ATTR_LIST, NULL, 0, NULL);
+ if (!attr_list)
+ return 0;
+
+ asize = le32_to_cpu(attr_list->size);
+
+ /* Free space in primary record without attribute list. */
+ free = sbi->record_size - le32_to_cpu(ni->mi.mrec->used) + asize;
+ mi_get_ref(&ni->mi, &ref);
+
+ le = NULL;
+ while ((le = al_enumerate(ni, le))) {
+ if (!memcmp(&le->ref, &ref, sizeof(ref)))
+ continue;
+
+ if (le->vcn)
+ return 0;
+
+ mi = ni_find_mi(ni, ino_get(&le->ref));
+ if (!mi)
+ return 0;
+
+ attr = mi_find_attr(mi, NULL, le->type, le_name(le),
+ le->name_len, &le->id);
+ if (!attr)
+ return 0;
+
+ asize = le32_to_cpu(attr->size);
+ if (asize > free)
+ return 0;
+
+ free -= asize;
+ }
+
+ /* It seems that attribute list can be removed from primary record. */
+ mi_remove_attr(NULL, &ni->mi, attr_list);
+
+ /*
+ * Repeat the cycle above and move all attributes to primary record.
+ * It should be success!
+ */
+ le = NULL;
+ while ((le = al_enumerate(ni, le))) {
+ if (!memcmp(&le->ref, &ref, sizeof(ref)))
+ continue;
+
+ mi = ni_find_mi(ni, ino_get(&le->ref));
+
+ attr = mi_find_attr(mi, NULL, le->type, le_name(le),
+ le->name_len, &le->id);
+ asize = le32_to_cpu(attr->size);
+
+ /* Insert into primary record. */
+ attr_ins = mi_insert_attr(&ni->mi, le->type, le_name(le),
+ le->name_len, asize,
+ le16_to_cpu(attr->name_off));
+ id = attr_ins->id;
+
+ /* Copy all except id. */
+ memcpy(attr_ins, attr, asize);
+ attr_ins->id = id;
+
+ /* Remove from original record. */
+ mi_remove_attr(NULL, mi, attr);
+ }
+
+ run_deallocate(sbi, &ni->attr_list.run, true);
+ run_close(&ni->attr_list.run);
+ ni->attr_list.size = 0;
+ kfree(ni->attr_list.le);
+ ni->attr_list.le = NULL;
+ ni->attr_list.dirty = false;
+
+ return 0;
+}
+
+/*
+ * ni_create_attr_list - Generates an attribute list for this primary record.
+ */
+int ni_create_attr_list(struct ntfs_inode *ni)
+{
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ int err;
+ u32 lsize;
+ struct ATTRIB *attr;
+ struct ATTRIB *arr_move[7];
+ struct ATTR_LIST_ENTRY *le, *le_b[7];
+ struct MFT_REC *rec;
+ bool is_mft;
+ CLST rno = 0;
+ struct mft_inode *mi;
+ u32 free_b, nb, to_free, rs;
+ u16 sz;
+
+ is_mft = ni->mi.rno == MFT_REC_MFT;
+ rec = ni->mi.mrec;
+ rs = sbi->record_size;
+
+ /*
+ * Skip estimating exact memory requirement.
+ * Looks like one record_size is always enough.
+ */
+ le = kmalloc(al_aligned(rs), GFP_NOFS);
+ if (!le) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ mi_get_ref(&ni->mi, &le->ref);
+ ni->attr_list.le = le;
+
+ attr = NULL;
+ nb = 0;
+ free_b = 0;
+ attr = NULL;
+
+ for (; (attr = mi_enum_attr(&ni->mi, attr)); le = Add2Ptr(le, sz)) {
+ sz = le_size(attr->name_len);
+ le->type = attr->type;
+ le->size = cpu_to_le16(sz);
+ le->name_len = attr->name_len;
+ le->name_off = offsetof(struct ATTR_LIST_ENTRY, name);
+ le->vcn = 0;
+ if (le != ni->attr_list.le)
+ le->ref = ni->attr_list.le->ref;
+ le->id = attr->id;
+
+ if (attr->name_len)
+ memcpy(le->name, attr_name(attr),
+ sizeof(short) * attr->name_len);
+ else if (attr->type == ATTR_STD)
+ continue;
+ else if (attr->type == ATTR_LIST)
+ continue;
+ else if (is_mft && attr->type == ATTR_DATA)
+ continue;
+
+ if (!nb || nb < ARRAY_SIZE(arr_move)) {
+ le_b[nb] = le;
+ arr_move[nb++] = attr;
+ free_b += le32_to_cpu(attr->size);
+ }
+ }
+
+ lsize = PtrOffset(ni->attr_list.le, le);
+ ni->attr_list.size = lsize;
+
+ to_free = le32_to_cpu(rec->used) + lsize + SIZEOF_RESIDENT;
+ if (to_free <= rs) {
+ to_free = 0;
+ } else {
+ to_free -= rs;
+
+ if (to_free > free_b) {
+ err = -EINVAL;
+ goto out1;
+ }
+ }
+
+ /* Allocate child MFT. */
+ err = ntfs_look_free_mft(sbi, &rno, is_mft, ni, &mi);
+ if (err)
+ goto out1;
+
+ /* Call mi_remove_attr() in reverse order to keep pointers 'arr_move' valid. */
+ while (to_free > 0) {
+ struct ATTRIB *b = arr_move[--nb];
+ u32 asize = le32_to_cpu(b->size);
+ u16 name_off = le16_to_cpu(b->name_off);
+
+ attr = mi_insert_attr(mi, b->type, Add2Ptr(b, name_off),
+ b->name_len, asize, name_off);
+ WARN_ON(!attr);
+
+ mi_get_ref(mi, &le_b[nb]->ref);
+ le_b[nb]->id = attr->id;
+
+ /* Copy all except id. */
+ memcpy(attr, b, asize);
+ attr->id = le_b[nb]->id;
+
+ /* Remove from primary record. */
+ WARN_ON(!mi_remove_attr(NULL, &ni->mi, b));
+
+ if (to_free <= asize)
+ break;
+ to_free -= asize;
+ WARN_ON(!nb);
+ }
+
+ attr = mi_insert_attr(&ni->mi, ATTR_LIST, NULL, 0,
+ lsize + SIZEOF_RESIDENT, SIZEOF_RESIDENT);
+ WARN_ON(!attr);
+
+ attr->non_res = 0;
+ attr->flags = 0;
+ attr->res.data_size = cpu_to_le32(lsize);
+ attr->res.data_off = SIZEOF_RESIDENT_LE;
+ attr->res.flags = 0;
+ attr->res.res = 0;
+
+ memcpy(resident_data_ex(attr, lsize), ni->attr_list.le, lsize);
+
+ ni->attr_list.dirty = false;
+
+ mark_inode_dirty(&ni->vfs_inode);
+ goto out;
+
+out1:
+ kfree(ni->attr_list.le);
+ ni->attr_list.le = NULL;
+ ni->attr_list.size = 0;
+
+out:
+ return err;
+}
+
+/*
+ * ni_ins_attr_ext - Add an external attribute to the ntfs_inode.
+ */
+static int ni_ins_attr_ext(struct ntfs_inode *ni, struct ATTR_LIST_ENTRY *le,
+ enum ATTR_TYPE type, const __le16 *name, u8 name_len,
+ u32 asize, CLST svcn, u16 name_off, bool force_ext,
+ struct ATTRIB **ins_attr, struct mft_inode **ins_mi,
+ struct ATTR_LIST_ENTRY **ins_le)
+{
+ struct ATTRIB *attr;
+ struct mft_inode *mi;
+ CLST rno;
+ u64 vbo;
+ struct rb_node *node;
+ int err;
+ bool is_mft, is_mft_data;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+
+ is_mft = ni->mi.rno == MFT_REC_MFT;
+ is_mft_data = is_mft && type == ATTR_DATA && !name_len;
+
+ if (asize > sbi->max_bytes_per_attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * Standard information and attr_list cannot be made external.
+ * The Log File cannot have any external attributes.
+ */
+ if (type == ATTR_STD || type == ATTR_LIST ||
+ ni->mi.rno == MFT_REC_LOG) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Create attribute list if it is not already existed. */
+ if (!ni->attr_list.size) {
+ err = ni_create_attr_list(ni);
+ if (err)
+ goto out;
+ }
+
+ vbo = is_mft_data ? ((u64)svcn << sbi->cluster_bits) : 0;
+
+ if (force_ext)
+ goto insert_ext;
+
+ /* Load all subrecords into memory. */
+ err = ni_load_all_mi(ni);
+ if (err)
+ goto out;
+
+ /* Check each of loaded subrecord. */
+ for (node = rb_first(&ni->mi_tree); node; node = rb_next(node)) {
+ mi = rb_entry(node, struct mft_inode, node);
+
+ if (is_mft_data &&
+ (mi_enum_attr(mi, NULL) ||
+ vbo <= ((u64)mi->rno << sbi->record_bits))) {
+ /* We can't accept this record 'cause MFT's bootstrapping. */
+ continue;
+ }
+ if (is_mft &&
+ mi_find_attr(mi, NULL, ATTR_DATA, NULL, 0, NULL)) {
+ /*
+ * This child record already has a ATTR_DATA.
+ * So it can't accept any other records.
+ */
+ continue;
+ }
+
+ if ((type != ATTR_NAME || name_len) &&
+ mi_find_attr(mi, NULL, type, name, name_len, NULL)) {
+ /* Only indexed attributes can share same record. */
+ continue;
+ }
+
+ /* Try to insert attribute into this subrecord. */
+ attr = ni_ins_new_attr(ni, mi, le, type, name, name_len, asize,
+ name_off, svcn, ins_le);
+ if (!attr)
+ continue;
+
+ if (ins_attr)
+ *ins_attr = attr;
+ if (ins_mi)
+ *ins_mi = mi;
+ return 0;
+ }
+
+insert_ext:
+ /* We have to allocate a new child subrecord. */
+ err = ntfs_look_free_mft(sbi, &rno, is_mft_data, ni, &mi);
+ if (err)
+ goto out;
+
+ if (is_mft_data && vbo <= ((u64)rno << sbi->record_bits)) {
+ err = -EINVAL;
+ goto out1;
+ }
+
+ attr = ni_ins_new_attr(ni, mi, le, type, name, name_len, asize,
+ name_off, svcn, ins_le);
+ if (!attr)
+ goto out2;
+
+ if (ins_attr)
+ *ins_attr = attr;
+ if (ins_mi)
+ *ins_mi = mi;
+
+ return 0;
+
+out2:
+ ni_remove_mi(ni, mi);
+ mi_put(mi);
+ err = -EINVAL;
+
+out1:
+ ntfs_mark_rec_free(sbi, rno);
+
+out:
+ return err;
+}
+
+/*
+ * ni_insert_attr - Insert an attribute into the file.
+ *
+ * If the primary record has room, it will just insert the attribute.
+ * If not, it may make the attribute external.
+ * For $MFT::Data it may make room for the attribute by
+ * making other attributes external.
+ *
+ * NOTE:
+ * The ATTR_LIST and ATTR_STD cannot be made external.
+ * This function does not fill new attribute full.
+ * It only fills 'size'/'type'/'id'/'name_len' fields.
+ */
+static int ni_insert_attr(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, u32 asize,
+ u16 name_off, CLST svcn, struct ATTRIB **ins_attr,
+ struct mft_inode **ins_mi,
+ struct ATTR_LIST_ENTRY **ins_le)
+{
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ int err;
+ struct ATTRIB *attr, *eattr;
+ struct MFT_REC *rec;
+ bool is_mft;
+ struct ATTR_LIST_ENTRY *le;
+ u32 list_reserve, max_free, free, used, t32;
+ __le16 id;
+ u16 t16;
+
+ is_mft = ni->mi.rno == MFT_REC_MFT;
+ rec = ni->mi.mrec;
+
+ list_reserve = SIZEOF_NONRESIDENT + 3 * (1 + 2 * sizeof(u32));
+ used = le32_to_cpu(rec->used);
+ free = sbi->record_size - used;
+
+ if (is_mft && type != ATTR_LIST) {
+ /* Reserve space for the ATTRIB list. */
+ if (free < list_reserve)
+ free = 0;
+ else
+ free -= list_reserve;
+ }
+
+ if (asize <= free) {
+ attr = ni_ins_new_attr(ni, &ni->mi, NULL, type, name, name_len,
+ asize, name_off, svcn, ins_le);
+ if (attr) {
+ if (ins_attr)
+ *ins_attr = attr;
+ if (ins_mi)
+ *ins_mi = &ni->mi;
+ err = 0;
+ goto out;
+ }
+ }
+
+ if (!is_mft || type != ATTR_DATA || svcn) {
+ /* This ATTRIB will be external. */
+ err = ni_ins_attr_ext(ni, NULL, type, name, name_len, asize,
+ svcn, name_off, false, ins_attr, ins_mi,
+ ins_le);
+ goto out;
+ }
+
+ /*
+ * Here we have: "is_mft && type == ATTR_DATA && !svcn"
+ *
+ * The first chunk of the $MFT::Data ATTRIB must be the base record.
+ * Evict as many other attributes as possible.
+ */
+ max_free = free;
+
+ /* Estimate the result of moving all possible attributes away. */
+ attr = NULL;
+
+ while ((attr = mi_enum_attr(&ni->mi, attr))) {
+ if (attr->type == ATTR_STD)
+ continue;
+ if (attr->type == ATTR_LIST)
+ continue;
+ max_free += le32_to_cpu(attr->size);
+ }
+
+ if (max_free < asize + list_reserve) {
+ /* Impossible to insert this attribute into primary record. */
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Start real attribute moving. */
+ attr = NULL;
+
+ for (;;) {
+ attr = mi_enum_attr(&ni->mi, attr);
+ if (!attr) {
+ /* We should never be here 'cause we have already check this case. */
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Skip attributes that MUST be primary record. */
+ if (attr->type == ATTR_STD || attr->type == ATTR_LIST)
+ continue;
+
+ le = NULL;
+ if (ni->attr_list.size) {
+ le = al_find_le(ni, NULL, attr);
+ if (!le) {
+ /* Really this is a serious bug. */
+ err = -EINVAL;
+ goto out;
+ }
+ }
+
+ t32 = le32_to_cpu(attr->size);
+ t16 = le16_to_cpu(attr->name_off);
+ err = ni_ins_attr_ext(ni, le, attr->type, Add2Ptr(attr, t16),
+ attr->name_len, t32, attr_svcn(attr), t16,
+ false, &eattr, NULL, NULL);
+ if (err)
+ return err;
+
+ id = eattr->id;
+ memcpy(eattr, attr, t32);
+ eattr->id = id;
+
+ /* Remove from primary record. */
+ mi_remove_attr(NULL, &ni->mi, attr);
+
+ /* attr now points to next attribute. */
+ if (attr->type == ATTR_END)
+ goto out;
+ }
+ while (asize + list_reserve > sbi->record_size - le32_to_cpu(rec->used))
+ ;
+
+ attr = ni_ins_new_attr(ni, &ni->mi, NULL, type, name, name_len, asize,
+ name_off, svcn, ins_le);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (ins_attr)
+ *ins_attr = attr;
+ if (ins_mi)
+ *ins_mi = &ni->mi;
+
+out:
+ return err;
+}
+
+/* ni_expand_mft_list - Split ATTR_DATA of $MFT. */
+static int ni_expand_mft_list(struct ntfs_inode *ni)
+{
+ int err = 0;
+ struct runs_tree *run = &ni->file.run;
+ u32 asize, run_size, done = 0;
+ struct ATTRIB *attr;
+ struct rb_node *node;
+ CLST mft_min, mft_new, svcn, evcn, plen;
+ struct mft_inode *mi, *mi_min, *mi_new;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+
+ /* Find the nearest MFT. */
+ mft_min = 0;
+ mft_new = 0;
+ mi_min = NULL;
+
+ for (node = rb_first(&ni->mi_tree); node; node = rb_next(node)) {
+ mi = rb_entry(node, struct mft_inode, node);
+
+ attr = mi_enum_attr(mi, NULL);
+
+ if (!attr) {
+ mft_min = mi->rno;
+ mi_min = mi;
+ break;
+ }
+ }
+
+ if (ntfs_look_free_mft(sbi, &mft_new, true, ni, &mi_new)) {
+ mft_new = 0;
+ /* Really this is not critical. */
+ } else if (mft_min > mft_new) {
+ mft_min = mft_new;
+ mi_min = mi_new;
+ } else {
+ ntfs_mark_rec_free(sbi, mft_new);
+ mft_new = 0;
+ ni_remove_mi(ni, mi_new);
+ }
+
+ attr = mi_find_attr(&ni->mi, NULL, ATTR_DATA, NULL, 0, NULL);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ asize = le32_to_cpu(attr->size);
+
+ evcn = le64_to_cpu(attr->nres.evcn);
+ svcn = bytes_to_cluster(sbi, (u64)(mft_min + 1) << sbi->record_bits);
+ if (evcn + 1 >= svcn) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * Split primary attribute [0 evcn] in two parts [0 svcn) + [svcn evcn].
+ *
+ * Update first part of ATTR_DATA in 'primary MFT.
+ */
+ err = run_pack(run, 0, svcn, Add2Ptr(attr, SIZEOF_NONRESIDENT),
+ asize - SIZEOF_NONRESIDENT, &plen);
+ if (err < 0)
+ goto out;
+
+ run_size = ALIGN(err, 8);
+ err = 0;
+
+ if (plen < svcn) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ attr->nres.evcn = cpu_to_le64(svcn - 1);
+ attr->size = cpu_to_le32(run_size + SIZEOF_NONRESIDENT);
+ /* 'done' - How many bytes of primary MFT becomes free. */
+ done = asize - run_size - SIZEOF_NONRESIDENT;
+ le32_sub_cpu(&ni->mi.mrec->used, done);
+
+ /* Estimate the size of second part: run_buf=NULL. */
+ err = run_pack(run, svcn, evcn + 1 - svcn, NULL, sbi->record_size,
+ &plen);
+ if (err < 0)
+ goto out;
+
+ run_size = ALIGN(err, 8);
+ err = 0;
+
+ if (plen < evcn + 1 - svcn) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * This function may implicitly call expand attr_list.
+ * Insert second part of ATTR_DATA in 'mi_min'.
+ */
+ attr = ni_ins_new_attr(ni, mi_min, NULL, ATTR_DATA, NULL, 0,
+ SIZEOF_NONRESIDENT + run_size,
+ SIZEOF_NONRESIDENT, svcn, NULL);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ attr->non_res = 1;
+ attr->name_off = SIZEOF_NONRESIDENT_LE;
+ attr->flags = 0;
+
+ run_pack(run, svcn, evcn + 1 - svcn, Add2Ptr(attr, SIZEOF_NONRESIDENT),
+ run_size, &plen);
+
+ attr->nres.svcn = cpu_to_le64(svcn);
+ attr->nres.evcn = cpu_to_le64(evcn);
+ attr->nres.run_off = cpu_to_le16(SIZEOF_NONRESIDENT);
+
+out:
+ if (mft_new) {
+ ntfs_mark_rec_free(sbi, mft_new);
+ ni_remove_mi(ni, mi_new);
+ }
+
+ return !err && !done ? -EOPNOTSUPP : err;
+}
+
+/*
+ * ni_expand_list - Move all possible attributes out of primary record.
+ */
+int ni_expand_list(struct ntfs_inode *ni)
+{
+ int err = 0;
+ u32 asize, done = 0;
+ struct ATTRIB *attr, *ins_attr;
+ struct ATTR_LIST_ENTRY *le;
+ bool is_mft = ni->mi.rno == MFT_REC_MFT;
+ struct MFT_REF ref;
+
+ mi_get_ref(&ni->mi, &ref);
+ le = NULL;
+
+ while ((le = al_enumerate(ni, le))) {
+ if (le->type == ATTR_STD)
+ continue;
+
+ if (memcmp(&ref, &le->ref, sizeof(struct MFT_REF)))
+ continue;
+
+ if (is_mft && le->type == ATTR_DATA)
+ continue;
+
+ /* Find attribute in primary record. */
+ attr = rec_find_attr_le(&ni->mi, le);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ asize = le32_to_cpu(attr->size);
+
+ /* Always insert into new record to avoid collisions (deep recursive). */
+ err = ni_ins_attr_ext(ni, le, attr->type, attr_name(attr),
+ attr->name_len, asize, attr_svcn(attr),
+ le16_to_cpu(attr->name_off), true,
+ &ins_attr, NULL, NULL);
+
+ if (err)
+ goto out;
+
+ memcpy(ins_attr, attr, asize);
+ ins_attr->id = le->id;
+ /* Remove from primary record. */
+ mi_remove_attr(NULL, &ni->mi, attr);
+
+ done += asize;
+ goto out;
+ }
+
+ if (!is_mft) {
+ err = -EFBIG; /* Attr list is too big(?) */
+ goto out;
+ }
+
+ /* Split MFT data as much as possible. */
+ err = ni_expand_mft_list(ni);
+ if (err)
+ goto out;
+
+out:
+ return !err && !done ? -EOPNOTSUPP : err;
+}
+
+/*
+ * ni_insert_nonresident - Insert new nonresident attribute.
+ */
+int ni_insert_nonresident(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len,
+ const struct runs_tree *run, CLST svcn, CLST len,
+ __le16 flags, struct ATTRIB **new_attr,
+ struct mft_inode **mi)
+{
+ int err;
+ CLST plen;
+ struct ATTRIB *attr;
+ bool is_ext =
+ (flags & (ATTR_FLAG_SPARSED | ATTR_FLAG_COMPRESSED)) && !svcn;
+ u32 name_size = ALIGN(name_len * sizeof(short), 8);
+ u32 name_off = is_ext ? SIZEOF_NONRESIDENT_EX : SIZEOF_NONRESIDENT;
+ u32 run_off = name_off + name_size;
+ u32 run_size, asize;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+
+ err = run_pack(run, svcn, len, NULL, sbi->max_bytes_per_attr - run_off,
+ &plen);
+ if (err < 0)
+ goto out;
+
+ run_size = ALIGN(err, 8);
+
+ if (plen < len) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ asize = run_off + run_size;
+
+ if (asize > sbi->max_bytes_per_attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = ni_insert_attr(ni, type, name, name_len, asize, name_off, svcn,
+ &attr, mi, NULL);
+
+ if (err)
+ goto out;
+
+ attr->non_res = 1;
+ attr->name_off = cpu_to_le16(name_off);
+ attr->flags = flags;
+
+ run_pack(run, svcn, len, Add2Ptr(attr, run_off), run_size, &plen);
+
+ attr->nres.svcn = cpu_to_le64(svcn);
+ attr->nres.evcn = cpu_to_le64((u64)svcn + len - 1);
+
+ err = 0;
+ if (new_attr)
+ *new_attr = attr;
+
+ *(__le64 *)&attr->nres.run_off = cpu_to_le64(run_off);
+
+ attr->nres.alloc_size =
+ svcn ? 0 : cpu_to_le64((u64)len << ni->mi.sbi->cluster_bits);
+ attr->nres.data_size = attr->nres.alloc_size;
+ attr->nres.valid_size = attr->nres.alloc_size;
+
+ if (is_ext) {
+ if (flags & ATTR_FLAG_COMPRESSED)
+ attr->nres.c_unit = COMPRESSION_UNIT;
+ attr->nres.total_size = attr->nres.alloc_size;
+ }
+
+out:
+ return err;
+}
+
+/*
+ * ni_insert_resident - Inserts new resident attribute.
+ */
+int ni_insert_resident(struct ntfs_inode *ni, u32 data_size,
+ enum ATTR_TYPE type, const __le16 *name, u8 name_len,
+ struct ATTRIB **new_attr, struct mft_inode **mi,
+ struct ATTR_LIST_ENTRY **le)
+{
+ int err;
+ u32 name_size = ALIGN(name_len * sizeof(short), 8);
+ u32 asize = SIZEOF_RESIDENT + name_size + ALIGN(data_size, 8);
+ struct ATTRIB *attr;
+
+ err = ni_insert_attr(ni, type, name, name_len, asize, SIZEOF_RESIDENT,
+ 0, &attr, mi, le);
+ if (err)
+ return err;
+
+ attr->non_res = 0;
+ attr->flags = 0;
+
+ attr->res.data_size = cpu_to_le32(data_size);
+ attr->res.data_off = cpu_to_le16(SIZEOF_RESIDENT + name_size);
+ if (type == ATTR_NAME) {
+ attr->res.flags = RESIDENT_FLAG_INDEXED;
+
+ /* is_attr_indexed(attr)) == true */
+ le16_add_cpu(&ni->mi.mrec->hard_links, +1);
+ ni->mi.dirty = true;
+ }
+ attr->res.res = 0;
+
+ if (new_attr)
+ *new_attr = attr;
+
+ return 0;
+}
+
+/*
+ * ni_remove_attr_le - Remove attribute from record.
+ */
+void ni_remove_attr_le(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct mft_inode *mi, struct ATTR_LIST_ENTRY *le)
+{
+ mi_remove_attr(ni, mi, attr);
+
+ if (le)
+ al_remove_le(ni, le);
+}
+
+/*
+ * ni_delete_all - Remove all attributes and frees allocates space.
+ *
+ * ntfs_evict_inode->ntfs_clear_inode->ni_delete_all (if no links).
+ */
+int ni_delete_all(struct ntfs_inode *ni)
+{
+ int err;
+ struct ATTR_LIST_ENTRY *le = NULL;
+ struct ATTRIB *attr = NULL;
+ struct rb_node *node;
+ u16 roff;
+ u32 asize;
+ CLST svcn, evcn;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ bool nt3 = is_ntfs3(sbi);
+ struct MFT_REF ref;
+
+ while ((attr = ni_enum_attr_ex(ni, attr, &le, NULL))) {
+ if (!nt3 || attr->name_len) {
+ ;
+ } else if (attr->type == ATTR_REPARSE) {
+ mi_get_ref(&ni->mi, &ref);
+ ntfs_remove_reparse(sbi, 0, &ref);
+ } else if (attr->type == ATTR_ID && !attr->non_res &&
+ le32_to_cpu(attr->res.data_size) >=
+ sizeof(struct GUID)) {
+ ntfs_objid_remove(sbi, resident_data(attr));
+ }
+
+ if (!attr->non_res)
+ continue;
+
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn = le64_to_cpu(attr->nres.evcn);
+
+ if (evcn + 1 <= svcn)
+ continue;
+
+ asize = le32_to_cpu(attr->size);
+ roff = le16_to_cpu(attr->nres.run_off);
+
+ /* run==1 means unpack and deallocate. */
+ run_unpack_ex(RUN_DEALLOCATE, sbi, ni->mi.rno, svcn, evcn, svcn,
+ Add2Ptr(attr, roff), asize - roff);
+ }
+
+ if (ni->attr_list.size) {
+ run_deallocate(ni->mi.sbi, &ni->attr_list.run, true);
+ al_destroy(ni);
+ }
+
+ /* Free all subrecords. */
+ for (node = rb_first(&ni->mi_tree); node;) {
+ struct rb_node *next = rb_next(node);
+ struct mft_inode *mi = rb_entry(node, struct mft_inode, node);
+
+ clear_rec_inuse(mi->mrec);
+ mi->dirty = true;
+ mi_write(mi, 0);
+
+ ntfs_mark_rec_free(sbi, mi->rno);
+ ni_remove_mi(ni, mi);
+ mi_put(mi);
+ node = next;
+ }
+
+ /* Free base record. */
+ clear_rec_inuse(ni->mi.mrec);
+ ni->mi.dirty = true;
+ err = mi_write(&ni->mi, 0);
+
+ ntfs_mark_rec_free(sbi, ni->mi.rno);
+
+ return err;
+}
+
+/* ni_fname_name
+ *
+ * Return: File name attribute by its value.
+ */
+struct ATTR_FILE_NAME *ni_fname_name(struct ntfs_inode *ni,
+ const struct cpu_str *uni,
+ const struct MFT_REF *home_dir,
+ struct mft_inode **mi,
+ struct ATTR_LIST_ENTRY **le)
+{
+ struct ATTRIB *attr = NULL;
+ struct ATTR_FILE_NAME *fname;
+
+ *le = NULL;
+
+ /* Enumerate all names. */
+next:
+ attr = ni_find_attr(ni, attr, le, ATTR_NAME, NULL, 0, NULL, mi);
+ if (!attr)
+ return NULL;
+
+ fname = resident_data_ex(attr, SIZEOF_ATTRIBUTE_FILENAME);
+ if (!fname)
+ goto next;
+
+ if (home_dir && memcmp(home_dir, &fname->home, sizeof(*home_dir)))
+ goto next;
+
+ if (!uni)
+ goto next;
+
+ if (uni->len != fname->name_len)
+ goto next;
+
+ if (ntfs_cmp_names_cpu(uni, (struct le_str *)&fname->name_len, NULL,
+ false))
+ goto next;
+
+ return fname;
+}
+
+/*
+ * ni_fname_type
+ *
+ * Return: File name attribute with given type.
+ */
+struct ATTR_FILE_NAME *ni_fname_type(struct ntfs_inode *ni, u8 name_type,
+ struct mft_inode **mi,
+ struct ATTR_LIST_ENTRY **le)
+{
+ struct ATTRIB *attr = NULL;
+ struct ATTR_FILE_NAME *fname;
+
+ *le = NULL;
+
+ if (FILE_NAME_POSIX == name_type)
+ return NULL;
+
+ /* Enumerate all names. */
+ for (;;) {
+ attr = ni_find_attr(ni, attr, le, ATTR_NAME, NULL, 0, NULL, mi);
+ if (!attr)
+ return NULL;
+
+ fname = resident_data_ex(attr, SIZEOF_ATTRIBUTE_FILENAME);
+ if (fname && name_type == fname->type)
+ return fname;
+ }
+}
+
+/*
+ * ni_new_attr_flags
+ *
+ * Process compressed/sparsed in special way.
+ * NOTE: You need to set ni->std_fa = new_fa
+ * after this function to keep internal structures in consistency.
+ */
+int ni_new_attr_flags(struct ntfs_inode *ni, enum FILE_ATTRIBUTE new_fa)
+{
+ struct ATTRIB *attr;
+ struct mft_inode *mi;
+ __le16 new_aflags;
+ u32 new_asize;
+
+ attr = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL, &mi);
+ if (!attr)
+ return -EINVAL;
+
+ new_aflags = attr->flags;
+
+ if (new_fa & FILE_ATTRIBUTE_SPARSE_FILE)
+ new_aflags |= ATTR_FLAG_SPARSED;
+ else
+ new_aflags &= ~ATTR_FLAG_SPARSED;
+
+ if (new_fa & FILE_ATTRIBUTE_COMPRESSED)
+ new_aflags |= ATTR_FLAG_COMPRESSED;
+ else
+ new_aflags &= ~ATTR_FLAG_COMPRESSED;
+
+ if (new_aflags == attr->flags)
+ return 0;
+
+ if ((new_aflags & (ATTR_FLAG_COMPRESSED | ATTR_FLAG_SPARSED)) ==
+ (ATTR_FLAG_COMPRESSED | ATTR_FLAG_SPARSED)) {
+ ntfs_inode_warn(&ni->vfs_inode,
+ "file can't be sparsed and compressed");
+ return -EOPNOTSUPP;
+ }
+
+ if (!attr->non_res)
+ goto out;
+
+ if (attr->nres.data_size) {
+ ntfs_inode_warn(
+ &ni->vfs_inode,
+ "one can change sparsed/compressed only for empty files");
+ return -EOPNOTSUPP;
+ }
+
+ /* Resize nonresident empty attribute in-place only. */
+ new_asize = (new_aflags & (ATTR_FLAG_COMPRESSED | ATTR_FLAG_SPARSED))
+ ? (SIZEOF_NONRESIDENT_EX + 8)
+ : (SIZEOF_NONRESIDENT + 8);
+
+ if (!mi_resize_attr(mi, attr, new_asize - le32_to_cpu(attr->size)))
+ return -EOPNOTSUPP;
+
+ if (new_aflags & ATTR_FLAG_SPARSED) {
+ attr->name_off = SIZEOF_NONRESIDENT_EX_LE;
+ /* Windows uses 16 clusters per frame but supports one cluster per frame too. */
+ attr->nres.c_unit = 0;
+ ni->vfs_inode.i_mapping->a_ops = &ntfs_aops;
+ } else if (new_aflags & ATTR_FLAG_COMPRESSED) {
+ attr->name_off = SIZEOF_NONRESIDENT_EX_LE;
+ /* The only allowed: 16 clusters per frame. */
+ attr->nres.c_unit = NTFS_LZNT_CUNIT;
+ ni->vfs_inode.i_mapping->a_ops = &ntfs_aops_cmpr;
+ } else {
+ attr->name_off = SIZEOF_NONRESIDENT_LE;
+ /* Normal files. */
+ attr->nres.c_unit = 0;
+ ni->vfs_inode.i_mapping->a_ops = &ntfs_aops;
+ }
+ attr->nres.run_off = attr->name_off;
+out:
+ attr->flags = new_aflags;
+ mi->dirty = true;
+
+ return 0;
+}
+
+/*
+ * ni_parse_reparse
+ *
+ * Buffer is at least 24 bytes.
+ */
+enum REPARSE_SIGN ni_parse_reparse(struct ntfs_inode *ni, struct ATTRIB *attr,
+ void *buffer)
+{
+ const struct REPARSE_DATA_BUFFER *rp = NULL;
+ u8 bits;
+ u16 len;
+ typeof(rp->CompressReparseBuffer) *cmpr;
+
+ static_assert(sizeof(struct REPARSE_DATA_BUFFER) <= 24);
+
+ /* Try to estimate reparse point. */
+ if (!attr->non_res) {
+ rp = resident_data_ex(attr, sizeof(struct REPARSE_DATA_BUFFER));
+ } else if (le64_to_cpu(attr->nres.data_size) >=
+ sizeof(struct REPARSE_DATA_BUFFER)) {
+ struct runs_tree run;
+
+ run_init(&run);
+
+ if (!attr_load_runs_vcn(ni, ATTR_REPARSE, NULL, 0, &run, 0) &&
+ !ntfs_read_run_nb(ni->mi.sbi, &run, 0, buffer,
+ sizeof(struct REPARSE_DATA_BUFFER),
+ NULL)) {
+ rp = buffer;
+ }
+
+ run_close(&run);
+ }
+
+ if (!rp)
+ return REPARSE_NONE;
+
+ len = le16_to_cpu(rp->ReparseDataLength);
+ switch (rp->ReparseTag) {
+ case (IO_REPARSE_TAG_MICROSOFT | IO_REPARSE_TAG_SYMBOLIC_LINK):
+ break; /* Symbolic link. */
+ case IO_REPARSE_TAG_MOUNT_POINT:
+ break; /* Mount points and junctions. */
+ case IO_REPARSE_TAG_SYMLINK:
+ break;
+ case IO_REPARSE_TAG_COMPRESS:
+ /*
+ * WOF - Windows Overlay Filter - Used to compress files with
+ * LZX/Xpress.
+ *
+ * Unlike native NTFS file compression, the Windows
+ * Overlay Filter supports only read operations. This means
+ * that it doesn't need to sector-align each compressed chunk,
+ * so the compressed data can be packed more tightly together.
+ * If you open the file for writing, the WOF just decompresses
+ * the entire file, turning it back into a plain file.
+ *
+ * Ntfs3 driver decompresses the entire file only on write or
+ * change size requests.
+ */
+
+ cmpr = &rp->CompressReparseBuffer;
+ if (len < sizeof(*cmpr) ||
+ cmpr->WofVersion != WOF_CURRENT_VERSION ||
+ cmpr->WofProvider != WOF_PROVIDER_SYSTEM ||
+ cmpr->ProviderVer != WOF_PROVIDER_CURRENT_VERSION) {
+ return REPARSE_NONE;
+ }
+
+ switch (cmpr->CompressionFormat) {
+ case WOF_COMPRESSION_XPRESS4K:
+ bits = 0xc; // 4k
+ break;
+ case WOF_COMPRESSION_XPRESS8K:
+ bits = 0xd; // 8k
+ break;
+ case WOF_COMPRESSION_XPRESS16K:
+ bits = 0xe; // 16k
+ break;
+ case WOF_COMPRESSION_LZX32K:
+ bits = 0xf; // 32k
+ break;
+ default:
+ bits = 0x10; // 64k
+ break;
+ }
+ ni_set_ext_compress_bits(ni, bits);
+ return REPARSE_COMPRESSED;
+
+ case IO_REPARSE_TAG_DEDUP:
+ ni->ni_flags |= NI_FLAG_DEDUPLICATED;
+ return REPARSE_DEDUPLICATED;
+
+ default:
+ if (rp->ReparseTag & IO_REPARSE_TAG_NAME_SURROGATE)
+ break;
+
+ return REPARSE_NONE;
+ }
+
+ /* Looks like normal symlink. */
+ return REPARSE_LINK;
+}
+
+/*
+ * ni_fiemap - Helper for file_fiemap().
+ *
+ * Assumed ni_lock.
+ * TODO: Less aggressive locks.
+ */
+int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo,
+ __u64 vbo, __u64 len)
+{
+ int err = 0;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ u8 cluster_bits = sbi->cluster_bits;
+ struct runs_tree *run;
+ struct rw_semaphore *run_lock;
+ struct ATTRIB *attr;
+ CLST vcn = vbo >> cluster_bits;
+ CLST lcn, clen;
+ u64 valid = ni->i_valid;
+ u64 lbo, bytes;
+ u64 end, alloc_size;
+ size_t idx = -1;
+ u32 flags;
+ bool ok;
+
+ if (S_ISDIR(ni->vfs_inode.i_mode)) {
+ run = &ni->dir.alloc_run;
+ attr = ni_find_attr(ni, NULL, NULL, ATTR_ALLOC, I30_NAME,
+ ARRAY_SIZE(I30_NAME), NULL, NULL);
+ run_lock = &ni->dir.run_lock;
+ } else {
+ run = &ni->file.run;
+ attr = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL,
+ NULL);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+ if (is_attr_compressed(attr)) {
+ /* Unfortunately cp -r incorrectly treats compressed clusters. */
+ err = -EOPNOTSUPP;
+ ntfs_inode_warn(
+ &ni->vfs_inode,
+ "fiemap is not supported for compressed file (cp -r)");
+ goto out;
+ }
+ run_lock = &ni->file.run_lock;
+ }
+
+ if (!attr || !attr->non_res) {
+ err = fiemap_fill_next_extent(
+ fieinfo, 0, 0,
+ attr ? le32_to_cpu(attr->res.data_size) : 0,
+ FIEMAP_EXTENT_DATA_INLINE | FIEMAP_EXTENT_LAST |
+ FIEMAP_EXTENT_MERGED);
+ goto out;
+ }
+
+ end = vbo + len;
+ alloc_size = le64_to_cpu(attr->nres.alloc_size);
+ if (end > alloc_size)
+ end = alloc_size;
+
+ down_read(run_lock);
+
+ while (vbo < end) {
+ if (idx == -1) {
+ ok = run_lookup_entry(run, vcn, &lcn, &clen, &idx);
+ } else {
+ CLST vcn_next = vcn;
+
+ ok = run_get_entry(run, ++idx, &vcn, &lcn, &clen) &&
+ vcn == vcn_next;
+ if (!ok)
+ vcn = vcn_next;
+ }
+
+ if (!ok) {
+ up_read(run_lock);
+ down_write(run_lock);
+
+ err = attr_load_runs_vcn(ni, attr->type,
+ attr_name(attr),
+ attr->name_len, run, vcn);
+
+ up_write(run_lock);
+ down_read(run_lock);
+
+ if (err)
+ break;
+
+ ok = run_lookup_entry(run, vcn, &lcn, &clen, &idx);
+
+ if (!ok) {
+ err = -EINVAL;
+ break;
+ }
+ }
+
+ if (!clen) {
+ err = -EINVAL; // ?
+ break;
+ }
+
+ if (lcn == SPARSE_LCN) {
+ vcn += clen;
+ vbo = (u64)vcn << cluster_bits;
+ continue;
+ }
+
+ flags = FIEMAP_EXTENT_MERGED;
+ if (S_ISDIR(ni->vfs_inode.i_mode)) {
+ ;
+ } else if (is_attr_compressed(attr)) {
+ CLST clst_data;
+
+ err = attr_is_frame_compressed(
+ ni, attr, vcn >> attr->nres.c_unit, &clst_data);
+ if (err)
+ break;
+ if (clst_data < NTFS_LZNT_CLUSTERS)
+ flags |= FIEMAP_EXTENT_ENCODED;
+ } else if (is_attr_encrypted(attr)) {
+ flags |= FIEMAP_EXTENT_DATA_ENCRYPTED;
+ }
+
+ vbo = (u64)vcn << cluster_bits;
+ bytes = (u64)clen << cluster_bits;
+ lbo = (u64)lcn << cluster_bits;
+
+ vcn += clen;
+
+ if (vbo + bytes >= end) {
+ bytes = end - vbo;
+ flags |= FIEMAP_EXTENT_LAST;
+ }
+
+ if (vbo + bytes <= valid) {
+ ;
+ } else if (vbo >= valid) {
+ flags |= FIEMAP_EXTENT_UNWRITTEN;
+ } else {
+ /* vbo < valid && valid < vbo + bytes */
+ u64 dlen = valid - vbo;
+
+ err = fiemap_fill_next_extent(fieinfo, vbo, lbo, dlen,
+ flags);
+ if (err < 0)
+ break;
+ if (err == 1) {
+ err = 0;
+ break;
+ }
+
+ vbo = valid;
+ bytes -= dlen;
+ if (!bytes)
+ continue;
+
+ lbo += dlen;
+ flags |= FIEMAP_EXTENT_UNWRITTEN;
+ }
+
+ err = fiemap_fill_next_extent(fieinfo, vbo, lbo, bytes, flags);
+ if (err < 0)
+ break;
+ if (err == 1) {
+ err = 0;
+ break;
+ }
+
+ vbo += bytes;
+ }
+
+ up_read(run_lock);
+
+out:
+ return err;
+}
+
+/*
+ * ni_readpage_cmpr
+ *
+ * When decompressing, we typically obtain more than one page per reference.
+ * We inject the additional pages into the page cache.
+ */
+int ni_readpage_cmpr(struct ntfs_inode *ni, struct page *page)
+{
+ int err;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct address_space *mapping = page->mapping;
+ pgoff_t index = page->index;
+ u64 frame_vbo, vbo = (u64)index << PAGE_SHIFT;
+ struct page **pages = NULL; /* Array of at most 16 pages. stack? */
+ u8 frame_bits;
+ CLST frame;
+ u32 i, idx, frame_size, pages_per_frame;
+ gfp_t gfp_mask;
+ struct page *pg;
+
+ if (vbo >= ni->vfs_inode.i_size) {
+ SetPageUptodate(page);
+ err = 0;
+ goto out;
+ }
+
+ if (ni->ni_flags & NI_FLAG_COMPRESSED_MASK) {
+ /* Xpress or LZX. */
+ frame_bits = ni_ext_compress_bits(ni);
+ } else {
+ /* LZNT compression. */
+ frame_bits = NTFS_LZNT_CUNIT + sbi->cluster_bits;
+ }
+ frame_size = 1u << frame_bits;
+ frame = vbo >> frame_bits;
+ frame_vbo = (u64)frame << frame_bits;
+ idx = (vbo - frame_vbo) >> PAGE_SHIFT;
+
+ pages_per_frame = frame_size >> PAGE_SHIFT;
+ pages = kcalloc(pages_per_frame, sizeof(struct page *), GFP_NOFS);
+ if (!pages) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ pages[idx] = page;
+ index = frame_vbo >> PAGE_SHIFT;
+ gfp_mask = mapping_gfp_mask(mapping);
+
+ for (i = 0; i < pages_per_frame; i++, index++) {
+ if (i == idx)
+ continue;
+
+ pg = find_or_create_page(mapping, index, gfp_mask);
+ if (!pg) {
+ err = -ENOMEM;
+ goto out1;
+ }
+ pages[i] = pg;
+ }
+
+ err = ni_read_frame(ni, frame_vbo, pages, pages_per_frame);
+
+out1:
+ if (err)
+ SetPageError(page);
+
+ for (i = 0; i < pages_per_frame; i++) {
+ pg = pages[i];
+ if (i == idx)
+ continue;
+ unlock_page(pg);
+ put_page(pg);
+ }
+
+out:
+ /* At this point, err contains 0 or -EIO depending on the "critical" page. */
+ kfree(pages);
+ unlock_page(page);
+
+ return err;
+}
+
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+/*
+ * ni_decompress_file - Decompress LZX/Xpress compressed file.
+ *
+ * Remove ATTR_DATA::WofCompressedData.
+ * Remove ATTR_REPARSE.
+ */
+int ni_decompress_file(struct ntfs_inode *ni)
+{
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct inode *inode = &ni->vfs_inode;
+ loff_t i_size = inode->i_size;
+ struct address_space *mapping = inode->i_mapping;
+ gfp_t gfp_mask = mapping_gfp_mask(mapping);
+ struct page **pages = NULL;
+ struct ATTR_LIST_ENTRY *le;
+ struct ATTRIB *attr;
+ CLST vcn, cend, lcn, clen, end;
+ pgoff_t index;
+ u64 vbo;
+ u8 frame_bits;
+ u32 i, frame_size, pages_per_frame, bytes;
+ struct mft_inode *mi;
+ int err;
+
+ /* Clusters for decompressed data. */
+ cend = bytes_to_cluster(sbi, i_size);
+
+ if (!i_size)
+ goto remove_wof;
+
+ /* Check in advance. */
+ if (cend > wnd_zeroes(&sbi->used.bitmap)) {
+ err = -ENOSPC;
+ goto out;
+ }
+
+ frame_bits = ni_ext_compress_bits(ni);
+ frame_size = 1u << frame_bits;
+ pages_per_frame = frame_size >> PAGE_SHIFT;
+ pages = kcalloc(pages_per_frame, sizeof(struct page *), GFP_NOFS);
+ if (!pages) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ /*
+ * Step 1: Decompress data and copy to new allocated clusters.
+ */
+ index = 0;
+ for (vbo = 0; vbo < i_size; vbo += bytes) {
+ u32 nr_pages;
+ bool new;
+
+ if (vbo + frame_size > i_size) {
+ bytes = i_size - vbo;
+ nr_pages = (bytes + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ } else {
+ nr_pages = pages_per_frame;
+ bytes = frame_size;
+ }
+
+ end = bytes_to_cluster(sbi, vbo + bytes);
+
+ for (vcn = vbo >> sbi->cluster_bits; vcn < end; vcn += clen) {
+ err = attr_data_get_block(ni, vcn, cend - vcn, &lcn,
+ &clen, &new);
+ if (err)
+ goto out;
+ }
+
+ for (i = 0; i < pages_per_frame; i++, index++) {
+ struct page *pg;
+
+ pg = find_or_create_page(mapping, index, gfp_mask);
+ if (!pg) {
+ while (i--) {
+ unlock_page(pages[i]);
+ put_page(pages[i]);
+ }
+ err = -ENOMEM;
+ goto out;
+ }
+ pages[i] = pg;
+ }
+
+ err = ni_read_frame(ni, vbo, pages, pages_per_frame);
+
+ if (!err) {
+ down_read(&ni->file.run_lock);
+ err = ntfs_bio_pages(sbi, &ni->file.run, pages,
+ nr_pages, vbo, bytes,
+ REQ_OP_WRITE);
+ up_read(&ni->file.run_lock);
+ }
+
+ for (i = 0; i < pages_per_frame; i++) {
+ unlock_page(pages[i]);
+ put_page(pages[i]);
+ }
+
+ if (err)
+ goto out;
+
+ cond_resched();
+ }
+
+remove_wof:
+ /*
+ * Step 2: Deallocate attributes ATTR_DATA::WofCompressedData
+ * and ATTR_REPARSE.
+ */
+ attr = NULL;
+ le = NULL;
+ while ((attr = ni_enum_attr_ex(ni, attr, &le, NULL))) {
+ CLST svcn, evcn;
+ u32 asize, roff;
+
+ if (attr->type == ATTR_REPARSE) {
+ struct MFT_REF ref;
+
+ mi_get_ref(&ni->mi, &ref);
+ ntfs_remove_reparse(sbi, 0, &ref);
+ }
+
+ if (!attr->non_res)
+ continue;
+
+ if (attr->type != ATTR_REPARSE &&
+ (attr->type != ATTR_DATA ||
+ attr->name_len != ARRAY_SIZE(WOF_NAME) ||
+ memcmp(attr_name(attr), WOF_NAME, sizeof(WOF_NAME))))
+ continue;
+
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn = le64_to_cpu(attr->nres.evcn);
+
+ if (evcn + 1 <= svcn)
+ continue;
+
+ asize = le32_to_cpu(attr->size);
+ roff = le16_to_cpu(attr->nres.run_off);
+
+ /*run==1 Means unpack and deallocate. */
+ run_unpack_ex(RUN_DEALLOCATE, sbi, ni->mi.rno, svcn, evcn, svcn,
+ Add2Ptr(attr, roff), asize - roff);
+ }
+
+ /*
+ * Step 3: Remove attribute ATTR_DATA::WofCompressedData.
+ */
+ err = ni_remove_attr(ni, ATTR_DATA, WOF_NAME, ARRAY_SIZE(WOF_NAME),
+ false, NULL);
+ if (err)
+ goto out;
+
+ /*
+ * Step 4: Remove ATTR_REPARSE.
+ */
+ err = ni_remove_attr(ni, ATTR_REPARSE, NULL, 0, false, NULL);
+ if (err)
+ goto out;
+
+ /*
+ * Step 5: Remove sparse flag from data attribute.
+ */
+ attr = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL, &mi);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (attr->non_res && is_attr_sparsed(attr)) {
+ /* Sparsed attribute header is 8 bytes bigger than normal. */
+ struct MFT_REC *rec = mi->mrec;
+ u32 used = le32_to_cpu(rec->used);
+ u32 asize = le32_to_cpu(attr->size);
+ u16 roff = le16_to_cpu(attr->nres.run_off);
+ char *rbuf = Add2Ptr(attr, roff);
+
+ memmove(rbuf - 8, rbuf, used - PtrOffset(rec, rbuf));
+ attr->size = cpu_to_le32(asize - 8);
+ attr->flags &= ~ATTR_FLAG_SPARSED;
+ attr->nres.run_off = cpu_to_le16(roff - 8);
+ attr->nres.c_unit = 0;
+ rec->used = cpu_to_le32(used - 8);
+ mi->dirty = true;
+ ni->std_fa &= ~(FILE_ATTRIBUTE_SPARSE_FILE |
+ FILE_ATTRIBUTE_REPARSE_POINT);
+
+ mark_inode_dirty(inode);
+ }
+
+ /* Clear cached flag. */
+ ni->ni_flags &= ~NI_FLAG_COMPRESSED_MASK;
+ if (ni->file.offs_page) {
+ put_page(ni->file.offs_page);
+ ni->file.offs_page = NULL;
+ }
+ mapping->a_ops = &ntfs_aops;
+
+out:
+ kfree(pages);
+ if (err) {
+ make_bad_inode(inode);
+ ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+ }
+
+ return err;
+}
+
+/*
+ * decompress_lzx_xpress - External compression LZX/Xpress.
+ */
+static int decompress_lzx_xpress(struct ntfs_sb_info *sbi, const char *cmpr,
+ size_t cmpr_size, void *unc, size_t unc_size,
+ u32 frame_size)
+{
+ int err;
+ void *ctx;
+
+ if (cmpr_size == unc_size) {
+ /* Frame not compressed. */
+ memcpy(unc, cmpr, unc_size);
+ return 0;
+ }
+
+ err = 0;
+ if (frame_size == 0x8000) {
+ mutex_lock(&sbi->compress.mtx_lzx);
+ /* LZX: Frame compressed. */
+ ctx = sbi->compress.lzx;
+ if (!ctx) {
+ /* Lazy initialize LZX decompress context. */
+ ctx = lzx_allocate_decompressor();
+ if (!ctx) {
+ err = -ENOMEM;
+ goto out1;
+ }
+
+ sbi->compress.lzx = ctx;
+ }
+
+ if (lzx_decompress(ctx, cmpr, cmpr_size, unc, unc_size)) {
+ /* Treat all errors as "invalid argument". */
+ err = -EINVAL;
+ }
+out1:
+ mutex_unlock(&sbi->compress.mtx_lzx);
+ } else {
+ /* XPRESS: Frame compressed. */
+ mutex_lock(&sbi->compress.mtx_xpress);
+ ctx = sbi->compress.xpress;
+ if (!ctx) {
+ /* Lazy initialize Xpress decompress context. */
+ ctx = xpress_allocate_decompressor();
+ if (!ctx) {
+ err = -ENOMEM;
+ goto out2;
+ }
+
+ sbi->compress.xpress = ctx;
+ }
+
+ if (xpress_decompress(ctx, cmpr, cmpr_size, unc, unc_size)) {
+ /* Treat all errors as "invalid argument". */
+ err = -EINVAL;
+ }
+out2:
+ mutex_unlock(&sbi->compress.mtx_xpress);
+ }
+ return err;
+}
+#endif
+
+/*
+ * ni_read_frame
+ *
+ * Pages - Array of locked pages.
+ */
+int ni_read_frame(struct ntfs_inode *ni, u64 frame_vbo, struct page **pages,
+ u32 pages_per_frame)
+{
+ int err;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ u8 cluster_bits = sbi->cluster_bits;
+ char *frame_ondisk = NULL;
+ char *frame_mem = NULL;
+ struct page **pages_disk = NULL;
+ struct ATTR_LIST_ENTRY *le = NULL;
+ struct runs_tree *run = &ni->file.run;
+ u64 valid_size = ni->i_valid;
+ u64 vbo_disk;
+ size_t unc_size;
+ u32 frame_size, i, npages_disk, ondisk_size;
+ struct page *pg;
+ struct ATTRIB *attr;
+ CLST frame, clst_data;
+
+ /*
+ * To simplify decompress algorithm do vmap for source
+ * and target pages.
+ */
+ for (i = 0; i < pages_per_frame; i++)
+ kmap(pages[i]);
+
+ frame_size = pages_per_frame << PAGE_SHIFT;
+ frame_mem = vmap(pages, pages_per_frame, VM_MAP, PAGE_KERNEL);
+ if (!frame_mem) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ attr = ni_find_attr(ni, NULL, &le, ATTR_DATA, NULL, 0, NULL, NULL);
+ if (!attr) {
+ err = -ENOENT;
+ goto out1;
+ }
+
+ if (!attr->non_res) {
+ u32 data_size = le32_to_cpu(attr->res.data_size);
+
+ memset(frame_mem, 0, frame_size);
+ if (frame_vbo < data_size) {
+ ondisk_size = data_size - frame_vbo;
+ memcpy(frame_mem, resident_data(attr) + frame_vbo,
+ min(ondisk_size, frame_size));
+ }
+ err = 0;
+ goto out1;
+ }
+
+ if (frame_vbo >= valid_size) {
+ memset(frame_mem, 0, frame_size);
+ err = 0;
+ goto out1;
+ }
+
+ if (ni->ni_flags & NI_FLAG_COMPRESSED_MASK) {
+#ifndef CONFIG_NTFS3_LZX_XPRESS
+ err = -EOPNOTSUPP;
+ goto out1;
+#else
+ u32 frame_bits = ni_ext_compress_bits(ni);
+ u64 frame64 = frame_vbo >> frame_bits;
+ u64 frames, vbo_data;
+
+ if (frame_size != (1u << frame_bits)) {
+ err = -EINVAL;
+ goto out1;
+ }
+ switch (frame_size) {
+ case 0x1000:
+ case 0x2000:
+ case 0x4000:
+ case 0x8000:
+ break;
+ default:
+ /* Unknown compression. */
+ err = -EOPNOTSUPP;
+ goto out1;
+ }
+
+ attr = ni_find_attr(ni, attr, &le, ATTR_DATA, WOF_NAME,
+ ARRAY_SIZE(WOF_NAME), NULL, NULL);
+ if (!attr) {
+ ntfs_inode_err(
+ &ni->vfs_inode,
+ "external compressed file should contains data attribute \"WofCompressedData\"");
+ err = -EINVAL;
+ goto out1;
+ }
+
+ if (!attr->non_res) {
+ run = NULL;
+ } else {
+ run = run_alloc();
+ if (!run) {
+ err = -ENOMEM;
+ goto out1;
+ }
+ }
+
+ frames = (ni->vfs_inode.i_size - 1) >> frame_bits;
+
+ err = attr_wof_frame_info(ni, attr, run, frame64, frames,
+ frame_bits, &ondisk_size, &vbo_data);
+ if (err)
+ goto out2;
+
+ if (frame64 == frames) {
+ unc_size = 1 + ((ni->vfs_inode.i_size - 1) &
+ (frame_size - 1));
+ ondisk_size = attr_size(attr) - vbo_data;
+ } else {
+ unc_size = frame_size;
+ }
+
+ if (ondisk_size > frame_size) {
+ err = -EINVAL;
+ goto out2;
+ }
+
+ if (!attr->non_res) {
+ if (vbo_data + ondisk_size >
+ le32_to_cpu(attr->res.data_size)) {
+ err = -EINVAL;
+ goto out1;
+ }
+
+ err = decompress_lzx_xpress(
+ sbi, Add2Ptr(resident_data(attr), vbo_data),
+ ondisk_size, frame_mem, unc_size, frame_size);
+ goto out1;
+ }
+ vbo_disk = vbo_data;
+ /* Load all runs to read [vbo_disk-vbo_to). */
+ err = attr_load_runs_range(ni, ATTR_DATA, WOF_NAME,
+ ARRAY_SIZE(WOF_NAME), run, vbo_disk,
+ vbo_data + ondisk_size);
+ if (err)
+ goto out2;
+ npages_disk = (ondisk_size + (vbo_disk & (PAGE_SIZE - 1)) +
+ PAGE_SIZE - 1) >>
+ PAGE_SHIFT;
+#endif
+ } else if (is_attr_compressed(attr)) {
+ /* LZNT compression. */
+ if (sbi->cluster_size > NTFS_LZNT_MAX_CLUSTER) {
+ err = -EOPNOTSUPP;
+ goto out1;
+ }
+
+ if (attr->nres.c_unit != NTFS_LZNT_CUNIT) {
+ err = -EOPNOTSUPP;
+ goto out1;
+ }
+
+ down_write(&ni->file.run_lock);
+ run_truncate_around(run, le64_to_cpu(attr->nres.svcn));
+ frame = frame_vbo >> (cluster_bits + NTFS_LZNT_CUNIT);
+ err = attr_is_frame_compressed(ni, attr, frame, &clst_data);
+ up_write(&ni->file.run_lock);
+ if (err)
+ goto out1;
+
+ if (!clst_data) {
+ memset(frame_mem, 0, frame_size);
+ goto out1;
+ }
+
+ frame_size = sbi->cluster_size << NTFS_LZNT_CUNIT;
+ ondisk_size = clst_data << cluster_bits;
+
+ if (clst_data >= NTFS_LZNT_CLUSTERS) {
+ /* Frame is not compressed. */
+ down_read(&ni->file.run_lock);
+ err = ntfs_bio_pages(sbi, run, pages, pages_per_frame,
+ frame_vbo, ondisk_size,
+ REQ_OP_READ);
+ up_read(&ni->file.run_lock);
+ goto out1;
+ }
+ vbo_disk = frame_vbo;
+ npages_disk = (ondisk_size + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ } else {
+ __builtin_unreachable();
+ err = -EINVAL;
+ goto out1;
+ }
+
+ pages_disk = kzalloc(npages_disk * sizeof(struct page *), GFP_NOFS);
+ if (!pages_disk) {
+ err = -ENOMEM;
+ goto out2;
+ }
+
+ for (i = 0; i < npages_disk; i++) {
+ pg = alloc_page(GFP_KERNEL);
+ if (!pg) {
+ err = -ENOMEM;
+ goto out3;
+ }
+ pages_disk[i] = pg;
+ lock_page(pg);
+ kmap(pg);
+ }
+
+ /* Read 'ondisk_size' bytes from disk. */
+ down_read(&ni->file.run_lock);
+ err = ntfs_bio_pages(sbi, run, pages_disk, npages_disk, vbo_disk,
+ ondisk_size, REQ_OP_READ);
+ up_read(&ni->file.run_lock);
+ if (err)
+ goto out3;
+
+ /*
+ * To simplify decompress algorithm do vmap for source and target pages.
+ */
+ frame_ondisk = vmap(pages_disk, npages_disk, VM_MAP, PAGE_KERNEL_RO);
+ if (!frame_ondisk) {
+ err = -ENOMEM;
+ goto out3;
+ }
+
+ /* Decompress: Frame_ondisk -> frame_mem. */
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+ if (run != &ni->file.run) {
+ /* LZX or XPRESS */
+ err = decompress_lzx_xpress(
+ sbi, frame_ondisk + (vbo_disk & (PAGE_SIZE - 1)),
+ ondisk_size, frame_mem, unc_size, frame_size);
+ } else
+#endif
+ {
+ /* LZNT - Native NTFS compression. */
+ unc_size = decompress_lznt(frame_ondisk, ondisk_size, frame_mem,
+ frame_size);
+ if ((ssize_t)unc_size < 0)
+ err = unc_size;
+ else if (!unc_size || unc_size > frame_size)
+ err = -EINVAL;
+ }
+ if (!err && valid_size < frame_vbo + frame_size) {
+ size_t ok = valid_size - frame_vbo;
+
+ memset(frame_mem + ok, 0, frame_size - ok);
+ }
+
+ vunmap(frame_ondisk);
+
+out3:
+ for (i = 0; i < npages_disk; i++) {
+ pg = pages_disk[i];
+ if (pg) {
+ kunmap(pg);
+ unlock_page(pg);
+ put_page(pg);
+ }
+ }
+ kfree(pages_disk);
+
+out2:
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+ if (run != &ni->file.run)
+ run_free(run);
+#endif
+out1:
+ vunmap(frame_mem);
+out:
+ for (i = 0; i < pages_per_frame; i++) {
+ pg = pages[i];
+ kunmap(pg);
+ ClearPageError(pg);
+ SetPageUptodate(pg);
+ }
+
+ return err;
+}
+
+/*
+ * ni_write_frame
+ *
+ * Pages - Array of locked pages.
+ */
+int ni_write_frame(struct ntfs_inode *ni, struct page **pages,
+ u32 pages_per_frame)
+{
+ int err;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ u8 frame_bits = NTFS_LZNT_CUNIT + sbi->cluster_bits;
+ u32 frame_size = sbi->cluster_size << NTFS_LZNT_CUNIT;
+ u64 frame_vbo = (u64)pages[0]->index << PAGE_SHIFT;
+ CLST frame = frame_vbo >> frame_bits;
+ char *frame_ondisk = NULL;
+ struct page **pages_disk = NULL;
+ struct ATTR_LIST_ENTRY *le = NULL;
+ char *frame_mem;
+ struct ATTRIB *attr;
+ struct mft_inode *mi;
+ u32 i;
+ struct page *pg;
+ size_t compr_size, ondisk_size;
+ struct lznt *lznt;
+
+ attr = ni_find_attr(ni, NULL, &le, ATTR_DATA, NULL, 0, NULL, &mi);
+ if (!attr) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ if (WARN_ON(!is_attr_compressed(attr))) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (sbi->cluster_size > NTFS_LZNT_MAX_CLUSTER) {
+ err = -EOPNOTSUPP;
+ goto out;
+ }
+
+ if (!attr->non_res) {
+ down_write(&ni->file.run_lock);
+ err = attr_make_nonresident(ni, attr, le, mi,
+ le32_to_cpu(attr->res.data_size),
+ &ni->file.run, &attr, pages[0]);
+ up_write(&ni->file.run_lock);
+ if (err)
+ goto out;
+ }
+
+ if (attr->nres.c_unit != NTFS_LZNT_CUNIT) {
+ err = -EOPNOTSUPP;
+ goto out;
+ }
+
+ pages_disk = kcalloc(pages_per_frame, sizeof(struct page *), GFP_NOFS);
+ if (!pages_disk) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ for (i = 0; i < pages_per_frame; i++) {
+ pg = alloc_page(GFP_KERNEL);
+ if (!pg) {
+ err = -ENOMEM;
+ goto out1;
+ }
+ pages_disk[i] = pg;
+ lock_page(pg);
+ kmap(pg);
+ }
+
+ /* To simplify compress algorithm do vmap for source and target pages. */
+ frame_ondisk = vmap(pages_disk, pages_per_frame, VM_MAP, PAGE_KERNEL);
+ if (!frame_ondisk) {
+ err = -ENOMEM;
+ goto out1;
+ }
+
+ for (i = 0; i < pages_per_frame; i++)
+ kmap(pages[i]);
+
+ /* Map in-memory frame for read-only. */
+ frame_mem = vmap(pages, pages_per_frame, VM_MAP, PAGE_KERNEL_RO);
+ if (!frame_mem) {
+ err = -ENOMEM;
+ goto out2;
+ }
+
+ mutex_lock(&sbi->compress.mtx_lznt);
+ lznt = NULL;
+ if (!sbi->compress.lznt) {
+ /*
+ * LZNT implements two levels of compression:
+ * 0 - Standard compression
+ * 1 - Best compression, requires a lot of cpu
+ * use mount option?
+ */
+ lznt = get_lznt_ctx(0);
+ if (!lznt) {
+ mutex_unlock(&sbi->compress.mtx_lznt);
+ err = -ENOMEM;
+ goto out3;
+ }
+
+ sbi->compress.lznt = lznt;
+ lznt = NULL;
+ }
+
+ /* Compress: frame_mem -> frame_ondisk */
+ compr_size = compress_lznt(frame_mem, frame_size, frame_ondisk,
+ frame_size, sbi->compress.lznt);
+ mutex_unlock(&sbi->compress.mtx_lznt);
+ kfree(lznt);
+
+ if (compr_size + sbi->cluster_size > frame_size) {
+ /* Frame is not compressed. */
+ compr_size = frame_size;
+ ondisk_size = frame_size;
+ } else if (compr_size) {
+ /* Frame is compressed. */
+ ondisk_size = ntfs_up_cluster(sbi, compr_size);
+ memset(frame_ondisk + compr_size, 0, ondisk_size - compr_size);
+ } else {
+ /* Frame is sparsed. */
+ ondisk_size = 0;
+ }
+
+ down_write(&ni->file.run_lock);
+ run_truncate_around(&ni->file.run, le64_to_cpu(attr->nres.svcn));
+ err = attr_allocate_frame(ni, frame, compr_size, ni->i_valid);
+ up_write(&ni->file.run_lock);
+ if (err)
+ goto out2;
+
+ if (!ondisk_size)
+ goto out2;
+
+ down_read(&ni->file.run_lock);
+ err = ntfs_bio_pages(sbi, &ni->file.run,
+ ondisk_size < frame_size ? pages_disk : pages,
+ pages_per_frame, frame_vbo, ondisk_size,
+ REQ_OP_WRITE);
+ up_read(&ni->file.run_lock);
+
+out3:
+ vunmap(frame_mem);
+
+out2:
+ for (i = 0; i < pages_per_frame; i++)
+ kunmap(pages[i]);
+
+ vunmap(frame_ondisk);
+out1:
+ for (i = 0; i < pages_per_frame; i++) {
+ pg = pages_disk[i];
+ if (pg) {
+ kunmap(pg);
+ unlock_page(pg);
+ put_page(pg);
+ }
+ }
+ kfree(pages_disk);
+out:
+ return err;
+}
+
+/*
+ * ni_remove_name - Removes name 'de' from MFT and from directory.
+ * 'de2' and 'undo_step' are used to restore MFT/dir, if error occurs.
+ */
+int ni_remove_name(struct ntfs_inode *dir_ni, struct ntfs_inode *ni,
+ struct NTFS_DE *de, struct NTFS_DE **de2, int *undo_step)
+{
+ int err;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct ATTR_FILE_NAME *de_name = (struct ATTR_FILE_NAME *)(de + 1);
+ struct ATTR_FILE_NAME *fname;
+ struct ATTR_LIST_ENTRY *le;
+ struct mft_inode *mi;
+ u16 de_key_size = le16_to_cpu(de->key_size);
+ u8 name_type;
+
+ *undo_step = 0;
+
+ /* Find name in record. */
+ mi_get_ref(&dir_ni->mi, &de_name->home);
+
+ fname = ni_fname_name(ni, (struct cpu_str *)&de_name->name_len,
+ &de_name->home, &mi, &le);
+ if (!fname)
+ return -ENOENT;
+
+ memcpy(&de_name->dup, &fname->dup, sizeof(struct NTFS_DUP_INFO));
+ name_type = paired_name(fname->type);
+
+ /* Mark ntfs as dirty. It will be cleared at umount. */
+ ntfs_set_state(sbi, NTFS_DIRTY_DIRTY);
+
+ /* Step 1: Remove name from directory. */
+ err = indx_delete_entry(&dir_ni->dir, dir_ni, fname, de_key_size, sbi);
+ if (err)
+ return err;
+
+ /* Step 2: Remove name from MFT. */
+ ni_remove_attr_le(ni, attr_from_name(fname), mi, le);
+
+ *undo_step = 2;
+
+ /* Get paired name. */
+ fname = ni_fname_type(ni, name_type, &mi, &le);
+ if (fname) {
+ u16 de2_key_size = fname_full_size(fname);
+
+ *de2 = Add2Ptr(de, 1024);
+ (*de2)->key_size = cpu_to_le16(de2_key_size);
+
+ memcpy(*de2 + 1, fname, de2_key_size);
+
+ /* Step 3: Remove paired name from directory. */
+ err = indx_delete_entry(&dir_ni->dir, dir_ni, fname,
+ de2_key_size, sbi);
+ if (err)
+ return err;
+
+ /* Step 4: Remove paired name from MFT. */
+ ni_remove_attr_le(ni, attr_from_name(fname), mi, le);
+
+ *undo_step = 4;
+ }
+ return 0;
+}
+
+/*
+ * ni_remove_name_undo - Paired function for ni_remove_name.
+ *
+ * Return: True if ok
+ */
+bool ni_remove_name_undo(struct ntfs_inode *dir_ni, struct ntfs_inode *ni,
+ struct NTFS_DE *de, struct NTFS_DE *de2, int undo_step)
+{
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct ATTRIB *attr;
+ u16 de_key_size = de2 ? le16_to_cpu(de2->key_size) : 0;
+
+ switch (undo_step) {
+ case 4:
+ if (ni_insert_resident(ni, de_key_size, ATTR_NAME, NULL, 0,
+ &attr, NULL, NULL)) {
+ return false;
+ }
+ memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), de2 + 1, de_key_size);
+
+ mi_get_ref(&ni->mi, &de2->ref);
+ de2->size = cpu_to_le16(ALIGN(de_key_size, 8) +
+ sizeof(struct NTFS_DE));
+ de2->flags = 0;
+ de2->res = 0;
+
+ if (indx_insert_entry(&dir_ni->dir, dir_ni, de2, sbi, NULL,
+ 1)) {
+ return false;
+ }
+ fallthrough;
+
+ case 2:
+ de_key_size = le16_to_cpu(de->key_size);
+
+ if (ni_insert_resident(ni, de_key_size, ATTR_NAME, NULL, 0,
+ &attr, NULL, NULL)) {
+ return false;
+ }
+
+ memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), de + 1, de_key_size);
+ mi_get_ref(&ni->mi, &de->ref);
+
+ if (indx_insert_entry(&dir_ni->dir, dir_ni, de, sbi, NULL, 1)) {
+ return false;
+ }
+ }
+
+ return true;
+}
+
+/*
+ * ni_add_name - Add new name in MFT and in directory.
+ */
+int ni_add_name(struct ntfs_inode *dir_ni, struct ntfs_inode *ni,
+ struct NTFS_DE *de)
+{
+ int err;
+ struct ATTRIB *attr;
+ struct ATTR_LIST_ENTRY *le;
+ struct mft_inode *mi;
+ struct ATTR_FILE_NAME *de_name = (struct ATTR_FILE_NAME *)(de + 1);
+ u16 de_key_size = le16_to_cpu(de->key_size);
+
+ mi_get_ref(&ni->mi, &de->ref);
+ mi_get_ref(&dir_ni->mi, &de_name->home);
+
+ /* Insert new name in MFT. */
+ err = ni_insert_resident(ni, de_key_size, ATTR_NAME, NULL, 0, &attr,
+ &mi, &le);
+ if (err)
+ return err;
+
+ memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), de_name, de_key_size);
+
+ /* Insert new name in directory. */
+ err = indx_insert_entry(&dir_ni->dir, dir_ni, de, ni->mi.sbi, NULL, 0);
+ if (err)
+ ni_remove_attr_le(ni, attr, mi, le);
+
+ return err;
+}
+
+/*
+ * ni_rename - Remove one name and insert new name.
+ */
+int ni_rename(struct ntfs_inode *dir_ni, struct ntfs_inode *new_dir_ni,
+ struct ntfs_inode *ni, struct NTFS_DE *de, struct NTFS_DE *new_de,
+ bool *is_bad)
+{
+ int err;
+ struct NTFS_DE *de2 = NULL;
+ int undo = 0;
+
+ /*
+ * There are two possible ways to rename:
+ * 1) Add new name and remove old name.
+ * 2) Remove old name and add new name.
+ *
+ * In most cases (not all!) adding new name in MFT and in directory can
+ * allocate additional cluster(s).
+ * Second way may result to bad inode if we can't add new name
+ * and then can't restore (add) old name.
+ */
+
+ /*
+ * Way 1 - Add new + remove old.
+ */
+ err = ni_add_name(new_dir_ni, ni, new_de);
+ if (!err) {
+ err = ni_remove_name(dir_ni, ni, de, &de2, &undo);
+ if (err && ni_remove_name(new_dir_ni, ni, new_de, &de2, &undo))
+ *is_bad = true;
+ }
+
+ /*
+ * Way 2 - Remove old + add new.
+ */
+ /*
+ * err = ni_remove_name(dir_ni, ni, de, &de2, &undo);
+ * if (!err) {
+ * err = ni_add_name(new_dir_ni, ni, new_de);
+ * if (err && !ni_remove_name_undo(dir_ni, ni, de, de2, undo))
+ * *is_bad = true;
+ * }
+ */
+
+ return err;
+}
+
+/*
+ * ni_is_dirty - Return: True if 'ni' requires ni_write_inode.
+ */
+bool ni_is_dirty(struct inode *inode)
+{
+ struct ntfs_inode *ni = ntfs_i(inode);
+ struct rb_node *node;
+
+ if (ni->mi.dirty || ni->attr_list.dirty ||
+ (ni->ni_flags & NI_FLAG_UPDATE_PARENT))
+ return true;
+
+ for (node = rb_first(&ni->mi_tree); node; node = rb_next(node)) {
+ if (rb_entry(node, struct mft_inode, node)->dirty)
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * ni_update_parent
+ *
+ * Update duplicate info of ATTR_FILE_NAME in MFT and in parent directories.
+ */
+static bool ni_update_parent(struct ntfs_inode *ni, struct NTFS_DUP_INFO *dup,
+ int sync)
+{
+ struct ATTRIB *attr;
+ struct mft_inode *mi;
+ struct ATTR_LIST_ENTRY *le = NULL;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct super_block *sb = sbi->sb;
+ bool re_dirty = false;
+
+ if (ni->mi.mrec->flags & RECORD_FLAG_DIR) {
+ dup->fa |= FILE_ATTRIBUTE_DIRECTORY;
+ attr = NULL;
+ dup->alloc_size = 0;
+ dup->data_size = 0;
+ } else {
+ dup->fa &= ~FILE_ATTRIBUTE_DIRECTORY;
+
+ attr = ni_find_attr(ni, NULL, &le, ATTR_DATA, NULL, 0, NULL,
+ &mi);
+ if (!attr) {
+ dup->alloc_size = dup->data_size = 0;
+ } else if (!attr->non_res) {
+ u32 data_size = le32_to_cpu(attr->res.data_size);
+
+ dup->alloc_size = cpu_to_le64(ALIGN(data_size, 8));
+ dup->data_size = cpu_to_le64(data_size);
+ } else {
+ u64 new_valid = ni->i_valid;
+ u64 data_size = le64_to_cpu(attr->nres.data_size);
+ __le64 valid_le;
+
+ dup->alloc_size = is_attr_ext(attr)
+ ? attr->nres.total_size
+ : attr->nres.alloc_size;
+ dup->data_size = attr->nres.data_size;
+
+ if (new_valid > data_size)
+ new_valid = data_size;
+
+ valid_le = cpu_to_le64(new_valid);
+ if (valid_le != attr->nres.valid_size) {
+ attr->nres.valid_size = valid_le;
+ mi->dirty = true;
+ }
+ }
+ }
+
+ /* TODO: Fill reparse info. */
+ dup->reparse = 0;
+ dup->ea_size = 0;
+
+ if (ni->ni_flags & NI_FLAG_EA) {
+ attr = ni_find_attr(ni, attr, &le, ATTR_EA_INFO, NULL, 0, NULL,
+ NULL);
+ if (attr) {
+ const struct EA_INFO *info;
+
+ info = resident_data_ex(attr, sizeof(struct EA_INFO));
+ dup->ea_size = info->size_pack;
+ }
+ }
+
+ attr = NULL;
+ le = NULL;
+
+ while ((attr = ni_find_attr(ni, attr, &le, ATTR_NAME, NULL, 0, NULL,
+ &mi))) {
+ struct inode *dir;
+ struct ATTR_FILE_NAME *fname;
+
+ fname = resident_data_ex(attr, SIZEOF_ATTRIBUTE_FILENAME);
+ if (!fname || !memcmp(&fname->dup, dup, sizeof(fname->dup)))
+ continue;
+
+ /* ntfs_iget5 may sleep. */
+ dir = ntfs_iget5(sb, &fname->home, NULL);
+ if (IS_ERR(dir)) {
+ ntfs_inode_warn(
+ &ni->vfs_inode,
+ "failed to open parent directory r=%lx to update",
+ (long)ino_get(&fname->home));
+ continue;
+ }
+
+ if (!is_bad_inode(dir)) {
+ struct ntfs_inode *dir_ni = ntfs_i(dir);
+
+ if (!ni_trylock(dir_ni)) {
+ re_dirty = true;
+ } else {
+ indx_update_dup(dir_ni, sbi, fname, dup, sync);
+ ni_unlock(dir_ni);
+ memcpy(&fname->dup, dup, sizeof(fname->dup));
+ mi->dirty = true;
+ }
+ }
+ iput(dir);
+ }
+
+ return re_dirty;
+}
+
+/*
+ * ni_write_inode - Write MFT base record and all subrecords to disk.
+ */
+int ni_write_inode(struct inode *inode, int sync, const char *hint)
+{
+ int err = 0, err2;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ struct super_block *sb = inode->i_sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ bool re_dirty = false;
+ struct ATTR_STD_INFO *std;
+ struct rb_node *node, *next;
+ struct NTFS_DUP_INFO dup;
+
+ if (is_bad_inode(inode) || sb_rdonly(sb))
+ return 0;
+
+ if (!ni_trylock(ni)) {
+ /* 'ni' is under modification, skip for now. */
+ mark_inode_dirty_sync(inode);
+ return 0;
+ }
+
+ if (is_rec_inuse(ni->mi.mrec) &&
+ !(sbi->flags & NTFS_FLAGS_LOG_REPLAYING) && inode->i_nlink) {
+ bool modified = false;
+
+ /* Update times in standard attribute. */
+ std = ni_std(ni);
+ if (!std) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Update the access times if they have changed. */
+ dup.m_time = kernel2nt(&inode->i_mtime);
+ if (std->m_time != dup.m_time) {
+ std->m_time = dup.m_time;
+ modified = true;
+ }
+
+ dup.c_time = kernel2nt(&inode->i_ctime);
+ if (std->c_time != dup.c_time) {
+ std->c_time = dup.c_time;
+ modified = true;
+ }
+
+ dup.a_time = kernel2nt(&inode->i_atime);
+ if (std->a_time != dup.a_time) {
+ std->a_time = dup.a_time;
+ modified = true;
+ }
+
+ dup.fa = ni->std_fa;
+ if (std->fa != dup.fa) {
+ std->fa = dup.fa;
+ modified = true;
+ }
+
+ if (modified)
+ ni->mi.dirty = true;
+
+ if (!ntfs_is_meta_file(sbi, inode->i_ino) &&
+ (modified || (ni->ni_flags & NI_FLAG_UPDATE_PARENT))
+ /* Avoid __wait_on_freeing_inode(inode). */
+ && (sb->s_flags & SB_ACTIVE)) {
+ dup.cr_time = std->cr_time;
+ /* Not critical if this function fail. */
+ re_dirty = ni_update_parent(ni, &dup, sync);
+
+ if (re_dirty)
+ ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+ else
+ ni->ni_flags &= ~NI_FLAG_UPDATE_PARENT;
+ }
+
+ /* Update attribute list. */
+ if (ni->attr_list.size && ni->attr_list.dirty) {
+ if (inode->i_ino != MFT_REC_MFT || sync) {
+ err = ni_try_remove_attr_list(ni);
+ if (err)
+ goto out;
+ }
+
+ err = al_update(ni);
+ if (err)
+ goto out;
+ }
+ }
+
+ for (node = rb_first(&ni->mi_tree); node; node = next) {
+ struct mft_inode *mi = rb_entry(node, struct mft_inode, node);
+ bool is_empty;
+
+ next = rb_next(node);
+
+ if (!mi->dirty)
+ continue;
+
+ is_empty = !mi_enum_attr(mi, NULL);
+
+ if (is_empty)
+ clear_rec_inuse(mi->mrec);
+
+ err2 = mi_write(mi, sync);
+ if (!err && err2)
+ err = err2;
+
+ if (is_empty) {
+ ntfs_mark_rec_free(sbi, mi->rno);
+ rb_erase(node, &ni->mi_tree);
+ mi_put(mi);
+ }
+ }
+
+ if (ni->mi.dirty) {
+ err2 = mi_write(&ni->mi, sync);
+ if (!err && err2)
+ err = err2;
+ }
+out:
+ ni_unlock(ni);
+
+ if (err) {
+ ntfs_err(sb, "%s r=%lx failed, %d.", hint, inode->i_ino, err);
+ ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+ return err;
+ }
+
+ if (re_dirty)
+ mark_inode_dirty_sync(inode);
+
+ return 0;
+}
diff --git a/fs/ntfs3/fslog.c b/fs/ntfs3/fslog.c
new file mode 100644
index 000000000000..b5853aed0e25
--- /dev/null
+++ b/fs/ntfs3/fslog.c
@@ -0,0 +1,5217 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/hash.h>
+#include <linux/nls.h>
+#include <linux/random.h>
+#include <linux/ratelimit.h>
+#include <linux/slab.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+/*
+ * LOG FILE structs
+ */
+
+// clang-format off
+
+#define MaxLogFileSize 0x100000000ull
+#define DefaultLogPageSize 4096
+#define MinLogRecordPages 0x30
+
+struct RESTART_HDR {
+ struct NTFS_RECORD_HEADER rhdr; // 'RSTR'
+ __le32 sys_page_size; // 0x10: Page size of the system which initialized the log.
+ __le32 page_size; // 0x14: Log page size used for this log file.
+ __le16 ra_off; // 0x18:
+ __le16 minor_ver; // 0x1A:
+ __le16 major_ver; // 0x1C:
+ __le16 fixups[];
+};
+
+#define LFS_NO_CLIENT 0xffff
+#define LFS_NO_CLIENT_LE cpu_to_le16(0xffff)
+
+struct CLIENT_REC {
+ __le64 oldest_lsn;
+ __le64 restart_lsn; // 0x08:
+ __le16 prev_client; // 0x10:
+ __le16 next_client; // 0x12:
+ __le16 seq_num; // 0x14:
+ u8 align[6]; // 0x16:
+ __le32 name_bytes; // 0x1C: In bytes.
+ __le16 name[32]; // 0x20: Name of client.
+};
+
+static_assert(sizeof(struct CLIENT_REC) == 0x60);
+
+/* Two copies of these will exist at the beginning of the log file */
+struct RESTART_AREA {
+ __le64 current_lsn; // 0x00: Current logical end of log file.
+ __le16 log_clients; // 0x08: Maximum number of clients.
+ __le16 client_idx[2]; // 0x0A: Free/use index into the client record arrays.
+ __le16 flags; // 0x0E: See RESTART_SINGLE_PAGE_IO.
+ __le32 seq_num_bits; // 0x10: The number of bits in sequence number.
+ __le16 ra_len; // 0x14:
+ __le16 client_off; // 0x16:
+ __le64 l_size; // 0x18: Usable log file size.
+ __le32 last_lsn_data_len; // 0x20:
+ __le16 rec_hdr_len; // 0x24: Log page data offset.
+ __le16 data_off; // 0x26: Log page data length.
+ __le32 open_log_count; // 0x28:
+ __le32 align[5]; // 0x2C:
+ struct CLIENT_REC clients[]; // 0x40:
+};
+
+struct LOG_REC_HDR {
+ __le16 redo_op; // 0x00: NTFS_LOG_OPERATION
+ __le16 undo_op; // 0x02: NTFS_LOG_OPERATION
+ __le16 redo_off; // 0x04: Offset to Redo record.
+ __le16 redo_len; // 0x06: Redo length.
+ __le16 undo_off; // 0x08: Offset to Undo record.
+ __le16 undo_len; // 0x0A: Undo length.
+ __le16 target_attr; // 0x0C:
+ __le16 lcns_follow; // 0x0E:
+ __le16 record_off; // 0x10:
+ __le16 attr_off; // 0x12:
+ __le16 cluster_off; // 0x14:
+ __le16 reserved; // 0x16:
+ __le64 target_vcn; // 0x18:
+ __le64 page_lcns[]; // 0x20:
+};
+
+static_assert(sizeof(struct LOG_REC_HDR) == 0x20);
+
+#define RESTART_ENTRY_ALLOCATED 0xFFFFFFFF
+#define RESTART_ENTRY_ALLOCATED_LE cpu_to_le32(0xFFFFFFFF)
+
+struct RESTART_TABLE {
+ __le16 size; // 0x00: In bytes
+ __le16 used; // 0x02: Entries
+ __le16 total; // 0x04: Entries
+ __le16 res[3]; // 0x06:
+ __le32 free_goal; // 0x0C:
+ __le32 first_free; // 0x10:
+ __le32 last_free; // 0x14:
+
+};
+
+static_assert(sizeof(struct RESTART_TABLE) == 0x18);
+
+struct ATTR_NAME_ENTRY {
+ __le16 off; // Offset in the Open attribute Table.
+ __le16 name_bytes;
+ __le16 name[];
+};
+
+struct OPEN_ATTR_ENRTY {
+ __le32 next; // 0x00: RESTART_ENTRY_ALLOCATED if allocated
+ __le32 bytes_per_index; // 0x04:
+ enum ATTR_TYPE type; // 0x08:
+ u8 is_dirty_pages; // 0x0C:
+ u8 is_attr_name; // 0x0B: Faked field to manage 'ptr'
+ u8 name_len; // 0x0C: Faked field to manage 'ptr'
+ u8 res;
+ struct MFT_REF ref; // 0x10: File Reference of file containing attribute
+ __le64 open_record_lsn; // 0x18:
+ void *ptr; // 0x20:
+};
+
+/* 32 bit version of 'struct OPEN_ATTR_ENRTY' */
+struct OPEN_ATTR_ENRTY_32 {
+ __le32 next; // 0x00: RESTART_ENTRY_ALLOCATED if allocated
+ __le32 ptr; // 0x04:
+ struct MFT_REF ref; // 0x08:
+ __le64 open_record_lsn; // 0x10:
+ u8 is_dirty_pages; // 0x18:
+ u8 is_attr_name; // 0x19:
+ u8 res1[2];
+ enum ATTR_TYPE type; // 0x1C:
+ u8 name_len; // 0x20: In wchar
+ u8 res2[3];
+ __le32 AttributeName; // 0x24:
+ __le32 bytes_per_index; // 0x28:
+};
+
+#define SIZEOF_OPENATTRIBUTEENTRY0 0x2c
+// static_assert( 0x2C == sizeof(struct OPEN_ATTR_ENRTY_32) );
+static_assert(sizeof(struct OPEN_ATTR_ENRTY) < SIZEOF_OPENATTRIBUTEENTRY0);
+
+/*
+ * One entry exists in the Dirty Pages Table for each page which is dirty at
+ * the time the Restart Area is written.
+ */
+struct DIR_PAGE_ENTRY {
+ __le32 next; // 0x00: RESTART_ENTRY_ALLOCATED if allocated
+ __le32 target_attr; // 0x04: Index into the Open attribute Table
+ __le32 transfer_len; // 0x08:
+ __le32 lcns_follow; // 0x0C:
+ __le64 vcn; // 0x10: Vcn of dirty page
+ __le64 oldest_lsn; // 0x18:
+ __le64 page_lcns[]; // 0x20:
+};
+
+static_assert(sizeof(struct DIR_PAGE_ENTRY) == 0x20);
+
+/* 32 bit version of 'struct DIR_PAGE_ENTRY' */
+struct DIR_PAGE_ENTRY_32 {
+ __le32 next; // 0x00: RESTART_ENTRY_ALLOCATED if allocated
+ __le32 target_attr; // 0x04: Index into the Open attribute Table
+ __le32 transfer_len; // 0x08:
+ __le32 lcns_follow; // 0x0C:
+ __le32 reserved; // 0x10:
+ __le32 vcn_low; // 0x14: Vcn of dirty page
+ __le32 vcn_hi; // 0x18: Vcn of dirty page
+ __le32 oldest_lsn_low; // 0x1C:
+ __le32 oldest_lsn_hi; // 0x1C:
+ __le32 page_lcns_low; // 0x24:
+ __le32 page_lcns_hi; // 0x24:
+};
+
+static_assert(offsetof(struct DIR_PAGE_ENTRY_32, vcn_low) == 0x14);
+static_assert(sizeof(struct DIR_PAGE_ENTRY_32) == 0x2c);
+
+enum transact_state {
+ TransactionUninitialized = 0,
+ TransactionActive,
+ TransactionPrepared,
+ TransactionCommitted
+};
+
+struct TRANSACTION_ENTRY {
+ __le32 next; // 0x00: RESTART_ENTRY_ALLOCATED if allocated
+ u8 transact_state; // 0x04:
+ u8 reserved[3]; // 0x05:
+ __le64 first_lsn; // 0x08:
+ __le64 prev_lsn; // 0x10:
+ __le64 undo_next_lsn; // 0x18:
+ __le32 undo_records; // 0x20: Number of undo log records pending abort
+ __le32 undo_len; // 0x24: Total undo size
+};
+
+static_assert(sizeof(struct TRANSACTION_ENTRY) == 0x28);
+
+struct NTFS_RESTART {
+ __le32 major_ver; // 0x00:
+ __le32 minor_ver; // 0x04:
+ __le64 check_point_start; // 0x08:
+ __le64 open_attr_table_lsn; // 0x10:
+ __le64 attr_names_lsn; // 0x18:
+ __le64 dirty_pages_table_lsn; // 0x20:
+ __le64 transact_table_lsn; // 0x28:
+ __le32 open_attr_len; // 0x30: In bytes
+ __le32 attr_names_len; // 0x34: In bytes
+ __le32 dirty_pages_len; // 0x38: In bytes
+ __le32 transact_table_len; // 0x3C: In bytes
+};
+
+static_assert(sizeof(struct NTFS_RESTART) == 0x40);
+
+struct NEW_ATTRIBUTE_SIZES {
+ __le64 alloc_size;
+ __le64 valid_size;
+ __le64 data_size;
+ __le64 total_size;
+};
+
+struct BITMAP_RANGE {
+ __le32 bitmap_off;
+ __le32 bits;
+};
+
+struct LCN_RANGE {
+ __le64 lcn;
+ __le64 len;
+};
+
+/* The following type defines the different log record types. */
+#define LfsClientRecord cpu_to_le32(1)
+#define LfsClientRestart cpu_to_le32(2)
+
+/* This is used to uniquely identify a client for a particular log file. */
+struct CLIENT_ID {
+ __le16 seq_num;
+ __le16 client_idx;
+};
+
+/* This is the header that begins every Log Record in the log file. */
+struct LFS_RECORD_HDR {
+ __le64 this_lsn; // 0x00:
+ __le64 client_prev_lsn; // 0x08:
+ __le64 client_undo_next_lsn; // 0x10:
+ __le32 client_data_len; // 0x18:
+ struct CLIENT_ID client; // 0x1C: Owner of this log record.
+ __le32 record_type; // 0x20: LfsClientRecord or LfsClientRestart.
+ __le32 transact_id; // 0x24:
+ __le16 flags; // 0x28: LOG_RECORD_MULTI_PAGE
+ u8 align[6]; // 0x2A:
+};
+
+#define LOG_RECORD_MULTI_PAGE cpu_to_le16(1)
+
+static_assert(sizeof(struct LFS_RECORD_HDR) == 0x30);
+
+struct LFS_RECORD {
+ __le16 next_record_off; // 0x00: Offset of the free space in the page,
+ u8 align[6]; // 0x02:
+ __le64 last_end_lsn; // 0x08: lsn for the last log record which ends on the page,
+};
+
+static_assert(sizeof(struct LFS_RECORD) == 0x10);
+
+struct RECORD_PAGE_HDR {
+ struct NTFS_RECORD_HEADER rhdr; // 'RCRD'
+ __le32 rflags; // 0x10: See LOG_PAGE_LOG_RECORD_END
+ __le16 page_count; // 0x14:
+ __le16 page_pos; // 0x16:
+ struct LFS_RECORD record_hdr; // 0x18:
+ __le16 fixups[10]; // 0x28:
+ __le32 file_off; // 0x3c: Used when major version >= 2
+};
+
+// clang-format on
+
+// Page contains the end of a log record.
+#define LOG_PAGE_LOG_RECORD_END cpu_to_le32(0x00000001)
+
+static inline bool is_log_record_end(const struct RECORD_PAGE_HDR *hdr)
+{
+ return hdr->rflags & LOG_PAGE_LOG_RECORD_END;
+}
+
+static_assert(offsetof(struct RECORD_PAGE_HDR, file_off) == 0x3c);
+
+/*
+ * END of NTFS LOG structures
+ */
+
+/* Define some tuning parameters to keep the restart tables a reasonable size. */
+#define INITIAL_NUMBER_TRANSACTIONS 5
+
+enum NTFS_LOG_OPERATION {
+
+ Noop = 0x00,
+ CompensationLogRecord = 0x01,
+ InitializeFileRecordSegment = 0x02,
+ DeallocateFileRecordSegment = 0x03,
+ WriteEndOfFileRecordSegment = 0x04,
+ CreateAttribute = 0x05,
+ DeleteAttribute = 0x06,
+ UpdateResidentValue = 0x07,
+ UpdateNonresidentValue = 0x08,
+ UpdateMappingPairs = 0x09,
+ DeleteDirtyClusters = 0x0A,
+ SetNewAttributeSizes = 0x0B,
+ AddIndexEntryRoot = 0x0C,
+ DeleteIndexEntryRoot = 0x0D,
+ AddIndexEntryAllocation = 0x0E,
+ DeleteIndexEntryAllocation = 0x0F,
+ WriteEndOfIndexBuffer = 0x10,
+ SetIndexEntryVcnRoot = 0x11,
+ SetIndexEntryVcnAllocation = 0x12,
+ UpdateFileNameRoot = 0x13,
+ UpdateFileNameAllocation = 0x14,
+ SetBitsInNonresidentBitMap = 0x15,
+ ClearBitsInNonresidentBitMap = 0x16,
+ HotFix = 0x17,
+ EndTopLevelAction = 0x18,
+ PrepareTransaction = 0x19,
+ CommitTransaction = 0x1A,
+ ForgetTransaction = 0x1B,
+ OpenNonresidentAttribute = 0x1C,
+ OpenAttributeTableDump = 0x1D,
+ AttributeNamesDump = 0x1E,
+ DirtyPageTableDump = 0x1F,
+ TransactionTableDump = 0x20,
+ UpdateRecordDataRoot = 0x21,
+ UpdateRecordDataAllocation = 0x22,
+
+ UpdateRelativeDataInIndex =
+ 0x23, // NtOfsRestartUpdateRelativeDataInIndex
+ UpdateRelativeDataInIndex2 = 0x24,
+ ZeroEndOfFileRecord = 0x25,
+};
+
+/*
+ * Array for log records which require a target attribute.
+ * A true indicates that the corresponding restart operation
+ * requires a target attribute.
+ */
+static const u8 AttributeRequired[] = {
+ 0xFC, 0xFB, 0xFF, 0x10, 0x06,
+};
+
+static inline bool is_target_required(u16 op)
+{
+ bool ret = op <= UpdateRecordDataAllocation &&
+ (AttributeRequired[op >> 3] >> (op & 7) & 1);
+ return ret;
+}
+
+static inline bool can_skip_action(enum NTFS_LOG_OPERATION op)
+{
+ switch (op) {
+ case Noop:
+ case DeleteDirtyClusters:
+ case HotFix:
+ case EndTopLevelAction:
+ case PrepareTransaction:
+ case CommitTransaction:
+ case ForgetTransaction:
+ case CompensationLogRecord:
+ case OpenNonresidentAttribute:
+ case OpenAttributeTableDump:
+ case AttributeNamesDump:
+ case DirtyPageTableDump:
+ case TransactionTableDump:
+ return true;
+ default:
+ return false;
+ }
+}
+
+enum { lcb_ctx_undo_next, lcb_ctx_prev, lcb_ctx_next };
+
+/* Bytes per restart table. */
+static inline u32 bytes_per_rt(const struct RESTART_TABLE *rt)
+{
+ return le16_to_cpu(rt->used) * le16_to_cpu(rt->size) +
+ sizeof(struct RESTART_TABLE);
+}
+
+/* Log record length. */
+static inline u32 lrh_length(const struct LOG_REC_HDR *lr)
+{
+ u16 t16 = le16_to_cpu(lr->lcns_follow);
+
+ return struct_size(lr, page_lcns, max_t(u16, 1, t16));
+}
+
+struct lcb {
+ struct LFS_RECORD_HDR *lrh; // Log record header of the current lsn.
+ struct LOG_REC_HDR *log_rec;
+ u32 ctx_mode; // lcb_ctx_undo_next/lcb_ctx_prev/lcb_ctx_next
+ struct CLIENT_ID client;
+ bool alloc; // If true the we should deallocate 'log_rec'.
+};
+
+static void lcb_put(struct lcb *lcb)
+{
+ if (lcb->alloc)
+ kfree(lcb->log_rec);
+ kfree(lcb->lrh);
+ kfree(lcb);
+}
+
+/* Find the oldest lsn from active clients. */
+static inline void oldest_client_lsn(const struct CLIENT_REC *ca,
+ __le16 next_client, u64 *oldest_lsn)
+{
+ while (next_client != LFS_NO_CLIENT_LE) {
+ const struct CLIENT_REC *cr = ca + le16_to_cpu(next_client);
+ u64 lsn = le64_to_cpu(cr->oldest_lsn);
+
+ /* Ignore this block if it's oldest lsn is 0. */
+ if (lsn && lsn < *oldest_lsn)
+ *oldest_lsn = lsn;
+
+ next_client = cr->next_client;
+ }
+}
+
+static inline bool is_rst_page_hdr_valid(u32 file_off,
+ const struct RESTART_HDR *rhdr)
+{
+ u32 sys_page = le32_to_cpu(rhdr->sys_page_size);
+ u32 page_size = le32_to_cpu(rhdr->page_size);
+ u32 end_usa;
+ u16 ro;
+
+ if (sys_page < SECTOR_SIZE || page_size < SECTOR_SIZE ||
+ sys_page & (sys_page - 1) || page_size & (page_size - 1)) {
+ return false;
+ }
+
+ /* Check that if the file offset isn't 0, it is the system page size. */
+ if (file_off && file_off != sys_page)
+ return false;
+
+ /* Check support version 1.1+. */
+ if (le16_to_cpu(rhdr->major_ver) <= 1 && !rhdr->minor_ver)
+ return false;
+
+ if (le16_to_cpu(rhdr->major_ver) > 2)
+ return false;
+
+ ro = le16_to_cpu(rhdr->ra_off);
+ if (!IS_ALIGNED(ro, 8) || ro > sys_page)
+ return false;
+
+ end_usa = ((sys_page >> SECTOR_SHIFT) + 1) * sizeof(short);
+ end_usa += le16_to_cpu(rhdr->rhdr.fix_off);
+
+ if (ro < end_usa)
+ return false;
+
+ return true;
+}
+
+static inline bool is_rst_area_valid(const struct RESTART_HDR *rhdr)
+{
+ const struct RESTART_AREA *ra;
+ u16 cl, fl, ul;
+ u32 off, l_size, file_dat_bits, file_size_round;
+ u16 ro = le16_to_cpu(rhdr->ra_off);
+ u32 sys_page = le32_to_cpu(rhdr->sys_page_size);
+
+ if (ro + offsetof(struct RESTART_AREA, l_size) >
+ SECTOR_SIZE - sizeof(short))
+ return false;
+
+ ra = Add2Ptr(rhdr, ro);
+ cl = le16_to_cpu(ra->log_clients);
+
+ if (cl > 1)
+ return false;
+
+ off = le16_to_cpu(ra->client_off);
+
+ if (!IS_ALIGNED(off, 8) || ro + off > SECTOR_SIZE - sizeof(short))
+ return false;
+
+ off += cl * sizeof(struct CLIENT_REC);
+
+ if (off > sys_page)
+ return false;
+
+ /*
+ * Check the restart length field and whether the entire
+ * restart area is contained that length.
+ */
+ if (le16_to_cpu(rhdr->ra_off) + le16_to_cpu(ra->ra_len) > sys_page ||
+ off > le16_to_cpu(ra->ra_len)) {
+ return false;
+ }
+
+ /*
+ * As a final check make sure that the use list and the free list
+ * are either empty or point to a valid client.
+ */
+ fl = le16_to_cpu(ra->client_idx[0]);
+ ul = le16_to_cpu(ra->client_idx[1]);
+ if ((fl != LFS_NO_CLIENT && fl >= cl) ||
+ (ul != LFS_NO_CLIENT && ul >= cl))
+ return false;
+
+ /* Make sure the sequence number bits match the log file size. */
+ l_size = le64_to_cpu(ra->l_size);
+
+ file_dat_bits = sizeof(u64) * 8 - le32_to_cpu(ra->seq_num_bits);
+ file_size_round = 1u << (file_dat_bits + 3);
+ if (file_size_round != l_size &&
+ (file_size_round < l_size || (file_size_round / 2) > l_size)) {
+ return false;
+ }
+
+ /* The log page data offset and record header length must be quad-aligned. */
+ if (!IS_ALIGNED(le16_to_cpu(ra->data_off), 8) ||
+ !IS_ALIGNED(le16_to_cpu(ra->rec_hdr_len), 8))
+ return false;
+
+ return true;
+}
+
+static inline bool is_client_area_valid(const struct RESTART_HDR *rhdr,
+ bool usa_error)
+{
+ u16 ro = le16_to_cpu(rhdr->ra_off);
+ const struct RESTART_AREA *ra = Add2Ptr(rhdr, ro);
+ u16 ra_len = le16_to_cpu(ra->ra_len);
+ const struct CLIENT_REC *ca;
+ u32 i;
+
+ if (usa_error && ra_len + ro > SECTOR_SIZE - sizeof(short))
+ return false;
+
+ /* Find the start of the client array. */
+ ca = Add2Ptr(ra, le16_to_cpu(ra->client_off));
+
+ /*
+ * Start with the free list.
+ * Check that all the clients are valid and that there isn't a cycle.
+ * Do the in-use list on the second pass.
+ */
+ for (i = 0; i < 2; i++) {
+ u16 client_idx = le16_to_cpu(ra->client_idx[i]);
+ bool first_client = true;
+ u16 clients = le16_to_cpu(ra->log_clients);
+
+ while (client_idx != LFS_NO_CLIENT) {
+ const struct CLIENT_REC *cr;
+
+ if (!clients ||
+ client_idx >= le16_to_cpu(ra->log_clients))
+ return false;
+
+ clients -= 1;
+ cr = ca + client_idx;
+
+ client_idx = le16_to_cpu(cr->next_client);
+
+ if (first_client) {
+ first_client = false;
+ if (cr->prev_client != LFS_NO_CLIENT_LE)
+ return false;
+ }
+ }
+ }
+
+ return true;
+}
+
+/*
+ * remove_client
+ *
+ * Remove a client record from a client record list an restart area.
+ */
+static inline void remove_client(struct CLIENT_REC *ca,
+ const struct CLIENT_REC *cr, __le16 *head)
+{
+ if (cr->prev_client == LFS_NO_CLIENT_LE)
+ *head = cr->next_client;
+ else
+ ca[le16_to_cpu(cr->prev_client)].next_client = cr->next_client;
+
+ if (cr->next_client != LFS_NO_CLIENT_LE)
+ ca[le16_to_cpu(cr->next_client)].prev_client = cr->prev_client;
+}
+
+/*
+ * add_client - Add a client record to the start of a list.
+ */
+static inline void add_client(struct CLIENT_REC *ca, u16 index, __le16 *head)
+{
+ struct CLIENT_REC *cr = ca + index;
+
+ cr->prev_client = LFS_NO_CLIENT_LE;
+ cr->next_client = *head;
+
+ if (*head != LFS_NO_CLIENT_LE)
+ ca[le16_to_cpu(*head)].prev_client = cpu_to_le16(index);
+
+ *head = cpu_to_le16(index);
+}
+
+static inline void *enum_rstbl(struct RESTART_TABLE *t, void *c)
+{
+ __le32 *e;
+ u32 bprt;
+ u16 rsize = t ? le16_to_cpu(t->size) : 0;
+
+ if (!c) {
+ if (!t || !t->total)
+ return NULL;
+ e = Add2Ptr(t, sizeof(struct RESTART_TABLE));
+ } else {
+ e = Add2Ptr(c, rsize);
+ }
+
+ /* Loop until we hit the first one allocated, or the end of the list. */
+ for (bprt = bytes_per_rt(t); PtrOffset(t, e) < bprt;
+ e = Add2Ptr(e, rsize)) {
+ if (*e == RESTART_ENTRY_ALLOCATED_LE)
+ return e;
+ }
+ return NULL;
+}
+
+/*
+ * find_dp - Search for a @vcn in Dirty Page Table.
+ */
+static inline struct DIR_PAGE_ENTRY *find_dp(struct RESTART_TABLE *dptbl,
+ u32 target_attr, u64 vcn)
+{
+ __le32 ta = cpu_to_le32(target_attr);
+ struct DIR_PAGE_ENTRY *dp = NULL;
+
+ while ((dp = enum_rstbl(dptbl, dp))) {
+ u64 dp_vcn = le64_to_cpu(dp->vcn);
+
+ if (dp->target_attr == ta && vcn >= dp_vcn &&
+ vcn < dp_vcn + le32_to_cpu(dp->lcns_follow)) {
+ return dp;
+ }
+ }
+ return NULL;
+}
+
+static inline u32 norm_file_page(u32 page_size, u32 *l_size, bool use_default)
+{
+ if (use_default)
+ page_size = DefaultLogPageSize;
+
+ /* Round the file size down to a system page boundary. */
+ *l_size &= ~(page_size - 1);
+
+ /* File should contain at least 2 restart pages and MinLogRecordPages pages. */
+ if (*l_size < (MinLogRecordPages + 2) * page_size)
+ return 0;
+
+ return page_size;
+}
+
+static bool check_log_rec(const struct LOG_REC_HDR *lr, u32 bytes, u32 tr,
+ u32 bytes_per_attr_entry)
+{
+ u16 t16;
+
+ if (bytes < sizeof(struct LOG_REC_HDR))
+ return false;
+ if (!tr)
+ return false;
+
+ if ((tr - sizeof(struct RESTART_TABLE)) %
+ sizeof(struct TRANSACTION_ENTRY))
+ return false;
+
+ if (le16_to_cpu(lr->redo_off) & 7)
+ return false;
+
+ if (le16_to_cpu(lr->undo_off) & 7)
+ return false;
+
+ if (lr->target_attr)
+ goto check_lcns;
+
+ if (is_target_required(le16_to_cpu(lr->redo_op)))
+ return false;
+
+ if (is_target_required(le16_to_cpu(lr->undo_op)))
+ return false;
+
+check_lcns:
+ if (!lr->lcns_follow)
+ goto check_length;
+
+ t16 = le16_to_cpu(lr->target_attr);
+ if ((t16 - sizeof(struct RESTART_TABLE)) % bytes_per_attr_entry)
+ return false;
+
+check_length:
+ if (bytes < lrh_length(lr))
+ return false;
+
+ return true;
+}
+
+static bool check_rstbl(const struct RESTART_TABLE *rt, size_t bytes)
+{
+ u32 ts;
+ u32 i, off;
+ u16 rsize = le16_to_cpu(rt->size);
+ u16 ne = le16_to_cpu(rt->used);
+ u32 ff = le32_to_cpu(rt->first_free);
+ u32 lf = le32_to_cpu(rt->last_free);
+
+ ts = rsize * ne + sizeof(struct RESTART_TABLE);
+
+ if (!rsize || rsize > bytes ||
+ rsize + sizeof(struct RESTART_TABLE) > bytes || bytes < ts ||
+ le16_to_cpu(rt->total) > ne || ff > ts || lf > ts ||
+ (ff && ff < sizeof(struct RESTART_TABLE)) ||
+ (lf && lf < sizeof(struct RESTART_TABLE))) {
+ return false;
+ }
+
+ /*
+ * Verify each entry is either allocated or points
+ * to a valid offset the table.
+ */
+ for (i = 0; i < ne; i++) {
+ off = le32_to_cpu(*(__le32 *)Add2Ptr(
+ rt, i * rsize + sizeof(struct RESTART_TABLE)));
+
+ if (off != RESTART_ENTRY_ALLOCATED && off &&
+ (off < sizeof(struct RESTART_TABLE) ||
+ ((off - sizeof(struct RESTART_TABLE)) % rsize))) {
+ return false;
+ }
+ }
+
+ /*
+ * Walk through the list headed by the first entry to make
+ * sure none of the entries are currently being used.
+ */
+ for (off = ff; off;) {
+ if (off == RESTART_ENTRY_ALLOCATED)
+ return false;
+
+ off = le32_to_cpu(*(__le32 *)Add2Ptr(rt, off));
+ }
+
+ return true;
+}
+
+/*
+ * free_rsttbl_idx - Free a previously allocated index a Restart Table.
+ */
+static inline void free_rsttbl_idx(struct RESTART_TABLE *rt, u32 off)
+{
+ __le32 *e;
+ u32 lf = le32_to_cpu(rt->last_free);
+ __le32 off_le = cpu_to_le32(off);
+
+ e = Add2Ptr(rt, off);
+
+ if (off < le32_to_cpu(rt->free_goal)) {
+ *e = rt->first_free;
+ rt->first_free = off_le;
+ if (!lf)
+ rt->last_free = off_le;
+ } else {
+ if (lf)
+ *(__le32 *)Add2Ptr(rt, lf) = off_le;
+ else
+ rt->first_free = off_le;
+
+ rt->last_free = off_le;
+ *e = 0;
+ }
+
+ le16_sub_cpu(&rt->total, 1);
+}
+
+static inline struct RESTART_TABLE *init_rsttbl(u16 esize, u16 used)
+{
+ __le32 *e, *last_free;
+ u32 off;
+ u32 bytes = esize * used + sizeof(struct RESTART_TABLE);
+ u32 lf = sizeof(struct RESTART_TABLE) + (used - 1) * esize;
+ struct RESTART_TABLE *t = kzalloc(bytes, GFP_NOFS);
+
+ if (!t)
+ return NULL;
+
+ t->size = cpu_to_le16(esize);
+ t->used = cpu_to_le16(used);
+ t->free_goal = cpu_to_le32(~0u);
+ t->first_free = cpu_to_le32(sizeof(struct RESTART_TABLE));
+ t->last_free = cpu_to_le32(lf);
+
+ e = (__le32 *)(t + 1);
+ last_free = Add2Ptr(t, lf);
+
+ for (off = sizeof(struct RESTART_TABLE) + esize; e < last_free;
+ e = Add2Ptr(e, esize), off += esize) {
+ *e = cpu_to_le32(off);
+ }
+ return t;
+}
+
+static inline struct RESTART_TABLE *extend_rsttbl(struct RESTART_TABLE *tbl,
+ u32 add, u32 free_goal)
+{
+ u16 esize = le16_to_cpu(tbl->size);
+ __le32 osize = cpu_to_le32(bytes_per_rt(tbl));
+ u32 used = le16_to_cpu(tbl->used);
+ struct RESTART_TABLE *rt;
+
+ rt = init_rsttbl(esize, used + add);
+ if (!rt)
+ return NULL;
+
+ memcpy(rt + 1, tbl + 1, esize * used);
+
+ rt->free_goal = free_goal == ~0u
+ ? cpu_to_le32(~0u)
+ : cpu_to_le32(sizeof(struct RESTART_TABLE) +
+ free_goal * esize);
+
+ if (tbl->first_free) {
+ rt->first_free = tbl->first_free;
+ *(__le32 *)Add2Ptr(rt, le32_to_cpu(tbl->last_free)) = osize;
+ } else {
+ rt->first_free = osize;
+ }
+
+ rt->total = tbl->total;
+
+ kfree(tbl);
+ return rt;
+}
+
+/*
+ * alloc_rsttbl_idx
+ *
+ * Allocate an index from within a previously initialized Restart Table.
+ */
+static inline void *alloc_rsttbl_idx(struct RESTART_TABLE **tbl)
+{
+ u32 off;
+ __le32 *e;
+ struct RESTART_TABLE *t = *tbl;
+
+ if (!t->first_free) {
+ *tbl = t = extend_rsttbl(t, 16, ~0u);
+ if (!t)
+ return NULL;
+ }
+
+ off = le32_to_cpu(t->first_free);
+
+ /* Dequeue this entry and zero it. */
+ e = Add2Ptr(t, off);
+
+ t->first_free = *e;
+
+ memset(e, 0, le16_to_cpu(t->size));
+
+ *e = RESTART_ENTRY_ALLOCATED_LE;
+
+ /* If list is going empty, then we fix the last_free as well. */
+ if (!t->first_free)
+ t->last_free = 0;
+
+ le16_add_cpu(&t->total, 1);
+
+ return Add2Ptr(t, off);
+}
+
+/*
+ * alloc_rsttbl_from_idx
+ *
+ * Allocate a specific index from within a previously initialized Restart Table.
+ */
+static inline void *alloc_rsttbl_from_idx(struct RESTART_TABLE **tbl, u32 vbo)
+{
+ u32 off;
+ __le32 *e;
+ struct RESTART_TABLE *rt = *tbl;
+ u32 bytes = bytes_per_rt(rt);
+ u16 esize = le16_to_cpu(rt->size);
+
+ /* If the entry is not the table, we will have to extend the table. */
+ if (vbo >= bytes) {
+ /*
+ * Extend the size by computing the number of entries between
+ * the existing size and the desired index and adding 1 to that.
+ */
+ u32 bytes2idx = vbo - bytes;
+
+ /*
+ * There should always be an integral number of entries
+ * being added. Now extend the table.
+ */
+ *tbl = rt = extend_rsttbl(rt, bytes2idx / esize + 1, bytes);
+ if (!rt)
+ return NULL;
+ }
+
+ /* See if the entry is already allocated, and just return if it is. */
+ e = Add2Ptr(rt, vbo);
+
+ if (*e == RESTART_ENTRY_ALLOCATED_LE)
+ return e;
+
+ /*
+ * Walk through the table, looking for the entry we're
+ * interested and the previous entry.
+ */
+ off = le32_to_cpu(rt->first_free);
+ e = Add2Ptr(rt, off);
+
+ if (off == vbo) {
+ /* this is a match */
+ rt->first_free = *e;
+ goto skip_looking;
+ }
+
+ /*
+ * Need to walk through the list looking for the predecessor
+ * of our entry.
+ */
+ for (;;) {
+ /* Remember the entry just found */
+ u32 last_off = off;
+ __le32 *last_e = e;
+
+ /* Should never run of entries. */
+
+ /* Lookup up the next entry the list. */
+ off = le32_to_cpu(*last_e);
+ e = Add2Ptr(rt, off);
+
+ /* If this is our match we are done. */
+ if (off == vbo) {
+ *last_e = *e;
+
+ /*
+ * If this was the last entry, we update that
+ * table as well.
+ */
+ if (le32_to_cpu(rt->last_free) == off)
+ rt->last_free = cpu_to_le32(last_off);
+ break;
+ }
+ }
+
+skip_looking:
+ /* If the list is now empty, we fix the last_free as well. */
+ if (!rt->first_free)
+ rt->last_free = 0;
+
+ /* Zero this entry. */
+ memset(e, 0, esize);
+ *e = RESTART_ENTRY_ALLOCATED_LE;
+
+ le16_add_cpu(&rt->total, 1);
+
+ return e;
+}
+
+#define RESTART_SINGLE_PAGE_IO cpu_to_le16(0x0001)
+
+#define NTFSLOG_WRAPPED 0x00000001
+#define NTFSLOG_MULTIPLE_PAGE_IO 0x00000002
+#define NTFSLOG_NO_LAST_LSN 0x00000004
+#define NTFSLOG_REUSE_TAIL 0x00000010
+#define NTFSLOG_NO_OLDEST_LSN 0x00000020
+
+/* Helper struct to work with NTFS $LogFile. */
+struct ntfs_log {
+ struct ntfs_inode *ni;
+
+ u32 l_size;
+ u32 sys_page_size;
+ u32 sys_page_mask;
+ u32 page_size;
+ u32 page_mask; // page_size - 1
+ u8 page_bits;
+ struct RECORD_PAGE_HDR *one_page_buf;
+
+ struct RESTART_TABLE *open_attr_tbl;
+ u32 transaction_id;
+ u32 clst_per_page;
+
+ u32 first_page;
+ u32 next_page;
+ u32 ra_off;
+ u32 data_off;
+ u32 restart_size;
+ u32 data_size;
+ u16 record_header_len;
+ u64 seq_num;
+ u32 seq_num_bits;
+ u32 file_data_bits;
+ u32 seq_num_mask; /* (1 << file_data_bits) - 1 */
+
+ struct RESTART_AREA *ra; /* In-memory image of the next restart area. */
+ u32 ra_size; /* The usable size of the restart area. */
+
+ /*
+ * If true, then the in-memory restart area is to be written
+ * to the first position on the disk.
+ */
+ bool init_ra;
+ bool set_dirty; /* True if we need to set dirty flag. */
+
+ u64 oldest_lsn;
+
+ u32 oldest_lsn_off;
+ u64 last_lsn;
+
+ u32 total_avail;
+ u32 total_avail_pages;
+ u32 total_undo_commit;
+ u32 max_current_avail;
+ u32 current_avail;
+ u32 reserved;
+
+ short major_ver;
+ short minor_ver;
+
+ u32 l_flags; /* See NTFSLOG_XXX */
+ u32 current_openlog_count; /* On-disk value for open_log_count. */
+
+ struct CLIENT_ID client_id;
+ u32 client_undo_commit;
+};
+
+static inline u32 lsn_to_vbo(struct ntfs_log *log, const u64 lsn)
+{
+ u32 vbo = (lsn << log->seq_num_bits) >> (log->seq_num_bits - 3);
+
+ return vbo;
+}
+
+/* Compute the offset in the log file of the next log page. */
+static inline u32 next_page_off(struct ntfs_log *log, u32 off)
+{
+ off = (off & ~log->sys_page_mask) + log->page_size;
+ return off >= log->l_size ? log->first_page : off;
+}
+
+static inline u32 lsn_to_page_off(struct ntfs_log *log, u64 lsn)
+{
+ return (((u32)lsn) << 3) & log->page_mask;
+}
+
+static inline u64 vbo_to_lsn(struct ntfs_log *log, u32 off, u64 Seq)
+{
+ return (off >> 3) + (Seq << log->file_data_bits);
+}
+
+static inline bool is_lsn_in_file(struct ntfs_log *log, u64 lsn)
+{
+ return lsn >= log->oldest_lsn &&
+ lsn <= le64_to_cpu(log->ra->current_lsn);
+}
+
+static inline u32 hdr_file_off(struct ntfs_log *log,
+ struct RECORD_PAGE_HDR *hdr)
+{
+ if (log->major_ver < 2)
+ return le64_to_cpu(hdr->rhdr.lsn);
+
+ return le32_to_cpu(hdr->file_off);
+}
+
+static inline u64 base_lsn(struct ntfs_log *log,
+ const struct RECORD_PAGE_HDR *hdr, u64 lsn)
+{
+ u64 h_lsn = le64_to_cpu(hdr->rhdr.lsn);
+ u64 ret = (((h_lsn >> log->file_data_bits) +
+ (lsn < (lsn_to_vbo(log, h_lsn) & ~log->page_mask) ? 1 : 0))
+ << log->file_data_bits) +
+ ((((is_log_record_end(hdr) &&
+ h_lsn <= le64_to_cpu(hdr->record_hdr.last_end_lsn))
+ ? le16_to_cpu(hdr->record_hdr.next_record_off)
+ : log->page_size) +
+ lsn) >>
+ 3);
+
+ return ret;
+}
+
+static inline bool verify_client_lsn(struct ntfs_log *log,
+ const struct CLIENT_REC *client, u64 lsn)
+{
+ return lsn >= le64_to_cpu(client->oldest_lsn) &&
+ lsn <= le64_to_cpu(log->ra->current_lsn) && lsn;
+}
+
+struct restart_info {
+ u64 last_lsn;
+ struct RESTART_HDR *r_page;
+ u32 vbo;
+ bool chkdsk_was_run;
+ bool valid_page;
+ bool initialized;
+ bool restart;
+};
+
+static int read_log_page(struct ntfs_log *log, u32 vbo,
+ struct RECORD_PAGE_HDR **buffer, bool *usa_error)
+{
+ int err = 0;
+ u32 page_idx = vbo >> log->page_bits;
+ u32 page_off = vbo & log->page_mask;
+ u32 bytes = log->page_size - page_off;
+ void *to_free = NULL;
+ u32 page_vbo = page_idx << log->page_bits;
+ struct RECORD_PAGE_HDR *page_buf;
+ struct ntfs_inode *ni = log->ni;
+ bool bBAAD;
+
+ if (vbo >= log->l_size)
+ return -EINVAL;
+
+ if (!*buffer) {
+ to_free = kmalloc(bytes, GFP_NOFS);
+ if (!to_free)
+ return -ENOMEM;
+ *buffer = to_free;
+ }
+
+ page_buf = page_off ? log->one_page_buf : *buffer;
+
+ err = ntfs_read_run_nb(ni->mi.sbi, &ni->file.run, page_vbo, page_buf,
+ log->page_size, NULL);
+ if (err)
+ goto out;
+
+ if (page_buf->rhdr.sign != NTFS_FFFF_SIGNATURE)
+ ntfs_fix_post_read(&page_buf->rhdr, PAGE_SIZE, false);
+
+ if (page_buf != *buffer)
+ memcpy(*buffer, Add2Ptr(page_buf, page_off), bytes);
+
+ bBAAD = page_buf->rhdr.sign == NTFS_BAAD_SIGNATURE;
+
+ if (usa_error)
+ *usa_error = bBAAD;
+ /* Check that the update sequence array for this page is valid */
+ /* If we don't allow errors, raise an error status */
+ else if (bBAAD)
+ err = -EINVAL;
+
+out:
+ if (err && to_free) {
+ kfree(to_free);
+ *buffer = NULL;
+ }
+
+ return err;
+}
+
+/*
+ * log_read_rst
+ *
+ * It walks through 512 blocks of the file looking for a valid
+ * restart page header. It will stop the first time we find a
+ * valid page header.
+ */
+static int log_read_rst(struct ntfs_log *log, u32 l_size, bool first,
+ struct restart_info *info)
+{
+ u32 skip, vbo;
+ struct RESTART_HDR *r_page = kmalloc(DefaultLogPageSize, GFP_NOFS);
+
+ if (!r_page)
+ return -ENOMEM;
+
+ memset(info, 0, sizeof(struct restart_info));
+
+ /* Determine which restart area we are looking for. */
+ if (first) {
+ vbo = 0;
+ skip = 512;
+ } else {
+ vbo = 512;
+ skip = 0;
+ }
+
+ /* Loop continuously until we succeed. */
+ for (; vbo < l_size; vbo = 2 * vbo + skip, skip = 0) {
+ bool usa_error;
+ u32 sys_page_size;
+ bool brst, bchk;
+ struct RESTART_AREA *ra;
+
+ /* Read a page header at the current offset. */
+ if (read_log_page(log, vbo, (struct RECORD_PAGE_HDR **)&r_page,
+ &usa_error)) {
+ /* Ignore any errors. */
+ continue;
+ }
+
+ /* Exit if the signature is a log record page. */
+ if (r_page->rhdr.sign == NTFS_RCRD_SIGNATURE) {
+ info->initialized = true;
+ break;
+ }
+
+ brst = r_page->rhdr.sign == NTFS_RSTR_SIGNATURE;
+ bchk = r_page->rhdr.sign == NTFS_CHKD_SIGNATURE;
+
+ if (!bchk && !brst) {
+ if (r_page->rhdr.sign != NTFS_FFFF_SIGNATURE) {
+ /*
+ * Remember if the signature does not
+ * indicate uninitialized file.
+ */
+ info->initialized = true;
+ }
+ continue;
+ }
+
+ ra = NULL;
+ info->valid_page = false;
+ info->initialized = true;
+ info->vbo = vbo;
+
+ /* Let's check the restart area if this is a valid page. */
+ if (!is_rst_page_hdr_valid(vbo, r_page))
+ goto check_result;
+ ra = Add2Ptr(r_page, le16_to_cpu(r_page->ra_off));
+
+ if (!is_rst_area_valid(r_page))
+ goto check_result;
+
+ /*
+ * We have a valid restart page header and restart area.
+ * If chkdsk was run or we have no clients then we have
+ * no more checking to do.
+ */
+ if (bchk || ra->client_idx[1] == LFS_NO_CLIENT_LE) {
+ info->valid_page = true;
+ goto check_result;
+ }
+
+ /* Read the entire restart area. */
+ sys_page_size = le32_to_cpu(r_page->sys_page_size);
+ if (DefaultLogPageSize != sys_page_size) {
+ kfree(r_page);
+ r_page = kzalloc(sys_page_size, GFP_NOFS);
+ if (!r_page)
+ return -ENOMEM;
+
+ if (read_log_page(log, vbo,
+ (struct RECORD_PAGE_HDR **)&r_page,
+ &usa_error)) {
+ /* Ignore any errors. */
+ kfree(r_page);
+ r_page = NULL;
+ continue;
+ }
+ }
+
+ if (is_client_area_valid(r_page, usa_error)) {
+ info->valid_page = true;
+ ra = Add2Ptr(r_page, le16_to_cpu(r_page->ra_off));
+ }
+
+check_result:
+ /*
+ * If chkdsk was run then update the caller's
+ * values and return.
+ */
+ if (r_page->rhdr.sign == NTFS_CHKD_SIGNATURE) {
+ info->chkdsk_was_run = true;
+ info->last_lsn = le64_to_cpu(r_page->rhdr.lsn);
+ info->restart = true;
+ info->r_page = r_page;
+ return 0;
+ }
+
+ /*
+ * If we have a valid page then copy the values
+ * we need from it.
+ */
+ if (info->valid_page) {
+ info->last_lsn = le64_to_cpu(ra->current_lsn);
+ info->restart = true;
+ info->r_page = r_page;
+ return 0;
+ }
+ }
+
+ kfree(r_page);
+
+ return 0;
+}
+
+/*
+ * Ilog_init_pg_hdr - Init @log from restart page header.
+ */
+static void log_init_pg_hdr(struct ntfs_log *log, u32 sys_page_size,
+ u32 page_size, u16 major_ver, u16 minor_ver)
+{
+ log->sys_page_size = sys_page_size;
+ log->sys_page_mask = sys_page_size - 1;
+ log->page_size = page_size;
+ log->page_mask = page_size - 1;
+ log->page_bits = blksize_bits(page_size);
+
+ log->clst_per_page = log->page_size >> log->ni->mi.sbi->cluster_bits;
+ if (!log->clst_per_page)
+ log->clst_per_page = 1;
+
+ log->first_page = major_ver >= 2
+ ? 0x22 * page_size
+ : ((sys_page_size << 1) + (page_size << 1));
+ log->major_ver = major_ver;
+ log->minor_ver = minor_ver;
+}
+
+/*
+ * log_create - Init @log in cases when we don't have a restart area to use.
+ */
+static void log_create(struct ntfs_log *log, u32 l_size, const u64 last_lsn,
+ u32 open_log_count, bool wrapped, bool use_multi_page)
+{
+ log->l_size = l_size;
+ /* All file offsets must be quadword aligned. */
+ log->file_data_bits = blksize_bits(l_size) - 3;
+ log->seq_num_mask = (8 << log->file_data_bits) - 1;
+ log->seq_num_bits = sizeof(u64) * 8 - log->file_data_bits;
+ log->seq_num = (last_lsn >> log->file_data_bits) + 2;
+ log->next_page = log->first_page;
+ log->oldest_lsn = log->seq_num << log->file_data_bits;
+ log->oldest_lsn_off = 0;
+ log->last_lsn = log->oldest_lsn;
+
+ log->l_flags |= NTFSLOG_NO_LAST_LSN | NTFSLOG_NO_OLDEST_LSN;
+
+ /* Set the correct flags for the I/O and indicate if we have wrapped. */
+ if (wrapped)
+ log->l_flags |= NTFSLOG_WRAPPED;
+
+ if (use_multi_page)
+ log->l_flags |= NTFSLOG_MULTIPLE_PAGE_IO;
+
+ /* Compute the log page values. */
+ log->data_off = ALIGN(
+ offsetof(struct RECORD_PAGE_HDR, fixups) +
+ sizeof(short) * ((log->page_size >> SECTOR_SHIFT) + 1),
+ 8);
+ log->data_size = log->page_size - log->data_off;
+ log->record_header_len = sizeof(struct LFS_RECORD_HDR);
+
+ /* Remember the different page sizes for reservation. */
+ log->reserved = log->data_size - log->record_header_len;
+
+ /* Compute the restart page values. */
+ log->ra_off = ALIGN(
+ offsetof(struct RESTART_HDR, fixups) +
+ sizeof(short) *
+ ((log->sys_page_size >> SECTOR_SHIFT) + 1),
+ 8);
+ log->restart_size = log->sys_page_size - log->ra_off;
+ log->ra_size = struct_size(log->ra, clients, 1);
+ log->current_openlog_count = open_log_count;
+
+ /*
+ * The total available log file space is the number of
+ * log file pages times the space available on each page.
+ */
+ log->total_avail_pages = log->l_size - log->first_page;
+ log->total_avail = log->total_avail_pages >> log->page_bits;
+
+ /*
+ * We assume that we can't use the end of the page less than
+ * the file record size.
+ * Then we won't need to reserve more than the caller asks for.
+ */
+ log->max_current_avail = log->total_avail * log->reserved;
+ log->total_avail = log->total_avail * log->data_size;
+ log->current_avail = log->max_current_avail;
+}
+
+/*
+ * log_create_ra - Fill a restart area from the values stored in @log.
+ */
+static struct RESTART_AREA *log_create_ra(struct ntfs_log *log)
+{
+ struct CLIENT_REC *cr;
+ struct RESTART_AREA *ra = kzalloc(log->restart_size, GFP_NOFS);
+
+ if (!ra)
+ return NULL;
+
+ ra->current_lsn = cpu_to_le64(log->last_lsn);
+ ra->log_clients = cpu_to_le16(1);
+ ra->client_idx[1] = LFS_NO_CLIENT_LE;
+ if (log->l_flags & NTFSLOG_MULTIPLE_PAGE_IO)
+ ra->flags = RESTART_SINGLE_PAGE_IO;
+ ra->seq_num_bits = cpu_to_le32(log->seq_num_bits);
+ ra->ra_len = cpu_to_le16(log->ra_size);
+ ra->client_off = cpu_to_le16(offsetof(struct RESTART_AREA, clients));
+ ra->l_size = cpu_to_le64(log->l_size);
+ ra->rec_hdr_len = cpu_to_le16(log->record_header_len);
+ ra->data_off = cpu_to_le16(log->data_off);
+ ra->open_log_count = cpu_to_le32(log->current_openlog_count + 1);
+
+ cr = ra->clients;
+
+ cr->prev_client = LFS_NO_CLIENT_LE;
+ cr->next_client = LFS_NO_CLIENT_LE;
+
+ return ra;
+}
+
+static u32 final_log_off(struct ntfs_log *log, u64 lsn, u32 data_len)
+{
+ u32 base_vbo = lsn << 3;
+ u32 final_log_off = (base_vbo & log->seq_num_mask) & ~log->page_mask;
+ u32 page_off = base_vbo & log->page_mask;
+ u32 tail = log->page_size - page_off;
+
+ page_off -= 1;
+
+ /* Add the length of the header. */
+ data_len += log->record_header_len;
+
+ /*
+ * If this lsn is contained this log page we are done.
+ * Otherwise we need to walk through several log pages.
+ */
+ if (data_len > tail) {
+ data_len -= tail;
+ tail = log->data_size;
+ page_off = log->data_off - 1;
+
+ for (;;) {
+ final_log_off = next_page_off(log, final_log_off);
+
+ /*
+ * We are done if the remaining bytes
+ * fit on this page.
+ */
+ if (data_len <= tail)
+ break;
+ data_len -= tail;
+ }
+ }
+
+ /*
+ * We add the remaining bytes to our starting position on this page
+ * and then add that value to the file offset of this log page.
+ */
+ return final_log_off + data_len + page_off;
+}
+
+static int next_log_lsn(struct ntfs_log *log, const struct LFS_RECORD_HDR *rh,
+ u64 *lsn)
+{
+ int err;
+ u64 this_lsn = le64_to_cpu(rh->this_lsn);
+ u32 vbo = lsn_to_vbo(log, this_lsn);
+ u32 end =
+ final_log_off(log, this_lsn, le32_to_cpu(rh->client_data_len));
+ u32 hdr_off = end & ~log->sys_page_mask;
+ u64 seq = this_lsn >> log->file_data_bits;
+ struct RECORD_PAGE_HDR *page = NULL;
+
+ /* Remember if we wrapped. */
+ if (end <= vbo)
+ seq += 1;
+
+ /* Log page header for this page. */
+ err = read_log_page(log, hdr_off, &page, NULL);
+ if (err)
+ return err;
+
+ /*
+ * If the lsn we were given was not the last lsn on this page,
+ * then the starting offset for the next lsn is on a quad word
+ * boundary following the last file offset for the current lsn.
+ * Otherwise the file offset is the start of the data on the next page.
+ */
+ if (this_lsn == le64_to_cpu(page->rhdr.lsn)) {
+ /* If we wrapped, we need to increment the sequence number. */
+ hdr_off = next_page_off(log, hdr_off);
+ if (hdr_off == log->first_page)
+ seq += 1;
+
+ vbo = hdr_off + log->data_off;
+ } else {
+ vbo = ALIGN(end, 8);
+ }
+
+ /* Compute the lsn based on the file offset and the sequence count. */
+ *lsn = vbo_to_lsn(log, vbo, seq);
+
+ /*
+ * If this lsn is within the legal range for the file, we return true.
+ * Otherwise false indicates that there are no more lsn's.
+ */
+ if (!is_lsn_in_file(log, *lsn))
+ *lsn = 0;
+
+ kfree(page);
+
+ return 0;
+}
+
+/*
+ * current_log_avail - Calculate the number of bytes available for log records.
+ */
+static u32 current_log_avail(struct ntfs_log *log)
+{
+ u32 oldest_off, next_free_off, free_bytes;
+
+ if (log->l_flags & NTFSLOG_NO_LAST_LSN) {
+ /* The entire file is available. */
+ return log->max_current_avail;
+ }
+
+ /*
+ * If there is a last lsn the restart area then we know that we will
+ * have to compute the free range.
+ * If there is no oldest lsn then start at the first page of the file.
+ */
+ oldest_off = (log->l_flags & NTFSLOG_NO_OLDEST_LSN)
+ ? log->first_page
+ : (log->oldest_lsn_off & ~log->sys_page_mask);
+
+ /*
+ * We will use the next log page offset to compute the next free page.
+ * If we are going to reuse this page go to the next page.
+ * If we are at the first page then use the end of the file.
+ */
+ next_free_off = (log->l_flags & NTFSLOG_REUSE_TAIL)
+ ? log->next_page + log->page_size
+ : log->next_page == log->first_page
+ ? log->l_size
+ : log->next_page;
+
+ /* If the two offsets are the same then there is no available space. */
+ if (oldest_off == next_free_off)
+ return 0;
+ /*
+ * If the free offset follows the oldest offset then subtract
+ * this range from the total available pages.
+ */
+ free_bytes =
+ oldest_off < next_free_off
+ ? log->total_avail_pages - (next_free_off - oldest_off)
+ : oldest_off - next_free_off;
+
+ free_bytes >>= log->page_bits;
+ return free_bytes * log->reserved;
+}
+
+static bool check_subseq_log_page(struct ntfs_log *log,
+ const struct RECORD_PAGE_HDR *rp, u32 vbo,
+ u64 seq)
+{
+ u64 lsn_seq;
+ const struct NTFS_RECORD_HEADER *rhdr = &rp->rhdr;
+ u64 lsn = le64_to_cpu(rhdr->lsn);
+
+ if (rhdr->sign == NTFS_FFFF_SIGNATURE || !rhdr->sign)
+ return false;
+
+ /*
+ * If the last lsn on the page occurs was written after the page
+ * that caused the original error then we have a fatal error.
+ */
+ lsn_seq = lsn >> log->file_data_bits;
+
+ /*
+ * If the sequence number for the lsn the page is equal or greater
+ * than lsn we expect, then this is a subsequent write.
+ */
+ return lsn_seq >= seq ||
+ (lsn_seq == seq - 1 && log->first_page == vbo &&
+ vbo != (lsn_to_vbo(log, lsn) & ~log->page_mask));
+}
+
+/*
+ * last_log_lsn
+ *
+ * Walks through the log pages for a file, searching for the
+ * last log page written to the file.
+ */
+static int last_log_lsn(struct ntfs_log *log)
+{
+ int err;
+ bool usa_error = false;
+ bool replace_page = false;
+ bool reuse_page = log->l_flags & NTFSLOG_REUSE_TAIL;
+ bool wrapped_file, wrapped;
+
+ u32 page_cnt = 1, page_pos = 1;
+ u32 page_off = 0, page_off1 = 0, saved_off = 0;
+ u32 final_off, second_off, final_off_prev = 0, second_off_prev = 0;
+ u32 first_file_off = 0, second_file_off = 0;
+ u32 part_io_count = 0;
+ u32 tails = 0;
+ u32 this_off, curpage_off, nextpage_off, remain_pages;
+
+ u64 expected_seq, seq_base = 0, lsn_base = 0;
+ u64 best_lsn, best_lsn1, best_lsn2;
+ u64 lsn_cur, lsn1, lsn2;
+ u64 last_ok_lsn = reuse_page ? log->last_lsn : 0;
+
+ u16 cur_pos, best_page_pos;
+
+ struct RECORD_PAGE_HDR *page = NULL;
+ struct RECORD_PAGE_HDR *tst_page = NULL;
+ struct RECORD_PAGE_HDR *first_tail = NULL;
+ struct RECORD_PAGE_HDR *second_tail = NULL;
+ struct RECORD_PAGE_HDR *tail_page = NULL;
+ struct RECORD_PAGE_HDR *second_tail_prev = NULL;
+ struct RECORD_PAGE_HDR *first_tail_prev = NULL;
+ struct RECORD_PAGE_HDR *page_bufs = NULL;
+ struct RECORD_PAGE_HDR *best_page;
+
+ if (log->major_ver >= 2) {
+ final_off = 0x02 * log->page_size;
+ second_off = 0x12 * log->page_size;
+
+ // 0x10 == 0x12 - 0x2
+ page_bufs = kmalloc(log->page_size * 0x10, GFP_NOFS);
+ if (!page_bufs)
+ return -ENOMEM;
+ } else {
+ second_off = log->first_page - log->page_size;
+ final_off = second_off - log->page_size;
+ }
+
+next_tail:
+ /* Read second tail page (at pos 3/0x12000). */
+ if (read_log_page(log, second_off, &second_tail, &usa_error) ||
+ usa_error || second_tail->rhdr.sign != NTFS_RCRD_SIGNATURE) {
+ kfree(second_tail);
+ second_tail = NULL;
+ second_file_off = 0;
+ lsn2 = 0;
+ } else {
+ second_file_off = hdr_file_off(log, second_tail);
+ lsn2 = le64_to_cpu(second_tail->record_hdr.last_end_lsn);
+ }
+
+ /* Read first tail page (at pos 2/0x2000). */
+ if (read_log_page(log, final_off, &first_tail, &usa_error) ||
+ usa_error || first_tail->rhdr.sign != NTFS_RCRD_SIGNATURE) {
+ kfree(first_tail);
+ first_tail = NULL;
+ first_file_off = 0;
+ lsn1 = 0;
+ } else {
+ first_file_off = hdr_file_off(log, first_tail);
+ lsn1 = le64_to_cpu(first_tail->record_hdr.last_end_lsn);
+ }
+
+ if (log->major_ver < 2) {
+ int best_page;
+
+ first_tail_prev = first_tail;
+ final_off_prev = first_file_off;
+ second_tail_prev = second_tail;
+ second_off_prev = second_file_off;
+ tails = 1;
+
+ if (!first_tail && !second_tail)
+ goto tail_read;
+
+ if (first_tail && second_tail)
+ best_page = lsn1 < lsn2 ? 1 : 0;
+ else if (first_tail)
+ best_page = 0;
+ else
+ best_page = 1;
+
+ page_off = best_page ? second_file_off : first_file_off;
+ seq_base = (best_page ? lsn2 : lsn1) >> log->file_data_bits;
+ goto tail_read;
+ }
+
+ best_lsn1 = first_tail ? base_lsn(log, first_tail, first_file_off) : 0;
+ best_lsn2 =
+ second_tail ? base_lsn(log, second_tail, second_file_off) : 0;
+
+ if (first_tail && second_tail) {
+ if (best_lsn1 > best_lsn2) {
+ best_lsn = best_lsn1;
+ best_page = first_tail;
+ this_off = first_file_off;
+ } else {
+ best_lsn = best_lsn2;
+ best_page = second_tail;
+ this_off = second_file_off;
+ }
+ } else if (first_tail) {
+ best_lsn = best_lsn1;
+ best_page = first_tail;
+ this_off = first_file_off;
+ } else if (second_tail) {
+ best_lsn = best_lsn2;
+ best_page = second_tail;
+ this_off = second_file_off;
+ } else {
+ goto tail_read;
+ }
+
+ best_page_pos = le16_to_cpu(best_page->page_pos);
+
+ if (!tails) {
+ if (best_page_pos == page_pos) {
+ seq_base = best_lsn >> log->file_data_bits;
+ saved_off = page_off = le32_to_cpu(best_page->file_off);
+ lsn_base = best_lsn;
+
+ memmove(page_bufs, best_page, log->page_size);
+
+ page_cnt = le16_to_cpu(best_page->page_count);
+ if (page_cnt > 1)
+ page_pos += 1;
+
+ tails = 1;
+ }
+ } else if (seq_base == (best_lsn >> log->file_data_bits) &&
+ saved_off + log->page_size == this_off &&
+ lsn_base < best_lsn &&
+ (page_pos != page_cnt || best_page_pos == page_pos ||
+ best_page_pos == 1) &&
+ (page_pos >= page_cnt || best_page_pos == page_pos)) {
+ u16 bppc = le16_to_cpu(best_page->page_count);
+
+ saved_off += log->page_size;
+ lsn_base = best_lsn;
+
+ memmove(Add2Ptr(page_bufs, tails * log->page_size), best_page,
+ log->page_size);
+
+ tails += 1;
+
+ if (best_page_pos != bppc) {
+ page_cnt = bppc;
+ page_pos = best_page_pos;
+
+ if (page_cnt > 1)
+ page_pos += 1;
+ } else {
+ page_pos = page_cnt = 1;
+ }
+ } else {
+ kfree(first_tail);
+ kfree(second_tail);
+ goto tail_read;
+ }
+
+ kfree(first_tail_prev);
+ first_tail_prev = first_tail;
+ final_off_prev = first_file_off;
+ first_tail = NULL;
+
+ kfree(second_tail_prev);
+ second_tail_prev = second_tail;
+ second_off_prev = second_file_off;
+ second_tail = NULL;
+
+ final_off += log->page_size;
+ second_off += log->page_size;
+
+ if (tails < 0x10)
+ goto next_tail;
+tail_read:
+ first_tail = first_tail_prev;
+ final_off = final_off_prev;
+
+ second_tail = second_tail_prev;
+ second_off = second_off_prev;
+
+ page_cnt = page_pos = 1;
+
+ curpage_off = seq_base == log->seq_num ? min(log->next_page, page_off)
+ : log->next_page;
+
+ wrapped_file =
+ curpage_off == log->first_page &&
+ !(log->l_flags & (NTFSLOG_NO_LAST_LSN | NTFSLOG_REUSE_TAIL));
+
+ expected_seq = wrapped_file ? (log->seq_num + 1) : log->seq_num;
+
+ nextpage_off = curpage_off;
+
+next_page:
+ tail_page = NULL;
+ /* Read the next log page. */
+ err = read_log_page(log, curpage_off, &page, &usa_error);
+
+ /* Compute the next log page offset the file. */
+ nextpage_off = next_page_off(log, curpage_off);
+ wrapped = nextpage_off == log->first_page;
+
+ if (tails > 1) {
+ struct RECORD_PAGE_HDR *cur_page =
+ Add2Ptr(page_bufs, curpage_off - page_off);
+
+ if (curpage_off == saved_off) {
+ tail_page = cur_page;
+ goto use_tail_page;
+ }
+
+ if (page_off > curpage_off || curpage_off >= saved_off)
+ goto use_tail_page;
+
+ if (page_off1)
+ goto use_cur_page;
+
+ if (!err && !usa_error &&
+ page->rhdr.sign == NTFS_RCRD_SIGNATURE &&
+ cur_page->rhdr.lsn == page->rhdr.lsn &&
+ cur_page->record_hdr.next_record_off ==
+ page->record_hdr.next_record_off &&
+ ((page_pos == page_cnt &&
+ le16_to_cpu(page->page_pos) == 1) ||
+ (page_pos != page_cnt &&
+ le16_to_cpu(page->page_pos) == page_pos + 1 &&
+ le16_to_cpu(page->page_count) == page_cnt))) {
+ cur_page = NULL;
+ goto use_tail_page;
+ }
+
+ page_off1 = page_off;
+
+use_cur_page:
+
+ lsn_cur = le64_to_cpu(cur_page->rhdr.lsn);
+
+ if (last_ok_lsn !=
+ le64_to_cpu(cur_page->record_hdr.last_end_lsn) &&
+ ((lsn_cur >> log->file_data_bits) +
+ ((curpage_off <
+ (lsn_to_vbo(log, lsn_cur) & ~log->page_mask))
+ ? 1
+ : 0)) != expected_seq) {
+ goto check_tail;
+ }
+
+ if (!is_log_record_end(cur_page)) {
+ tail_page = NULL;
+ last_ok_lsn = lsn_cur;
+ goto next_page_1;
+ }
+
+ log->seq_num = expected_seq;
+ log->l_flags &= ~NTFSLOG_NO_LAST_LSN;
+ log->last_lsn = le64_to_cpu(cur_page->record_hdr.last_end_lsn);
+ log->ra->current_lsn = cur_page->record_hdr.last_end_lsn;
+
+ if (log->record_header_len <=
+ log->page_size -
+ le16_to_cpu(cur_page->record_hdr.next_record_off)) {
+ log->l_flags |= NTFSLOG_REUSE_TAIL;
+ log->next_page = curpage_off;
+ } else {
+ log->l_flags &= ~NTFSLOG_REUSE_TAIL;
+ log->next_page = nextpage_off;
+ }
+
+ if (wrapped_file)
+ log->l_flags |= NTFSLOG_WRAPPED;
+
+ last_ok_lsn = le64_to_cpu(cur_page->record_hdr.last_end_lsn);
+ goto next_page_1;
+ }
+
+ /*
+ * If we are at the expected first page of a transfer check to see
+ * if either tail copy is at this offset.
+ * If this page is the last page of a transfer, check if we wrote
+ * a subsequent tail copy.
+ */
+ if (page_cnt == page_pos || page_cnt == page_pos + 1) {
+ /*
+ * Check if the offset matches either the first or second
+ * tail copy. It is possible it will match both.
+ */
+ if (curpage_off == final_off)
+ tail_page = first_tail;
+
+ /*
+ * If we already matched on the first page then
+ * check the ending lsn's.
+ */
+ if (curpage_off == second_off) {
+ if (!tail_page ||
+ (second_tail &&
+ le64_to_cpu(second_tail->record_hdr.last_end_lsn) >
+ le64_to_cpu(first_tail->record_hdr
+ .last_end_lsn))) {
+ tail_page = second_tail;
+ }
+ }
+ }
+
+use_tail_page:
+ if (tail_page) {
+ /* We have a candidate for a tail copy. */
+ lsn_cur = le64_to_cpu(tail_page->record_hdr.last_end_lsn);
+
+ if (last_ok_lsn < lsn_cur) {
+ /*
+ * If the sequence number is not expected,
+ * then don't use the tail copy.
+ */
+ if (expected_seq != (lsn_cur >> log->file_data_bits))
+ tail_page = NULL;
+ } else if (last_ok_lsn > lsn_cur) {
+ /*
+ * If the last lsn is greater than the one on
+ * this page then forget this tail.
+ */
+ tail_page = NULL;
+ }
+ }
+
+ /*
+ *If we have an error on the current page,
+ * we will break of this loop.
+ */
+ if (err || usa_error)
+ goto check_tail;
+
+ /*
+ * Done if the last lsn on this page doesn't match the previous known
+ * last lsn or the sequence number is not expected.
+ */
+ lsn_cur = le64_to_cpu(page->rhdr.lsn);
+ if (last_ok_lsn != lsn_cur &&
+ expected_seq != (lsn_cur >> log->file_data_bits)) {
+ goto check_tail;
+ }
+
+ /*
+ * Check that the page position and page count values are correct.
+ * If this is the first page of a transfer the position must be 1
+ * and the count will be unknown.
+ */
+ if (page_cnt == page_pos) {
+ if (page->page_pos != cpu_to_le16(1) &&
+ (!reuse_page || page->page_pos != page->page_count)) {
+ /*
+ * If the current page is the first page we are
+ * looking at and we are reusing this page then
+ * it can be either the first or last page of a
+ * transfer. Otherwise it can only be the first.
+ */
+ goto check_tail;
+ }
+ } else if (le16_to_cpu(page->page_count) != page_cnt ||
+ le16_to_cpu(page->page_pos) != page_pos + 1) {
+ /*
+ * The page position better be 1 more than the last page
+ * position and the page count better match.
+ */
+ goto check_tail;
+ }
+
+ /*
+ * We have a valid page the file and may have a valid page
+ * the tail copy area.
+ * If the tail page was written after the page the file then
+ * break of the loop.
+ */
+ if (tail_page &&
+ le64_to_cpu(tail_page->record_hdr.last_end_lsn) > lsn_cur) {
+ /* Remember if we will replace the page. */
+ replace_page = true;
+ goto check_tail;
+ }
+
+ tail_page = NULL;
+
+ if (is_log_record_end(page)) {
+ /*
+ * Since we have read this page we know the sequence number
+ * is the same as our expected value.
+ */
+ log->seq_num = expected_seq;
+ log->last_lsn = le64_to_cpu(page->record_hdr.last_end_lsn);
+ log->ra->current_lsn = page->record_hdr.last_end_lsn;
+ log->l_flags &= ~NTFSLOG_NO_LAST_LSN;
+
+ /*
+ * If there is room on this page for another header then
+ * remember we want to reuse the page.
+ */
+ if (log->record_header_len <=
+ log->page_size -
+ le16_to_cpu(page->record_hdr.next_record_off)) {
+ log->l_flags |= NTFSLOG_REUSE_TAIL;
+ log->next_page = curpage_off;
+ } else {
+ log->l_flags &= ~NTFSLOG_REUSE_TAIL;
+ log->next_page = nextpage_off;
+ }
+
+ /* Remember if we wrapped the log file. */
+ if (wrapped_file)
+ log->l_flags |= NTFSLOG_WRAPPED;
+ }
+
+ /*
+ * Remember the last page count and position.
+ * Also remember the last known lsn.
+ */
+ page_cnt = le16_to_cpu(page->page_count);
+ page_pos = le16_to_cpu(page->page_pos);
+ last_ok_lsn = le64_to_cpu(page->rhdr.lsn);
+
+next_page_1:
+
+ if (wrapped) {
+ expected_seq += 1;
+ wrapped_file = 1;
+ }
+
+ curpage_off = nextpage_off;
+ kfree(page);
+ page = NULL;
+ reuse_page = 0;
+ goto next_page;
+
+check_tail:
+ if (tail_page) {
+ log->seq_num = expected_seq;
+ log->last_lsn = le64_to_cpu(tail_page->record_hdr.last_end_lsn);
+ log->ra->current_lsn = tail_page->record_hdr.last_end_lsn;
+ log->l_flags &= ~NTFSLOG_NO_LAST_LSN;
+
+ if (log->page_size -
+ le16_to_cpu(
+ tail_page->record_hdr.next_record_off) >=
+ log->record_header_len) {
+ log->l_flags |= NTFSLOG_REUSE_TAIL;
+ log->next_page = curpage_off;
+ } else {
+ log->l_flags &= ~NTFSLOG_REUSE_TAIL;
+ log->next_page = nextpage_off;
+ }
+
+ if (wrapped)
+ log->l_flags |= NTFSLOG_WRAPPED;
+ }
+
+ /* Remember that the partial IO will start at the next page. */
+ second_off = nextpage_off;
+
+ /*
+ * If the next page is the first page of the file then update
+ * the sequence number for log records which begon the next page.
+ */
+ if (wrapped)
+ expected_seq += 1;
+
+ /*
+ * If we have a tail copy or are performing single page I/O we can
+ * immediately look at the next page.
+ */
+ if (replace_page || (log->ra->flags & RESTART_SINGLE_PAGE_IO)) {
+ page_cnt = 2;
+ page_pos = 1;
+ goto check_valid;
+ }
+
+ if (page_pos != page_cnt)
+ goto check_valid;
+ /*
+ * If the next page causes us to wrap to the beginning of the log
+ * file then we know which page to check next.
+ */
+ if (wrapped) {
+ page_cnt = 2;
+ page_pos = 1;
+ goto check_valid;
+ }
+
+ cur_pos = 2;
+
+next_test_page:
+ kfree(tst_page);
+ tst_page = NULL;
+
+ /* Walk through the file, reading log pages. */
+ err = read_log_page(log, nextpage_off, &tst_page, &usa_error);
+
+ /*
+ * If we get a USA error then assume that we correctly found
+ * the end of the original transfer.
+ */
+ if (usa_error)
+ goto file_is_valid;
+
+ /*
+ * If we were able to read the page, we examine it to see if it
+ * is the same or different Io block.
+ */
+ if (err)
+ goto next_test_page_1;
+
+ if (le16_to_cpu(tst_page->page_pos) == cur_pos &&
+ check_subseq_log_page(log, tst_page, nextpage_off, expected_seq)) {
+ page_cnt = le16_to_cpu(tst_page->page_count) + 1;
+ page_pos = le16_to_cpu(tst_page->page_pos);
+ goto check_valid;
+ } else {
+ goto file_is_valid;
+ }
+
+next_test_page_1:
+
+ nextpage_off = next_page_off(log, curpage_off);
+ wrapped = nextpage_off == log->first_page;
+
+ if (wrapped) {
+ expected_seq += 1;
+ page_cnt = 2;
+ page_pos = 1;
+ }
+
+ cur_pos += 1;
+ part_io_count += 1;
+ if (!wrapped)
+ goto next_test_page;
+
+check_valid:
+ /* Skip over the remaining pages this transfer. */
+ remain_pages = page_cnt - page_pos - 1;
+ part_io_count += remain_pages;
+
+ while (remain_pages--) {
+ nextpage_off = next_page_off(log, curpage_off);
+ wrapped = nextpage_off == log->first_page;
+
+ if (wrapped)
+ expected_seq += 1;
+ }
+
+ /* Call our routine to check this log page. */
+ kfree(tst_page);
+ tst_page = NULL;
+
+ err = read_log_page(log, nextpage_off, &tst_page, &usa_error);
+ if (!err && !usa_error &&
+ check_subseq_log_page(log, tst_page, nextpage_off, expected_seq)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+file_is_valid:
+
+ /* We have a valid file. */
+ if (page_off1 || tail_page) {
+ struct RECORD_PAGE_HDR *tmp_page;
+
+ if (sb_rdonly(log->ni->mi.sbi->sb)) {
+ err = -EROFS;
+ goto out;
+ }
+
+ if (page_off1) {
+ tmp_page = Add2Ptr(page_bufs, page_off1 - page_off);
+ tails -= (page_off1 - page_off) / log->page_size;
+ if (!tail_page)
+ tails -= 1;
+ } else {
+ tmp_page = tail_page;
+ tails = 1;
+ }
+
+ while (tails--) {
+ u64 off = hdr_file_off(log, tmp_page);
+
+ if (!page) {
+ page = kmalloc(log->page_size, GFP_NOFS);
+ if (!page)
+ return -ENOMEM;
+ }
+
+ /*
+ * Correct page and copy the data from this page
+ * into it and flush it to disk.
+ */
+ memcpy(page, tmp_page, log->page_size);
+
+ /* Fill last flushed lsn value flush the page. */
+ if (log->major_ver < 2)
+ page->rhdr.lsn = page->record_hdr.last_end_lsn;
+ else
+ page->file_off = 0;
+
+ page->page_pos = page->page_count = cpu_to_le16(1);
+
+ ntfs_fix_pre_write(&page->rhdr, log->page_size);
+
+ err = ntfs_sb_write_run(log->ni->mi.sbi,
+ &log->ni->file.run, off, page,
+ log->page_size);
+
+ if (err)
+ goto out;
+
+ if (part_io_count && second_off == off) {
+ second_off += log->page_size;
+ part_io_count -= 1;
+ }
+
+ tmp_page = Add2Ptr(tmp_page, log->page_size);
+ }
+ }
+
+ if (part_io_count) {
+ if (sb_rdonly(log->ni->mi.sbi->sb)) {
+ err = -EROFS;
+ goto out;
+ }
+ }
+
+out:
+ kfree(second_tail);
+ kfree(first_tail);
+ kfree(page);
+ kfree(tst_page);
+ kfree(page_bufs);
+
+ return err;
+}
+
+/*
+ * read_log_rec_buf - Copy a log record from the file to a buffer.
+ *
+ * The log record may span several log pages and may even wrap the file.
+ */
+static int read_log_rec_buf(struct ntfs_log *log,
+ const struct LFS_RECORD_HDR *rh, void *buffer)
+{
+ int err;
+ struct RECORD_PAGE_HDR *ph = NULL;
+ u64 lsn = le64_to_cpu(rh->this_lsn);
+ u32 vbo = lsn_to_vbo(log, lsn) & ~log->page_mask;
+ u32 off = lsn_to_page_off(log, lsn) + log->record_header_len;
+ u32 data_len = le32_to_cpu(rh->client_data_len);
+
+ /*
+ * While there are more bytes to transfer,
+ * we continue to attempt to perform the read.
+ */
+ for (;;) {
+ bool usa_error;
+ u32 tail = log->page_size - off;
+
+ if (tail >= data_len)
+ tail = data_len;
+
+ data_len -= tail;
+
+ err = read_log_page(log, vbo, &ph, &usa_error);
+ if (err)
+ goto out;
+
+ /*
+ * The last lsn on this page better be greater or equal
+ * to the lsn we are copying.
+ */
+ if (lsn > le64_to_cpu(ph->rhdr.lsn)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ memcpy(buffer, Add2Ptr(ph, off), tail);
+
+ /* If there are no more bytes to transfer, we exit the loop. */
+ if (!data_len) {
+ if (!is_log_record_end(ph) ||
+ lsn > le64_to_cpu(ph->record_hdr.last_end_lsn)) {
+ err = -EINVAL;
+ goto out;
+ }
+ break;
+ }
+
+ if (ph->rhdr.lsn == ph->record_hdr.last_end_lsn ||
+ lsn > le64_to_cpu(ph->rhdr.lsn)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ vbo = next_page_off(log, vbo);
+ off = log->data_off;
+
+ /*
+ * Adjust our pointer the user's buffer to transfer
+ * the next block to.
+ */
+ buffer = Add2Ptr(buffer, tail);
+ }
+
+out:
+ kfree(ph);
+ return err;
+}
+
+static int read_rst_area(struct ntfs_log *log, struct NTFS_RESTART **rst_,
+ u64 *lsn)
+{
+ int err;
+ struct LFS_RECORD_HDR *rh = NULL;
+ const struct CLIENT_REC *cr =
+ Add2Ptr(log->ra, le16_to_cpu(log->ra->client_off));
+ u64 lsnr, lsnc = le64_to_cpu(cr->restart_lsn);
+ u32 len;
+ struct NTFS_RESTART *rst;
+
+ *lsn = 0;
+ *rst_ = NULL;
+
+ /* If the client doesn't have a restart area, go ahead and exit now. */
+ if (!lsnc)
+ return 0;
+
+ err = read_log_page(log, lsn_to_vbo(log, lsnc),
+ (struct RECORD_PAGE_HDR **)&rh, NULL);
+ if (err)
+ return err;
+
+ rst = NULL;
+ lsnr = le64_to_cpu(rh->this_lsn);
+
+ if (lsnc != lsnr) {
+ /* If the lsn values don't match, then the disk is corrupt. */
+ err = -EINVAL;
+ goto out;
+ }
+
+ *lsn = lsnr;
+ len = le32_to_cpu(rh->client_data_len);
+
+ if (!len) {
+ err = 0;
+ goto out;
+ }
+
+ if (len < sizeof(struct NTFS_RESTART)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ rst = kmalloc(len, GFP_NOFS);
+ if (!rst) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ /* Copy the data into the 'rst' buffer. */
+ err = read_log_rec_buf(log, rh, rst);
+ if (err)
+ goto out;
+
+ *rst_ = rst;
+ rst = NULL;
+
+out:
+ kfree(rh);
+ kfree(rst);
+
+ return err;
+}
+
+static int find_log_rec(struct ntfs_log *log, u64 lsn, struct lcb *lcb)
+{
+ int err;
+ struct LFS_RECORD_HDR *rh = lcb->lrh;
+ u32 rec_len, len;
+
+ /* Read the record header for this lsn. */
+ if (!rh) {
+ err = read_log_page(log, lsn_to_vbo(log, lsn),
+ (struct RECORD_PAGE_HDR **)&rh, NULL);
+
+ lcb->lrh = rh;
+ if (err)
+ return err;
+ }
+
+ /*
+ * If the lsn the log record doesn't match the desired
+ * lsn then the disk is corrupt.
+ */
+ if (lsn != le64_to_cpu(rh->this_lsn))
+ return -EINVAL;
+
+ len = le32_to_cpu(rh->client_data_len);
+
+ /*
+ * Check that the length field isn't greater than the total
+ * available space the log file.
+ */
+ rec_len = len + log->record_header_len;
+ if (rec_len >= log->total_avail)
+ return -EINVAL;
+
+ /*
+ * If the entire log record is on this log page,
+ * put a pointer to the log record the context block.
+ */
+ if (rh->flags & LOG_RECORD_MULTI_PAGE) {
+ void *lr = kmalloc(len, GFP_NOFS);
+
+ if (!lr)
+ return -ENOMEM;
+
+ lcb->log_rec = lr;
+ lcb->alloc = true;
+
+ /* Copy the data into the buffer returned. */
+ err = read_log_rec_buf(log, rh, lr);
+ if (err)
+ return err;
+ } else {
+ /* If beyond the end of the current page -> an error. */
+ u32 page_off = lsn_to_page_off(log, lsn);
+
+ if (page_off + len + log->record_header_len > log->page_size)
+ return -EINVAL;
+
+ lcb->log_rec = Add2Ptr(rh, sizeof(struct LFS_RECORD_HDR));
+ lcb->alloc = false;
+ }
+
+ return 0;
+}
+
+/*
+ * read_log_rec_lcb - Init the query operation.
+ */
+static int read_log_rec_lcb(struct ntfs_log *log, u64 lsn, u32 ctx_mode,
+ struct lcb **lcb_)
+{
+ int err;
+ const struct CLIENT_REC *cr;
+ struct lcb *lcb;
+
+ switch (ctx_mode) {
+ case lcb_ctx_undo_next:
+ case lcb_ctx_prev:
+ case lcb_ctx_next:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* Check that the given lsn is the legal range for this client. */
+ cr = Add2Ptr(log->ra, le16_to_cpu(log->ra->client_off));
+
+ if (!verify_client_lsn(log, cr, lsn))
+ return -EINVAL;
+
+ lcb = kzalloc(sizeof(struct lcb), GFP_NOFS);
+ if (!lcb)
+ return -ENOMEM;
+ lcb->client = log->client_id;
+ lcb->ctx_mode = ctx_mode;
+
+ /* Find the log record indicated by the given lsn. */
+ err = find_log_rec(log, lsn, lcb);
+ if (err)
+ goto out;
+
+ *lcb_ = lcb;
+ return 0;
+
+out:
+ lcb_put(lcb);
+ *lcb_ = NULL;
+ return err;
+}
+
+/*
+ * find_client_next_lsn
+ *
+ * Attempt to find the next lsn to return to a client based on the context mode.
+ */
+static int find_client_next_lsn(struct ntfs_log *log, struct lcb *lcb, u64 *lsn)
+{
+ int err;
+ u64 next_lsn;
+ struct LFS_RECORD_HDR *hdr;
+
+ hdr = lcb->lrh;
+ *lsn = 0;
+
+ if (lcb_ctx_next != lcb->ctx_mode)
+ goto check_undo_next;
+
+ /* Loop as long as another lsn can be found. */
+ for (;;) {
+ u64 current_lsn;
+
+ err = next_log_lsn(log, hdr, &current_lsn);
+ if (err)
+ goto out;
+
+ if (!current_lsn)
+ break;
+
+ if (hdr != lcb->lrh)
+ kfree(hdr);
+
+ hdr = NULL;
+ err = read_log_page(log, lsn_to_vbo(log, current_lsn),
+ (struct RECORD_PAGE_HDR **)&hdr, NULL);
+ if (err)
+ goto out;
+
+ if (memcmp(&hdr->client, &lcb->client,
+ sizeof(struct CLIENT_ID))) {
+ /*err = -EINVAL; */
+ } else if (LfsClientRecord == hdr->record_type) {
+ kfree(lcb->lrh);
+ lcb->lrh = hdr;
+ *lsn = current_lsn;
+ return 0;
+ }
+ }
+
+out:
+ if (hdr != lcb->lrh)
+ kfree(hdr);
+ return err;
+
+check_undo_next:
+ if (lcb_ctx_undo_next == lcb->ctx_mode)
+ next_lsn = le64_to_cpu(hdr->client_undo_next_lsn);
+ else if (lcb_ctx_prev == lcb->ctx_mode)
+ next_lsn = le64_to_cpu(hdr->client_prev_lsn);
+ else
+ return 0;
+
+ if (!next_lsn)
+ return 0;
+
+ if (!verify_client_lsn(
+ log, Add2Ptr(log->ra, le16_to_cpu(log->ra->client_off)),
+ next_lsn))
+ return 0;
+
+ hdr = NULL;
+ err = read_log_page(log, lsn_to_vbo(log, next_lsn),
+ (struct RECORD_PAGE_HDR **)&hdr, NULL);
+ if (err)
+ return err;
+ kfree(lcb->lrh);
+ lcb->lrh = hdr;
+
+ *lsn = next_lsn;
+
+ return 0;
+}
+
+static int read_next_log_rec(struct ntfs_log *log, struct lcb *lcb, u64 *lsn)
+{
+ int err;
+
+ err = find_client_next_lsn(log, lcb, lsn);
+ if (err)
+ return err;
+
+ if (!*lsn)
+ return 0;
+
+ if (lcb->alloc)
+ kfree(lcb->log_rec);
+
+ lcb->log_rec = NULL;
+ lcb->alloc = false;
+ kfree(lcb->lrh);
+ lcb->lrh = NULL;
+
+ return find_log_rec(log, *lsn, lcb);
+}
+
+static inline bool check_index_header(const struct INDEX_HDR *hdr, size_t bytes)
+{
+ __le16 mask;
+ u32 min_de, de_off, used, total;
+ const struct NTFS_DE *e;
+
+ if (hdr_has_subnode(hdr)) {
+ min_de = sizeof(struct NTFS_DE) + sizeof(u64);
+ mask = NTFS_IE_HAS_SUBNODES;
+ } else {
+ min_de = sizeof(struct NTFS_DE);
+ mask = 0;
+ }
+
+ de_off = le32_to_cpu(hdr->de_off);
+ used = le32_to_cpu(hdr->used);
+ total = le32_to_cpu(hdr->total);
+
+ if (de_off > bytes - min_de || used > bytes || total > bytes ||
+ de_off + min_de > used || used > total) {
+ return false;
+ }
+
+ e = Add2Ptr(hdr, de_off);
+ for (;;) {
+ u16 esize = le16_to_cpu(e->size);
+ struct NTFS_DE *next = Add2Ptr(e, esize);
+
+ if (esize < min_de || PtrOffset(hdr, next) > used ||
+ (e->flags & NTFS_IE_HAS_SUBNODES) != mask) {
+ return false;
+ }
+
+ if (de_is_last(e))
+ break;
+
+ e = next;
+ }
+
+ return true;
+}
+
+static inline bool check_index_buffer(const struct INDEX_BUFFER *ib, u32 bytes)
+{
+ u16 fo;
+ const struct NTFS_RECORD_HEADER *r = &ib->rhdr;
+
+ if (r->sign != NTFS_INDX_SIGNATURE)
+ return false;
+
+ fo = (SECTOR_SIZE - ((bytes >> SECTOR_SHIFT) + 1) * sizeof(short));
+
+ if (le16_to_cpu(r->fix_off) > fo)
+ return false;
+
+ if ((le16_to_cpu(r->fix_num) - 1) * SECTOR_SIZE != bytes)
+ return false;
+
+ return check_index_header(&ib->ihdr,
+ bytes - offsetof(struct INDEX_BUFFER, ihdr));
+}
+
+static inline bool check_index_root(const struct ATTRIB *attr,
+ struct ntfs_sb_info *sbi)
+{
+ bool ret;
+ const struct INDEX_ROOT *root = resident_data(attr);
+ u8 index_bits = le32_to_cpu(root->index_block_size) >= sbi->cluster_size
+ ? sbi->cluster_bits
+ : SECTOR_SHIFT;
+ u8 block_clst = root->index_block_clst;
+
+ if (le32_to_cpu(attr->res.data_size) < sizeof(struct INDEX_ROOT) ||
+ (root->type != ATTR_NAME && root->type != ATTR_ZERO) ||
+ (root->type == ATTR_NAME &&
+ root->rule != NTFS_COLLATION_TYPE_FILENAME) ||
+ (le32_to_cpu(root->index_block_size) !=
+ (block_clst << index_bits)) ||
+ (block_clst != 1 && block_clst != 2 && block_clst != 4 &&
+ block_clst != 8 && block_clst != 0x10 && block_clst != 0x20 &&
+ block_clst != 0x40 && block_clst != 0x80)) {
+ return false;
+ }
+
+ ret = check_index_header(&root->ihdr,
+ le32_to_cpu(attr->res.data_size) -
+ offsetof(struct INDEX_ROOT, ihdr));
+ return ret;
+}
+
+static inline bool check_attr(const struct MFT_REC *rec,
+ const struct ATTRIB *attr,
+ struct ntfs_sb_info *sbi)
+{
+ u32 asize = le32_to_cpu(attr->size);
+ u32 rsize = 0;
+ u64 dsize, svcn, evcn;
+ u16 run_off;
+
+ /* Check the fixed part of the attribute record header. */
+ if (asize >= sbi->record_size ||
+ asize + PtrOffset(rec, attr) >= sbi->record_size ||
+ (attr->name_len &&
+ le16_to_cpu(attr->name_off) + attr->name_len * sizeof(short) >
+ asize)) {
+ return false;
+ }
+
+ /* Check the attribute fields. */
+ switch (attr->non_res) {
+ case 0:
+ rsize = le32_to_cpu(attr->res.data_size);
+ if (rsize >= asize ||
+ le16_to_cpu(attr->res.data_off) + rsize > asize) {
+ return false;
+ }
+ break;
+
+ case 1:
+ dsize = le64_to_cpu(attr->nres.data_size);
+ svcn = le64_to_cpu(attr->nres.svcn);
+ evcn = le64_to_cpu(attr->nres.evcn);
+ run_off = le16_to_cpu(attr->nres.run_off);
+
+ if (svcn > evcn + 1 || run_off >= asize ||
+ le64_to_cpu(attr->nres.valid_size) > dsize ||
+ dsize > le64_to_cpu(attr->nres.alloc_size)) {
+ return false;
+ }
+
+ if (run_unpack(NULL, sbi, 0, svcn, evcn, svcn,
+ Add2Ptr(attr, run_off), asize - run_off) < 0) {
+ return false;
+ }
+
+ return true;
+
+ default:
+ return false;
+ }
+
+ switch (attr->type) {
+ case ATTR_NAME:
+ if (fname_full_size(Add2Ptr(
+ attr, le16_to_cpu(attr->res.data_off))) > asize) {
+ return false;
+ }
+ break;
+
+ case ATTR_ROOT:
+ return check_index_root(attr, sbi);
+
+ case ATTR_STD:
+ if (rsize < sizeof(struct ATTR_STD_INFO5) &&
+ rsize != sizeof(struct ATTR_STD_INFO)) {
+ return false;
+ }
+ break;
+
+ case ATTR_LIST:
+ case ATTR_ID:
+ case ATTR_SECURE:
+ case ATTR_LABEL:
+ case ATTR_VOL_INFO:
+ case ATTR_DATA:
+ case ATTR_ALLOC:
+ case ATTR_BITMAP:
+ case ATTR_REPARSE:
+ case ATTR_EA_INFO:
+ case ATTR_EA:
+ case ATTR_PROPERTYSET:
+ case ATTR_LOGGED_UTILITY_STREAM:
+ break;
+
+ default:
+ return false;
+ }
+
+ return true;
+}
+
+static inline bool check_file_record(const struct MFT_REC *rec,
+ const struct MFT_REC *rec2,
+ struct ntfs_sb_info *sbi)
+{
+ const struct ATTRIB *attr;
+ u16 fo = le16_to_cpu(rec->rhdr.fix_off);
+ u16 fn = le16_to_cpu(rec->rhdr.fix_num);
+ u16 ao = le16_to_cpu(rec->attr_off);
+ u32 rs = sbi->record_size;
+
+ /* Check the file record header for consistency. */
+ if (rec->rhdr.sign != NTFS_FILE_SIGNATURE ||
+ fo > (SECTOR_SIZE - ((rs >> SECTOR_SHIFT) + 1) * sizeof(short)) ||
+ (fn - 1) * SECTOR_SIZE != rs || ao < MFTRECORD_FIXUP_OFFSET_1 ||
+ ao > sbi->record_size - SIZEOF_RESIDENT || !is_rec_inuse(rec) ||
+ le32_to_cpu(rec->total) != rs) {
+ return false;
+ }
+
+ /* Loop to check all of the attributes. */
+ for (attr = Add2Ptr(rec, ao); attr->type != ATTR_END;
+ attr = Add2Ptr(attr, le32_to_cpu(attr->size))) {
+ if (check_attr(rec, attr, sbi))
+ continue;
+ return false;
+ }
+
+ return true;
+}
+
+static inline int check_lsn(const struct NTFS_RECORD_HEADER *hdr,
+ const u64 *rlsn)
+{
+ u64 lsn;
+
+ if (!rlsn)
+ return true;
+
+ lsn = le64_to_cpu(hdr->lsn);
+
+ if (hdr->sign == NTFS_HOLE_SIGNATURE)
+ return false;
+
+ if (*rlsn > lsn)
+ return true;
+
+ return false;
+}
+
+static inline bool check_if_attr(const struct MFT_REC *rec,
+ const struct LOG_REC_HDR *lrh)
+{
+ u16 ro = le16_to_cpu(lrh->record_off);
+ u16 o = le16_to_cpu(rec->attr_off);
+ const struct ATTRIB *attr = Add2Ptr(rec, o);
+
+ while (o < ro) {
+ u32 asize;
+
+ if (attr->type == ATTR_END)
+ break;
+
+ asize = le32_to_cpu(attr->size);
+ if (!asize)
+ break;
+
+ o += asize;
+ attr = Add2Ptr(attr, asize);
+ }
+
+ return o == ro;
+}
+
+static inline bool check_if_index_root(const struct MFT_REC *rec,
+ const struct LOG_REC_HDR *lrh)
+{
+ u16 ro = le16_to_cpu(lrh->record_off);
+ u16 o = le16_to_cpu(rec->attr_off);
+ const struct ATTRIB *attr = Add2Ptr(rec, o);
+
+ while (o < ro) {
+ u32 asize;
+
+ if (attr->type == ATTR_END)
+ break;
+
+ asize = le32_to_cpu(attr->size);
+ if (!asize)
+ break;
+
+ o += asize;
+ attr = Add2Ptr(attr, asize);
+ }
+
+ return o == ro && attr->type == ATTR_ROOT;
+}
+
+static inline bool check_if_root_index(const struct ATTRIB *attr,
+ const struct INDEX_HDR *hdr,
+ const struct LOG_REC_HDR *lrh)
+{
+ u16 ao = le16_to_cpu(lrh->attr_off);
+ u32 de_off = le32_to_cpu(hdr->de_off);
+ u32 o = PtrOffset(attr, hdr) + de_off;
+ const struct NTFS_DE *e = Add2Ptr(hdr, de_off);
+ u32 asize = le32_to_cpu(attr->size);
+
+ while (o < ao) {
+ u16 esize;
+
+ if (o >= asize)
+ break;
+
+ esize = le16_to_cpu(e->size);
+ if (!esize)
+ break;
+
+ o += esize;
+ e = Add2Ptr(e, esize);
+ }
+
+ return o == ao;
+}
+
+static inline bool check_if_alloc_index(const struct INDEX_HDR *hdr,
+ u32 attr_off)
+{
+ u32 de_off = le32_to_cpu(hdr->de_off);
+ u32 o = offsetof(struct INDEX_BUFFER, ihdr) + de_off;
+ const struct NTFS_DE *e = Add2Ptr(hdr, de_off);
+ u32 used = le32_to_cpu(hdr->used);
+
+ while (o < attr_off) {
+ u16 esize;
+
+ if (de_off >= used)
+ break;
+
+ esize = le16_to_cpu(e->size);
+ if (!esize)
+ break;
+
+ o += esize;
+ de_off += esize;
+ e = Add2Ptr(e, esize);
+ }
+
+ return o == attr_off;
+}
+
+static inline void change_attr_size(struct MFT_REC *rec, struct ATTRIB *attr,
+ u32 nsize)
+{
+ u32 asize = le32_to_cpu(attr->size);
+ int dsize = nsize - asize;
+ u8 *next = Add2Ptr(attr, asize);
+ u32 used = le32_to_cpu(rec->used);
+
+ memmove(Add2Ptr(attr, nsize), next, used - PtrOffset(rec, next));
+
+ rec->used = cpu_to_le32(used + dsize);
+ attr->size = cpu_to_le32(nsize);
+}
+
+struct OpenAttr {
+ struct ATTRIB *attr;
+ struct runs_tree *run1;
+ struct runs_tree run0;
+ struct ntfs_inode *ni;
+ // CLST rno;
+};
+
+/*
+ * cmp_type_and_name
+ *
+ * Return: 0 if 'attr' has the same type and name.
+ */
+static inline int cmp_type_and_name(const struct ATTRIB *a1,
+ const struct ATTRIB *a2)
+{
+ return a1->type != a2->type || a1->name_len != a2->name_len ||
+ (a1->name_len && memcmp(attr_name(a1), attr_name(a2),
+ a1->name_len * sizeof(short)));
+}
+
+static struct OpenAttr *find_loaded_attr(struct ntfs_log *log,
+ const struct ATTRIB *attr, CLST rno)
+{
+ struct OPEN_ATTR_ENRTY *oe = NULL;
+
+ while ((oe = enum_rstbl(log->open_attr_tbl, oe))) {
+ struct OpenAttr *op_attr;
+
+ if (ino_get(&oe->ref) != rno)
+ continue;
+
+ op_attr = (struct OpenAttr *)oe->ptr;
+ if (!cmp_type_and_name(op_attr->attr, attr))
+ return op_attr;
+ }
+ return NULL;
+}
+
+static struct ATTRIB *attr_create_nonres_log(struct ntfs_sb_info *sbi,
+ enum ATTR_TYPE type, u64 size,
+ const u16 *name, size_t name_len,
+ __le16 flags)
+{
+ struct ATTRIB *attr;
+ u32 name_size = ALIGN(name_len * sizeof(short), 8);
+ bool is_ext = flags & (ATTR_FLAG_COMPRESSED | ATTR_FLAG_SPARSED);
+ u32 asize = name_size +
+ (is_ext ? SIZEOF_NONRESIDENT_EX : SIZEOF_NONRESIDENT);
+
+ attr = kzalloc(asize, GFP_NOFS);
+ if (!attr)
+ return NULL;
+
+ attr->type = type;
+ attr->size = cpu_to_le32(asize);
+ attr->flags = flags;
+ attr->non_res = 1;
+ attr->name_len = name_len;
+
+ attr->nres.evcn = cpu_to_le64((u64)bytes_to_cluster(sbi, size) - 1);
+ attr->nres.alloc_size = cpu_to_le64(ntfs_up_cluster(sbi, size));
+ attr->nres.data_size = cpu_to_le64(size);
+ attr->nres.valid_size = attr->nres.data_size;
+ if (is_ext) {
+ attr->name_off = SIZEOF_NONRESIDENT_EX_LE;
+ if (is_attr_compressed(attr))
+ attr->nres.c_unit = COMPRESSION_UNIT;
+
+ attr->nres.run_off =
+ cpu_to_le16(SIZEOF_NONRESIDENT_EX + name_size);
+ memcpy(Add2Ptr(attr, SIZEOF_NONRESIDENT_EX), name,
+ name_len * sizeof(short));
+ } else {
+ attr->name_off = SIZEOF_NONRESIDENT_LE;
+ attr->nres.run_off =
+ cpu_to_le16(SIZEOF_NONRESIDENT + name_size);
+ memcpy(Add2Ptr(attr, SIZEOF_NONRESIDENT), name,
+ name_len * sizeof(short));
+ }
+
+ return attr;
+}
+
+/*
+ * do_action - Common routine for the Redo and Undo Passes.
+ * @rlsn: If it is NULL then undo.
+ */
+static int do_action(struct ntfs_log *log, struct OPEN_ATTR_ENRTY *oe,
+ const struct LOG_REC_HDR *lrh, u32 op, void *data,
+ u32 dlen, u32 rec_len, const u64 *rlsn)
+{
+ int err = 0;
+ struct ntfs_sb_info *sbi = log->ni->mi.sbi;
+ struct inode *inode = NULL, *inode_parent;
+ struct mft_inode *mi = NULL, *mi2_child = NULL;
+ CLST rno = 0, rno_base = 0;
+ struct INDEX_BUFFER *ib = NULL;
+ struct MFT_REC *rec = NULL;
+ struct ATTRIB *attr = NULL, *attr2;
+ struct INDEX_HDR *hdr;
+ struct INDEX_ROOT *root;
+ struct NTFS_DE *e, *e1, *e2;
+ struct NEW_ATTRIBUTE_SIZES *new_sz;
+ struct ATTR_FILE_NAME *fname;
+ struct OpenAttr *oa, *oa2;
+ u32 nsize, t32, asize, used, esize, bmp_off, bmp_bits;
+ u16 id, id2;
+ u32 record_size = sbi->record_size;
+ u64 t64;
+ u16 roff = le16_to_cpu(lrh->record_off);
+ u16 aoff = le16_to_cpu(lrh->attr_off);
+ u64 lco = 0;
+ u64 cbo = (u64)le16_to_cpu(lrh->cluster_off) << SECTOR_SHIFT;
+ u64 tvo = le64_to_cpu(lrh->target_vcn) << sbi->cluster_bits;
+ u64 vbo = cbo + tvo;
+ void *buffer_le = NULL;
+ u32 bytes = 0;
+ bool a_dirty = false;
+ u16 data_off;
+
+ oa = oe->ptr;
+
+ /* Big switch to prepare. */
+ switch (op) {
+ /* ============================================================
+ * Process MFT records, as described by the current log record.
+ * ============================================================
+ */
+ case InitializeFileRecordSegment:
+ case DeallocateFileRecordSegment:
+ case WriteEndOfFileRecordSegment:
+ case CreateAttribute:
+ case DeleteAttribute:
+ case UpdateResidentValue:
+ case UpdateMappingPairs:
+ case SetNewAttributeSizes:
+ case AddIndexEntryRoot:
+ case DeleteIndexEntryRoot:
+ case SetIndexEntryVcnRoot:
+ case UpdateFileNameRoot:
+ case UpdateRecordDataRoot:
+ case ZeroEndOfFileRecord:
+ rno = vbo >> sbi->record_bits;
+ inode = ilookup(sbi->sb, rno);
+ if (inode) {
+ mi = &ntfs_i(inode)->mi;
+ } else if (op == InitializeFileRecordSegment) {
+ mi = kzalloc(sizeof(struct mft_inode), GFP_NOFS);
+ if (!mi)
+ return -ENOMEM;
+ err = mi_format_new(mi, sbi, rno, 0, false);
+ if (err)
+ goto out;
+ } else {
+ /* Read from disk. */
+ err = mi_get(sbi, rno, &mi);
+ if (err)
+ return err;
+ }
+ rec = mi->mrec;
+
+ if (op == DeallocateFileRecordSegment)
+ goto skip_load_parent;
+
+ if (InitializeFileRecordSegment != op) {
+ if (rec->rhdr.sign == NTFS_BAAD_SIGNATURE)
+ goto dirty_vol;
+ if (!check_lsn(&rec->rhdr, rlsn))
+ goto out;
+ if (!check_file_record(rec, NULL, sbi))
+ goto dirty_vol;
+ attr = Add2Ptr(rec, roff);
+ }
+
+ if (is_rec_base(rec) || InitializeFileRecordSegment == op) {
+ rno_base = rno;
+ goto skip_load_parent;
+ }
+
+ rno_base = ino_get(&rec->parent_ref);
+ inode_parent = ntfs_iget5(sbi->sb, &rec->parent_ref, NULL);
+ if (IS_ERR(inode_parent))
+ goto skip_load_parent;
+
+ if (is_bad_inode(inode_parent)) {
+ iput(inode_parent);
+ goto skip_load_parent;
+ }
+
+ if (ni_load_mi_ex(ntfs_i(inode_parent), rno, &mi2_child)) {
+ iput(inode_parent);
+ } else {
+ if (mi2_child->mrec != mi->mrec)
+ memcpy(mi2_child->mrec, mi->mrec,
+ sbi->record_size);
+
+ if (inode)
+ iput(inode);
+ else if (mi)
+ mi_put(mi);
+
+ inode = inode_parent;
+ mi = mi2_child;
+ rec = mi2_child->mrec;
+ attr = Add2Ptr(rec, roff);
+ }
+
+skip_load_parent:
+ inode_parent = NULL;
+ break;
+
+ /*
+ * Process attributes, as described by the current log record.
+ */
+ case UpdateNonresidentValue:
+ case AddIndexEntryAllocation:
+ case DeleteIndexEntryAllocation:
+ case WriteEndOfIndexBuffer:
+ case SetIndexEntryVcnAllocation:
+ case UpdateFileNameAllocation:
+ case SetBitsInNonresidentBitMap:
+ case ClearBitsInNonresidentBitMap:
+ case UpdateRecordDataAllocation:
+ attr = oa->attr;
+ bytes = UpdateNonresidentValue == op ? dlen : 0;
+ lco = (u64)le16_to_cpu(lrh->lcns_follow) << sbi->cluster_bits;
+
+ if (attr->type == ATTR_ALLOC) {
+ t32 = le32_to_cpu(oe->bytes_per_index);
+ if (bytes < t32)
+ bytes = t32;
+ }
+
+ if (!bytes)
+ bytes = lco - cbo;
+
+ bytes += roff;
+ if (attr->type == ATTR_ALLOC)
+ bytes = (bytes + 511) & ~511; // align
+
+ buffer_le = kmalloc(bytes, GFP_NOFS);
+ if (!buffer_le)
+ return -ENOMEM;
+
+ err = ntfs_read_run_nb(sbi, oa->run1, vbo, buffer_le, bytes,
+ NULL);
+ if (err)
+ goto out;
+
+ if (attr->type == ATTR_ALLOC && *(int *)buffer_le)
+ ntfs_fix_post_read(buffer_le, bytes, false);
+ break;
+
+ default:
+ WARN_ON(1);
+ }
+
+ /* Big switch to do operation. */
+ switch (op) {
+ case InitializeFileRecordSegment:
+ if (roff + dlen > record_size)
+ goto dirty_vol;
+
+ memcpy(Add2Ptr(rec, roff), data, dlen);
+ mi->dirty = true;
+ break;
+
+ case DeallocateFileRecordSegment:
+ clear_rec_inuse(rec);
+ le16_add_cpu(&rec->seq, 1);
+ mi->dirty = true;
+ break;
+
+ case WriteEndOfFileRecordSegment:
+ attr2 = (struct ATTRIB *)data;
+ if (!check_if_attr(rec, lrh) || roff + dlen > record_size)
+ goto dirty_vol;
+
+ memmove(attr, attr2, dlen);
+ rec->used = cpu_to_le32(ALIGN(roff + dlen, 8));
+
+ mi->dirty = true;
+ break;
+
+ case CreateAttribute:
+ attr2 = (struct ATTRIB *)data;
+ asize = le32_to_cpu(attr2->size);
+ used = le32_to_cpu(rec->used);
+
+ if (!check_if_attr(rec, lrh) || dlen < SIZEOF_RESIDENT ||
+ !IS_ALIGNED(asize, 8) ||
+ Add2Ptr(attr2, asize) > Add2Ptr(lrh, rec_len) ||
+ dlen > record_size - used) {
+ goto dirty_vol;
+ }
+
+ memmove(Add2Ptr(attr, asize), attr, used - roff);
+ memcpy(attr, attr2, asize);
+
+ rec->used = cpu_to_le32(used + asize);
+ id = le16_to_cpu(rec->next_attr_id);
+ id2 = le16_to_cpu(attr2->id);
+ if (id <= id2)
+ rec->next_attr_id = cpu_to_le16(id2 + 1);
+ if (is_attr_indexed(attr))
+ le16_add_cpu(&rec->hard_links, 1);
+
+ oa2 = find_loaded_attr(log, attr, rno_base);
+ if (oa2) {
+ void *p2 = kmemdup(attr, le32_to_cpu(attr->size),
+ GFP_NOFS);
+ if (p2) {
+ // run_close(oa2->run1);
+ kfree(oa2->attr);
+ oa2->attr = p2;
+ }
+ }
+
+ mi->dirty = true;
+ break;
+
+ case DeleteAttribute:
+ asize = le32_to_cpu(attr->size);
+ used = le32_to_cpu(rec->used);
+
+ if (!check_if_attr(rec, lrh))
+ goto dirty_vol;
+
+ rec->used = cpu_to_le32(used - asize);
+ if (is_attr_indexed(attr))
+ le16_add_cpu(&rec->hard_links, -1);
+
+ memmove(attr, Add2Ptr(attr, asize), used - asize - roff);
+
+ mi->dirty = true;
+ break;
+
+ case UpdateResidentValue:
+ nsize = aoff + dlen;
+
+ if (!check_if_attr(rec, lrh))
+ goto dirty_vol;
+
+ asize = le32_to_cpu(attr->size);
+ used = le32_to_cpu(rec->used);
+
+ if (lrh->redo_len == lrh->undo_len) {
+ if (nsize > asize)
+ goto dirty_vol;
+ goto move_data;
+ }
+
+ if (nsize > asize && nsize - asize > record_size - used)
+ goto dirty_vol;
+
+ nsize = ALIGN(nsize, 8);
+ data_off = le16_to_cpu(attr->res.data_off);
+
+ if (nsize < asize) {
+ memmove(Add2Ptr(attr, aoff), data, dlen);
+ data = NULL; // To skip below memmove().
+ }
+
+ memmove(Add2Ptr(attr, nsize), Add2Ptr(attr, asize),
+ used - le16_to_cpu(lrh->record_off) - asize);
+
+ rec->used = cpu_to_le32(used + nsize - asize);
+ attr->size = cpu_to_le32(nsize);
+ attr->res.data_size = cpu_to_le32(aoff + dlen - data_off);
+
+move_data:
+ if (data)
+ memmove(Add2Ptr(attr, aoff), data, dlen);
+
+ oa2 = find_loaded_attr(log, attr, rno_base);
+ if (oa2) {
+ void *p2 = kmemdup(attr, le32_to_cpu(attr->size),
+ GFP_NOFS);
+ if (p2) {
+ // run_close(&oa2->run0);
+ oa2->run1 = &oa2->run0;
+ kfree(oa2->attr);
+ oa2->attr = p2;
+ }
+ }
+
+ mi->dirty = true;
+ break;
+
+ case UpdateMappingPairs:
+ nsize = aoff + dlen;
+ asize = le32_to_cpu(attr->size);
+ used = le32_to_cpu(rec->used);
+
+ if (!check_if_attr(rec, lrh) || !attr->non_res ||
+ aoff < le16_to_cpu(attr->nres.run_off) || aoff > asize ||
+ (nsize > asize && nsize - asize > record_size - used)) {
+ goto dirty_vol;
+ }
+
+ nsize = ALIGN(nsize, 8);
+
+ memmove(Add2Ptr(attr, nsize), Add2Ptr(attr, asize),
+ used - le16_to_cpu(lrh->record_off) - asize);
+ rec->used = cpu_to_le32(used + nsize - asize);
+ attr->size = cpu_to_le32(nsize);
+ memmove(Add2Ptr(attr, aoff), data, dlen);
+
+ if (run_get_highest_vcn(le64_to_cpu(attr->nres.svcn),
+ attr_run(attr), &t64)) {
+ goto dirty_vol;
+ }
+
+ attr->nres.evcn = cpu_to_le64(t64);
+ oa2 = find_loaded_attr(log, attr, rno_base);
+ if (oa2 && oa2->attr->non_res)
+ oa2->attr->nres.evcn = attr->nres.evcn;
+
+ mi->dirty = true;
+ break;
+
+ case SetNewAttributeSizes:
+ new_sz = data;
+ if (!check_if_attr(rec, lrh) || !attr->non_res)
+ goto dirty_vol;
+
+ attr->nres.alloc_size = new_sz->alloc_size;
+ attr->nres.data_size = new_sz->data_size;
+ attr->nres.valid_size = new_sz->valid_size;
+
+ if (dlen >= sizeof(struct NEW_ATTRIBUTE_SIZES))
+ attr->nres.total_size = new_sz->total_size;
+
+ oa2 = find_loaded_attr(log, attr, rno_base);
+ if (oa2) {
+ void *p2 = kmemdup(attr, le32_to_cpu(attr->size),
+ GFP_NOFS);
+ if (p2) {
+ kfree(oa2->attr);
+ oa2->attr = p2;
+ }
+ }
+ mi->dirty = true;
+ break;
+
+ case AddIndexEntryRoot:
+ e = (struct NTFS_DE *)data;
+ esize = le16_to_cpu(e->size);
+ root = resident_data(attr);
+ hdr = &root->ihdr;
+ used = le32_to_cpu(hdr->used);
+
+ if (!check_if_index_root(rec, lrh) ||
+ !check_if_root_index(attr, hdr, lrh) ||
+ Add2Ptr(data, esize) > Add2Ptr(lrh, rec_len) ||
+ esize > le32_to_cpu(rec->total) - le32_to_cpu(rec->used)) {
+ goto dirty_vol;
+ }
+
+ e1 = Add2Ptr(attr, le16_to_cpu(lrh->attr_off));
+
+ change_attr_size(rec, attr, le32_to_cpu(attr->size) + esize);
+
+ memmove(Add2Ptr(e1, esize), e1,
+ PtrOffset(e1, Add2Ptr(hdr, used)));
+ memmove(e1, e, esize);
+
+ le32_add_cpu(&attr->res.data_size, esize);
+ hdr->used = cpu_to_le32(used + esize);
+ le32_add_cpu(&hdr->total, esize);
+
+ mi->dirty = true;
+ break;
+
+ case DeleteIndexEntryRoot:
+ root = resident_data(attr);
+ hdr = &root->ihdr;
+ used = le32_to_cpu(hdr->used);
+
+ if (!check_if_index_root(rec, lrh) ||
+ !check_if_root_index(attr, hdr, lrh)) {
+ goto dirty_vol;
+ }
+
+ e1 = Add2Ptr(attr, le16_to_cpu(lrh->attr_off));
+ esize = le16_to_cpu(e1->size);
+ e2 = Add2Ptr(e1, esize);
+
+ memmove(e1, e2, PtrOffset(e2, Add2Ptr(hdr, used)));
+
+ le32_sub_cpu(&attr->res.data_size, esize);
+ hdr->used = cpu_to_le32(used - esize);
+ le32_sub_cpu(&hdr->total, esize);
+
+ change_attr_size(rec, attr, le32_to_cpu(attr->size) - esize);
+
+ mi->dirty = true;
+ break;
+
+ case SetIndexEntryVcnRoot:
+ root = resident_data(attr);
+ hdr = &root->ihdr;
+
+ if (!check_if_index_root(rec, lrh) ||
+ !check_if_root_index(attr, hdr, lrh)) {
+ goto dirty_vol;
+ }
+
+ e = Add2Ptr(attr, le16_to_cpu(lrh->attr_off));
+
+ de_set_vbn_le(e, *(__le64 *)data);
+ mi->dirty = true;
+ break;
+
+ case UpdateFileNameRoot:
+ root = resident_data(attr);
+ hdr = &root->ihdr;
+
+ if (!check_if_index_root(rec, lrh) ||
+ !check_if_root_index(attr, hdr, lrh)) {
+ goto dirty_vol;
+ }
+
+ e = Add2Ptr(attr, le16_to_cpu(lrh->attr_off));
+ fname = (struct ATTR_FILE_NAME *)(e + 1);
+ memmove(&fname->dup, data, sizeof(fname->dup)); //
+ mi->dirty = true;
+ break;
+
+ case UpdateRecordDataRoot:
+ root = resident_data(attr);
+ hdr = &root->ihdr;
+
+ if (!check_if_index_root(rec, lrh) ||
+ !check_if_root_index(attr, hdr, lrh)) {
+ goto dirty_vol;
+ }
+
+ e = Add2Ptr(attr, le16_to_cpu(lrh->attr_off));
+
+ memmove(Add2Ptr(e, le16_to_cpu(e->view.data_off)), data, dlen);
+
+ mi->dirty = true;
+ break;
+
+ case ZeroEndOfFileRecord:
+ if (roff + dlen > record_size)
+ goto dirty_vol;
+
+ memset(attr, 0, dlen);
+ mi->dirty = true;
+ break;
+
+ case UpdateNonresidentValue:
+ if (lco < cbo + roff + dlen)
+ goto dirty_vol;
+
+ memcpy(Add2Ptr(buffer_le, roff), data, dlen);
+
+ a_dirty = true;
+ if (attr->type == ATTR_ALLOC)
+ ntfs_fix_pre_write(buffer_le, bytes);
+ break;
+
+ case AddIndexEntryAllocation:
+ ib = Add2Ptr(buffer_le, roff);
+ hdr = &ib->ihdr;
+ e = data;
+ esize = le16_to_cpu(e->size);
+ e1 = Add2Ptr(ib, aoff);
+
+ if (is_baad(&ib->rhdr))
+ goto dirty_vol;
+ if (!check_lsn(&ib->rhdr, rlsn))
+ goto out;
+
+ used = le32_to_cpu(hdr->used);
+
+ if (!check_index_buffer(ib, bytes) ||
+ !check_if_alloc_index(hdr, aoff) ||
+ Add2Ptr(e, esize) > Add2Ptr(lrh, rec_len) ||
+ used + esize > le32_to_cpu(hdr->total)) {
+ goto dirty_vol;
+ }
+
+ memmove(Add2Ptr(e1, esize), e1,
+ PtrOffset(e1, Add2Ptr(hdr, used)));
+ memcpy(e1, e, esize);
+
+ hdr->used = cpu_to_le32(used + esize);
+
+ a_dirty = true;
+
+ ntfs_fix_pre_write(&ib->rhdr, bytes);
+ break;
+
+ case DeleteIndexEntryAllocation:
+ ib = Add2Ptr(buffer_le, roff);
+ hdr = &ib->ihdr;
+ e = Add2Ptr(ib, aoff);
+ esize = le16_to_cpu(e->size);
+
+ if (is_baad(&ib->rhdr))
+ goto dirty_vol;
+ if (!check_lsn(&ib->rhdr, rlsn))
+ goto out;
+
+ if (!check_index_buffer(ib, bytes) ||
+ !check_if_alloc_index(hdr, aoff)) {
+ goto dirty_vol;
+ }
+
+ e1 = Add2Ptr(e, esize);
+ nsize = esize;
+ used = le32_to_cpu(hdr->used);
+
+ memmove(e, e1, PtrOffset(e1, Add2Ptr(hdr, used)));
+
+ hdr->used = cpu_to_le32(used - nsize);
+
+ a_dirty = true;
+
+ ntfs_fix_pre_write(&ib->rhdr, bytes);
+ break;
+
+ case WriteEndOfIndexBuffer:
+ ib = Add2Ptr(buffer_le, roff);
+ hdr = &ib->ihdr;
+ e = Add2Ptr(ib, aoff);
+
+ if (is_baad(&ib->rhdr))
+ goto dirty_vol;
+ if (!check_lsn(&ib->rhdr, rlsn))
+ goto out;
+ if (!check_index_buffer(ib, bytes) ||
+ !check_if_alloc_index(hdr, aoff) ||
+ aoff + dlen > offsetof(struct INDEX_BUFFER, ihdr) +
+ le32_to_cpu(hdr->total)) {
+ goto dirty_vol;
+ }
+
+ hdr->used = cpu_to_le32(dlen + PtrOffset(hdr, e));
+ memmove(e, data, dlen);
+
+ a_dirty = true;
+ ntfs_fix_pre_write(&ib->rhdr, bytes);
+ break;
+
+ case SetIndexEntryVcnAllocation:
+ ib = Add2Ptr(buffer_le, roff);
+ hdr = &ib->ihdr;
+ e = Add2Ptr(ib, aoff);
+
+ if (is_baad(&ib->rhdr))
+ goto dirty_vol;
+
+ if (!check_lsn(&ib->rhdr, rlsn))
+ goto out;
+ if (!check_index_buffer(ib, bytes) ||
+ !check_if_alloc_index(hdr, aoff)) {
+ goto dirty_vol;
+ }
+
+ de_set_vbn_le(e, *(__le64 *)data);
+
+ a_dirty = true;
+ ntfs_fix_pre_write(&ib->rhdr, bytes);
+ break;
+
+ case UpdateFileNameAllocation:
+ ib = Add2Ptr(buffer_le, roff);
+ hdr = &ib->ihdr;
+ e = Add2Ptr(ib, aoff);
+
+ if (is_baad(&ib->rhdr))
+ goto dirty_vol;
+
+ if (!check_lsn(&ib->rhdr, rlsn))
+ goto out;
+ if (!check_index_buffer(ib, bytes) ||
+ !check_if_alloc_index(hdr, aoff)) {
+ goto dirty_vol;
+ }
+
+ fname = (struct ATTR_FILE_NAME *)(e + 1);
+ memmove(&fname->dup, data, sizeof(fname->dup));
+
+ a_dirty = true;
+ ntfs_fix_pre_write(&ib->rhdr, bytes);
+ break;
+
+ case SetBitsInNonresidentBitMap:
+ bmp_off =
+ le32_to_cpu(((struct BITMAP_RANGE *)data)->bitmap_off);
+ bmp_bits = le32_to_cpu(((struct BITMAP_RANGE *)data)->bits);
+
+ if (cbo + (bmp_off + 7) / 8 > lco ||
+ cbo + ((bmp_off + bmp_bits + 7) / 8) > lco) {
+ goto dirty_vol;
+ }
+
+ __bitmap_set(Add2Ptr(buffer_le, roff), bmp_off, bmp_bits);
+ a_dirty = true;
+ break;
+
+ case ClearBitsInNonresidentBitMap:
+ bmp_off =
+ le32_to_cpu(((struct BITMAP_RANGE *)data)->bitmap_off);
+ bmp_bits = le32_to_cpu(((struct BITMAP_RANGE *)data)->bits);
+
+ if (cbo + (bmp_off + 7) / 8 > lco ||
+ cbo + ((bmp_off + bmp_bits + 7) / 8) > lco) {
+ goto dirty_vol;
+ }
+
+ __bitmap_clear(Add2Ptr(buffer_le, roff), bmp_off, bmp_bits);
+ a_dirty = true;
+ break;
+
+ case UpdateRecordDataAllocation:
+ ib = Add2Ptr(buffer_le, roff);
+ hdr = &ib->ihdr;
+ e = Add2Ptr(ib, aoff);
+
+ if (is_baad(&ib->rhdr))
+ goto dirty_vol;
+
+ if (!check_lsn(&ib->rhdr, rlsn))
+ goto out;
+ if (!check_index_buffer(ib, bytes) ||
+ !check_if_alloc_index(hdr, aoff)) {
+ goto dirty_vol;
+ }
+
+ memmove(Add2Ptr(e, le16_to_cpu(e->view.data_off)), data, dlen);
+
+ a_dirty = true;
+ ntfs_fix_pre_write(&ib->rhdr, bytes);
+ break;
+
+ default:
+ WARN_ON(1);
+ }
+
+ if (rlsn) {
+ __le64 t64 = cpu_to_le64(*rlsn);
+
+ if (rec)
+ rec->rhdr.lsn = t64;
+ if (ib)
+ ib->rhdr.lsn = t64;
+ }
+
+ if (mi && mi->dirty) {
+ err = mi_write(mi, 0);
+ if (err)
+ goto out;
+ }
+
+ if (a_dirty) {
+ attr = oa->attr;
+ err = ntfs_sb_write_run(sbi, oa->run1, vbo, buffer_le, bytes);
+ if (err)
+ goto out;
+ }
+
+out:
+
+ if (inode)
+ iput(inode);
+ else if (mi != mi2_child)
+ mi_put(mi);
+
+ kfree(buffer_le);
+
+ return err;
+
+dirty_vol:
+ log->set_dirty = true;
+ goto out;
+}
+
+/*
+ * log_replay - Replays log and empties it.
+ *
+ * This function is called during mount operation.
+ * It replays log and empties it.
+ * Initialized is set false if logfile contains '-1'.
+ */
+int log_replay(struct ntfs_inode *ni, bool *initialized)
+{
+ int err;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct ntfs_log *log;
+
+ struct restart_info rst_info, rst_info2;
+ u64 rec_lsn, ra_lsn, checkpt_lsn = 0, rlsn = 0;
+ struct ATTR_NAME_ENTRY *attr_names = NULL;
+ struct ATTR_NAME_ENTRY *ane;
+ struct RESTART_TABLE *dptbl = NULL;
+ struct RESTART_TABLE *trtbl = NULL;
+ const struct RESTART_TABLE *rt;
+ struct RESTART_TABLE *oatbl = NULL;
+ struct inode *inode;
+ struct OpenAttr *oa;
+ struct ntfs_inode *ni_oe;
+ struct ATTRIB *attr = NULL;
+ u64 size, vcn, undo_next_lsn;
+ CLST rno, lcn, lcn0, len0, clen;
+ void *data;
+ struct NTFS_RESTART *rst = NULL;
+ struct lcb *lcb = NULL;
+ struct OPEN_ATTR_ENRTY *oe;
+ struct TRANSACTION_ENTRY *tr;
+ struct DIR_PAGE_ENTRY *dp;
+ u32 i, bytes_per_attr_entry;
+ u32 l_size = ni->vfs_inode.i_size;
+ u32 orig_file_size = l_size;
+ u32 page_size, vbo, tail, off, dlen;
+ u32 saved_len, rec_len, transact_id;
+ bool use_second_page;
+ struct RESTART_AREA *ra2, *ra = NULL;
+ struct CLIENT_REC *ca, *cr;
+ __le16 client;
+ struct RESTART_HDR *rh;
+ const struct LFS_RECORD_HDR *frh;
+ const struct LOG_REC_HDR *lrh;
+ bool is_mapped;
+ bool is_ro = sb_rdonly(sbi->sb);
+ u64 t64;
+ u16 t16;
+ u32 t32;
+
+ /* Get the size of page. NOTE: To replay we can use default page. */
+#if PAGE_SIZE >= DefaultLogPageSize && PAGE_SIZE <= DefaultLogPageSize * 2
+ page_size = norm_file_page(PAGE_SIZE, &l_size, true);
+#else
+ page_size = norm_file_page(PAGE_SIZE, &l_size, false);
+#endif
+ if (!page_size)
+ return -EINVAL;
+
+ log = kzalloc(sizeof(struct ntfs_log), GFP_NOFS);
+ if (!log)
+ return -ENOMEM;
+
+ log->ni = ni;
+ log->l_size = l_size;
+ log->one_page_buf = kmalloc(page_size, GFP_NOFS);
+
+ if (!log->one_page_buf) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ log->page_size = page_size;
+ log->page_mask = page_size - 1;
+ log->page_bits = blksize_bits(page_size);
+
+ /* Look for a restart area on the disk. */
+ err = log_read_rst(log, l_size, true, &rst_info);
+ if (err)
+ goto out;
+
+ /* remember 'initialized' */
+ *initialized = rst_info.initialized;
+
+ if (!rst_info.restart) {
+ if (rst_info.initialized) {
+ /* No restart area but the file is not initialized. */
+ err = -EINVAL;
+ goto out;
+ }
+
+ log_init_pg_hdr(log, page_size, page_size, 1, 1);
+ log_create(log, l_size, 0, get_random_int(), false, false);
+
+ log->ra = ra;
+
+ ra = log_create_ra(log);
+ if (!ra) {
+ err = -ENOMEM;
+ goto out;
+ }
+ log->ra = ra;
+ log->init_ra = true;
+
+ goto process_log;
+ }
+
+ /*
+ * If the restart offset above wasn't zero then we won't
+ * look for a second restart.
+ */
+ if (rst_info.vbo)
+ goto check_restart_area;
+
+ err = log_read_rst(log, l_size, false, &rst_info2);
+
+ /* Determine which restart area to use. */
+ if (!rst_info2.restart || rst_info2.last_lsn <= rst_info.last_lsn)
+ goto use_first_page;
+
+ use_second_page = true;
+
+ if (rst_info.chkdsk_was_run && page_size != rst_info.vbo) {
+ struct RECORD_PAGE_HDR *sp = NULL;
+ bool usa_error;
+
+ if (!read_log_page(log, page_size, &sp, &usa_error) &&
+ sp->rhdr.sign == NTFS_CHKD_SIGNATURE) {
+ use_second_page = false;
+ }
+ kfree(sp);
+ }
+
+ if (use_second_page) {
+ kfree(rst_info.r_page);
+ memcpy(&rst_info, &rst_info2, sizeof(struct restart_info));
+ rst_info2.r_page = NULL;
+ }
+
+use_first_page:
+ kfree(rst_info2.r_page);
+
+check_restart_area:
+ /*
+ * If the restart area is at offset 0, we want
+ * to write the second restart area first.
+ */
+ log->init_ra = !!rst_info.vbo;
+
+ /* If we have a valid page then grab a pointer to the restart area. */
+ ra2 = rst_info.valid_page
+ ? Add2Ptr(rst_info.r_page,
+ le16_to_cpu(rst_info.r_page->ra_off))
+ : NULL;
+
+ if (rst_info.chkdsk_was_run ||
+ (ra2 && ra2->client_idx[1] == LFS_NO_CLIENT_LE)) {
+ bool wrapped = false;
+ bool use_multi_page = false;
+ u32 open_log_count;
+
+ /* Do some checks based on whether we have a valid log page. */
+ if (!rst_info.valid_page) {
+ open_log_count = get_random_int();
+ goto init_log_instance;
+ }
+ open_log_count = le32_to_cpu(ra2->open_log_count);
+
+ /*
+ * If the restart page size isn't changing then we want to
+ * check how much work we need to do.
+ */
+ if (page_size != le32_to_cpu(rst_info.r_page->sys_page_size))
+ goto init_log_instance;
+
+init_log_instance:
+ log_init_pg_hdr(log, page_size, page_size, 1, 1);
+
+ log_create(log, l_size, rst_info.last_lsn, open_log_count,
+ wrapped, use_multi_page);
+
+ ra = log_create_ra(log);
+ if (!ra) {
+ err = -ENOMEM;
+ goto out;
+ }
+ log->ra = ra;
+
+ /* Put the restart areas and initialize
+ * the log file as required.
+ */
+ goto process_log;
+ }
+
+ if (!ra2) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * If the log page or the system page sizes have changed, we can't
+ * use the log file. We must use the system page size instead of the
+ * default size if there is not a clean shutdown.
+ */
+ t32 = le32_to_cpu(rst_info.r_page->sys_page_size);
+ if (page_size != t32) {
+ l_size = orig_file_size;
+ page_size =
+ norm_file_page(t32, &l_size, t32 == DefaultLogPageSize);
+ }
+
+ if (page_size != t32 ||
+ page_size != le32_to_cpu(rst_info.r_page->page_size)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* If the file size has shrunk then we won't mount it. */
+ if (l_size < le64_to_cpu(ra2->l_size)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ log_init_pg_hdr(log, page_size, page_size,
+ le16_to_cpu(rst_info.r_page->major_ver),
+ le16_to_cpu(rst_info.r_page->minor_ver));
+
+ log->l_size = le64_to_cpu(ra2->l_size);
+ log->seq_num_bits = le32_to_cpu(ra2->seq_num_bits);
+ log->file_data_bits = sizeof(u64) * 8 - log->seq_num_bits;
+ log->seq_num_mask = (8 << log->file_data_bits) - 1;
+ log->last_lsn = le64_to_cpu(ra2->current_lsn);
+ log->seq_num = log->last_lsn >> log->file_data_bits;
+ log->ra_off = le16_to_cpu(rst_info.r_page->ra_off);
+ log->restart_size = log->sys_page_size - log->ra_off;
+ log->record_header_len = le16_to_cpu(ra2->rec_hdr_len);
+ log->ra_size = le16_to_cpu(ra2->ra_len);
+ log->data_off = le16_to_cpu(ra2->data_off);
+ log->data_size = log->page_size - log->data_off;
+ log->reserved = log->data_size - log->record_header_len;
+
+ vbo = lsn_to_vbo(log, log->last_lsn);
+
+ if (vbo < log->first_page) {
+ /* This is a pseudo lsn. */
+ log->l_flags |= NTFSLOG_NO_LAST_LSN;
+ log->next_page = log->first_page;
+ goto find_oldest;
+ }
+
+ /* Find the end of this log record. */
+ off = final_log_off(log, log->last_lsn,
+ le32_to_cpu(ra2->last_lsn_data_len));
+
+ /* If we wrapped the file then increment the sequence number. */
+ if (off <= vbo) {
+ log->seq_num += 1;
+ log->l_flags |= NTFSLOG_WRAPPED;
+ }
+
+ /* Now compute the next log page to use. */
+ vbo &= ~log->sys_page_mask;
+ tail = log->page_size - (off & log->page_mask) - 1;
+
+ /*
+ *If we can fit another log record on the page,
+ * move back a page the log file.
+ */
+ if (tail >= log->record_header_len) {
+ log->l_flags |= NTFSLOG_REUSE_TAIL;
+ log->next_page = vbo;
+ } else {
+ log->next_page = next_page_off(log, vbo);
+ }
+
+find_oldest:
+ /*
+ * Find the oldest client lsn. Use the last
+ * flushed lsn as a starting point.
+ */
+ log->oldest_lsn = log->last_lsn;
+ oldest_client_lsn(Add2Ptr(ra2, le16_to_cpu(ra2->client_off)),
+ ra2->client_idx[1], &log->oldest_lsn);
+ log->oldest_lsn_off = lsn_to_vbo(log, log->oldest_lsn);
+
+ if (log->oldest_lsn_off < log->first_page)
+ log->l_flags |= NTFSLOG_NO_OLDEST_LSN;
+
+ if (!(ra2->flags & RESTART_SINGLE_PAGE_IO))
+ log->l_flags |= NTFSLOG_WRAPPED | NTFSLOG_MULTIPLE_PAGE_IO;
+
+ log->current_openlog_count = le32_to_cpu(ra2->open_log_count);
+ log->total_avail_pages = log->l_size - log->first_page;
+ log->total_avail = log->total_avail_pages >> log->page_bits;
+ log->max_current_avail = log->total_avail * log->reserved;
+ log->total_avail = log->total_avail * log->data_size;
+
+ log->current_avail = current_log_avail(log);
+
+ ra = kzalloc(log->restart_size, GFP_NOFS);
+ if (!ra) {
+ err = -ENOMEM;
+ goto out;
+ }
+ log->ra = ra;
+
+ t16 = le16_to_cpu(ra2->client_off);
+ if (t16 == offsetof(struct RESTART_AREA, clients)) {
+ memcpy(ra, ra2, log->ra_size);
+ } else {
+ memcpy(ra, ra2, offsetof(struct RESTART_AREA, clients));
+ memcpy(ra->clients, Add2Ptr(ra2, t16),
+ le16_to_cpu(ra2->ra_len) - t16);
+
+ log->current_openlog_count = get_random_int();
+ ra->open_log_count = cpu_to_le32(log->current_openlog_count);
+ log->ra_size = offsetof(struct RESTART_AREA, clients) +
+ sizeof(struct CLIENT_REC);
+ ra->client_off =
+ cpu_to_le16(offsetof(struct RESTART_AREA, clients));
+ ra->ra_len = cpu_to_le16(log->ra_size);
+ }
+
+ le32_add_cpu(&ra->open_log_count, 1);
+
+ /* Now we need to walk through looking for the last lsn. */
+ err = last_log_lsn(log);
+ if (err)
+ goto out;
+
+ log->current_avail = current_log_avail(log);
+
+ /* Remember which restart area to write first. */
+ log->init_ra = rst_info.vbo;
+
+process_log:
+ /* 1.0, 1.1, 2.0 log->major_ver/minor_ver - short values. */
+ switch ((log->major_ver << 16) + log->minor_ver) {
+ case 0x10000:
+ case 0x10001:
+ case 0x20000:
+ break;
+ default:
+ ntfs_warn(sbi->sb, "\x24LogFile version %d.%d is not supported",
+ log->major_ver, log->minor_ver);
+ err = -EOPNOTSUPP;
+ log->set_dirty = true;
+ goto out;
+ }
+
+ /* One client "NTFS" per logfile. */
+ ca = Add2Ptr(ra, le16_to_cpu(ra->client_off));
+
+ for (client = ra->client_idx[1];; client = cr->next_client) {
+ if (client == LFS_NO_CLIENT_LE) {
+ /* Insert "NTFS" client LogFile. */
+ client = ra->client_idx[0];
+ if (client == LFS_NO_CLIENT_LE)
+ return -EINVAL;
+
+ t16 = le16_to_cpu(client);
+ cr = ca + t16;
+
+ remove_client(ca, cr, &ra->client_idx[0]);
+
+ cr->restart_lsn = 0;
+ cr->oldest_lsn = cpu_to_le64(log->oldest_lsn);
+ cr->name_bytes = cpu_to_le32(8);
+ cr->name[0] = cpu_to_le16('N');
+ cr->name[1] = cpu_to_le16('T');
+ cr->name[2] = cpu_to_le16('F');
+ cr->name[3] = cpu_to_le16('S');
+
+ add_client(ca, t16, &ra->client_idx[1]);
+ break;
+ }
+
+ cr = ca + le16_to_cpu(client);
+
+ if (cpu_to_le32(8) == cr->name_bytes &&
+ cpu_to_le16('N') == cr->name[0] &&
+ cpu_to_le16('T') == cr->name[1] &&
+ cpu_to_le16('F') == cr->name[2] &&
+ cpu_to_le16('S') == cr->name[3])
+ break;
+ }
+
+ /* Update the client handle with the client block information. */
+ log->client_id.seq_num = cr->seq_num;
+ log->client_id.client_idx = client;
+
+ err = read_rst_area(log, &rst, &ra_lsn);
+ if (err)
+ goto out;
+
+ if (!rst)
+ goto out;
+
+ bytes_per_attr_entry = !rst->major_ver ? 0x2C : 0x28;
+
+ checkpt_lsn = le64_to_cpu(rst->check_point_start);
+ if (!checkpt_lsn)
+ checkpt_lsn = ra_lsn;
+
+ /* Allocate and Read the Transaction Table. */
+ if (!rst->transact_table_len)
+ goto check_dirty_page_table;
+
+ t64 = le64_to_cpu(rst->transact_table_lsn);
+ err = read_log_rec_lcb(log, t64, lcb_ctx_prev, &lcb);
+ if (err)
+ goto out;
+
+ lrh = lcb->log_rec;
+ frh = lcb->lrh;
+ rec_len = le32_to_cpu(frh->client_data_len);
+
+ if (!check_log_rec(lrh, rec_len, le32_to_cpu(frh->transact_id),
+ bytes_per_attr_entry)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ t16 = le16_to_cpu(lrh->redo_off);
+
+ rt = Add2Ptr(lrh, t16);
+ t32 = rec_len - t16;
+
+ /* Now check that this is a valid restart table. */
+ if (!check_rstbl(rt, t32)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ trtbl = kmemdup(rt, t32, GFP_NOFS);
+ if (!trtbl) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ lcb_put(lcb);
+ lcb = NULL;
+
+check_dirty_page_table:
+ /* The next record back should be the Dirty Pages Table. */
+ if (!rst->dirty_pages_len)
+ goto check_attribute_names;
+
+ t64 = le64_to_cpu(rst->dirty_pages_table_lsn);
+ err = read_log_rec_lcb(log, t64, lcb_ctx_prev, &lcb);
+ if (err)
+ goto out;
+
+ lrh = lcb->log_rec;
+ frh = lcb->lrh;
+ rec_len = le32_to_cpu(frh->client_data_len);
+
+ if (!check_log_rec(lrh, rec_len, le32_to_cpu(frh->transact_id),
+ bytes_per_attr_entry)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ t16 = le16_to_cpu(lrh->redo_off);
+
+ rt = Add2Ptr(lrh, t16);
+ t32 = rec_len - t16;
+
+ /* Now check that this is a valid restart table. */
+ if (!check_rstbl(rt, t32)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ dptbl = kmemdup(rt, t32, GFP_NOFS);
+ if (!dptbl) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ /* Convert Ra version '0' into version '1'. */
+ if (rst->major_ver)
+ goto end_conv_1;
+
+ dp = NULL;
+ while ((dp = enum_rstbl(dptbl, dp))) {
+ struct DIR_PAGE_ENTRY_32 *dp0 = (struct DIR_PAGE_ENTRY_32 *)dp;
+ // NOTE: Danger. Check for of boundary.
+ memmove(&dp->vcn, &dp0->vcn_low,
+ 2 * sizeof(u64) +
+ le32_to_cpu(dp->lcns_follow) * sizeof(u64));
+ }
+
+end_conv_1:
+ lcb_put(lcb);
+ lcb = NULL;
+
+ /*
+ * Go through the table and remove the duplicates,
+ * remembering the oldest lsn values.
+ */
+ if (sbi->cluster_size <= log->page_size)
+ goto trace_dp_table;
+
+ dp = NULL;
+ while ((dp = enum_rstbl(dptbl, dp))) {
+ struct DIR_PAGE_ENTRY *next = dp;
+
+ while ((next = enum_rstbl(dptbl, next))) {
+ if (next->target_attr == dp->target_attr &&
+ next->vcn == dp->vcn) {
+ if (le64_to_cpu(next->oldest_lsn) <
+ le64_to_cpu(dp->oldest_lsn)) {
+ dp->oldest_lsn = next->oldest_lsn;
+ }
+
+ free_rsttbl_idx(dptbl, PtrOffset(dptbl, next));
+ }
+ }
+ }
+trace_dp_table:
+check_attribute_names:
+ /* The next record should be the Attribute Names. */
+ if (!rst->attr_names_len)
+ goto check_attr_table;
+
+ t64 = le64_to_cpu(rst->attr_names_lsn);
+ err = read_log_rec_lcb(log, t64, lcb_ctx_prev, &lcb);
+ if (err)
+ goto out;
+
+ lrh = lcb->log_rec;
+ frh = lcb->lrh;
+ rec_len = le32_to_cpu(frh->client_data_len);
+
+ if (!check_log_rec(lrh, rec_len, le32_to_cpu(frh->transact_id),
+ bytes_per_attr_entry)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ t32 = lrh_length(lrh);
+ rec_len -= t32;
+
+ attr_names = kmemdup(Add2Ptr(lrh, t32), rec_len, GFP_NOFS);
+
+ lcb_put(lcb);
+ lcb = NULL;
+
+check_attr_table:
+ /* The next record should be the attribute Table. */
+ if (!rst->open_attr_len)
+ goto check_attribute_names2;
+
+ t64 = le64_to_cpu(rst->open_attr_table_lsn);
+ err = read_log_rec_lcb(log, t64, lcb_ctx_prev, &lcb);
+ if (err)
+ goto out;
+
+ lrh = lcb->log_rec;
+ frh = lcb->lrh;
+ rec_len = le32_to_cpu(frh->client_data_len);
+
+ if (!check_log_rec(lrh, rec_len, le32_to_cpu(frh->transact_id),
+ bytes_per_attr_entry)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ t16 = le16_to_cpu(lrh->redo_off);
+
+ rt = Add2Ptr(lrh, t16);
+ t32 = rec_len - t16;
+
+ if (!check_rstbl(rt, t32)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ oatbl = kmemdup(rt, t32, GFP_NOFS);
+ if (!oatbl) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ log->open_attr_tbl = oatbl;
+
+ /* Clear all of the Attr pointers. */
+ oe = NULL;
+ while ((oe = enum_rstbl(oatbl, oe))) {
+ if (!rst->major_ver) {
+ struct OPEN_ATTR_ENRTY_32 oe0;
+
+ /* Really 'oe' points to OPEN_ATTR_ENRTY_32. */
+ memcpy(&oe0, oe, SIZEOF_OPENATTRIBUTEENTRY0);
+
+ oe->bytes_per_index = oe0.bytes_per_index;
+ oe->type = oe0.type;
+ oe->is_dirty_pages = oe0.is_dirty_pages;
+ oe->name_len = 0;
+ oe->ref = oe0.ref;
+ oe->open_record_lsn = oe0.open_record_lsn;
+ }
+
+ oe->is_attr_name = 0;
+ oe->ptr = NULL;
+ }
+
+ lcb_put(lcb);
+ lcb = NULL;
+
+check_attribute_names2:
+ if (!rst->attr_names_len)
+ goto trace_attribute_table;
+
+ ane = attr_names;
+ if (!oatbl)
+ goto trace_attribute_table;
+ while (ane->off) {
+ /* TODO: Clear table on exit! */
+ oe = Add2Ptr(oatbl, le16_to_cpu(ane->off));
+ t16 = le16_to_cpu(ane->name_bytes);
+ oe->name_len = t16 / sizeof(short);
+ oe->ptr = ane->name;
+ oe->is_attr_name = 2;
+ ane = Add2Ptr(ane, sizeof(struct ATTR_NAME_ENTRY) + t16);
+ }
+
+trace_attribute_table:
+ /*
+ * If the checkpt_lsn is zero, then this is a freshly
+ * formatted disk and we have no work to do.
+ */
+ if (!checkpt_lsn) {
+ err = 0;
+ goto out;
+ }
+
+ if (!oatbl) {
+ oatbl = init_rsttbl(bytes_per_attr_entry, 8);
+ if (!oatbl) {
+ err = -ENOMEM;
+ goto out;
+ }
+ }
+
+ log->open_attr_tbl = oatbl;
+
+ /* Start the analysis pass from the Checkpoint lsn. */
+ rec_lsn = checkpt_lsn;
+
+ /* Read the first lsn. */
+ err = read_log_rec_lcb(log, checkpt_lsn, lcb_ctx_next, &lcb);
+ if (err)
+ goto out;
+
+ /* Loop to read all subsequent records to the end of the log file. */
+next_log_record_analyze:
+ err = read_next_log_rec(log, lcb, &rec_lsn);
+ if (err)
+ goto out;
+
+ if (!rec_lsn)
+ goto end_log_records_enumerate;
+
+ frh = lcb->lrh;
+ transact_id = le32_to_cpu(frh->transact_id);
+ rec_len = le32_to_cpu(frh->client_data_len);
+ lrh = lcb->log_rec;
+
+ if (!check_log_rec(lrh, rec_len, transact_id, bytes_per_attr_entry)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * The first lsn after the previous lsn remembered
+ * the checkpoint is the first candidate for the rlsn.
+ */
+ if (!rlsn)
+ rlsn = rec_lsn;
+
+ if (LfsClientRecord != frh->record_type)
+ goto next_log_record_analyze;
+
+ /*
+ * Now update the Transaction Table for this transaction. If there
+ * is no entry present or it is unallocated we allocate the entry.
+ */
+ if (!trtbl) {
+ trtbl = init_rsttbl(sizeof(struct TRANSACTION_ENTRY),
+ INITIAL_NUMBER_TRANSACTIONS);
+ if (!trtbl) {
+ err = -ENOMEM;
+ goto out;
+ }
+ }
+
+ tr = Add2Ptr(trtbl, transact_id);
+
+ if (transact_id >= bytes_per_rt(trtbl) ||
+ tr->next != RESTART_ENTRY_ALLOCATED_LE) {
+ tr = alloc_rsttbl_from_idx(&trtbl, transact_id);
+ if (!tr) {
+ err = -ENOMEM;
+ goto out;
+ }
+ tr->transact_state = TransactionActive;
+ tr->first_lsn = cpu_to_le64(rec_lsn);
+ }
+
+ tr->prev_lsn = tr->undo_next_lsn = cpu_to_le64(rec_lsn);
+
+ /*
+ * If this is a compensation log record, then change
+ * the undo_next_lsn to be the undo_next_lsn of this record.
+ */
+ if (lrh->undo_op == cpu_to_le16(CompensationLogRecord))
+ tr->undo_next_lsn = frh->client_undo_next_lsn;
+
+ /* Dispatch to handle log record depending on type. */
+ switch (le16_to_cpu(lrh->redo_op)) {
+ case InitializeFileRecordSegment:
+ case DeallocateFileRecordSegment:
+ case WriteEndOfFileRecordSegment:
+ case CreateAttribute:
+ case DeleteAttribute:
+ case UpdateResidentValue:
+ case UpdateNonresidentValue:
+ case UpdateMappingPairs:
+ case SetNewAttributeSizes:
+ case AddIndexEntryRoot:
+ case DeleteIndexEntryRoot:
+ case AddIndexEntryAllocation:
+ case DeleteIndexEntryAllocation:
+ case WriteEndOfIndexBuffer:
+ case SetIndexEntryVcnRoot:
+ case SetIndexEntryVcnAllocation:
+ case UpdateFileNameRoot:
+ case UpdateFileNameAllocation:
+ case SetBitsInNonresidentBitMap:
+ case ClearBitsInNonresidentBitMap:
+ case UpdateRecordDataRoot:
+ case UpdateRecordDataAllocation:
+ case ZeroEndOfFileRecord:
+ t16 = le16_to_cpu(lrh->target_attr);
+ t64 = le64_to_cpu(lrh->target_vcn);
+ dp = find_dp(dptbl, t16, t64);
+
+ if (dp)
+ goto copy_lcns;
+
+ /*
+ * Calculate the number of clusters per page the system
+ * which wrote the checkpoint, possibly creating the table.
+ */
+ if (dptbl) {
+ t32 = (le16_to_cpu(dptbl->size) -
+ sizeof(struct DIR_PAGE_ENTRY)) /
+ sizeof(u64);
+ } else {
+ t32 = log->clst_per_page;
+ kfree(dptbl);
+ dptbl = init_rsttbl(struct_size(dp, page_lcns, t32),
+ 32);
+ if (!dptbl) {
+ err = -ENOMEM;
+ goto out;
+ }
+ }
+
+ dp = alloc_rsttbl_idx(&dptbl);
+ if (!dp) {
+ err = -ENOMEM;
+ goto out;
+ }
+ dp->target_attr = cpu_to_le32(t16);
+ dp->transfer_len = cpu_to_le32(t32 << sbi->cluster_bits);
+ dp->lcns_follow = cpu_to_le32(t32);
+ dp->vcn = cpu_to_le64(t64 & ~((u64)t32 - 1));
+ dp->oldest_lsn = cpu_to_le64(rec_lsn);
+
+copy_lcns:
+ /*
+ * Copy the Lcns from the log record into the Dirty Page Entry.
+ * TODO: For different page size support, must somehow make
+ * whole routine a loop, case Lcns do not fit below.
+ */
+ t16 = le16_to_cpu(lrh->lcns_follow);
+ for (i = 0; i < t16; i++) {
+ size_t j = (size_t)(le64_to_cpu(lrh->target_vcn) -
+ le64_to_cpu(dp->vcn));
+ dp->page_lcns[j + i] = lrh->page_lcns[i];
+ }
+
+ goto next_log_record_analyze;
+
+ case DeleteDirtyClusters: {
+ u32 range_count =
+ le16_to_cpu(lrh->redo_len) / sizeof(struct LCN_RANGE);
+ const struct LCN_RANGE *r =
+ Add2Ptr(lrh, le16_to_cpu(lrh->redo_off));
+
+ /* Loop through all of the Lcn ranges this log record. */
+ for (i = 0; i < range_count; i++, r++) {
+ u64 lcn0 = le64_to_cpu(r->lcn);
+ u64 lcn_e = lcn0 + le64_to_cpu(r->len) - 1;
+
+ dp = NULL;
+ while ((dp = enum_rstbl(dptbl, dp))) {
+ u32 j;
+
+ t32 = le32_to_cpu(dp->lcns_follow);
+ for (j = 0; j < t32; j++) {
+ t64 = le64_to_cpu(dp->page_lcns[j]);
+ if (t64 >= lcn0 && t64 <= lcn_e)
+ dp->page_lcns[j] = 0;
+ }
+ }
+ }
+ goto next_log_record_analyze;
+ ;
+ }
+
+ case OpenNonresidentAttribute:
+ t16 = le16_to_cpu(lrh->target_attr);
+ if (t16 >= bytes_per_rt(oatbl)) {
+ /*
+ * Compute how big the table needs to be.
+ * Add 10 extra entries for some cushion.
+ */
+ u32 new_e = t16 / le16_to_cpu(oatbl->size);
+
+ new_e += 10 - le16_to_cpu(oatbl->used);
+
+ oatbl = extend_rsttbl(oatbl, new_e, ~0u);
+ log->open_attr_tbl = oatbl;
+ if (!oatbl) {
+ err = -ENOMEM;
+ goto out;
+ }
+ }
+
+ /* Point to the entry being opened. */
+ oe = alloc_rsttbl_from_idx(&oatbl, t16);
+ log->open_attr_tbl = oatbl;
+ if (!oe) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ /* Initialize this entry from the log record. */
+ t16 = le16_to_cpu(lrh->redo_off);
+ if (!rst->major_ver) {
+ /* Convert version '0' into version '1'. */
+ struct OPEN_ATTR_ENRTY_32 *oe0 = Add2Ptr(lrh, t16);
+
+ oe->bytes_per_index = oe0->bytes_per_index;
+ oe->type = oe0->type;
+ oe->is_dirty_pages = oe0->is_dirty_pages;
+ oe->name_len = 0; //oe0.name_len;
+ oe->ref = oe0->ref;
+ oe->open_record_lsn = oe0->open_record_lsn;
+ } else {
+ memcpy(oe, Add2Ptr(lrh, t16), bytes_per_attr_entry);
+ }
+
+ t16 = le16_to_cpu(lrh->undo_len);
+ if (t16) {
+ oe->ptr = kmalloc(t16, GFP_NOFS);
+ if (!oe->ptr) {
+ err = -ENOMEM;
+ goto out;
+ }
+ oe->name_len = t16 / sizeof(short);
+ memcpy(oe->ptr,
+ Add2Ptr(lrh, le16_to_cpu(lrh->undo_off)), t16);
+ oe->is_attr_name = 1;
+ } else {
+ oe->ptr = NULL;
+ oe->is_attr_name = 0;
+ }
+
+ goto next_log_record_analyze;
+
+ case HotFix:
+ t16 = le16_to_cpu(lrh->target_attr);
+ t64 = le64_to_cpu(lrh->target_vcn);
+ dp = find_dp(dptbl, t16, t64);
+ if (dp) {
+ size_t j = le64_to_cpu(lrh->target_vcn) -
+ le64_to_cpu(dp->vcn);
+ if (dp->page_lcns[j])
+ dp->page_lcns[j] = lrh->page_lcns[0];
+ }
+ goto next_log_record_analyze;
+
+ case EndTopLevelAction:
+ tr = Add2Ptr(trtbl, transact_id);
+ tr->prev_lsn = cpu_to_le64(rec_lsn);
+ tr->undo_next_lsn = frh->client_undo_next_lsn;
+ goto next_log_record_analyze;
+
+ case PrepareTransaction:
+ tr = Add2Ptr(trtbl, transact_id);
+ tr->transact_state = TransactionPrepared;
+ goto next_log_record_analyze;
+
+ case CommitTransaction:
+ tr = Add2Ptr(trtbl, transact_id);
+ tr->transact_state = TransactionCommitted;
+ goto next_log_record_analyze;
+
+ case ForgetTransaction:
+ free_rsttbl_idx(trtbl, transact_id);
+ goto next_log_record_analyze;
+
+ case Noop:
+ case OpenAttributeTableDump:
+ case AttributeNamesDump:
+ case DirtyPageTableDump:
+ case TransactionTableDump:
+ /* The following cases require no action the Analysis Pass. */
+ goto next_log_record_analyze;
+
+ default:
+ /*
+ * All codes will be explicitly handled.
+ * If we see a code we do not expect, then we are trouble.
+ */
+ goto next_log_record_analyze;
+ }
+
+end_log_records_enumerate:
+ lcb_put(lcb);
+ lcb = NULL;
+
+ /*
+ * Scan the Dirty Page Table and Transaction Table for
+ * the lowest lsn, and return it as the Redo lsn.
+ */
+ dp = NULL;
+ while ((dp = enum_rstbl(dptbl, dp))) {
+ t64 = le64_to_cpu(dp->oldest_lsn);
+ if (t64 && t64 < rlsn)
+ rlsn = t64;
+ }
+
+ tr = NULL;
+ while ((tr = enum_rstbl(trtbl, tr))) {
+ t64 = le64_to_cpu(tr->first_lsn);
+ if (t64 && t64 < rlsn)
+ rlsn = t64;
+ }
+
+ /*
+ * Only proceed if the Dirty Page Table or Transaction
+ * table are not empty.
+ */
+ if ((!dptbl || !dptbl->total) && (!trtbl || !trtbl->total))
+ goto end_reply;
+
+ sbi->flags |= NTFS_FLAGS_NEED_REPLAY;
+ if (is_ro)
+ goto out;
+
+ /* Reopen all of the attributes with dirty pages. */
+ oe = NULL;
+next_open_attribute:
+
+ oe = enum_rstbl(oatbl, oe);
+ if (!oe) {
+ err = 0;
+ dp = NULL;
+ goto next_dirty_page;
+ }
+
+ oa = kzalloc(sizeof(struct OpenAttr), GFP_NOFS);
+ if (!oa) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ inode = ntfs_iget5(sbi->sb, &oe->ref, NULL);
+ if (IS_ERR(inode))
+ goto fake_attr;
+
+ if (is_bad_inode(inode)) {
+ iput(inode);
+fake_attr:
+ if (oa->ni) {
+ iput(&oa->ni->vfs_inode);
+ oa->ni = NULL;
+ }
+
+ attr = attr_create_nonres_log(sbi, oe->type, 0, oe->ptr,
+ oe->name_len, 0);
+ if (!attr) {
+ kfree(oa);
+ err = -ENOMEM;
+ goto out;
+ }
+ oa->attr = attr;
+ oa->run1 = &oa->run0;
+ goto final_oe;
+ }
+
+ ni_oe = ntfs_i(inode);
+ oa->ni = ni_oe;
+
+ attr = ni_find_attr(ni_oe, NULL, NULL, oe->type, oe->ptr, oe->name_len,
+ NULL, NULL);
+
+ if (!attr)
+ goto fake_attr;
+
+ t32 = le32_to_cpu(attr->size);
+ oa->attr = kmemdup(attr, t32, GFP_NOFS);
+ if (!oa->attr)
+ goto fake_attr;
+
+ if (!S_ISDIR(inode->i_mode)) {
+ if (attr->type == ATTR_DATA && !attr->name_len) {
+ oa->run1 = &ni_oe->file.run;
+ goto final_oe;
+ }
+ } else {
+ if (attr->type == ATTR_ALLOC &&
+ attr->name_len == ARRAY_SIZE(I30_NAME) &&
+ !memcmp(attr_name(attr), I30_NAME, sizeof(I30_NAME))) {
+ oa->run1 = &ni_oe->dir.alloc_run;
+ goto final_oe;
+ }
+ }
+
+ if (attr->non_res) {
+ u16 roff = le16_to_cpu(attr->nres.run_off);
+ CLST svcn = le64_to_cpu(attr->nres.svcn);
+
+ err = run_unpack(&oa->run0, sbi, inode->i_ino, svcn,
+ le64_to_cpu(attr->nres.evcn), svcn,
+ Add2Ptr(attr, roff), t32 - roff);
+ if (err < 0) {
+ kfree(oa->attr);
+ oa->attr = NULL;
+ goto fake_attr;
+ }
+ err = 0;
+ }
+ oa->run1 = &oa->run0;
+ attr = oa->attr;
+
+final_oe:
+ if (oe->is_attr_name == 1)
+ kfree(oe->ptr);
+ oe->is_attr_name = 0;
+ oe->ptr = oa;
+ oe->name_len = attr->name_len;
+
+ goto next_open_attribute;
+
+ /*
+ * Now loop through the dirty page table to extract all of the Vcn/Lcn.
+ * Mapping that we have, and insert it into the appropriate run.
+ */
+next_dirty_page:
+ dp = enum_rstbl(dptbl, dp);
+ if (!dp)
+ goto do_redo_1;
+
+ oe = Add2Ptr(oatbl, le32_to_cpu(dp->target_attr));
+
+ if (oe->next != RESTART_ENTRY_ALLOCATED_LE)
+ goto next_dirty_page;
+
+ oa = oe->ptr;
+ if (!oa)
+ goto next_dirty_page;
+
+ i = -1;
+next_dirty_page_vcn:
+ i += 1;
+ if (i >= le32_to_cpu(dp->lcns_follow))
+ goto next_dirty_page;
+
+ vcn = le64_to_cpu(dp->vcn) + i;
+ size = (vcn + 1) << sbi->cluster_bits;
+
+ if (!dp->page_lcns[i])
+ goto next_dirty_page_vcn;
+
+ rno = ino_get(&oe->ref);
+ if (rno <= MFT_REC_MIRR &&
+ size < (MFT_REC_VOL + 1) * sbi->record_size &&
+ oe->type == ATTR_DATA) {
+ goto next_dirty_page_vcn;
+ }
+
+ lcn = le64_to_cpu(dp->page_lcns[i]);
+
+ if ((!run_lookup_entry(oa->run1, vcn, &lcn0, &len0, NULL) ||
+ lcn0 != lcn) &&
+ !run_add_entry(oa->run1, vcn, lcn, 1, false)) {
+ err = -ENOMEM;
+ goto out;
+ }
+ attr = oa->attr;
+ t64 = le64_to_cpu(attr->nres.alloc_size);
+ if (size > t64) {
+ attr->nres.valid_size = attr->nres.data_size =
+ attr->nres.alloc_size = cpu_to_le64(size);
+ }
+ goto next_dirty_page_vcn;
+
+do_redo_1:
+ /*
+ * Perform the Redo Pass, to restore all of the dirty pages to the same
+ * contents that they had immediately before the crash. If the dirty
+ * page table is empty, then we can skip the entire Redo Pass.
+ */
+ if (!dptbl || !dptbl->total)
+ goto do_undo_action;
+
+ rec_lsn = rlsn;
+
+ /*
+ * Read the record at the Redo lsn, before falling
+ * into common code to handle each record.
+ */
+ err = read_log_rec_lcb(log, rlsn, lcb_ctx_next, &lcb);
+ if (err)
+ goto out;
+
+ /*
+ * Now loop to read all of our log records forwards, until
+ * we hit the end of the file, cleaning up at the end.
+ */
+do_action_next:
+ frh = lcb->lrh;
+
+ if (LfsClientRecord != frh->record_type)
+ goto read_next_log_do_action;
+
+ transact_id = le32_to_cpu(frh->transact_id);
+ rec_len = le32_to_cpu(frh->client_data_len);
+ lrh = lcb->log_rec;
+
+ if (!check_log_rec(lrh, rec_len, transact_id, bytes_per_attr_entry)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Ignore log records that do not update pages. */
+ if (lrh->lcns_follow)
+ goto find_dirty_page;
+
+ goto read_next_log_do_action;
+
+find_dirty_page:
+ t16 = le16_to_cpu(lrh->target_attr);
+ t64 = le64_to_cpu(lrh->target_vcn);
+ dp = find_dp(dptbl, t16, t64);
+
+ if (!dp)
+ goto read_next_log_do_action;
+
+ if (rec_lsn < le64_to_cpu(dp->oldest_lsn))
+ goto read_next_log_do_action;
+
+ t16 = le16_to_cpu(lrh->target_attr);
+ if (t16 >= bytes_per_rt(oatbl)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ oe = Add2Ptr(oatbl, t16);
+
+ if (oe->next != RESTART_ENTRY_ALLOCATED_LE) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ oa = oe->ptr;
+
+ if (!oa) {
+ err = -EINVAL;
+ goto out;
+ }
+ attr = oa->attr;
+
+ vcn = le64_to_cpu(lrh->target_vcn);
+
+ if (!run_lookup_entry(oa->run1, vcn, &lcn, NULL, NULL) ||
+ lcn == SPARSE_LCN) {
+ goto read_next_log_do_action;
+ }
+
+ /* Point to the Redo data and get its length. */
+ data = Add2Ptr(lrh, le16_to_cpu(lrh->redo_off));
+ dlen = le16_to_cpu(lrh->redo_len);
+
+ /* Shorten length by any Lcns which were deleted. */
+ saved_len = dlen;
+
+ for (i = le16_to_cpu(lrh->lcns_follow); i; i--) {
+ size_t j;
+ u32 alen, voff;
+
+ voff = le16_to_cpu(lrh->record_off) +
+ le16_to_cpu(lrh->attr_off);
+ voff += le16_to_cpu(lrh->cluster_off) << SECTOR_SHIFT;
+
+ /* If the Vcn question is allocated, we can just get out. */
+ j = le64_to_cpu(lrh->target_vcn) - le64_to_cpu(dp->vcn);
+ if (dp->page_lcns[j + i - 1])
+ break;
+
+ if (!saved_len)
+ saved_len = 1;
+
+ /*
+ * Calculate the allocated space left relative to the
+ * log record Vcn, after removing this unallocated Vcn.
+ */
+ alen = (i - 1) << sbi->cluster_bits;
+
+ /*
+ * If the update described this log record goes beyond
+ * the allocated space, then we will have to reduce the length.
+ */
+ if (voff >= alen)
+ dlen = 0;
+ else if (voff + dlen > alen)
+ dlen = alen - voff;
+ }
+
+ /*
+ * If the resulting dlen from above is now zero,
+ * we can skip this log record.
+ */
+ if (!dlen && saved_len)
+ goto read_next_log_do_action;
+
+ t16 = le16_to_cpu(lrh->redo_op);
+ if (can_skip_action(t16))
+ goto read_next_log_do_action;
+
+ /* Apply the Redo operation a common routine. */
+ err = do_action(log, oe, lrh, t16, data, dlen, rec_len, &rec_lsn);
+ if (err)
+ goto out;
+
+ /* Keep reading and looping back until end of file. */
+read_next_log_do_action:
+ err = read_next_log_rec(log, lcb, &rec_lsn);
+ if (!err && rec_lsn)
+ goto do_action_next;
+
+ lcb_put(lcb);
+ lcb = NULL;
+
+do_undo_action:
+ /* Scan Transaction Table. */
+ tr = NULL;
+transaction_table_next:
+ tr = enum_rstbl(trtbl, tr);
+ if (!tr)
+ goto undo_action_done;
+
+ if (TransactionActive != tr->transact_state || !tr->undo_next_lsn) {
+ free_rsttbl_idx(trtbl, PtrOffset(trtbl, tr));
+ goto transaction_table_next;
+ }
+
+ log->transaction_id = PtrOffset(trtbl, tr);
+ undo_next_lsn = le64_to_cpu(tr->undo_next_lsn);
+
+ /*
+ * We only have to do anything if the transaction has
+ * something its undo_next_lsn field.
+ */
+ if (!undo_next_lsn)
+ goto commit_undo;
+
+ /* Read the first record to be undone by this transaction. */
+ err = read_log_rec_lcb(log, undo_next_lsn, lcb_ctx_undo_next, &lcb);
+ if (err)
+ goto out;
+
+ /*
+ * Now loop to read all of our log records forwards,
+ * until we hit the end of the file, cleaning up at the end.
+ */
+undo_action_next:
+
+ lrh = lcb->log_rec;
+ frh = lcb->lrh;
+ transact_id = le32_to_cpu(frh->transact_id);
+ rec_len = le32_to_cpu(frh->client_data_len);
+
+ if (!check_log_rec(lrh, rec_len, transact_id, bytes_per_attr_entry)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (lrh->undo_op == cpu_to_le16(Noop))
+ goto read_next_log_undo_action;
+
+ oe = Add2Ptr(oatbl, le16_to_cpu(lrh->target_attr));
+ oa = oe->ptr;
+
+ t16 = le16_to_cpu(lrh->lcns_follow);
+ if (!t16)
+ goto add_allocated_vcns;
+
+ is_mapped = run_lookup_entry(oa->run1, le64_to_cpu(lrh->target_vcn),
+ &lcn, &clen, NULL);
+
+ /*
+ * If the mapping isn't already the table or the mapping
+ * corresponds to a hole the mapping, we need to make sure
+ * there is no partial page already memory.
+ */
+ if (is_mapped && lcn != SPARSE_LCN && clen >= t16)
+ goto add_allocated_vcns;
+
+ vcn = le64_to_cpu(lrh->target_vcn);
+ vcn &= ~(log->clst_per_page - 1);
+
+add_allocated_vcns:
+ for (i = 0, vcn = le64_to_cpu(lrh->target_vcn),
+ size = (vcn + 1) << sbi->cluster_bits;
+ i < t16; i++, vcn += 1, size += sbi->cluster_size) {
+ attr = oa->attr;
+ if (!attr->non_res) {
+ if (size > le32_to_cpu(attr->res.data_size))
+ attr->res.data_size = cpu_to_le32(size);
+ } else {
+ if (size > le64_to_cpu(attr->nres.data_size))
+ attr->nres.valid_size = attr->nres.data_size =
+ attr->nres.alloc_size =
+ cpu_to_le64(size);
+ }
+ }
+
+ t16 = le16_to_cpu(lrh->undo_op);
+ if (can_skip_action(t16))
+ goto read_next_log_undo_action;
+
+ /* Point to the Redo data and get its length. */
+ data = Add2Ptr(lrh, le16_to_cpu(lrh->undo_off));
+ dlen = le16_to_cpu(lrh->undo_len);
+
+ /* It is time to apply the undo action. */
+ err = do_action(log, oe, lrh, t16, data, dlen, rec_len, NULL);
+
+read_next_log_undo_action:
+ /*
+ * Keep reading and looping back until we have read the
+ * last record for this transaction.
+ */
+ err = read_next_log_rec(log, lcb, &rec_lsn);
+ if (err)
+ goto out;
+
+ if (rec_lsn)
+ goto undo_action_next;
+
+ lcb_put(lcb);
+ lcb = NULL;
+
+commit_undo:
+ free_rsttbl_idx(trtbl, log->transaction_id);
+
+ log->transaction_id = 0;
+
+ goto transaction_table_next;
+
+undo_action_done:
+
+ ntfs_update_mftmirr(sbi, 0);
+
+ sbi->flags &= ~NTFS_FLAGS_NEED_REPLAY;
+
+end_reply:
+
+ err = 0;
+ if (is_ro)
+ goto out;
+
+ rh = kzalloc(log->page_size, GFP_NOFS);
+ if (!rh) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ rh->rhdr.sign = NTFS_RSTR_SIGNATURE;
+ rh->rhdr.fix_off = cpu_to_le16(offsetof(struct RESTART_HDR, fixups));
+ t16 = (log->page_size >> SECTOR_SHIFT) + 1;
+ rh->rhdr.fix_num = cpu_to_le16(t16);
+ rh->sys_page_size = cpu_to_le32(log->page_size);
+ rh->page_size = cpu_to_le32(log->page_size);
+
+ t16 = ALIGN(offsetof(struct RESTART_HDR, fixups) + sizeof(short) * t16,
+ 8);
+ rh->ra_off = cpu_to_le16(t16);
+ rh->minor_ver = cpu_to_le16(1); // 0x1A:
+ rh->major_ver = cpu_to_le16(1); // 0x1C:
+
+ ra2 = Add2Ptr(rh, t16);
+ memcpy(ra2, ra, sizeof(struct RESTART_AREA));
+
+ ra2->client_idx[0] = 0;
+ ra2->client_idx[1] = LFS_NO_CLIENT_LE;
+ ra2->flags = cpu_to_le16(2);
+
+ le32_add_cpu(&ra2->open_log_count, 1);
+
+ ntfs_fix_pre_write(&rh->rhdr, log->page_size);
+
+ err = ntfs_sb_write_run(sbi, &ni->file.run, 0, rh, log->page_size);
+ if (!err)
+ err = ntfs_sb_write_run(sbi, &log->ni->file.run, log->page_size,
+ rh, log->page_size);
+
+ kfree(rh);
+ if (err)
+ goto out;
+
+out:
+ kfree(rst);
+ if (lcb)
+ lcb_put(lcb);
+
+ /*
+ * Scan the Open Attribute Table to close all of
+ * the open attributes.
+ */
+ oe = NULL;
+ while ((oe = enum_rstbl(oatbl, oe))) {
+ rno = ino_get(&oe->ref);
+
+ if (oe->is_attr_name == 1) {
+ kfree(oe->ptr);
+ oe->ptr = NULL;
+ continue;
+ }
+
+ if (oe->is_attr_name)
+ continue;
+
+ oa = oe->ptr;
+ if (!oa)
+ continue;
+
+ run_close(&oa->run0);
+ kfree(oa->attr);
+ if (oa->ni)
+ iput(&oa->ni->vfs_inode);
+ kfree(oa);
+ }
+
+ kfree(trtbl);
+ kfree(oatbl);
+ kfree(dptbl);
+ kfree(attr_names);
+ kfree(rst_info.r_page);
+
+ kfree(ra);
+ kfree(log->one_page_buf);
+
+ if (err)
+ sbi->flags |= NTFS_FLAGS_NEED_REPLAY;
+
+ if (err == -EROFS)
+ err = 0;
+ else if (log->set_dirty)
+ ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+
+ kfree(log);
+
+ return err;
+}
diff --git a/fs/ntfs3/fsntfs.c b/fs/ntfs3/fsntfs.c
new file mode 100644
index 000000000000..91e3743e1442
--- /dev/null
+++ b/fs/ntfs3/fsntfs.c
@@ -0,0 +1,2509 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+// clang-format off
+const struct cpu_str NAME_MFT = {
+ 4, 0, { '$', 'M', 'F', 'T' },
+};
+const struct cpu_str NAME_MIRROR = {
+ 8, 0, { '$', 'M', 'F', 'T', 'M', 'i', 'r', 'r' },
+};
+const struct cpu_str NAME_LOGFILE = {
+ 8, 0, { '$', 'L', 'o', 'g', 'F', 'i', 'l', 'e' },
+};
+const struct cpu_str NAME_VOLUME = {
+ 7, 0, { '$', 'V', 'o', 'l', 'u', 'm', 'e' },
+};
+const struct cpu_str NAME_ATTRDEF = {
+ 8, 0, { '$', 'A', 't', 't', 'r', 'D', 'e', 'f' },
+};
+const struct cpu_str NAME_ROOT = {
+ 1, 0, { '.' },
+};
+const struct cpu_str NAME_BITMAP = {
+ 7, 0, { '$', 'B', 'i', 't', 'm', 'a', 'p' },
+};
+const struct cpu_str NAME_BOOT = {
+ 5, 0, { '$', 'B', 'o', 'o', 't' },
+};
+const struct cpu_str NAME_BADCLUS = {
+ 8, 0, { '$', 'B', 'a', 'd', 'C', 'l', 'u', 's' },
+};
+const struct cpu_str NAME_QUOTA = {
+ 6, 0, { '$', 'Q', 'u', 'o', 't', 'a' },
+};
+const struct cpu_str NAME_SECURE = {
+ 7, 0, { '$', 'S', 'e', 'c', 'u', 'r', 'e' },
+};
+const struct cpu_str NAME_UPCASE = {
+ 7, 0, { '$', 'U', 'p', 'C', 'a', 's', 'e' },
+};
+const struct cpu_str NAME_EXTEND = {
+ 7, 0, { '$', 'E', 'x', 't', 'e', 'n', 'd' },
+};
+const struct cpu_str NAME_OBJID = {
+ 6, 0, { '$', 'O', 'b', 'j', 'I', 'd' },
+};
+const struct cpu_str NAME_REPARSE = {
+ 8, 0, { '$', 'R', 'e', 'p', 'a', 'r', 's', 'e' },
+};
+const struct cpu_str NAME_USNJRNL = {
+ 8, 0, { '$', 'U', 's', 'n', 'J', 'r', 'n', 'l' },
+};
+const __le16 BAD_NAME[4] = {
+ cpu_to_le16('$'), cpu_to_le16('B'), cpu_to_le16('a'), cpu_to_le16('d'),
+};
+const __le16 I30_NAME[4] = {
+ cpu_to_le16('$'), cpu_to_le16('I'), cpu_to_le16('3'), cpu_to_le16('0'),
+};
+const __le16 SII_NAME[4] = {
+ cpu_to_le16('$'), cpu_to_le16('S'), cpu_to_le16('I'), cpu_to_le16('I'),
+};
+const __le16 SDH_NAME[4] = {
+ cpu_to_le16('$'), cpu_to_le16('S'), cpu_to_le16('D'), cpu_to_le16('H'),
+};
+const __le16 SDS_NAME[4] = {
+ cpu_to_le16('$'), cpu_to_le16('S'), cpu_to_le16('D'), cpu_to_le16('S'),
+};
+const __le16 SO_NAME[2] = {
+ cpu_to_le16('$'), cpu_to_le16('O'),
+};
+const __le16 SQ_NAME[2] = {
+ cpu_to_le16('$'), cpu_to_le16('Q'),
+};
+const __le16 SR_NAME[2] = {
+ cpu_to_le16('$'), cpu_to_le16('R'),
+};
+
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+const __le16 WOF_NAME[17] = {
+ cpu_to_le16('W'), cpu_to_le16('o'), cpu_to_le16('f'), cpu_to_le16('C'),
+ cpu_to_le16('o'), cpu_to_le16('m'), cpu_to_le16('p'), cpu_to_le16('r'),
+ cpu_to_le16('e'), cpu_to_le16('s'), cpu_to_le16('s'), cpu_to_le16('e'),
+ cpu_to_le16('d'), cpu_to_le16('D'), cpu_to_le16('a'), cpu_to_le16('t'),
+ cpu_to_le16('a'),
+};
+#endif
+
+// clang-format on
+
+/*
+ * ntfs_fix_pre_write - Insert fixups into @rhdr before writing to disk.
+ */
+bool ntfs_fix_pre_write(struct NTFS_RECORD_HEADER *rhdr, size_t bytes)
+{
+ u16 *fixup, *ptr;
+ u16 sample;
+ u16 fo = le16_to_cpu(rhdr->fix_off);
+ u16 fn = le16_to_cpu(rhdr->fix_num);
+
+ if ((fo & 1) || fo + fn * sizeof(short) > SECTOR_SIZE || !fn-- ||
+ fn * SECTOR_SIZE > bytes) {
+ return false;
+ }
+
+ /* Get fixup pointer. */
+ fixup = Add2Ptr(rhdr, fo);
+
+ if (*fixup >= 0x7FFF)
+ *fixup = 1;
+ else
+ *fixup += 1;
+
+ sample = *fixup;
+
+ ptr = Add2Ptr(rhdr, SECTOR_SIZE - sizeof(short));
+
+ while (fn--) {
+ *++fixup = *ptr;
+ *ptr = sample;
+ ptr += SECTOR_SIZE / sizeof(short);
+ }
+ return true;
+}
+
+/*
+ * ntfs_fix_post_read - Remove fixups after reading from disk.
+ *
+ * Return: < 0 if error, 0 if ok, 1 if need to update fixups.
+ */
+int ntfs_fix_post_read(struct NTFS_RECORD_HEADER *rhdr, size_t bytes,
+ bool simple)
+{
+ int ret;
+ u16 *fixup, *ptr;
+ u16 sample, fo, fn;
+
+ fo = le16_to_cpu(rhdr->fix_off);
+ fn = simple ? ((bytes >> SECTOR_SHIFT) + 1)
+ : le16_to_cpu(rhdr->fix_num);
+
+ /* Check errors. */
+ if ((fo & 1) || fo + fn * sizeof(short) > SECTOR_SIZE || !fn-- ||
+ fn * SECTOR_SIZE > bytes) {
+ return -EINVAL; /* Native chkntfs returns ok! */
+ }
+
+ /* Get fixup pointer. */
+ fixup = Add2Ptr(rhdr, fo);
+ sample = *fixup;
+ ptr = Add2Ptr(rhdr, SECTOR_SIZE - sizeof(short));
+ ret = 0;
+
+ while (fn--) {
+ /* Test current word. */
+ if (*ptr != sample) {
+ /* Fixup does not match! Is it serious error? */
+ ret = -E_NTFS_FIXUP;
+ }
+
+ /* Replace fixup. */
+ *ptr = *++fixup;
+ ptr += SECTOR_SIZE / sizeof(short);
+ }
+
+ return ret;
+}
+
+/*
+ * ntfs_extend_init - Load $Extend file.
+ */
+int ntfs_extend_init(struct ntfs_sb_info *sbi)
+{
+ int err;
+ struct super_block *sb = sbi->sb;
+ struct inode *inode, *inode2;
+ struct MFT_REF ref;
+
+ if (sbi->volume.major_ver < 3) {
+ ntfs_notice(sb, "Skip $Extend 'cause NTFS version");
+ return 0;
+ }
+
+ ref.low = cpu_to_le32(MFT_REC_EXTEND);
+ ref.high = 0;
+ ref.seq = cpu_to_le16(MFT_REC_EXTEND);
+ inode = ntfs_iget5(sb, &ref, &NAME_EXTEND);
+ if (IS_ERR(inode)) {
+ err = PTR_ERR(inode);
+ ntfs_err(sb, "Failed to load $Extend.");
+ inode = NULL;
+ goto out;
+ }
+
+ /* If ntfs_iget5() reads from disk it never returns bad inode. */
+ if (!S_ISDIR(inode->i_mode)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Try to find $ObjId */
+ inode2 = dir_search_u(inode, &NAME_OBJID, NULL);
+ if (inode2 && !IS_ERR(inode2)) {
+ if (is_bad_inode(inode2)) {
+ iput(inode2);
+ } else {
+ sbi->objid.ni = ntfs_i(inode2);
+ sbi->objid_no = inode2->i_ino;
+ }
+ }
+
+ /* Try to find $Quota */
+ inode2 = dir_search_u(inode, &NAME_QUOTA, NULL);
+ if (inode2 && !IS_ERR(inode2)) {
+ sbi->quota_no = inode2->i_ino;
+ iput(inode2);
+ }
+
+ /* Try to find $Reparse */
+ inode2 = dir_search_u(inode, &NAME_REPARSE, NULL);
+ if (inode2 && !IS_ERR(inode2)) {
+ sbi->reparse.ni = ntfs_i(inode2);
+ sbi->reparse_no = inode2->i_ino;
+ }
+
+ /* Try to find $UsnJrnl */
+ inode2 = dir_search_u(inode, &NAME_USNJRNL, NULL);
+ if (inode2 && !IS_ERR(inode2)) {
+ sbi->usn_jrnl_no = inode2->i_ino;
+ iput(inode2);
+ }
+
+ err = 0;
+out:
+ iput(inode);
+ return err;
+}
+
+int ntfs_loadlog_and_replay(struct ntfs_inode *ni, struct ntfs_sb_info *sbi)
+{
+ int err = 0;
+ struct super_block *sb = sbi->sb;
+ bool initialized = false;
+ struct MFT_REF ref;
+ struct inode *inode;
+
+ /* Check for 4GB. */
+ if (ni->vfs_inode.i_size >= 0x100000000ull) {
+ ntfs_err(sb, "\x24LogFile is too big");
+ err = -EINVAL;
+ goto out;
+ }
+
+ sbi->flags |= NTFS_FLAGS_LOG_REPLAYING;
+
+ ref.low = cpu_to_le32(MFT_REC_MFT);
+ ref.high = 0;
+ ref.seq = cpu_to_le16(1);
+
+ inode = ntfs_iget5(sb, &ref, NULL);
+
+ if (IS_ERR(inode))
+ inode = NULL;
+
+ if (!inode) {
+ /* Try to use MFT copy. */
+ u64 t64 = sbi->mft.lbo;
+
+ sbi->mft.lbo = sbi->mft.lbo2;
+ inode = ntfs_iget5(sb, &ref, NULL);
+ sbi->mft.lbo = t64;
+ if (IS_ERR(inode))
+ inode = NULL;
+ }
+
+ if (!inode) {
+ err = -EINVAL;
+ ntfs_err(sb, "Failed to load $MFT.");
+ goto out;
+ }
+
+ sbi->mft.ni = ntfs_i(inode);
+
+ /* LogFile should not contains attribute list. */
+ err = ni_load_all_mi(sbi->mft.ni);
+ if (!err)
+ err = log_replay(ni, &initialized);
+
+ iput(inode);
+ sbi->mft.ni = NULL;
+
+ sync_blockdev(sb->s_bdev);
+ invalidate_bdev(sb->s_bdev);
+
+ if (sbi->flags & NTFS_FLAGS_NEED_REPLAY) {
+ err = 0;
+ goto out;
+ }
+
+ if (sb_rdonly(sb) || !initialized)
+ goto out;
+
+ /* Fill LogFile by '-1' if it is initialized. */
+ err = ntfs_bio_fill_1(sbi, &ni->file.run);
+
+out:
+ sbi->flags &= ~NTFS_FLAGS_LOG_REPLAYING;
+
+ return err;
+}
+
+/*
+ * ntfs_query_def
+ *
+ * Return: Current ATTR_DEF_ENTRY for given attribute type.
+ */
+const struct ATTR_DEF_ENTRY *ntfs_query_def(struct ntfs_sb_info *sbi,
+ enum ATTR_TYPE type)
+{
+ int type_in = le32_to_cpu(type);
+ size_t min_idx = 0;
+ size_t max_idx = sbi->def_entries - 1;
+
+ while (min_idx <= max_idx) {
+ size_t i = min_idx + ((max_idx - min_idx) >> 1);
+ const struct ATTR_DEF_ENTRY *entry = sbi->def_table + i;
+ int diff = le32_to_cpu(entry->type) - type_in;
+
+ if (!diff)
+ return entry;
+ if (diff < 0)
+ min_idx = i + 1;
+ else if (i)
+ max_idx = i - 1;
+ else
+ return NULL;
+ }
+ return NULL;
+}
+
+/*
+ * ntfs_look_for_free_space - Look for a free space in bitmap.
+ */
+int ntfs_look_for_free_space(struct ntfs_sb_info *sbi, CLST lcn, CLST len,
+ CLST *new_lcn, CLST *new_len,
+ enum ALLOCATE_OPT opt)
+{
+ int err;
+ CLST alen = 0;
+ struct super_block *sb = sbi->sb;
+ size_t alcn, zlen, zeroes, zlcn, zlen2, ztrim, new_zlen;
+ struct wnd_bitmap *wnd = &sbi->used.bitmap;
+
+ down_write_nested(&wnd->rw_lock, BITMAP_MUTEX_CLUSTERS);
+ if (opt & ALLOCATE_MFT) {
+ zlen = wnd_zone_len(wnd);
+
+ if (!zlen) {
+ err = ntfs_refresh_zone(sbi);
+ if (err)
+ goto out;
+ zlen = wnd_zone_len(wnd);
+ }
+
+ if (!zlen) {
+ ntfs_err(sbi->sb, "no free space to extend mft");
+ goto out;
+ }
+
+ lcn = wnd_zone_bit(wnd);
+ alen = zlen > len ? len : zlen;
+
+ wnd_zone_set(wnd, lcn + alen, zlen - alen);
+
+ err = wnd_set_used(wnd, lcn, alen);
+ if (err) {
+ up_write(&wnd->rw_lock);
+ return err;
+ }
+ alcn = lcn;
+ goto out;
+ }
+ /*
+ * 'Cause cluster 0 is always used this value means that we should use
+ * cached value of 'next_free_lcn' to improve performance.
+ */
+ if (!lcn)
+ lcn = sbi->used.next_free_lcn;
+
+ if (lcn >= wnd->nbits)
+ lcn = 0;
+
+ alen = wnd_find(wnd, len, lcn, BITMAP_FIND_MARK_AS_USED, &alcn);
+ if (alen)
+ goto out;
+
+ /* Try to use clusters from MftZone. */
+ zlen = wnd_zone_len(wnd);
+ zeroes = wnd_zeroes(wnd);
+
+ /* Check too big request */
+ if (len > zeroes + zlen || zlen <= NTFS_MIN_MFT_ZONE)
+ goto out;
+
+ /* How many clusters to cat from zone. */
+ zlcn = wnd_zone_bit(wnd);
+ zlen2 = zlen >> 1;
+ ztrim = len > zlen ? zlen : (len > zlen2 ? len : zlen2);
+ new_zlen = zlen - ztrim;
+
+ if (new_zlen < NTFS_MIN_MFT_ZONE) {
+ new_zlen = NTFS_MIN_MFT_ZONE;
+ if (new_zlen > zlen)
+ new_zlen = zlen;
+ }
+
+ wnd_zone_set(wnd, zlcn, new_zlen);
+
+ /* Allocate continues clusters. */
+ alen = wnd_find(wnd, len, 0,
+ BITMAP_FIND_MARK_AS_USED | BITMAP_FIND_FULL, &alcn);
+
+out:
+ if (alen) {
+ err = 0;
+ *new_len = alen;
+ *new_lcn = alcn;
+
+ ntfs_unmap_meta(sb, alcn, alen);
+
+ /* Set hint for next requests. */
+ if (!(opt & ALLOCATE_MFT))
+ sbi->used.next_free_lcn = alcn + alen;
+ } else {
+ err = -ENOSPC;
+ }
+
+ up_write(&wnd->rw_lock);
+ return err;
+}
+
+/*
+ * ntfs_extend_mft - Allocate additional MFT records.
+ *
+ * sbi->mft.bitmap is locked for write.
+ *
+ * NOTE: recursive:
+ * ntfs_look_free_mft ->
+ * ntfs_extend_mft ->
+ * attr_set_size ->
+ * ni_insert_nonresident ->
+ * ni_insert_attr ->
+ * ni_ins_attr_ext ->
+ * ntfs_look_free_mft ->
+ * ntfs_extend_mft
+ *
+ * To avoid recursive always allocate space for two new MFT records
+ * see attrib.c: "at least two MFT to avoid recursive loop".
+ */
+static int ntfs_extend_mft(struct ntfs_sb_info *sbi)
+{
+ int err;
+ struct ntfs_inode *ni = sbi->mft.ni;
+ size_t new_mft_total;
+ u64 new_mft_bytes, new_bitmap_bytes;
+ struct ATTRIB *attr;
+ struct wnd_bitmap *wnd = &sbi->mft.bitmap;
+
+ new_mft_total = (wnd->nbits + MFT_INCREASE_CHUNK + 127) & (CLST)~127;
+ new_mft_bytes = (u64)new_mft_total << sbi->record_bits;
+
+ /* Step 1: Resize $MFT::DATA. */
+ down_write(&ni->file.run_lock);
+ err = attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run,
+ new_mft_bytes, NULL, false, &attr);
+
+ if (err) {
+ up_write(&ni->file.run_lock);
+ goto out;
+ }
+
+ attr->nres.valid_size = attr->nres.data_size;
+ new_mft_total = le64_to_cpu(attr->nres.alloc_size) >> sbi->record_bits;
+ ni->mi.dirty = true;
+
+ /* Step 2: Resize $MFT::BITMAP. */
+ new_bitmap_bytes = bitmap_size(new_mft_total);
+
+ err = attr_set_size(ni, ATTR_BITMAP, NULL, 0, &sbi->mft.bitmap.run,
+ new_bitmap_bytes, &new_bitmap_bytes, true, NULL);
+
+ /* Refresh MFT Zone if necessary. */
+ down_write_nested(&sbi->used.bitmap.rw_lock, BITMAP_MUTEX_CLUSTERS);
+
+ ntfs_refresh_zone(sbi);
+
+ up_write(&sbi->used.bitmap.rw_lock);
+ up_write(&ni->file.run_lock);
+
+ if (err)
+ goto out;
+
+ err = wnd_extend(wnd, new_mft_total);
+
+ if (err)
+ goto out;
+
+ ntfs_clear_mft_tail(sbi, sbi->mft.used, new_mft_total);
+
+ err = _ni_write_inode(&ni->vfs_inode, 0);
+out:
+ return err;
+}
+
+/*
+ * ntfs_look_free_mft - Look for a free MFT record.
+ */
+int ntfs_look_free_mft(struct ntfs_sb_info *sbi, CLST *rno, bool mft,
+ struct ntfs_inode *ni, struct mft_inode **mi)
+{
+ int err = 0;
+ size_t zbit, zlen, from, to, fr;
+ size_t mft_total;
+ struct MFT_REF ref;
+ struct super_block *sb = sbi->sb;
+ struct wnd_bitmap *wnd = &sbi->mft.bitmap;
+ u32 ir;
+
+ static_assert(sizeof(sbi->mft.reserved_bitmap) * 8 >=
+ MFT_REC_FREE - MFT_REC_RESERVED);
+
+ if (!mft)
+ down_write_nested(&wnd->rw_lock, BITMAP_MUTEX_MFT);
+
+ zlen = wnd_zone_len(wnd);
+
+ /* Always reserve space for MFT. */
+ if (zlen) {
+ if (mft) {
+ zbit = wnd_zone_bit(wnd);
+ *rno = zbit;
+ wnd_zone_set(wnd, zbit + 1, zlen - 1);
+ }
+ goto found;
+ }
+
+ /* No MFT zone. Find the nearest to '0' free MFT. */
+ if (!wnd_find(wnd, 1, MFT_REC_FREE, 0, &zbit)) {
+ /* Resize MFT */
+ mft_total = wnd->nbits;
+
+ err = ntfs_extend_mft(sbi);
+ if (!err) {
+ zbit = mft_total;
+ goto reserve_mft;
+ }
+
+ if (!mft || MFT_REC_FREE == sbi->mft.next_reserved)
+ goto out;
+
+ err = 0;
+
+ /*
+ * Look for free record reserved area [11-16) ==
+ * [MFT_REC_RESERVED, MFT_REC_FREE ) MFT bitmap always
+ * marks it as used.
+ */
+ if (!sbi->mft.reserved_bitmap) {
+ /* Once per session create internal bitmap for 5 bits. */
+ sbi->mft.reserved_bitmap = 0xFF;
+
+ ref.high = 0;
+ for (ir = MFT_REC_RESERVED; ir < MFT_REC_FREE; ir++) {
+ struct inode *i;
+ struct ntfs_inode *ni;
+ struct MFT_REC *mrec;
+
+ ref.low = cpu_to_le32(ir);
+ ref.seq = cpu_to_le16(ir);
+
+ i = ntfs_iget5(sb, &ref, NULL);
+ if (IS_ERR(i)) {
+next:
+ ntfs_notice(
+ sb,
+ "Invalid reserved record %x",
+ ref.low);
+ continue;
+ }
+ if (is_bad_inode(i)) {
+ iput(i);
+ goto next;
+ }
+
+ ni = ntfs_i(i);
+
+ mrec = ni->mi.mrec;
+
+ if (!is_rec_base(mrec))
+ goto next;
+
+ if (mrec->hard_links)
+ goto next;
+
+ if (!ni_std(ni))
+ goto next;
+
+ if (ni_find_attr(ni, NULL, NULL, ATTR_NAME,
+ NULL, 0, NULL, NULL))
+ goto next;
+
+ __clear_bit(ir - MFT_REC_RESERVED,
+ &sbi->mft.reserved_bitmap);
+ }
+ }
+
+ /* Scan 5 bits for zero. Bit 0 == MFT_REC_RESERVED */
+ zbit = find_next_zero_bit(&sbi->mft.reserved_bitmap,
+ MFT_REC_FREE, MFT_REC_RESERVED);
+ if (zbit >= MFT_REC_FREE) {
+ sbi->mft.next_reserved = MFT_REC_FREE;
+ goto out;
+ }
+
+ zlen = 1;
+ sbi->mft.next_reserved = zbit;
+ } else {
+reserve_mft:
+ zlen = zbit == MFT_REC_FREE ? (MFT_REC_USER - MFT_REC_FREE) : 4;
+ if (zbit + zlen > wnd->nbits)
+ zlen = wnd->nbits - zbit;
+
+ while (zlen > 1 && !wnd_is_free(wnd, zbit, zlen))
+ zlen -= 1;
+
+ /* [zbit, zbit + zlen) will be used for MFT itself. */
+ from = sbi->mft.used;
+ if (from < zbit)
+ from = zbit;
+ to = zbit + zlen;
+ if (from < to) {
+ ntfs_clear_mft_tail(sbi, from, to);
+ sbi->mft.used = to;
+ }
+ }
+
+ if (mft) {
+ *rno = zbit;
+ zbit += 1;
+ zlen -= 1;
+ }
+
+ wnd_zone_set(wnd, zbit, zlen);
+
+found:
+ if (!mft) {
+ /* The request to get record for general purpose. */
+ if (sbi->mft.next_free < MFT_REC_USER)
+ sbi->mft.next_free = MFT_REC_USER;
+
+ for (;;) {
+ if (sbi->mft.next_free >= sbi->mft.bitmap.nbits) {
+ } else if (!wnd_find(wnd, 1, MFT_REC_USER, 0, &fr)) {
+ sbi->mft.next_free = sbi->mft.bitmap.nbits;
+ } else {
+ *rno = fr;
+ sbi->mft.next_free = *rno + 1;
+ break;
+ }
+
+ err = ntfs_extend_mft(sbi);
+ if (err)
+ goto out;
+ }
+ }
+
+ if (ni && !ni_add_subrecord(ni, *rno, mi)) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ /* We have found a record that are not reserved for next MFT. */
+ if (*rno >= MFT_REC_FREE)
+ wnd_set_used(wnd, *rno, 1);
+ else if (*rno >= MFT_REC_RESERVED && sbi->mft.reserved_bitmap_inited)
+ __set_bit(*rno - MFT_REC_RESERVED, &sbi->mft.reserved_bitmap);
+
+out:
+ if (!mft)
+ up_write(&wnd->rw_lock);
+
+ return err;
+}
+
+/*
+ * ntfs_mark_rec_free - Mark record as free.
+ */
+void ntfs_mark_rec_free(struct ntfs_sb_info *sbi, CLST rno)
+{
+ struct wnd_bitmap *wnd = &sbi->mft.bitmap;
+
+ down_write_nested(&wnd->rw_lock, BITMAP_MUTEX_MFT);
+ if (rno >= wnd->nbits)
+ goto out;
+
+ if (rno >= MFT_REC_FREE) {
+ if (!wnd_is_used(wnd, rno, 1))
+ ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+ else
+ wnd_set_free(wnd, rno, 1);
+ } else if (rno >= MFT_REC_RESERVED && sbi->mft.reserved_bitmap_inited) {
+ __clear_bit(rno - MFT_REC_RESERVED, &sbi->mft.reserved_bitmap);
+ }
+
+ if (rno < wnd_zone_bit(wnd))
+ wnd_zone_set(wnd, rno, 1);
+ else if (rno < sbi->mft.next_free && rno >= MFT_REC_USER)
+ sbi->mft.next_free = rno;
+
+out:
+ up_write(&wnd->rw_lock);
+}
+
+/*
+ * ntfs_clear_mft_tail - Format empty records [from, to).
+ *
+ * sbi->mft.bitmap is locked for write.
+ */
+int ntfs_clear_mft_tail(struct ntfs_sb_info *sbi, size_t from, size_t to)
+{
+ int err;
+ u32 rs;
+ u64 vbo;
+ struct runs_tree *run;
+ struct ntfs_inode *ni;
+
+ if (from >= to)
+ return 0;
+
+ rs = sbi->record_size;
+ ni = sbi->mft.ni;
+ run = &ni->file.run;
+
+ down_read(&ni->file.run_lock);
+ vbo = (u64)from * rs;
+ for (; from < to; from++, vbo += rs) {
+ struct ntfs_buffers nb;
+
+ err = ntfs_get_bh(sbi, run, vbo, rs, &nb);
+ if (err)
+ goto out;
+
+ err = ntfs_write_bh(sbi, &sbi->new_rec->rhdr, &nb, 0);
+ nb_put(&nb);
+ if (err)
+ goto out;
+ }
+
+out:
+ sbi->mft.used = from;
+ up_read(&ni->file.run_lock);
+ return err;
+}
+
+/*
+ * ntfs_refresh_zone - Refresh MFT zone.
+ *
+ * sbi->used.bitmap is locked for rw.
+ * sbi->mft.bitmap is locked for write.
+ * sbi->mft.ni->file.run_lock for write.
+ */
+int ntfs_refresh_zone(struct ntfs_sb_info *sbi)
+{
+ CLST zone_limit, zone_max, lcn, vcn, len;
+ size_t lcn_s, zlen;
+ struct wnd_bitmap *wnd = &sbi->used.bitmap;
+ struct ntfs_inode *ni = sbi->mft.ni;
+
+ /* Do not change anything unless we have non empty MFT zone. */
+ if (wnd_zone_len(wnd))
+ return 0;
+
+ /*
+ * Compute the MFT zone at two steps.
+ * It would be nice if we are able to allocate 1/8 of
+ * total clusters for MFT but not more then 512 MB.
+ */
+ zone_limit = (512 * 1024 * 1024) >> sbi->cluster_bits;
+ zone_max = wnd->nbits >> 3;
+ if (zone_max > zone_limit)
+ zone_max = zone_limit;
+
+ vcn = bytes_to_cluster(sbi,
+ (u64)sbi->mft.bitmap.nbits << sbi->record_bits);
+
+ if (!run_lookup_entry(&ni->file.run, vcn - 1, &lcn, &len, NULL))
+ lcn = SPARSE_LCN;
+
+ /* We should always find Last Lcn for MFT. */
+ if (lcn == SPARSE_LCN)
+ return -EINVAL;
+
+ lcn_s = lcn + 1;
+
+ /* Try to allocate clusters after last MFT run. */
+ zlen = wnd_find(wnd, zone_max, lcn_s, 0, &lcn_s);
+ if (!zlen) {
+ ntfs_notice(sbi->sb, "MftZone: unavailable");
+ return 0;
+ }
+
+ /* Truncate too large zone. */
+ wnd_zone_set(wnd, lcn_s, zlen);
+
+ return 0;
+}
+
+/*
+ * ntfs_update_mftmirr - Update $MFTMirr data.
+ */
+int ntfs_update_mftmirr(struct ntfs_sb_info *sbi, int wait)
+{
+ int err;
+ struct super_block *sb = sbi->sb;
+ u32 blocksize = sb->s_blocksize;
+ sector_t block1, block2;
+ u32 bytes;
+
+ if (!(sbi->flags & NTFS_FLAGS_MFTMIRR))
+ return 0;
+
+ err = 0;
+ bytes = sbi->mft.recs_mirr << sbi->record_bits;
+ block1 = sbi->mft.lbo >> sb->s_blocksize_bits;
+ block2 = sbi->mft.lbo2 >> sb->s_blocksize_bits;
+
+ for (; bytes >= blocksize; bytes -= blocksize) {
+ struct buffer_head *bh1, *bh2;
+
+ bh1 = sb_bread(sb, block1++);
+ if (!bh1) {
+ err = -EIO;
+ goto out;
+ }
+
+ bh2 = sb_getblk(sb, block2++);
+ if (!bh2) {
+ put_bh(bh1);
+ err = -EIO;
+ goto out;
+ }
+
+ if (buffer_locked(bh2))
+ __wait_on_buffer(bh2);
+
+ lock_buffer(bh2);
+ memcpy(bh2->b_data, bh1->b_data, blocksize);
+ set_buffer_uptodate(bh2);
+ mark_buffer_dirty(bh2);
+ unlock_buffer(bh2);
+
+ put_bh(bh1);
+ bh1 = NULL;
+
+ if (wait)
+ err = sync_dirty_buffer(bh2);
+
+ put_bh(bh2);
+ if (err)
+ goto out;
+ }
+
+ sbi->flags &= ~NTFS_FLAGS_MFTMIRR;
+
+out:
+ return err;
+}
+
+/*
+ * ntfs_set_state
+ *
+ * Mount: ntfs_set_state(NTFS_DIRTY_DIRTY)
+ * Umount: ntfs_set_state(NTFS_DIRTY_CLEAR)
+ * NTFS error: ntfs_set_state(NTFS_DIRTY_ERROR)
+ */
+int ntfs_set_state(struct ntfs_sb_info *sbi, enum NTFS_DIRTY_FLAGS dirty)
+{
+ int err;
+ struct ATTRIB *attr;
+ struct VOLUME_INFO *info;
+ struct mft_inode *mi;
+ struct ntfs_inode *ni;
+
+ /*
+ * Do not change state if fs was real_dirty.
+ * Do not change state if fs already dirty(clear).
+ * Do not change any thing if mounted read only.
+ */
+ if (sbi->volume.real_dirty || sb_rdonly(sbi->sb))
+ return 0;
+
+ /* Check cached value. */
+ if ((dirty == NTFS_DIRTY_CLEAR ? 0 : VOLUME_FLAG_DIRTY) ==
+ (sbi->volume.flags & VOLUME_FLAG_DIRTY))
+ return 0;
+
+ ni = sbi->volume.ni;
+ if (!ni)
+ return -EINVAL;
+
+ mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_DIRTY);
+
+ attr = ni_find_attr(ni, NULL, NULL, ATTR_VOL_INFO, NULL, 0, NULL, &mi);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ info = resident_data_ex(attr, SIZEOF_ATTRIBUTE_VOLUME_INFO);
+ if (!info) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ switch (dirty) {
+ case NTFS_DIRTY_ERROR:
+ ntfs_notice(sbi->sb, "Mark volume as dirty due to NTFS errors");
+ sbi->volume.real_dirty = true;
+ fallthrough;
+ case NTFS_DIRTY_DIRTY:
+ info->flags |= VOLUME_FLAG_DIRTY;
+ break;
+ case NTFS_DIRTY_CLEAR:
+ info->flags &= ~VOLUME_FLAG_DIRTY;
+ break;
+ }
+ /* Cache current volume flags. */
+ sbi->volume.flags = info->flags;
+ mi->dirty = true;
+ err = 0;
+
+out:
+ ni_unlock(ni);
+ if (err)
+ return err;
+
+ mark_inode_dirty(&ni->vfs_inode);
+ /* verify(!ntfs_update_mftmirr()); */
+
+ /*
+ * If we used wait=1, sync_inode_metadata waits for the io for the
+ * inode to finish. It hangs when media is removed.
+ * So wait=0 is sent down to sync_inode_metadata
+ * and filemap_fdatawrite is used for the data blocks.
+ */
+ err = sync_inode_metadata(&ni->vfs_inode, 0);
+ if (!err)
+ err = filemap_fdatawrite(ni->vfs_inode.i_mapping);
+
+ return err;
+}
+
+/*
+ * security_hash - Calculates a hash of security descriptor.
+ */
+static inline __le32 security_hash(const void *sd, size_t bytes)
+{
+ u32 hash = 0;
+ const __le32 *ptr = sd;
+
+ bytes >>= 2;
+ while (bytes--)
+ hash = ((hash >> 0x1D) | (hash << 3)) + le32_to_cpu(*ptr++);
+ return cpu_to_le32(hash);
+}
+
+int ntfs_sb_read(struct super_block *sb, u64 lbo, size_t bytes, void *buffer)
+{
+ struct block_device *bdev = sb->s_bdev;
+ u32 blocksize = sb->s_blocksize;
+ u64 block = lbo >> sb->s_blocksize_bits;
+ u32 off = lbo & (blocksize - 1);
+ u32 op = blocksize - off;
+
+ for (; bytes; block += 1, off = 0, op = blocksize) {
+ struct buffer_head *bh = __bread(bdev, block, blocksize);
+
+ if (!bh)
+ return -EIO;
+
+ if (op > bytes)
+ op = bytes;
+
+ memcpy(buffer, bh->b_data + off, op);
+
+ put_bh(bh);
+
+ bytes -= op;
+ buffer = Add2Ptr(buffer, op);
+ }
+
+ return 0;
+}
+
+int ntfs_sb_write(struct super_block *sb, u64 lbo, size_t bytes,
+ const void *buf, int wait)
+{
+ u32 blocksize = sb->s_blocksize;
+ struct block_device *bdev = sb->s_bdev;
+ sector_t block = lbo >> sb->s_blocksize_bits;
+ u32 off = lbo & (blocksize - 1);
+ u32 op = blocksize - off;
+ struct buffer_head *bh;
+
+ if (!wait && (sb->s_flags & SB_SYNCHRONOUS))
+ wait = 1;
+
+ for (; bytes; block += 1, off = 0, op = blocksize) {
+ if (op > bytes)
+ op = bytes;
+
+ if (op < blocksize) {
+ bh = __bread(bdev, block, blocksize);
+ if (!bh) {
+ ntfs_err(sb, "failed to read block %llx",
+ (u64)block);
+ return -EIO;
+ }
+ } else {
+ bh = __getblk(bdev, block, blocksize);
+ if (!bh)
+ return -ENOMEM;
+ }
+
+ if (buffer_locked(bh))
+ __wait_on_buffer(bh);
+
+ lock_buffer(bh);
+ if (buf) {
+ memcpy(bh->b_data + off, buf, op);
+ buf = Add2Ptr(buf, op);
+ } else {
+ memset(bh->b_data + off, -1, op);
+ }
+
+ set_buffer_uptodate(bh);
+ mark_buffer_dirty(bh);
+ unlock_buffer(bh);
+
+ if (wait) {
+ int err = sync_dirty_buffer(bh);
+
+ if (err) {
+ ntfs_err(
+ sb,
+ "failed to sync buffer at block %llx, error %d",
+ (u64)block, err);
+ put_bh(bh);
+ return err;
+ }
+ }
+
+ put_bh(bh);
+
+ bytes -= op;
+ }
+ return 0;
+}
+
+int ntfs_sb_write_run(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+ u64 vbo, const void *buf, size_t bytes)
+{
+ struct super_block *sb = sbi->sb;
+ u8 cluster_bits = sbi->cluster_bits;
+ u32 off = vbo & sbi->cluster_mask;
+ CLST lcn, clen, vcn = vbo >> cluster_bits, vcn_next;
+ u64 lbo, len;
+ size_t idx;
+
+ if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx))
+ return -ENOENT;
+
+ if (lcn == SPARSE_LCN)
+ return -EINVAL;
+
+ lbo = ((u64)lcn << cluster_bits) + off;
+ len = ((u64)clen << cluster_bits) - off;
+
+ for (;;) {
+ u32 op = len < bytes ? len : bytes;
+ int err = ntfs_sb_write(sb, lbo, op, buf, 0);
+
+ if (err)
+ return err;
+
+ bytes -= op;
+ if (!bytes)
+ break;
+
+ vcn_next = vcn + clen;
+ if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) ||
+ vcn != vcn_next)
+ return -ENOENT;
+
+ if (lcn == SPARSE_LCN)
+ return -EINVAL;
+
+ if (buf)
+ buf = Add2Ptr(buf, op);
+
+ lbo = ((u64)lcn << cluster_bits);
+ len = ((u64)clen << cluster_bits);
+ }
+
+ return 0;
+}
+
+struct buffer_head *ntfs_bread_run(struct ntfs_sb_info *sbi,
+ const struct runs_tree *run, u64 vbo)
+{
+ struct super_block *sb = sbi->sb;
+ u8 cluster_bits = sbi->cluster_bits;
+ CLST lcn;
+ u64 lbo;
+
+ if (!run_lookup_entry(run, vbo >> cluster_bits, &lcn, NULL, NULL))
+ return ERR_PTR(-ENOENT);
+
+ lbo = ((u64)lcn << cluster_bits) + (vbo & sbi->cluster_mask);
+
+ return ntfs_bread(sb, lbo >> sb->s_blocksize_bits);
+}
+
+int ntfs_read_run_nb(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+ u64 vbo, void *buf, u32 bytes, struct ntfs_buffers *nb)
+{
+ int err;
+ struct super_block *sb = sbi->sb;
+ u32 blocksize = sb->s_blocksize;
+ u8 cluster_bits = sbi->cluster_bits;
+ u32 off = vbo & sbi->cluster_mask;
+ u32 nbh = 0;
+ CLST vcn_next, vcn = vbo >> cluster_bits;
+ CLST lcn, clen;
+ u64 lbo, len;
+ size_t idx;
+ struct buffer_head *bh;
+
+ if (!run) {
+ /* First reading of $Volume + $MFTMirr + $LogFile goes here. */
+ if (vbo > MFT_REC_VOL * sbi->record_size) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ /* Use absolute boot's 'MFTCluster' to read record. */
+ lbo = vbo + sbi->mft.lbo;
+ len = sbi->record_size;
+ } else if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) {
+ err = -ENOENT;
+ goto out;
+ } else {
+ if (lcn == SPARSE_LCN) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ lbo = ((u64)lcn << cluster_bits) + off;
+ len = ((u64)clen << cluster_bits) - off;
+ }
+
+ off = lbo & (blocksize - 1);
+ if (nb) {
+ nb->off = off;
+ nb->bytes = bytes;
+ }
+
+ for (;;) {
+ u32 len32 = len >= bytes ? bytes : len;
+ sector_t block = lbo >> sb->s_blocksize_bits;
+
+ do {
+ u32 op = blocksize - off;
+
+ if (op > len32)
+ op = len32;
+
+ bh = ntfs_bread(sb, block);
+ if (!bh) {
+ err = -EIO;
+ goto out;
+ }
+
+ if (buf) {
+ memcpy(buf, bh->b_data + off, op);
+ buf = Add2Ptr(buf, op);
+ }
+
+ if (!nb) {
+ put_bh(bh);
+ } else if (nbh >= ARRAY_SIZE(nb->bh)) {
+ err = -EINVAL;
+ goto out;
+ } else {
+ nb->bh[nbh++] = bh;
+ nb->nbufs = nbh;
+ }
+
+ bytes -= op;
+ if (!bytes)
+ return 0;
+ len32 -= op;
+ block += 1;
+ off = 0;
+
+ } while (len32);
+
+ vcn_next = vcn + clen;
+ if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) ||
+ vcn != vcn_next) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ if (lcn == SPARSE_LCN) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ lbo = ((u64)lcn << cluster_bits);
+ len = ((u64)clen << cluster_bits);
+ }
+
+out:
+ if (!nbh)
+ return err;
+
+ while (nbh) {
+ put_bh(nb->bh[--nbh]);
+ nb->bh[nbh] = NULL;
+ }
+
+ nb->nbufs = 0;
+ return err;
+}
+
+/*
+ * ntfs_read_bh
+ *
+ * Return: < 0 if error, 0 if ok, -E_NTFS_FIXUP if need to update fixups.
+ */
+int ntfs_read_bh(struct ntfs_sb_info *sbi, const struct runs_tree *run, u64 vbo,
+ struct NTFS_RECORD_HEADER *rhdr, u32 bytes,
+ struct ntfs_buffers *nb)
+{
+ int err = ntfs_read_run_nb(sbi, run, vbo, rhdr, bytes, nb);
+
+ if (err)
+ return err;
+ return ntfs_fix_post_read(rhdr, nb->bytes, true);
+}
+
+int ntfs_get_bh(struct ntfs_sb_info *sbi, const struct runs_tree *run, u64 vbo,
+ u32 bytes, struct ntfs_buffers *nb)
+{
+ int err = 0;
+ struct super_block *sb = sbi->sb;
+ u32 blocksize = sb->s_blocksize;
+ u8 cluster_bits = sbi->cluster_bits;
+ CLST vcn_next, vcn = vbo >> cluster_bits;
+ u32 off;
+ u32 nbh = 0;
+ CLST lcn, clen;
+ u64 lbo, len;
+ size_t idx;
+
+ nb->bytes = bytes;
+
+ if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ off = vbo & sbi->cluster_mask;
+ lbo = ((u64)lcn << cluster_bits) + off;
+ len = ((u64)clen << cluster_bits) - off;
+
+ nb->off = off = lbo & (blocksize - 1);
+
+ for (;;) {
+ u32 len32 = len < bytes ? len : bytes;
+ sector_t block = lbo >> sb->s_blocksize_bits;
+
+ do {
+ u32 op;
+ struct buffer_head *bh;
+
+ if (nbh >= ARRAY_SIZE(nb->bh)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ op = blocksize - off;
+ if (op > len32)
+ op = len32;
+
+ if (op == blocksize) {
+ bh = sb_getblk(sb, block);
+ if (!bh) {
+ err = -ENOMEM;
+ goto out;
+ }
+ if (buffer_locked(bh))
+ __wait_on_buffer(bh);
+ set_buffer_uptodate(bh);
+ } else {
+ bh = ntfs_bread(sb, block);
+ if (!bh) {
+ err = -EIO;
+ goto out;
+ }
+ }
+
+ nb->bh[nbh++] = bh;
+ bytes -= op;
+ if (!bytes) {
+ nb->nbufs = nbh;
+ return 0;
+ }
+
+ block += 1;
+ len32 -= op;
+ off = 0;
+ } while (len32);
+
+ vcn_next = vcn + clen;
+ if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) ||
+ vcn != vcn_next) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ lbo = ((u64)lcn << cluster_bits);
+ len = ((u64)clen << cluster_bits);
+ }
+
+out:
+ while (nbh) {
+ put_bh(nb->bh[--nbh]);
+ nb->bh[nbh] = NULL;
+ }
+
+ nb->nbufs = 0;
+
+ return err;
+}
+
+int ntfs_write_bh(struct ntfs_sb_info *sbi, struct NTFS_RECORD_HEADER *rhdr,
+ struct ntfs_buffers *nb, int sync)
+{
+ int err = 0;
+ struct super_block *sb = sbi->sb;
+ u32 block_size = sb->s_blocksize;
+ u32 bytes = nb->bytes;
+ u32 off = nb->off;
+ u16 fo = le16_to_cpu(rhdr->fix_off);
+ u16 fn = le16_to_cpu(rhdr->fix_num);
+ u32 idx;
+ __le16 *fixup;
+ __le16 sample;
+
+ if ((fo & 1) || fo + fn * sizeof(short) > SECTOR_SIZE || !fn-- ||
+ fn * SECTOR_SIZE > bytes) {
+ return -EINVAL;
+ }
+
+ for (idx = 0; bytes && idx < nb->nbufs; idx += 1, off = 0) {
+ u32 op = block_size - off;
+ char *bh_data;
+ struct buffer_head *bh = nb->bh[idx];
+ __le16 *ptr, *end_data;
+
+ if (op > bytes)
+ op = bytes;
+
+ if (buffer_locked(bh))
+ __wait_on_buffer(bh);
+
+ lock_buffer(nb->bh[idx]);
+
+ bh_data = bh->b_data + off;
+ end_data = Add2Ptr(bh_data, op);
+ memcpy(bh_data, rhdr, op);
+
+ if (!idx) {
+ u16 t16;
+
+ fixup = Add2Ptr(bh_data, fo);
+ sample = *fixup;
+ t16 = le16_to_cpu(sample);
+ if (t16 >= 0x7FFF) {
+ sample = *fixup = cpu_to_le16(1);
+ } else {
+ sample = cpu_to_le16(t16 + 1);
+ *fixup = sample;
+ }
+
+ *(__le16 *)Add2Ptr(rhdr, fo) = sample;
+ }
+
+ ptr = Add2Ptr(bh_data, SECTOR_SIZE - sizeof(short));
+
+ do {
+ *++fixup = *ptr;
+ *ptr = sample;
+ ptr += SECTOR_SIZE / sizeof(short);
+ } while (ptr < end_data);
+
+ set_buffer_uptodate(bh);
+ mark_buffer_dirty(bh);
+ unlock_buffer(bh);
+
+ if (sync) {
+ int err2 = sync_dirty_buffer(bh);
+
+ if (!err && err2)
+ err = err2;
+ }
+
+ bytes -= op;
+ rhdr = Add2Ptr(rhdr, op);
+ }
+
+ return err;
+}
+
+static inline struct bio *ntfs_alloc_bio(u32 nr_vecs)
+{
+ struct bio *bio = bio_alloc(GFP_NOFS | __GFP_HIGH, nr_vecs);
+
+ if (!bio && (current->flags & PF_MEMALLOC)) {
+ while (!bio && (nr_vecs /= 2))
+ bio = bio_alloc(GFP_NOFS | __GFP_HIGH, nr_vecs);
+ }
+ return bio;
+}
+
+/*
+ * ntfs_bio_pages - Read/write pages from/to disk.
+ */
+int ntfs_bio_pages(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+ struct page **pages, u32 nr_pages, u64 vbo, u32 bytes,
+ u32 op)
+{
+ int err = 0;
+ struct bio *new, *bio = NULL;
+ struct super_block *sb = sbi->sb;
+ struct block_device *bdev = sb->s_bdev;
+ struct page *page;
+ u8 cluster_bits = sbi->cluster_bits;
+ CLST lcn, clen, vcn, vcn_next;
+ u32 add, off, page_idx;
+ u64 lbo, len;
+ size_t run_idx;
+ struct blk_plug plug;
+
+ if (!bytes)
+ return 0;
+
+ blk_start_plug(&plug);
+
+ /* Align vbo and bytes to be 512 bytes aligned. */
+ lbo = (vbo + bytes + 511) & ~511ull;
+ vbo = vbo & ~511ull;
+ bytes = lbo - vbo;
+
+ vcn = vbo >> cluster_bits;
+ if (!run_lookup_entry(run, vcn, &lcn, &clen, &run_idx)) {
+ err = -ENOENT;
+ goto out;
+ }
+ off = vbo & sbi->cluster_mask;
+ page_idx = 0;
+ page = pages[0];
+
+ for (;;) {
+ lbo = ((u64)lcn << cluster_bits) + off;
+ len = ((u64)clen << cluster_bits) - off;
+new_bio:
+ new = ntfs_alloc_bio(nr_pages - page_idx);
+ if (!new) {
+ err = -ENOMEM;
+ goto out;
+ }
+ if (bio) {
+ bio_chain(bio, new);
+ submit_bio(bio);
+ }
+ bio = new;
+ bio_set_dev(bio, bdev);
+ bio->bi_iter.bi_sector = lbo >> 9;
+ bio->bi_opf = op;
+
+ while (len) {
+ off = vbo & (PAGE_SIZE - 1);
+ add = off + len > PAGE_SIZE ? (PAGE_SIZE - off) : len;
+
+ if (bio_add_page(bio, page, add, off) < add)
+ goto new_bio;
+
+ if (bytes <= add)
+ goto out;
+ bytes -= add;
+ vbo += add;
+
+ if (add + off == PAGE_SIZE) {
+ page_idx += 1;
+ if (WARN_ON(page_idx >= nr_pages)) {
+ err = -EINVAL;
+ goto out;
+ }
+ page = pages[page_idx];
+ }
+
+ if (len <= add)
+ break;
+ len -= add;
+ lbo += add;
+ }
+
+ vcn_next = vcn + clen;
+ if (!run_get_entry(run, ++run_idx, &vcn, &lcn, &clen) ||
+ vcn != vcn_next) {
+ err = -ENOENT;
+ goto out;
+ }
+ off = 0;
+ }
+out:
+ if (bio) {
+ if (!err)
+ err = submit_bio_wait(bio);
+ bio_put(bio);
+ }
+ blk_finish_plug(&plug);
+
+ return err;
+}
+
+/*
+ * ntfs_bio_fill_1 - Helper for ntfs_loadlog_and_replay().
+ *
+ * Fill on-disk logfile range by (-1)
+ * this means empty logfile.
+ */
+int ntfs_bio_fill_1(struct ntfs_sb_info *sbi, const struct runs_tree *run)
+{
+ int err = 0;
+ struct super_block *sb = sbi->sb;
+ struct block_device *bdev = sb->s_bdev;
+ u8 cluster_bits = sbi->cluster_bits;
+ struct bio *new, *bio = NULL;
+ CLST lcn, clen;
+ u64 lbo, len;
+ size_t run_idx;
+ struct page *fill;
+ void *kaddr;
+ struct blk_plug plug;
+
+ fill = alloc_page(GFP_KERNEL);
+ if (!fill)
+ return -ENOMEM;
+
+ kaddr = kmap_atomic(fill);
+ memset(kaddr, -1, PAGE_SIZE);
+ kunmap_atomic(kaddr);
+ flush_dcache_page(fill);
+ lock_page(fill);
+
+ if (!run_lookup_entry(run, 0, &lcn, &clen, &run_idx)) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ /*
+ * TODO: Try blkdev_issue_write_same.
+ */
+ blk_start_plug(&plug);
+ do {
+ lbo = (u64)lcn << cluster_bits;
+ len = (u64)clen << cluster_bits;
+new_bio:
+ new = ntfs_alloc_bio(BIO_MAX_VECS);
+ if (!new) {
+ err = -ENOMEM;
+ break;
+ }
+ if (bio) {
+ bio_chain(bio, new);
+ submit_bio(bio);
+ }
+ bio = new;
+ bio_set_dev(bio, bdev);
+ bio->bi_opf = REQ_OP_WRITE;
+ bio->bi_iter.bi_sector = lbo >> 9;
+
+ for (;;) {
+ u32 add = len > PAGE_SIZE ? PAGE_SIZE : len;
+
+ if (bio_add_page(bio, fill, add, 0) < add)
+ goto new_bio;
+
+ lbo += add;
+ if (len <= add)
+ break;
+ len -= add;
+ }
+ } while (run_get_entry(run, ++run_idx, NULL, &lcn, &clen));
+
+ if (bio) {
+ if (!err)
+ err = submit_bio_wait(bio);
+ bio_put(bio);
+ }
+ blk_finish_plug(&plug);
+out:
+ unlock_page(fill);
+ put_page(fill);
+
+ return err;
+}
+
+int ntfs_vbo_to_lbo(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+ u64 vbo, u64 *lbo, u64 *bytes)
+{
+ u32 off;
+ CLST lcn, len;
+ u8 cluster_bits = sbi->cluster_bits;
+
+ if (!run_lookup_entry(run, vbo >> cluster_bits, &lcn, &len, NULL))
+ return -ENOENT;
+
+ off = vbo & sbi->cluster_mask;
+ *lbo = lcn == SPARSE_LCN ? -1 : (((u64)lcn << cluster_bits) + off);
+ *bytes = ((u64)len << cluster_bits) - off;
+
+ return 0;
+}
+
+struct ntfs_inode *ntfs_new_inode(struct ntfs_sb_info *sbi, CLST rno, bool dir)
+{
+ int err = 0;
+ struct super_block *sb = sbi->sb;
+ struct inode *inode = new_inode(sb);
+ struct ntfs_inode *ni;
+
+ if (!inode)
+ return ERR_PTR(-ENOMEM);
+
+ ni = ntfs_i(inode);
+
+ err = mi_format_new(&ni->mi, sbi, rno, dir ? RECORD_FLAG_DIR : 0,
+ false);
+ if (err)
+ goto out;
+
+ inode->i_ino = rno;
+ if (insert_inode_locked(inode) < 0) {
+ err = -EIO;
+ goto out;
+ }
+
+out:
+ if (err) {
+ iput(inode);
+ ni = ERR_PTR(err);
+ }
+ return ni;
+}
+
+/*
+ * O:BAG:BAD:(A;OICI;FA;;;WD)
+ * Owner S-1-5-32-544 (Administrators)
+ * Group S-1-5-32-544 (Administrators)
+ * ACE: allow S-1-1-0 (Everyone) with FILE_ALL_ACCESS
+ */
+const u8 s_default_security[] __aligned(8) = {
+ 0x01, 0x00, 0x04, 0x80, 0x30, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00, 0x02, 0x00, 0x1C, 0x00,
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0x03, 0x14, 0x00, 0xFF, 0x01, 0x1F, 0x00,
+ 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00,
+ 0x01, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x05, 0x20, 0x00, 0x00, 0x00,
+ 0x20, 0x02, 0x00, 0x00, 0x01, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x05,
+ 0x20, 0x00, 0x00, 0x00, 0x20, 0x02, 0x00, 0x00,
+};
+
+static_assert(sizeof(s_default_security) == 0x50);
+
+static inline u32 sid_length(const struct SID *sid)
+{
+ return struct_size(sid, SubAuthority, sid->SubAuthorityCount);
+}
+
+/*
+ * is_acl_valid
+ *
+ * Thanks Mark Harmstone for idea.
+ */
+static bool is_acl_valid(const struct ACL *acl, u32 len)
+{
+ const struct ACE_HEADER *ace;
+ u32 i;
+ u16 ace_count, ace_size;
+
+ if (acl->AclRevision != ACL_REVISION &&
+ acl->AclRevision != ACL_REVISION_DS) {
+ /*
+ * This value should be ACL_REVISION, unless the ACL contains an
+ * object-specific ACE, in which case this value must be ACL_REVISION_DS.
+ * All ACEs in an ACL must be at the same revision level.
+ */
+ return false;
+ }
+
+ if (acl->Sbz1)
+ return false;
+
+ if (le16_to_cpu(acl->AclSize) > len)
+ return false;
+
+ if (acl->Sbz2)
+ return false;
+
+ len -= sizeof(struct ACL);
+ ace = (struct ACE_HEADER *)&acl[1];
+ ace_count = le16_to_cpu(acl->AceCount);
+
+ for (i = 0; i < ace_count; i++) {
+ if (len < sizeof(struct ACE_HEADER))
+ return false;
+
+ ace_size = le16_to_cpu(ace->AceSize);
+ if (len < ace_size)
+ return false;
+
+ len -= ace_size;
+ ace = Add2Ptr(ace, ace_size);
+ }
+
+ return true;
+}
+
+bool is_sd_valid(const struct SECURITY_DESCRIPTOR_RELATIVE *sd, u32 len)
+{
+ u32 sd_owner, sd_group, sd_sacl, sd_dacl;
+
+ if (len < sizeof(struct SECURITY_DESCRIPTOR_RELATIVE))
+ return false;
+
+ if (sd->Revision != 1)
+ return false;
+
+ if (sd->Sbz1)
+ return false;
+
+ if (!(sd->Control & SE_SELF_RELATIVE))
+ return false;
+
+ sd_owner = le32_to_cpu(sd->Owner);
+ if (sd_owner) {
+ const struct SID *owner = Add2Ptr(sd, sd_owner);
+
+ if (sd_owner + offsetof(struct SID, SubAuthority) > len)
+ return false;
+
+ if (owner->Revision != 1)
+ return false;
+
+ if (sd_owner + sid_length(owner) > len)
+ return false;
+ }
+
+ sd_group = le32_to_cpu(sd->Group);
+ if (sd_group) {
+ const struct SID *group = Add2Ptr(sd, sd_group);
+
+ if (sd_group + offsetof(struct SID, SubAuthority) > len)
+ return false;
+
+ if (group->Revision != 1)
+ return false;
+
+ if (sd_group + sid_length(group) > len)
+ return false;
+ }
+
+ sd_sacl = le32_to_cpu(sd->Sacl);
+ if (sd_sacl) {
+ const struct ACL *sacl = Add2Ptr(sd, sd_sacl);
+
+ if (sd_sacl + sizeof(struct ACL) > len)
+ return false;
+
+ if (!is_acl_valid(sacl, len - sd_sacl))
+ return false;
+ }
+
+ sd_dacl = le32_to_cpu(sd->Dacl);
+ if (sd_dacl) {
+ const struct ACL *dacl = Add2Ptr(sd, sd_dacl);
+
+ if (sd_dacl + sizeof(struct ACL) > len)
+ return false;
+
+ if (!is_acl_valid(dacl, len - sd_dacl))
+ return false;
+ }
+
+ return true;
+}
+
+/*
+ * ntfs_security_init - Load and parse $Secure.
+ */
+int ntfs_security_init(struct ntfs_sb_info *sbi)
+{
+ int err;
+ struct super_block *sb = sbi->sb;
+ struct inode *inode;
+ struct ntfs_inode *ni;
+ struct MFT_REF ref;
+ struct ATTRIB *attr;
+ struct ATTR_LIST_ENTRY *le;
+ u64 sds_size;
+ size_t off;
+ struct NTFS_DE *ne;
+ struct NTFS_DE_SII *sii_e;
+ struct ntfs_fnd *fnd_sii = NULL;
+ const struct INDEX_ROOT *root_sii;
+ const struct INDEX_ROOT *root_sdh;
+ struct ntfs_index *indx_sdh = &sbi->security.index_sdh;
+ struct ntfs_index *indx_sii = &sbi->security.index_sii;
+
+ ref.low = cpu_to_le32(MFT_REC_SECURE);
+ ref.high = 0;
+ ref.seq = cpu_to_le16(MFT_REC_SECURE);
+
+ inode = ntfs_iget5(sb, &ref, &NAME_SECURE);
+ if (IS_ERR(inode)) {
+ err = PTR_ERR(inode);
+ ntfs_err(sb, "Failed to load $Secure.");
+ inode = NULL;
+ goto out;
+ }
+
+ ni = ntfs_i(inode);
+
+ le = NULL;
+
+ attr = ni_find_attr(ni, NULL, &le, ATTR_ROOT, SDH_NAME,
+ ARRAY_SIZE(SDH_NAME), NULL, NULL);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ root_sdh = resident_data(attr);
+ if (root_sdh->type != ATTR_ZERO ||
+ root_sdh->rule != NTFS_COLLATION_TYPE_SECURITY_HASH) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = indx_init(indx_sdh, sbi, attr, INDEX_MUTEX_SDH);
+ if (err)
+ goto out;
+
+ attr = ni_find_attr(ni, attr, &le, ATTR_ROOT, SII_NAME,
+ ARRAY_SIZE(SII_NAME), NULL, NULL);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ root_sii = resident_data(attr);
+ if (root_sii->type != ATTR_ZERO ||
+ root_sii->rule != NTFS_COLLATION_TYPE_UINT) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = indx_init(indx_sii, sbi, attr, INDEX_MUTEX_SII);
+ if (err)
+ goto out;
+
+ fnd_sii = fnd_get();
+ if (!fnd_sii) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ sds_size = inode->i_size;
+
+ /* Find the last valid Id. */
+ sbi->security.next_id = SECURITY_ID_FIRST;
+ /* Always write new security at the end of bucket. */
+ sbi->security.next_off =
+ ALIGN(sds_size - SecurityDescriptorsBlockSize, 16);
+
+ off = 0;
+ ne = NULL;
+
+ for (;;) {
+ u32 next_id;
+
+ err = indx_find_raw(indx_sii, ni, root_sii, &ne, &off, fnd_sii);
+ if (err || !ne)
+ break;
+
+ sii_e = (struct NTFS_DE_SII *)ne;
+ if (le16_to_cpu(ne->view.data_size) < SIZEOF_SECURITY_HDR)
+ continue;
+
+ next_id = le32_to_cpu(sii_e->sec_id) + 1;
+ if (next_id >= sbi->security.next_id)
+ sbi->security.next_id = next_id;
+ }
+
+ sbi->security.ni = ni;
+ inode = NULL;
+out:
+ iput(inode);
+ fnd_put(fnd_sii);
+
+ return err;
+}
+
+/*
+ * ntfs_get_security_by_id - Read security descriptor by id.
+ */
+int ntfs_get_security_by_id(struct ntfs_sb_info *sbi, __le32 security_id,
+ struct SECURITY_DESCRIPTOR_RELATIVE **sd,
+ size_t *size)
+{
+ int err;
+ int diff;
+ struct ntfs_inode *ni = sbi->security.ni;
+ struct ntfs_index *indx = &sbi->security.index_sii;
+ void *p = NULL;
+ struct NTFS_DE_SII *sii_e;
+ struct ntfs_fnd *fnd_sii;
+ struct SECURITY_HDR d_security;
+ const struct INDEX_ROOT *root_sii;
+ u32 t32;
+
+ *sd = NULL;
+
+ mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_SECURITY);
+
+ fnd_sii = fnd_get();
+ if (!fnd_sii) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ root_sii = indx_get_root(indx, ni, NULL, NULL);
+ if (!root_sii) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Try to find this SECURITY descriptor in SII indexes. */
+ err = indx_find(indx, ni, root_sii, &security_id, sizeof(security_id),
+ NULL, &diff, (struct NTFS_DE **)&sii_e, fnd_sii);
+ if (err)
+ goto out;
+
+ if (diff)
+ goto out;
+
+ t32 = le32_to_cpu(sii_e->sec_hdr.size);
+ if (t32 < SIZEOF_SECURITY_HDR) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (t32 > SIZEOF_SECURITY_HDR + 0x10000) {
+ /* Looks like too big security. 0x10000 - is arbitrary big number. */
+ err = -EFBIG;
+ goto out;
+ }
+
+ *size = t32 - SIZEOF_SECURITY_HDR;
+
+ p = kmalloc(*size, GFP_NOFS);
+ if (!p) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ err = ntfs_read_run_nb(sbi, &ni->file.run,
+ le64_to_cpu(sii_e->sec_hdr.off), &d_security,
+ sizeof(d_security), NULL);
+ if (err)
+ goto out;
+
+ if (memcmp(&d_security, &sii_e->sec_hdr, SIZEOF_SECURITY_HDR)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = ntfs_read_run_nb(sbi, &ni->file.run,
+ le64_to_cpu(sii_e->sec_hdr.off) +
+ SIZEOF_SECURITY_HDR,
+ p, *size, NULL);
+ if (err)
+ goto out;
+
+ *sd = p;
+ p = NULL;
+
+out:
+ kfree(p);
+ fnd_put(fnd_sii);
+ ni_unlock(ni);
+
+ return err;
+}
+
+/*
+ * ntfs_insert_security - Insert security descriptor into $Secure::SDS.
+ *
+ * SECURITY Descriptor Stream data is organized into chunks of 256K bytes
+ * and it contains a mirror copy of each security descriptor. When writing
+ * to a security descriptor at location X, another copy will be written at
+ * location (X+256K).
+ * When writing a security descriptor that will cross the 256K boundary,
+ * the pointer will be advanced by 256K to skip
+ * over the mirror portion.
+ */
+int ntfs_insert_security(struct ntfs_sb_info *sbi,
+ const struct SECURITY_DESCRIPTOR_RELATIVE *sd,
+ u32 size_sd, __le32 *security_id, bool *inserted)
+{
+ int err, diff;
+ struct ntfs_inode *ni = sbi->security.ni;
+ struct ntfs_index *indx_sdh = &sbi->security.index_sdh;
+ struct ntfs_index *indx_sii = &sbi->security.index_sii;
+ struct NTFS_DE_SDH *e;
+ struct NTFS_DE_SDH sdh_e;
+ struct NTFS_DE_SII sii_e;
+ struct SECURITY_HDR *d_security;
+ u32 new_sec_size = size_sd + SIZEOF_SECURITY_HDR;
+ u32 aligned_sec_size = ALIGN(new_sec_size, 16);
+ struct SECURITY_KEY hash_key;
+ struct ntfs_fnd *fnd_sdh = NULL;
+ const struct INDEX_ROOT *root_sdh;
+ const struct INDEX_ROOT *root_sii;
+ u64 mirr_off, new_sds_size;
+ u32 next, left;
+
+ static_assert((1 << Log2OfSecurityDescriptorsBlockSize) ==
+ SecurityDescriptorsBlockSize);
+
+ hash_key.hash = security_hash(sd, size_sd);
+ hash_key.sec_id = SECURITY_ID_INVALID;
+
+ if (inserted)
+ *inserted = false;
+ *security_id = SECURITY_ID_INVALID;
+
+ /* Allocate a temporal buffer. */
+ d_security = kzalloc(aligned_sec_size, GFP_NOFS);
+ if (!d_security)
+ return -ENOMEM;
+
+ mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_SECURITY);
+
+ fnd_sdh = fnd_get();
+ if (!fnd_sdh) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ root_sdh = indx_get_root(indx_sdh, ni, NULL, NULL);
+ if (!root_sdh) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ root_sii = indx_get_root(indx_sii, ni, NULL, NULL);
+ if (!root_sii) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * Check if such security already exists.
+ * Use "SDH" and hash -> to get the offset in "SDS".
+ */
+ err = indx_find(indx_sdh, ni, root_sdh, &hash_key, sizeof(hash_key),
+ &d_security->key.sec_id, &diff, (struct NTFS_DE **)&e,
+ fnd_sdh);
+ if (err)
+ goto out;
+
+ while (e) {
+ if (le32_to_cpu(e->sec_hdr.size) == new_sec_size) {
+ err = ntfs_read_run_nb(sbi, &ni->file.run,
+ le64_to_cpu(e->sec_hdr.off),
+ d_security, new_sec_size, NULL);
+ if (err)
+ goto out;
+
+ if (le32_to_cpu(d_security->size) == new_sec_size &&
+ d_security->key.hash == hash_key.hash &&
+ !memcmp(d_security + 1, sd, size_sd)) {
+ *security_id = d_security->key.sec_id;
+ /* Such security already exists. */
+ err = 0;
+ goto out;
+ }
+ }
+
+ err = indx_find_sort(indx_sdh, ni, root_sdh,
+ (struct NTFS_DE **)&e, fnd_sdh);
+ if (err)
+ goto out;
+
+ if (!e || e->key.hash != hash_key.hash)
+ break;
+ }
+
+ /* Zero unused space. */
+ next = sbi->security.next_off & (SecurityDescriptorsBlockSize - 1);
+ left = SecurityDescriptorsBlockSize - next;
+
+ /* Zero gap until SecurityDescriptorsBlockSize. */
+ if (left < new_sec_size) {
+ /* Zero "left" bytes from sbi->security.next_off. */
+ sbi->security.next_off += SecurityDescriptorsBlockSize + left;
+ }
+
+ /* Zero tail of previous security. */
+ //used = ni->vfs_inode.i_size & (SecurityDescriptorsBlockSize - 1);
+
+ /*
+ * Example:
+ * 0x40438 == ni->vfs_inode.i_size
+ * 0x00440 == sbi->security.next_off
+ * need to zero [0x438-0x440)
+ * if (next > used) {
+ * u32 tozero = next - used;
+ * zero "tozero" bytes from sbi->security.next_off - tozero
+ */
+
+ /* Format new security descriptor. */
+ d_security->key.hash = hash_key.hash;
+ d_security->key.sec_id = cpu_to_le32(sbi->security.next_id);
+ d_security->off = cpu_to_le64(sbi->security.next_off);
+ d_security->size = cpu_to_le32(new_sec_size);
+ memcpy(d_security + 1, sd, size_sd);
+
+ /* Write main SDS bucket. */
+ err = ntfs_sb_write_run(sbi, &ni->file.run, sbi->security.next_off,
+ d_security, aligned_sec_size);
+
+ if (err)
+ goto out;
+
+ mirr_off = sbi->security.next_off + SecurityDescriptorsBlockSize;
+ new_sds_size = mirr_off + aligned_sec_size;
+
+ if (new_sds_size > ni->vfs_inode.i_size) {
+ err = attr_set_size(ni, ATTR_DATA, SDS_NAME,
+ ARRAY_SIZE(SDS_NAME), &ni->file.run,
+ new_sds_size, &new_sds_size, false, NULL);
+ if (err)
+ goto out;
+ }
+
+ /* Write copy SDS bucket. */
+ err = ntfs_sb_write_run(sbi, &ni->file.run, mirr_off, d_security,
+ aligned_sec_size);
+ if (err)
+ goto out;
+
+ /* Fill SII entry. */
+ sii_e.de.view.data_off =
+ cpu_to_le16(offsetof(struct NTFS_DE_SII, sec_hdr));
+ sii_e.de.view.data_size = cpu_to_le16(SIZEOF_SECURITY_HDR);
+ sii_e.de.view.res = 0;
+ sii_e.de.size = cpu_to_le16(SIZEOF_SII_DIRENTRY);
+ sii_e.de.key_size = cpu_to_le16(sizeof(d_security->key.sec_id));
+ sii_e.de.flags = 0;
+ sii_e.de.res = 0;
+ sii_e.sec_id = d_security->key.sec_id;
+ memcpy(&sii_e.sec_hdr, d_security, SIZEOF_SECURITY_HDR);
+
+ err = indx_insert_entry(indx_sii, ni, &sii_e.de, NULL, NULL, 0);
+ if (err)
+ goto out;
+
+ /* Fill SDH entry. */
+ sdh_e.de.view.data_off =
+ cpu_to_le16(offsetof(struct NTFS_DE_SDH, sec_hdr));
+ sdh_e.de.view.data_size = cpu_to_le16(SIZEOF_SECURITY_HDR);
+ sdh_e.de.view.res = 0;
+ sdh_e.de.size = cpu_to_le16(SIZEOF_SDH_DIRENTRY);
+ sdh_e.de.key_size = cpu_to_le16(sizeof(sdh_e.key));
+ sdh_e.de.flags = 0;
+ sdh_e.de.res = 0;
+ sdh_e.key.hash = d_security->key.hash;
+ sdh_e.key.sec_id = d_security->key.sec_id;
+ memcpy(&sdh_e.sec_hdr, d_security, SIZEOF_SECURITY_HDR);
+ sdh_e.magic[0] = cpu_to_le16('I');
+ sdh_e.magic[1] = cpu_to_le16('I');
+
+ fnd_clear(fnd_sdh);
+ err = indx_insert_entry(indx_sdh, ni, &sdh_e.de, (void *)(size_t)1,
+ fnd_sdh, 0);
+ if (err)
+ goto out;
+
+ *security_id = d_security->key.sec_id;
+ if (inserted)
+ *inserted = true;
+
+ /* Update Id and offset for next descriptor. */
+ sbi->security.next_id += 1;
+ sbi->security.next_off += aligned_sec_size;
+
+out:
+ fnd_put(fnd_sdh);
+ mark_inode_dirty(&ni->vfs_inode);
+ ni_unlock(ni);
+ kfree(d_security);
+
+ return err;
+}
+
+/*
+ * ntfs_reparse_init - Load and parse $Extend/$Reparse.
+ */
+int ntfs_reparse_init(struct ntfs_sb_info *sbi)
+{
+ int err;
+ struct ntfs_inode *ni = sbi->reparse.ni;
+ struct ntfs_index *indx = &sbi->reparse.index_r;
+ struct ATTRIB *attr;
+ struct ATTR_LIST_ENTRY *le;
+ const struct INDEX_ROOT *root_r;
+
+ if (!ni)
+ return 0;
+
+ le = NULL;
+ attr = ni_find_attr(ni, NULL, &le, ATTR_ROOT, SR_NAME,
+ ARRAY_SIZE(SR_NAME), NULL, NULL);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ root_r = resident_data(attr);
+ if (root_r->type != ATTR_ZERO ||
+ root_r->rule != NTFS_COLLATION_TYPE_UINTS) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = indx_init(indx, sbi, attr, INDEX_MUTEX_SR);
+ if (err)
+ goto out;
+
+out:
+ return err;
+}
+
+/*
+ * ntfs_objid_init - Load and parse $Extend/$ObjId.
+ */
+int ntfs_objid_init(struct ntfs_sb_info *sbi)
+{
+ int err;
+ struct ntfs_inode *ni = sbi->objid.ni;
+ struct ntfs_index *indx = &sbi->objid.index_o;
+ struct ATTRIB *attr;
+ struct ATTR_LIST_ENTRY *le;
+ const struct INDEX_ROOT *root;
+
+ if (!ni)
+ return 0;
+
+ le = NULL;
+ attr = ni_find_attr(ni, NULL, &le, ATTR_ROOT, SO_NAME,
+ ARRAY_SIZE(SO_NAME), NULL, NULL);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ root = resident_data(attr);
+ if (root->type != ATTR_ZERO ||
+ root->rule != NTFS_COLLATION_TYPE_UINTS) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = indx_init(indx, sbi, attr, INDEX_MUTEX_SO);
+ if (err)
+ goto out;
+
+out:
+ return err;
+}
+
+int ntfs_objid_remove(struct ntfs_sb_info *sbi, struct GUID *guid)
+{
+ int err;
+ struct ntfs_inode *ni = sbi->objid.ni;
+ struct ntfs_index *indx = &sbi->objid.index_o;
+
+ if (!ni)
+ return -EINVAL;
+
+ mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_OBJID);
+
+ err = indx_delete_entry(indx, ni, guid, sizeof(*guid), NULL);
+
+ mark_inode_dirty(&ni->vfs_inode);
+ ni_unlock(ni);
+
+ return err;
+}
+
+int ntfs_insert_reparse(struct ntfs_sb_info *sbi, __le32 rtag,
+ const struct MFT_REF *ref)
+{
+ int err;
+ struct ntfs_inode *ni = sbi->reparse.ni;
+ struct ntfs_index *indx = &sbi->reparse.index_r;
+ struct NTFS_DE_R re;
+
+ if (!ni)
+ return -EINVAL;
+
+ memset(&re, 0, sizeof(re));
+
+ re.de.view.data_off = cpu_to_le16(offsetof(struct NTFS_DE_R, zero));
+ re.de.size = cpu_to_le16(sizeof(struct NTFS_DE_R));
+ re.de.key_size = cpu_to_le16(sizeof(re.key));
+
+ re.key.ReparseTag = rtag;
+ memcpy(&re.key.ref, ref, sizeof(*ref));
+
+ mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_REPARSE);
+
+ err = indx_insert_entry(indx, ni, &re.de, NULL, NULL, 0);
+
+ mark_inode_dirty(&ni->vfs_inode);
+ ni_unlock(ni);
+
+ return err;
+}
+
+int ntfs_remove_reparse(struct ntfs_sb_info *sbi, __le32 rtag,
+ const struct MFT_REF *ref)
+{
+ int err, diff;
+ struct ntfs_inode *ni = sbi->reparse.ni;
+ struct ntfs_index *indx = &sbi->reparse.index_r;
+ struct ntfs_fnd *fnd = NULL;
+ struct REPARSE_KEY rkey;
+ struct NTFS_DE_R *re;
+ struct INDEX_ROOT *root_r;
+
+ if (!ni)
+ return -EINVAL;
+
+ rkey.ReparseTag = rtag;
+ rkey.ref = *ref;
+
+ mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_REPARSE);
+
+ if (rtag) {
+ err = indx_delete_entry(indx, ni, &rkey, sizeof(rkey), NULL);
+ goto out1;
+ }
+
+ fnd = fnd_get();
+ if (!fnd) {
+ err = -ENOMEM;
+ goto out1;
+ }
+
+ root_r = indx_get_root(indx, ni, NULL, NULL);
+ if (!root_r) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* 1 - forces to ignore rkey.ReparseTag when comparing keys. */
+ err = indx_find(indx, ni, root_r, &rkey, sizeof(rkey), (void *)1, &diff,
+ (struct NTFS_DE **)&re, fnd);
+ if (err)
+ goto out;
+
+ if (memcmp(&re->key.ref, ref, sizeof(*ref))) {
+ /* Impossible. Looks like volume corrupt? */
+ goto out;
+ }
+
+ memcpy(&rkey, &re->key, sizeof(rkey));
+
+ fnd_put(fnd);
+ fnd = NULL;
+
+ err = indx_delete_entry(indx, ni, &rkey, sizeof(rkey), NULL);
+ if (err)
+ goto out;
+
+out:
+ fnd_put(fnd);
+
+out1:
+ mark_inode_dirty(&ni->vfs_inode);
+ ni_unlock(ni);
+
+ return err;
+}
+
+static inline void ntfs_unmap_and_discard(struct ntfs_sb_info *sbi, CLST lcn,
+ CLST len)
+{
+ ntfs_unmap_meta(sbi->sb, lcn, len);
+ ntfs_discard(sbi, lcn, len);
+}
+
+void mark_as_free_ex(struct ntfs_sb_info *sbi, CLST lcn, CLST len, bool trim)
+{
+ CLST end, i;
+ struct wnd_bitmap *wnd = &sbi->used.bitmap;
+
+ down_write_nested(&wnd->rw_lock, BITMAP_MUTEX_CLUSTERS);
+ if (!wnd_is_used(wnd, lcn, len)) {
+ ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+
+ end = lcn + len;
+ len = 0;
+ for (i = lcn; i < end; i++) {
+ if (wnd_is_used(wnd, i, 1)) {
+ if (!len)
+ lcn = i;
+ len += 1;
+ continue;
+ }
+
+ if (!len)
+ continue;
+
+ if (trim)
+ ntfs_unmap_and_discard(sbi, lcn, len);
+
+ wnd_set_free(wnd, lcn, len);
+ len = 0;
+ }
+
+ if (!len)
+ goto out;
+ }
+
+ if (trim)
+ ntfs_unmap_and_discard(sbi, lcn, len);
+ wnd_set_free(wnd, lcn, len);
+
+out:
+ up_write(&wnd->rw_lock);
+}
+
+/*
+ * run_deallocate - Deallocate clusters.
+ */
+int run_deallocate(struct ntfs_sb_info *sbi, struct runs_tree *run, bool trim)
+{
+ CLST lcn, len;
+ size_t idx = 0;
+
+ while (run_get_entry(run, idx++, NULL, &lcn, &len)) {
+ if (lcn == SPARSE_LCN)
+ continue;
+
+ mark_as_free_ex(sbi, lcn, len, trim);
+ }
+
+ return 0;
+}
diff --git a/fs/ntfs3/index.c b/fs/ntfs3/index.c
new file mode 100644
index 000000000000..0daca9adc54c
--- /dev/null
+++ b/fs/ntfs3/index.c
@@ -0,0 +1,2650 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+static const struct INDEX_NAMES {
+ const __le16 *name;
+ u8 name_len;
+} s_index_names[INDEX_MUTEX_TOTAL] = {
+ { I30_NAME, ARRAY_SIZE(I30_NAME) }, { SII_NAME, ARRAY_SIZE(SII_NAME) },
+ { SDH_NAME, ARRAY_SIZE(SDH_NAME) }, { SO_NAME, ARRAY_SIZE(SO_NAME) },
+ { SQ_NAME, ARRAY_SIZE(SQ_NAME) }, { SR_NAME, ARRAY_SIZE(SR_NAME) },
+};
+
+/*
+ * cmp_fnames - Compare two names in index.
+ *
+ * if l1 != 0
+ * Both names are little endian on-disk ATTR_FILE_NAME structs.
+ * else
+ * key1 - cpu_str, key2 - ATTR_FILE_NAME
+ */
+static int cmp_fnames(const void *key1, size_t l1, const void *key2, size_t l2,
+ const void *data)
+{
+ const struct ATTR_FILE_NAME *f2 = key2;
+ const struct ntfs_sb_info *sbi = data;
+ const struct ATTR_FILE_NAME *f1;
+ u16 fsize2;
+ bool both_case;
+
+ if (l2 <= offsetof(struct ATTR_FILE_NAME, name))
+ return -1;
+
+ fsize2 = fname_full_size(f2);
+ if (l2 < fsize2)
+ return -1;
+
+ both_case = f2->type != FILE_NAME_DOS /*&& !sbi->options.nocase*/;
+ if (!l1) {
+ const struct le_str *s2 = (struct le_str *)&f2->name_len;
+
+ /*
+ * If names are equal (case insensitive)
+ * try to compare it case sensitive.
+ */
+ return ntfs_cmp_names_cpu(key1, s2, sbi->upcase, both_case);
+ }
+
+ f1 = key1;
+ return ntfs_cmp_names(f1->name, f1->name_len, f2->name, f2->name_len,
+ sbi->upcase, both_case);
+}
+
+/*
+ * cmp_uint - $SII of $Secure and $Q of Quota
+ */
+static int cmp_uint(const void *key1, size_t l1, const void *key2, size_t l2,
+ const void *data)
+{
+ const u32 *k1 = key1;
+ const u32 *k2 = key2;
+
+ if (l2 < sizeof(u32))
+ return -1;
+
+ if (*k1 < *k2)
+ return -1;
+ if (*k1 > *k2)
+ return 1;
+ return 0;
+}
+
+/*
+ * cmp_sdh - $SDH of $Secure
+ */
+static int cmp_sdh(const void *key1, size_t l1, const void *key2, size_t l2,
+ const void *data)
+{
+ const struct SECURITY_KEY *k1 = key1;
+ const struct SECURITY_KEY *k2 = key2;
+ u32 t1, t2;
+
+ if (l2 < sizeof(struct SECURITY_KEY))
+ return -1;
+
+ t1 = le32_to_cpu(k1->hash);
+ t2 = le32_to_cpu(k2->hash);
+
+ /* First value is a hash value itself. */
+ if (t1 < t2)
+ return -1;
+ if (t1 > t2)
+ return 1;
+
+ /* Second value is security Id. */
+ if (data) {
+ t1 = le32_to_cpu(k1->sec_id);
+ t2 = le32_to_cpu(k2->sec_id);
+ if (t1 < t2)
+ return -1;
+ if (t1 > t2)
+ return 1;
+ }
+
+ return 0;
+}
+
+/*
+ * cmp_uints - $O of ObjId and "$R" for Reparse.
+ */
+static int cmp_uints(const void *key1, size_t l1, const void *key2, size_t l2,
+ const void *data)
+{
+ const __le32 *k1 = key1;
+ const __le32 *k2 = key2;
+ size_t count;
+
+ if ((size_t)data == 1) {
+ /*
+ * ni_delete_all -> ntfs_remove_reparse ->
+ * delete all with this reference.
+ * k1, k2 - pointers to REPARSE_KEY
+ */
+
+ k1 += 1; // Skip REPARSE_KEY.ReparseTag
+ k2 += 1; // Skip REPARSE_KEY.ReparseTag
+ if (l2 <= sizeof(int))
+ return -1;
+ l2 -= sizeof(int);
+ if (l1 <= sizeof(int))
+ return 1;
+ l1 -= sizeof(int);
+ }
+
+ if (l2 < sizeof(int))
+ return -1;
+
+ for (count = min(l1, l2) >> 2; count > 0; --count, ++k1, ++k2) {
+ u32 t1 = le32_to_cpu(*k1);
+ u32 t2 = le32_to_cpu(*k2);
+
+ if (t1 > t2)
+ return 1;
+ if (t1 < t2)
+ return -1;
+ }
+
+ if (l1 > l2)
+ return 1;
+ if (l1 < l2)
+ return -1;
+
+ return 0;
+}
+
+static inline NTFS_CMP_FUNC get_cmp_func(const struct INDEX_ROOT *root)
+{
+ switch (root->type) {
+ case ATTR_NAME:
+ if (root->rule == NTFS_COLLATION_TYPE_FILENAME)
+ return &cmp_fnames;
+ break;
+ case ATTR_ZERO:
+ switch (root->rule) {
+ case NTFS_COLLATION_TYPE_UINT:
+ return &cmp_uint;
+ case NTFS_COLLATION_TYPE_SECURITY_HASH:
+ return &cmp_sdh;
+ case NTFS_COLLATION_TYPE_UINTS:
+ return &cmp_uints;
+ default:
+ break;
+ }
+ break;
+ default:
+ break;
+ }
+
+ return NULL;
+}
+
+struct bmp_buf {
+ struct ATTRIB *b;
+ struct mft_inode *mi;
+ struct buffer_head *bh;
+ ulong *buf;
+ size_t bit;
+ u32 nbits;
+ u64 new_valid;
+};
+
+static int bmp_buf_get(struct ntfs_index *indx, struct ntfs_inode *ni,
+ size_t bit, struct bmp_buf *bbuf)
+{
+ struct ATTRIB *b;
+ size_t data_size, valid_size, vbo, off = bit >> 3;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ CLST vcn = off >> sbi->cluster_bits;
+ struct ATTR_LIST_ENTRY *le = NULL;
+ struct buffer_head *bh;
+ struct super_block *sb;
+ u32 blocksize;
+ const struct INDEX_NAMES *in = &s_index_names[indx->type];
+
+ bbuf->bh = NULL;
+
+ b = ni_find_attr(ni, NULL, &le, ATTR_BITMAP, in->name, in->name_len,
+ &vcn, &bbuf->mi);
+ bbuf->b = b;
+ if (!b)
+ return -EINVAL;
+
+ if (!b->non_res) {
+ data_size = le32_to_cpu(b->res.data_size);
+
+ if (off >= data_size)
+ return -EINVAL;
+
+ bbuf->buf = (ulong *)resident_data(b);
+ bbuf->bit = 0;
+ bbuf->nbits = data_size * 8;
+
+ return 0;
+ }
+
+ data_size = le64_to_cpu(b->nres.data_size);
+ if (WARN_ON(off >= data_size)) {
+ /* Looks like filesystem error. */
+ return -EINVAL;
+ }
+
+ valid_size = le64_to_cpu(b->nres.valid_size);
+
+ bh = ntfs_bread_run(sbi, &indx->bitmap_run, off);
+ if (!bh)
+ return -EIO;
+
+ if (IS_ERR(bh))
+ return PTR_ERR(bh);
+
+ bbuf->bh = bh;
+
+ if (buffer_locked(bh))
+ __wait_on_buffer(bh);
+
+ lock_buffer(bh);
+
+ sb = sbi->sb;
+ blocksize = sb->s_blocksize;
+
+ vbo = off & ~(size_t)sbi->block_mask;
+
+ bbuf->new_valid = vbo + blocksize;
+ if (bbuf->new_valid <= valid_size)
+ bbuf->new_valid = 0;
+ else if (bbuf->new_valid > data_size)
+ bbuf->new_valid = data_size;
+
+ if (vbo >= valid_size) {
+ memset(bh->b_data, 0, blocksize);
+ } else if (vbo + blocksize > valid_size) {
+ u32 voff = valid_size & sbi->block_mask;
+
+ memset(bh->b_data + voff, 0, blocksize - voff);
+ }
+
+ bbuf->buf = (ulong *)bh->b_data;
+ bbuf->bit = 8 * (off & ~(size_t)sbi->block_mask);
+ bbuf->nbits = 8 * blocksize;
+
+ return 0;
+}
+
+static void bmp_buf_put(struct bmp_buf *bbuf, bool dirty)
+{
+ struct buffer_head *bh = bbuf->bh;
+ struct ATTRIB *b = bbuf->b;
+
+ if (!bh) {
+ if (b && !b->non_res && dirty)
+ bbuf->mi->dirty = true;
+ return;
+ }
+
+ if (!dirty)
+ goto out;
+
+ if (bbuf->new_valid) {
+ b->nres.valid_size = cpu_to_le64(bbuf->new_valid);
+ bbuf->mi->dirty = true;
+ }
+
+ set_buffer_uptodate(bh);
+ mark_buffer_dirty(bh);
+
+out:
+ unlock_buffer(bh);
+ put_bh(bh);
+}
+
+/*
+ * indx_mark_used - Mark the bit @bit as used.
+ */
+static int indx_mark_used(struct ntfs_index *indx, struct ntfs_inode *ni,
+ size_t bit)
+{
+ int err;
+ struct bmp_buf bbuf;
+
+ err = bmp_buf_get(indx, ni, bit, &bbuf);
+ if (err)
+ return err;
+
+ __set_bit(bit - bbuf.bit, bbuf.buf);
+
+ bmp_buf_put(&bbuf, true);
+
+ return 0;
+}
+
+/*
+ * indx_mark_free - Mark the bit @bit as free.
+ */
+static int indx_mark_free(struct ntfs_index *indx, struct ntfs_inode *ni,
+ size_t bit)
+{
+ int err;
+ struct bmp_buf bbuf;
+
+ err = bmp_buf_get(indx, ni, bit, &bbuf);
+ if (err)
+ return err;
+
+ __clear_bit(bit - bbuf.bit, bbuf.buf);
+
+ bmp_buf_put(&bbuf, true);
+
+ return 0;
+}
+
+/*
+ * scan_nres_bitmap
+ *
+ * If ntfs_readdir calls this function (indx_used_bit -> scan_nres_bitmap),
+ * inode is shared locked and no ni_lock.
+ * Use rw_semaphore for read/write access to bitmap_run.
+ */
+static int scan_nres_bitmap(struct ntfs_inode *ni, struct ATTRIB *bitmap,
+ struct ntfs_index *indx, size_t from,
+ bool (*fn)(const ulong *buf, u32 bit, u32 bits,
+ size_t *ret),
+ size_t *ret)
+{
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct super_block *sb = sbi->sb;
+ struct runs_tree *run = &indx->bitmap_run;
+ struct rw_semaphore *lock = &indx->run_lock;
+ u32 nbits = sb->s_blocksize * 8;
+ u32 blocksize = sb->s_blocksize;
+ u64 valid_size = le64_to_cpu(bitmap->nres.valid_size);
+ u64 data_size = le64_to_cpu(bitmap->nres.data_size);
+ sector_t eblock = bytes_to_block(sb, data_size);
+ size_t vbo = from >> 3;
+ sector_t blk = (vbo & sbi->cluster_mask) >> sb->s_blocksize_bits;
+ sector_t vblock = vbo >> sb->s_blocksize_bits;
+ sector_t blen, block;
+ CLST lcn, clen, vcn, vcn_next;
+ size_t idx;
+ struct buffer_head *bh;
+ bool ok;
+
+ *ret = MINUS_ONE_T;
+
+ if (vblock >= eblock)
+ return 0;
+
+ from &= nbits - 1;
+ vcn = vbo >> sbi->cluster_bits;
+
+ down_read(lock);
+ ok = run_lookup_entry(run, vcn, &lcn, &clen, &idx);
+ up_read(lock);
+
+next_run:
+ if (!ok) {
+ int err;
+ const struct INDEX_NAMES *name = &s_index_names[indx->type];
+
+ down_write(lock);
+ err = attr_load_runs_vcn(ni, ATTR_BITMAP, name->name,
+ name->name_len, run, vcn);
+ up_write(lock);
+ if (err)
+ return err;
+ down_read(lock);
+ ok = run_lookup_entry(run, vcn, &lcn, &clen, &idx);
+ up_read(lock);
+ if (!ok)
+ return -EINVAL;
+ }
+
+ blen = (sector_t)clen * sbi->blocks_per_cluster;
+ block = (sector_t)lcn * sbi->blocks_per_cluster;
+
+ for (; blk < blen; blk++, from = 0) {
+ bh = ntfs_bread(sb, block + blk);
+ if (!bh)
+ return -EIO;
+
+ vbo = (u64)vblock << sb->s_blocksize_bits;
+ if (vbo >= valid_size) {
+ memset(bh->b_data, 0, blocksize);
+ } else if (vbo + blocksize > valid_size) {
+ u32 voff = valid_size & sbi->block_mask;
+
+ memset(bh->b_data + voff, 0, blocksize - voff);
+ }
+
+ if (vbo + blocksize > data_size)
+ nbits = 8 * (data_size - vbo);
+
+ ok = nbits > from ? (*fn)((ulong *)bh->b_data, from, nbits, ret)
+ : false;
+ put_bh(bh);
+
+ if (ok) {
+ *ret += 8 * vbo;
+ return 0;
+ }
+
+ if (++vblock >= eblock) {
+ *ret = MINUS_ONE_T;
+ return 0;
+ }
+ }
+ blk = 0;
+ vcn_next = vcn + clen;
+ down_read(lock);
+ ok = run_get_entry(run, ++idx, &vcn, &lcn, &clen) && vcn == vcn_next;
+ if (!ok)
+ vcn = vcn_next;
+ up_read(lock);
+ goto next_run;
+}
+
+static bool scan_for_free(const ulong *buf, u32 bit, u32 bits, size_t *ret)
+{
+ size_t pos = find_next_zero_bit(buf, bits, bit);
+
+ if (pos >= bits)
+ return false;
+ *ret = pos;
+ return true;
+}
+
+/*
+ * indx_find_free - Look for free bit.
+ *
+ * Return: -1 if no free bits.
+ */
+static int indx_find_free(struct ntfs_index *indx, struct ntfs_inode *ni,
+ size_t *bit, struct ATTRIB **bitmap)
+{
+ struct ATTRIB *b;
+ struct ATTR_LIST_ENTRY *le = NULL;
+ const struct INDEX_NAMES *in = &s_index_names[indx->type];
+ int err;
+
+ b = ni_find_attr(ni, NULL, &le, ATTR_BITMAP, in->name, in->name_len,
+ NULL, NULL);
+
+ if (!b)
+ return -ENOENT;
+
+ *bitmap = b;
+ *bit = MINUS_ONE_T;
+
+ if (!b->non_res) {
+ u32 nbits = 8 * le32_to_cpu(b->res.data_size);
+ size_t pos = find_next_zero_bit(resident_data(b), nbits, 0);
+
+ if (pos < nbits)
+ *bit = pos;
+ } else {
+ err = scan_nres_bitmap(ni, b, indx, 0, &scan_for_free, bit);
+
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static bool scan_for_used(const ulong *buf, u32 bit, u32 bits, size_t *ret)
+{
+ size_t pos = find_next_bit(buf, bits, bit);
+
+ if (pos >= bits)
+ return false;
+ *ret = pos;
+ return true;
+}
+
+/*
+ * indx_used_bit - Look for used bit.
+ *
+ * Return: MINUS_ONE_T if no used bits.
+ */
+int indx_used_bit(struct ntfs_index *indx, struct ntfs_inode *ni, size_t *bit)
+{
+ struct ATTRIB *b;
+ struct ATTR_LIST_ENTRY *le = NULL;
+ size_t from = *bit;
+ const struct INDEX_NAMES *in = &s_index_names[indx->type];
+ int err;
+
+ b = ni_find_attr(ni, NULL, &le, ATTR_BITMAP, in->name, in->name_len,
+ NULL, NULL);
+
+ if (!b)
+ return -ENOENT;
+
+ *bit = MINUS_ONE_T;
+
+ if (!b->non_res) {
+ u32 nbits = le32_to_cpu(b->res.data_size) * 8;
+ size_t pos = find_next_bit(resident_data(b), nbits, from);
+
+ if (pos < nbits)
+ *bit = pos;
+ } else {
+ err = scan_nres_bitmap(ni, b, indx, from, &scan_for_used, bit);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+/*
+ * hdr_find_split
+ *
+ * Find a point at which the index allocation buffer would like to be split.
+ * NOTE: This function should never return 'END' entry NULL returns on error.
+ */
+static const struct NTFS_DE *hdr_find_split(const struct INDEX_HDR *hdr)
+{
+ size_t o;
+ const struct NTFS_DE *e = hdr_first_de(hdr);
+ u32 used_2 = le32_to_cpu(hdr->used) >> 1;
+ u16 esize;
+
+ if (!e || de_is_last(e))
+ return NULL;
+
+ esize = le16_to_cpu(e->size);
+ for (o = le32_to_cpu(hdr->de_off) + esize; o < used_2; o += esize) {
+ const struct NTFS_DE *p = e;
+
+ e = Add2Ptr(hdr, o);
+
+ /* We must not return END entry. */
+ if (de_is_last(e))
+ return p;
+
+ esize = le16_to_cpu(e->size);
+ }
+
+ return e;
+}
+
+/*
+ * hdr_insert_head - Insert some entries at the beginning of the buffer.
+ *
+ * It is used to insert entries into a newly-created buffer.
+ */
+static const struct NTFS_DE *hdr_insert_head(struct INDEX_HDR *hdr,
+ const void *ins, u32 ins_bytes)
+{
+ u32 to_move;
+ struct NTFS_DE *e = hdr_first_de(hdr);
+ u32 used = le32_to_cpu(hdr->used);
+
+ if (!e)
+ return NULL;
+
+ /* Now we just make room for the inserted entries and jam it in. */
+ to_move = used - le32_to_cpu(hdr->de_off);
+ memmove(Add2Ptr(e, ins_bytes), e, to_move);
+ memcpy(e, ins, ins_bytes);
+ hdr->used = cpu_to_le32(used + ins_bytes);
+
+ return e;
+}
+
+void fnd_clear(struct ntfs_fnd *fnd)
+{
+ int i;
+
+ for (i = 0; i < fnd->level; i++) {
+ struct indx_node *n = fnd->nodes[i];
+
+ if (!n)
+ continue;
+
+ put_indx_node(n);
+ fnd->nodes[i] = NULL;
+ }
+ fnd->level = 0;
+ fnd->root_de = NULL;
+}
+
+static int fnd_push(struct ntfs_fnd *fnd, struct indx_node *n,
+ struct NTFS_DE *e)
+{
+ int i;
+
+ i = fnd->level;
+ if (i < 0 || i >= ARRAY_SIZE(fnd->nodes))
+ return -EINVAL;
+ fnd->nodes[i] = n;
+ fnd->de[i] = e;
+ fnd->level += 1;
+ return 0;
+}
+
+static struct indx_node *fnd_pop(struct ntfs_fnd *fnd)
+{
+ struct indx_node *n;
+ int i = fnd->level;
+
+ i -= 1;
+ n = fnd->nodes[i];
+ fnd->nodes[i] = NULL;
+ fnd->level = i;
+
+ return n;
+}
+
+static bool fnd_is_empty(struct ntfs_fnd *fnd)
+{
+ if (!fnd->level)
+ return !fnd->root_de;
+
+ return !fnd->de[fnd->level - 1];
+}
+
+/*
+ * hdr_find_e - Locate an entry the index buffer.
+ *
+ * If no matching entry is found, it returns the first entry which is greater
+ * than the desired entry If the search key is greater than all the entries the
+ * buffer, it returns the 'end' entry. This function does a binary search of the
+ * current index buffer, for the first entry that is <= to the search value.
+ *
+ * Return: NULL if error.
+ */
+static struct NTFS_DE *hdr_find_e(const struct ntfs_index *indx,
+ const struct INDEX_HDR *hdr, const void *key,
+ size_t key_len, const void *ctx, int *diff)
+{
+ struct NTFS_DE *e;
+ NTFS_CMP_FUNC cmp = indx->cmp;
+ u32 e_size, e_key_len;
+ u32 end = le32_to_cpu(hdr->used);
+ u32 off = le32_to_cpu(hdr->de_off);
+
+#ifdef NTFS3_INDEX_BINARY_SEARCH
+ int max_idx = 0, fnd, min_idx;
+ int nslots = 64;
+ u16 *offs;
+
+ if (end > 0x10000)
+ goto next;
+
+ offs = kmalloc(sizeof(u16) * nslots, GFP_NOFS);
+ if (!offs)
+ goto next;
+
+ /* Use binary search algorithm. */
+next1:
+ if (off + sizeof(struct NTFS_DE) > end) {
+ e = NULL;
+ goto out1;
+ }
+ e = Add2Ptr(hdr, off);
+ e_size = le16_to_cpu(e->size);
+
+ if (e_size < sizeof(struct NTFS_DE) || off + e_size > end) {
+ e = NULL;
+ goto out1;
+ }
+
+ if (max_idx >= nslots) {
+ u16 *ptr;
+ int new_slots = ALIGN(2 * nslots, 8);
+
+ ptr = kmalloc(sizeof(u16) * new_slots, GFP_NOFS);
+ if (ptr)
+ memcpy(ptr, offs, sizeof(u16) * max_idx);
+ kfree(offs);
+ offs = ptr;
+ nslots = new_slots;
+ if (!ptr)
+ goto next;
+ }
+
+ /* Store entry table. */
+ offs[max_idx] = off;
+
+ if (!de_is_last(e)) {
+ off += e_size;
+ max_idx += 1;
+ goto next1;
+ }
+
+ /*
+ * Table of pointers is created.
+ * Use binary search to find entry that is <= to the search value.
+ */
+ fnd = -1;
+ min_idx = 0;
+
+ while (min_idx <= max_idx) {
+ int mid_idx = min_idx + ((max_idx - min_idx) >> 1);
+ int diff2;
+
+ e = Add2Ptr(hdr, offs[mid_idx]);
+
+ e_key_len = le16_to_cpu(e->key_size);
+
+ diff2 = (*cmp)(key, key_len, e + 1, e_key_len, ctx);
+
+ if (!diff2) {
+ *diff = 0;
+ goto out1;
+ }
+
+ if (diff2 < 0) {
+ max_idx = mid_idx - 1;
+ fnd = mid_idx;
+ if (!fnd)
+ break;
+ } else {
+ min_idx = mid_idx + 1;
+ }
+ }
+
+ if (fnd == -1) {
+ e = NULL;
+ goto out1;
+ }
+
+ *diff = -1;
+ e = Add2Ptr(hdr, offs[fnd]);
+
+out1:
+ kfree(offs);
+
+ return e;
+#endif
+
+next:
+ /*
+ * Entries index are sorted.
+ * Enumerate all entries until we find entry
+ * that is <= to the search value.
+ */
+ if (off + sizeof(struct NTFS_DE) > end)
+ return NULL;
+
+ e = Add2Ptr(hdr, off);
+ e_size = le16_to_cpu(e->size);
+
+ if (e_size < sizeof(struct NTFS_DE) || off + e_size > end)
+ return NULL;
+
+ off += e_size;
+
+ e_key_len = le16_to_cpu(e->key_size);
+
+ *diff = (*cmp)(key, key_len, e + 1, e_key_len, ctx);
+ if (!*diff)
+ return e;
+
+ if (*diff <= 0)
+ return e;
+
+ if (de_is_last(e)) {
+ *diff = 1;
+ return e;
+ }
+ goto next;
+}
+
+/*
+ * hdr_insert_de - Insert an index entry into the buffer.
+ *
+ * 'before' should be a pointer previously returned from hdr_find_e.
+ */
+static struct NTFS_DE *hdr_insert_de(const struct ntfs_index *indx,
+ struct INDEX_HDR *hdr,
+ const struct NTFS_DE *de,
+ struct NTFS_DE *before, const void *ctx)
+{
+ int diff;
+ size_t off = PtrOffset(hdr, before);
+ u32 used = le32_to_cpu(hdr->used);
+ u32 total = le32_to_cpu(hdr->total);
+ u16 de_size = le16_to_cpu(de->size);
+
+ /* First, check to see if there's enough room. */
+ if (used + de_size > total)
+ return NULL;
+
+ /* We know there's enough space, so we know we'll succeed. */
+ if (before) {
+ /* Check that before is inside Index. */
+ if (off >= used || off < le32_to_cpu(hdr->de_off) ||
+ off + le16_to_cpu(before->size) > total) {
+ return NULL;
+ }
+ goto ok;
+ }
+ /* No insert point is applied. Get it manually. */
+ before = hdr_find_e(indx, hdr, de + 1, le16_to_cpu(de->key_size), ctx,
+ &diff);
+ if (!before)
+ return NULL;
+ off = PtrOffset(hdr, before);
+
+ok:
+ /* Now we just make room for the entry and jam it in. */
+ memmove(Add2Ptr(before, de_size), before, used - off);
+
+ hdr->used = cpu_to_le32(used + de_size);
+ memcpy(before, de, de_size);
+
+ return before;
+}
+
+/*
+ * hdr_delete_de - Remove an entry from the index buffer.
+ */
+static inline struct NTFS_DE *hdr_delete_de(struct INDEX_HDR *hdr,
+ struct NTFS_DE *re)
+{
+ u32 used = le32_to_cpu(hdr->used);
+ u16 esize = le16_to_cpu(re->size);
+ u32 off = PtrOffset(hdr, re);
+ int bytes = used - (off + esize);
+
+ if (off >= used || esize < sizeof(struct NTFS_DE) ||
+ bytes < sizeof(struct NTFS_DE))
+ return NULL;
+
+ hdr->used = cpu_to_le32(used - esize);
+ memmove(re, Add2Ptr(re, esize), bytes);
+
+ return re;
+}
+
+void indx_clear(struct ntfs_index *indx)
+{
+ run_close(&indx->alloc_run);
+ run_close(&indx->bitmap_run);
+}
+
+int indx_init(struct ntfs_index *indx, struct ntfs_sb_info *sbi,
+ const struct ATTRIB *attr, enum index_mutex_classed type)
+{
+ u32 t32;
+ const struct INDEX_ROOT *root = resident_data(attr);
+
+ /* Check root fields. */
+ if (!root->index_block_clst)
+ return -EINVAL;
+
+ indx->type = type;
+ indx->idx2vbn_bits = __ffs(root->index_block_clst);
+
+ t32 = le32_to_cpu(root->index_block_size);
+ indx->index_bits = blksize_bits(t32);
+
+ /* Check index record size. */
+ if (t32 < sbi->cluster_size) {
+ /* Index record is smaller than a cluster, use 512 blocks. */
+ if (t32 != root->index_block_clst * SECTOR_SIZE)
+ return -EINVAL;
+
+ /* Check alignment to a cluster. */
+ if ((sbi->cluster_size >> SECTOR_SHIFT) &
+ (root->index_block_clst - 1)) {
+ return -EINVAL;
+ }
+
+ indx->vbn2vbo_bits = SECTOR_SHIFT;
+ } else {
+ /* Index record must be a multiple of cluster size. */
+ if (t32 != root->index_block_clst << sbi->cluster_bits)
+ return -EINVAL;
+
+ indx->vbn2vbo_bits = sbi->cluster_bits;
+ }
+
+ init_rwsem(&indx->run_lock);
+
+ indx->cmp = get_cmp_func(root);
+ return indx->cmp ? 0 : -EINVAL;
+}
+
+static struct indx_node *indx_new(struct ntfs_index *indx,
+ struct ntfs_inode *ni, CLST vbn,
+ const __le64 *sub_vbn)
+{
+ int err;
+ struct NTFS_DE *e;
+ struct indx_node *r;
+ struct INDEX_HDR *hdr;
+ struct INDEX_BUFFER *index;
+ u64 vbo = (u64)vbn << indx->vbn2vbo_bits;
+ u32 bytes = 1u << indx->index_bits;
+ u16 fn;
+ u32 eo;
+
+ r = kzalloc(sizeof(struct indx_node), GFP_NOFS);
+ if (!r)
+ return ERR_PTR(-ENOMEM);
+
+ index = kzalloc(bytes, GFP_NOFS);
+ if (!index) {
+ kfree(r);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ err = ntfs_get_bh(ni->mi.sbi, &indx->alloc_run, vbo, bytes, &r->nb);
+
+ if (err) {
+ kfree(index);
+ kfree(r);
+ return ERR_PTR(err);
+ }
+
+ /* Create header. */
+ index->rhdr.sign = NTFS_INDX_SIGNATURE;
+ index->rhdr.fix_off = cpu_to_le16(sizeof(struct INDEX_BUFFER)); // 0x28
+ fn = (bytes >> SECTOR_SHIFT) + 1; // 9
+ index->rhdr.fix_num = cpu_to_le16(fn);
+ index->vbn = cpu_to_le64(vbn);
+ hdr = &index->ihdr;
+ eo = ALIGN(sizeof(struct INDEX_BUFFER) + fn * sizeof(short), 8);
+ hdr->de_off = cpu_to_le32(eo);
+
+ e = Add2Ptr(hdr, eo);
+
+ if (sub_vbn) {
+ e->flags = NTFS_IE_LAST | NTFS_IE_HAS_SUBNODES;
+ e->size = cpu_to_le16(sizeof(struct NTFS_DE) + sizeof(u64));
+ hdr->used =
+ cpu_to_le32(eo + sizeof(struct NTFS_DE) + sizeof(u64));
+ de_set_vbn_le(e, *sub_vbn);
+ hdr->flags = 1;
+ } else {
+ e->size = cpu_to_le16(sizeof(struct NTFS_DE));
+ hdr->used = cpu_to_le32(eo + sizeof(struct NTFS_DE));
+ e->flags = NTFS_IE_LAST;
+ }
+
+ hdr->total = cpu_to_le32(bytes - offsetof(struct INDEX_BUFFER, ihdr));
+
+ r->index = index;
+ return r;
+}
+
+struct INDEX_ROOT *indx_get_root(struct ntfs_index *indx, struct ntfs_inode *ni,
+ struct ATTRIB **attr, struct mft_inode **mi)
+{
+ struct ATTR_LIST_ENTRY *le = NULL;
+ struct ATTRIB *a;
+ const struct INDEX_NAMES *in = &s_index_names[indx->type];
+
+ a = ni_find_attr(ni, NULL, &le, ATTR_ROOT, in->name, in->name_len, NULL,
+ mi);
+ if (!a)
+ return NULL;
+
+ if (attr)
+ *attr = a;
+
+ return resident_data_ex(a, sizeof(struct INDEX_ROOT));
+}
+
+static int indx_write(struct ntfs_index *indx, struct ntfs_inode *ni,
+ struct indx_node *node, int sync)
+{
+ struct INDEX_BUFFER *ib = node->index;
+
+ return ntfs_write_bh(ni->mi.sbi, &ib->rhdr, &node->nb, sync);
+}
+
+/*
+ * indx_read
+ *
+ * If ntfs_readdir calls this function
+ * inode is shared locked and no ni_lock.
+ * Use rw_semaphore for read/write access to alloc_run.
+ */
+int indx_read(struct ntfs_index *indx, struct ntfs_inode *ni, CLST vbn,
+ struct indx_node **node)
+{
+ int err;
+ struct INDEX_BUFFER *ib;
+ struct runs_tree *run = &indx->alloc_run;
+ struct rw_semaphore *lock = &indx->run_lock;
+ u64 vbo = (u64)vbn << indx->vbn2vbo_bits;
+ u32 bytes = 1u << indx->index_bits;
+ struct indx_node *in = *node;
+ const struct INDEX_NAMES *name;
+
+ if (!in) {
+ in = kzalloc(sizeof(struct indx_node), GFP_NOFS);
+ if (!in)
+ return -ENOMEM;
+ } else {
+ nb_put(&in->nb);
+ }
+
+ ib = in->index;
+ if (!ib) {
+ ib = kmalloc(bytes, GFP_NOFS);
+ if (!ib) {
+ err = -ENOMEM;
+ goto out;
+ }
+ }
+
+ down_read(lock);
+ err = ntfs_read_bh(ni->mi.sbi, run, vbo, &ib->rhdr, bytes, &in->nb);
+ up_read(lock);
+ if (!err)
+ goto ok;
+
+ if (err == -E_NTFS_FIXUP)
+ goto ok;
+
+ if (err != -ENOENT)
+ goto out;
+
+ name = &s_index_names[indx->type];
+ down_write(lock);
+ err = attr_load_runs_range(ni, ATTR_ALLOC, name->name, name->name_len,
+ run, vbo, vbo + bytes);
+ up_write(lock);
+ if (err)
+ goto out;
+
+ down_read(lock);
+ err = ntfs_read_bh(ni->mi.sbi, run, vbo, &ib->rhdr, bytes, &in->nb);
+ up_read(lock);
+ if (err == -E_NTFS_FIXUP)
+ goto ok;
+
+ if (err)
+ goto out;
+
+ok:
+ if (err == -E_NTFS_FIXUP) {
+ ntfs_write_bh(ni->mi.sbi, &ib->rhdr, &in->nb, 0);
+ err = 0;
+ }
+
+ in->index = ib;
+ *node = in;
+
+out:
+ if (ib != in->index)
+ kfree(ib);
+
+ if (*node != in) {
+ nb_put(&in->nb);
+ kfree(in);
+ }
+
+ return err;
+}
+
+/*
+ * indx_find - Scan NTFS directory for given entry.
+ */
+int indx_find(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const struct INDEX_ROOT *root, const void *key, size_t key_len,
+ const void *ctx, int *diff, struct NTFS_DE **entry,
+ struct ntfs_fnd *fnd)
+{
+ int err;
+ struct NTFS_DE *e;
+ const struct INDEX_HDR *hdr;
+ struct indx_node *node;
+
+ if (!root)
+ root = indx_get_root(&ni->dir, ni, NULL, NULL);
+
+ if (!root) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ hdr = &root->ihdr;
+
+ /* Check cache. */
+ e = fnd->level ? fnd->de[fnd->level - 1] : fnd->root_de;
+ if (e && !de_is_last(e) &&
+ !(*indx->cmp)(key, key_len, e + 1, le16_to_cpu(e->key_size), ctx)) {
+ *entry = e;
+ *diff = 0;
+ return 0;
+ }
+
+ /* Soft finder reset. */
+ fnd_clear(fnd);
+
+ /* Lookup entry that is <= to the search value. */
+ e = hdr_find_e(indx, hdr, key, key_len, ctx, diff);
+ if (!e)
+ return -EINVAL;
+
+ if (fnd)
+ fnd->root_de = e;
+
+ err = 0;
+
+ for (;;) {
+ node = NULL;
+ if (*diff >= 0 || !de_has_vcn_ex(e)) {
+ *entry = e;
+ goto out;
+ }
+
+ /* Read next level. */
+ err = indx_read(indx, ni, de_get_vbn(e), &node);
+ if (err)
+ goto out;
+
+ /* Lookup entry that is <= to the search value. */
+ e = hdr_find_e(indx, &node->index->ihdr, key, key_len, ctx,
+ diff);
+ if (!e) {
+ err = -EINVAL;
+ put_indx_node(node);
+ goto out;
+ }
+
+ fnd_push(fnd, node, e);
+ }
+
+out:
+ return err;
+}
+
+int indx_find_sort(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const struct INDEX_ROOT *root, struct NTFS_DE **entry,
+ struct ntfs_fnd *fnd)
+{
+ int err;
+ struct indx_node *n = NULL;
+ struct NTFS_DE *e;
+ size_t iter = 0;
+ int level = fnd->level;
+
+ if (!*entry) {
+ /* Start find. */
+ e = hdr_first_de(&root->ihdr);
+ if (!e)
+ return 0;
+ fnd_clear(fnd);
+ fnd->root_de = e;
+ } else if (!level) {
+ if (de_is_last(fnd->root_de)) {
+ *entry = NULL;
+ return 0;
+ }
+
+ e = hdr_next_de(&root->ihdr, fnd->root_de);
+ if (!e)
+ return -EINVAL;
+ fnd->root_de = e;
+ } else {
+ n = fnd->nodes[level - 1];
+ e = fnd->de[level - 1];
+
+ if (de_is_last(e))
+ goto pop_level;
+
+ e = hdr_next_de(&n->index->ihdr, e);
+ if (!e)
+ return -EINVAL;
+
+ fnd->de[level - 1] = e;
+ }
+
+ /* Just to avoid tree cycle. */
+next_iter:
+ if (iter++ >= 1000)
+ return -EINVAL;
+
+ while (de_has_vcn_ex(e)) {
+ if (le16_to_cpu(e->size) <
+ sizeof(struct NTFS_DE) + sizeof(u64)) {
+ if (n) {
+ fnd_pop(fnd);
+ kfree(n);
+ }
+ return -EINVAL;
+ }
+
+ /* Read next level. */
+ err = indx_read(indx, ni, de_get_vbn(e), &n);
+ if (err)
+ return err;
+
+ /* Try next level. */
+ e = hdr_first_de(&n->index->ihdr);
+ if (!e) {
+ kfree(n);
+ return -EINVAL;
+ }
+
+ fnd_push(fnd, n, e);
+ }
+
+ if (le16_to_cpu(e->size) > sizeof(struct NTFS_DE)) {
+ *entry = e;
+ return 0;
+ }
+
+pop_level:
+ for (;;) {
+ if (!de_is_last(e))
+ goto next_iter;
+
+ /* Pop one level. */
+ if (n) {
+ fnd_pop(fnd);
+ kfree(n);
+ }
+
+ level = fnd->level;
+
+ if (level) {
+ n = fnd->nodes[level - 1];
+ e = fnd->de[level - 1];
+ } else if (fnd->root_de) {
+ n = NULL;
+ e = fnd->root_de;
+ fnd->root_de = NULL;
+ } else {
+ *entry = NULL;
+ return 0;
+ }
+
+ if (le16_to_cpu(e->size) > sizeof(struct NTFS_DE)) {
+ *entry = e;
+ if (!fnd->root_de)
+ fnd->root_de = e;
+ return 0;
+ }
+ }
+}
+
+int indx_find_raw(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const struct INDEX_ROOT *root, struct NTFS_DE **entry,
+ size_t *off, struct ntfs_fnd *fnd)
+{
+ int err;
+ struct indx_node *n = NULL;
+ struct NTFS_DE *e = NULL;
+ struct NTFS_DE *e2;
+ size_t bit;
+ CLST next_used_vbn;
+ CLST next_vbn;
+ u32 record_size = ni->mi.sbi->record_size;
+
+ /* Use non sorted algorithm. */
+ if (!*entry) {
+ /* This is the first call. */
+ e = hdr_first_de(&root->ihdr);
+ if (!e)
+ return 0;
+ fnd_clear(fnd);
+ fnd->root_de = e;
+
+ /* The first call with setup of initial element. */
+ if (*off >= record_size) {
+ next_vbn = (((*off - record_size) >> indx->index_bits))
+ << indx->idx2vbn_bits;
+ /* Jump inside cycle 'for'. */
+ goto next;
+ }
+
+ /* Start enumeration from root. */
+ *off = 0;
+ } else if (!fnd->root_de)
+ return -EINVAL;
+
+ for (;;) {
+ /* Check if current entry can be used. */
+ if (e && le16_to_cpu(e->size) > sizeof(struct NTFS_DE))
+ goto ok;
+
+ if (!fnd->level) {
+ /* Continue to enumerate root. */
+ if (!de_is_last(fnd->root_de)) {
+ e = hdr_next_de(&root->ihdr, fnd->root_de);
+ if (!e)
+ return -EINVAL;
+ fnd->root_de = e;
+ continue;
+ }
+
+ /* Start to enumerate indexes from 0. */
+ next_vbn = 0;
+ } else {
+ /* Continue to enumerate indexes. */
+ e2 = fnd->de[fnd->level - 1];
+
+ n = fnd->nodes[fnd->level - 1];
+
+ if (!de_is_last(e2)) {
+ e = hdr_next_de(&n->index->ihdr, e2);
+ if (!e)
+ return -EINVAL;
+ fnd->de[fnd->level - 1] = e;
+ continue;
+ }
+
+ /* Continue with next index. */
+ next_vbn = le64_to_cpu(n->index->vbn) +
+ root->index_block_clst;
+ }
+
+next:
+ /* Release current index. */
+ if (n) {
+ fnd_pop(fnd);
+ put_indx_node(n);
+ n = NULL;
+ }
+
+ /* Skip all free indexes. */
+ bit = next_vbn >> indx->idx2vbn_bits;
+ err = indx_used_bit(indx, ni, &bit);
+ if (err == -ENOENT || bit == MINUS_ONE_T) {
+ /* No used indexes. */
+ *entry = NULL;
+ return 0;
+ }
+
+ next_used_vbn = bit << indx->idx2vbn_bits;
+
+ /* Read buffer into memory. */
+ err = indx_read(indx, ni, next_used_vbn, &n);
+ if (err)
+ return err;
+
+ e = hdr_first_de(&n->index->ihdr);
+ fnd_push(fnd, n, e);
+ if (!e)
+ return -EINVAL;
+ }
+
+ok:
+ /* Return offset to restore enumerator if necessary. */
+ if (!n) {
+ /* 'e' points in root, */
+ *off = PtrOffset(&root->ihdr, e);
+ } else {
+ /* 'e' points in index, */
+ *off = (le64_to_cpu(n->index->vbn) << indx->vbn2vbo_bits) +
+ record_size + PtrOffset(&n->index->ihdr, e);
+ }
+
+ *entry = e;
+ return 0;
+}
+
+/*
+ * indx_create_allocate - Create "Allocation + Bitmap" attributes.
+ */
+static int indx_create_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
+ CLST *vbn)
+{
+ int err = -ENOMEM;
+ struct ntfs_sb_info *sbi = ni->mi.sbi;
+ struct ATTRIB *bitmap;
+ struct ATTRIB *alloc;
+ u32 data_size = 1u << indx->index_bits;
+ u32 alloc_size = ntfs_up_cluster(sbi, data_size);
+ CLST len = alloc_size >> sbi->cluster_bits;
+ const struct INDEX_NAMES *in = &s_index_names[indx->type];
+ CLST alen;
+ struct runs_tree run;
+
+ run_init(&run);
+
+ err = attr_allocate_clusters(sbi, &run, 0, 0, len, NULL, 0, &alen, 0,
+ NULL);
+ if (err)
+ goto out;
+
+ err = ni_insert_nonresident(ni, ATTR_ALLOC, in->name, in->name_len,
+ &run, 0, len, 0, &alloc, NULL);
+ if (err)
+ goto out1;
+
+ alloc->nres.valid_size = alloc->nres.data_size = cpu_to_le64(data_size);
+
+ err = ni_insert_resident(ni, bitmap_size(1), ATTR_BITMAP, in->name,
+ in->name_len, &bitmap, NULL, NULL);
+ if (err)
+ goto out2;
+
+ if (in->name == I30_NAME) {
+ ni->vfs_inode.i_size = data_size;
+ inode_set_bytes(&ni->vfs_inode, alloc_size);
+ }
+
+ memcpy(&indx->alloc_run, &run, sizeof(run));
+
+ *vbn = 0;
+
+ return 0;
+
+out2:
+ mi_remove_attr(NULL, &ni->mi, alloc);
+
+out1:
+ run_deallocate(sbi, &run, false);
+
+out:
+ return err;
+}
+
+/*
+ * indx_add_allocate - Add clusters to index.
+ */
+static int indx_add_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
+ CLST *vbn)
+{
+ int err;
+ size_t bit;
+ u64 data_size;
+ u64 bmp_size, bmp_size_v;
+ struct ATTRIB *bmp, *alloc;
+ struct mft_inode *mi;
+ const struct INDEX_NAMES *in = &s_index_names[indx->type];
+
+ err = indx_find_free(indx, ni, &bit, &bmp);
+ if (err)
+ goto out1;
+
+ if (bit != MINUS_ONE_T) {
+ bmp = NULL;
+ } else {
+ if (bmp->non_res) {
+ bmp_size = le64_to_cpu(bmp->nres.data_size);
+ bmp_size_v = le64_to_cpu(bmp->nres.valid_size);
+ } else {
+ bmp_size = bmp_size_v = le32_to_cpu(bmp->res.data_size);
+ }
+
+ bit = bmp_size << 3;
+ }
+
+ data_size = (u64)(bit + 1) << indx->index_bits;
+
+ if (bmp) {
+ /* Increase bitmap. */
+ err = attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
+ &indx->bitmap_run, bitmap_size(bit + 1),
+ NULL, true, NULL);
+ if (err)
+ goto out1;
+ }
+
+ alloc = ni_find_attr(ni, NULL, NULL, ATTR_ALLOC, in->name, in->name_len,
+ NULL, &mi);
+ if (!alloc) {
+ err = -EINVAL;
+ if (bmp)
+ goto out2;
+ goto out1;
+ }
+
+ /* Increase allocation. */
+ err = attr_set_size(ni, ATTR_ALLOC, in->name, in->name_len,
+ &indx->alloc_run, data_size, &data_size, true,
+ NULL);
+ if (err) {
+ if (bmp)
+ goto out2;
+ goto out1;
+ }
+
+ *vbn = bit << indx->idx2vbn_bits;
+
+ return 0;
+
+out2:
+ /* Ops. No space? */
+ attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
+ &indx->bitmap_run, bmp_size, &bmp_size_v, false, NULL);
+
+out1:
+ return err;
+}
+
+/*
+ * indx_insert_into_root - Attempt to insert an entry into the index root.
+ *
+ * @undo - True if we undoing previous remove.
+ * If necessary, it will twiddle the index b-tree.
+ */
+static int indx_insert_into_root(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const struct NTFS_DE *new_de,
+ struct NTFS_DE *root_de, const void *ctx,
+ struct ntfs_fnd *fnd, bool undo)
+{
+ int err = 0;
+ struct NTFS_DE *e, *e0, *re;
+ struct mft_inode *mi;
+ struct ATTRIB *attr;
+ struct INDEX_HDR *hdr;
+ struct indx_node *n;
+ CLST new_vbn;
+ __le64 *sub_vbn, t_vbn;
+ u16 new_de_size;
+ u32 hdr_used, hdr_total, asize, to_move;
+ u32 root_size, new_root_size;
+ struct ntfs_sb_info *sbi;
+ int ds_root;
+ struct INDEX_ROOT *root, *a_root;
+
+ /* Get the record this root placed in. */
+ root = indx_get_root(indx, ni, &attr, &mi);
+ if (!root)
+ return -EINVAL;
+
+ /*
+ * Try easy case:
+ * hdr_insert_de will succeed if there's
+ * room the root for the new entry.
+ */
+ hdr = &root->ihdr;
+ sbi = ni->mi.sbi;
+ new_de_size = le16_to_cpu(new_de->size);
+ hdr_used = le32_to_cpu(hdr->used);
+ hdr_total = le32_to_cpu(hdr->total);
+ asize = le32_to_cpu(attr->size);
+ root_size = le32_to_cpu(attr->res.data_size);
+
+ ds_root = new_de_size + hdr_used - hdr_total;
+
+ /* If 'undo' is set then reduce requirements. */
+ if ((undo || asize + ds_root < sbi->max_bytes_per_attr) &&
+ mi_resize_attr(mi, attr, ds_root)) {
+ hdr->total = cpu_to_le32(hdr_total + ds_root);
+ e = hdr_insert_de(indx, hdr, new_de, root_de, ctx);
+ WARN_ON(!e);
+ fnd_clear(fnd);
+ fnd->root_de = e;
+
+ return 0;
+ }
+
+ /* Make a copy of root attribute to restore if error. */
+ a_root = kmemdup(attr, asize, GFP_NOFS);
+ if (!a_root)
+ return -ENOMEM;
+
+ /*
+ * Copy all the non-end entries from
+ * the index root to the new buffer.
+ */
+ to_move = 0;
+ e0 = hdr_first_de(hdr);
+
+ /* Calculate the size to copy. */
+ for (e = e0;; e = hdr_next_de(hdr, e)) {
+ if (!e) {
+ err = -EINVAL;
+ goto out_free_root;
+ }
+
+ if (de_is_last(e))
+ break;
+ to_move += le16_to_cpu(e->size);
+ }
+
+ if (!to_move) {
+ re = NULL;
+ } else {
+ re = kmemdup(e0, to_move, GFP_NOFS);
+ if (!re) {
+ err = -ENOMEM;
+ goto out_free_root;
+ }
+ }
+
+ sub_vbn = NULL;
+ if (de_has_vcn(e)) {
+ t_vbn = de_get_vbn_le(e);
+ sub_vbn = &t_vbn;
+ }
+
+ new_root_size = sizeof(struct INDEX_ROOT) + sizeof(struct NTFS_DE) +
+ sizeof(u64);
+ ds_root = new_root_size - root_size;
+
+ if (ds_root > 0 && asize + ds_root > sbi->max_bytes_per_attr) {
+ /* Make root external. */
+ err = -EOPNOTSUPP;
+ goto out_free_re;
+ }
+
+ if (ds_root)
+ mi_resize_attr(mi, attr, ds_root);
+
+ /* Fill first entry (vcn will be set later). */
+ e = (struct NTFS_DE *)(root + 1);
+ memset(e, 0, sizeof(struct NTFS_DE));
+ e->size = cpu_to_le16(sizeof(struct NTFS_DE) + sizeof(u64));
+ e->flags = NTFS_IE_HAS_SUBNODES | NTFS_IE_LAST;
+
+ hdr->flags = 1;
+ hdr->used = hdr->total =
+ cpu_to_le32(new_root_size - offsetof(struct INDEX_ROOT, ihdr));
+
+ fnd->root_de = hdr_first_de(hdr);
+ mi->dirty = true;
+
+ /* Create alloc and bitmap attributes (if not). */
+ err = run_is_empty(&indx->alloc_run)
+ ? indx_create_allocate(indx, ni, &new_vbn)
+ : indx_add_allocate(indx, ni, &new_vbn);
+
+ /* Layout of record may be changed, so rescan root. */
+ root = indx_get_root(indx, ni, &attr, &mi);
+ if (!root) {
+ /* Bug? */
+ ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+ err = -EINVAL;
+ goto out_free_re;
+ }
+
+ if (err) {
+ /* Restore root. */
+ if (mi_resize_attr(mi, attr, -ds_root))
+ memcpy(attr, a_root, asize);
+ else {
+ /* Bug? */
+ ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+ }
+ goto out_free_re;
+ }
+
+ e = (struct NTFS_DE *)(root + 1);
+ *(__le64 *)(e + 1) = cpu_to_le64(new_vbn);
+ mi->dirty = true;
+
+ /* Now we can create/format the new buffer and copy the entries into. */
+ n = indx_new(indx, ni, new_vbn, sub_vbn);
+ if (IS_ERR(n)) {
+ err = PTR_ERR(n);
+ goto out_free_re;
+ }
+
+ hdr = &n->index->ihdr;
+ hdr_used = le32_to_cpu(hdr->used);
+ hdr_total = le32_to_cpu(hdr->total);
+
+ /* Copy root entries into new buffer. */
+ hdr_insert_head(hdr, re, to_move);
+
+ /* Update bitmap attribute. */
+ indx_mark_used(indx, ni, new_vbn >> indx->idx2vbn_bits);
+
+ /* Check if we can insert new entry new index buffer. */
+ if (hdr_used + new_de_size > hdr_total) {
+ /*
+ * This occurs if MFT record is the same or bigger than index
+ * buffer. Move all root new index and have no space to add
+ * new entry classic case when MFT record is 1K and index
+ * buffer 4K the problem should not occurs.
+ */
+ kfree(re);
+ indx_write(indx, ni, n, 0);
+
+ put_indx_node(n);
+ fnd_clear(fnd);
+ err = indx_insert_entry(indx, ni, new_de, ctx, fnd, undo);
+ goto out_free_root;
+ }
+
+ /*
+ * Now root is a parent for new index buffer.
+ * Insert NewEntry a new buffer.
+ */
+ e = hdr_insert_de(indx, hdr, new_de, NULL, ctx);
+ if (!e) {
+ err = -EINVAL;
+ goto out_put_n;
+ }
+ fnd_push(fnd, n, e);
+
+ /* Just write updates index into disk. */
+ indx_write(indx, ni, n, 0);
+
+ n = NULL;
+
+out_put_n:
+ put_indx_node(n);
+out_free_re:
+ kfree(re);
+out_free_root:
+ kfree(a_root);
+ return err;
+}
+
+/*
+ * indx_insert_into_buffer
+ *
+ * Attempt to insert an entry into an Index Allocation Buffer.
+ * If necessary, it will split the buffer.
+ */
+static int
+indx_insert_into_buffer(struct ntfs_index *indx, struct ntfs_inode *ni,
+ struct INDEX_ROOT *root, const struct NTFS_DE *new_de,
+ const void *ctx, int level, struct ntfs_fnd *fnd)
+{
+ int err;
+ const struct NTFS_DE *sp;
+ struct NTFS_DE *e, *de_t, *up_e = NULL;
+ struct indx_node *n2 = NULL;
+ struct indx_node *n1 = fnd->nodes[level];
+ struct INDEX_HDR *hdr1 = &n1->index->ihdr;
+ struct INDEX_HDR *hdr2;
+ u32 to_copy, used;
+ CLST new_vbn;
+ __le64 t_vbn, *sub_vbn;
+ u16 sp_size;
+
+ /* Try the most easy case. */
+ e = fnd->level - 1 == level ? fnd->de[level] : NULL;
+ e = hdr_insert_de(indx, hdr1, new_de, e, ctx);
+ fnd->de[level] = e;
+ if (e) {
+ /* Just write updated index into disk. */
+ indx_write(indx, ni, n1, 0);
+ return 0;
+ }
+
+ /*
+ * No space to insert into buffer. Split it.
+ * To split we:
+ * - Save split point ('cause index buffers will be changed)
+ * - Allocate NewBuffer and copy all entries <= sp into new buffer
+ * - Remove all entries (sp including) from TargetBuffer
+ * - Insert NewEntry into left or right buffer (depending on sp <=>
+ * NewEntry)
+ * - Insert sp into parent buffer (or root)
+ * - Make sp a parent for new buffer
+ */
+ sp = hdr_find_split(hdr1);
+ if (!sp)
+ return -EINVAL;
+
+ sp_size = le16_to_cpu(sp->size);
+ up_e = kmalloc(sp_size + sizeof(u64), GFP_NOFS);
+ if (!up_e)
+ return -ENOMEM;
+ memcpy(up_e, sp, sp_size);
+
+ if (!hdr1->flags) {
+ up_e->flags |= NTFS_IE_HAS_SUBNODES;
+ up_e->size = cpu_to_le16(sp_size + sizeof(u64));
+ sub_vbn = NULL;
+ } else {
+ t_vbn = de_get_vbn_le(up_e);
+ sub_vbn = &t_vbn;
+ }
+
+ /* Allocate on disk a new index allocation buffer. */
+ err = indx_add_allocate(indx, ni, &new_vbn);
+ if (err)
+ goto out;
+
+ /* Allocate and format memory a new index buffer. */
+ n2 = indx_new(indx, ni, new_vbn, sub_vbn);
+ if (IS_ERR(n2)) {
+ err = PTR_ERR(n2);
+ goto out;
+ }
+
+ hdr2 = &n2->index->ihdr;
+
+ /* Make sp a parent for new buffer. */
+ de_set_vbn(up_e, new_vbn);
+
+ /* Copy all the entries <= sp into the new buffer. */
+ de_t = hdr_first_de(hdr1);
+ to_copy = PtrOffset(de_t, sp);
+ hdr_insert_head(hdr2, de_t, to_copy);
+
+ /* Remove all entries (sp including) from hdr1. */
+ used = le32_to_cpu(hdr1->used) - to_copy - sp_size;
+ memmove(de_t, Add2Ptr(sp, sp_size), used - le32_to_cpu(hdr1->de_off));
+ hdr1->used = cpu_to_le32(used);
+
+ /*
+ * Insert new entry into left or right buffer
+ * (depending on sp <=> new_de).
+ */
+ hdr_insert_de(indx,
+ (*indx->cmp)(new_de + 1, le16_to_cpu(new_de->key_size),
+ up_e + 1, le16_to_cpu(up_e->key_size),
+ ctx) < 0
+ ? hdr2
+ : hdr1,
+ new_de, NULL, ctx);
+
+ indx_mark_used(indx, ni, new_vbn >> indx->idx2vbn_bits);
+
+ indx_write(indx, ni, n1, 0);
+ indx_write(indx, ni, n2, 0);
+
+ put_indx_node(n2);
+
+ /*
+ * We've finished splitting everybody, so we are ready to
+ * insert the promoted entry into the parent.
+ */
+ if (!level) {
+ /* Insert in root. */
+ err = indx_insert_into_root(indx, ni, up_e, NULL, ctx, fnd, 0);
+ if (err)
+ goto out;
+ } else {
+ /*
+ * The target buffer's parent is another index buffer.
+ * TODO: Remove recursion.
+ */
+ err = indx_insert_into_buffer(indx, ni, root, up_e, ctx,
+ level - 1, fnd);
+ if (err)
+ goto out;
+ }
+
+out:
+ kfree(up_e);
+
+ return err;
+}
+
+/*
+ * indx_insert_entry - Insert new entry into index.
+ *
+ * @undo - True if we undoing previous remove.
+ */
+int indx_insert_entry(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const struct NTFS_DE *new_de, const void *ctx,
+ struct ntfs_fnd *fnd, bool undo)
+{
+ int err;
+ int diff;
+ struct NTFS_DE *e;
+ struct ntfs_fnd *fnd_a = NULL;
+ struct INDEX_ROOT *root;
+
+ if (!fnd) {
+ fnd_a = fnd_get();
+ if (!fnd_a) {
+ err = -ENOMEM;
+ goto out1;
+ }
+ fnd = fnd_a;
+ }
+
+ root = indx_get_root(indx, ni, NULL, NULL);
+ if (!root) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (fnd_is_empty(fnd)) {
+ /*
+ * Find the spot the tree where we want to
+ * insert the new entry.
+ */
+ err = indx_find(indx, ni, root, new_de + 1,
+ le16_to_cpu(new_de->key_size), ctx, &diff, &e,
+ fnd);
+ if (err)
+ goto out;
+
+ if (!diff) {
+ err = -EEXIST;
+ goto out;
+ }
+ }
+
+ if (!fnd->level) {
+ /*
+ * The root is also a leaf, so we'll insert the
+ * new entry into it.
+ */
+ err = indx_insert_into_root(indx, ni, new_de, fnd->root_de, ctx,
+ fnd, undo);
+ if (err)
+ goto out;
+ } else {
+ /*
+ * Found a leaf buffer, so we'll insert the new entry into it.
+ */
+ err = indx_insert_into_buffer(indx, ni, root, new_de, ctx,
+ fnd->level - 1, fnd);
+ if (err)
+ goto out;
+ }
+
+out:
+ fnd_put(fnd_a);
+out1:
+ return err;
+}
+
+/*
+ * indx_find_buffer - Locate a buffer from the tree.
+ */
+static struct indx_node *indx_find_buffer(struct ntfs_index *indx,
+ struct ntfs_inode *ni,
+ const struct INDEX_ROOT *root,
+ __le64 vbn, struct indx_node *n)
+{
+ int err;
+ const struct NTFS_DE *e;
+ struct indx_node *r;
+ const struct INDEX_HDR *hdr = n ? &n->index->ihdr : &root->ihdr;
+
+ /* Step 1: Scan one level. */
+ for (e = hdr_first_de(hdr);; e = hdr_next_de(hdr, e)) {
+ if (!e)
+ return ERR_PTR(-EINVAL);
+
+ if (de_has_vcn(e) && vbn == de_get_vbn_le(e))
+ return n;
+
+ if (de_is_last(e))
+ break;
+ }
+
+ /* Step2: Do recursion. */
+ e = Add2Ptr(hdr, le32_to_cpu(hdr->de_off));
+ for (;;) {
+ if (de_has_vcn_ex(e)) {
+ err = indx_read(indx, ni, de_get_vbn(e), &n);
+ if (err)
+ return ERR_PTR(err);
+
+ r = indx_find_buffer(indx, ni, root, vbn, n);
+ if (r)
+ return r;
+ }
+
+ if (de_is_last(e))
+ break;
+
+ e = Add2Ptr(e, le16_to_cpu(e->size));
+ }
+
+ return NULL;
+}
+
+/*
+ * indx_shrink - Deallocate unused tail indexes.
+ */
+static int indx_shrink(struct ntfs_index *indx, struct ntfs_inode *ni,
+ size_t bit)
+{
+ int err = 0;
+ u64 bpb, new_data;
+ size_t nbits;
+ struct ATTRIB *b;
+ struct ATTR_LIST_ENTRY *le = NULL;
+ const struct INDEX_NAMES *in = &s_index_names[indx->type];
+
+ b = ni_find_attr(ni, NULL, &le, ATTR_BITMAP, in->name, in->name_len,
+ NULL, NULL);
+
+ if (!b)
+ return -ENOENT;
+
+ if (!b->non_res) {
+ unsigned long pos;
+ const unsigned long *bm = resident_data(b);
+
+ nbits = (size_t)le32_to_cpu(b->res.data_size) * 8;
+
+ if (bit >= nbits)
+ return 0;
+
+ pos = find_next_bit(bm, nbits, bit);
+ if (pos < nbits)
+ return 0;
+ } else {
+ size_t used = MINUS_ONE_T;
+
+ nbits = le64_to_cpu(b->nres.data_size) * 8;
+
+ if (bit >= nbits)
+ return 0;
+
+ err = scan_nres_bitmap(ni, b, indx, bit, &scan_for_used, &used);
+ if (err)
+ return err;
+
+ if (used != MINUS_ONE_T)
+ return 0;
+ }
+
+ new_data = (u64)bit << indx->index_bits;
+
+ err = attr_set_size(ni, ATTR_ALLOC, in->name, in->name_len,
+ &indx->alloc_run, new_data, &new_data, false, NULL);
+ if (err)
+ return err;
+
+ bpb = bitmap_size(bit);
+ if (bpb * 8 == nbits)
+ return 0;
+
+ err = attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
+ &indx->bitmap_run, bpb, &bpb, false, NULL);
+
+ return err;
+}
+
+static int indx_free_children(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const struct NTFS_DE *e, bool trim)
+{
+ int err;
+ struct indx_node *n;
+ struct INDEX_HDR *hdr;
+ CLST vbn = de_get_vbn(e);
+ size_t i;
+
+ err = indx_read(indx, ni, vbn, &n);
+ if (err)
+ return err;
+
+ hdr = &n->index->ihdr;
+ /* First, recurse into the children, if any. */
+ if (hdr_has_subnode(hdr)) {
+ for (e = hdr_first_de(hdr); e; e = hdr_next_de(hdr, e)) {
+ indx_free_children(indx, ni, e, false);
+ if (de_is_last(e))
+ break;
+ }
+ }
+
+ put_indx_node(n);
+
+ i = vbn >> indx->idx2vbn_bits;
+ /*
+ * We've gotten rid of the children; add this buffer to the free list.
+ */
+ indx_mark_free(indx, ni, i);
+
+ if (!trim)
+ return 0;
+
+ /*
+ * If there are no used indexes after current free index
+ * then we can truncate allocation and bitmap.
+ * Use bitmap to estimate the case.
+ */
+ indx_shrink(indx, ni, i + 1);
+ return 0;
+}
+
+/*
+ * indx_get_entry_to_replace
+ *
+ * Find a replacement entry for a deleted entry.
+ * Always returns a node entry:
+ * NTFS_IE_HAS_SUBNODES is set the flags and the size includes the sub_vcn.
+ */
+static int indx_get_entry_to_replace(struct ntfs_index *indx,
+ struct ntfs_inode *ni,
+ const struct NTFS_DE *de_next,
+ struct NTFS_DE **de_to_replace,
+ struct ntfs_fnd *fnd)
+{
+ int err;
+ int level = -1;
+ CLST vbn;
+ struct NTFS_DE *e, *te, *re;
+ struct indx_node *n;
+ struct INDEX_BUFFER *ib;
+
+ *de_to_replace = NULL;
+
+ /* Find first leaf entry down from de_next. */
+ vbn = de_get_vbn(de_next);
+ for (;;) {
+ n = NULL;
+ err = indx_read(indx, ni, vbn, &n);
+ if (err)
+ goto out;
+
+ e = hdr_first_de(&n->index->ihdr);
+ fnd_push(fnd, n, e);
+
+ if (!de_is_last(e)) {
+ /*
+ * This buffer is non-empty, so its first entry
+ * could be used as the replacement entry.
+ */
+ level = fnd->level - 1;
+ }
+
+ if (!de_has_vcn(e))
+ break;
+
+ /* This buffer is a node. Continue to go down. */
+ vbn = de_get_vbn(e);
+ }
+
+ if (level == -1)
+ goto out;
+
+ n = fnd->nodes[level];
+ te = hdr_first_de(&n->index->ihdr);
+ /* Copy the candidate entry into the replacement entry buffer. */
+ re = kmalloc(le16_to_cpu(te->size) + sizeof(u64), GFP_NOFS);
+ if (!re) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ *de_to_replace = re;
+ memcpy(re, te, le16_to_cpu(te->size));
+
+ if (!de_has_vcn(re)) {
+ /*
+ * The replacement entry we found doesn't have a sub_vcn.
+ * increase its size to hold one.
+ */
+ le16_add_cpu(&re->size, sizeof(u64));
+ re->flags |= NTFS_IE_HAS_SUBNODES;
+ } else {
+ /*
+ * The replacement entry we found was a node entry, which
+ * means that all its child buffers are empty. Return them
+ * to the free pool.
+ */
+ indx_free_children(indx, ni, te, true);
+ }
+
+ /*
+ * Expunge the replacement entry from its former location,
+ * and then write that buffer.
+ */
+ ib = n->index;
+ e = hdr_delete_de(&ib->ihdr, te);
+
+ fnd->de[level] = e;
+ indx_write(indx, ni, n, 0);
+
+ /* Check to see if this action created an empty leaf. */
+ if (ib_is_leaf(ib) && ib_is_empty(ib))
+ return 0;
+
+out:
+ fnd_clear(fnd);
+ return err;
+}
+
+/*
+ * indx_delete_entry - Delete an entry from the index.
+ */
+int indx_delete_entry(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const void *key, u32 key_len, const void *ctx)
+{
+ int err, diff;
+ struct INDEX_ROOT *root;
+ struct INDEX_HDR *hdr;
+ struct ntfs_fnd *fnd, *fnd2;
+ struct INDEX_BUFFER *ib;
+ struct NTFS_DE *e, *re, *next, *prev, *me;
+ struct indx_node *n, *n2d = NULL;
+ __le64 sub_vbn;
+ int level, level2;
+ struct ATTRIB *attr;
+ struct mft_inode *mi;
+ u32 e_size, root_size, new_root_size;
+ size_t trim_bit;
+ const struct INDEX_NAMES *in;
+
+ fnd = fnd_get();
+ if (!fnd) {
+ err = -ENOMEM;
+ goto out2;
+ }
+
+ fnd2 = fnd_get();
+ if (!fnd2) {
+ err = -ENOMEM;
+ goto out1;
+ }
+
+ root = indx_get_root(indx, ni, &attr, &mi);
+ if (!root) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Locate the entry to remove. */
+ err = indx_find(indx, ni, root, key, key_len, ctx, &diff, &e, fnd);
+ if (err)
+ goto out;
+
+ if (!e || diff) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ level = fnd->level;
+
+ if (level) {
+ n = fnd->nodes[level - 1];
+ e = fnd->de[level - 1];
+ ib = n->index;
+ hdr = &ib->ihdr;
+ } else {
+ hdr = &root->ihdr;
+ e = fnd->root_de;
+ n = NULL;
+ }
+
+ e_size = le16_to_cpu(e->size);
+
+ if (!de_has_vcn_ex(e)) {
+ /* The entry to delete is a leaf, so we can just rip it out. */
+ hdr_delete_de(hdr, e);
+
+ if (!level) {
+ hdr->total = hdr->used;
+
+ /* Shrink resident root attribute. */
+ mi_resize_attr(mi, attr, 0 - e_size);
+ goto out;
+ }
+
+ indx_write(indx, ni, n, 0);
+
+ /*
+ * Check to see if removing that entry made
+ * the leaf empty.
+ */
+ if (ib_is_leaf(ib) && ib_is_empty(ib)) {
+ fnd_pop(fnd);
+ fnd_push(fnd2, n, e);
+ }
+ } else {
+ /*
+ * The entry we wish to delete is a node buffer, so we
+ * have to find a replacement for it.
+ */
+ next = de_get_next(e);
+
+ err = indx_get_entry_to_replace(indx, ni, next, &re, fnd2);
+ if (err)
+ goto out;
+
+ if (re) {
+ de_set_vbn_le(re, de_get_vbn_le(e));
+ hdr_delete_de(hdr, e);
+
+ err = level ? indx_insert_into_buffer(indx, ni, root,
+ re, ctx,
+ fnd->level - 1,
+ fnd)
+ : indx_insert_into_root(indx, ni, re, e,
+ ctx, fnd, 0);
+ kfree(re);
+
+ if (err)
+ goto out;
+ } else {
+ /*
+ * There is no replacement for the current entry.
+ * This means that the subtree rooted at its node
+ * is empty, and can be deleted, which turn means
+ * that the node can just inherit the deleted
+ * entry sub_vcn.
+ */
+ indx_free_children(indx, ni, next, true);
+
+ de_set_vbn_le(next, de_get_vbn_le(e));
+ hdr_delete_de(hdr, e);
+ if (level) {
+ indx_write(indx, ni, n, 0);
+ } else {
+ hdr->total = hdr->used;
+
+ /* Shrink resident root attribute. */
+ mi_resize_attr(mi, attr, 0 - e_size);
+ }
+ }
+ }
+
+ /* Delete a branch of tree. */
+ if (!fnd2 || !fnd2->level)
+ goto out;
+
+ /* Reinit root 'cause it can be changed. */
+ root = indx_get_root(indx, ni, &attr, &mi);
+ if (!root) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ n2d = NULL;
+ sub_vbn = fnd2->nodes[0]->index->vbn;
+ level2 = 0;
+ level = fnd->level;
+
+ hdr = level ? &fnd->nodes[level - 1]->index->ihdr : &root->ihdr;
+
+ /* Scan current level. */
+ for (e = hdr_first_de(hdr);; e = hdr_next_de(hdr, e)) {
+ if (!e) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (de_has_vcn(e) && sub_vbn == de_get_vbn_le(e))
+ break;
+
+ if (de_is_last(e)) {
+ e = NULL;
+ break;
+ }
+ }
+
+ if (!e) {
+ /* Do slow search from root. */
+ struct indx_node *in;
+
+ fnd_clear(fnd);
+
+ in = indx_find_buffer(indx, ni, root, sub_vbn, NULL);
+ if (IS_ERR(in)) {
+ err = PTR_ERR(in);
+ goto out;
+ }
+
+ if (in)
+ fnd_push(fnd, in, NULL);
+ }
+
+ /* Merge fnd2 -> fnd. */
+ for (level = 0; level < fnd2->level; level++) {
+ fnd_push(fnd, fnd2->nodes[level], fnd2->de[level]);
+ fnd2->nodes[level] = NULL;
+ }
+ fnd2->level = 0;
+
+ hdr = NULL;
+ for (level = fnd->level; level; level--) {
+ struct indx_node *in = fnd->nodes[level - 1];
+
+ ib = in->index;
+ if (ib_is_empty(ib)) {
+ sub_vbn = ib->vbn;
+ } else {
+ hdr = &ib->ihdr;
+ n2d = in;
+ level2 = level;
+ break;
+ }
+ }
+
+ if (!hdr)
+ hdr = &root->ihdr;
+
+ e = hdr_first_de(hdr);
+ if (!e) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (hdr != &root->ihdr || !de_is_last(e)) {
+ prev = NULL;
+ while (!de_is_last(e)) {
+ if (de_has_vcn(e) && sub_vbn == de_get_vbn_le(e))
+ break;
+ prev = e;
+ e = hdr_next_de(hdr, e);
+ if (!e) {
+ err = -EINVAL;
+ goto out;
+ }
+ }
+
+ if (sub_vbn != de_get_vbn_le(e)) {
+ /*
+ * Didn't find the parent entry, although this buffer
+ * is the parent trail. Something is corrupt.
+ */
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (de_is_last(e)) {
+ /*
+ * Since we can't remove the end entry, we'll remove
+ * its predecessor instead. This means we have to
+ * transfer the predecessor's sub_vcn to the end entry.
+ * Note: This index block is not empty, so the
+ * predecessor must exist.
+ */
+ if (!prev) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (de_has_vcn(prev)) {
+ de_set_vbn_le(e, de_get_vbn_le(prev));
+ } else if (de_has_vcn(e)) {
+ le16_sub_cpu(&e->size, sizeof(u64));
+ e->flags &= ~NTFS_IE_HAS_SUBNODES;
+ le32_sub_cpu(&hdr->used, sizeof(u64));
+ }
+ e = prev;
+ }
+
+ /*
+ * Copy the current entry into a temporary buffer (stripping
+ * off its down-pointer, if any) and delete it from the current
+ * buffer or root, as appropriate.
+ */
+ e_size = le16_to_cpu(e->size);
+ me = kmemdup(e, e_size, GFP_NOFS);
+ if (!me) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ if (de_has_vcn(me)) {
+ me->flags &= ~NTFS_IE_HAS_SUBNODES;
+ le16_sub_cpu(&me->size, sizeof(u64));
+ }
+
+ hdr_delete_de(hdr, e);
+
+ if (hdr == &root->ihdr) {
+ level = 0;
+ hdr->total = hdr->used;
+
+ /* Shrink resident root attribute. */
+ mi_resize_attr(mi, attr, 0 - e_size);
+ } else {
+ indx_write(indx, ni, n2d, 0);
+ level = level2;
+ }
+
+ /* Mark unused buffers as free. */
+ trim_bit = -1;
+ for (; level < fnd->level; level++) {
+ ib = fnd->nodes[level]->index;
+ if (ib_is_empty(ib)) {
+ size_t k = le64_to_cpu(ib->vbn) >>
+ indx->idx2vbn_bits;
+
+ indx_mark_free(indx, ni, k);
+ if (k < trim_bit)
+ trim_bit = k;
+ }
+ }
+
+ fnd_clear(fnd);
+ /*fnd->root_de = NULL;*/
+
+ /*
+ * Re-insert the entry into the tree.
+ * Find the spot the tree where we want to insert the new entry.
+ */
+ err = indx_insert_entry(indx, ni, me, ctx, fnd, 0);
+ kfree(me);
+ if (err)
+ goto out;
+
+ if (trim_bit != -1)
+ indx_shrink(indx, ni, trim_bit);
+ } else {
+ /*
+ * This tree needs to be collapsed down to an empty root.
+ * Recreate the index root as an empty leaf and free all
+ * the bits the index allocation bitmap.
+ */
+ fnd_clear(fnd);
+ fnd_clear(fnd2);
+
+ in = &s_index_names[indx->type];
+
+ err = attr_set_size(ni, ATTR_ALLOC, in->name, in->name_len,
+ &indx->alloc_run, 0, NULL, false, NULL);
+ err = ni_remove_attr(ni, ATTR_ALLOC, in->name, in->name_len,
+ false, NULL);
+ run_close(&indx->alloc_run);
+
+ err = attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
+ &indx->bitmap_run, 0, NULL, false, NULL);
+ err = ni_remove_attr(ni, ATTR_BITMAP, in->name, in->name_len,
+ false, NULL);
+ run_close(&indx->bitmap_run);
+
+ root = indx_get_root(indx, ni, &attr, &mi);
+ if (!root) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ root_size = le32_to_cpu(attr->res.data_size);
+ new_root_size =
+ sizeof(struct INDEX_ROOT) + sizeof(struct NTFS_DE);
+
+ if (new_root_size != root_size &&
+ !mi_resize_attr(mi, attr, new_root_size - root_size)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Fill first entry. */
+ e = (struct NTFS_DE *)(root + 1);
+ e->ref.low = 0;
+ e->ref.high = 0;
+ e->ref.seq = 0;
+ e->size = cpu_to_le16(sizeof(struct NTFS_DE));
+ e->flags = NTFS_IE_LAST; // 0x02
+ e->key_size = 0;
+ e->res = 0;
+
+ hdr = &root->ihdr;
+ hdr->flags = 0;
+ hdr->used = hdr->total = cpu_to_le32(
+ new_root_size - offsetof(struct INDEX_ROOT, ihdr));
+ mi->dirty = true;
+ }
+
+out:
+ fnd_put(fnd2);
+out1:
+ fnd_put(fnd);
+out2:
+ return err;
+}
+
+/*
+ * Update duplicated information in directory entry
+ * 'dup' - info from MFT record
+ */
+int indx_update_dup(struct ntfs_inode *ni, struct ntfs_sb_info *sbi,
+ const struct ATTR_FILE_NAME *fname,
+ const struct NTFS_DUP_INFO *dup, int sync)
+{
+ int err, diff;
+ struct NTFS_DE *e = NULL;
+ struct ATTR_FILE_NAME *e_fname;
+ struct ntfs_fnd *fnd;
+ struct INDEX_ROOT *root;
+ struct mft_inode *mi;
+ struct ntfs_index *indx = &ni->dir;
+
+ fnd = fnd_get();
+ if (!fnd)
+ return -ENOMEM;
+
+ root = indx_get_root(indx, ni, NULL, &mi);
+ if (!root) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Find entry in directory. */
+ err = indx_find(indx, ni, root, fname, fname_full_size(fname), sbi,
+ &diff, &e, fnd);
+ if (err)
+ goto out;
+
+ if (!e) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (diff) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ e_fname = (struct ATTR_FILE_NAME *)(e + 1);
+
+ if (!memcmp(&e_fname->dup, dup, sizeof(*dup))) {
+ /*
+ * Nothing to update in index! Try to avoid this call.
+ */
+ goto out;
+ }
+
+ memcpy(&e_fname->dup, dup, sizeof(*dup));
+
+ if (fnd->level) {
+ /* Directory entry in index. */
+ err = indx_write(indx, ni, fnd->nodes[fnd->level - 1], sync);
+ } else {
+ /* Directory entry in directory MFT record. */
+ mi->dirty = true;
+ if (sync)
+ err = mi_write(mi, 1);
+ else
+ mark_inode_dirty(&ni->vfs_inode);
+ }
+
+out:
+ fnd_put(fnd);
+ return err;
+}
diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
new file mode 100644
index 000000000000..db2a5a4c38e4
--- /dev/null
+++ b/fs/ntfs3/inode.c
@@ -0,0 +1,1957 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/iversion.h>
+#include <linux/mpage.h>
+#include <linux/namei.h>
+#include <linux/nls.h>
+#include <linux/uio.h>
+#include <linux/writeback.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+/*
+ * ntfs_read_mft - Read record and parses MFT.
+ */
+static struct inode *ntfs_read_mft(struct inode *inode,
+ const struct cpu_str *name,
+ const struct MFT_REF *ref)
+{
+ int err = 0;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ struct super_block *sb = inode->i_sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ mode_t mode = 0;
+ struct ATTR_STD_INFO5 *std5 = NULL;
+ struct ATTR_LIST_ENTRY *le;
+ struct ATTRIB *attr;
+ bool is_match = false;
+ bool is_root = false;
+ bool is_dir;
+ unsigned long ino = inode->i_ino;
+ u32 rp_fa = 0, asize, t32;
+ u16 roff, rsize, names = 0;
+ const struct ATTR_FILE_NAME *fname = NULL;
+ const struct INDEX_ROOT *root;
+ struct REPARSE_DATA_BUFFER rp; // 0x18 bytes
+ u64 t64;
+ struct MFT_REC *rec;
+ struct runs_tree *run;
+
+ inode->i_op = NULL;
+ /* Setup 'uid' and 'gid' */
+ inode->i_uid = sbi->options.fs_uid;
+ inode->i_gid = sbi->options.fs_gid;
+
+ err = mi_init(&ni->mi, sbi, ino);
+ if (err)
+ goto out;
+
+ if (!sbi->mft.ni && ino == MFT_REC_MFT && !sb->s_root) {
+ t64 = sbi->mft.lbo >> sbi->cluster_bits;
+ t32 = bytes_to_cluster(sbi, MFT_REC_VOL * sbi->record_size);
+ sbi->mft.ni = ni;
+ init_rwsem(&ni->file.run_lock);
+
+ if (!run_add_entry(&ni->file.run, 0, t64, t32, true)) {
+ err = -ENOMEM;
+ goto out;
+ }
+ }
+
+ err = mi_read(&ni->mi, ino == MFT_REC_MFT);
+
+ if (err)
+ goto out;
+
+ rec = ni->mi.mrec;
+
+ if (sbi->flags & NTFS_FLAGS_LOG_REPLAYING) {
+ ;
+ } else if (ref->seq != rec->seq) {
+ err = -EINVAL;
+ ntfs_err(sb, "MFT: r=%lx, expect seq=%x instead of %x!", ino,
+ le16_to_cpu(ref->seq), le16_to_cpu(rec->seq));
+ goto out;
+ } else if (!is_rec_inuse(rec)) {
+ err = -EINVAL;
+ ntfs_err(sb, "Inode r=%x is not in use!", (u32)ino);
+ goto out;
+ }
+
+ if (le32_to_cpu(rec->total) != sbi->record_size) {
+ /* Bad inode? */
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (!is_rec_base(rec))
+ goto Ok;
+
+ /* Record should contain $I30 root. */
+ is_dir = rec->flags & RECORD_FLAG_DIR;
+
+ inode->i_generation = le16_to_cpu(rec->seq);
+
+ /* Enumerate all struct Attributes MFT. */
+ le = NULL;
+ attr = NULL;
+
+ /*
+ * To reduce tab pressure use goto instead of
+ * while( (attr = ni_enum_attr_ex(ni, attr, &le, NULL) ))
+ */
+next_attr:
+ run = NULL;
+ err = -EINVAL;
+ attr = ni_enum_attr_ex(ni, attr, &le, NULL);
+ if (!attr)
+ goto end_enum;
+
+ if (le && le->vcn) {
+ /* This is non primary attribute segment. Ignore if not MFT. */
+ if (ino != MFT_REC_MFT || attr->type != ATTR_DATA)
+ goto next_attr;
+
+ run = &ni->file.run;
+ asize = le32_to_cpu(attr->size);
+ goto attr_unpack_run;
+ }
+
+ roff = attr->non_res ? 0 : le16_to_cpu(attr->res.data_off);
+ rsize = attr->non_res ? 0 : le32_to_cpu(attr->res.data_size);
+ asize = le32_to_cpu(attr->size);
+
+ switch (attr->type) {
+ case ATTR_STD:
+ if (attr->non_res ||
+ asize < sizeof(struct ATTR_STD_INFO) + roff ||
+ rsize < sizeof(struct ATTR_STD_INFO))
+ goto out;
+
+ if (std5)
+ goto next_attr;
+
+ std5 = Add2Ptr(attr, roff);
+
+#ifdef STATX_BTIME
+ nt2kernel(std5->cr_time, &ni->i_crtime);
+#endif
+ nt2kernel(std5->a_time, &inode->i_atime);
+ nt2kernel(std5->c_time, &inode->i_ctime);
+ nt2kernel(std5->m_time, &inode->i_mtime);
+
+ ni->std_fa = std5->fa;
+
+ if (asize >= sizeof(struct ATTR_STD_INFO5) + roff &&
+ rsize >= sizeof(struct ATTR_STD_INFO5))
+ ni->std_security_id = std5->security_id;
+ goto next_attr;
+
+ case ATTR_LIST:
+ if (attr->name_len || le || ino == MFT_REC_LOG)
+ goto out;
+
+ err = ntfs_load_attr_list(ni, attr);
+ if (err)
+ goto out;
+
+ le = NULL;
+ attr = NULL;
+ goto next_attr;
+
+ case ATTR_NAME:
+ if (attr->non_res || asize < SIZEOF_ATTRIBUTE_FILENAME + roff ||
+ rsize < SIZEOF_ATTRIBUTE_FILENAME)
+ goto out;
+
+ fname = Add2Ptr(attr, roff);
+ if (fname->type == FILE_NAME_DOS)
+ goto next_attr;
+
+ names += 1;
+ if (name && name->len == fname->name_len &&
+ !ntfs_cmp_names_cpu(name, (struct le_str *)&fname->name_len,
+ NULL, false))
+ is_match = true;
+
+ goto next_attr;
+
+ case ATTR_DATA:
+ if (is_dir) {
+ /* Ignore data attribute in dir record. */
+ goto next_attr;
+ }
+
+ if (ino == MFT_REC_BADCLUST && !attr->non_res)
+ goto next_attr;
+
+ if (attr->name_len &&
+ ((ino != MFT_REC_BADCLUST || !attr->non_res ||
+ attr->name_len != ARRAY_SIZE(BAD_NAME) ||
+ memcmp(attr_name(attr), BAD_NAME, sizeof(BAD_NAME))) &&
+ (ino != MFT_REC_SECURE || !attr->non_res ||
+ attr->name_len != ARRAY_SIZE(SDS_NAME) ||
+ memcmp(attr_name(attr), SDS_NAME, sizeof(SDS_NAME))))) {
+ /* File contains stream attribute. Ignore it. */
+ goto next_attr;
+ }
+
+ if (is_attr_sparsed(attr))
+ ni->std_fa |= FILE_ATTRIBUTE_SPARSE_FILE;
+ else
+ ni->std_fa &= ~FILE_ATTRIBUTE_SPARSE_FILE;
+
+ if (is_attr_compressed(attr))
+ ni->std_fa |= FILE_ATTRIBUTE_COMPRESSED;
+ else
+ ni->std_fa &= ~FILE_ATTRIBUTE_COMPRESSED;
+
+ if (is_attr_encrypted(attr))
+ ni->std_fa |= FILE_ATTRIBUTE_ENCRYPTED;
+ else
+ ni->std_fa &= ~FILE_ATTRIBUTE_ENCRYPTED;
+
+ if (!attr->non_res) {
+ ni->i_valid = inode->i_size = rsize;
+ inode_set_bytes(inode, rsize);
+ t32 = asize;
+ } else {
+ t32 = le16_to_cpu(attr->nres.run_off);
+ }
+
+ mode = S_IFREG | (0777 & sbi->options.fs_fmask_inv);
+
+ if (!attr->non_res) {
+ ni->ni_flags |= NI_FLAG_RESIDENT;
+ goto next_attr;
+ }
+
+ inode_set_bytes(inode, attr_ondisk_size(attr));
+
+ ni->i_valid = le64_to_cpu(attr->nres.valid_size);
+ inode->i_size = le64_to_cpu(attr->nres.data_size);
+ if (!attr->nres.alloc_size)
+ goto next_attr;
+
+ run = ino == MFT_REC_BITMAP ? &sbi->used.bitmap.run
+ : &ni->file.run;
+ break;
+
+ case ATTR_ROOT:
+ if (attr->non_res)
+ goto out;
+
+ root = Add2Ptr(attr, roff);
+ is_root = true;
+
+ if (attr->name_len != ARRAY_SIZE(I30_NAME) ||
+ memcmp(attr_name(attr), I30_NAME, sizeof(I30_NAME)))
+ goto next_attr;
+
+ if (root->type != ATTR_NAME ||
+ root->rule != NTFS_COLLATION_TYPE_FILENAME)
+ goto out;
+
+ if (!is_dir)
+ goto next_attr;
+
+ ni->ni_flags |= NI_FLAG_DIR;
+
+ err = indx_init(&ni->dir, sbi, attr, INDEX_MUTEX_I30);
+ if (err)
+ goto out;
+
+ mode = sb->s_root
+ ? (S_IFDIR | (0777 & sbi->options.fs_dmask_inv))
+ : (S_IFDIR | 0777);
+ goto next_attr;
+
+ case ATTR_ALLOC:
+ if (!is_root || attr->name_len != ARRAY_SIZE(I30_NAME) ||
+ memcmp(attr_name(attr), I30_NAME, sizeof(I30_NAME)))
+ goto next_attr;
+
+ inode->i_size = le64_to_cpu(attr->nres.data_size);
+ ni->i_valid = le64_to_cpu(attr->nres.valid_size);
+ inode_set_bytes(inode, le64_to_cpu(attr->nres.alloc_size));
+
+ run = &ni->dir.alloc_run;
+ break;
+
+ case ATTR_BITMAP:
+ if (ino == MFT_REC_MFT) {
+ if (!attr->non_res)
+ goto out;
+#ifndef CONFIG_NTFS3_64BIT_CLUSTER
+ /* 0x20000000 = 2^32 / 8 */
+ if (le64_to_cpu(attr->nres.alloc_size) >= 0x20000000)
+ goto out;
+#endif
+ run = &sbi->mft.bitmap.run;
+ break;
+ } else if (is_dir && attr->name_len == ARRAY_SIZE(I30_NAME) &&
+ !memcmp(attr_name(attr), I30_NAME,
+ sizeof(I30_NAME)) &&
+ attr->non_res) {
+ run = &ni->dir.bitmap_run;
+ break;
+ }
+ goto next_attr;
+
+ case ATTR_REPARSE:
+ if (attr->name_len)
+ goto next_attr;
+
+ rp_fa = ni_parse_reparse(ni, attr, &rp);
+ switch (rp_fa) {
+ case REPARSE_LINK:
+ if (!attr->non_res) {
+ inode->i_size = rsize;
+ inode_set_bytes(inode, rsize);
+ t32 = asize;
+ } else {
+ inode->i_size =
+ le64_to_cpu(attr->nres.data_size);
+ t32 = le16_to_cpu(attr->nres.run_off);
+ }
+
+ /* Looks like normal symlink. */
+ ni->i_valid = inode->i_size;
+
+ /* Clear directory bit. */
+ if (ni->ni_flags & NI_FLAG_DIR) {
+ indx_clear(&ni->dir);
+ memset(&ni->dir, 0, sizeof(ni->dir));
+ ni->ni_flags &= ~NI_FLAG_DIR;
+ } else {
+ run_close(&ni->file.run);
+ }
+ mode = S_IFLNK | 0777;
+ is_dir = false;
+ if (attr->non_res) {
+ run = &ni->file.run;
+ goto attr_unpack_run; // Double break.
+ }
+ break;
+
+ case REPARSE_COMPRESSED:
+ break;
+
+ case REPARSE_DEDUPLICATED:
+ break;
+ }
+ goto next_attr;
+
+ case ATTR_EA_INFO:
+ if (!attr->name_len &&
+ resident_data_ex(attr, sizeof(struct EA_INFO))) {
+ ni->ni_flags |= NI_FLAG_EA;
+ /*
+ * ntfs_get_wsl_perm updates inode->i_uid, inode->i_gid, inode->i_mode
+ */
+ inode->i_mode = mode;
+ ntfs_get_wsl_perm(inode);
+ mode = inode->i_mode;
+ }
+ goto next_attr;
+
+ default:
+ goto next_attr;
+ }
+
+attr_unpack_run:
+ roff = le16_to_cpu(attr->nres.run_off);
+
+ t64 = le64_to_cpu(attr->nres.svcn);
+ err = run_unpack_ex(run, sbi, ino, t64, le64_to_cpu(attr->nres.evcn),
+ t64, Add2Ptr(attr, roff), asize - roff);
+ if (err < 0)
+ goto out;
+ err = 0;
+ goto next_attr;
+
+end_enum:
+
+ if (!std5)
+ goto out;
+
+ if (!is_match && name) {
+ /* Reuse rec as buffer for ascii name. */
+ err = -ENOENT;
+ goto out;
+ }
+
+ if (std5->fa & FILE_ATTRIBUTE_READONLY)
+ mode &= ~0222;
+
+ if (!names) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (names != le16_to_cpu(rec->hard_links)) {
+ /* Correct minor error on the fly. Do not mark inode as dirty. */
+ rec->hard_links = cpu_to_le16(names);
+ ni->mi.dirty = true;
+ }
+
+ set_nlink(inode, names);
+
+ if (S_ISDIR(mode)) {
+ ni->std_fa |= FILE_ATTRIBUTE_DIRECTORY;
+
+ /*
+ * Dot and dot-dot should be included in count but was not
+ * included in enumeration.
+ * Usually a hard links to directories are disabled.
+ */
+ inode->i_op = &ntfs_dir_inode_operations;
+ inode->i_fop = &ntfs_dir_operations;
+ ni->i_valid = 0;
+ } else if (S_ISLNK(mode)) {
+ ni->std_fa &= ~FILE_ATTRIBUTE_DIRECTORY;
+ inode->i_op = &ntfs_link_inode_operations;
+ inode->i_fop = NULL;
+ inode_nohighmem(inode); // ??
+ } else if (S_ISREG(mode)) {
+ ni->std_fa &= ~FILE_ATTRIBUTE_DIRECTORY;
+ inode->i_op = &ntfs_file_inode_operations;
+ inode->i_fop = &ntfs_file_operations;
+ inode->i_mapping->a_ops =
+ is_compressed(ni) ? &ntfs_aops_cmpr : &ntfs_aops;
+ if (ino != MFT_REC_MFT)
+ init_rwsem(&ni->file.run_lock);
+ } else if (S_ISCHR(mode) || S_ISBLK(mode) || S_ISFIFO(mode) ||
+ S_ISSOCK(mode)) {
+ inode->i_op = &ntfs_special_inode_operations;
+ init_special_inode(inode, mode, inode->i_rdev);
+ } else if (fname && fname->home.low == cpu_to_le32(MFT_REC_EXTEND) &&
+ fname->home.seq == cpu_to_le16(MFT_REC_EXTEND)) {
+ /* Records in $Extend are not a files or general directories. */
+ } else {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if ((sbi->options.sys_immutable &&
+ (std5->fa & FILE_ATTRIBUTE_SYSTEM)) &&
+ !S_ISFIFO(mode) && !S_ISSOCK(mode) && !S_ISLNK(mode)) {
+ inode->i_flags |= S_IMMUTABLE;
+ } else {
+ inode->i_flags &= ~S_IMMUTABLE;
+ }
+
+ inode->i_mode = mode;
+ if (!(ni->ni_flags & NI_FLAG_EA)) {
+ /* If no xattr then no security (stored in xattr). */
+ inode->i_flags |= S_NOSEC;
+ }
+
+Ok:
+ if (ino == MFT_REC_MFT && !sb->s_root)
+ sbi->mft.ni = NULL;
+
+ unlock_new_inode(inode);
+
+ return inode;
+
+out:
+ if (ino == MFT_REC_MFT && !sb->s_root)
+ sbi->mft.ni = NULL;
+
+ iget_failed(inode);
+ return ERR_PTR(err);
+}
+
+/*
+ * ntfs_test_inode
+ *
+ * Return: 1 if match.
+ */
+static int ntfs_test_inode(struct inode *inode, void *data)
+{
+ struct MFT_REF *ref = data;
+
+ return ino_get(ref) == inode->i_ino;
+}
+
+static int ntfs_set_inode(struct inode *inode, void *data)
+{
+ const struct MFT_REF *ref = data;
+
+ inode->i_ino = ino_get(ref);
+ return 0;
+}
+
+struct inode *ntfs_iget5(struct super_block *sb, const struct MFT_REF *ref,
+ const struct cpu_str *name)
+{
+ struct inode *inode;
+
+ inode = iget5_locked(sb, ino_get(ref), ntfs_test_inode, ntfs_set_inode,
+ (void *)ref);
+ if (unlikely(!inode))
+ return ERR_PTR(-ENOMEM);
+
+ /* If this is a freshly allocated inode, need to read it now. */
+ if (inode->i_state & I_NEW)
+ inode = ntfs_read_mft(inode, name, ref);
+ else if (ref->seq != ntfs_i(inode)->mi.mrec->seq) {
+ /* Inode overlaps? */
+ make_bad_inode(inode);
+ }
+
+ return inode;
+}
+
+enum get_block_ctx {
+ GET_BLOCK_GENERAL = 0,
+ GET_BLOCK_WRITE_BEGIN = 1,
+ GET_BLOCK_DIRECT_IO_R = 2,
+ GET_BLOCK_DIRECT_IO_W = 3,
+ GET_BLOCK_BMAP = 4,
+};
+
+static noinline int ntfs_get_block_vbo(struct inode *inode, u64 vbo,
+ struct buffer_head *bh, int create,
+ enum get_block_ctx ctx)
+{
+ struct super_block *sb = inode->i_sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ struct page *page = bh->b_page;
+ u8 cluster_bits = sbi->cluster_bits;
+ u32 block_size = sb->s_blocksize;
+ u64 bytes, lbo, valid;
+ u32 off;
+ int err;
+ CLST vcn, lcn, len;
+ bool new;
+
+ /* Clear previous state. */
+ clear_buffer_new(bh);
+ clear_buffer_uptodate(bh);
+
+ /* Direct write uses 'create=0'. */
+ if (!create && vbo >= ni->i_valid) {
+ /* Out of valid. */
+ return 0;
+ }
+
+ if (vbo >= inode->i_size) {
+ /* Out of size. */
+ return 0;
+ }
+
+ if (is_resident(ni)) {
+ ni_lock(ni);
+ err = attr_data_read_resident(ni, page);
+ ni_unlock(ni);
+
+ if (!err)
+ set_buffer_uptodate(bh);
+ bh->b_size = block_size;
+ return err;
+ }
+
+ vcn = vbo >> cluster_bits;
+ off = vbo & sbi->cluster_mask;
+ new = false;
+
+ err = attr_data_get_block(ni, vcn, 1, &lcn, &len, create ? &new : NULL);
+ if (err)
+ goto out;
+
+ if (!len)
+ return 0;
+
+ bytes = ((u64)len << cluster_bits) - off;
+
+ if (lcn == SPARSE_LCN) {
+ if (!create) {
+ if (bh->b_size > bytes)
+ bh->b_size = bytes;
+ return 0;
+ }
+ WARN_ON(1);
+ }
+
+ if (new) {
+ set_buffer_new(bh);
+ if ((len << cluster_bits) > block_size)
+ ntfs_sparse_cluster(inode, page, vcn, len);
+ }
+
+ lbo = ((u64)lcn << cluster_bits) + off;
+
+ set_buffer_mapped(bh);
+ bh->b_bdev = sb->s_bdev;
+ bh->b_blocknr = lbo >> sb->s_blocksize_bits;
+
+ valid = ni->i_valid;
+
+ if (ctx == GET_BLOCK_DIRECT_IO_W) {
+ /* ntfs_direct_IO will update ni->i_valid. */
+ if (vbo >= valid)
+ set_buffer_new(bh);
+ } else if (create) {
+ /* Normal write. */
+ if (bytes > bh->b_size)
+ bytes = bh->b_size;
+
+ if (vbo >= valid)
+ set_buffer_new(bh);
+
+ if (vbo + bytes > valid) {
+ ni->i_valid = vbo + bytes;
+ mark_inode_dirty(inode);
+ }
+ } else if (vbo >= valid) {
+ /* Read out of valid data. */
+ /* Should never be here 'cause already checked. */
+ clear_buffer_mapped(bh);
+ } else if (vbo + bytes <= valid) {
+ /* Normal read. */
+ } else if (vbo + block_size <= valid) {
+ /* Normal short read. */
+ bytes = block_size;
+ } else {
+ /*
+ * Read across valid size: vbo < valid && valid < vbo + block_size
+ */
+ bytes = block_size;
+
+ if (page) {
+ u32 voff = valid - vbo;
+
+ bh->b_size = block_size;
+ off = vbo & (PAGE_SIZE - 1);
+ set_bh_page(bh, page, off);
+ ll_rw_block(REQ_OP_READ, 0, 1, &bh);
+ wait_on_buffer(bh);
+ if (!buffer_uptodate(bh)) {
+ err = -EIO;
+ goto out;
+ }
+ zero_user_segment(page, off + voff, off + block_size);
+ }
+ }
+
+ if (bh->b_size > bytes)
+ bh->b_size = bytes;
+
+#ifndef __LP64__
+ if (ctx == GET_BLOCK_DIRECT_IO_W || ctx == GET_BLOCK_DIRECT_IO_R) {
+ static_assert(sizeof(size_t) < sizeof(loff_t));
+ if (bytes > 0x40000000u)
+ bh->b_size = 0x40000000u;
+ }
+#endif
+
+ return 0;
+
+out:
+ return err;
+}
+
+int ntfs_get_block(struct inode *inode, sector_t vbn,
+ struct buffer_head *bh_result, int create)
+{
+ return ntfs_get_block_vbo(inode, (u64)vbn << inode->i_blkbits,
+ bh_result, create, GET_BLOCK_GENERAL);
+}
+
+static int ntfs_get_block_bmap(struct inode *inode, sector_t vsn,
+ struct buffer_head *bh_result, int create)
+{
+ return ntfs_get_block_vbo(inode,
+ (u64)vsn << inode->i_sb->s_blocksize_bits,
+ bh_result, create, GET_BLOCK_BMAP);
+}
+
+static sector_t ntfs_bmap(struct address_space *mapping, sector_t block)
+{
+ return generic_block_bmap(mapping, block, ntfs_get_block_bmap);
+}
+
+static int ntfs_readpage(struct file *file, struct page *page)
+{
+ int err;
+ struct address_space *mapping = page->mapping;
+ struct inode *inode = mapping->host;
+ struct ntfs_inode *ni = ntfs_i(inode);
+
+ if (is_resident(ni)) {
+ ni_lock(ni);
+ err = attr_data_read_resident(ni, page);
+ ni_unlock(ni);
+ if (err != E_NTFS_NONRESIDENT) {
+ unlock_page(page);
+ return err;
+ }
+ }
+
+ if (is_compressed(ni)) {
+ ni_lock(ni);
+ err = ni_readpage_cmpr(ni, page);
+ ni_unlock(ni);
+ return err;
+ }
+
+ /* Normal + sparse files. */
+ return mpage_readpage(page, ntfs_get_block);
+}
+
+static void ntfs_readahead(struct readahead_control *rac)
+{
+ struct address_space *mapping = rac->mapping;
+ struct inode *inode = mapping->host;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ u64 valid;
+ loff_t pos;
+
+ if (is_resident(ni)) {
+ /* No readahead for resident. */
+ return;
+ }
+
+ if (is_compressed(ni)) {
+ /* No readahead for compressed. */
+ return;
+ }
+
+ valid = ni->i_valid;
+ pos = readahead_pos(rac);
+
+ if (valid < i_size_read(inode) && pos <= valid &&
+ valid < pos + readahead_length(rac)) {
+ /* Range cross 'valid'. Read it page by page. */
+ return;
+ }
+
+ mpage_readahead(rac, ntfs_get_block);
+}
+
+static int ntfs_get_block_direct_IO_R(struct inode *inode, sector_t iblock,
+ struct buffer_head *bh_result, int create)
+{
+ return ntfs_get_block_vbo(inode, (u64)iblock << inode->i_blkbits,
+ bh_result, create, GET_BLOCK_DIRECT_IO_R);
+}
+
+static int ntfs_get_block_direct_IO_W(struct inode *inode, sector_t iblock,
+ struct buffer_head *bh_result, int create)
+{
+ return ntfs_get_block_vbo(inode, (u64)iblock << inode->i_blkbits,
+ bh_result, create, GET_BLOCK_DIRECT_IO_W);
+}
+
+static ssize_t ntfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
+{
+ struct file *file = iocb->ki_filp;
+ struct address_space *mapping = file->f_mapping;
+ struct inode *inode = mapping->host;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ loff_t vbo = iocb->ki_pos;
+ loff_t end;
+ int wr = iov_iter_rw(iter) & WRITE;
+ loff_t valid;
+ ssize_t ret;
+
+ if (is_resident(ni)) {
+ /* Switch to buffered write. */
+ ret = 0;
+ goto out;
+ }
+
+ ret = blockdev_direct_IO(iocb, inode, iter,
+ wr ? ntfs_get_block_direct_IO_W
+ : ntfs_get_block_direct_IO_R);
+
+ if (ret <= 0)
+ goto out;
+
+ end = vbo + ret;
+ valid = ni->i_valid;
+ if (wr) {
+ if (end > valid && !S_ISBLK(inode->i_mode)) {
+ ni->i_valid = end;
+ mark_inode_dirty(inode);
+ }
+ } else if (vbo < valid && valid < end) {
+ /* Fix page. */
+ iov_iter_revert(iter, end - valid);
+ iov_iter_zero(end - valid, iter);
+ }
+
+out:
+ return ret;
+}
+
+int ntfs_set_size(struct inode *inode, u64 new_size)
+{
+ struct super_block *sb = inode->i_sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ int err;
+
+ /* Check for maximum file size. */
+ if (is_sparsed(ni) || is_compressed(ni)) {
+ if (new_size > sbi->maxbytes_sparse) {
+ err = -EFBIG;
+ goto out;
+ }
+ } else if (new_size > sbi->maxbytes) {
+ err = -EFBIG;
+ goto out;
+ }
+
+ ni_lock(ni);
+ down_write(&ni->file.run_lock);
+
+ err = attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, new_size,
+ &ni->i_valid, true, NULL);
+
+ up_write(&ni->file.run_lock);
+ ni_unlock(ni);
+
+ mark_inode_dirty(inode);
+
+out:
+ return err;
+}
+
+static int ntfs_writepage(struct page *page, struct writeback_control *wbc)
+{
+ struct address_space *mapping = page->mapping;
+ struct inode *inode = mapping->host;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ int err;
+
+ if (is_resident(ni)) {
+ ni_lock(ni);
+ err = attr_data_write_resident(ni, page);
+ ni_unlock(ni);
+ if (err != E_NTFS_NONRESIDENT) {
+ unlock_page(page);
+ return err;
+ }
+ }
+
+ return block_write_full_page(page, ntfs_get_block, wbc);
+}
+
+static int ntfs_writepages(struct address_space *mapping,
+ struct writeback_control *wbc)
+{
+ struct inode *inode = mapping->host;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ /* Redirect call to 'ntfs_writepage' for resident files. */
+ get_block_t *get_block = is_resident(ni) ? NULL : &ntfs_get_block;
+
+ return mpage_writepages(mapping, wbc, get_block);
+}
+
+static int ntfs_get_block_write_begin(struct inode *inode, sector_t vbn,
+ struct buffer_head *bh_result, int create)
+{
+ return ntfs_get_block_vbo(inode, (u64)vbn << inode->i_blkbits,
+ bh_result, create, GET_BLOCK_WRITE_BEGIN);
+}
+
+static int ntfs_write_begin(struct file *file, struct address_space *mapping,
+ loff_t pos, u32 len, u32 flags, struct page **pagep,
+ void **fsdata)
+{
+ int err;
+ struct inode *inode = mapping->host;
+ struct ntfs_inode *ni = ntfs_i(inode);
+
+ *pagep = NULL;
+ if (is_resident(ni)) {
+ struct page *page = grab_cache_page_write_begin(
+ mapping, pos >> PAGE_SHIFT, flags);
+
+ if (!page) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ ni_lock(ni);
+ err = attr_data_read_resident(ni, page);
+ ni_unlock(ni);
+
+ if (!err) {
+ *pagep = page;
+ goto out;
+ }
+ unlock_page(page);
+ put_page(page);
+
+ if (err != E_NTFS_NONRESIDENT)
+ goto out;
+ }
+
+ err = block_write_begin(mapping, pos, len, flags, pagep,
+ ntfs_get_block_write_begin);
+
+out:
+ return err;
+}
+
+/*
+ * ntfs_write_end - Address_space_operations::write_end.
+ */
+static int ntfs_write_end(struct file *file, struct address_space *mapping,
+ loff_t pos, u32 len, u32 copied, struct page *page,
+ void *fsdata)
+
+{
+ struct inode *inode = mapping->host;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ u64 valid = ni->i_valid;
+ bool dirty = false;
+ int err;
+
+ if (is_resident(ni)) {
+ ni_lock(ni);
+ err = attr_data_write_resident(ni, page);
+ ni_unlock(ni);
+ if (!err) {
+ dirty = true;
+ /* Clear any buffers in page. */
+ if (page_has_buffers(page)) {
+ struct buffer_head *head, *bh;
+
+ bh = head = page_buffers(page);
+ do {
+ clear_buffer_dirty(bh);
+ clear_buffer_mapped(bh);
+ set_buffer_uptodate(bh);
+ } while (head != (bh = bh->b_this_page));
+ }
+ SetPageUptodate(page);
+ err = copied;
+ }
+ unlock_page(page);
+ put_page(page);
+ } else {
+ err = generic_write_end(file, mapping, pos, len, copied, page,
+ fsdata);
+ }
+
+ if (err >= 0) {
+ if (!(ni->std_fa & FILE_ATTRIBUTE_ARCHIVE)) {
+ inode->i_ctime = inode->i_mtime = current_time(inode);
+ ni->std_fa |= FILE_ATTRIBUTE_ARCHIVE;
+ dirty = true;
+ }
+
+ if (valid != ni->i_valid) {
+ /* ni->i_valid is changed in ntfs_get_block_vbo. */
+ dirty = true;
+ }
+
+ if (dirty)
+ mark_inode_dirty(inode);
+ }
+
+ return err;
+}
+
+int reset_log_file(struct inode *inode)
+{
+ int err;
+ loff_t pos = 0;
+ u32 log_size = inode->i_size;
+ struct address_space *mapping = inode->i_mapping;
+
+ for (;;) {
+ u32 len;
+ void *kaddr;
+ struct page *page;
+
+ len = pos + PAGE_SIZE > log_size ? (log_size - pos) : PAGE_SIZE;
+
+ err = block_write_begin(mapping, pos, len, 0, &page,
+ ntfs_get_block_write_begin);
+ if (err)
+ goto out;
+
+ kaddr = kmap_atomic(page);
+ memset(kaddr, -1, len);
+ kunmap_atomic(kaddr);
+ flush_dcache_page(page);
+
+ err = block_write_end(NULL, mapping, pos, len, len, page, NULL);
+ if (err < 0)
+ goto out;
+ pos += len;
+
+ if (pos >= log_size)
+ break;
+ balance_dirty_pages_ratelimited(mapping);
+ }
+out:
+ mark_inode_dirty_sync(inode);
+
+ return err;
+}
+
+int ntfs3_write_inode(struct inode *inode, struct writeback_control *wbc)
+{
+ return _ni_write_inode(inode, wbc->sync_mode == WB_SYNC_ALL);
+}
+
+int ntfs_sync_inode(struct inode *inode)
+{
+ return _ni_write_inode(inode, 1);
+}
+
+/*
+ * writeback_inode - Helper function for ntfs_flush_inodes().
+ *
+ * This writes both the inode and the file data blocks, waiting
+ * for in flight data blocks before the start of the call. It
+ * does not wait for any io started during the call.
+ */
+static int writeback_inode(struct inode *inode)
+{
+ int ret = sync_inode_metadata(inode, 0);
+
+ if (!ret)
+ ret = filemap_fdatawrite(inode->i_mapping);
+ return ret;
+}
+
+/*
+ * ntfs_flush_inodes
+ *
+ * Write data and metadata corresponding to i1 and i2. The io is
+ * started but we do not wait for any of it to finish.
+ *
+ * filemap_flush() is used for the block device, so if there is a dirty
+ * page for a block already in flight, we will not wait and start the
+ * io over again.
+ */
+int ntfs_flush_inodes(struct super_block *sb, struct inode *i1,
+ struct inode *i2)
+{
+ int ret = 0;
+
+ if (i1)
+ ret = writeback_inode(i1);
+ if (!ret && i2)
+ ret = writeback_inode(i2);
+ if (!ret)
+ ret = filemap_flush(sb->s_bdev->bd_inode->i_mapping);
+ return ret;
+}
+
+int inode_write_data(struct inode *inode, const void *data, size_t bytes)
+{
+ pgoff_t idx;
+
+ /* Write non resident data. */
+ for (idx = 0; bytes; idx++) {
+ size_t op = bytes > PAGE_SIZE ? PAGE_SIZE : bytes;
+ struct page *page = ntfs_map_page(inode->i_mapping, idx);
+
+ if (IS_ERR(page))
+ return PTR_ERR(page);
+
+ lock_page(page);
+ WARN_ON(!PageUptodate(page));
+ ClearPageUptodate(page);
+
+ memcpy(page_address(page), data, op);
+
+ flush_dcache_page(page);
+ SetPageUptodate(page);
+ unlock_page(page);
+
+ ntfs_unmap_page(page);
+
+ bytes -= op;
+ data = Add2Ptr(data, PAGE_SIZE);
+ }
+ return 0;
+}
+
+/*
+ * ntfs_reparse_bytes
+ *
+ * Number of bytes for REPARSE_DATA_BUFFER(IO_REPARSE_TAG_SYMLINK)
+ * for unicode string of @uni_len length.
+ */
+static inline u32 ntfs_reparse_bytes(u32 uni_len)
+{
+ /* Header + unicode string + decorated unicode string. */
+ return sizeof(short) * (2 * uni_len + 4) +
+ offsetof(struct REPARSE_DATA_BUFFER,
+ SymbolicLinkReparseBuffer.PathBuffer);
+}
+
+static struct REPARSE_DATA_BUFFER *
+ntfs_create_reparse_buffer(struct ntfs_sb_info *sbi, const char *symname,
+ u32 size, u16 *nsize)
+{
+ int i, err;
+ struct REPARSE_DATA_BUFFER *rp;
+ __le16 *rp_name;
+ typeof(rp->SymbolicLinkReparseBuffer) *rs;
+
+ rp = kzalloc(ntfs_reparse_bytes(2 * size + 2), GFP_NOFS);
+ if (!rp)
+ return ERR_PTR(-ENOMEM);
+
+ rs = &rp->SymbolicLinkReparseBuffer;
+ rp_name = rs->PathBuffer;
+
+ /* Convert link name to UTF-16. */
+ err = ntfs_nls_to_utf16(sbi, symname, size,
+ (struct cpu_str *)(rp_name - 1), 2 * size,
+ UTF16_LITTLE_ENDIAN);
+ if (err < 0)
+ goto out;
+
+ /* err = the length of unicode name of symlink. */
+ *nsize = ntfs_reparse_bytes(err);
+
+ if (*nsize > sbi->reparse.max_size) {
+ err = -EFBIG;
+ goto out;
+ }
+
+ /* Translate Linux '/' into Windows '\'. */
+ for (i = 0; i < err; i++) {
+ if (rp_name[i] == cpu_to_le16('/'))
+ rp_name[i] = cpu_to_le16('\\');
+ }
+
+ rp->ReparseTag = IO_REPARSE_TAG_SYMLINK;
+ rp->ReparseDataLength =
+ cpu_to_le16(*nsize - offsetof(struct REPARSE_DATA_BUFFER,
+ SymbolicLinkReparseBuffer));
+
+ /* PrintName + SubstituteName. */
+ rs->SubstituteNameOffset = cpu_to_le16(sizeof(short) * err);
+ rs->SubstituteNameLength = cpu_to_le16(sizeof(short) * err + 8);
+ rs->PrintNameLength = rs->SubstituteNameOffset;
+
+ /*
+ * TODO: Use relative path if possible to allow Windows to
+ * parse this path.
+ * 0-absolute path 1- relative path (SYMLINK_FLAG_RELATIVE).
+ */
+ rs->Flags = 0;
+
+ memmove(rp_name + err + 4, rp_name, sizeof(short) * err);
+
+ /* Decorate SubstituteName. */
+ rp_name += err;
+ rp_name[0] = cpu_to_le16('\\');
+ rp_name[1] = cpu_to_le16('?');
+ rp_name[2] = cpu_to_le16('?');
+ rp_name[3] = cpu_to_le16('\\');
+
+ return rp;
+out:
+ kfree(rp);
+ return ERR_PTR(err);
+}
+
+struct inode *ntfs_create_inode(struct user_namespace *mnt_userns,
+ struct inode *dir, struct dentry *dentry,
+ const struct cpu_str *uni, umode_t mode,
+ dev_t dev, const char *symname, u32 size,
+ struct ntfs_fnd *fnd)
+{
+ int err;
+ struct super_block *sb = dir->i_sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ const struct qstr *name = &dentry->d_name;
+ CLST ino = 0;
+ struct ntfs_inode *dir_ni = ntfs_i(dir);
+ struct ntfs_inode *ni = NULL;
+ struct inode *inode = NULL;
+ struct ATTRIB *attr;
+ struct ATTR_STD_INFO5 *std5;
+ struct ATTR_FILE_NAME *fname;
+ struct MFT_REC *rec;
+ u32 asize, dsize, sd_size;
+ enum FILE_ATTRIBUTE fa;
+ __le32 security_id = SECURITY_ID_INVALID;
+ CLST vcn;
+ const void *sd;
+ u16 t16, nsize = 0, aid = 0;
+ struct INDEX_ROOT *root, *dir_root;
+ struct NTFS_DE *e, *new_de = NULL;
+ struct REPARSE_DATA_BUFFER *rp = NULL;
+ bool rp_inserted = false;
+
+ dir_root = indx_get_root(&dir_ni->dir, dir_ni, NULL, NULL);
+ if (!dir_root)
+ return ERR_PTR(-EINVAL);
+
+ if (S_ISDIR(mode)) {
+ /* Use parent's directory attributes. */
+ fa = dir_ni->std_fa | FILE_ATTRIBUTE_DIRECTORY |
+ FILE_ATTRIBUTE_ARCHIVE;
+ /*
+ * By default child directory inherits parent attributes.
+ * Root directory is hidden + system.
+ * Make an exception for children in root.
+ */
+ if (dir->i_ino == MFT_REC_ROOT)
+ fa &= ~(FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM);
+ } else if (S_ISLNK(mode)) {
+ /* It is good idea that link should be the same type (file/dir) as target */
+ fa = FILE_ATTRIBUTE_REPARSE_POINT;
+
+ /*
+ * Linux: there are dir/file/symlink and so on.
+ * NTFS: symlinks are "dir + reparse" or "file + reparse"
+ * It is good idea to create:
+ * dir + reparse if 'symname' points to directory
+ * or
+ * file + reparse if 'symname' points to file
+ * Unfortunately kern_path hangs if symname contains 'dir'.
+ */
+
+ /*
+ * struct path path;
+ *
+ * if (!kern_path(symname, LOOKUP_FOLLOW, &path)){
+ * struct inode *target = d_inode(path.dentry);
+ *
+ * if (S_ISDIR(target->i_mode))
+ * fa |= FILE_ATTRIBUTE_DIRECTORY;
+ * // if ( target->i_sb == sb ){
+ * // use relative path?
+ * // }
+ * path_put(&path);
+ * }
+ */
+ } else if (S_ISREG(mode)) {
+ if (sbi->options.sparse) {
+ /* Sparsed regular file, cause option 'sparse'. */
+ fa = FILE_ATTRIBUTE_SPARSE_FILE |
+ FILE_ATTRIBUTE_ARCHIVE;
+ } else if (dir_ni->std_fa & FILE_ATTRIBUTE_COMPRESSED) {
+ /* Compressed regular file, if parent is compressed. */
+ fa = FILE_ATTRIBUTE_COMPRESSED | FILE_ATTRIBUTE_ARCHIVE;
+ } else {
+ /* Regular file, default attributes. */
+ fa = FILE_ATTRIBUTE_ARCHIVE;
+ }
+ } else {
+ fa = FILE_ATTRIBUTE_ARCHIVE;
+ }
+
+ if (!(mode & 0222))
+ fa |= FILE_ATTRIBUTE_READONLY;
+
+ /* Allocate PATH_MAX bytes. */
+ new_de = __getname();
+ if (!new_de) {
+ err = -ENOMEM;
+ goto out1;
+ }
+
+ /* Mark rw ntfs as dirty. it will be cleared at umount. */
+ ntfs_set_state(sbi, NTFS_DIRTY_DIRTY);
+
+ /* Step 1: allocate and fill new mft record. */
+ err = ntfs_look_free_mft(sbi, &ino, false, NULL, NULL);
+ if (err)
+ goto out2;
+
+ ni = ntfs_new_inode(sbi, ino, fa & FILE_ATTRIBUTE_DIRECTORY);
+ if (IS_ERR(ni)) {
+ err = PTR_ERR(ni);
+ ni = NULL;
+ goto out3;
+ }
+ inode = &ni->vfs_inode;
+ inode_init_owner(mnt_userns, inode, dir, mode);
+ mode = inode->i_mode;
+
+ inode->i_atime = inode->i_mtime = inode->i_ctime = ni->i_crtime =
+ current_time(inode);
+
+ rec = ni->mi.mrec;
+ rec->hard_links = cpu_to_le16(1);
+ attr = Add2Ptr(rec, le16_to_cpu(rec->attr_off));
+
+ /* Get default security id. */
+ sd = s_default_security;
+ sd_size = sizeof(s_default_security);
+
+ if (is_ntfs3(sbi)) {
+ security_id = dir_ni->std_security_id;
+ if (le32_to_cpu(security_id) < SECURITY_ID_FIRST) {
+ security_id = sbi->security.def_security_id;
+
+ if (security_id == SECURITY_ID_INVALID &&
+ !ntfs_insert_security(sbi, sd, sd_size,
+ &security_id, NULL))
+ sbi->security.def_security_id = security_id;
+ }
+ }
+
+ /* Insert standard info. */
+ std5 = Add2Ptr(attr, SIZEOF_RESIDENT);
+
+ if (security_id == SECURITY_ID_INVALID) {
+ dsize = sizeof(struct ATTR_STD_INFO);
+ } else {
+ dsize = sizeof(struct ATTR_STD_INFO5);
+ std5->security_id = security_id;
+ ni->std_security_id = security_id;
+ }
+ asize = SIZEOF_RESIDENT + dsize;
+
+ attr->type = ATTR_STD;
+ attr->size = cpu_to_le32(asize);
+ attr->id = cpu_to_le16(aid++);
+ attr->res.data_off = SIZEOF_RESIDENT_LE;
+ attr->res.data_size = cpu_to_le32(dsize);
+
+ std5->cr_time = std5->m_time = std5->c_time = std5->a_time =
+ kernel2nt(&inode->i_atime);
+
+ ni->std_fa = fa;
+ std5->fa = fa;
+
+ attr = Add2Ptr(attr, asize);
+
+ /* Insert file name. */
+ err = fill_name_de(sbi, new_de, name, uni);
+ if (err)
+ goto out4;
+
+ mi_get_ref(&ni->mi, &new_de->ref);
+
+ fname = (struct ATTR_FILE_NAME *)(new_de + 1);
+ mi_get_ref(&dir_ni->mi, &fname->home);
+ fname->dup.cr_time = fname->dup.m_time = fname->dup.c_time =
+ fname->dup.a_time = std5->cr_time;
+ fname->dup.alloc_size = fname->dup.data_size = 0;
+ fname->dup.fa = std5->fa;
+ fname->dup.ea_size = fname->dup.reparse = 0;
+
+ dsize = le16_to_cpu(new_de->key_size);
+ asize = ALIGN(SIZEOF_RESIDENT + dsize, 8);
+
+ attr->type = ATTR_NAME;
+ attr->size = cpu_to_le32(asize);
+ attr->res.data_off = SIZEOF_RESIDENT_LE;
+ attr->res.flags = RESIDENT_FLAG_INDEXED;
+ attr->id = cpu_to_le16(aid++);
+ attr->res.data_size = cpu_to_le32(dsize);
+ memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), fname, dsize);
+
+ attr = Add2Ptr(attr, asize);
+
+ if (security_id == SECURITY_ID_INVALID) {
+ /* Insert security attribute. */
+ asize = SIZEOF_RESIDENT + ALIGN(sd_size, 8);
+
+ attr->type = ATTR_SECURE;
+ attr->size = cpu_to_le32(asize);
+ attr->id = cpu_to_le16(aid++);
+ attr->res.data_off = SIZEOF_RESIDENT_LE;
+ attr->res.data_size = cpu_to_le32(sd_size);
+ memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), sd, sd_size);
+
+ attr = Add2Ptr(attr, asize);
+ }
+
+ attr->id = cpu_to_le16(aid++);
+ if (fa & FILE_ATTRIBUTE_DIRECTORY) {
+ /*
+ * Regular directory or symlink to directory.
+ * Create root attribute.
+ */
+ dsize = sizeof(struct INDEX_ROOT) + sizeof(struct NTFS_DE);
+ asize = sizeof(I30_NAME) + SIZEOF_RESIDENT + dsize;
+
+ attr->type = ATTR_ROOT;
+ attr->size = cpu_to_le32(asize);
+
+ attr->name_len = ARRAY_SIZE(I30_NAME);
+ attr->name_off = SIZEOF_RESIDENT_LE;
+ attr->res.data_off =
+ cpu_to_le16(sizeof(I30_NAME) + SIZEOF_RESIDENT);
+ attr->res.data_size = cpu_to_le32(dsize);
+ memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), I30_NAME,
+ sizeof(I30_NAME));
+
+ root = Add2Ptr(attr, sizeof(I30_NAME) + SIZEOF_RESIDENT);
+ memcpy(root, dir_root, offsetof(struct INDEX_ROOT, ihdr));
+ root->ihdr.de_off =
+ cpu_to_le32(sizeof(struct INDEX_HDR)); // 0x10
+ root->ihdr.used = cpu_to_le32(sizeof(struct INDEX_HDR) +
+ sizeof(struct NTFS_DE));
+ root->ihdr.total = root->ihdr.used;
+
+ e = Add2Ptr(root, sizeof(struct INDEX_ROOT));
+ e->size = cpu_to_le16(sizeof(struct NTFS_DE));
+ e->flags = NTFS_IE_LAST;
+ } else if (S_ISLNK(mode)) {
+ /*
+ * Symlink to file.
+ * Create empty resident data attribute.
+ */
+ asize = SIZEOF_RESIDENT;
+
+ /* Insert empty ATTR_DATA */
+ attr->type = ATTR_DATA;
+ attr->size = cpu_to_le32(SIZEOF_RESIDENT);
+ attr->name_off = SIZEOF_RESIDENT_LE;
+ attr->res.data_off = SIZEOF_RESIDENT_LE;
+ } else if (S_ISREG(mode)) {
+ /*
+ * Regular file. Create empty non resident data attribute.
+ */
+ attr->type = ATTR_DATA;
+ attr->non_res = 1;
+ attr->nres.evcn = cpu_to_le64(-1ll);
+ if (fa & FILE_ATTRIBUTE_SPARSE_FILE) {
+ attr->size = cpu_to_le32(SIZEOF_NONRESIDENT_EX + 8);
+ attr->name_off = SIZEOF_NONRESIDENT_EX_LE;
+ attr->flags = ATTR_FLAG_SPARSED;
+ asize = SIZEOF_NONRESIDENT_EX + 8;
+ } else if (fa & FILE_ATTRIBUTE_COMPRESSED) {
+ attr->size = cpu_to_le32(SIZEOF_NONRESIDENT_EX + 8);
+ attr->name_off = SIZEOF_NONRESIDENT_EX_LE;
+ attr->flags = ATTR_FLAG_COMPRESSED;
+ attr->nres.c_unit = COMPRESSION_UNIT;
+ asize = SIZEOF_NONRESIDENT_EX + 8;
+ } else {
+ attr->size = cpu_to_le32(SIZEOF_NONRESIDENT + 8);
+ attr->name_off = SIZEOF_NONRESIDENT_LE;
+ asize = SIZEOF_NONRESIDENT + 8;
+ }
+ attr->nres.run_off = attr->name_off;
+ } else {
+ /*
+ * Node. Create empty resident data attribute.
+ */
+ attr->type = ATTR_DATA;
+ attr->size = cpu_to_le32(SIZEOF_RESIDENT);
+ attr->name_off = SIZEOF_RESIDENT_LE;
+ if (fa & FILE_ATTRIBUTE_SPARSE_FILE)
+ attr->flags = ATTR_FLAG_SPARSED;
+ else if (fa & FILE_ATTRIBUTE_COMPRESSED)
+ attr->flags = ATTR_FLAG_COMPRESSED;
+ attr->res.data_off = SIZEOF_RESIDENT_LE;
+ asize = SIZEOF_RESIDENT;
+ ni->ni_flags |= NI_FLAG_RESIDENT;
+ }
+
+ if (S_ISDIR(mode)) {
+ ni->ni_flags |= NI_FLAG_DIR;
+ err = indx_init(&ni->dir, sbi, attr, INDEX_MUTEX_I30);
+ if (err)
+ goto out4;
+ } else if (S_ISLNK(mode)) {
+ rp = ntfs_create_reparse_buffer(sbi, symname, size, &nsize);
+
+ if (IS_ERR(rp)) {
+ err = PTR_ERR(rp);
+ rp = NULL;
+ goto out4;
+ }
+
+ /*
+ * Insert ATTR_REPARSE.
+ */
+ attr = Add2Ptr(attr, asize);
+ attr->type = ATTR_REPARSE;
+ attr->id = cpu_to_le16(aid++);
+
+ /* Resident or non resident? */
+ asize = ALIGN(SIZEOF_RESIDENT + nsize, 8);
+ t16 = PtrOffset(rec, attr);
+
+ /* 0x78 - the size of EA + EAINFO to store WSL */
+ if (asize + t16 + 0x78 + 8 > sbi->record_size) {
+ CLST alen;
+ CLST clst = bytes_to_cluster(sbi, nsize);
+
+ /* Bytes per runs. */
+ t16 = sbi->record_size - t16 - SIZEOF_NONRESIDENT;
+
+ attr->non_res = 1;
+ attr->nres.evcn = cpu_to_le64(clst - 1);
+ attr->name_off = SIZEOF_NONRESIDENT_LE;
+ attr->nres.run_off = attr->name_off;
+ attr->nres.data_size = cpu_to_le64(nsize);
+ attr->nres.valid_size = attr->nres.data_size;
+ attr->nres.alloc_size =
+ cpu_to_le64(ntfs_up_cluster(sbi, nsize));
+
+ err = attr_allocate_clusters(sbi, &ni->file.run, 0, 0,
+ clst, NULL, 0, &alen, 0,
+ NULL);
+ if (err)
+ goto out5;
+
+ err = run_pack(&ni->file.run, 0, clst,
+ Add2Ptr(attr, SIZEOF_NONRESIDENT), t16,
+ &vcn);
+ if (err < 0)
+ goto out5;
+
+ if (vcn != clst) {
+ err = -EINVAL;
+ goto out5;
+ }
+
+ asize = SIZEOF_NONRESIDENT + ALIGN(err, 8);
+ inode->i_size = nsize;
+ } else {
+ attr->res.data_off = SIZEOF_RESIDENT_LE;
+ attr->res.data_size = cpu_to_le32(nsize);
+ memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), rp, nsize);
+ inode->i_size = nsize;
+ nsize = 0;
+ }
+
+ attr->size = cpu_to_le32(asize);
+
+ err = ntfs_insert_reparse(sbi, IO_REPARSE_TAG_SYMLINK,
+ &new_de->ref);
+ if (err)
+ goto out5;
+
+ rp_inserted = true;
+ }
+
+ attr = Add2Ptr(attr, asize);
+ attr->type = ATTR_END;
+
+ rec->used = cpu_to_le32(PtrOffset(rec, attr) + 8);
+ rec->next_attr_id = cpu_to_le16(aid);
+
+ /* Step 2: Add new name in index. */
+ err = indx_insert_entry(&dir_ni->dir, dir_ni, new_de, sbi, fnd, 0);
+ if (err)
+ goto out6;
+
+ inode->i_generation = le16_to_cpu(rec->seq);
+
+ dir->i_mtime = dir->i_ctime = inode->i_atime;
+
+ if (S_ISDIR(mode)) {
+ inode->i_op = &ntfs_dir_inode_operations;
+ inode->i_fop = &ntfs_dir_operations;
+ } else if (S_ISLNK(mode)) {
+ inode->i_op = &ntfs_link_inode_operations;
+ inode->i_fop = NULL;
+ inode->i_mapping->a_ops = &ntfs_aops;
+ } else if (S_ISREG(mode)) {
+ inode->i_op = &ntfs_file_inode_operations;
+ inode->i_fop = &ntfs_file_operations;
+ inode->i_mapping->a_ops =
+ is_compressed(ni) ? &ntfs_aops_cmpr : &ntfs_aops;
+ init_rwsem(&ni->file.run_lock);
+ } else {
+ inode->i_op = &ntfs_special_inode_operations;
+ init_special_inode(inode, mode, dev);
+ }
+
+#ifdef CONFIG_NTFS3_FS_POSIX_ACL
+ if (!S_ISLNK(mode) && (sb->s_flags & SB_POSIXACL)) {
+ err = ntfs_init_acl(mnt_userns, inode, dir);
+ if (err)
+ goto out6;
+ } else
+#endif
+ {
+ inode->i_flags |= S_NOSEC;
+ }
+
+ /* Write non resident data. */
+ if (nsize) {
+ err = ntfs_sb_write_run(sbi, &ni->file.run, 0, rp, nsize);
+ if (err)
+ goto out7;
+ }
+
+ /*
+ * Call 'd_instantiate' after inode->i_op is set
+ * but before finish_open.
+ */
+ d_instantiate(dentry, inode);
+
+ ntfs_save_wsl_perm(inode);
+ mark_inode_dirty(dir);
+ mark_inode_dirty(inode);
+
+ /* Normal exit. */
+ goto out2;
+
+out7:
+
+ /* Undo 'indx_insert_entry'. */
+ indx_delete_entry(&dir_ni->dir, dir_ni, new_de + 1,
+ le16_to_cpu(new_de->key_size), sbi);
+out6:
+ if (rp_inserted)
+ ntfs_remove_reparse(sbi, IO_REPARSE_TAG_SYMLINK, &new_de->ref);
+
+out5:
+ if (S_ISDIR(mode) || run_is_empty(&ni->file.run))
+ goto out4;
+
+ run_deallocate(sbi, &ni->file.run, false);
+
+out4:
+ clear_rec_inuse(rec);
+ clear_nlink(inode);
+ ni->mi.dirty = false;
+ discard_new_inode(inode);
+out3:
+ ntfs_mark_rec_free(sbi, ino);
+
+out2:
+ __putname(new_de);
+ kfree(rp);
+
+out1:
+ if (err)
+ return ERR_PTR(err);
+
+ unlock_new_inode(inode);
+
+ return inode;
+}
+
+int ntfs_link_inode(struct inode *inode, struct dentry *dentry)
+{
+ int err;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ struct ntfs_sb_info *sbi = inode->i_sb->s_fs_info;
+ struct NTFS_DE *de;
+ struct ATTR_FILE_NAME *de_name;
+
+ /* Allocate PATH_MAX bytes. */
+ de = __getname();
+ if (!de)
+ return -ENOMEM;
+
+ /* Mark rw ntfs as dirty. It will be cleared at umount. */
+ ntfs_set_state(sbi, NTFS_DIRTY_DIRTY);
+
+ /* Construct 'de'. */
+ err = fill_name_de(sbi, de, &dentry->d_name, NULL);
+ if (err)
+ goto out;
+
+ de_name = (struct ATTR_FILE_NAME *)(de + 1);
+ /* Fill duplicate info. */
+ de_name->dup.cr_time = de_name->dup.m_time = de_name->dup.c_time =
+ de_name->dup.a_time = kernel2nt(&inode->i_ctime);
+ de_name->dup.alloc_size = de_name->dup.data_size =
+ cpu_to_le64(inode->i_size);
+ de_name->dup.fa = ni->std_fa;
+ de_name->dup.ea_size = de_name->dup.reparse = 0;
+
+ err = ni_add_name(ntfs_i(d_inode(dentry->d_parent)), ni, de);
+out:
+ __putname(de);
+ return err;
+}
+
+/*
+ * ntfs_unlink_inode
+ *
+ * inode_operations::unlink
+ * inode_operations::rmdir
+ */
+int ntfs_unlink_inode(struct inode *dir, const struct dentry *dentry)
+{
+ int err;
+ struct ntfs_sb_info *sbi = dir->i_sb->s_fs_info;
+ struct inode *inode = d_inode(dentry);
+ struct ntfs_inode *ni = ntfs_i(inode);
+ struct ntfs_inode *dir_ni = ntfs_i(dir);
+ struct NTFS_DE *de, *de2 = NULL;
+ int undo_remove;
+
+ if (ntfs_is_meta_file(sbi, ni->mi.rno))
+ return -EINVAL;
+
+ /* Allocate PATH_MAX bytes. */
+ de = __getname();
+ if (!de)
+ return -ENOMEM;
+
+ ni_lock(ni);
+
+ if (S_ISDIR(inode->i_mode) && !dir_is_empty(inode)) {
+ err = -ENOTEMPTY;
+ goto out;
+ }
+
+ err = fill_name_de(sbi, de, &dentry->d_name, NULL);
+ if (err < 0)
+ goto out;
+
+ undo_remove = 0;
+ err = ni_remove_name(dir_ni, ni, de, &de2, &undo_remove);
+
+ if (!err) {
+ drop_nlink(inode);
+ dir->i_mtime = dir->i_ctime = current_time(dir);
+ mark_inode_dirty(dir);
+ inode->i_ctime = dir->i_ctime;
+ if (inode->i_nlink)
+ mark_inode_dirty(inode);
+ } else if (!ni_remove_name_undo(dir_ni, ni, de, de2, undo_remove)) {
+ make_bad_inode(inode);
+ ntfs_inode_err(inode, "failed to undo unlink");
+ ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+ } else {
+ if (ni_is_dirty(dir))
+ mark_inode_dirty(dir);
+ if (ni_is_dirty(inode))
+ mark_inode_dirty(inode);
+ }
+
+out:
+ ni_unlock(ni);
+ __putname(de);
+ return err;
+}
+
+void ntfs_evict_inode(struct inode *inode)
+{
+ truncate_inode_pages_final(&inode->i_data);
+
+ if (inode->i_nlink)
+ _ni_write_inode(inode, inode_needs_sync(inode));
+
+ invalidate_inode_buffers(inode);
+ clear_inode(inode);
+
+ ni_clear(ntfs_i(inode));
+}
+
+static noinline int ntfs_readlink_hlp(struct inode *inode, char *buffer,
+ int buflen)
+{
+ int i, err = 0;
+ struct ntfs_inode *ni = ntfs_i(inode);
+ struct super_block *sb = inode->i_sb;
+ struct ntfs_sb_info *sbi = sb->s_fs_info;
+ u64 i_size = inode->i_size;
+ u16 nlen = 0;
+ void *to_free = NULL;
+ struct REPARSE_DATA_BUFFER *rp;
+ struct le_str *uni;
+ struct ATTRIB *attr;
+
+ /* Reparse data present. Try to parse it. */
+ static_assert(!offsetof(struct REPARSE_DATA_BUFFER, ReparseTag));
+ static_assert(sizeof(u32) == sizeof(rp->ReparseTag));
+
+ *buffer = 0;
+
+ /* Read into temporal buffer. */
+ if (i_size > sbi->reparse.max_size || i_size <= sizeof(u32)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ attr = ni_find_attr(ni, NULL, NULL, ATTR_REPARSE, NULL, 0, NULL, NULL);
+ if (!attr) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (!attr->non_res) {
+ rp = resident_data_ex(attr, i_size);
+ if (!rp) {
+ err = -EINVAL;
+ goto out;
+ }
+ } else {
+ rp = kmalloc(i_size, GFP_NOFS);
+ if (!rp) {
+ err = -ENOMEM;
+ goto out;
+ }
+ to_free = rp;
+ err = ntfs_read_run_nb(sbi, &ni->file.run, 0, rp, i_size, NULL);
+ if (err)
+ goto out;
+ }
+
+ err = -EINVAL;
+
+ /* Microsoft Tag. */
+ switch (rp->ReparseTag) {
+ case IO_REPARSE_TAG_MOUNT_POINT:
+ /* Mount points and junctions. */
+ /* Can we use 'Rp->MountPointReparseBuffer.PrintNameLength'? */
+ if (i_size <= offsetof(struct REPARSE_DATA_BUFFER,
+ MountPointReparseBuffer.PathBuffer))
+ goto out;
+ uni = Add2Ptr(rp,
+ offsetof(struct REPARSE_DATA_BUFFER,
+ MountPointReparseBuffer.PathBuffer) +
+ le16_to_cpu(rp->MountPointReparseBuffer
+ .PrintNameOffset) -
+ 2);
+ nlen = le16_to_cpu(rp->MountPointReparseBuffer.PrintNameLength);
+ break;
+
+ case IO_REPARSE_TAG_SYMLINK:
+ /* FolderSymbolicLink */
+ /* Can we use 'Rp->SymbolicLinkReparseBuffer.PrintNameLength'? */
+ if (i_size <= offsetof(struct REPARSE_DATA_BUFFER,
+ SymbolicLinkReparseBuffer.PathBuffer))
+ goto out;
+ uni = Add2Ptr(rp,
+ offsetof(struct REPARSE_DATA_BUFFER,
+ SymbolicLinkReparseBuffer.PathBuffer) +
+ le16_to_cpu(rp->SymbolicLinkReparseBuffer
+ .PrintNameOffset) -
+ 2);
+ nlen = le16_to_cpu(
+ rp->SymbolicLinkReparseBuffer.PrintNameLength);
+ break;
+
+ case IO_REPARSE_TAG_CLOUD:
+ case IO_REPARSE_TAG_CLOUD_1:
+ case IO_REPARSE_TAG_CLOUD_2:
+ case IO_REPARSE_TAG_CLOUD_3:
+ case IO_REPARSE_TAG_CLOUD_4:
+ case IO_REPARSE_TAG_CLOUD_5:
+ case IO_REPARSE_TAG_CLOUD_6:
+ case IO_REPARSE_TAG_CLOUD_7:
+ case IO_REPARSE_TAG_CLOUD_8:
+ case IO_REPARSE_TAG_CLOUD_9:
+ case IO_REPARSE_TAG_CLOUD_A:
+ case IO_REPARSE_TAG_CLOUD_B:
+ case IO_REPARSE_TAG_CLOUD_C:
+ case IO_REPARSE_TAG_CLOUD_D:
+ case IO_REPARSE_TAG_CLOUD_E:
+ case IO_REPARSE_TAG_CLOUD_F:
+ err = sizeof("OneDrive") - 1;
+ if (err > buflen)
+ err = buflen;
+ memcpy(buffer, "OneDrive", err);
+ goto out;
+
+ default:
+ if (IsReparseTagMicrosoft(rp->ReparseTag)) {
+ /* Unknown Microsoft Tag. */
+ goto out;
+ }
+ if (!IsReparseTagNameSurrogate(rp->ReparseTag) ||
+ i_size <= sizeof(struct REPARSE_POINT)) {
+ goto out;
+ }
+
+ /* Users tag. */
+ uni = Add2Ptr(rp, sizeof(struct REPARSE_POINT) - 2);
+ nlen = le16_to_cpu(rp->ReparseDataLength) -
+ sizeof(struct REPARSE_POINT);
+ }
+
+ /* Convert nlen from bytes to UNICODE chars. */
+ nlen >>= 1;
+
+ /* Check that name is available. */
+ if (!nlen || &uni->name[nlen] > (__le16 *)Add2Ptr(rp, i_size))
+ goto out;
+
+ /* If name is already zero terminated then truncate it now. */
+ if (!uni->name[nlen - 1])
+ nlen -= 1;
+ uni->len = nlen;
+
+ err = ntfs_utf16_to_nls(sbi, uni, buffer, buflen);
+
+ if (err < 0)
+ goto out;
+
+ /* Translate Windows '\' into Linux '/'. */
+ for (i = 0; i < err; i++) {
+ if (buffer[i] == '\\')
+ buffer[i] = '/';
+ }
+
+ /* Always set last zero. */
+ buffer[err] = 0;
+out:
+ kfree(to_free);
+ return err;
+}
+
+static const char *ntfs_get_link(struct dentry *de, struct inode *inode,
+ struct delayed_call *done)
+{
+ int err;
+ char *ret;
+
+ if (!de)
+ return ERR_PTR(-ECHILD);
+
+ ret = kmalloc(PAGE_SIZE, GFP_NOFS);
+ if (!ret)
+ return ERR_PTR(-ENOMEM);
+
+ err = ntfs_readlink_hlp(inode, ret, PAGE_SIZE);
+ if (err < 0) {
+ kfree(ret);
+ return ERR_PTR(err);
+ }
+
+ set_delayed_call(done, kfree_link, ret);
+
+ return ret;
+}
+
+// clang-format off
+const struct inode_operations ntfs_link_inode_operations = {
+ .get_link = ntfs_get_link,
+ .setattr = ntfs3_setattr,
+ .listxattr = ntfs_listxattr,
+ .permission = ntfs_permission,
+ .get_acl = ntfs_get_acl,
+ .set_acl = ntfs_set_acl,
+};
+
+const struct address_space_operations ntfs_aops = {
+ .readpage = ntfs_readpage,
+ .readahead = ntfs_readahead,
+ .writepage = ntfs_writepage,
+ .writepages = ntfs_writepages,
+ .write_begin = ntfs_write_begin,
+ .write_end = ntfs_write_end,
+ .direct_IO = ntfs_direct_IO,
+ .bmap = ntfs_bmap,
+ .set_page_dirty = __set_page_dirty_buffers,
+};
+
+const struct address_space_operations ntfs_aops_cmpr = {
+ .readpage = ntfs_readpage,
+ .readahead = ntfs_readahead,
+};
+// clang-format on
diff --git a/fs/ntfs3/lib/decompress_common.c b/fs/ntfs3/lib/decompress_common.c
new file mode 100644
index 000000000000..e96652240859
--- /dev/null
+++ b/fs/ntfs3/lib/decompress_common.c
@@ -0,0 +1,319 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * decompress_common.c - Code shared by the XPRESS and LZX decompressors
+ *
+ * Copyright (C) 2015 Eric Biggers
+ */
+
+#include "decompress_common.h"
+
+/*
+ * make_huffman_decode_table() -
+ *
+ * Build a decoding table for a canonical prefix code, or "Huffman code".
+ *
+ * This is an internal function, not part of the library API!
+ *
+ * This takes as input the length of the codeword for each symbol in the
+ * alphabet and produces as output a table that can be used for fast
+ * decoding of prefix-encoded symbols using read_huffsym().
+ *
+ * Strictly speaking, a canonical prefix code might not be a Huffman
+ * code. But this algorithm will work either way; and in fact, since
+ * Huffman codes are defined in terms of symbol frequencies, there is no
+ * way for the decompressor to know whether the code is a true Huffman
+ * code or not until all symbols have been decoded.
+ *
+ * Because the prefix code is assumed to be "canonical", it can be
+ * reconstructed directly from the codeword lengths. A prefix code is
+ * canonical if and only if a longer codeword never lexicographically
+ * precedes a shorter codeword, and the lexicographic ordering of
+ * codewords of the same length is the same as the lexicographic ordering
+ * of the corresponding symbols. Consequently, we can sort the symbols
+ * primarily by codeword length and secondarily by symbol value, then
+ * reconstruct the prefix code by generating codewords lexicographically
+ * in that order.
+ *
+ * This function does not, however, generate the prefix code explicitly.
+ * Instead, it directly builds a table for decoding symbols using the
+ * code. The basic idea is this: given the next 'max_codeword_len' bits
+ * in the input, we can look up the decoded symbol by indexing a table
+ * containing 2**max_codeword_len entries. A codeword with length
+ * 'max_codeword_len' will have exactly one entry in this table, whereas