aboutsummaryrefslogtreecommitdiffstats
path: root/fs/ext4/super.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2014-10-30ext4: remove extent status procfs files if journal load failsDarrick J. Wong1-2/+3
If we can't load the journal, remove the procfs files for the extent status information file to avoid leaking resources. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2014-10-30ext4: disallow changing journal_csum option during remountDarrick J. Wong1-0/+8
ext4 does not permit changing the metadata or journal checksum feature flag while mounted. Until we decide to support that, don't allow a remount to change the journal_csum flag (right now we silently fail to change anything). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-10-30ext4: enable journal checksum when metadata checksum feature enabledDarrick J. Wong1-0/+4
If metadata checksumming is turned on for the FS, we need to tell the journal to use checksumming too. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2014-10-20Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds1-117/+128
Pull ext4 updates from Ted Ts'o: "A large number of cleanups and bug fixes, with some (minor) journal optimizations" [ This got sent to me before -rc1, but was stuck in my spam folder. - Linus ] * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (67 commits) ext4: check s_chksum_driver when looking for bg csum presence ext4: move error report out of atomic context in ext4_init_block_bitmap() ext4: Replace open coded mdata csum feature to helper function ext4: delete useless comments about ext4_move_extents ext4: fix reservation overflow in ext4_da_write_begin ext4: add ext4_iget_normal() which is to be used for dir tree lookups ext4: don't orphan or truncate the boot loader inode ext4: grab missed write_count for EXT4_IOC_SWAP_BOOT ext4: optimize block allocation on grow indepth ext4: get rid of code duplication ext4: fix over-defensive complaint after journal abort ext4: fix return value of ext4_do_update_inode ext4: fix mmap data corruption when blocksize < pagesize vfs: fix data corruption when blocksize < pagesize for mmaped data ext4: fold ext4_nojournal_sops into ext4_sops ext4: support freezing ext2 (nojournal) file systems ext4: fold ext4_sync_fs_nojournal() into ext4_sync_fs() ext4: don't check quota format when there are no quota files jbd2: simplify calling convention around __jbd2_journal_clean_checkpoint_list jbd2: avoid pointless scanning of checkpoint lists ...
2014-10-14ext4: check s_chksum_driver when looking for bg csum presenceDarrick J. Wong1-0/+4
Convert the ext4_has_group_desc_csum predicate to look for a checksum driver instead of the metadata_csum flag and change the bg checksum calculation function to look for GDT_CSUM before taking the crc16 path. Without this patch, if we mount with ^uninit_bg,^metadata_csum and later metadata_csum gets turned on by accident, the block group checksum functions will incorrectly assume that checksumming is enabled (metadata_csum) but that crc16 should be used (!s_chksum_driver). This is totally wrong, so fix the predicate and the checksum formula selection. (Granted, if the metadata_csum feature bit gets enabled on a live FS then something underhanded is going on, but we could at least avoid writing garbage into the on-disk fields.) Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org> Cc: stable@vger.kernel.org
2014-10-13ext4: Replace open coded mdata csum feature to helper functionDmitry Monakhov1-10/+5
Besides the fact that this replacement improves code readability it also protects from errors caused direct EXT4_S(sb)->s_es manipulation which may result attempt to use uninitialized csum machinery. #Testcase_BEGIN IMG=/dev/ram0 MNT=/mnt mkfs.ext4 $IMG mount $IMG $MNT #Enable feature directly on disk, on mounted fs tune2fs -O metadata_csum $IMG # Provoke metadata update, likey result in OOPS touch $MNT/test umount $MNT #Testcase_END # Replacement script @@ expression E; @@ - EXT4_HAS_RO_COMPAT_FEATURE(E, EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) + ext4_has_metadata_csum(E) https://bugzilla.kernel.org/show_bug.cgi?id=82201 Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2014-10-05ext4: add ext4_iget_normal() which is to be used for dir tree lookupsTheodore Ts'o1-1/+1
If there is a corrupted file system which has directory entries that point at reserved, metadata inodes, prohibit them from being used by treating them the same way we treat Boot Loader inodes --- that is, mark them to be bad inodes. This prohibits them from being opened, deleted, or modified via chmod, chown, utimes, etc. In particular, this prevents a corrupted file system which has a directory entry which points at the journal inode from being deleted and its blocks released, after which point Much Hilarity Ensues. Reported-by: Sami Liedes <sami.liedes@iki.fi> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2014-09-24Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block into for-3.18Tejun Heo1-2/+3
This is to receive 0a30288da1ae ("blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe") which implements __percpu_ref_kill_expedited() to work around SCSI blk-mq stall. The commit reverted and patches to implement proper fix will be added. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de>
2014-09-18ext4: fold ext4_nojournal_sops into ext4_sopsTheodore Ts'o1-26/+1
There's no longer any need to have a separate set of super_operations for nojournal mode. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-18ext4: support freezing ext2 (nojournal) file systemsTheodore Ts'o1-11/+16
Through an oversight, when we added nojournal support to ext4, we didn't add support to allow file system freezing. This is relatively easy to add, so let's do it. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Dexuan Cui <decui@microsoft.com>
2014-09-18ext4: fold ext4_sync_fs_nojournal() into ext4_sync_fs()Theodore Ts'o1-23/+13
This allows us to eliminate duplicate code, and eventually allow us to also fold ext4_sops and ext4_nojournal_sops together. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-18ext4: don't check quota format when there are no quota filesJan Kara1-7/+0
The check whether quota format is set even though there are no quota files with journalled quota is pointless and it actually makes it impossible to turn off journalled quotas (as there's no way to unset journalled quota format). Just remove the check. CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-16ext4: explicitly inform user about orphan list cleanupDmitry Monakhov1-1/+1
Production fs likely compiled/mounted w/o jbd debugging, so orphan list clearing will be silent. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11ext4: validate external journal superblock checksumDarrick J. Wong1-0/+9
If the external journal device has metadata_csum enabled, verify that the superblock checksum matches the block before we try to mount. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11jbd2: fix journal checksum feature flag handlingDarrick J. Wong1-5/+6
Clear all three journal checksum feature flags before turning on whichever journal checksum options we want. Rearrange the error checking so that newer flags get complained about first. Reported-by: TR Reardon <thomas_reardon@hotmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11ext4: provide separate operations for sysfs feature filesLukas Czerner1-1/+17
Currently sysfs feature files uses ext4_attr_ops as the file operations to show/store data. However the feature files is not supposed to contain any data at all, the sole existence of the file means that the module support the feature. Moreover, none of the sysfs feature attributes actually register show/store functions so that would not be a problem. However if a sysfs feature attribute register a show or store function we might be in trouble because the kobject in this case is _not_ embedded in the ext4_sb_info structure as ext4_attr_show/store expect. So just to be safe, provide separate empty sysfs_ops to use in ext4_feat_ktype. This might safe us from potential problems in the future. As a bonus we can "store" something more descriptive than nothing in the files, so let it contain "enabled" to make it clear that the feature is really present in the module. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11ext4: add sysfs entry showing whether the fs contains errorsLukas Czerner1-0/+31
Currently there is no easy way to tell that the mounted file system contains errors other than checking for log messages, or reading the information directly from superblock. This patch adds new sysfs entries: errors_count (number of fs errors we encounter) first_error_time (unix timestamp for the first error we see) last_error_time (unix timestamp for the last error we see) If the file system is not marked as containing errors then any of the file will return 0. Otherwise it will contain valid information. More details about the errors should as always be found in the logs. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-11ext4: don't use MAXQUOTAS valueJan Kara1-11/+11
MAXQUOTAS value defines maximum number of quota types VFS supports. This isn't necessarily the number of types ext4 supports. Although ext4 will support project quotas, use ext4 private definition for consistency with other filesystems. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-08percpu_counter: add @gfp to percpu_counter_init()Tejun Heo1-5/+9
Percpu allocator now supports allocation mask. Add @gfp to percpu_counter_init() so that !GFP_KERNEL allocation masks can be used with percpu_counters too. We could have left percpu_counter_init() alone and added percpu_counter_init_gfp(); however, the number of users isn't that high and introducing _gfp variants to all percpu data structures would be quite ugly, so let's just do the conversion. This is the one with the most users. Other percpu data structures are a lot easier to convert. This patch doesn't make any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jan Kara <jack@suse.cz> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: x86@kernel.org Cc: Jens Axboe <axboe@kernel.dk> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org>
2014-09-04ext4: use non-movable memory for the ext4 superblockGioh Kim1-3/+3
Since the ext4 superblock is not released until the file system is unmounted, allocate the buffer cache entry for the ext4 superblock out of the non-moveable are to allow page migrations and thus CMA allocations to more easily succeed if the CMA area is limited. Signed-off-by: Gioh Kim <gioh.kim@lge.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2014-09-01ext4: track extent status tree shrinker delay staticticsZheng Liu1-8/+3
This commit adds some statictics in extent status tree shrinker. The purpose to add these is that we want to collect more details when we encounter a stall caused by extent status tree shrinker. Here we count the following statictics: stats: the number of all objects on all extent status trees the number of reclaimable objects on lru list cache hits/misses the last sorted interval the number of inodes on lru list average: scan time for shrinking some objects the number of shrunk objects maximum: the inode that has max nr. of objects on lru list the maximum scan time for shrinking some objects The output looks like below: $ cat /proc/fs/ext4/sda1/es_shrinker_info stats: 28228 objects 6341 reclaimable objects 5281/631 cache hits/misses 586 ms last sorted interval 250 inodes on lru list average: 153 us scan time 128 shrunk objects maximum: 255 inode (255 objects, 198 reclaimable) 125723 us max scan time If the lru list has never been sorted, the following line will not be printed: 586ms last sorted interval If there is an empty lru list, the following lines also will not be printed: 250 inodes on lru list ... maximum: 255 inode (255 objects, 198 reclaimable) 0 us max scan time Meanwhile in this commit a new trace point is defined to print some details in __ext4_es_shrink(). Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01ext4: enable block_validity by defaultDarrick J. Wong1-2/+2
Enable by default the block_validity feature, which checks for collisions between newly allocated blocks and critical system metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-29ext4: convert ext4_bread() to use the ERR_PTR conventionTheodore Ts'o1-10/+8
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-28jbd2: fix descriptor block size handling errors with journal_csumDarrick J. Wong1-2/+3
It turns out that there are some serious problems with the on-disk format of journal checksum v2. The foremost is that the function to calculate descriptor tag size returns sizes that are too big. This causes alignment issues on some architectures and is compounded by the fact that some parts of jbd2 use the structure size (incorrectly) to determine the presence of a 64bit journal instead of checking the feature flags. Therefore, introduce journal checksum v3, which enlarges the descriptor block tag format to allow for full 32-bit checksums of journal blocks, fix the journal tag function to return the correct sizes, and fix the jbd2 recovery code to use feature flags to determine 64bitness. Add a few function helpers so we don't have to open-code quite so many pieces. Switching to a 16-byte block size was found to increase journal size overhead by a maximum of 0.1%, to convert a 32-bit journal with no checksumming to a 32-bit journal with checksum v3 enabled. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reported-by: TR Reardon <thomas_reardon@hotmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2014-07-15ext4: rearrange initialization to fix EXT4FS_DEBUGTheodore Ts'o1-49/+39
The EXT4FS_DEBUG is a *very* developer specific #ifdef designed for ext4 developers only. (You have to modify fs/ext4/ext4.h to enable it.) Rearrange how we initialize data structures to avoid calling ext4_count_free_clusters() until the multiblock allocator has been initialized. This also allows us to only call ext4_count_free_clusters() once, and simplifies the code somewhat. (Thanks to Chen Gang <gang.chen.5i5j@gmail.com> for pointing out a !CONFIG_SMP compile breakage in the original patch.) Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>
2014-07-11ext4: revert commit which was causing fs corruption after journal replaysTheodore Ts'o1-27/+24
Commit 007649375f6af2 ("ext4: initialize multi-block allocator before checking block descriptors") causes the block group descriptor's count of the number of free blocks to become inconsistent with the number of free blocks in the allocation bitmap. This is a harmless form of fs corruption, but it causes the kernel to potentially remount the file system read-only, or to panic, depending on the file systems's error behavior. Thanks to Eric Whitney for his tireless work to reproduce and to find the guilty commit. Fixes: 007649375f6af2 ("ext4: initialize multi-block allocator before checking block descriptors" Cc: stable@vger.kernel.org # 3.15 Reported-by: David Jander <david@protonic.nl> Reported-by: Matteo Croce <technoboy85@gmail.com> Tested-by: Eric Whitney <enwlinux@gmail.com> Suggested-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-07-05ext4: disable synchronous transaction batching if max_batch_time==0Eric Sandeen1-2/+0
The mount manpage says of the max_batch_time option, This optimization can be turned off entirely by setting max_batch_time to 0. But the code doesn't do that. So fix the code to do that. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2014-07-05ext4: clarify error count warning messagesTheodore Ts'o1-3/+4
Make it clear that values printed are times, and that it is error since last fsck. Also add note about fsck version required. Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: stable@vger.kernel.org
2014-05-12ext4: add missing BUFFER_TRACE before ext4_journal_get_write_accessliang xie1-0/+1
Make them more consistently Signed-off-by: xieliang <xieliang@xiaomi.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-05-12ext4: remove unnecessary double parenthesesLukas Czerner1-1/+1
Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-05-12ext4: make local functions staticStephen Hemminger1-2/+2
I have been running make namespacecheck to look for unneeded globals, and found these in ext4. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-05-12ext4: find the group descriptors on a 1k-block bigalloc,meta_bg filesystemDarrick J. Wong1-0/+10
On a filesystem with a 1k block size, the group descriptors live in block 2, not block 1. If the filesystem has bigalloc,meta_bg set, however, the calculation of the group descriptor table location does not take this into account and returns the wrong block number. Fix the calculation to return the correct value for this case. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-21ext4: add a new spinlock i_raw_lock to protect the ext4's raw inodeTheodore Ts'o1-0/+1
To avoid potential data races, use a spinlock which protects the raw (on-disk) inode. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2014-04-20ext4: rename uninitialized extents to unwrittenLukas Czerner1-1/+1
Currently in ext4 there is quite a mess when it comes to naming unwritten extents. Sometimes we call it uninitialized and sometimes we refer to it as unwritten. The right name for the extent which has been allocated but does not contain any written data is _unwritten_. Other file systems are using this name consistently, even the buffer head state refers to it as unwritten. We need to fix this confusion in ext4. This commit changes every reference to an uninitialized extent (meaning allocated but unwritten) to unwritten extent. This includes comments, function names and variable names. It even covers abbreviation of the word uninitialized (such as uninit) and some misspellings. This commit does not change any of the code paths at all. This has been confirmed by comparing md5sums of the assembly code of each object file after all the function names were stripped from it. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-07ext4: initialize multi-block allocator before checking block descriptorsAzat Khuzhin1-24/+27
With EXT4FS_DEBUG ext4_count_free_clusters() will call ext4_read_block_bitmap() without s_group_info initialized, so we need to initialize multi-block allocator before. And dependencies that must be solved, to allow this: - multi-block allocator needs in group descriptors - need to install s_op before initializing multi-block allocator, because in ext4_mb_init_backend() new inode is created. - initialize number of group desc blocks (s_gdb_count) otherwise number of clusters returned by ext4_free_clusters_after_init() is not correct. (see ext4_bg_num_gdb_nometa()) Here is the stack backtrace: (gdb) bt #0 ext4_get_group_info (group=0, sb=0xffff880079a10000) at ext4.h:2430 #1 ext4_validate_block_bitmap (sb=sb@entry=0xffff880079a10000, desc=desc@entry=0xffff880056510000, block_group=block_group@entry=0, bh=bh@entry=0xffff88007bf2b2d8) at balloc.c:358 #2 0xffffffff81232202 in ext4_wait_block_bitmap (sb=sb@entry=0xffff880079a10000, block_group=block_group@entry=0, bh=bh@entry=0xffff88007bf2b2d8) at balloc.c:476 #3 0xffffffff81232eaf in ext4_read_block_bitmap (sb=sb@entry=0xffff880079a10000, block_group=block_group@entry=0) at balloc.c:489 #4 0xffffffff81232fc0 in ext4_count_free_clusters (sb=sb@entry=0xffff880079a10000) at balloc.c:665 #5 0xffffffff81259ffa in ext4_check_descriptors (first_not_zeroed=<synthetic pointer>, sb=0xffff880079a10000) at super.c:2143 #6 ext4_fill_super (sb=sb@entry=0xffff880079a10000, data=<optimized out>, data@entry=0x0 <irq_stack_union>, silent=silent@entry=0) at super.c:3851 ... Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-03-24ext4: optimize Hurd tests when reading/writing inodesTheodore Ts'o1-0/+10
Set a in-memory superblock flag to indicate whether the file system is designed to support the Hurd. Also, add a sanity check to make sure the 64-bit feature is not set for Hurd file systems, since i_file_acl_high conflicts with a Hurd-specific field. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-03-18ext4: each filesystem creates and uses its own mb_cacheT Makphaibulchoke1-8/+17
This patch adds new interfaces to create and destory cache, ext4_xattr_create_cache() and ext4_xattr_destroy_cache(), and remove the cache creation and destory calls from ex4_init_xattr() and ext4_exitxattr() in fs/ext4/xattr.c. fs/ext4/super.c has been changed so that when a filesystem is mounted a cache is allocated and attched to its ext4_sb_info structure. fs/mbcache.c has been changed so that only one slab allocator is allocated and used by all mbcache structures. Signed-off-by: T. Makphaibulchoke <tmac@hp.com>
2014-03-13ext4: only call sync_filesystm() when remounting read-onlyTheodore Ts'o1-2/+3
This is the only time it is required for ext4. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-03-13fs: push sync_filesystem() down to the file system's remount_fs()Theodore Ts'o1-0/+2
Previously, the no-op "mount -o mount /dev/xxx" operation when the file system is already mounted read-write causes an implied, unconditional syncfs(). This seems pretty stupid, and it's certainly documented or guaraunteed to do this, nor is it particularly useful, except in the case where the file system was mounted rw and is getting remounted read-only. However, it's possible that there might be some file systems that are actually depending on this behavior. In most file systems, it's probably fine to only call sync_filesystem() when transitioning from read-write to read-only, and there are some file systems where this is not needed at all (for example, for a pseudo-filesystem or something like romfs). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-fsdevel@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Jan Kara <jack@suse.cz> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Anders Larsen <al@alarsen.net> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: Petr Vandrovec <petr@vandrovec.name> Cc: xfs@oss.sgi.com Cc: linux-btrfs@vger.kernel.org Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Cc: codalist@coda.cs.cmu.edu Cc: linux-ext4@vger.kernel.org Cc: linux-f2fs-devel@lists.sourceforge.net Cc: fuse-devel@lists.sourceforge.net Cc: cluster-devel@redhat.com Cc: linux-mtd@lists.infradead.org Cc: jfs-discussion@lists.sourceforge.net Cc: linux-nfs@vger.kernel.org Cc: linux-nilfs@vger.kernel.org Cc: linux-ntfs-dev@lists.sourceforge.net Cc: ocfs2-devel@oss.oracle.com Cc: reiserfs-devel@vger.kernel.org
2014-02-17ext4: Add __init marking to init_inodecacheFabian Frederick1-1/+1
init_inodecache is only called by __init init_ext4_fs. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-02-12ext4: don't try to modify s_flags if the the file system is read-onlyTheodore Ts'o1-7/+13
If an ext4 file system is created by some tool other than mke2fs (perhaps by someone who has a pathalogical fear of the GPL) that doesn't set one or the other of the EXT2_FLAGS_{UN}SIGNED_HASH flags, and that file system is then mounted read-only, don't try to modify the s_flags field. Otherwise, if dm_verity is in use, the superblock will change, causing an dm_verity failure. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2013-12-08ext4: Do not reserve clusters when fs doesn't support extentsJan Kara1-4/+13
When the filesystem doesn't support extents (like in ext2/3 compatibility modes), there is no need to reserve any clusters. Space estimates for writing are exact, hole punching doesn't need new metadata, and there are no unwritten extents to convert. This fixes a problem when filesystem still having some free space when accessed with a native ext2/3 driver suddently reports ENOSPC when accessed with ext4 driver. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2013-12-08ext4: fix del_timer() misuse for ->s_err_reportAl Viro1-2/+2
That thing should be del_timer_sync(); consider what happens if ext4_put_super() call of del_timer() happens to come just as it's getting run on another CPU. Since that timer reschedules itself to run next day, you are pretty much guaranteed that you'll end up with kfree'd scheduled timer, with usual fun consequences. AFAICS, that's -stable fodder all way back to 2010... [the second del_timer_sync() is almost certainly not needed, but it doesn't hurt either] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2013-11-08ext4: use prandom_u32() instead of get_random_bytes()Theodore Ts'o1-5/+2
Many of the uses of get_random_bytes() do not actually need cryptographically secure random numbers. Replace those uses with a call to prandom_u32(), which is faster and which doesn't consume entropy from the /dev/random driver. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-10-17ext4: add ratelimiting to ext4 messagesTheodore Ts'o1-58/+94
In the case of a storage device that suddenly disappears, or in the case of significant file system corruption, this can result in a huge flood of messages being sent to the console. This can overflow the file system containing /var/log/messages, or if a serial console is configured, this can slow down the system so much that a hardware watchdog can end up triggering forcing a system reboot. Google-Bug-Id: 7258357 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-09-06Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivialLinus Torvalds1-2/+2
Pull trivial tree from Jiri Kosina: "The usual trivial updates all over the tree -- mostly typo fixes and documentation updates" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (52 commits) doc: Documentation/cputopology.txt fix typo treewide: Convert retrun typos to return Fix comment typo for init_cma_reserved_pageblock Documentation/trace: Correcting and extending tracepoint documentation mm/hotplug: fix a typo in Documentation/memory-hotplug.txt power: Documentation: Update s2ram link doc: fix a typo in Documentation/00-INDEX Documentation/printk-formats.txt: No casts needed for u64/s64 doc: Fix typo "is is" in Documentations treewide: Fix printks with 0x%# zram: doc fixes Documentation/kmemcheck: update kmemcheck documentation doc: documentation/hwspinlock.txt fix typo PM / Hibernate: add section for resume options doc: filesystems : Fix typo in Documentations/filesystems scsi/megaraid fixed several typos in comments ppc: init_32: Fix error typo "CONFIG_START_KERNEL" treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacks page_isolation: Fix a comment typo in test_pages_isolated() doc: fix a typo about irq affinity ...
2013-09-05Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-16/+0
Pull vfs pile 1 from Al Viro: "Unfortunately, this merge window it'll have a be a lot of small piles - my fault, actually, for not keeping #for-next in anything that would resemble a sane shape ;-/ This pile: assorted fixes (the first 3 are -stable fodder, IMO) and cleanups + %pd/%pD formats (dentry/file pathname, up to 4 last components) + several long-standing patches from various folks. There definitely will be a lot more (starting with Miklos' check_submount_and_drop() series)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) direct-io: Handle O_(D)SYNC AIO direct-io: Implement generic deferred AIO completions add formats for dentry/file pathnames kvm eventfd: switch to fdget powerpc kvm: use fdget switch fchmod() to fdget switch epoll_ctl() to fdget switch copy_module_from_fd() to fdget git simplify nilfs check for busy subtree ibmasmfs: don't bother passing superblock when not needed don't pass superblock to hypfs_{mkdir,create*} don't pass superblock to hypfs_diag_create_files don't pass superblock to hypfs_vm_create_files() oprofile: get rid of pointless forward declarations of struct super_block oprofilefs_create_...() do not need superblock argument oprofilefs_mkdir() doesn't need superblock argument don't bother with passing superblock to oprofile_create_stats_files() oprofile: don't bother with passing superblock to ->create_files() don't bother passing sb to oprofile_create_files() coh901318: don't open-code simple_read_from_buffer() ...
2013-09-04direct-io: Implement generic deferred AIO completionsChristoph Hellwig1-16/+0
Add support to the core direct-io code to defer AIO completions to user context using a workqueue. This replaces opencoded and less efficient code in XFS and ext4 (we save a memory allocation for each direct IO) and will be needed to properly support O_(D)SYNC for AIO. The communication between the filesystem and the direct I/O code requires a new buffer head flag, which is a bit ugly but not avoidable until the direct I/O code stops abusing the buffer_head structure for communicating with the filesystems. Currently this creates a per-superblock unbound workqueue for these completions, which is taken from an earlier patch by Jan Kara. I'm not really convinced about this use and would prefer a "normal" global workqueue with a high concurrency limit, but this needs further discussion. JK: Fixed ext4 part, dynamic allocation of the workqueue. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-08-28ext4: allow specifying external journal by pathname mount optionEric Sandeen1-3/+44
It's always been a hassle that if an external journal's device number changes, the filesystem won't mount. And since boot-time enumeration can change, device number changes aren't unusual. The current mechanism to update the journal location is by passing in a mount option w/ a new devnum, but that's a hassle; it's a manual approach, fixing things after the fact. Adding a mount option, "-o journal_path=/dev/$DEVICE" would help, since then we can do i.e. # mount -o journal_path=/dev/disk/by-label/$JOURNAL_LABEL ... and it'll mount even if the devnum has changed, as shown here: # losetup /dev/loop0 journalfile # mke2fs -L mylabel-journal -O journal_dev /dev/loop0 # mkfs.ext4 -L mylabel -J device=/dev/loop0 /dev/sdb1 Change the journal device number: # losetup -d /dev/loop0 # losetup /dev/loop1 journalfile And today it will fail: # mount /dev/sdb1 /mnt/test mount: wrong fs type, bad option, bad superblock on /dev/sdb1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so # dmesg | tail -n 1 [17343.240702] EXT4-fs (sdb1): error: couldn't read superblock of external journal But with this new mount option, we can specify the new path: # mount -o journal_path=/dev/loop1 /dev/sdb1 /mnt/test # (which does update the encoded device number, incidentally): # umount /dev/sdb1 # dumpe2fs -h /dev/sdb1 | grep "Journal device" dumpe2fs 1.41.12 (17-May-2010) Journal device: 0x0701 But best of all we can just always mount by journal-path, and it'll always work: # mount -o journal_path=/dev/disk/by-label/mylabel-journal /dev/sdb1 /mnt/test # So the journal_path option can be specified in fstab, and as long as the disk is available somewhere, and findable by label (or by UUID), we can mount. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2013-08-20treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacksJoe Perches1-2/+2
Don't emit OOM warnings when k.alloc calls fail when there there is a v.alloc immediately afterwards. Converted a kmalloc/vmalloc with memset to kzalloc/vzalloc. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jiri Kosina <jkosina@suse.cz>