aboutsummaryrefslogtreecommitdiffstats
path: root/fs/ext4/inode.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2011-07-26Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writebackLinus Torvalds1-2/+2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback: (27 commits) mm: properly reflect task dirty limits in dirty_exceeded logic writeback: don't busy retry writeback on new/freeing inodes writeback: scale IO chunk size up to half device bandwidth writeback: trace global_dirty_state writeback: introduce max-pause and pass-good dirty limits writeback: introduce smoothed global dirty limit writeback: consolidate variable names in balance_dirty_pages() writeback: show bdi write bandwidth in debugfs writeback: bdi write bandwidth estimation writeback: account per-bdi accumulated written pages writeback: make writeback_control.nr_to_write straight writeback: skip tmpfs early in balance_dirty_pages_ratelimited_nr() writeback: trace event writeback_queue_io writeback: trace event writeback_single_inode writeback: remove .nonblocking and .encountered_congestion writeback: remove writeback_control.more_io writeback: skip balance_dirty_pages() for in-memory fs writeback: add bdi_dirty_limit() kernel-doc writeback: avoid extra sync work at enqueue time writeback: elevate queue_io() into wb_writeback() ... Fix up trivial conflicts in fs/fs-writeback.c and mm/filemap.c
2011-07-20fs: move inode_dio_done to the end_io handlerChristoph Hellwig1-0/+5
For filesystems that delay their end_io processing we should keep our i_dio_count until the the processing is done. Enable this by moving the inode_dio_done call to the end_io handler if one exist. Note that the actual move to the workqueue for ext4 and XFS is not done in this patch yet, but left to the filesystem maintainers. At least for XFS it's not needed yet either as XFS has an internal equivalent to i_dio_count. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20fs: simplify the blockdev_direct_IO prototypeChristoph Hellwig1-6/+6
Simple filesystems always pass inode->i_sb_bdev as the block device argument, and never need a end_io handler. Let's simply things for them and for my grepping activity by dropping these arguments. The only thing not falling into that scheme is ext4, which passes and end_io handler without needing special flags (yet), but given how messy the direct I/O code there is use of __blockdev_direct_IO in one instead of two out of three cases isn't going to make a large difference anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20fs: move inode_dio_wait calls into ->setattrChristoph Hellwig1-0/+2
Let filesystems handle waiting for direct I/O requests themselves instead of doing it beforehand. This means filesystem-specific locks to prevent new dio referenes from appearing can be held. This is important to allow generalizing i_dio_count to non-DIO_LOCKING filesystems. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20ext4: Rewrite ext4_page_mkwrite() to use generic helpersJan Kara1-51/+55
Rewrite ext4_page_mkwrite() to use __block_page_mkwrite() helper. This removes the need of using i_alloc_sem to avoid races with truncate which seems to be the wrong locking order according to lock ordering documented in mm/rmap.c. Also calling ext4_da_write_begin() as used by the old code seems to be problematic because we can decide to flush delay-allocated blocks which will acquire s_umount semaphore - again creating unpleasant lock dependency if not directly a deadlock. Also add a check for frozen filesystem so that we don't busyloop in page fault when the filesystem is frozen. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-08writeback: introduce .tagged_writepages for the WB_SYNC_NONE sync stageWu Fengguang1-2/+2
sync(2) is performed in two stages: the WB_SYNC_NONE sync and the WB_SYNC_ALL sync. Identify the first stage with .tagged_writepages and do livelock prevention for it, too. Jan's commit f446daaea9 ("mm: implement writeback livelock avoidance using page tagging") is a partial fix in that it only fixed the WB_SYNC_ALL phase livelock. Although ext4 is tested to no longer livelock with commit f446daaea9, it may due to some "redirty_tail() after pages_skipped" effect which is by no means a guarantee for _all_ the file systems. Note that writeback_inodes_sb() is called by not only sync(), they are treated the same because the other callers also need livelock prevention. Impact: It changes the order in which pages/inodes are synced to disk. Now in the WB_SYNC_NONE stage, it won't proceed to write the next inode until finished with the current inode. Acked-by: Jan Kara <jack@suse.cz> CC: Dave Chinner <david@fromorbit.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-06-06ext4: fixed tracepoints cleanupLukas Czerner1-1/+1
While creating fixed tracepoints for ext3, basically by porting them from ext4, I found a lot of useless retyping, wrong type usage, useless variable passing and other inconsistencies in the ext4 fixed tracepoint code. This patch cleans the fixed tracepoint code for ext4 and also simplify some of them. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-05-27fs: pass exact type of data dirties to ->dirty_inodeChristoph Hellwig1-1/+1
Tell the filesystem if we just updated timestamp (I_DIRTY_SYNC) or anything else, so that the filesystem can track internally if it needs to push out a transaction for fdatasync or not. This is just the prototype change with no user for it yet. I plan to push large XFS changes for the next merge window, and getting this trivial infrastructure in this window would help a lot to avoid tree interdependencies. Also remove incorrect comments that ->dirty_inode can't block. That has been changed a long time ago, and many implementations rely on it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-25ext4: Convert ext4 to new truncate calling conventionJan Kara1-3/+1
Trivial conversion. Fixup one error handling case calling vmtruncate() and remove ->truncate callback. We also fix a bug that IS_IMMUTABLE and IS_APPEND files could not be truncated during failed writes. In fact, the test can be completely removed as upper layers do necessary permission checks for truncate in do_sys_[f]truncate() and may_open() anyway. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-05-25ext4: enable "punch hole" functionalityAllison Henderson1-0/+25
This patch adds new routines: "ext4_punch_hole" "ext4_ext_punch_hole" and "ext4_ext_check_cache" fallocate has been modified to call ext4_punch_hole when the punch hole flag is passed. At the moment, we only support punching holes in extents, so this routine is pretty much a wrapper for the ext4_ext_punch_hole routine. The ext4_ext_punch_hole routine first completes all outstanding writes with the associated pages, and then releases them. The unblock aligned data is zeroed, and all blocks in between are punched out. The ext4_ext_check_cache routine is very similar to ext4_ext_in_cache except it accepts a ext4_ext_cache parameter instead of a ext4_extent parameter. This routine is used by ext4_ext_punch_hole to check and see if a block in a hole that has been cached. The ext4_ext_cache parameter is necessary because the members ext4_extent structure are not large enough to hold a 32 bit value. The existing ext4_ext_in_cache routine has become a wrapper to this new function. [ext4 punch hole patch series 5/5 v7] Signed-off-by: Allison Henderson <achender@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Mingming Cao <cmm@us.ibm.com>
2011-05-25ext4: add new function ext4_block_zero_page_range()Allison Henderson1-2/+31
This patch modifies the existing ext4_block_truncate_page() function which was used by the truncate code path, and which zeroes out block unaligned data, by adding a new length parameter, and renames it to ext4_block_zero_page_rage(). This function can now be used to zero out the head of a block, the tail of a block, or the middle of a block. The ext4_block_truncate_page() function is now a wrapper to ext4_block_zero_page_range(). [ext4 punch hole patch series 2/5 v7] Signed-off-by: Allison Henderson <achender@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Mingming Cao <cmm@us.ibm.com>
2011-05-25ext4: add flag to ext4_has_free_blocksAllison Henderson1-3/+3
This patch adds an allocation request flag to the ext4_has_free_blocks function which enables the use of reserved blocks. This will allow a punch hole to proceed even if the disk is full. Punching a hole may require additional blocks to first split the extents. Because ext4_has_free_blocks is a low level function, the flag needs to be passed down through several functions listed below: ext4_ext_insert_extent ext4_ext_create_new_leaf ext4_ext_grow_indepth ext4_ext_split ext4_ext_new_meta_block ext4_mb_new_blocks ext4_claim_free_blocks ext4_has_free_blocks [ext4 punch hole patch series 1/5 v7] Signed-off-by: Allison Henderson <achender@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Mingming Cao <cmm@us.ibm.com>
2011-05-23ext4: use truncate_setsize() unconditionallyTheodore Ts'o1-8/+8
In commit c8d46e41 (ext4: Add flag to files with blocks intentionally past EOF), if the EOFBLOCKS_FL flag is set, we call ext4_truncate() before calling vmtruncate(). This caused any allocated but unwritten blocks created by calling fallocate() with the FALLOC_FL_KEEP_SIZE flag to be dropped. This was done to make to make sure that EOFBLOCKS_FL would not be cleared while still leaving blocks past i_size allocated. This was not necessary, since ext4_truncate() guarantees that blocks past i_size will be dropped, even in the case where truncate() has increased i_size before calling ext4_truncate(). So fix this by removing the EOFBLOCKS_FL special case treatment in ext4_setattr(). In addition, use truncate_setsize() followed by a call to ext4_truncate() instead of using vmtruncate(). This is more efficient since it skips the call to inode_newsize_ok(), which has been checked already by inode_change_ok(). This is also in a win in the case where EOFBLOCKS_FL is set since it avoids calling ext4_truncate() twice. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-05-18ext4: wait for writeback to complete while making pages writableDarrick J. Wong1-5/+19
In order to stabilize pages during disk writes, ext4_page_mkwrite must wait for writeback operations to complete before making a page writable. Furthermore, the function must return locked pages, and recheck the writeback status if the page lock is ever dropped. The "someone could wander in" part of this patch was suggested by Chris Mason. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-05-18ext4: clean up some wait_on_page_writeback callsDarrick J. Wong1-3/+1
wait_on_page_writeback already checks the writeback bit, so callers of it needn't do that test. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-05-09ext4: use s_inodes_per_block directly in __ext4_get_inode_locTao Ma1-1/+1
In __ext4_get_inode_loc, we calculate inodes_per_block every time by EXT4_BLOCK_SIZE(sb) / EXT4_INODE_SIZE(sb). AFAICS, this function is a hot path for ext4, so we'd better use s_inodes_per_block directly instead of calculating every time. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-04-11Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds1-12/+23
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix data corruption regression by reverting commit 6de9843dab3f ext4: Allow indirect-block file to grow the file size to max file size ext4: allow an active handle to be started when freezing ext4: sync the directory inode in ext4_sync_parent() ext4: init timer earlier to avoid a kernel panic in __save_error_info jbd2: fix potential memory leak on transaction commit ext4: fix a double free in ext4_register_li_request ext4: fix credits computing for indirect mapped files ext4: remove unnecessary [cm]time update of quota file jbd2: move bdget out of critical section
2011-04-10ext4: fix data corruption regression by reverting commit 6de9843dab3fTheodore Ts'o1-0/+1
Revert commit 6de9843dab3f2a1d4d66d80aa9e5782f80977d20, since it caused a data corruption regression with BitTorrent downloads. Thanks to Damien for discovering and bisecting to find the problem commit. https://bugzilla.kernel.org/show_bug.cgi?id=32972 Reported-by: Damien Grassart <damien@grassart.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-04-10ext4: Allow indirect-block file to grow the file size to max file sizeKazuya Mio1-6/+17
We can create 4402345721856 byte file with indirect block mapping. However, if we grow an indirect-block file to the size with ftruncate(), we can see an ext4 warning. The following patch fixes this problem. How to reproduce: # dd if=/dev/zero of=/mnt/mp1/hoge bs=1 count=0 seek=4402345721856 0+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000221428 s, 0.0 kB/s # tail -n 1 /var/log/messages Nov 25 15:10:27 test kernel: EXT4-fs warning (device sda8): ext4_block_to_path:345: block 1074791436 > max in inode 12 Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-04-04ext4: fix credits computing for indirect mapped filesYongqiang Yang1-6/+5
When writing a contiguous set of blocks, two indirect blocks could be needed depending on how the blocks are aligned, so we need to increase the number of credits needed by one. [ Also fixed a another bug which could further underestimate the number of journal credits needed by 1; the code was using integer division instead of DIV_ROUND_UP() -- tytso] Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
2011-03-31Fix common misspellingsLucas De Marchi1-10/+10
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-25Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds1-234/+176
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (43 commits) ext4: fix a BUG in mb_mark_used during trim. ext4: unused variables cleanup in fs/ext4/extents.c ext4: remove redundant set_buffer_mapped() in ext4_da_get_block_prep() ext4: add more tracepoints and use dev_t in the trace buffer ext4: don't kfree uninitialized s_group_info members ext4: add missing space in printk's in __ext4_grp_locked_error() ext4: add FITRIM to compat_ioctl. ext4: handle errors in ext4_clear_blocks() ext4: unify the ext4_handle_release_buffer() api ext4: handle errors in ext4_rename jbd2: add COW fields to struct jbd2_journal_handle jbd2: add the b_cow_tid field to journal_head struct ext4: Initialize fsync transaction ids in ext4_new_inode() ext4: Use single thread to perform DIO unwritten convertion ext4: optimize ext4_bio_write_page() when no extent conversion is needed ext4: skip orphan cleanup if fs has unknown ROCOMPAT features ext4: use the nblocks arg to ext4_truncate_restart_trans() ext4: fix missing iput of root inode for some mount error paths ext4: make FIEMAP and delayed allocation play well together ext4: suppress verbose debugging information if malloc-debug is off ... Fi up conflicts in fs/ext4/super.c due to workqueue changes
2011-03-23ext4: remove redundant set_buffer_mapped() in ext4_da_get_block_prep()Feng Tang1-1/+0
The map_bh() call will have already set the buffer_head to mapped. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-03-21ext4: add more tracepoints and use dev_t in the trace bufferJiaying Zhang1-3/+21
- Add more ext4 tracepoints. - Change ext4 tracepoints to use dev_t field with MAJOR/MINOR macros so that we can save 4 bytes in the ring buffer on some platforms. - Add sync_mode to ext4_da_writepages, ext4_da_write_pages, and ext4_da_writepages_result tracepoints. Also remove for_reclaim field from ext4_da_writepages since it is usually not very useful. Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-03-20ext4: handle errors in ext4_clear_blocks()Amir Goldstein1-20/+26
Checking return code from ext4_journal_get_write_access() is important with snapshots, because this function invokes COW, so may return new errors, such as ENOSPC. ext4_clear_blocks() now returns < 0 for fatal errors, in which case, ext4_free_data() is aborted. Signed-off-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-03-10block: remove per-queue pluggingJens Axboe1-4/+0
Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-02-27ext4: use the nblocks arg to ext4_truncate_restart_trans()Amir Goldstein1-1/+1
nblocks is passed into ext4_truncate_restart_trans() from ext4_ext_truncate_extend_restart() with a value different from the default blocks_for_truncate(), but is being ignored. The two other calls to ext4_truncate_restart_trans() already pass the default value, which is then being recalculated inside the function. Fix the problem by using the passed argument. Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
2011-02-26ext4: move setup of the mpd structure to write_cache_pages_da()Theodore Ts'o1-22/+7
Move the initialization of all of the fields of the mpd structure to write_cache_pages_da(). This simplifies the code considerably. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-02-26ext4: don't lock the next page in write_cache_pages if not neededTheodore Ts'o1-17/+10
If we have accumulated a contiguous region of memory to be written out, and the next page can added to this region, don't bother locking (and then unlocking the page) before writing out the memory. In the unlikely event that the next page was being written back by some other CPU, we can also skip waiting that page to finish writeback. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-02-26ext4: remove page_skipped hackery in ext4_da_writepages()Theodore Ts'o1-10/+0
Because the ext4 page writeback codepath had been prematurely calling clear_page_dirty_for_io(), if it turned out that a particular page couldn't be written out during a particular pass of write_cache_pages_da(), the page would have to get redirtied by calling redirty_pages_for_writeback(). Not only was this wasted work, but redirty_page_for_writeback() would increment wbc->pages_skipped to signal to writeback_sb_inodes() that buffers were locked, and that it should skip this inode until later. Since this signal was incorrect in ext4's case --- which was caused by ext4's historically incorrect use of write_cache_pages() --- ext4_da_writepages() saved and restored wbc->skipped_pages to avoid confusing writeback_sb_inodes(). Now that we've fixed ext4 to call clear_page_dirty_for_io() right before initiating the page I/O, we can nuke the page_skipped save/restore hackery, and breathe a sigh of relief. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-02-26ext4: clear the dirty bit for a page in writeback at the last minuteTheodore Ts'o1-17/+11
Move when we call clear_page_dirty_for_io() to just before we actually write the page. This simplifies the code somewhat, and avoids marking pages as clean and then needing to remark them as dirty later. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-02-26ext4: simple cleanups to write_cache_pages_da()Theodore Ts'o1-67/+48
Eliminate duplicate code, unneeded variables, etc., to make it easier to understand the code. No behavioral changes were made in this patch. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-02-26ext4: fold __mpage_da_writepage() into write_cache_pages_da()Theodore Ts'o1-115/+91
Fold the __mpage_da_writepage() function into write_cache_pages_da(). This will give us opportunities to clean up and simplify the resulting code. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-02-26ext4: fix ext4_da_block_invalidatepages() to handle page range properlyCurt Wohlgemuth1-7/+4
If ext4_da_block_invalidatepages() is called because of a failure from ext4_map_blocks() in mpage_da_map_and_submit(), it's supposed to clean up -- including unlock -- all the pages in the mpd structure. But these values may not match up, even on a system in which block size == page size: mpd->b_blocknr != mpd->first_page mpd->b_size != (mpd->next_page - mpd->first_page) ext4_da_block_invalidatepages() has been using b_blocknr and b_size; this patch changes it to use first_page and next_page. Tested: I injected a small number (5%) of failures in ext4_map_blocks() in the case that the flags contain EXT4_GET_BLOCKS_DELALLOC_RESERVE, and ran fsstress on this kernel. Without this patch, I got hung tasks every time. With this patch, I see no hangs in many runs of fsstress. Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-02-26ext4: mark multi-page IO complete on mapping failureCurt Wohlgemuth1-0/+3
In mpage_da_map_and_submit(), if we have a delayed block allocation failure from ext4_map_blocks(), we need to mark the IO as complete, by setting mpd->io_done = 1; Otherwise, we could end up submitting the pages in an outer loop; since they are unlocked on mapping failure in ext4_da_block_invalidatepages(), this will cause a bug check in mpage_da_submit_io(). I tested this by injected failures into ext4_map_blocks(). Without this patch, a simple fsstress run will bug check; with the patch, it works fine. Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-02-21ext4: Fix sparse warning: Using plain integer as NULL pointerPeter Huewe1-9/+9
This patch fixes the warning "Using plain integer as NULL pointer", generated by sparse, by replacing the offending 0s with NULL. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-13Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivialLinus Torvalds1-3/+3
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) Documentation/trace/events.txt: Remove obsolete sched_signal_send. writeback: fix global_dirty_limits comment runtime -> real-time ppc: fix comment typo singal -> signal drivers: fix comment typo diable -> disable. m68k: fix comment typo diable -> disable. wireless: comment typo fix diable -> disable. media: comment typo fix diable -> disable. remove doc for obsolete dynamic-printk kernel-parameter remove extraneous 'is' from Documentation/iostats.txt Fix spelling milisec -> ms in snd_ps3 module parameter description Fix spelling mistakes in comments Revert conflicting V4L changes i7core_edac: fix typos in comments mm/rmap.c: fix comment sound, ca0106: Fix assignment to 'channel'. hrtimer: fix a typo in comment init/Kconfig: fix typo anon_inodes: fix wrong function name in comment fix comment typos concerning "consistent" poll: fix a typo in comment ... Fix up trivial conflicts in: - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c) - fs/ext4/ext4.h Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-13fs/ext4/inode.c: use pr_warn_ratelimited()Andrew Morton1-1/+2
pr_warning_ratelimited() doesn't exist. Also include printk.h, which defines these things. Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-10ext4: fix memory leak in ext4_free_branchesTheodore Ts'o1-0/+1
Commit 40389687 moved a call to ext4_forget() out of ext4_free_branches and let ext4_free_blocks() handle calling bforget(). But that change unfortunately did not replace the call to ext4_forget() with brelse(), which was needed to drop the in-use count of the indirect block's buffer head, which lead to a memory leak when deleting files that used indirect blocks. Fix this. Thanks to Hugh Dickins for pointing this out. Cc: stable@kernel.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: add error checking to calls to ext4_handle_dirty_metadata()Theodore Ts'o1-4/+17
Call ext4_std_error() in various places when we can't bail out cleanly, so the file system can be marked as in error. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: dynamically allocate the jbd2_inode in ext4_inode_info as necessaryTheodore Ts'o1-5/+12
Replace the jbd2_inode structure (which is 48 bytes) with a pointer and only allocate the jbd2_inode when it is needed --- that is, when the file system has a journal present and the inode has been opened for writing. This allows us to further slim down the ext4_inode_info structure. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: drop i_state_flags on architectures with 64-bit longsTheodore Ts'o1-2/+2
We can store the dynamic inode state flags in the high bits of EXT4_I(inode)->i_flags, and eliminate i_state_flags. This saves 8 bytes from the size of ext4_inode_info structure, which when multiplied by the number of the number of in the inode cache, can save a lot of memory. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: use ext4_lblk_t instead of sector_t for logical blocksTheodore Ts'o1-2/+2
This fixes a number of places where we used sector_t instead of ext4_lblk_t for logical blocks, which for ext4 are still 32-bit data types. No point wasting space in the ext4_inode_info structure, and requiring 64-bit arithmetic on 32-bit systems, when it isn't necessary. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: replace i_delalloc_reserved_flag with EXT4_STATE_DELALLOC_RESERVEDTheodore Ts'o1-3/+3
Remove the short element i_delalloc_reserved_flag from the ext4_inode_info structure and replace it a new bit in i_state_flags. Since we have an ext4_inode_info for every ext4 inode cached in the inode cache, any savings we can produce here is a very good thing from a memory utilization perspective. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-22Merge branch 'master' into for-nextJiri Kosina1-3/+7
Conflicts: MAINTAINERS arch/arm/mach-omap2/pm24xx.c drivers/scsi/bfa/bfa_fcpim.c Needed to update to apply fixes for which the old branch was too outdated.
2010-12-17ext4: Use pr_warning_ratelimited() instead of printk_ratelimit()Theodore Ts'o1-2/+2
printk_ratelimit() is deprecated since it is a global instead of a per-printk ratelimit. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-16ext4: Fix up comments in inode.cTheodore Ts'o1-4/+13
This fixes up some broken argument descriptions that Namhyung Kim had originally submitted for ext3. This fixes the comments that were still applicable in ext4. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-14ext4: Turn off multiple page-io submission by defaultTheodore Ts'o1-1/+4
Jon Nelson has found a test case which causes postgresql to fail with the error: psql:t.sql:4: ERROR: invalid page header in block 38269 of relation base/16384/16581 Under memory pressure, it looks like part of a file can end up getting replaced by zero's. Until we can figure out the cause, we'll roll back the change and use block_write_full_page() instead of ext4_bio_write_page(). The new, more efficient writing function can be used via the mount option mblk_io_submit, so we can test and fix the new page I/O code. To reproduce the problem, install postgres 8.4 or 9.0, and pin enough memory such that the system just at the end of triggering writeback before running the following sql script: begin; create temporary table foo as select x as a, ARRAY[x] as b FROM generate_series(1, 10000000 ) AS x; create index foo_a_idx on foo (a); create index foo_b_idx on foo USING GIN (b); rollback; If the temporary table is created on a hard drive partition which is encrypted using dm_crypt, then under memory pressure, approximately 30-40% of the time, pgsql will issue the above failure. This patch should fix this problem, and the problem will come back if the file system is mounted with the mblk_io_submit mount option. Reported-by: Jon Nelson <jnelson@jamponi.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-11-15ext4: fix redirty_page_for_writepage() typo in commentWu Fengguang1-1/+1
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-11-08Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds1-0/+3
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Add new ext4 inode tracepoints ext4: Don't call sb_issue_discard() in ext4_free_blocks() ext4: do not try to grab the s_umount semaphore in ext4_quota_off ext4: fix potential race when freeing ext4_io_page structures ext4: handle writeback of inodes which are being freed ext4: initialize the percpu counters before replaying the journal ext4: "ret" may be used uninitialized in ext4_lazyinit_thread() ext4: fix lazyinit hang after removing request