jbd: Write journal superblock with WRITE_FUA after checkpointing

If journal superblock is written only in disk's caches and other transaction starts reusing space of the transaction cleaned from the log, it can happen blocks of a new transaction reach the disk before journal superblock. When power failure happens in such case, subsequent journal replay would still try to replay the old transaction but some of it's blocks may be already overwritten by the new transaction. For this reason we must use WRITE_FUA when updating log tail and we must first write new log tail to disk and update in-memory information only after that. Signed-off-by: Jan Kara <jack@suse.cz>
author: Jan Kara <jack@suse.cz> 2012-04-07 11:05:19 +0200
committer: Jan Kara <jack@suse.cz> 2012-05-15 23:34:37 +0200
commit: fd2cbd4dfa3db477dd6226d387d3f1911d36a6a9 (patch)
tree: b0ada946d14cdcf5db6da2d177be9590a3449e9a /fs/jbd/checkpoint.c
parent: jbd: protect all log tail updates with j_checkpoint_mutex (diff)
download: linux-dev-fd2cbd4dfa3db477dd6226d387d3f1911d36a6a9.tar.xz
linux-dev-fd2cbd4dfa3db477dd6226d387d3f1911d36a6a9.zip
1 files changed, 10 insertions, 13 deletions
diff --git a/fs/jbd/checkpoint.c b/fs/jbd/checkpoint.c
index 80c85f3e087f..08c03044abdd 100644
--- a/fs/jbd/checkpoint.c
+++ b/fs/jbd/checkpoint.c
@@ -508,20 +508,19 @@ int cleanup_journal_tail(journal_t *journal)
 	/*
 	 * We need to make sure that any blocks that were recently written out
 	 * --- perhaps by log_do_checkpoint() --- are flushed out before we
-	 * drop the transactions from the journal. It's unlikely this will be
-	 * necessary, especially with an appropriately sized journal, but we
-	 * need this to guarantee correctness.  Fortunately
-	 * cleanup_journal_tail() doesn't get called all that often.
+	 * drop the transactions from the journal. Similarly we need to be sure
+	 * superblock makes it to disk before next transaction starts reusing
+	 * freed space (otherwise we could replay some blocks of the new
+	 * transaction thinking they belong to the old one). So we use
+	 * WRITE_FLUSH_FUA. It's unlikely this will be necessary, especially
+	 * with an appropriately sized journal, but we need this to guarantee
+	 * correctness.  Fortunately cleanup_journal_tail() doesn't get called
+	 * all that often.
 	 */
-	if (journal->j_flags & JFS_BARRIER)
-		blkdev_issue_flush(journal->j_fs_dev, GFP_KERNEL, NULL);
+	journal_update_sb_log_tail(journal, first_tid, blocknr,
+				   WRITE_FLUSH_FUA);
 
 	spin_lock(&journal->j_state_lock);
-	if (!tid_gt(first_tid, journal->j_tail_sequence)) {
-		spin_unlock(&journal->j_state_lock);
-		/* Someone else cleaned up journal so return 0 */
-		return 0;
-	}
 	/* OK, update the superblock to recover the freed space.
 	 * Physical blocks come first: have we wrapped beyond the end of
 	 * the log?  */
@@ -539,8 +538,6 @@ int cleanup_journal_tail(journal_t *journal)
 	journal->j_tail_sequence = first_tid;
 	journal->j_tail = blocknr;
 	spin_unlock(&journal->j_state_lock);
-	if (!(journal->j_flags & JFS_ABORT))
-		journal_update_sb_log_tail(journal);
 	return 0;
 }
author	Jan Kara <jack@suse.cz>	2012-04-07 11:05:19 +0200
committer	Jan Kara <jack@suse.cz>	2012-05-15 23:34:37 +0200
commit	fd2cbd4dfa3db477dd6226d387d3f1911d36a6a9 (patch)
tree	b0ada946d14cdcf5db6da2d177be9590a3449e9a /fs/jbd/checkpoint.c
parent	jbd: protect all log tail updates with j_checkpoint_mutex (diff)
download	linux-dev-fd2cbd4dfa3db477dd6226d387d3f1911d36a6a9.tar.xz linux-dev-fd2cbd4dfa3db477dd6226d387d3f1911d36a6a9.zip