bcache: avoid unnecessary soft lockup in kworker update_writeback_rate()

The kworker routine update_writeback_rate() is schedued to update the writeback rate in every 5 seconds by default. Before calling __update_writeback_rate() to do real job, semaphore dc->writeback_lock should be held by the kworker routine. At the same time, bcache writeback thread routine bch_writeback_thread() also needs to hold dc->writeback_lock before flushing dirty data back into the backing device. If the dirty data set is large, it might be very long time for bch_writeback_thread() to scan all dirty buckets and releases dc->writeback_lock. In such case update_writeback_rate() can be starved for long enough time so that kernel reports a soft lockup warn- ing started like: watchdog: BUG: soft lockup - CPU#246 stuck for 23s! [kworker/246:31:179713] Such soft lockup condition is unnecessary, because after the writeback thread finishes its job and releases dc->writeback_lock, the kworker update_writeback_rate() may continue to work and everything is fine indeed. This patch avoids the unnecessary soft lockup by the following method, - Add new member to struct cached_dev - dc->rate_update_retry (0 by default) - In update_writeback_rate() call down_read_trylock(&dc->writeback_lock) firstly, if it fails then lock contention happens. - If dc->rate_update_retry <= BCH_WBRATE_UPDATE_MAX_SKIPS (15), doesn't acquire the lock and reschedules the kworker for next try. - If dc->rate_update_retry > BCH_WBRATE_UPDATE_MAX_SKIPS, no retry anymore and call down_read(&dc->writeback_lock) to wait for the lock. By the above method, at worst case update_writeback_rate() may retry for 1+ minutes before blocking on dc->writeback_lock by calling down_read(). For a 4TB cache device with 1TB dirty data, 90%+ of the unnecessary soft lockup warning message can be avoided. When retrying to acquire dc->writeback_lock in update_writeback_rate(), of course the writeback rate cannot be updated. It is fair, because when the kworker is blocked on the lock contention of dc->writeback_lock, the writeback rate cannot be updated neither. This change follows Jens Axboe's suggestion to a more clear and simple version. Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20220528124550.32834-2-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
author: Coly Li <colyli@suse.de> 2022-05-28 20:45:50 +0800
committer: Jens Axboe <axboe@kernel.dk> 2022-05-28 06:48:26 -0600
commit: a1a2d8f0162b27e85e7ce0ae6a35c96a490e0559 (patch)
tree: 5eac047e8f21b317acddfd009b7ad4577520a87c /drivers/md/bcache/bcache.h
parent: nbd: use pr_err to output error message (diff)
download: linux-dev-a1a2d8f0162b27e85e7ce0ae6a35c96a490e0559.tar.xz
linux-dev-a1a2d8f0162b27e85e7ce0ae6a35c96a490e0559.zip
1 files changed, 7 insertions, 0 deletions
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 9ed9c955add7..2acda9cea0f9 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -395,6 +395,13 @@ struct cached_dev {
 	atomic_t		io_errors;
 	unsigned int		error_limit;
 	unsigned int		offline_seconds;
+
+	/*
+	 * Retry to update writeback_rate if contention happens for
+	 * down_read(dc->writeback_lock) in update_writeback_rate()
+	 */
+#define BCH_WBRATE_UPDATE_MAX_SKIPS	15
+	unsigned int		rate_update_retry;
 };
 
 enum alloc_reserve {
author	Coly Li <colyli@suse.de>	2022-05-28 20:45:50 +0800
committer	Jens Axboe <axboe@kernel.dk>	2022-05-28 06:48:26 -0600
commit	a1a2d8f0162b27e85e7ce0ae6a35c96a490e0559 (patch)
tree	5eac047e8f21b317acddfd009b7ad4577520a87c /drivers/md/bcache/bcache.h
parent	nbd: use pr_err to output error message (diff)
download	linux-dev-a1a2d8f0162b27e85e7ce0ae6a35c96a490e0559.tar.xz linux-dev-a1a2d8f0162b27e85e7ce0ae6a35c96a490e0559.zip