Merge tag 'folio-5.18d' of git://git.infradead.org/users/willy/pagecache

Pull more filesystem folio updates from Matthew Wilcox: "A mixture of odd changes that didn't quite make it into the original pull and fixes for things that did. Also the readpages changes had to wait for the NFS tree to be pulled first. - Remove ->readpages infrastructure - Remove AOP_FLAG_CONT_EXPAND - Move read_descriptor_t to networking code - Pass the iocb to generic_perform_write - Minor updates to iomap, btrfs, ext4, f2fs, ntfs" * tag 'folio-5.18d' of git://git.infradead.org/users/willy/pagecache: btrfs: Remove a use of PAGE_SIZE in btrfs_invalidate_folio() ntfs: Correct mark_ntfs_record_dirty() folio conversion f2fs: Get the superblock from the mapping instead of the page f2fs: Correct f2fs_dirty_data_folio() conversion ext4: Correct ext4_journalled_dirty_folio() conversion filemap: Remove AOP_FLAG_CONT_EXPAND fs: Pass an iocb to generic_perform_write() fs, net: Move read_descriptor_t to net.h fs: Remove read_actor_t iomap: Simplify is_partially_uptodate a little readahead: Update comments mm: remove the skip_page argument to read_pages mm: remove the pages argument to read_pages fs: Remove ->readpages address space operation readahead: Remove read_cache_pages()
author: Linus Torvalds <torvalds@linux-foundation.org> 2022-04-01 13:50:50 -0700
committer: Linus Torvalds <torvalds@linux-foundation.org> 2022-04-01 13:50:50 -0700
commit: cda4351252e710507d32178dfbb5a7f0f92488f8 (patch)
tree: 71d39c61132eebd26b3814b9b3c5d36e51400fe5 /mm
parent: Merge tag 'xarray-5.18' of git://git.infradead.org/users/willy/xarray (diff)
parent: btrfs: Remove a use of PAGE_SIZE in btrfs_invalidate_folio() (diff)
download: linux-dev-cda4351252e710507d32178dfbb5a7f0f92488f8.tar.xz
linux-dev-cda4351252e710507d32178dfbb5a7f0f92488f8.zip
2 files changed, 62 insertions, 154 deletions
diff --git a/mm/filemap.c b/mm/filemap.c
index 647d72bf23b6..3a5ffb5587cd 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2538,7 +2538,7 @@ static int filemap_create_folio(struct file *file,
 	 * the page cache as the locked folio would then be enough to
 	 * synchronize with hole punching. But there are code paths
 	 * such as filemap_update_page() filling in partially uptodate
-	 * pages or ->readpages() that need to hold invalidate_lock
+	 * pages or ->readahead() that need to hold invalidate_lock
 	 * while mapping blocks for IO so let's hold the lock here as
 	 * well to keep locking rules simple.
 	 */
@@ -3752,9 +3752,10 @@ out:
 }
 EXPORT_SYMBOL(generic_file_direct_write);
 
-ssize_t generic_perform_write(struct file *file,
-				struct iov_iter *i, loff_t pos)
+ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
 {
+	struct file *file = iocb->ki_filp;
+	loff_t pos = iocb->ki_pos;
 	struct address_space *mapping = file->f_mapping;
 	const struct address_space_operations *a_ops = mapping->a_ops;
 	long status = 0;
@@ -3884,7 +3885,8 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 		if (written < 0 || !iov_iter_count(from) || IS_DAX(inode))
 			goto out;
 
-		status = generic_perform_write(file, from, pos = iocb->ki_pos);
+		pos = iocb->ki_pos;
+		status = generic_perform_write(iocb, from);
 		/*
 		 * If generic_perform_write() returned a synchronous error
 		 * then we want to return the number of bytes which were
@@ -3916,7 +3918,7 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 			 */
 		}
 	} else {
-		written = generic_perform_write(file, from, iocb->ki_pos);
+		written = generic_perform_write(iocb, from);
 		if (likely(written > 0))
 			iocb->ki_pos += written;
 	}
diff --git a/mm/readahead.c b/mm/readahead.c
index d3a47546d17d..8e3775829513 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -13,29 +13,29 @@
  *
  * Readahead is used to read content into the page cache before it is
  * explicitly requested by the application.  Readahead only ever
- * attempts to read pages that are not yet in the page cache.  If a
- * page is present but not up-to-date, readahead will not try to read
+ * attempts to read folios that are not yet in the page cache.  If a
+ * folio is present but not up-to-date, readahead will not try to read
  * it. In that case a simple ->readpage() will be requested.
  *
  * Readahead is triggered when an application read request (whether a
- * systemcall or a page fault) finds that the requested page is not in
+ * system call or a page fault) finds that the requested folio is not in
  * the page cache, or that it is in the page cache and has the
- * %PG_readahead flag set.  This flag indicates that the page was loaded
- * as part of a previous read-ahead request and now that it has been
- * accessed, it is time for the next read-ahead.
+ * readahead flag set.  This flag indicates that the folio was read
+ * as part of a previous readahead request and now that it has been
+ * accessed, it is time for the next readahead.
  *
  * Each readahead request is partly synchronous read, and partly async
- * read-ahead.  This is reflected in the struct file_ra_state which
- * contains ->size being to total number of pages, and ->async_size
- * which is the number of pages in the async section.  The first page in
- * this async section will have %PG_readahead set as a trigger for a
- * subsequent read ahead.  Once a series of sequential reads has been
+ * readahead.  This is reflected in the struct file_ra_state which
+ * contains ->size being the total number of pages, and ->async_size
+ * which is the number of pages in the async section.  The readahead
+ * flag will be set on the first folio in this async section to trigger
+ * a subsequent readahead.  Once a series of sequential reads has been
  * established, there should be no need for a synchronous component and
- * all read ahead request will be fully asynchronous.
+ * all readahead request will be fully asynchronous.
  *
- * When either of the triggers causes a readahead, three numbers need to
- * be determined: the start of the region, the size of the region, and
- * the size of the async tail.
+ * When either of the triggers causes a readahead, three numbers need
+ * to be determined: the start of the region to read, the size of the
+ * region, and the size of the async tail.
  *
  * The start of the region is simply the first page address at or after
  * the accessed address, which is not currently populated in the page
@@ -45,14 +45,14 @@
  * was explicitly requested from the determined request size, unless
  * this would be less than zero - then zero is used.  NOTE THIS
  * CALCULATION IS WRONG WHEN THE START OF THE REGION IS NOT THE ACCESSED
- * PAGE.
+ * PAGE.  ALSO THIS CALCULATION IS NOT USED CONSISTENTLY.
  *
  * The size of the region is normally determined from the size of the
  * previous readahead which loaded the preceding pages.  This may be
  * discovered from the struct file_ra_state for simple sequential reads,
  * or from examining the state of the page cache when multiple
  * sequential reads are interleaved.  Specifically: where the readahead
- * was triggered by the %PG_readahead flag, the size of the previous
+ * was triggered by the readahead flag, the size of the previous
  * readahead is assumed to be the number of pages from the triggering
  * page to the start of the new readahead.  In these cases, the size of
  * the previous readahead is scaled, often doubled, for the new
@@ -65,52 +65,52 @@
  * larger than the current request, and it is not scaled up, unless it
  * is at the start of file.
  *
- * In general read ahead is accelerated at the start of the file, as
+ * In general readahead is accelerated at the start of the file, as
  * reads from there are often sequential.  There are other minor
- * adjustments to the read ahead size in various special cases and these
+ * adjustments to the readahead size in various special cases and these
  * are best discovered by reading the code.
  *
- * The above calculation determines the readahead, to which any requested
- * read size may be added.
+ * The above calculation, based on the previous readahead size,
+ * determines the size of the readahead, to which any requested read
+ * size may be added.
  *
  * Readahead requests are sent to the filesystem using the ->readahead()
  * address space operation, for which mpage_readahead() is a canonical
  * implementation.  ->readahead() should normally initiate reads on all
- * pages, but may fail to read any or all pages without causing an IO
+ * folios, but may fail to read any or all folios without causing an I/O
  * error.  The page cache reading code will issue a ->readpage() request
- * for any page which ->readahead() does not provided, and only an error
+ * for any folio which ->readahead() did not read, and only an error
  * from this will be final.
  *
- * ->readahead() will generally call readahead_page() repeatedly to get
- * each page from those prepared for read ahead.  It may fail to read a
- * page by:
+ * ->readahead() will generally call readahead_folio() repeatedly to get
+ * each folio from those prepared for readahead.  It may fail to read a
+ * folio by:
  *
- * * not calling readahead_page() sufficiently many times, effectively
- *   ignoring some pages, as might be appropriate if the path to
+ * * not calling readahead_folio() sufficiently many times, effectively
+ *   ignoring some folios, as might be appropriate if the path to
  *   storage is congested.
  *
- * * failing to actually submit a read request for a given page,
+ * * failing to actually submit a read request for a given folio,
  *   possibly due to insufficient resources, or
  *
  * * getting an error during subsequent processing of a request.
  *
- * In the last two cases, the page should be unlocked to indicate that
- * the read attempt has failed.  In the first case the page will be
- * unlocked by the caller.
+ * In the last two cases, the folio should be unlocked by the filesystem
+ * to indicate that the read attempt has failed.  In the first case the
+ * folio will be unlocked by the VFS.
  *
- * Those pages not in the final ``async_size`` of the request should be
+ * Those folios not in the final ``async_size`` of the request should be
  * considered to be important and ->readahead() should not fail them due
  * to congestion or temporary resource unavailability, but should wait
  * for necessary resources (e.g.  memory or indexing information) to
- * become available.  Pages in the final ``async_size`` may be
+ * become available.  Folios in the final ``async_size`` may be
  * considered less urgent and failure to read them is more acceptable.
- * In this case it is best to use delete_from_page_cache() to remove the
- * pages from the page cache as is automatically done for pages that
- * were not fetched with readahead_page().  This will allow a
- * subsequent synchronous read ahead request to try them again.  If they
+ * In this case it is best to use filemap_remove_folio() to remove the
+ * folios from the page cache as is automatically done for folios that
+ * were not fetched with readahead_folio().  This will allow a
+ * subsequent synchronous readahead request to try them again.  If they
  * are left in the page cache, then they will be read individually using
- * ->readpage().
- *
+ * ->readpage() which may be less efficient.
  */
 
 #include <linux/kernel.h>
@@ -142,91 +142,14 @@ file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping)
 }
 EXPORT_SYMBOL_GPL(file_ra_state_init);
 
-/*
- * see if a page needs releasing upon read_cache_pages() failure
- * - the caller of read_cache_pages() may have set PG_private or PG_fscache
- *   before calling, such as the NFS fs marking pages that are cached locally
- *   on disk, thus we need to give the fs a chance to clean up in the event of
- *   an error
- */
-static void read_cache_pages_invalidate_page(struct address_space *mapping,
-					     struct page *page)
-{
-	if (page_has_private(page)) {
-		if (!trylock_page(page))
-			BUG();
-		page->mapping = mapping;
-		folio_invalidate(page_folio(page), 0, PAGE_SIZE);
-		page->mapping = NULL;
-		unlock_page(page);
-	}
-	put_page(page);
-}
-
-/*
- * release a list of pages, invalidating them first if need be
- */
-static void read_cache_pages_invalidate_pages(struct address_space *mapping,
-					      struct list_head *pages)
-{
-	struct page *victim;
-
-	while (!list_empty(pages)) {
-		victim = lru_to_page(pages);
-		list_del(&victim->lru);
-		read_cache_pages_invalidate_page(mapping, victim);
-	}
-}
-
-/**
- * read_cache_pages - populate an address space with some pages & start reads against them
- * @mapping: the address_space
- * @pages: The address of a list_head which contains the target pages.  These
- *   pages have their ->index populated and are otherwise uninitialised.
- * @filler: callback routine for filling a single page.
- * @data: private data for the callback routine.
- *
- * Hides the details of the LRU cache etc from the filesystems.
- *
- * Returns: %0 on success, error return by @filler otherwise
- */
-int read_cache_pages(struct address_space *mapping, struct list_head *pages,
-			int (*filler)(void *, struct page *), void *data)
-{
-	struct page *page;
-	int ret = 0;
-
-	while (!list_empty(pages)) {
-		page = lru_to_page(pages);
-		list_del(&page->lru);
-		if (add_to_page_cache_lru(page, mapping, page->index,
-				readahead_gfp_mask(mapping))) {
-			read_cache_pages_invalidate_page(mapping, page);
-			continue;
-		}
-		put_page(page);
-
-		ret = filler(data, page);
-		if (unlikely(ret)) {
-			read_cache_pages_invalidate_pages(mapping, pages);
-			break;
-		}
-		task_io_account_read(PAGE_SIZE);
-	}
-	return ret;
-}
-
-EXPORT_SYMBOL(read_cache_pages);
-
-static void read_pages(struct readahead_control *rac, struct list_head *pages,
-		bool skip_page)
+static void read_pages(struct readahead_control *rac)
 {
 	const struct address_space_operations *aops = rac->mapping->a_ops;
 	struct page *page;
 	struct blk_plug plug;
 
 	if (!readahead_count(rac))
-		goto out;
+		return;
 
 	blk_start_plug(&plug);
 
@@ -234,7 +157,7 @@ static void read_pages(struct readahead_control *rac, struct list_head *pages,
 		aops->readahead(rac);
 		/*
 		 * Clean up the remaining pages.  The sizes in ->ra
-		 * maybe be used to size next read-ahead, so make sure
+		 * may be used to size the next readahead, so make sure
 		 * they accurately reflect what happened.
 		 */
 		while ((page = readahead_page(rac))) {
@@ -246,13 +169,6 @@ static void read_pages(struct readahead_control *rac, struct list_head *pages,
 			unlock_page(page);
 			put_page(page);
 		}
-	} else if (aops->readpages) {
-		aops->readpages(rac->file, rac->mapping, pages,
-				readahead_count(rac));
-		/* Clean up the remaining pages */
-		put_pages_list(pages);
-		rac->_index += rac->_nr_pages;
-		rac->_nr_pages = 0;
 	} else {
 		while ((page = readahead_page(rac))) {
 			aops->readpage(rac->file, page);
@@ -262,12 +178,7 @@ static void read_pages(struct readahead_control *rac, struct list_head *pages,
 
 	blk_finish_plug(&plug);
 
-	BUG_ON(pages && !list_empty(pages));
 	BUG_ON(readahead_count(rac));
-
-out:
-	if (skip_page)
-		rac->_index++;
 }
 
 /**
@@ -289,7 +200,6 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
 {
 	struct address_space *mapping = ractl->mapping;
 	unsigned long index = readahead_index(ractl);
-	LIST_HEAD(page_pool);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
 	unsigned long i;
 
@@ -321,7 +231,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
 			 * have a stable reference to this page, and it's
 			 * not worth getting one just for that.
 			 */
-			read_pages(ractl, &page_pool, true);
+			read_pages(ractl);
+			ractl->_index++;
 			i = ractl->_index + ractl->_nr_pages - index - 1;
 			continue;
 		}
@@ -329,13 +240,11 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
 		folio = filemap_alloc_folio(gfp_mask, 0);
 		if (!folio)
 			break;
-		if (mapping->a_ops->readpages) {
-			folio->index = index + i;
-			list_add(&folio->lru, &page_pool);
-		} else if (filemap_add_folio(mapping, folio, index + i,
+		if (filemap_add_folio(mapping, folio, index + i,
 					gfp_mask) < 0) {
 			folio_put(folio);
-			read_pages(ractl, &page_pool, true);
+			read_pages(ractl);
+			ractl->_index++;
 			i = ractl->_index + ractl->_nr_pages - index - 1;
 			continue;
 		}
@@ -349,7 +258,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
 	 * uptodate then the caller will launch readpage again, and
 	 * will then handle the error.
 	 */
-	read_pages(ractl, &page_pool, false);
+	read_pages(ractl);
 	filemap_invalidate_unlock_shared(mapping);
 	memalloc_nofs_restore(nofs);
 }
@@ -394,8 +303,7 @@ void force_page_cache_ra(struct readahead_control *ractl,
 	struct backing_dev_info *bdi = inode_to_bdi(mapping->host);
 	unsigned long max_pages, index;
 
-	if (unlikely(!mapping->a_ops->readpage && !mapping->a_ops->readpages &&
-			!mapping->a_ops->readahead))
+	if (unlikely(!mapping->a_ops->readpage && !mapping->a_ops->readahead))
 		return;
 
 	/*
@@ -512,7 +420,7 @@ static pgoff_t count_history_pages(struct address_space *mapping,
 }
 
 /*
- * page cache context based read-ahead
+ * page cache context based readahead
  */
 static int try_context_readahead(struct address_space *mapping,
 				 struct file_ra_state *ra,
@@ -624,7 +532,7 @@ void page_cache_ra_order(struct readahead_control *ractl,
 		ra->async_size += index - limit - 1;
 	}
 
-	read_pages(ractl, NULL, false);
+	read_pages(ractl);
 
 	/*
 	 * If there were already pages in the page cache, then we may have
@@ -763,9 +671,9 @@ void page_cache_sync_ra(struct readahead_control *ractl,
 	bool do_forced_ra = ractl->file && (ractl->file->f_mode & FMODE_RANDOM);
 
 	/*
-	 * Even if read-ahead is disabled, issue this request as read-ahead
+	 * Even if readahead is disabled, issue this request as readahead
 	 * as we'll need it to satisfy the requested range. The forced
-	 * read-ahead will do the right thing and limit the read to just the
+	 * readahead will do the right thing and limit the read to just the
 	 * requested range, which we'll set to 1 page for this case.
 	 */
 	if (!ractl->ra->ra_pages || blk_cgroup_congested()) {
@@ -781,7 +689,6 @@ void page_cache_sync_ra(struct readahead_control *ractl,
 		return;
 	}
 
-	/* do read-ahead */
 	ondemand_readahead(ractl, NULL, req_count);
 }
 EXPORT_SYMBOL_GPL(page_cache_sync_ra);
@@ -789,7 +696,7 @@ EXPORT_SYMBOL_GPL(page_cache_sync_ra);
 void page_cache_async_ra(struct readahead_control *ractl,
 		struct folio *folio, unsigned long req_count)
 {
-	/* no read-ahead */
+	/* no readahead */
 	if (!ractl->ra->ra_pages)
 		return;
 
@@ -804,7 +711,6 @@ void page_cache_async_ra(struct readahead_control *ractl,
 	if (blk_cgroup_congested())
 		return;
 
-	/* do read-ahead */
 	ondemand_readahead(ractl, folio, req_count);
 }
 EXPORT_SYMBOL_GPL(page_cache_async_ra);
author	Linus Torvalds <torvalds@linux-foundation.org>	2022-04-01 13:50:50 -0700
committer	Linus Torvalds <torvalds@linux-foundation.org>	2022-04-01 13:50:50 -0700
commit	cda4351252e710507d32178dfbb5a7f0f92488f8 (patch)
tree	71d39c61132eebd26b3814b9b3c5d36e51400fe5 /mm
parent	Merge tag 'xarray-5.18' of git://git.infradead.org/users/willy/xarray (diff)
parent	btrfs: Remove a use of PAGE_SIZE in btrfs_invalidate_folio() (diff)
download	linux-dev-cda4351252e710507d32178dfbb5a7f0f92488f8.tar.xz linux-dev-cda4351252e710507d32178dfbb5a7f0f92488f8.zip