aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/include/linux/mm_types.h
diff options
context:
space:
mode:
authorJohn Hubbard <jhubbard@nvidia.com>2020-04-01 21:05:33 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2020-04-02 09:35:27 -0700
commit47e29d32afba11b13efb51f03154a8cf22fb4360 (patch)
tree9d72042c87bfd951e56cf680c0372ed347240300 /include/linux/mm_types.h
parentmm/gup: track FOLL_PIN pages (diff)
downloadwireguard-linux-47e29d32afba11b13efb51f03154a8cf22fb4360.tar.xz
wireguard-linux-47e29d32afba11b13efb51f03154a8cf22fb4360.zip
mm/gup: page->hpage_pinned_refcount: exact pin counts for huge pages
For huge pages (and in fact, any compound page), the GUP_PIN_COUNTING_BIAS scheme tends to overflow too easily, each tail page increments the head page->_refcount by GUP_PIN_COUNTING_BIAS (1024). That limits the number of huge pages that can be pinned. This patch removes that limitation, by using an exact form of pin counting for compound pages of order > 1. The "order > 1" is required because this approach uses the 3rd struct page in the compound page, and order 1 compound pages only have two pages, so that won't work there. A new struct page field, hpage_pinned_refcount, has been added, replacing a padding field in the union (so no new space is used). This enhancement also has a useful side effect: huge pages and compound pages (of order > 1) do not suffer from the "potential false positives" problem that is discussed in the page_dma_pinned() comment block. That is because these compound pages have extra space for tracking things, so they get exact pin counts instead of overloading page->_refcount. Documentation/core-api/pin_user_pages.rst is updated accordingly. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200211001536.1027652-8-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'include/linux/mm_types.h')
-rw-r--r--include/linux/mm_types.h7
1 files changed, 6 insertions, 1 deletions
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index c28911c3afa8..dd555e6d23f3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -137,7 +137,7 @@ struct page {
};
struct { /* Second tail page of compound page */
unsigned long _compound_pad_1; /* compound_head */
- unsigned long _compound_pad_2;
+ atomic_t hpage_pinned_refcount;
/* For both global and memcg */
struct list_head deferred_list;
};
@@ -226,6 +226,11 @@ static inline atomic_t *compound_mapcount_ptr(struct page *page)
return &page[1].compound_mapcount;
}
+static inline atomic_t *compound_pincount_ptr(struct page *page)
+{
+ return &page[2].hpage_pinned_refcount;
+}
+
/*
* Used for sizing the vmemmap region on some architectures
*/