From a03208652dad18232e9ec3432df69f9c19c856ec Mon Sep 17 00:00:00 2001 From: Dave Hansen Date: Thu, 30 Jan 2014 15:46:09 -0800 Subject: mm/slub.c: fix page->_count corruption (again) Commit abca7c496584 ("mm: fix slab->page _count corruption when using slub") notes that we can not _set_ a page->counters directly, except when using a real double-cmpxchg. Doing so can lose updates to ->_count. That is an absolute rule: You may not *set* page->counters except via a cmpxchg. Commit abca7c496584 fixed this for the folks who have the slub cmpxchg_double code turned off at compile time, but it left the bad case alone. It can still be reached, and the same bug triggered in two cases: 1. Turning on slub debugging at runtime, which is available on the distro kernels that I looked at. 2. On 64-bit CPUs with no CMPXCHG16B (some early AMD x86-64 cpus, evidently) There are at least 3 ways we could fix this: 1. Take all of the exising calls to cmpxchg_double_slab() and __cmpxchg_double_slab() and convert them to take an old, new and target 'struct page'. 2. Do (1), but with the newly-introduced 'slub_data'. 3. Do some magic inside the two cmpxchg...slab() functions to pull the counters out of new_counters and only set those fields in page->{inuse,frozen,objects}. I've done (2) as well, but it's a bunch more code. This patch is an attempt at (3). This was the most straightforward and foolproof way that I could think to do this. This would also technically allow us to get rid of the ugly #if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) in 'struct page', but leaving it alone has the added benefit that 'counters' stays 'unsigned' instead of 'unsigned long', so all the copies that the slub code does stay a bit smaller. Signed-off-by: Dave Hansen Cc: Christoph Lameter Cc: Pekka Enberg Cc: Matt Mackall Cc: Pravin B Shelar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/slub.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) (limited to 'mm') diff --git a/mm/slub.c b/mm/slub.c index 545a170ebf9f..2b1a6970e46f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -355,6 +355,21 @@ static __always_inline void slab_unlock(struct page *page) __bit_spin_unlock(PG_locked, &page->flags); } +static inline void set_page_slub_counters(struct page *page, unsigned long counters_new) +{ + struct page tmp; + tmp.counters = counters_new; + /* + * page->counters can cover frozen/inuse/objects as well + * as page->_count. If we assign to ->counters directly + * we run the risk of losing updates to page->_count, so + * be careful and only assign to the fields we need. + */ + page->frozen = tmp.frozen; + page->inuse = tmp.inuse; + page->objects = tmp.objects; +} + /* Interrupts must be disabled (for the fallback code to work right) */ static inline bool __cmpxchg_double_slab(struct kmem_cache *s, struct page *page, void *freelist_old, unsigned long counters_old, @@ -376,7 +391,7 @@ static inline bool __cmpxchg_double_slab(struct kmem_cache *s, struct page *page if (page->freelist == freelist_old && page->counters == counters_old) { page->freelist = freelist_new; - page->counters = counters_new; + set_page_slub_counters(page, counters_new); slab_unlock(page); return 1; } @@ -415,7 +430,7 @@ static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct page *page, if (page->freelist == freelist_old && page->counters == counters_old) { page->freelist = freelist_new; - page->counters = counters_new; + set_page_slub_counters(page, counters_new); slab_unlock(page); local_irq_restore(flags); return 1; -- cgit v1.2.3-59-g8ed1b