aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/arch/powerpc (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-06-05Merge tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds282-6196/+6535
Pull powerpc updates from Michael Ellerman: - Support for userspace to send requests directly to the on-chip GZIP accelerator on Power9. - Rework of our lockless page table walking (__find_linux_pte()) to make it safe against parallel page table manipulations without relying on an IPI for serialisation. - A series of fixes & enhancements to make our machine check handling more robust. - Lots of plumbing to add support for "prefixed" (64-bit) instructions on Power10. - Support for using huge pages for the linear mapping on 8xx (32-bit). - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound driver. - Removal of some obsolete 40x platforms and associated cruft. - Initial support for booting on Power10. - Lots of other small features, cleanups & fixes. Thanks to: Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Andrey Abramov, Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent Abali, Cédric Le Goater, Chen Zhou, Christian Zigotzky, Christophe JAILLET, Christophe Leroy, Dmitry Torokhov, Emmanuel Nicolet, Erhard F., Gautham R. Shenoy, Geoff Levand, George Spelvin, Greg Kurz, Gustavo A. R. Silva, Gustavo Walbon, Haren Myneni, Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Kees Cook, Leonardo Bras, Madhavan Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Michal Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram Pai, Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler, Wolfram Sang, Xiongfeng Wang. * tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (299 commits) powerpc/pseries: Make vio and ibmebus initcalls pseries specific cxl: Remove dead Kconfig options powerpc: Add POWER10 architected mode powerpc/dt_cpu_ftrs: Add MMA feature powerpc/dt_cpu_ftrs: Enable Prefixed Instructions powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected powerpc: Add support for ISA v3.1 powerpc: Add new HWCAP bits powerpc/64s: Don't set FSCR bits in INIT_THREAD powerpc/64s: Save FSCR to init_task.thread.fscr after feature init powerpc/64s: Don't let DT CPU features set FSCR_DSCR powerpc/64s: Don't init FSCR_DSCR in __init_FSCR() powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations powerpc/module_64: Consolidate ftrace code powerpc/32: Disable KASAN with pages bigger than 16k powerpc/uaccess: Don't set KUEP by default on book3s/32 powerpc/uaccess: Don't set KUAP by default on book3s/32 powerpc/8xx: Reduce time spent in allow_user_access() and friends ...
2020-06-04powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()David Hildenbrand1-23/+3
In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory blocks as removable"), the user space interface to compute whether a memory block can be offlined (exposed via /sys/devices/system/memory/memoryX/removable) has effectively been deprecated. We want to remove the leftovers of the kernel implementation. When offlining a memory block (mm/memory_hotplug.c:__offline_pages()), we'll start by: 1. Testing if it contains any holes, and reject if so 2. Testing if pages belong to different zones, and reject if so 3. Isolating the page range, checking if it contains any unmovable pages Using is_mem_section_removable() before trying to offline is not only racy, it can easily result in false positives/negatives. Let's stop manually checking is_mem_section_removable(), and let device_offline() handle it completely instead. We can remove the racy is_mem_section_removable() implementation next. We now take more locks (e.g., memory hotplug lock when offlining and the zone lock when isolating), but maybe we should optimize that implementation instead if this ever becomes a real problem (after all, memory unplug is already an expensive operation). We started using is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries: Implement memory hotplug remove in the kernel"), with the initial hotremove support of lmbs. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richard.weiyang@gmail.com> Link: http://lkml.kernel.org/r/20200407135416.24093-2-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04kmap: consolidate kmap_prot definitionsIra Weiny1-1/+0
Most architectures define kmap_prot to be PAGE_KERNEL. Let sparc and xtensa define there own and define PAGE_KERNEL as the default if not overridden. [akpm@linux-foundation.org: coding style fixes] Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-16-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04arch/kmap: define kmap_atomic_prot() for all arch'sIra Weiny1-17/+0
To support kmap_atomic_prot(), all architectures need to support protections passed to their kmap_atomic_high() function. Pass protections into kmap_atomic_high() and change the name to kmap_atomic_high_prot() to match. Then define kmap_atomic_prot() as a core function which calls kmap_atomic_high_prot() when needed. Finally, redefine kmap_atomic() as a wrapper of kmap_atomic_prot() with the default kmap_prot exported by the architectures. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-11-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04arch/kmap: ensure kmap_prot visibilityIra Weiny2-4/+1
We want to support kmap_atomic_prot() on all architectures and it makes sense to define kmap_atomic() to use the default kmap_prot. So we ensure all arch's have a globally available kmap_prot either as a define or exported symbol. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-9-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04arch/kunmap_atomic: consolidate duplicate codeIra Weiny2-9/+3
Every single architecture (including !CONFIG_HIGHMEM) calls... pagefault_enable(); preempt_enable(); ... before returning from __kunmap_atomic(). Lift this code into the kunmap_atomic() macro. While we are at it rename __kunmap_atomic() to kunmap_atomic_high() to be consistent. [ira.weiny@intel.com: don't enable pagefault/preempt twice] Link: http://lkml.kernel.org/r/20200518184843.3029640-1-ira.weiny@intel.com [akpm@linux-foundation.org: coding style fixes] Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20200507150004.1423069-8-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04arch/kmap_atomic: consolidate duplicate codeIra Weiny2-8/+2
Every arch has the same code to ensure atomic operations and a check for !HIGHMEM page. Remove the duplicate code by defining a core kmap_atomic() which only calls the arch specific kmap_atomic_high() when the page is high memory. [akpm@linux-foundation.org: coding style fixes] Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-7-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04{x86,powerpc,microblaze}/kmap: move preempt disableIra Weiny2-8/+12
During this kmap() conversion series we must maintain bisect-ability. To do this, kmap_atomic_prot() in x86, powerpc, and microblaze need to remain functional. Create a temporary inline version of kmap_atomic_prot within these architectures so we can rework their kmap_atomic() calls and then lift kmap_atomic_prot() to the core. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-6-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04arch/kunmap: remove duplicate kunmap implementationsIra Weiny1-9/+0
All architectures do exactly the same thing for kunmap(); remove all the duplicate definitions and lift the call to the core. This also has the benefit of changing kmap_unmap() on a number of architectures to be an inline call rather than an actual function. [akpm@linux-foundation.org: fix CONFIG_HIGHMEM=n build on various architectures] Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-5-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04arch/kmap: remove redundant arch specific kmapsIra Weiny1-9/+0
The kmap code for all the architectures is almost 100% identical. Lift the common code to the core. Use ARCH_HAS_KMAP_FLUSH_TLB to indicate if an arch defines kmap_flush_tlb() and call if if needed. This also has the benefit of changing kmap() on a number of architectures to be an inline call rather than an actual function. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-4-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04arch/kmap: remove BUG_ON()Ira Weiny1-1/+1
Patch series "Remove duplicated kmap code", v3. The kmap infrastructure has been copied almost verbatim to every architecture. This series consolidates obvious duplicated code by defining core functions which call into the architectures only when needed. Some of the k[un]map_atomic() implementations have some similarities but the similarities were not sufficient to warrant further changes. In addition we remove a duplicate implementation of kmap() in DRM. This patch (of 15): Replace the use of BUG_ON(in_interrupt()) in the kmap() and kunmap() in favor of might_sleep(). Besides the benefits of might_sleep(), this normalizes the implementations such that they can be made generic in subsequent patches. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Link: http://lkml.kernel.org/r/20200507150004.1423069-1-ira.weiny@intel.com Link: http://lkml.kernel.org/r/20200507150004.1423069-2-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04mm/debug: add tests validating architecture page table helpersAnshuman Khandual1-0/+1
This adds tests which will validate architecture page table helpers and other accessors in their compliance with expected generic MM semantics. This will help various architectures in validating changes to existing page table helpers or addition of new ones. This test covers basic page table entry transformations including but not limited to old, young, dirty, clean, write, write protect etc at various level along with populating intermediate entries with next page table page and validating them. Test page table pages are allocated from system memory with required size and alignments. The mapped pfns at page table levels are derived from a real pfn representing a valid kernel text symbol. This test gets called via late_initcall(). This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any architecture, which is willing to subscribe this test will need to select ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390 and powerpc platforms where the test is known to build and run successfully Going forward, other architectures too can subscribe the test after fixing any build or runtime problems with their page table helpers. Folks interested in making sure that a given platform's page table helpers conform to expected generic MM semantics should enable the above config which will just trigger this test during boot. Any non conformity here will be reported as an warning which would need to be fixed. This test will help catch any changes to the agreed upon semantics expected from generic MM and enable platforms to accommodate it thereafter. [anshuman.khandual@arm.com: v17] Link: http://lkml.kernel.org/r/1587436495-22033-3-git-send-email-anshuman.khandual@arm.com [anshuman.khandual@arm.com: v18] Link: http://lkml.kernel.org/r/1588564865-31160-3-git-send-email-anshuman.khandual@arm.com Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> [ppc32] Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Link: http://lkml.kernel.org/r/1583919272-24178-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04powerpc: add support for folded p4d page tablesMike Rapoport23-145/+200
Implement primitives necessary for the 4th level folding, add walks of p4d level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h. [rppt@linux.ibm.com: powerpc/xmon: drop unused pgdir varialble in show_pte() function] Link: http://lkml.kernel.org/r/20200519181454.GI1059226@linux.ibm.com [rppt@linux.ibm.com; build fix] Link: http://lkml.kernel.org/r/20200423141845.GI13521@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> # 8xx and 83xx Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: James Morse <james.morse@arm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200414153455.21744-9-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03Merge branch 'akpm' (patches from Andrew)Linus Torvalds5-39/+12
Merge more updates from Andrew Morton: "More mm/ work, plenty more to come Subsystems affected by this patch series: slub, memcg, gup, kasan, pagealloc, hugetlb, vmscan, tools, mempolicy, memblock, hugetlbfs, thp, mmap, kconfig" * akpm: (131 commits) arm64: mm: use ARCH_HAS_DEBUG_WX instead of arch defined x86: mm: use ARCH_HAS_DEBUG_WX instead of arch defined riscv: support DEBUG_WX mm: add DEBUG_WX support drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup mm/thp: rename pmd_mknotpresent() as pmd_mkinvalid() powerpc/mm: drop platform defined pmd_mknotpresent() mm: thp: don't need to drain lru cache when splitting and mlocking THP hugetlbfs: get unmapped area below TASK_UNMAPPED_BASE for hugetlbfs sparc32: register memory occupied by kernel as memblock.memory include/linux/memblock.h: fix minor typo and unclear comment mm, mempolicy: fix up gup usage in lookup_node tools/vm/page_owner_sort.c: filter out unneeded line mm: swap: memcg: fix memcg stats for huge pages mm: swap: fix vmstats for huge pages mm: vmscan: limit the range of LRU type balancing mm: vmscan: reclaim writepage is IO cost mm: vmscan: determine anon/file pressure balance at the reclaim root mm: balance LRU lists based on relative thrashing mm: only count actual rotations as LRU reclaim cost ...
2020-06-03powerpc/mm: drop platform defined pmd_mknotpresent()Anshuman Khandual1-4/+0
Patch series "mm/thp: Rename pmd_mknotpresent() as pmd_mknotvalid()", v2. This series renames pmd_mknotpresent() as pmd_mknotvalid(). Before that it drops an existing pmd_mknotpresent() definition from powerpc platform which was never required as it defines it's pmdp_invalidate() through subscribing __HAVE_ARCH_PMDP_INVALIDATE. This does not create any functional change. This rename was suggested by Catalin during a previous discussion while we were trying to change the THP helpers on arm64 platform for migration. https://patchwork.kernel.org/patch/11019637/ This patch (of 2): Platform needs to define pmd_mknotpresent() for generic pmdp_invalidate() only when __HAVE_ARCH_PMDP_INVALIDATE is not subscribed. Otherwise platform specific pmd_mknotpresent() is not required. Hence just drop it. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1587520326-10099-1-git-send-email-anshuman.khandual@arm.com Link: http://lkml.kernel.org/r/1584680057-13753-1-git-send-email-anshuman.khandual@arm.com Link: http://lkml.kernel.org/r/1584680057-13753-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03mm/hugetlb: define a generic fallback for arch_clear_hugepage_flags()Anshuman Khandual1-4/+0
There are multiple similar definitions for arch_clear_hugepage_flags() on various platforms. Lets just add it's generic fallback definition for platforms that do not override. This help reduce code duplication. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: http://lkml.kernel.org/r/1588907271-11920-4-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03mm/hugetlb: define a generic fallback for is_hugepage_only_range()Anshuman Khandual1-0/+1
There are multiple similar definitions for is_hugepage_only_range() on various platforms. Lets just add it's generic fallback definition for platforms that do not override. This help reduce code duplication. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: http://lkml.kernel.org/r/1588907271-11920-3-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03hugetlbfs: remove hugetlb_add_hstate() warning for existing hstateMike Kravetz1-2/+1
hugetlb_add_hstate() prints a warning if the hstate already exists. This was originally done as part of kernel command line parsing. If 'hugepagesz=' was specified more than once, the warning pr_warn("hugepagesz= specified twice, ignoring\n"); would be printed. Some architectures want to enable all huge page sizes. They would call hugetlb_add_hstate for all supported sizes. However, this was done after command line processing and as a result hstates could have already been created for some sizes. To make sure no warning were printed, there would often be code like: if (!size_to_hstate(size) hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT) The only time we want to print the warning is as the result of command line processing. So, remove the warning from hugetlb_add_hstate and add it to the single arch independent routine processing "hugepagesz=". After this, calls to size_to_hstate() in arch specific code can be removed and hugetlb_add_hstate can be called without worrying about warning messages. [mike.kravetz@oracle.com: fix hugetlb initialization] Link: http://lkml.kernel.org/r/4c36c6ce-3774-78fa-abc4-b7346bf24348@oracle.com Link: http://lkml.kernel.org/r/20200428205614.246260-5-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Mina Almasry <almasrymina@google.com> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Will Deacon <will@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Longpeng <longpeng2@huawei.com> Cc: Nitesh Narayan Lal <nitesh@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Xu <peterx@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Qian Cai <cai@lca.pw> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lkml.kernel.org/r/20200417185049.275845-4-mike.kravetz@oracle.com Link: http://lkml.kernel.org/r/20200428205614.246260-4-mike.kravetz@oracle.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03hugetlbfs: move hugepagesz= parsing to arch independent codeMike Kravetz1-15/+0
Now that architectures provide arch_hugetlb_valid_size(), parsing of "hugepagesz=" can be done in architecture independent code. Create a single routine to handle hugepagesz= parsing and remove all arch specific routines. We can also remove the interface hugetlb_bad_size() as this is no longer used outside arch independent code. This also provides consistent behavior of hugetlbfs command line options. The hugepagesz= option should only be specified once for a specific size, but some architectures allow multiple instances. This appears to be more of an oversight when code was added by some architectures to set up ALL huge pages sizes. Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Sandipan Das <sandipan@linux.ibm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Mina Almasry <almasrymina@google.com> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Will Deacon <will@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Longpeng <longpeng2@huawei.com> Cc: Nitesh Narayan Lal <nitesh@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Qian Cai <cai@lca.pw> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lkml.kernel.org/r/20200417185049.275845-3-mike.kravetz@oracle.com Link: http://lkml.kernel.org/r/20200428205614.246260-3-mike.kravetz@oracle.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03hugetlbfs: add arch_hugetlb_valid_sizeMike Kravetz1-7/+13
Patch series "Clean up hugetlb boot command line processing", v4. Longpeng(Mike) reported a weird message from hugetlb command line processing and proposed a solution [1]. While the proposed patch does address the specific issue, there are other related issues in command line processing. As hugetlbfs evolved, updates to command line processing have been made to meet immediate needs and not necessarily in a coordinated manner. The result is that some processing is done in arch specific code, some is done in arch independent code and coordination is problematic. Semantics can vary between architectures. The patch series does the following: - Define arch specific arch_hugetlb_valid_size routine used to validate passed huge page sizes. - Move hugepagesz= command line parsing out of arch specific code and into an arch independent routine. - Clean up command line processing to follow desired semantics and document those semantics. [1] https://lore.kernel.org/linux-mm/20200305033014.1152-1-longpeng2@huawei.com This patch (of 3): The architecture independent routine hugetlb_default_setup sets up the default huge pages size. It has no way to verify if the passed value is valid, so it accepts it and attempts to validate at a later time. This requires undocumented cooperation between the arch specific and arch independent code. For architectures that support more than one huge page size, provide a routine arch_hugetlb_valid_size to validate a huge page size. hugetlb_default_setup can use this to validate passed values. arch_hugetlb_valid_size will also be used in a subsequent patch to move processing of the "hugepagesz=" in arch specific code to a common routine in arch independent code. Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Longpeng <longpeng2@huawei.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Mina Almasry <almasrymina@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Nitesh Narayan Lal <nitesh@redhat.com> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Qian Cai <cai@lca.pw> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lkml.kernel.org/r/20200428205614.246260-1-mike.kravetz@oracle.com Link: http://lkml.kernel.org/r/20200428205614.246260-2-mike.kravetz@oracle.com Link: http://lkml.kernel.org/r/20200417185049.275845-1-mike.kravetz@oracle.com Link: http://lkml.kernel.org/r/20200417185049.275845-2-mike.kravetz@oracle.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODESMike Rapoport1-9/+0
The memmap_init() function was made to iterate over memblock regions and as the result the early_pfn_in_nid() function became obsolete. Since CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real implementation of early_pfn_in_nid(), it is also not needed anymore. Remove both early_pfn_in_nid() and the CONFIG_NODES_SPAN_OTHER_NODES. Co-developed-by: Hoan Tran <Hoan@os.amperecomputing.com> Signed-off-by: Hoan Tran <Hoan@os.amperecomputing.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Hoan Tran <hoan@os.amperecomputing.com> [arm64] Cc: Baoquan He <bhe@redhat.com> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200412194859.12663-17-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03mm: use free_area_init() instead of free_area_init_nodes()Mike Rapoport1-1/+1
free_area_init() has effectively became a wrapper for free_area_init_nodes() and there is no point of keeping it. Still free_area_init() name is shorter and more general as it does not imply necessity to initialize multiple nodes. Rename free_area_init_nodes() to free_area_init(), update the callers and drop old version of free_area_init(). Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Hoan Tran <hoan@os.amperecomputing.com> [arm64] Reviewed-by: Baoquan He <bhe@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Brian Cain <bcain@codeaurora.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200412194859.12663-6-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP optionMike Rapoport1-1/+0
CONFIG_HAVE_MEMBLOCK_NODE_MAP is used to differentiate initialization of nodes and zones structures between the systems that have region to node mapping in memblock and those that don't. Currently all the NUMA architectures enable this option and for the non-NUMA systems we can presume that all the memory belongs to node 0 and therefore the compile time configuration option is not required. The remaining few architectures that use DISCONTIGMEM without NUMA are easily updated to use memblock_add_node() instead of memblock_add() and thus have proper correspondence of memblock regions to NUMA nodes. Still, free_area_init_node() must have a backward compatible version because its semantics with and without CONFIG_HAVE_MEMBLOCK_NODE_MAP is different. Once all the architectures will use the new semantics, the entire compatibility layer can be dropped. To avoid addition of extra run time memory to store node id for architectures that keep memblock but have only a single node, the node id field of the memblock_region is guarded by CONFIG_NEED_MULTIPLE_NODES and the corresponding accessors presume that in those cases it is always 0. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Hoan Tran <hoan@os.amperecomputing.com> [arm64] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Cc: Baoquan He <bhe@redhat.com> Cc: Brian Cain <bcain@codeaurora.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200412194859.12663-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds6-71/+65
Pull kvm updates from Paolo Bonzini: "ARM: - Move the arch-specific code into arch/arm64/kvm - Start the post-32bit cleanup - Cherry-pick a few non-invasive pre-NV patches x86: - Rework of TLB flushing - Rework of event injection, especially with respect to nested virtualization - Nested AMD event injection facelift, building on the rework of generic code and fixing a lot of corner cases - Nested AMD live migration support - Optimization for TSC deadline MSR writes and IPIs - Various cleanups - Asynchronous page fault cleanups (from tglx, common topic branch with tip tree) - Interrupt-based delivery of asynchronous "page ready" events (host side) - Hyper-V MSRs and hypercalls for guest debugging - VMX preemption timer fixes s390: - Cleanups Generic: - switch vCPU thread wakeup from swait to rcuwait The other architectures, and the guest side of the asynchronous page fault work, will come next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (256 commits) KVM: selftests: fix rdtsc() for vmx_tsc_adjust_test KVM: check userspace_addr for all memslots KVM: selftests: update hyperv_cpuid with SynDBG tests x86/kvm/hyper-v: Add support for synthetic debugger via hypercalls x86/kvm/hyper-v: enable hypercalls regardless of hypercall page x86/kvm/hyper-v: Add support for synthetic debugger interface x86/hyper-v: Add synthetic debugger definitions KVM: selftests: VMX preemption timer migration test KVM: nVMX: Fix VMX preemption timer migration x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit KVM: x86/pmu: Support full width counting KVM: x86/pmu: Tweak kvm_pmu_get_msr to pass 'struct msr_data' in KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT KVM: x86: acknowledgment mechanism for async pf page ready notifications KVM: x86: interrupt based APF 'page ready' event delivery KVM: introduce kvm_read_guest_offset_cached() KVM: rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present() KVM: x86: extend struct kvm_vcpu_pv_apf_data with token info Revert "KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously" KVM: VMX: Replace zero-length array with flexible-array ...
2020-06-03Merge tag 'sched-core-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+0
Pull scheduler updates from Ingo Molnar: "The changes in this cycle are: - Optimize the task wakeup CPU selection logic, to improve scalability and reduce wakeup latency spikes - PELT enhancements - CFS bandwidth handling fixes - Optimize the wakeup path by remove rq->wake_list and replacing it with ->ttwu_pending - Optimize IPI cross-calls by making flush_smp_call_function_queue() process sync callbacks first. - Misc fixes and enhancements" * tag 'sched-core-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) irq_work: Define irq_work_single() on !CONFIG_IRQ_WORK too sched/headers: Split out open-coded prototypes into kernel/sched/smp.h sched: Replace rq::wake_list sched: Add rq::ttwu_pending irq_work, smp: Allow irq_work on call_single_queue smp: Optimize send_call_function_single_ipi() smp: Move irq_work_run() out of flush_smp_call_function_queue() smp: Optimize flush_smp_call_function_queue() sched: Fix smp_call_function_single_async() usage for ILB sched/core: Offload wakee task activation if it the wakee is descheduling sched/core: Optimize ttwu() spinning on p->on_cpu sched: Defend cfs and rt bandwidth quota against overflow sched/cpuacct: Fix charge cpuacct.usage_sys sched/fair: Replace zero-length array with flexible-array sched/pelt: Sync util/runnable_sum with PELT window when propagating sched/cpuacct: Use __this_cpu_add() instead of this_cpu_ptr() sched/fair: Optimize enqueue_task_fair() sched: Make scheduler_ipi inline sched: Clean up scheduler_ipi() sched/core: Simplify sched_init() ...
2020-06-03Merge branch 'topic/ppc-kvm' into nextMichael Ellerman2-6/+39
Merge one more commit from the topic branch we shared with the kvm-ppc tree. This brings in a fix to the code that scans for dirty pages during migration of a VM, which was incorrectly triggering a warning.
2020-06-02Merge tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-blockLinus Torvalds1-9/+10
Pull block driver updates from Jens Axboe: "On top of the core changes, here are the block driver changes for this merge window: - NVMe changes: - NVMe over Fibre Channel protocol updates, which also reach over to drivers/scsi/lpfc (James Smart) - namespace revalidation support on the target (Anthony Iliopoulos) - gcc zero length array fix (Arnd Bergmann) - nvmet cleanups (Chaitanya Kulkarni) - misc cleanups and fixes (me, Keith Busch, Sagi Grimberg) - use a SRQ per completion vector (Max Gurtovoy) - fix handling of runtime changes to the queue count (Weiping Zhang) - t10 protection information support for nvme-rdma and nvmet-rdma (Israel Rukshin and Max Gurtovoy) - target side AEN improvements (Chaitanya Kulkarni) - various fixes and minor improvements all over, icluding the nvme part of the lpfc driver" - Floppy code cleanup series (Willy, Denis) - Floppy contention fix (Jiri) - Loop CONFIGURE support (Martijn) - bcache fixes/improvements (Coly, Joe, Colin) - q->queuedata cleanups (Christoph) - Get rid of ioctl_by_bdev (Christoph, Stefan) - md/raid5 allocation fixes (Coly) - zero length array fixes (Gustavo) - swim3 task state fix (Xu)" * tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-block: (166 commits) bcache: configure the asynchronous registertion to be experimental bcache: asynchronous devices registration bcache: fix refcount underflow in bcache_device_free() bcache: Convert pr_<level> uses to a more typical style bcache: remove redundant variables i and n lpfc: Fix return value in __lpfc_nvme_ls_abort lpfc: fix axchg pointer reference after free and double frees lpfc: Fix pointer checks and comments in LS receive refactoring nvme: set dma alignment to qword nvmet: cleanups the loop in nvmet_async_events_process nvmet: fix memory leak when removing namespaces and controllers concurrently nvmet-rdma: add metadata/T10-PI support nvmet: add metadata support for block devices nvmet: add metadata/T10-PI support nvme: add Metadata Capabilities enumerations nvmet: rename nvmet_check_data_len to nvmet_check_transfer_len nvmet: rename nvmet_rw_len to nvmet_rw_data_len nvmet: add metadata characteristics for a namespace nvme-rdma: add metadata/T10-PI support nvme-rdma: introduce nvme_rdma_sgl structure ...
2020-06-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds6-87/+62
Merge updates from Andrew Morton: "A few little subsystems and a start of a lot of MM patches. Subsystems affected by this patch series: squashfs, ocfs2, parisc, vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup, swap, memcg, pagemap, memory-failure, vmalloc, kasan" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits) kasan: move kasan_report() into report.c mm/mm_init.c: report kasan-tag information stored in page->flags ubsan: entirely disable alignment checks under UBSAN_TRAP kasan: fix clang compilation warning due to stack protector x86/mm: remove vmalloc faulting mm: remove vmalloc_sync_(un)mappings() x86/mm/32: implement arch_sync_kernel_mappings() x86/mm/64: implement arch_sync_kernel_mappings() mm/ioremap: track which page-table levels were modified mm/vmalloc: track which page-table levels were modified mm: add functions to track page directory modifications s390: use __vmalloc_node in stack_alloc powerpc: use __vmalloc_node in alloc_vm_stack arm64: use __vmalloc_node in arch_alloc_vmap_stack mm: remove vmalloc_user_node_flags mm: switch the test_vmalloc module to use __vmalloc_node mm: remove __vmalloc_node_flags_caller mm: remove both instances of __vmalloc_node_flags mm: remove the prot argument to __vmalloc_node mm: remove the pgprot argument to __vmalloc ...
2020-06-02powerpc: use __vmalloc_node in alloc_vm_stackChristoph Hellwig1-3/+2
alloc_vm_stack can use a slightly higher level vmalloc function. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: Gao Xiang <xiang@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Kelley <mikelley@microsoft.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Wei Liu <wei.liu@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200414131348.444715-29-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02mm: remove __get_vm_areaChristoph Hellwig1-1/+2
Switch the two remaining callers to use __get_vm_area_caller instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: Gao Xiang <xiang@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Kelley <mikelley@microsoft.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Wei Liu <wei.liu@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200414131348.444715-9-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02powerpc: remove __ioremap_at and __iounmap_atChristoph Hellwig3-65/+21
These helpers are only used for remapping the ISA I/O base. Replace the mapping side with a remap_isa_range helper in isa-bridge.c that hard codes all the known arguments, and just remove __iounmap_at in favour of open coding it in the only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: Gao Xiang <xiang@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Kelley <mikelley@microsoft.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Wei Liu <wei.liu@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200414131348.444715-8-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02powerpc: add an ioremap_phb helperChristoph Hellwig3-19/+38
Factor code shared between pci_64 and electra_cf into a ioremap_pbh helper that follows the normal ioremap semantics, and returns a useful __iomem pointer. Note that it opencodes __ioremap_at as we know from the callers the slab is available. Switch pci_64 to also store the result as __iomem pointer, and unmap the result using iounmap instead of force casting and using vmalloc APIs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: Gao Xiang <xiang@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Kelley <mikelley@microsoft.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Wei Liu <wei.liu@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200414131348.444715-7-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02powerpc/pseries: Make vio and ibmebus initcalls pseries specificOliver O'Halloran2-4/+6
The vio and ibmebus buses are used for pseries specific paravirtualised devices and currently they're initialised by the generic initcall types. This is mostly fine, but it can result in some nuisance errors in dmesg when booting on PowerNV on some OSes, e.g. [ 2.984439] synth uevent: /devices/vio: failed to send uevent [ 2.984442] vio vio: uevent: failed to send synthetic uevent [ 17.968551] synth uevent: /devices/vio: failed to send uevent [ 17.968554] vio vio: uevent: failed to send synthetic uevent We don't see anything similar for the ibmebus because that depends on !CONFIG_LITTLE_ENDIAN. This patch squashes those by switching to using machine_*_initcall() so the bus type is only registered when the kernel is running on a pseries machine. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200421081539.7485-1-oohall@gmail.com
2020-06-02powerpc: Add POWER10 architected modeAlistair Popple6-6/+65
PVR value of 0x0F000006 means we are arch v3.1 compliant (i.e. POWER10). This is used by phyp and kvm when booting as a pseries guest to detect the presence of new P10 features and to enable the appropriate hwcap and facility bits. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Cédric Le Goater <clg@kaod.org> [mpe: Fall through to __init_FSCR rather than duplicating it, drop hack to set current->thread.fscr now that is handled elsewhere.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200521014341.29095-8-alistair@popple.id.au
2020-06-02powerpc/dt_cpu_ftrs: Add MMA featureAlistair Popple2-2/+18
Matrix multiple assist (MMA) is a new feature added to ISAv3.1 and POWER10. Support on powernv can be selected via a firmware CPU device tree feature which enables it via a PCR bit. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200521014341.29095-7-alistair@popple.id.au
2020-06-02powerpc/dt_cpu_ftrs: Enable Prefixed InstructionsAlistair Popple1-0/+1
Prefix instructions have their own FSCR bit which needs to be enabled via a CPU feature. The kernel will save the FSCR for problem state but it needs to be enabled initially. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200521014341.29095-6-alistair@popple.id.au
2020-06-02powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selectedAlistair Popple1-0/+6
On powernv hardware support for ISAv3.1 is advertised via a cpu feature bit in the device tree. This patch enables the associated HWCAP bit if the device tree indicates ISAv3.1 is available. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200521014341.29095-4-alistair@popple.id.au
2020-06-02powerpc: Add support for ISA v3.1Alistair Popple3-4/+3
Newer ISA versions are enabled by clearing all bits in the PCR associated with previous versions of the ISA. Enable ISA v3.1 support by updating the PCR mask to include ISA v3.0. This ensures all PCR bits corresponding to earlier architecture versions get cleared thereby enabling ISA v3.1 if supported by the hardware. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200521014341.29095-3-alistair@popple.id.au
2020-06-02powerpc: Add new HWCAP bitsAlistair Popple1-0/+2
POWER10 introduces two new architectural features - ISAv3.1 and matrix multiply assist (MMA) instructions. Userspace detects the presence of these features via two HWCAP bits introduced in this patch. These bits have been agreed to by the compiler and binutils team. According to ISAv3.1 MMA is an optional feature and software that makes use of it should first check for availability via this HWCAP bit and use alternate code paths if unavailable. Signed-off-by: Alistair Popple <alistair@popple.id.au> Tested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200521014341.29095-2-alistair@popple.id.au
2020-06-02powerpc/64s: Don't set FSCR bits in INIT_THREADMichael Ellerman1-1/+0
Since the previous commit that saves the value of FSCR configured at boot into init_task.thread.fscr, the static initialisation in INIT_THREAD now no longer has any effect. So remove it. For non DT CPU features, the end result is the same, because __init_FSCR() is called on all CPUs that have an FSCR (Power8, Power9), and it sets FSCR_TAR & FSCR_EBB. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200527145843.2761782-4-mpe@ellerman.id.au
2020-06-02powerpc/64s: Save FSCR to init_task.thread.fscr after feature initMichael Ellerman1-0/+19
At boot the FSCR is initialised via one of two paths. On most systems it's set to a hard coded value in __init_FSCR(). On newer skiboot systems we use the device tree CPU features binding, where firmware can tell Linux what bits to set in FSCR (and HFSCR). In both cases the value that's configured at boot is not propagated into the init_task.thread.fscr value prior to the initial fork of init (pid 1), which means the value is not used by any processes other than swapper (the idle task). For the __init_FSCR() case this is OK, because the value in init_task.thread.fscr is initialised to something sensible. However it does mean that the value set in __init_FSCR() is not used other than for swapper, which is odd and confusing. The bigger problem is for the device tree CPU features case it prevents firmware from setting (or clearing) FSCR bits for use by user space. This means all existing kernels can not have features enabled/disabled by firmware if those features require setting/clearing FSCR bits. We can handle both cases by saving the FSCR value into init_task.thread.fscr after we have initialised it at boot. This fixes the bug for device tree CPU features, and will allow us to simplify the initialisation for the __init_FSCR() case in a future patch. Fixes: 5a61ef74f269 ("powerpc/64s: Support new device tree binding for discovering CPU features") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200527145843.2761782-3-mpe@ellerman.id.au
2020-06-02powerpc/64s: Don't let DT CPU features set FSCR_DSCRMichael Ellerman1-0/+8
The device tree CPU features binding includes FSCR bit numbers which Linux is instructed to set by firmware. Whether that's a good idea or not, in the case of the DSCR the Linux implementation has a hard requirement that the FSCR_DSCR bit not be set by default. We use it to track when a process reads/writes to DSCR, so it must be clear to begin with. So if firmware tells us to set FSCR_DSCR we must ignore it. Currently this does not cause a bug in our DSCR handling because the value of FSCR that the device tree CPU features code establishes is only used by swapper. All other tasks use the value hard coded in init_task.thread.fscr. However we'd like to fix that in a future commit, at which point this will become necessary. Fixes: 5a61ef74f269 ("powerpc/64s: Support new device tree binding for discovering CPU features") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200527145843.2761782-2-mpe@ellerman.id.au
2020-06-02powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()Michael Ellerman1-1/+1
__init_FSCR() was added originally in commit 2468dcf641e4 ("powerpc: Add support for context switching the TAR register") (Feb 2013), and only set FSCR_TAR. At that point FSCR (Facility Status and Control Register) was not context switched, so the setting was permanent after boot. Later we added initialisation of FSCR_DSCR to __init_FSCR(), in commit 54c9b2253d34 ("powerpc: Set DSCR bit in FSCR setup") (Mar 2013), again that was permanent after boot. Then commit 2517617e0de6 ("powerpc: Fix context switch DSCR on POWER8") (Aug 2013) added a limited context switch of FSCR, just the FSCR_DSCR bit was context switched based on thread.dscr_inherit. That commit said "This clears the H/FSCR DSCR bit initially", but it didn't, it left the initialisation of FSCR_DSCR in __init_FSCR(). However the initial context switch from init_task to pid 1 would clear FSCR_DSCR because thread.dscr_inherit was 0. That commit also introduced the requirement that FSCR_DSCR be clear for user processes, so that we can take the facility unavailable interrupt in order to manage dscr_inherit. Then in commit 152d523e6307 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") (Dec 2015) FSCR was added to thread_struct. However it still wasn't fully context switched, we just took the existing value and set FSCR_DSCR if the new thread had dscr_inherit set. FSCR was still initialised at boot to FSCR_DSCR | FSCR_TAR, but that value was not propagated into the thread_struct, so the initial context switch set FSCR_DSCR back to 0. Finally commit b57bd2de8c6c ("powerpc: Improve FSCR init and context switching") (Jun 2016) added a full context switch of the FSCR, and added an initialisation of init_task.thread.fscr to FSCR_TAR | FSCR_EBB, but omitted FSCR_DSCR. The end result is that swapper runs with FSCR_DSCR set because of the initialisation in __init_FSCR(), but no other processes do, they use the value from init_task.thread.fscr. Having FSCR_DSCR set for swapper allows it to access SPR 3 from userspace, but swapper never runs userspace, so it has no useful effect. It's also confusing to have the value initialised in two places to two different values. So remove FSCR_DSCR from __init_FSCR(), this at least gets us to the point where there's a single value of FSCR, even if it's still set in two places. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Alistair Popple <alistair@popple.id.au> Link: https://lore.kernel.org/r/20200527145843.2761782-1-mpe@ellerman.id.au
2020-06-02powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUGChristophe Leroy1-1/+2
'thread' doesn't exist in kuap_check() macro. Use 'current' instead. Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection") Cc: stable@vger.kernel.org Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b459e1600b969047a74e34251a84a3d6fdf1f312.1590858925.git.christophe.leroy@csgroup.eu
2020-06-02powerpc/module_64: Use special stub for _mcount() with -mprofile-kernelNaveen N. Rao1-118/+104
Since commit c55d7b5e64265f ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE"), powerpc kernels with -mprofile-kernel can crash in certain scenarios with a trace like below: BUG: Unable to handle kernel instruction fetch (NULL pointer?) Faulting instruction address: 0x00000000 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV <snip> NIP [0000000000000000] 0x0 LR [c0080000102c0048] ext4_iomap_end+0x8/0x30 [ext4] Call Trace: iomap_apply+0x20c/0x920 (unreliable) iomap_bmap+0xfc/0x160 ext4_bmap+0xa4/0x180 [ext4] bmap+0x4c/0x80 jbd2_journal_init_inode+0x44/0x1a0 [jbd2] ext4_load_journal+0x440/0x860 [ext4] ext4_fill_super+0x342c/0x3ab0 [ext4] mount_bdev+0x25c/0x290 ext4_mount+0x28/0x50 [ext4] legacy_get_tree+0x4c/0xb0 vfs_get_tree+0x4c/0x130 do_mount+0xa18/0xc50 sys_mount+0x158/0x180 system_call+0x5c/0x68 The NIP points to NULL, or a random location (data even), while the LR always points to the LEP of a function (with an offset of 8), indicating that something went wrong with ftrace. However, ftrace is not necessarily active when such crashes occur. The kernel OOPS sometimes follows a warning from ftrace indicating that some module functions could not be patched with a nop. Other times, if a module is loaded early during boot, instruction patching can fail due to a separate bug, but the error is not reported due to missing error reporting. In all the above cases when instruction patching fails, ftrace will be disabled but certain kernel module functions will be left with default calls to _mcount(). This is not a problem with ELFv1. However, with -mprofile-kernel, the default stub is problematic since it depends on a valid module TOC in r2. If the kernel (or a different module) calls into a function that does not use the TOC, the function won't have a prologue to setup the module TOC. When that function calls into _mcount(), we will end up in the relocation stub that will use the previous TOC, and end up trying to jump into a random location. From the above trace: iomap_apply+0x20c/0x920 [kernel TOC] | V ext4_iomap_end+0x8/0x30 [no GEP == kernel TOC] | V _mcount() stub [uses kernel TOC -> random entry] To address this, let's change over to using the special stub that is used for ftrace_[regs_]caller() for _mcount(). This ensures that we are not dependent on a valid module TOC in r2 for default _mcount() handling. Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Qian Cai <cai@lca.pw> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8affd4298d22099bbd82544fab8185700a6222b1.1587488954.git.naveen.n.rao@linux.vnet.ibm.com
2020-06-02powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocationsNaveen N. Rao1-14/+4
For -mprofile-kernel, we need special handling when generating stubs for ftrace calls such as _mcount(). To faciliate this, we check if a R_PPC64_REL24 relocation is for a symbol named "_mcount()" along with also checking the instruction sequence. The latter is not really required since "_mcount()" is an exported symbol and kernel modules cannot use it. As such, drop the additional checking and simplify the code. This helps unify stub creation for ftrace stubs with -mprofile-kernel and aids in code reuse. Also rename is_mprofile_mcount_callsite() to is_mprofile_ftrace_call() to reflect the checking being done. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7d9c316adfa1fb787ad268bb4691e7e4059ff2d5.1587488954.git.naveen.n.rao@linux.vnet.ibm.com
2020-06-02powerpc/module_64: Consolidate ftrace codeNaveen N. Rao2-39/+33
module_trampoline_target() is only used by ftrace. Move the prototype within the appropriate #ifdef in the header. Also, move the function body to the end of module_64.c so as to consolidate all ftrace code in one place. No functional changes. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2527351f65c53c5866068ae130dc34c5d4ee8ad9.1587488954.git.naveen.n.rao@linux.vnet.ibm.com
2020-06-02powerpc/32: Disable KASAN with pages bigger than 16kChristophe Leroy1-2/+2
Mapping of early shadow area is implemented by using a single static page table having all entries pointing to the same early shadow page. The shadow area must therefore occupy full PGD entries. The shadow area has a size of 128MB starting at 0xf8000000. With 4k pages, a PGD entry is 4MB With 16k pages, a PGD entry is 64MB With 64k pages, a PGD entry is 1GB which is too big. Until we rework the early shadow mapping, disable KASAN when the page size is too big. Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support") Cc: stable@vger.kernel.org # v5.2+ Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7195fcde7314ccbf7a081b356084a69d421b10d4.1590660977.git.christophe.leroy@csgroup.eu
2020-06-02powerpc/uaccess: Don't set KUEP by default on book3s/32Christophe Leroy1-1/+1
On book3s/32, KUEP is an heavy process as it requires to set/unset the NX bit in each of the 12 user segments everytime the kernel is entered/exited from/to user space. Don't select KUEP by default on book3s/32. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1492bb150c1aaa53d99a604b49992e60ea20cd5f.1586962582.git.christophe.leroy@c-s.fr
2020-06-02powerpc/uaccess: Don't set KUAP by default on book3s/32Christophe Leroy1-1/+1
On book3s/32, KUAP is an heavy process as it requires to determine which segments are impacted and unlock/lock each of them. And since the implementation of user_access_begin/end, it is even worth for the time being because unlike __get_user(), user_access_begin doesn't make difference between read and write and unlocks access also for read allthought that's unneeded on book3s/32. As shown by the size of a kernel built with KUAP and one without, the overhead is 64k bytes of code. As a comparison a similar build on an 8xx has an overhead of only 8k bytes of code. text data bss dec hex filename 7230416 1425868 837376 9493660 90dc9c vmlinux.kuap6xx 7165012 1425548 837376 9427936 8fdbe0 vmlinux.nokuap6xx 6519796 1960028 477464 8957288 88ad68 vmlinux.kuap8xx 6511664 1959864 477464 8948992 888d00 vmlinux.nokuap8xx Until a more optimised KUAP is implemented on book3s/32, don't select it by default. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/154a99399317b096ac1f04827b9f8d7a9179ddc1.1586962586.git.christophe.leroy@c-s.fr