aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/kernel (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-03-07Merge branch 'akpm' (patches from Andrew)Linus Torvalds3-14/+11
Merge more updates from Andrew Morton: - some of the rest of MM - various misc things - dynamic-debug updates - checkpatch - some epoll speedups - autofs - rapidio - lib/, lib/lzo/ updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits) samples/mic/mpssd/mpssd.h: remove duplicate header kernel/fork.c: remove duplicated include include/linux/relay.h: fix percpu annotation in struct rchan arch/nios2/mm/fault.c: remove duplicate include unicore32: stop printing the virtual memory layout MAINTAINERS: fix GTA02 entry and mark as orphan mm: create the new vm_fault_t type arm, s390, unicore32: remove oneliner wrappers for memblock_alloc() arch: simplify several early memory allocations openrisc: simplify pte_alloc_one_kernel() sh: prefer memblock APIs returning virtual address microblaze: prefer memblock API returning virtual address powerpc: prefer memblock APIs returning virtual address lib/lzo: separate lzo-rle from lzo lib/lzo: implement run-length encoding lib/lzo: fast 8-byte copy on arm64 lib/lzo: 64-bit CTZ on arm64 lib/lzo: tidy-up ifdefs ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size ipc: annotate implicit fall through ...
2019-03-07arch: simplify several early memory allocationsMike Rapoport1-2/+2
There are several early memory allocations in arch/ code that use memblock_phys_alloc() to allocate memory, convert the returned physical address to the virtual address and then set the allocated memory to zero. Exactly the same behaviour can be achieved simply by calling memblock_alloc(): it allocates the memory in the same way as memblock_phys_alloc(), then it performs the phys_to_virt() conversion and clears the allocated memory. Replace the longer sequence with a simpler call to memblock_alloc(). Link: http://lkml.kernel.org/r/1546248566-14910-6-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07powerpc: prefer memblock APIs returning virtual addressMike Rapoport2-12/+9
Patch series "memblock: simplify several early memory allocation", v4. These patches simplify some of the early memory allocations by replacing usage of older memblock APIs with newer and shinier ones. Quite a few places in the arch/ code allocated memory using a memblock API that returns a physical address of the allocated area, then converted this physical address to a virtual one and then used memset(0) to clear the allocated range. More recent memblock APIs do all the three steps in one call and their usage simplifies the code. It's important to note that regardless of API used, the core allocation is nearly identical for any set of memblock allocators: first it tries to find a free memory with all the constraints specified by the caller and then falls back to the allocation with some or all constraints disabled. The first three patches perform the conversion of call sites that have exact requirements for the node and the possible memory range. The fourth patch is a bit one-off as it simplifies openrisc's implementation of pte_alloc_one_kernel(), and not only the memblock usage. The fifth patch takes care of simpler cases when the allocation can be satisfied with a simple call to memblock_alloc(). The sixth patch removes one-liner wrappers for memblock_alloc on arm and unicore32, as suggested by Christoph. This patch (of 6): There are a several places that allocate memory using memblock APIs that return a physical address, convert the returned address to the virtual address and frequently also memset(0) the allocated range. Update these places to use memblock allocators already returning a virtual address. Use memblock functions that clear the allocated memory instead of calling memset(0) where appropriate. The calls to memblock_alloc_base() that were not followed by memset(0) are replaced with memblock_alloc_try_nid_raw(). Since the latter does not panic() when the allocation fails, the appropriate panic() calls are added to the call sites. Link: http://lkml.kernel.org/r/1546248566-14910-2-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Greentime Hu <green.hu@gmail.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Mark Salter <msalter@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07Merge tag 'powerpc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds54-1293/+1081
Pull powerpc updates from Michael Ellerman: "Notable changes: - Enable THREAD_INFO_IN_TASK to move thread_info off the stack. - A big series from Christoph reworking our DMA code to use more of the generic infrastructure, as he said: "This series switches the powerpc port to use the generic swiotlb and noncoherent dma ops, and to use more generic code for the coherent direct mapping, as well as removing a lot of dead code." - Increase our vmalloc space to 512T with the Hash MMU on modern CPUs, allowing us to support machines with larger amounts of total RAM or distance between nodes. - Two series from Christophe, one to optimise TLB miss handlers on 6xx, and another to optimise the way STRICT_KERNEL_RWX is implemented on some 32-bit CPUs. - Support for KCOV coverage instrumentation which means we can run syzkaller and discover even more bugs in our code. And as always many clean-ups, reworks and minor fixes etc. Thanks to: Alan Modra, Alexey Kardashevskiy, Alistair Popple, Andrea Arcangeli, Andrew Donnellan, Aneesh Kumar K.V, Aravinda Prasad, Balbir Singh, Brajeswar Ghosh, Breno Leitao, Christian Lamparter, Christian Zigotzky, Christophe Leroy, Christoph Hellwig, Corentin Labbe, Daniel Axtens, David Gibson, Diana Craciun, Firoz Khan, Gustavo A. R. Silva, Igor Stoppa, Joe Lawrence, Joel Stanley, Jonathan Neuschäfer, Jordan Niethe, Laurent Dufour, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Masahiro Yamada, Mathieu Malaterre, Matteo Croce, Meelis Roos, Michael W. Bringmann, Nathan Chancellor, Nathan Fontenot, Nicholas Piggin, Nick Desaulniers, Nicolai Stange, Oliver O'Halloran, Paul Mackerras, Peter Xu, PrasannaKumar Muralidharan, Qian Cai, Rashmica Gupta, Reza Arbab, Robert P. J. Day, Russell Currey, Sabyasachi Gupta, Sam Bobroff, Sandipan Das, Sergey Senozhatsky, Souptick Joarder, Stewart Smith, Tyrel Datwyler, Vaibhav Jain, YueHaibing" * tag 'powerpc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits) powerpc/32: Clear on-stack exception marker upon exception return powerpc: Remove export of save_stack_trace_tsk_reliable() powerpc/mm: fix "section_base" set but not used powerpc/mm: Fix "sz" set but not used warning powerpc/mm: Check secondary hash page table powerpc: remove nargs from __SYSCALL powerpc/64s: Fix unrelocated interrupt trampoline address test powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables powerpc/fsl: Fix the flush of branch predictor. powerpc/powernv: Make opal log only readable by root powerpc/xmon: Fix opcode being uninitialized in print_insn_powerpc powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C powerpc/64s: Fix data interrupts vs d-side MCE reentrancy powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy powerpc/64s: system reset interrupt preserve HSRRs powerpc/64s: Fix HV NMI vs HV interrupt recoverability test powerpc/mm/hash: Handle mmap_min_addr correctly in get_unmapped_area topdown search powerpc/hugetlb: Handle mmap_min_addr correctly in get_unmapped_area callback selftests/powerpc: Remove duplicate header powerpc sstep: Add support for modsd, modud instructions ...
2019-03-06Merge tag 'char-misc-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds2-191/+3
Pull char/misc driver updates from Greg KH: "Here is the big char/misc driver patch pull request for 5.1-rc1. The largest thing by far is the new habanalabs driver for their AI accelerator chip. For now it is in the drivers/misc directory but will probably move to a new directory soon along with other drivers of this type. Other than that, just the usual set of individual driver updates and fixes. There's an "odd" merge in here from the DRM tree that they asked me to do as the MEI driver is starting to interact with the i915 driver, and it needed some coordination. All of those patches have been properly acked by the relevant subsystem maintainers. All of these have been in linux-next with no reported issues, most for quite some time" * tag 'char-misc-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (219 commits) habanalabs: adjust Kconfig to fix build errors habanalabs: use %px instead of %p in error print habanalabs: use do_div for 64-bit divisions intel_th: gth: Fix an off-by-one in output unassigning habanalabs: fix little-endian<->cpu conversion warnings habanalabs: use NULL to initialize array of pointers habanalabs: fix little-endian<->cpu conversion warnings habanalabs: soft-reset device if context-switch fails habanalabs: print pointer using %p habanalabs: fix memory leak with CBs with unaligned size habanalabs: return correct error code on MMU mapping failure habanalabs: add comments in uapi/misc/habanalabs.h habanalabs: extend QMAN0 job timeout habanalabs: set DMA0 completion to SOB 1007 habanalabs: fix validation of WREG32 to DMA completion habanalabs: fix mmu cache registers init habanalabs: disable CPU access on timeouts habanalabs: add MMU DRAM default page mapping habanalabs: Dissociate RAZWI info from event types misc/habanalabs: adjust Kconfig to fix build errors ...
2019-03-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds3-4/+4
Merge misc updates from Andrew Morton: - a few misc things - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits) tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include proc: more robust bulk read test proc: test /proc/*/maps, smaps, smaps_rollup, statm proc: use seq_puts() everywhere proc: read kernel cpu stat pointer once proc: remove unused argument in proc_pid_lookup() fs/proc/thread_self.c: code cleanup for proc_setup_thread_self() fs/proc/self.c: code cleanup for proc_setup_self() proc: return exit code 4 for skipped tests mm,mremap: bail out earlier in mremap_to under map pressure mm/sparse: fix a bad comparison mm/memory.c: do_fault: avoid usage of stale vm_area_struct writeback: fix inode cgroup switching comment mm/huge_memory.c: fix "orig_pud" set but not used mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC mm/memcontrol.c: fix bad line in comment mm/cma.c: cma_declare_contiguous: correct err handling mm/page_ext.c: fix an imbalance with kmemleak mm/compaction: pass pgdat to too_many_isolated() instead of zone mm: remove zone_lru_lock() function, access ->lru_lock directly ...
2019-03-05powerpc/vdso: don't clear PG_reservedDavid Hildenbrand1-2/+0
The VDSO is part of the kernel image and therefore the struct pages are marked as reserved during boot. As we install a special mapping, the actual struct pages will never be exposed to MM via the page tables. We can therefore leave the pages marked as reserved. Link: http://lkml.kernel.org/r/20190114125903.24845-4-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Kees Cook <keescook@chromium.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05mm: replace all open encodings for NUMA_NO_NODEAnshuman Khandual2-2/+4
Patch series "Replace all open encodings for NUMA_NO_NODE", v3. All these places for replacement were found by running the following grep patterns on the entire kernel code. Please let me know if this might have missed some instances. This might also have replaced some false positives. I will appreciate suggestions, inputs and review. 1. git grep "nid == -1" 2. git grep "node == -1" 3. git grep "nid = -1" 4. git grep "node = -1" This patch (of 2): At present there are multiple places where invalid node number is encoded as -1. Even though implicitly understood it is always better to have macros in there. Replace these open encodings for an invalid node number with the global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like 'invalid node' from various places redirecting them to a common definition. Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> [ixgbe] Acked-by: Jens Axboe <axboe@kernel.dk> [mtip32xx] Acked-by: Vinod Koul <vkoul@kernel.org> [dmaengine.c] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Doug Ledford <dledford@redhat.com> [drivers/infiniband] Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Hans Verkuil <hverkuil@xs4all.nl> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-04powerpc/32: Clear on-stack exception marker upon exception returnChristophe Leroy1-0/+9
Clear the on-stack STACK_FRAME_REGS_MARKER on exception exit in order to avoid confusing stacktrace like the one below. Call Trace: [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable) [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180 [c0e9dd10] [c0895130] memchr+0x24/0x74 [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574 [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8 [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4 --- interrupt: c0e9df00 at 0x400f330 LR = init_stack+0x1f00/0x2000 [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable) [c0e9df20] [c0c27e44] early_irq_init+0x38/0x108 [c0e9df50] [c0c15434] start_kernel+0x310/0x488 [c0e9dff0] [00003484] 0x3484 With this patch the trace becomes: Call Trace: [c0e9dca0] [c01c42c0] print_address_description+0x64/0x2bc (unreliable) [c0e9dcd0] [c01c46a4] kasan_report+0xfc/0x180 [c0e9dd10] [c0895150] memchr+0x24/0x74 [c0e9dd30] [c00a9e58] msg_print_text+0x124/0x574 [c0e9dde0] [c00ab730] console_unlock+0x114/0x4f8 [c0e9de40] [c00adc80] vprintk_emit+0x188/0x1c4 [c0e9de80] [c00ae3e4] printk+0xa8/0xcc [c0e9df20] [c0c27e44] early_irq_init+0x38/0x108 [c0e9df50] [c0c15434] start_kernel+0x310/0x488 [c0e9dff0] [00003484] 0x3484 Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-02powerpc: Remove export of save_stack_trace_tsk_reliable()Joe Lawrence1-1/+0
As tglx points out, there are no in-tree module users of save_stack_trace_tsk_reliable() and its x86 counterpart is not exported, so remove the powerpc symbol export. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-02powerpc: remove nargs from __SYSCALLFiroz Khan2-5/+5
The __SYSCALL macro's arguments are system call number, system call entry name and number of arguments for the system call. Argument- nargs in __SYSCALL(nr, entry, nargs) is neither calculated nor used anywhere. So it would be better to keep the implementaion as __SYSCALL(nr, entry). This will unifies the implementation with some other architetures too. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-02powerpc/64s: Fix unrelocated interrupt trampoline address testNicholas Piggin2-8/+9
The recent commit got this test wrong, it declared the assembler symbols the wrong way, and also used the wrong symbol name (xxx_start rather than start_xxx, see asm/head-64.h). Fixes: ccd477028a ("powerpc/64s: Fix HV NMI vs HV interrupt recoverability test") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-27powerpc/fsl: Fix the flush of branch predictor.Christophe Leroy1-0/+1
The commit identified below adds MC_BTB_FLUSH macro only when CONFIG_PPC_FSL_BOOK3E is defined. This results in the following error on some configs (seen several times with kisskb randconfig_defconfig) arch/powerpc/kernel/exceptions-64e.S:576: Error: Unrecognized opcode: `mc_btb_flush' make[3]: *** [scripts/Makefile.build:367: arch/powerpc/kernel/exceptions-64e.o] Error 1 make[2]: *** [scripts/Makefile.build:492: arch/powerpc/kernel] Error 2 make[1]: *** [Makefile:1043: arch/powerpc] Error 2 make: *** [Makefile:152: sub-make] Error 2 This patch adds a blank definition of MC_BTB_FLUSH for other cases. Fixes: 10c5e83afd4a ("powerpc/fsl: Flush the branch predictor at each kernel entry (64bit)") Cc: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26powerpc/64s: Fix data interrupts vs d-side MCE reentrancyNicholas Piggin1-10/+26
Handlers for interrupts that set DAR / DSISR, set MSR[RI] before those SPRs are read. If a d-side machine check hits in this window, DAR / DSISR will be clobbered silently, leading to random corruption. Fix this by having handlers save those registers before setting MSR[RI]. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancyNicholas Piggin1-6/+42
A subsequent fix for data interrupts (those that set DAR / DSISR) requires some interrupt macros to be open-coded, and also requires the 0x300 interrupt handler to be moved out-of-line. This patch does that without changing behaviour, which makes the later fix a smaller change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26powerpc/64s: system reset interrupt preserve HSRRsNicholas Piggin1-1/+24
Code that uses HSRR registers is not required to clear MSR[RI] by convention, however the system reset NMI itself may use HSRR registers (e.g., to call OPAL) and clobber them. Rather than introduce the requirement to clear RI in order to use HSRRs, have system reset interrupt save and restore HSRRs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26powerpc/64s: Fix HV NMI vs HV interrupt recoverability testNicholas Piggin3-0/+77
HV interrupts that use HSRR registers do not enter with MSR[RI] clear, but their entry code is not recoverable vs NMI, due to shared use of HSPRG1 as a scratch register to save r13. This means that a system reset or machine check that hits in HSRR interrupt entry can cause r13 to be silently corrupted. Fix this by marking NMIs non-recoverable if they land in HV interrupt ranges. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: clean stack pointers namingChristophe Leroy2-18/+10
Some stack pointers used to also be thread_info pointers and were called tp. Now that they are only stack pointers, rename them sp. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/64: Replace CURRENT_THREAD_INFO with PACA_THREAD_INFOChristophe Leroy7-12/+14
Now that current_thread_info is located at the beginning of 'current' task struct, CURRENT_THREAD_INFO macro is not really needed any more. This patch replaces it by loads of the value at PACA_THREAD_INFO(r13). Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Add PACA_THREAD_INFO rather than using PACACURRENT] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/32: Remove CURRENT_THREAD_INFO and rename TI_CPUChristophe Leroy7-57/+30
Now that thread_info is similar to task_struct, its address is in r2 so CURRENT_THREAD_INFO() macro is useless. This patch removes it. This patch also moves the 'tovirt(r2, r2)' down just before the reactivation of MMU translation, so that we keep the physical address of 'current' in r2 until then. It avoids a few calls to tophys(). At the same time, as the 'cpu' field is not anymore in thread_info, TI_CPU is renamed TASK_CPU by this patch. It also allows to get rid of a couple of '#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE' as ACCOUNT_CPU_USER_ENTRY() and ACCOUNT_CPU_USER_EXIT() are empty when CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not defined. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Fix a missed conversion of TI_CPU idle_6xx.S] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: 'current_set' is now a table of task_struct pointersChristophe Leroy4-13/+11
The table of pointers 'current_set' has been used for retrieving the stack and current. They used to be thread_info pointers as they were pointing to the stack and current was taken from the 'task' field of the thread_info. Now, the pointers of 'current_set' table are now both pointers to task_struct and pointers to thread_info. As they are used to get current, and the stack pointer is retrieved from current's stack field, this patch changes their type to task_struct, and renames secondary_ti to secondary_current. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: regain entire stack spaceChristophe Leroy6-48/+32
thread_info is not anymore in the stack, so the entire stack can now be used. There is also no risk anymore of corrupting task_cpu(p) with a stack overflow so the patch removes the test. When doing this, an explicit test for NULL stack pointer is needed in validate_sp() as it is not anymore implicitely covered by the sizeof(thread_info) gap. In the meantime, with the previous patch all pointers to the stacks are not anymore pointers to thread_info so this patch changes them to void* Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: Activate CONFIG_THREAD_INFO_IN_TASKChristophe Leroy14-168/+24
This patch activates CONFIG_THREAD_INFO_IN_TASK which moves the thread_info into task_struct. Moving thread_info into task_struct has the following advantages: - It protects thread_info from corruption in the case of stack overflows. - Its address is harder to determine if stack addresses are leaked, making a number of attacks more difficult. This has the following consequences: - thread_info is now located at the beginning of task_struct. - The 'cpu' field is now in task_struct, and only exists when CONFIG_SMP is active. - thread_info doesn't have anymore the 'task' field. This patch: - Removes all recopy of thread_info struct when the stack changes. - Changes the CURRENT_THREAD_INFO() macro to point to current. - Selects CONFIG_THREAD_INFO_IN_TASK. - Modifies raw_smp_processor_id() to get ->cpu from current without including linux/sched.h to avoid circular inclusion and without including asm/asm-offsets.h to avoid symbol names duplication between ASM constants and C constants. - Modifies klp_init_thread_info() to take a task_struct pointer argument. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add task_stack.h to livepatch.h to fix build fails] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/idle/6xx: Use r1 with CURRENT_THREAD_INFO()Christophe Leroy1-1/+2
Make sure CURRENT_THREAD_INFO() is used with r1 which is the virtual address of the stack, in order to ease the switch to r2 when we enable THREAD_INFO_IN_TASK, as we have no register having the phys address of current. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/64: Use task_stack_page() to initialise paca->kstackChristophe Leroy1-1/+3
Rather than using the thread info use task_stack_page() to initialise paca->kstack, that way it will work with THREAD_INFO_IN_TASK. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: Update comments in preparation for THREAD_INFO_IN_TASKChristophe Leroy3-3/+3
Update a few comments that talk about current_thread_info() in preparation for THREAD_INFO_IN_TASK. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: Replace current_thread_info()->task with currentChristophe Leroy1-3/+3
We have a few places that use current_thread_info()->task to access current. This won't work with THREAD_INFO_IN_TASK so fix them now. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: Don't use CURRENT_THREAD_INFO to find the stackChristophe Leroy3-3/+3
A few places use CURRENT_THREAD_INFO, or the C version, to find the stack. This will no longer work with THREAD_INFO_IN_TASK so change them to find the stack in other ways. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: call_do_[soft]irq() takes a pointer to the stackChristophe Leroy1-1/+1
The purpose of the pointer given to call_do_softirq() and call_do_irq() is to point the new stack. Currently that's the same thing as the thread_info, but won't be with THREAD_INFO_IN_TASK. So change the parameter to void* and rename it 'sp'. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: Rename THREAD_INFO to TASK_STACKChristophe Leroy7-9/+9
This patch renames THREAD_INFO to TASK_STACK, because it is in fact the offset of the pointer to the stack in task_struct so this pointer will not be impacted by the move of THREAD_INFO. Also make it available on 64-bit, as we'll need it there when we activate THREAD_INFO_IN_TASK. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Make available on 64-bit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: prep stack walkers for THREAD_INFO_IN_TASKChristophe Leroy2-6/+49
[text copied from commit 9bbd4c56b0b6 ("arm64: prep stack walkers for THREAD_INFO_IN_TASK")] When CONFIG_THREAD_INFO_IN_TASK is selected, task stacks may be freed before a task is destroyed. To account for this, the stacks are refcounted, and when manipulating the stack of another task, it is necessary to get/put the stack to ensure it isn't freed and/or re-used while we do so. This patch reworks the powerpc stack walking code to account for this. When CONFIG_THREAD_INFO_IN_TASK is not selected these perform no refcounting, and this should only be a structural change that does not affect behaviour. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Move try_get_task_stack() below tsk == NULL check in show_stack()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: Only use task_struct 'cpu' field on SMPChristophe Leroy2-0/+6
When moving to CONFIG_THREAD_INFO_IN_TASK, the thread_info 'cpu' field gets moved into task_struct and only defined when CONFIG_SMP is set. This patch ensures that TI_CPU is only used when CONFIG_SMP is set and that task_struct 'cpu' field is not used directly out of SMP code. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/irq: use memblock functions returning virtual addressChristophe Leroy3-27/+23
Since only the virtual address of allocated blocks is used, lets use functions returning directly virtual address. Those functions have the advantage of also zeroing the block. Suggested-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/64: Simplify __secondary_start paca->kstack handlingMichael Ellerman1-11/+9
In __secondary_start() we load the thread_info of the idle task of the secondary CPU from current_set[cpu], and then convert it into a stack pointer before storing that back to paca->kstack. As pointed out in commit f761622e5943 ("powerpc: Initialise paca->kstack before early_setup_secondary") it's important that we initialise paca->kstack before calling the MMU setup code, in particular slb_initialize(), because it will bolt the SLB entry for the kstack into the SLB. However we have already setup paca->kstack in cpu_idle_thread_init(), since commit 3b5750644b2f ("[POWERPC] Bolt in SLB entry for kernel stack on secondary cpus") (May 2008). It's also in cpu_idle_thread_init() that we initialise current_set[cpu] with the thread_info pointer, so there is no issue of the timing being different between the two. Therefore the initialisation of paca->kstack in __setup_secondary() is completely redundant, so remove it. This has the added benefit of removing code that runs in real mode, and is therefore restricted by the RMO, and so opens the way for us to enable THREAD_INFO_IN_TASK. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/64s: Remove MSR_RI optimisation in system_call_exit()Michael Ellerman1-20/+14
Currently in system_call_exit() we have an optimisation where we disable MSR_RI (recoverable interrupt) and MSR_EE (external interrupt enable) in a single mtmsrd instruction. Unfortunately this will no longer work with THREAD_INFO_IN_TASK, because then the load of TI_FLAGS might fault and faulting with MSR_RI clear is treated as an unrecoverable exception which leads to a panic(). So change the code to only clear MSR_EE prior to loading TI_FLAGS, leaving the clear of MSR_RI until later. We have some latitude in where do the clear of MSR_RI. A bit of experimentation has shown that this location gives the least slow down. This still causes a noticeable slow down in our null_syscall performance. On a Power9 DD2.2: Before After Delta Delta % 955 cycles 999 cycles -44 -4.6% On the plus side this does simplify the code somewhat, because we don't have to reenable MSR_RI on the restore_math() or syscall_exit_work() paths which was necessitated previously by the optimisation. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: Enable kcovAndrew Donnellan4-2/+15
kcov provides kernel coverage data that's useful for fuzzing tools like syzkaller. Wire up kcov support on powerpc. Disable kcov instrumentation on the same files where we currently disable gcov and UBSan instrumentation, plus some additional exclusions which appear necessary to boot on book3e machines. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Daniel Axtens <dja@axtens.net> # e6500 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/kconfig: make _etext and data areas alignment configurable on 8xxChristophe Leroy1-2/+2
On 8xx, large pages (512kb or 8M) are used to map kernel linear memory. Aligning to 8M reduces TLB misses as only 8M pages are used in that case. We make 8M the default for data. This patchs allows the user to do it via Kconfig. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/8xx: don't disable large TLBs with CONFIG_STRICT_KERNEL_RWXChristophe Leroy1-12/+42
This patch implements handling of STRICT_KERNEL_RWX with large TLBs directly in the TLB miss handlers. To do so, etext and sinittext are aligned on 512kB boundaries and the miss handlers use 512kB pages instead of 8Mb pages for addresses close to the boundaries. It sets RO PP flags for addresses under sinittext. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/mm/32s: add setibat() clearibat() and update_bats()Christophe Leroy1-0/+35
setibat() and clearibat() allows to manipulate IBATs independently of DBATs. update_bats() allows to update bats after init. This is done with MMU off. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/kconfig: define CONFIG_DATA_SHIFT and CONFIG_ETEXT_SHIFTChristophe Leroy1-6/+3
CONFIG_STRICT_KERNEL_RWX requires a special alignment for DATA for some subarches. Today it is just defined as an #ifdef in vmlinux.lds.S In order to get more flexibility, this patch moves the definition of this alignment in Kconfig On some subarches, CONFIG_STRICT_KERNEL_RWX will require a special alignment of _etext. This patch also adds a configuration item for it in Kconfig Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc/8xx: Map 32Mb of RAM at init.Christophe Leroy1-20/+31
At the time being, initial MMU setup allows 24 Mbytes of DATA and 8 Mbytes of code. Some debug setup like CONFIG_KASAN generate huge kernels with text size over the 8M limit and data over the 24 Mbytes limit. Here is an 8xx kernel compiled with CONFIG_KASAN_INLINE for one of my boards: [root@po16846vm linux-powerpc]# size -x vmlinux text data bss dec hex filename 0x111019c 0x41b0d4 0x490de0 26984528 19bc050 vmlinux This patch maps up to 32 Mbytes code based on _einittext symbol and allows 32 Mbytes of memory instead of 24. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23Revert "powerpc/book3s32: Reorder _PAGE_XXX flags to simplify TLB handling"Michael Ellerman1-1/+4
This reverts commit 78ca1108b10927b3d068c8da91352b0f4cd01fc5. It is causing boot failures with qemu mac99 in at least some configurations.
2019-02-22powerpc/32: Fix CONFIG_VIRT_CPU_ACCOUNTING_NATIVE for 40x/bookeChristophe Leroy1-5/+7
40x/booke have another path to reach 3f from transfer_to_handler, make sure it also calls ACCOUNT_CPU_USER_ENTRY() when CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is selected. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-22powerpc/book3s32: Reorder _PAGE_XXX flags to simplify TLB handlingChristophe Leroy1-4/+1
For pages without _PAGE_USER, PP field is 00 For pages with _PAGE_USER, PP field is 10 for RW and 11 for RO. This patch sets _PAGE_USER to 0x002 and _PAGE_RW to 0x001 is order to simplify TLB handling by reducing amount of shifts. The location of _PAGE_PRESENT and _PAGE_HASHPTE doesn't matter as they are only SW related flags. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-22powerpc/603: don't handle PAGE_ACCESSED in TLB miss handlers.Christophe Leroy1-11/+13
PAGE_ACCESSED is only needed for CONFIG_SWAP. When CONFIG_SWAP is not set, just ignore it. If CONFIG_SWAP is set and PAGE_ACCESSED is not, let's take a minor fault. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-22powerpc/603: Don't worry about _PAGE_USER in TLB miss handlersChristophe Leroy1-9/+3
PP bits take user access into account, so no need to check _PAGE_USER here. A DSI or ISI will be generated if needed. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-22powerpc/603: let's handle PAGE_DIRTY directlyChristophe Leroy1-4/+2
PAGE_DIRTY corresponds to the C bit. If writing on a page for which the C bit is not set, a DataStoreTLBMiss is generated. No need to check it in DataLoadTLBMiss. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-22powerpc/603: Don't handle _PAGE_RW and _PAGE_DIRTY on ITLB missesChristophe Leroy1-6/+2
_PAGE_RW and _PAGE_DIRTY do not matter for ITLB misses. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-22powerpc/603: Don't handle kernel page TLB misses when not needChristophe Leroy1-0/+4
ITLB miss on kernel pages only occur with CONFIG_MODULES and CONFIG_DEBUG_PAGEALLOC. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-22powerpc/603: use physical address directly in TLB miss handlers.Christophe Leroy1-9/+6
Since commit c62ce9ef97ba ("powerpc: remove remaining bits from CONFIG_APUS"), tophys() has become a pure constant operation. PAGE_OFFSET is known at compile time so the physical address can be builtin directly. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>