aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/kernel/exceptions-64s.S (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-08-23powerpc/64s: Remove spurious IRQ reason in IRQ replayNicholas Piggin1-2/+0
HVI interrupts have always used 0x500, so remove the dead branch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23powerpc/64s: Use the HV handler for external IRQ replay in HV mode on POWER9Nicholas Piggin1-0/+4
POWER9 host external interrupts use the h_virt_irq_common handler, so use that to replay them rather than using the hardware_interrupt_common handler. Both call do_IRQ, but using the correct handler reduces i-cache footprint. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23powerpc/64s: Merge HV and non-HV paths for doorbell IRQ replayNicholas Piggin1-1/+1
This results in smaller code, and fewer branches. This relies on the fact that both the 0xe80 and 0xa00 handlers call the same upper level code, namely doorbell_exception(). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Mention we rely on the implementation of the 0xe80/0xa00 handlers] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23powerpc/64s: masked_interrupt() returns to kernel so avoid restoring r13Nicholas Piggin1-1/+1
Places in the kernel where r13 is not the PACA pointer must have maskable interrupts disabled, so r13 does not have to be restored when returning from a soft-masked interrupt. We should never have interrupts soft disabled when we're in user space. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23powerpc/64s: Optimise clearing of MSR_EE in masked_[H]interrupt()Nicholas Piggin1-2/+1
MSR_EE is always enabled in SRR1 for masked interrupts, so we can use xor to clear it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23powerpc/64s: Avoid a branch in masked_[H]interrupt()Nicholas Piggin1-4/+2
Interrupts which do not require EE to be cleared can all be tested with a single bitwise test. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23powerpc/64s: Fix replay interrupt return label nameMichael Ellerman1-2/+2
In __replay_interrupt() we take the address of a local label so we can return to it later. However the assembler turns the local label into a symbol with a name like ".L1^B42" - where "^B" is literally "\002". This does not make for pleasant stack traces. Fix it by giving the label a sensible name. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23Merge branch 'fixes' into nextMichael Ellerman1-1/+9
There's a non-trivial dependency between some commits we want to put in next and the KVM prefetch work around that went into fixes. So merge fixes into next.
2017-08-10powerpc: Fix powerpc-specific watchdog build configurationNicholas Piggin1-3/+3
The powerpc kernel/watchdog.o should be built when HARDLOCKUP_DETECTOR and HAVE_HARDLOCKUP_DETECTOR_ARCH are both selected. If only the former is selected, then the generic perf watchdog has been selected. To simplify this check, introduce a new Kconfig symbol PPC_WATCHDOG that depends on both. This Kconfig option means the powerpc specific watchdog is enabled. Without this patch, Book3E will attempt to build the powerpc watchdog. Fixes: 2104180a53 ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-03powerpc/mm: Use symbolic constants for filtering SRR1 bits on ISIsBenjamin Herrenschmidt1-1/+1
This uses the newly defined constants for this rather than open-coded numbers. There is a side effect on 64-bit which is to pass through some of the new P9 bits which we didn't before. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-03powerpc/mm: Update bits used to skip hash_pageBenjamin Herrenschmidt1-2/+4
We test a number of bits from DSISR/SRR1 before deciding to call hash_page(). If any of these is set, we go directly to do_page_fault() as the bit indicate a fault that needs to be handled there (no hashing needed). This updates the current open-coded masks to use the new DSISR definitions. This *does* change the masks actually used in two ways: - We used to test various bits that were defined as "always 0" in the architecture and could be repurposed for something else. From now on, we just ignore such bits. - We were missing some new bits defined on P9 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-31powerpc/64s: Fix stack setup in watchdog soft_nmi_common()Nicholas Piggin1-1/+9
The watchdog soft-NMI exception stack setup loads a stack pointer twice, which is an obvious error. It ends up using the system reset interrupt (true-NMI) stack, which is also a bug because the watchdog could be preempted by a system reset interrupt that overwrites the NMI stack. Change the soft-NMI to use the "emergency stack". The current kernel stack is not used, because of the longer-term goal to prevent asynchronous stack access using soft-disable. Fixes: 2104180a5369 ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-31Merge tag 'v4.13-rc1' into fixesMichael Ellerman1-2/+28
The fixes branch is based off a random pre-rc1 commit, because we had some fixes that needed to go in before rc1 was released. However we now need to fix some code that went in after that point, but before rc1, so merge rc1 to get that code into fixes so we can fix it!
2017-07-21Merge tag 'powerpc-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds1-14/+14
Pull powerpc fixes from Michael Ellerman: "A handful of fixes, mostly for new code: - some reworking of the new STRICT_KERNEL_RWX support to make sure we also remove executable permission from __init memory before it's freed. - a fix to some recent optimisations to the hypercall entry where we were clobbering r12, this was breaking nested guests (PR KVM). - a fix for the recent patch to opal_configure_cores(). This could break booting on bare metal Power8 boxes if the kernel was built without CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG. - .. and finally a workaround for spurious PMU interrupts on Power9 DD2. Thanks to: Nicholas Piggin, Anton Blanchard, Balbir Singh" * tag 'powerpc-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Mark __init memory no-execute when STRICT_KERNEL_RWX=y powerpc/mm/hash: Refactor hash__mark_rodata_ro() powerpc/mm/radix: Refactor radix__mark_rodata_ro() powerpc/64s: Fix hypercall entry clobbering r12 input powerpc/perf: Avoid spurious PMU interrupts after idle powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores()
2017-07-18powerpc/64s: Fix hypercall entry clobbering r12 inputNicholas Piggin1-14/+14
A previous optimisation incorrectly assumed the PAPR hcall does not use r12, and clobbers it upon entry. In fact it is used as an input. This can result in KVM guests crashing (observed with PR KVM). Instead of using r12 to save r13, tihs patch saves r13 in ctr. This is more costly, but not as slow as using the SPRG. Fixes: acd7d8cef0153 ("powerpc/64s: Optimize hypercall/syscall entry") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-12powerpc/64s: implement arch-specific hardlockup watchdogNicholas Piggin1-2/+28
Implement an arch-speicfic watchdog rather than use the perf-based hardlockup detector. The new watchdog takes the soft-NMI directly, rather than going through perf. Perf interrupts are to be made maskable in future, so that would prevent the perf detector from working in those regions. Additionally, implement a SMP based detector where all CPUs watch one another by pinging a shared cpumask. This is because powerpc Book3S does not have a true periodic local NMI, but some platforms do implement a true NMI IPI. If a CPU is stuck with interrupts hard disabled, the soft-NMI watchdog does not work, but the SMP watchdog will. Even on platforms without a true NMI IPI to get a good trace from the stuck CPU, other CPUs will notice the lockup sufficiently to report it and panic. [npiggin@gmail.com: honor watchdog disable at boot/hotplug] Link: http://lkml.kernel.org/r/20170621001346.5bb337c9@roar.ozlabs.ibm.com [npiggin@gmail.com: fix false positive warning at CPU unplug] Link: http://lkml.kernel.org/r/20170630080740.20766-1-npiggin@gmail.com [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170616065715.18390-6-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Don Zickus <dzickus@redhat.com> Tested-by: Babu Moger <babu.moger@oracle.com> [sparc] Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-03powerpc/64s: Blacklist functions invoked on a trapNaveen N. Rao1-0/+2
Blacklist all functions involved while handling a trap. We: - convert some of the symbols into private symbols, and - blacklist most functions involved while handling a trap. Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-03powerpc/64s: Convert .L__replay_interrupt_return to a local labelNaveen N. Rao1-2/+2
Commit b48bbb82e2b835 ("powerpc/64s: Don't unbalance the return branch predictor in __replay_interrupt()") introduced __replay_interrupt_return symbol with '.L' prefix in hopes of keeping it private. However, due to the use of LOAD_REG_ADDR(), the assembler kept this symbol visible. Fix the same by instead using the local label '1'. Fixes: Commit b48bbb82e2b835 ("powerpc/64s: Don't unbalance the return branch predictor in __replay_interrupt()") Suggested-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-03Merge branch 'fixes' into nextMichael Ellerman1-4/+7
Merge our fixes branch, a few of them are tripping people up while working on top of next, and we also have a dependency between the CXL fixes and new CXL code we want to merge into next.
2017-06-27powerpc/64s: Invalidate ERAT on powersave wakeup for POWER9Benjamin Herrenschmidt1-3/+5
On POWER9 the ERAT may be incorrect on wakeup from some stop states that lose state. This causes random segvs and illegal instructions when these stop states are enabled. This patch invalidates the ERAT on wakeup on POWER9 to prevent this from causing a problem. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Merge comment change with upstream changes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-21powerpc/64s: Rename slb_allocate_realmode() to slb_allocate()Michael Ellerman1-1/+1
As for slb_miss_realmode(), rename slb_allocate_realmode() to avoid confusion over whether it runs in real or virtual mode - it runs in both. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2017-06-21powerpc/64s: Rename slb_miss_realmode() to slb_miss_common()Michael Ellerman1-6/+9
slb_miss_realmode() doesn't always runs in real mode, which is what the name implies. So rename it to avoid confusing people. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2017-06-21powerpc/64s: Use BRANCH_TO_COMMON() for slb_miss_realmodeMichael Ellerman1-38/+4
All the callers of slb_miss_realmode currently open code the #ifndef CONFIG_RELOCATABLE check and the branch via CTR in the RELOCATABLE case. We have a macro to do this, BRANCH_TO_COMMON(), so use it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2017-06-20powerpc/64s: Avoid r3 save/restore in SLB miss handlerNicholas Piggin1-15/+26
The SLB miss handler uses r3 for the faulting address but r12 is mostly able to be freed up to save r3 in. It just requires SRR1 be reloaded again on error. It would be more conventional to use r12 for SRR1 (and use r11 to save r3), but slb_allocate_realmode clobbers r11 and not r12. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-20powerpc/64s: SLB miss already has CTR saved for relocatable kernelNicholas Piggin1-8/+1
The EXCEPTION_PROLOG_1 used by SLB miss already saves CTR when the kernel is built with CONFIG_RELOCATABLE. So it does not have to be saved and reloaded when branching to slb_miss_realmode. It can be restored from the PACA as usual. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-20powerpc/64s: Avoid saving faulting address into EX_DAR in SLB missNicholas Piggin1-5/+8
The EX_DAR save area is only used in exceptional cases. With r3 no longer clobbered by slb_allocate_realmode, saving faulting address to EX_DAR can be deferred to those cases. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-19powerpc/64s/idle: Avoid SRR usage in idle sleep/wake pathsNicholas Piggin1-0/+1
Idle code now always runs at the 0xc... effective address whether in real or virtual mode. This means rfid can be ditched, along with a lot of SRR manipulations. In the wakeup path, carry SRR1 around in r12. Use mtmsrd to change MSR states as required. This also balances the return prediction for the idle call, by doing blr rather than rfid to return to the idle caller. On POWER9, 2-process context switch on different cores, with snooze disabled, increases performance by 2%. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Incorporate v2 fixes from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-19powerpc/64s/idle: Branch to handler with virtual mode offsetNicholas Piggin1-2/+4
Have the system reset idle wakeup handlers branched to in real mode with the 0xc... kernel address applied. This allows simplifications of avoiding rfid when switching to virtual mode in the wakeup handler. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-19powerpc/64s: Don't unbalance the return branch predictor in __replay_interrupt()Nicholas Piggin1-1/+7
The __replay_interrupt() code is branched to with bl, but the caller is returned to directly with rfid from the interrupt. Instead, rfid to a stub that returns to the caller with blr, which should keep the return branch predictor balanced. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-19powerpc/64s: msgclr when handling doorbell exceptions from system resetNicholas Piggin1-2/+21
msgsnd doorbell exceptions are cleared when the doorbell interrupt is taken. However if a doorbell exception causes a system reset interrupt wake from power saving state, the message is not cleared. Processing the doorbell from the system reset interrupt requires msgclr to avoid taking the exception again. Testing this plus the previous wakup direct patch gives: original wakeup direct msgclr Different threads, same core: 315k/s 264k/s 345k/s Different cores: 235k/s 242k/s 242k/s Net speedup is +10% for same core, and +3% for different core. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16powerpc/64s: Handle data breakpoints in Radix modeNaveen N. Rao1-4/+7
On Power9, trying to use data breakpoints throws the splat shown below. This is because the check for a data breakpoint in DSISR is in do_hash_page(), which is not called when in Radix mode. Unable to handle kernel paging request for data at address 0xc000000000e19218 Faulting instruction address: 0xc0000000001155e8 cpu 0x0: Vector: 300 (Data Access) at [c0000000ef1e7b20] pc: c0000000001155e8: find_pid_ns+0x48/0xe0 lr: c000000000116ac4: find_task_by_vpid+0x44/0x90 sp: c0000000ef1e7da0 msr: 9000000000009033 dar: c000000000e19218 dsisr: 400000 Move the check to handle_page_fault() so as to catch data breakpoints in both Hash and Radix MMU modes. We have to change the check in do_hash_page() against 0xa410 to use 0xa450, so as to include the value of (DSISR_DABRMATCH << 16). There are two sites that call handle_page_fault() when in Radix, both already pass DSISR in r4. Fixes: caca285e5ab4 ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code") Cc: stable@vger.kernel.org # v4.7+ Reported-by: Shriya R. Kulkarni <shriykul@in.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Fix the fall-through case on hash, we need to reload DSISR] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15powerpc/64s: Optimize hypercall/syscall entryNicholas Piggin1-37/+97
After bc3551257a ("powerpc/64: Allow for relocation-on interrupts from guest to host"), a getppid() system call goes from 307 cycles to 358 cycles (+17%) on POWER8. This is due significantly to the scratch SPR used by the hypercall check. It turns out there are a some volatile registers common to both system call and hypercall (in particular, r12, cr0, ctr), which can be used to avoid the SPR and some other overheads. This brings getppid to 320 cycles (+4%). Testing hcall entry performance by running "sc 1" in guest userspace before this patch is 854 cycles, afterwards is 826. Also a small win there. POWER9 syscall is improved by about the same amount, hcall not tested. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-12Merge tag 'powerpc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds1-3/+1
Pull more powerpc updates from Michael Ellerman: "The change to the Linux page table geometry was delayed for more testing with 16G pages, and there's the new CPU features stuff which just needed one more polish before going in. Plus a few changes from Scott which came in a bit late. And then various fixes, mostly minor. Summary highlights: - rework the Linux page table geometry to lower memory usage on 64-bit Book3S (IBM chips) using the Hash MMU. - support for a new device tree binding for discovering CPU features on future firmwares. - Freescale updates from Scott: "Includes a fix for a powerpc/next mm regression on 64e, a fix for a kernel hang on 64e when using a debugger inside a relocated kernel, a qman fix, and misc qe improvements." Thanks to: Christophe Leroy, Gavin Shan, Horia Geantă, LiuHailong, Nicholas Piggin, Roy Pledge, Scott Wood, Valentin Longchamp" * tag 'powerpc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Support new device tree binding for discovering CPU features powerpc: Don't print cpu_spec->cpu_name if it's NULL of/fdt: introduce of_scan_flat_dt_subnodes and of_get_flat_dt_phandle powerpc/64s: Fix unnecessary machine check handler relocation branch powerpc/mm/book3s/64: Rework page table geometry for lower memory usage powerpc: Fix distclean with Makefile.postlink powerpc/64e: Don't place the stack beyond TASK_SIZE powerpc/powernv: Block PCI config access on BCM5718 during EEH recovery powerpc/8xx: Adding support of IRQ in MPC8xx GPIO soc/fsl/qbman: Disable IRQs for deferred QBMan work soc/fsl/qe: add EXPORT_SYMBOL for the 2 qe_tdm functions soc/fsl/qe: only apply QE_General4 workaround on affected SoCs soc/fsl/qe: round brg_freq to 1kHz granularity soc/fsl/qe: get rid of immrbar_virt_to_phys() net: ethernet: ucc_geth: fix MEM_PART_MURAM mode powerpc/64e: Fix hang when debugging programs with relocated kernel
2017-05-09powerpc/64s: Fix unnecessary machine check handler relocation branchNicholas Piggin1-3/+1
Similarly to commit 2563a70c3b ("powerpc/64s: Remove unnecessary relocation branch from idle handler"), the machine check handler has a BRANCH_TO from relocated to relocated code, which is unnecessary. It has also caused build errors with some toolchains: arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:395: Error: operand out of range (0xffffffffffff8280 is not between 0x0000000000000000 and 0x000000000000ffff) Fixes: 1945bc4549e5 ("powerpc/64s: Fix POWER9 machine check handler from stop state") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reported-and-tested-by : Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-05Merge tag 'powerpc-4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds1-99/+87
Pull powerpc updates from Michael Ellerman: "Highlights include: - Larger virtual address space on 64-bit server CPUs. By default we use a 128TB virtual address space, but a process can request access to the full 512TB by passing a hint to mmap(). - Support for the new Power9 "XIVE" interrupt controller. - TLB flushing optimisations for the radix MMU on Power9. - Support for CAPI cards on Power9, using the "Coherent Accelerator Interface Architecture 2.0". - The ability to configure the mmap randomisation limits at build and runtime. - Several small fixes and cleanups to the kprobes code, as well as support for KPROBES_ON_FTRACE. - Major improvements to handling of system reset interrupts, correctly treating them as NMIs, giving them a dedicated stack and using a new hypervisor call to trigger them, all of which should aid debugging and robustness. - Many fixes and other minor enhancements. Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton Blanchard, Balbir Singh, Ben Hutchings, Benjamin Herrenschmidt, Bhupesh Sharma, Chris Packham, Christian Zigotzky, Christophe Leroy, Christophe Lombard, Daniel Axtens, David Gibson, Gautham R. Shenoy, Gavin Shan, Geert Uytterhoeven, Guilherme G. Piccoli, Hamish Martin, Hari Bathini, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mahesh J Salgaonkar, Mahesh Salgaonkar, Masami Hiramatsu, Matt Brown, Matthew R. Ochs, Michael Neuling, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Pan Xinhui, Paul Mackerras, Rashmica Gupta, Russell Currey, Sukadev Bhattiprolu, Thadeu Lima de Souza Cascardo, Tobin C. Harding, Tyrel Datwyler, Uma Krishnan, Vaibhav Jain, Vipin K Parashar, Yang Shi" * tag 'powerpc-4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits) powerpc/64s: Power9 has no LPCR[VRMASD] field so don't set it powerpc/powernv: Fix TCE kill on NVLink2 powerpc/mm/radix: Drop support for CPUs without lockless tlbie powerpc/book3s/mce: Move add_taint() later in virtual mode powerpc/sysfs: Move #ifdef CONFIG_HOTPLUG_CPU out of the function body powerpc/smp: Document irq enable/disable after migrating IRQs powerpc/mpc52xx: Don't select user-visible RTAS_PROC powerpc/powernv: Document cxl dependency on special case in pnv_eeh_reset() powerpc/eeh: Clean up and document event handling functions powerpc/eeh: Avoid use after free in eeh_handle_special_event() cxl: Mask slice error interrupts after first occurrence cxl: Route eeh events to all drivers in cxl_pci_error_detected() cxl: Force context lock during EEH flow powerpc/64: Allow CONFIG_RELOCATABLE if COMPILE_TEST powerpc/xmon: Teach xmon oops about radix vectors powerpc/mm/hash: Fix off-by-one in comment about kernel contexts ids powerpc/pseries: Enable VFIO powerpc/powernv: Fix iommu table size calculation hook for small tables powerpc/powernv: Check kzalloc() return value in pnv_pci_table_alloc powerpc: Add arch/powerpc/tools directory ...
2017-04-28powerpc/64s: Dedicated system reset interrupt stackNicholas Piggin1-3/+5
The system reset interrupt is used for crash/debug situations, so it is desirable to have as little impact on the normal state of the system as possible. Currently it uses the current kernel stack to process the exception. This stores into the stack which may be involved with the crash. The stack pointer may be corrupted, or it may have overflowed. Avoid or minimise these problems by creating a dedicated NMI stack for the system reset interrupt to use. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28powerpc/64s: Disallow system reset vs system reset reentrancyNicholas Piggin1-5/+32
In preparation for using a dedicated stack for system reset interrupts, prevent a nested system reset from recovering, in order to simplify code that is called in crash/debug path. This allows a system reset interrupt to just use the base stack pointer. Keep an in_nmi nesting counter similarly to the in_mce counter. Consider the interrrupt non-recoverable if it is taken inside another system reset. Interrupt nesting could be allowed similarly to MCE, but system reset is a special case that's not for normal operation, so simplicity wins until there is requirement for nested system reset interrupts. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28powerpc/64s: Fix system reset vs general interrupt reentrancyNicholas Piggin1-3/+6
The system reset interrupt can occur when MSR_EE=0, and it currently uses the PACA_EXGEN save area. Some PACA_EXGEN interrupts have a window where MSR_RI=1 and MSR_EE=0 when the save area is still in use. A system reset interrupt in this window can lead to undetected corruption when the save area gets overwritten. This patch introduces PACA_EXNMI save area for system reset exceptions, which closes this corruption window. It's also helpful to retain the EXGEN state for debugging situations, even if not considering the recoverability aspect. This patch also moves the PACA_EXMC area down to a less frequently used part of the paca with the new save area. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28powerpc/64s: Exception macro for stack frame and initial register saveNicholas Piggin1-9/+4
This code is common to a few exceptions, and another user will be added. This causes a trivial change to generated code: - 604: std r9,416(r1) - 608: mfspr r11,314 - 60c: std r11,368(r1) - 610: mfspr r12,315 + 604: mfspr r11,314 + 608: mfspr r12,315 + 60c: std r9,416(r1) + 610: std r11,368(r1) machine_check_powernv_early could also use this, but that requires non trivial changes to generated code, so that's for another patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28powerpc/64s: Add exception macro that does not enable RINicholas Piggin1-13/+4
Subsequent patches will add more non-RI variant exceptions, so create a macro for it rather than open-code it. This does not change generated instructions. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23powerpc/64s: Fix POWER9 machine check handler from stop stateNicholas Piggin1-35/+44
The ISA specifies power save wakeup due to a machine check exception can cause a machine check interrupt (rather than the usual system reset interrupt). The machine check handler copes with this by doing low level machine check recovery without restoring full state from idle, then queues up a machine check event for logging, then directly executes the same idle instruction it woke from. This minimises the work done before recovery is performed. The problem is that it requires machine specific instructions and knowledge of the book3s idle code. Currently it only has code to handle POWER8 idle, so POWER9 crashes when trying to execute the P8 idle instructions which don't exist in ISAv3.0B. cpu 0x0: Vector: e40 (Emulation Assist) at [c0000000008f3810] pc: c000000000008380: machine_check_handle_early+0x130/0x2f0 lr: c00000000053a098: stop_loop+0x68/0xd0 sp: c0000000008f3a90 msr: 9000000000081001 current = 0xc0000000008a1080 paca = 0xc00000000ffd0000 softe: 0 irq_happened: 0x01 pid = 0, comm = swapper/0 Instead of going to sleep after recovery, do the usual idle wakeup and state restoration by calling into the normal idle wakeup path. This reuses the normal idle wakeup paths. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23powerpc/64s: Stop using bit in HSPRG0 to test winkleNicholas Piggin1-18/+3
The POWER8 idle code has a neat trick of programming the power on engine to restore a low bit into HSPRG0, so idle wakeup code can test and see if it has been programmed this way and therefore lost all state. Restore time can be reduced if winkle has not been reached. However this messes with our r13 PACA pointer, and requires HSPRG0 to be written to. It also optimizes the slowest and most uncommon case at the expense of another SPR write in the common nap state wakeup. Remove this complexity and assume winkle sleeps always require a state restore. This speedup could be made entirely contained within the winkle idle code by counting per-core winkles and setting a thread bitmap when all have gone to winkle. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23powerpc/64s: Move remaining system reset idle code into idle_book3s.SNicholas Piggin1-25/+1
No functional change. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23powerpc/64s: Remove unnecessary relocation branch from idle handlerNicholas Piggin1-1/+1
The system reset idle handler system_reset_idle_common is relocated, so relocation is not required to branch to kvm_start_guest. The superfluous relocation does not result in incorrect code, but it does not compile outside of exception-64s.S (with fixed section definitions). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-18powerpc/64: Fix HMI exception on LE with CONFIG_RELOCATABLE=yMichael Ellerman1-1/+1
Prior to commit 2337d207288f ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts"), the branch from hmi_exception_early() to hmi_exception_realmode() was just a bl hmi_exception_realmode, which the linker would turn into a bl to the local entry point of hmi_exception_realmode. This was broken when CONFIG_RELOCATABLE=y because hmi_exception_realmode() is not in the low part of the kernel text that is copied down to 0x0. But in fixing that, we added a new bug on little endian kernels. Because the branch is now a bctrl when CONFIG_RELOCATABLE=y, we branch to the global entry point of hmi_exception_realmode(). The global entry point must be called with r12 containing the address of hmi_exception_realmode(), because it uses that value to calculate the TOC value (r2). This may manifest as a checkstop, because we take a junk value from r12 which came from HSRR1, add a small constant to it and then use that as the TOC pointer. The HSRR1 value will have 0x9 as the top nibble, which puts it above RAM and somewhere in MMIO space. Fix it by changing the BRANCH_LINK_TO_FAR() macro to always use r12 to load the label we're branching to. This means r12 will be setup correctly on LE, fixing this bug, and r12 is also volatile across function calls on BE so it's a good choice anyway. Fixes: 2337d207288f ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts") Reported-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-14Merge branch 'topic/ppc-kvm' into nextMichael Ellerman1-30/+33
Merge the topic branch we're sharing with the kvm-ppc tree.
2017-02-07powerpc/64: CONFIG_RELOCATABLE support for hmi interruptsNicholas Piggin1-1/+1
The branch from hmi_exception_early to hmi_exception_realmode must use a "relocatable-style" branch, because it is branching from unrelocated exception code to beyond __end_interrupts. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-07powerpc/64s: Use (start, size) rather than (start, end) for exception handlersNicholas Piggin1-92/+103
start,size has the benefit of being easier to search for (start,end usually gives you the preceeding vector from the one you want, as first result). Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-07powerpc/64s: Tidy up after exception handler reworkNicholas Piggin1-1/+1
Somewhere along the line, search/replace left some naming garbled, and untidy alignment (aka. mpe stuffed it up). Might as well fix them all up now while git blame history doesn't extend too far. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31powerpc/64: Allow for relocation-on interrupts from guest to hostPaul Mackerras1-24/+29
With host and guest both using radix translation, it is feasible for the host to take interrupts that come from the guest with relocation on, and that is in fact what the POWER9 hardware will do when LPCR[AIL] = 3. All such interrupts use HSRR0/1 not SRR0/1 except for system call with LEV=1 (hcall). Therefore this adds the KVM tests to the _HV variants of the relocation-on interrupt handlers, and adds the KVM test to the relocation-on system call entry point. We also instantiate the relocation-on versions of the hypervisor data storage and instruction interrupt handlers, since these can occur with relocation on in radix guests. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>