aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-07-05Merge tag 'pstore-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds1-9/+5
Pull pstore updates from Kees Cook: "Various fixes and tweaks for the pstore subsystem. Highlights: - use memdup_user() instead of open-coded copies (Geliang Tang) - fix record memory leak during initialization (Douglas Anderson) - avoid confused compressed record warning (Ankit Kumar) - prepopulate record timestamp and remove redundant logic from backends" * tag 'pstore-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: powerpc/nvram: use memdup_user pstore: use memdup_user pstore: Fix format string to use %u for record id pstore: Populate pstore record->time field pstore: Create common record initializer efi-pstore: Refactor erase routine pstore: Avoid potential infinite loop pstore: Fix leaked pstore_record in pstore_get_backend_records() pstore: Don't warn if data is uncompressed and type is not PSTORE_TYPE_DMESG
2017-07-03Merge tag 'driver-core-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-coreLinus Torvalds5-43/+64
Pull driver core updates from Greg KH: "Here is the big driver core update for 4.13-rc1. The large majority of this is a lot of cleanup of old fields in the driver core structures and their remaining usages in random drivers. All of those fixes have been reviewed by the various subsystem maintainers. There's also some small firmware updates in here, a new kobject uevent api interface that makes userspace interaction easier, and a few other minor things. All of these have been in linux-next for a long while with no reported issues" * tag 'driver-core-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (56 commits) arm: mach-rpc: ecard: fix build error zram: convert remaining CLASS_ATTR() to CLASS_ATTR_RO() driver-core: remove struct bus_type.dev_attrs powerpc: vio_cmo: use dev_groups and not dev_attrs for bus_type powerpc: vio: use dev_groups and not dev_attrs for bus_type USB: usbip: convert to use DRIVER_ATTR_RW s390: drivers: convert to use DRIVER_ATTR_RO/WO platform: thinkpad_acpi: convert to use DRIVER_ATTR_RO/RW pcmcia: ds: convert to use DRIVER_ATTR_RO wireless: ipw2x00: convert to use DRIVER_ATTR_RW net: ehea: convert to use DRIVER_ATTR_RO net: caif: convert to use DRIVER_ATTR_RO TTY: hvc: convert to use DRIVER_ATTR_RW PCI: pci-driver: convert to use DRIVER_ATTR_WO IB: nes: convert to use DRIVER_ATTR_RW HID: hid-core: convert to use DRIVER_ATTR_RO and drv_groups arm: ecard: fix dev_groups patch typo tty: serdev: use dev_groups and not dev_attrs for bus_type sparc: vio: use dev_groups and not dev_attrs for bus_type hid: intel-ish-hid: use dev_groups and not dev_attrs for bus_type ...
2017-07-03Merge tag 'tty-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/ttyLinus Torvalds1-0/+1
Pull tty/serial updates from Greg KH: "Here is the large tty/serial patchset for 4.13-rc1. A lot of tty and serial driver updates are in here, along with some fixups for some __get/put_user usages that were reported. Nothing huge, just lots of development by a number of different developers, full details in the shortlog. All of these have been in linux-next for a while" * tag 'tty-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (71 commits) tty: serial: lpuart: add a more accurate baud rate calculation method tty: serial: lpuart: add earlycon support for imx7ulp tty: serial: lpuart: add imx7ulp support dt-bindings: serial: fsl-lpuart: add i.MX7ULP support tty: serial: lpuart: add little endian 32 bit register support tty: serial: lpuart: refactor lpuart32_{read|write} prototype tty: serial: lpuart: introduce lpuart_soc_data to represent SoC property serial: imx-serial - move DMA buffer configuration to DT serial: imx: Enable RTSD only when needed serial: imx: Remove unused members from imx_port struct serial: 8250: 8250_omap: Fix race b/w dma completion and RX timeout serial: 8250: Fix THRE flag usage for CAP_MINI tty/serial: meson_uart: update to stable bindings dt-bindings: serial: Add bindings for the Amlogic Meson UARTs serial: Delete dead code for CIR serial ports serial: sirf: make of_device_ids const serial/mpsc: switch to dma_alloc_attrs tty: serial: Add Actions Semi Owl UART earlycon dt-bindings: serial: Document Actions Semi Owl UARTs tty/serial: atmel: make the driver DT only ...
2017-07-03Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds5-14/+37
Pull SMP hotplug updates from Thomas Gleixner: "This update is primarily a cleanup of the CPU hotplug locking code. The hotplug locking mechanism is an open coded RWSEM, which allows recursive locking. The main problem with that is the recursive nature as it evades the full lockdep coverage and hides potential deadlocks. The rework replaces the open coded RWSEM with a percpu RWSEM and establishes full lockdep coverage that way. The bulk of the changes fix up recursive locking issues and address the now fully reported potential deadlocks all over the place. Some of these deadlocks have been observed in the RT tree, but on mainline the probability was low enough to hide them away." * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) cpu/hotplug: Constify attribute_group structures powerpc: Only obtain cpu_hotplug_lock if called by rtasd ARM/hw_breakpoint: Fix possible recursive locking for arch_hw_breakpoint_init cpu/hotplug: Remove unused check_for_tasks() function perf/core: Don't release cred_guard_mutex if not taken cpuhotplug: Link lock stacks for hotplug callbacks acpi/processor: Prevent cpu hotplug deadlock sched: Provide is_percpu_thread() helper cpu/hotplug: Convert hotplug locking to percpu rwsem s390: Prevent hotplug rwsem recursion arm: Prevent hotplug rwsem recursion arm64: Prevent cpu hotplug rwsem recursion kprobes: Cure hotplug lock ordering issues jump_label: Reorder hotplug lock and jump_label_lock perf/tracing/cpuhotplug: Fix locking order ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() PCI: Replace the racy recursion prevention PCI: Use cpu_hotplug_disable() instead of get_online_cpus() perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() x86/perf: Drop EXPORT of perf_check_microcode ...
2017-07-03Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull x86 mm updates from Ingo Molnar: "The main changes in this cycle were: - Continued work to add support for 5-level paging provided by future Intel CPUs. In particular we switch the x86 GUP code to the generic implementation. (Kirill A. Shutemov) - Continued work to add PCID CPU support to native kernels as well. In this round most of the focus is on reworking/refreshing the TLB flush infrastructure for the upcoming PCID changes. (Andy Lutomirski)" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) x86/mm: Delete a big outdated comment about TLB flushing x86/mm: Don't reenter flush_tlb_func_common() x86/KASLR: Fix detection 32/64 bit bootloaders for 5-level paging x86/ftrace: Exclude functions in head64.c from function-tracing x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap x86/mm: Remove reset_lazy_tlbstate() x86/ldt: Simplify the LDT switching logic x86/boot/64: Put __startup_64() into .head.text x86/mm: Add support for 5-level paging for KASLR x86/mm: Make kernel_physical_mapping_init() support 5-level paging x86/mm: Add sync_global_pgds() for configuration with 5-level paging x86/boot/64: Add support of additional page table level during early boot x86/boot/64: Rename init_level4_pgt and early_level4_pgt x86/boot/64: Rewrite startup_64() in C x86/boot/compressed: Enable 5-level paging during decompression stage x86/boot/efi: Define __KERNEL32_CS GDT on 64-bit configurations x86/boot/efi: Fix __KERNEL_CS definition of GDT entry on 64-bit configurations x86/boot/efi: Cleanup initialization of GDT entries x86/asm: Fix comment in return_from_SYSCALL_64() x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation ...
2017-07-03Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Add the SYSTEM_SCHEDULING bootup state to move various scheduler debug checks earlier into the bootup. This turns silent and sporadically deadly bugs into nice, deterministic splats. Fix some of the splats that triggered. (Thomas Gleixner) - A round of restructuring and refactoring of the load-balancing and topology code (Peter Zijlstra) - Another round of consolidating ~20 of incremental scheduler code history: this time in terms of wait-queue nomenclature. (I didn't get much feedback on these renaming patches, and we can still easily change any names I might have misplaced, so if anyone hates a new name, please holler and I'll fix it.) (Ingo Molnar) - sched/numa improvements, fixes and updates (Rik van Riel) - Another round of x86/tsc scheduler clock code improvements, in hope of making it more robust (Peter Zijlstra) - Improve NOHZ behavior (Frederic Weisbecker) - Deadline scheduler improvements and fixes (Luca Abeni, Daniel Bristot de Oliveira) - Simplify and optimize the topology setup code (Lauro Ramos Venancio) - Debloat and decouple scheduler code some more (Nicolas Pitre) - Simplify code by making better use of llist primitives (Byungchul Park) - ... plus other fixes and improvements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits) sched/cputime: Refactor the cputime_adjust() code sched/debug: Expose the number of RT/DL tasks that can migrate sched/numa: Hide numa_wake_affine() from UP build sched/fair: Remove effective_load() sched/numa: Implement NUMA node level wake_affine() sched/fair: Simplify wake_affine() for the single socket case sched/numa: Override part of migrate_degrades_locality() when idle balancing sched/rt: Move RT related code from sched/core.c to sched/rt.c sched/deadline: Move DL related code from sched/core.c to sched/deadline.c sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled sched/fair: Spare idle load balancing on nohz_full CPUs nohz: Move idle balancer registration to the idle path sched/loadavg: Generalize "_idle" naming to "_nohz" sched/core: Drop the unused try_get_task_struct() helper function sched/fair: WARN() and refuse to set buddy when !se->on_rq sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h> ...
2017-06-30Merge tag 'powerpc-4.12-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds1-7/+1
Pull powerpc fixes from Michael Ellerman: "Hopefully the last two powerpc fixes for 4.12. The CXL one is larger than I'd usually send at rc7, but it fixes new code this cycle, so better to have it working for the release. It was actually sent a few weeks back but got blocked in testing behind another fix that was causing issues. We are still tracking one crash in v4.12-rc7, but only one person has reproduced it and the commit identified by bisect doesn't touch any of the relevant code, so I think it's 50/50 whether that commit is actually the problem or it's some code layout / toolchain issue. Two fixes for code we merged this cycle: - cxl: Fixes for Coherent Accelerator Interface Architecture 2.0 - Avoid miscompilation w/GCC 4.6.3 on 32-bit - don't inline copy_to/from_user() Thanks to Al Viro, Larry Finger, Christophe Lombard" * tag 'powerpc-4.12-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/32: Avoid miscompilation w/GCC 4.6.3 - don't inline copy_to/from_user() cxl: Fixes for Coherent Accelerator Interface Architecture 2.0
2017-06-28arch: remove unused macro/function thread_saved_pc()Tobias Klauser1-6/+0
The only user of thread_saved_pc() in non-arch-specific code was removed in commit 8243d5597793 ("sched/core: Remove pointless printout in sched_show_task()"). Remove the implementations as well. Some architectures use thread_saved_pc() in their arch-specific code. Leave their thread_saved_pc() intact. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-27powerpc/nvram: use memdup_userGeliang Tang1-9/+5
Use memdup_user() helper instead of open-coding to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2017-06-26powerpc/32: Avoid miscompilation w/GCC 4.6.3 - don't inline copy_to/from_user()Michael Ellerman1-7/+1
Larry Finger reported that his Powerbook G4 was no longer booting with v4.12-rc, userspace was up but giving weird errors such as: udevd[64]: starting version 175 udevd[64]: Unable to receive ctrl message: Bad address. modprobe: chdir(4.12-rc1): No such file or directory He bisected the problem to commit 3448890c32c3 ("powerpc: get rid of zeroing, switch to RAW_COPY_USER"). Al identified that the problem is actually a miscompilation by GCC 4.6.3, which is exposed by the above commit. Al also pointed out that inlining copy_to/from_user() is probably of little or no benefit, which is correct. Using Anton's copy_to_user benchmark, with a pathological single byte copy, we see a small increase in performance by *removing* inlining: Before (inlined): # time ./copy_to_user -w -l 1 -i 10000000 ( x 3 ) real 0m22.063s real 0m22.059s real 0m22.076s After: # time ./copy_to_user -w -l 1 -i 10000000 ( x 3 ) real 0m21.325s real 0m21.299s real 0m21.364s So as a small performance improvement and to avoid the miscompilation, drop inlining copy_to/from_user() on 32-bit. Fixes: 3448890c32c3 ("powerpc: get rid of zeroing, switch to RAW_COPY_USER") Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-24Merge branch 'linus' into sched/core, to pick up fixesIngo Molnar10-70/+284
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-23Merge tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds7-50/+166
Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 4.12. Most of these actually came in last week but got held up for some more testing. - three fixes for kprobes/ftrace/livepatch interactions. - properly handle data breakpoints when using the Radix MMU. - fix for perf sampling of registers during call_usermodehelper(). - properly initialise the thread_info on our emergency stacks - add an explicit flush when doing TLB invalidations for a process using NPU2. Thanks to: Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi Bangoria, Masami Hiramatsu" * tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64: Initialise thread_info for emergency stacks powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD powerpc/perf: Fix oops when kthread execs user process powerpc/64s: Handle data breakpoints in Radix mode powerpc/kprobes: Skip livepatch_handler() for jprobes powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS powerpc/kprobes: Pause function_graph tracing during jprobes handling
2017-06-23powerpc: Only obtain cpu_hotplug_lock if called by rtasdThiago Jung Bauermann3-4/+26
Calling arch_update_cpu_topology from a CPU hotplug state machine callback hits a deadlock because the function tries to get a read lock on cpu_hotplug_lock while the state machine still holds a write lock on it. Since all callers of arch_update_cpu_topology except rtasd already hold cpu_hotplug_lock, this patch changes the function to use stop_machine_cpuslocked and creates a separate function for rtasd which still tries to obtain the lock. Michael Bringmann investigated the bug and provided a detailed analysis of the deadlock on this previous RFC for an alternate solution: Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Michael Bringmann <mwb@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1497996510-4032-1-git-send-email-bauerman@linux.vnet.ibm.com Link: https://patchwork.ozlabs.org/patch/771293/
2017-06-23powerpc/64: Initialise thread_info for emergency stacksNicholas Piggin1-3/+28
Emergency stacks have their thread_info mostly uninitialised, which in particular means garbage preempt_count values. Emergency stack code runs with interrupts disabled entirely, and is used very rarely, so this has been unnoticed so far. It was found by a proposed new powerpc watchdog that takes a soft-NMI directly from the masked_interrupt handler and using the emergency stack. That crashed at BUG_ON(in_nmi()) in nmi_enter(). preempt_count()s were found to be garbage. To fix this, zero the entire THREAD_SIZE allocation, and initialize the thread_info. Cc: stable@vger.kernel.org Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Move it all into setup_64.c, use a function not a macro. Fix crashes on Cell by setting preempt_count to 0 not HARDIRQ_OFFSET] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-22powerpc/powernv/npu-dma: Add explicit flush when sending an ATSDAlistair Popple1-29/+65
NPU2 requires an extra explicit flush to an active GPU PID when sending address translation shoot downs (ATSDs) to reliably flush the GPU TLB. This patch adds just such a flush at the end of each sequence of ATSDs. We can safely use PID 0 which is always reserved and active on the GPU. PID 0 is only used for init_mm which will never be a user mm on the GPU. To enforce this we add a check in pnv_npu2_init_context() just in case someone tries to use PID 0 on the GPU. Signed-off-by: Alistair Popple <alistair@popple.id.au> [mpe: Use true/false for bool literals] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-22Merge branch 'linus' into x86/mm, to pick up fixesIngo Molnar9-19/+17
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20Merge branch 'kvm-ppc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpcRadim Krčmář3-20/+118
* fix problems that could cause hangs or crashes in the host on POWER9 * fix problems that could allow guests to potentially affect or disrupt the execution of the controlling userspace
2017-06-20Merge branch 'WIP.sched/core' into sched/coreIngo Molnar28-74/+137
Conflicts: kernel/sched/Makefile Pick up the waitqueue related renames - it didn't get much feedback, so it appears to be uncontroversial. Famous last words? ;-) Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-19mm: larger stack guard gap, between vmasHugh Dickins3-4/+4
Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-17Merge tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds5-10/+13
Pull powerpc fixes from Michael Ellerman: "Three small fixes for recently merged code: - remove a spurious WARN_ON when a PCI device has no of_node, it's allowed in some circumstances for there to be no of_node. - fix the offset for store EOI MMIOs in the XIVE interrupt controller. - fix non-const WARN_ONs which were becoming BUGs due to them losing BUGFLAG_WARNING in a recent cleanup patch. Thanks to: Alexey Kardashevskiy, Alistair Popple, Benjamin Herrenschmidt" * tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/debug: Add missing warn flag to WARN_ON's non-builtin path powerpc/xive: Fix offset for store EOI MMIOs powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_node
2017-06-16powerpc/perf: Fix oops when kthread execs user processRavi Bangoria1-1/+2
When a kthread calls call_usermodehelper() the steps are: 1. allocate current->mm 2. load_elf_binary() 3. populate current->thread.regs While doing this, interrupts are not disabled. If there is a perf interrupt in the middle of this process (i.e. step 1 has completed but not yet reached to step 3) and if perf tries to read userspace regs, kernel oops with following log: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc0000000000da0fc ... Call Trace: perf_output_sample_regs+0x6c/0xd0 perf_output_sample+0x4e4/0x830 perf_event_output_forward+0x64/0x90 __perf_event_overflow+0x8c/0x1e0 record_and_restart+0x220/0x5c0 perf_event_interrupt+0x2d8/0x4d0 performance_monitor_exception+0x54/0x70 performance_monitor_common+0x158/0x160 --- interrupt: f01 at avtab_search_node+0x150/0x1a0 LR = avtab_search_node+0x100/0x1a0 ... load_elf_binary+0x6e8/0x15a0 search_binary_handler+0xe8/0x290 do_execveat_common.isra.14+0x5f4/0x840 call_usermodehelper_exec_async+0x170/0x210 ret_from_kernel_thread+0x5c/0x7c Fix it by setting abi to PERF_SAMPLE_REGS_ABI_NONE when userspace pt_regs are not set. Fixes: ed4a4ef85cf5 ("powerpc/perf: Add support for sampling interrupt register state") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16powerpc/64s: Handle data breakpoints in Radix modeNaveen N. Rao1-4/+7
On Power9, trying to use data breakpoints throws the splat shown below. This is because the check for a data breakpoint in DSISR is in do_hash_page(), which is not called when in Radix mode. Unable to handle kernel paging request for data at address 0xc000000000e19218 Faulting instruction address: 0xc0000000001155e8 cpu 0x0: Vector: 300 (Data Access) at [c0000000ef1e7b20] pc: c0000000001155e8: find_pid_ns+0x48/0xe0 lr: c000000000116ac4: find_task_by_vpid+0x44/0x90 sp: c0000000ef1e7da0 msr: 9000000000009033 dar: c000000000e19218 dsisr: 400000 Move the check to handle_page_fault() so as to catch data breakpoints in both Hash and Radix MMU modes. We have to change the check in do_hash_page() against 0xa410 to use 0xa450, so as to include the value of (DSISR_DABRMATCH << 16). There are two sites that call handle_page_fault() when in Radix, both already pass DSISR in r4. Fixes: caca285e5ab4 ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code") Cc: stable@vger.kernel.org # v4.7+ Reported-by: Shriya R. Kulkarni <shriykul@in.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Fix the fall-through case on hash, we need to reload DSISR] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16powerpc/kprobes: Skip livepatch_handler() for jprobesNaveen N. Rao3-5/+41
ftrace_caller() depends on a modified regs->nip to detect if a certain function has been livepatched. However, with KPROBES_ON_FTRACE, it is possible for regs->nip to have been modified by the kprobes pre_handler (jprobes, for instance). In this case, we do not want to invoke the livepatch_handler so as not to consume the livepatch stack. To distinguish between the two (kprobes and livepatch), we check if there is an active kprobe on the current function. If there is, then we know for sure that it must have modified the NIP as we don't support livepatching a kprobe'd function. In this case, we simply skip the livepatch_handler and branch to the new NIP. Otherwise, the livepatch_handler is invoked. Fixes: ead514d5fb30 ("powerpc/kprobes: Add support for KPROBES_ON_FTRACE") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGSNaveen N. Rao1-8/+12
For DYNAMIC_FTRACE_WITH_REGS, we should be passing-in the original set of registers in pt_regs, to capture the state _before_ ftrace_caller. However, we are instead passing the stack pointer *after* allocating a stack frame in ftrace_caller. Fix this by saving the proper value of r1 in pt_regs. Also, use SAVE_10GPRS() to simplify the code. Fixes: 153086644fd1 ("powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16powerpc/kprobes: Pause function_graph tracing during jprobes handlingNaveen N. Rao1-0/+11
This fixes a crash when function_graph and jprobes are used together. This is essentially commit 237d28db036e ("ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing"), but for powerpc. Jprobes breaks function_graph tracing since the jprobe hook needs to use jprobe_return(), which never returns back to the hook, but instead to the original jprobe'd function. The solution is to momentarily pause function_graph tracing before invoking the jprobe hook and re-enable it when returning back to the original jprobe'd function. Fixes: 6794c78243bf ("powerpc64: port of the function graph tracer") Cc: stable@vger.kernel.org # v2.6.30+ Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16powerpc/debug: Add missing warn flag to WARN_ON's non-builtin pathAlexey Kardashevskiy1-1/+1
When trapped on WARN_ON(), report_bug() is expected to return BUG_TRAP_TYPE_WARN so the caller will increment NIP by 4 and continue. The __builtin_constant_p() path of the PPC's WARN_ON() calls (indirectly) __WARN_FLAGS() which has BUGFLAG_WARNING set, however the other branch does not which makes report_bug() report a bug rather than a warning. Fixes: f26dee15103f ("debug: Avoid setting BUGFLAG_WARNING twice") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-16KVM: PPC: Book3S HV: Ignore timebase offset on POWER9 DD1Paul Mackerras1-0/+8
POWER9 DD1 has an erratum where writing to the TBU40 register, which is used to apply an offset to the timebase, can cause the timebase to lose counts. This results in the timebase on some CPUs getting out of sync with other CPUs, which then results in misbehaviour of the timekeeping code. To work around the problem, we make KVM ignore the timebase offset for all guests on POWER9 DD1 machines. This means that live migration cannot be supported on POWER9 DD1 machines. Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-06-16KVM: PPC: Book3S HV: Save/restore host values of debug registersPaul Mackerras1-13/+32
At present, HV KVM on POWER8 and POWER9 machines loses any instruction or data breakpoint set in the host whenever a guest is run. Instruction breakpoints are currently only used by xmon, but ptrace and the perf_event subsystem can set data breakpoints as well as xmon. To fix this, we save the host values of the debug registers (CIABR, DAWR and DAWRX) before entering the guest and restore them on exit. To provide space to save them in the stack frame, we expand the stack frame allocated by kvmppc_hv_entry() from 112 to 144 bytes. Fixes: b005255e12a3 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-06-15powerpc/xive: Fix offset for store EOI MMIOsBenjamin Herrenschmidt3-8/+10
Architecturally we should apply a 0x400 offset for these. Not doing it will break future HW implementations. The offset of 0 is supposed to remain for "triggers" though not all sources support both trigger and store EOI, and in P9 specifically, some sources will treat 0 as a store EOI. But future chips will not. So this makes us use the properly architected offset which should work always. Fixes: 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15KVM: PPC: Book3S HV: Preserve userspace HTM state properlyPaul Mackerras1-0/+21
If userspace attempts to call the KVM_RUN ioctl when it has hardware transactional memory (HTM) enabled, the values that it has put in the HTM-related SPRs TFHAR, TFIAR and TEXASR will get overwritten by guest values. To fix this, we detect this condition and save those SPR values in the thread struct, and disable HTM for the task. If userspace goes to access those SPRs or the HTM facility in future, a TM-unavailable interrupt will occur and the handler will reload those SPRs and re-enable HTM. If userspace has started a transaction and suspended it, we would currently lose the transactional state in the guest entry path and would almost certainly get a "TM Bad Thing" interrupt, which would cause the host to crash. To avoid this, we detect this case and return from the KVM_RUN ioctl with an EINVAL error, with the KVM exit reason set to KVM_EXIT_FAIL_ENTRY. Fixes: b005255e12a3 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-06-15KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exitPaul Mackerras2-3/+17
This restores several special-purpose registers (SPRs) to sane values on guest exit that were missed before. TAR and VRSAVE are readable and writable by userspace, and we need to save and restore them to prevent the guest from potentially affecting userspace execution (not that TAR or VRSAVE are used by any known program that run uses the KVM_RUN ioctl). We save/restore these in kvmppc_vcpu_run_hv() rather than on every guest entry/exit. FSCR affects userspace execution in that it can prohibit access to certain facilities by userspace. We restore it to the normal value for the task on exit from the KVM_RUN ioctl. IAMR is normally 0, and is restored to 0 on guest exit. However, with a radix host on POWER9, it is set to a value that prevents the kernel from executing user-accessible memory. On POWER9, we save IAMR on guest entry and restore it on guest exit to the saved value rather than 0. On POWER8 we continue to set it to 0 on guest exit. PSPB is normally 0. We restore it to 0 on guest exit to prevent userspace taking advantage of the guest having set it non-zero (which would allow userspace to set its SMT priority to high). UAMOR is normally 0. We restore it to 0 on guest exit to prevent the AMR from being used as a covert channel between userspace processes, since the AMR is not context-switched at present. Fixes: b005255e12a3 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-06-14powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_nodeAlistair Popple1-1/+2
Commit 4c3b89effc28 ("powerpc/powernv: Add sanity checks to pnv_pci_get_{gpu|npu}_dev") introduced explicit warnings in pnv_pci_get_npu_dev() when a PCIe device has no associated device-tree node. However not all PCIe devices have an of_node and pnv_pci_get_npu_dev() gets indirectly called at least once for every PCIe device in the system. This results in spurious WARN_ON()'s so remove it. The same situation should not exist for pnv_pci_get_gpu_dev() as any NPU based PCIe device requires a device-tree node. Fixes: 4c3b89effc28 ("powerpc/powernv: Add sanity checks to pnv_pci_get_{gpu|npu}_dev") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-13x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementationKirill A. Shutemov1-1/+1
This patch provides all required callbacks required by the generic get_user_pages_fast() code and switches x86 over - and removes the platform specific implementation. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170606113133.22974-2-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-13KVM: PPC: Book3S HV: Context-switch EBB registers properlyPaul Mackerras1-0/+15
This adds code to save the values of three SPRs (special-purpose registers) used by userspace to control event-based branches (EBBs), which are essentially interrupts that get delivered directly to userspace. These registers are loaded up with guest values when entering the guest, and their values are saved when exiting the guest, but we were not saving the host values and restoring them before going back to userspace. On POWER8 this would only affect userspace programs which explicitly request the use of EBBs and also use the KVM_RUN ioctl, since the only source of EBBs on POWER8 is the PMU, and there is an explicit enable bit in the PMU registers (and those PMU registers do get properly context-switched between host and guest). On POWER9 there is provision for externally-generated EBBs, and these are not subject to the control in the PMU registers. Since these registers only affect userspace, we can save them when we first come in from userspace and restore them before returning to userspace, rather than saving/restoring the host values on every guest entry/exit. Similarly, we don't need to worry about their values on offline secondary threads since they execute in the context of the idle task, which never executes in userspace. Fixes: b005255e12a3 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-06-12powerpc: vio_cmo: use dev_groups and not dev_attrs for bus_typeGreg Kroah-Hartman1-23/+33
The dev_attrs field has long been "depreciated" and is finally being removed, so move the driver to use the "correct" dev_groups field instead for struct bus_type. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Johan Hovold <johan@kernel.org> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: <linuxppc-dev@lists.ozlabs.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-12powerpc: vio: use dev_groups and not dev_attrs for bus_typeGreg Kroah-Hartman1-6/+10
The dev_attrs field has long been "depreciated" and is finally being removed, so move the driver to use the "correct" dev_groups field instead for struct bus_type. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Johan Hovold <johan@kernel.org> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: <linuxppc-dev@lists.ozlabs.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-11Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-securityLinus Torvalds1-5/+0
Pull key subsystem fixes from James Morris: "Here are a bunch of fixes for Linux keyrings, including: - Fix up the refcount handling now that key structs use the refcount_t type and the refcount_t ops don't allow a 0->1 transition. - Fix a potential NULL deref after error in x509_cert_parse(). - Don't put data for the crypto algorithms to use on the stack. - Fix the handling of a null payload being passed to add_key(). - Fix incorrect cleanup an uninitialised key_preparsed_payload in key_update(). - Explicit sanitisation of potentially secure data before freeing. - Fixes for the Diffie-Helman code" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits) KEYS: fix refcount_inc() on zero KEYS: Convert KEYCTL_DH_COMPUTE to use the crypto KPP API crypto : asymmetric_keys : verify_pefile:zero memory content before freeing KEYS: DH: add __user annotations to keyctl_kdf_params KEYS: DH: ensure the KDF counter is properly aligned KEYS: DH: don't feed uninitialized "otherinfo" into KDF KEYS: DH: forbid using digest_null as the KDF hash KEYS: sanitize key structs before freeing KEYS: trusted: sanitize all key material KEYS: encrypted: sanitize all key material KEYS: user_defined: sanitize key payloads KEYS: sanitize add_key() and keyctl() key payloads KEYS: fix freeing uninitialized memory in key_update() KEYS: fix dereferencing NULL payload with nonzero length KEYS: encrypted: use constant-time HMAC comparison KEYS: encrypted: fix race causing incorrect HMAC calculations KEYS: encrypted: fix buffer overread in valid_master_desc() KEYS: encrypted: avoid encrypting/decrypting stack buffers KEYS: put keyring if install_session_keyring_to_cred() fails KEYS: Delete an error message for a failed memory allocation in get_derived_key() ...
2017-06-09Merge tag 'powerpc-4.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds16-50/+109
Pull powerpc fixes from Michael Ellerman: "Mostly fairly minor, of note are: - Fix percpu allocations to be NUMA aware - Limit 4k page size config to 64TB virtual address space - Avoid needlessly restoring FP and vector registers Thanks to Aneesh Kumar K.V, Breno Leitao, Christophe Leroy, Frederic Barrat, Madhavan Srinivasan, Michael Bringmann, Nicholas Piggin, Vaibhav Jain" * tag 'powerpc-4.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/book3s64: Move PPC_DT_CPU_FTRs and enable it by default powerpc/mm/4k: Limit 4k page size config to 64TB virtual address space cxl: Fix error path on bad ioctl powerpc/perf: Fix Power9 test_adder fields powerpc/numa: Fix percpu allocations to be NUMA aware cxl: Avoid double free_irq() for psl,slice interrupts powerpc/kernel: Initialize load_tm on task creation powerpc/kernel: Fix FP and vector register restoration powerpc/64: Reclaim CPU_FTR_SUBCORE powerpc/hotplug-mem: Fix missing endian conversion of aa_index powerpc/sysdev/simple_gpio: Fix oops in gpio save_regs function powerpc/spufs: Fix coredump of SPU contexts powerpc/64s: Add dt_cpu_ftrs boot time setup option
2017-06-09tty: add TIOCGPTPEER ioctlAleksa Sarai1-0/+1
When opening the slave end of a PTY, it is not possible for userspace to safely ensure that /dev/pts/$num is actually a slave (in cases where the mount namespace in which devpts was mounted is controlled by an untrusted process). In addition, there are several unresolvable race conditions if userspace were to attempt to detect attacks through stat(2) and other similar methods [in addition it is not clear how userspace could detect attacks involving FUSE]. Resolve this by providing an interface for userpace to safely open the "peer" end of a PTY file descriptor by using the dentry cached by devpts. Since it is not possible to have an open master PTY without having its slave exposed in /dev/pts this interface is safe. This interface currently does not provide a way to get the master pty (since it is not clear whether such an interface is safe or even useful). Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Valentin Rothberg <vrothberg@suse.com> Signed-off-by: Aleksa Sarai <asarai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-09powerpc: ibmebus: use dev_groups and not dev_attrs for bus_typeGreg Kroah-Hartman1-6/+10
The dev_attrs field has long been "depreciated" and is finally being removed, so move the driver to use the "correct" dev_groups field instead for struct bus_type. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Johan Hovold <johan@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Rob Herring <robh@kernel.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: <linuxppc-dev@lists.ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-09powerpc: ps3: use dev_groups and not dev_attrs for bus_typeGreg Kroah-Hartman1-4/+6
The dev_attrs field has long been "depreciated" and is finally being removed, so move the driver to use the "correct" dev_groups field instead for struct bus_type. Seems-ok: Geoff Levand <geoff@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: <linuxppc-dev@lists.ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-09security/keys: add CONFIG_KEYS_COMPAT to KconfigBilal Amarni1-5/+0
CONFIG_KEYS_COMPAT is defined in arch-specific Kconfigs and is missing for several 64-bit architectures : mips, parisc, tile. At the moment and for those architectures, calling in 32-bit userspace the keyctl syscall would return an ENOSYS error. This patch moves the CONFIG_KEYS_COMPAT option to security/keys/Kconfig, to make sure the compatibility wrapper is registered by default for any 64-bit architecture as long as it is configured with CONFIG_COMPAT. [DH: Modified to remove arm64 compat enablement also as requested by Eric Biggers] Signed-off-by: Bilal Amarni <bilal.amarni@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> cc: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-08powerpc/book3s64: Move PPC_DT_CPU_FTRs and enable it by defaultMichael Ellerman2-11/+11
The PPC_DT_CPU_FTRs is a bit misplaced in menuconfig, it shows up with other general kernel options. It's really more at home in the "Platform Support" section, so move it there. Also enable it by default, for Book3s 64. It does mostly nothing unless the device tree properties are found, and we will want it enabled eventually in distro kernels, so turn it on to start getting more testing. Fixes: 5a61ef74f269 ("powerpc/64s: Support new device tree binding for discovering CPU features") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-08powerpc/mm/4k: Limit 4k page size config to 64TB virtual address spaceAneesh Kumar K.V4-16/+15
Supporting 512TB requires us to do a order 3 allocation for level 1 page table (pgd). This results in page allocation failures with certain workloads. For now limit 4k linux page size config to 64TB. Fixes: f6eedbba7a26 ("powerpc/mm/hash: Increase VA range to 128TB") Reported-by: Hugh Dickins <hughd@google.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-07driver core: remove CLASS_ATTR usageGreg Kroah-Hartman2-4/+5
There was only 2 remaining users of CLASS_ATTR() so let's finally get rid of them and force everyone to use the correct RW/RO/WO versions instead. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-06powerpc/perf: Fix Power9 test_adder fieldsMadhavan Srinivasan1-2/+2
Commit 8d911904f3ce4 ('powerpc/perf: Add restrictions to PMC5 in power9 DD1') was added to restrict the use of PMC5 in Power9 DD1. Intention was to disable the use of PMC5 using raw event code. But instead of updating the power9_isa207_pmu structure (used on DD1), the commit incorrectly updated the power9_pmu structure. Fix it. Fixes: 8d911904f3ce ("powerpc/perf: Add restrictions to PMC5 in power9 DD1") Reported-by: Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Tested-by: Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-06powerpc/numa: Fix percpu allocations to be NUMA awareMichael Ellerman2-2/+16
In commit 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we switched to the generic implementation of cpu_to_node(), which uses a percpu variable to hold the NUMA node for each CPU. Unfortunately we neglected to notice that we use cpu_to_node() in the allocation of our percpu areas, leading to a chicken and egg problem. In practice what happens is when we are setting up the percpu areas, cpu_to_node() reports that all CPUs are on node 0, so we allocate all percpu areas on node 0. This is visible in the dmesg output, as all pcpu allocs being in group 0: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 To fix it we need an early_cpu_to_node() which can run prior to percpu being setup. We already have the numa_cpu_lookup_table we can use, so just plumb it in. With the patch dmesg output shows two groups, 0 and 1: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31 pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39 pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47 We can also check the data_offset in the paca of various CPUs, with the fix we see: CPU 0: data_offset = 0x0ffe8b0000 CPU 24: data_offset = 0x1ffe5b0000 And we can see from dmesg that CPU 24 has an allocation on node 1: node 0: [mem 0x0000000000000000-0x0000000fffffffff] node 1: [mem 0x0000001000000000-0x0000001fffffffff] Cc: stable@vger.kernel.org # v3.16+ Fixes: 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-06powerpc/kernel: Initialize load_tm on task creationBreno Leitao1-0/+1
Currently tsk->thread.load_tm is not initialized in the task creation and can contain garbage on a new task. This is an undesired behaviour, since it affects the timing to enable and disable the transactional memory laziness (disabling and enabling the MSR TM bit, which affects TM reclaim and recheckpoint in the scheduling process). Fixes: 5d176f751ee3 ("powerpc: tm: Enable transactional memory (TM) lazily for userspace") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-05powerpc/kernel: Fix FP and vector register restorationBreno Leitao1-0/+2
Currently tsk->thread->load_vec and load_fp are not initialized during task creation, which can lead to garbage values in these variables (non-zero values). These variables will be checked later in restore_math() to validate if the FP and vector registers are being utilized. Since these values might be non-zero, the restore_math() will continue to save the FP and vectors even if they were never utilized by the userspace application. load_fp and load_vec counters will then overflow (they wrap at 255) and the FP and Altivec will be finally disabled, but before that condition is reached (counter overflow) several context switches will have restored FP and vector registers without need, causing a performance degradation. Fixes: 70fe3d980f5f ("powerpc: Restore FPU/VEC/VSX if previously used") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Gustavo Romero <gusbromero@gmail.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-01powerpc/64: Reclaim CPU_FTR_SUBCOREMichael Ellerman3-4/+8
We are running low on CPU feature bits, so we only want to use them when it's really necessary. CPU_FTR_SUBCORE is only used in one place, and only in C, so we don't need it in order to make asm patching work. It can only be set on "Power8" CPUs, which in practice means POWER8, POWER8E and POWER8NVL. There are no plans to implement it on future CPUs, but if there ever were we could retrofit it then. Although KVM uses subcores, it never looks at the CPU feature, it either looks at the ISA level or the threads_per_subcore value. So drop the CPU feature and do a PVR check instead. Drop the device tree "subcore" feature as we no longer support doing anything with it, and we will drop it from skiboot too. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>