aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/perf (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-04-05Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds4-0/+3354
Pull ARM SoC driver updates from Arnd Bergmann: "The main addition this time around is the new ARM "SCMI" framework, which is the latest in a series of standards coming from ARM to do power management in a platform independent way. This has been through many review cycles, and it relies on a rather interesting way of using the mailbox subsystem, but in the end I agreed that Sudeep's version was the best we could do after all. Other changes include: - the ARM CCN driver is moved out of drivers/bus into drivers/perf, which makes more sense. Similarly, the performance monitoring portion of the CCI driver are moved the same way and cleaned up a little more. - a series of updates to the SCPI framework - support for the Mediatek mt7623a SoC in drivers/soc - support for additional NVIDIA Tegra hardware in drivers/soc - a new reset driver for Socionext Uniphier - lesser bug fixes in drivers/soc, drivers/tee, drivers/memory, and drivers/firmware and drivers/reset across platforms" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (87 commits) reset: uniphier: add ethernet reset control support for PXs3 reset: stm32mp1: Enable stm32mp1 reset driver dt-bindings: reset: add STM32MP1 resets reset: uniphier: add Pro4/Pro5/PXs2 audio systems reset control reset: imx7: add 'depends on HAS_IOMEM' to fix unmet dependency reset: modify the way reset lookup works for board files reset: add support for non-DT systems clk: scmi: use devm_of_clk_add_hw_provider() API and drop scmi_clocks_remove firmware: arm_scmi: prevent accessing rate_discrete uninitialized hwmon: (scmi) return -EINVAL when sensor information is unavailable amlogic: meson-gx-socinfo: Update soc ids soc/tegra: pmc: Use the new reset APIs to manage reset controllers soc: mediatek: update power domain data of MT2712 dt-bindings: soc: update MT2712 power dt-bindings cpufreq: scmi: add thermal dependency soc: mediatek: fix the mistaken pointer accessed when subdomains are added soc: mediatek: add SCPSYS power domain driver for MediaTek MT7623A SoC soc: mediatek: avoid hardcoded value with bus_prot_mask dt-bindings: soc: add header files required for MT7623A SCPSYS dt-binding dt-bindings: soc: add SCPSYS binding for MT7623 and MT7623A SoC ...
2018-04-04Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds2-1/+15
Pull arm64 updates from Will Deacon: "Nothing particularly stands out here, probably because people were tied up with spectre/meltdown stuff last time around. Still, the main pieces are: - Rework of our CPU features framework so that we can whitelist CPUs that don't require kpti even in a heterogeneous system - Support for the IDC/DIC architecture extensions, which allow us to elide instruction and data cache maintenance when writing out instructions - Removal of the large memory model which resulted in suboptimal codegen by the compiler and increased the use of literal pools, which could potentially be used as ROP gadgets since they are mapped as executable - Rework of forced signal delivery so that the siginfo_t is well-formed and handling of show_unhandled_signals is consolidated and made consistent between different fault types - More siginfo cleanup based on the initial patches from Eric Biederman - Workaround for Cortex-A55 erratum #1024718 - Some small ACPI IORT updates and cleanups from Lorenzo Pieralisi - Misc cleanups and non-critical fixes" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (70 commits) arm64: uaccess: Fix omissions from usercopy whitelist arm64: fpsimd: Split cpu field out from struct fpsimd_state arm64: tlbflush: avoid writing RES0 bits arm64: cmpxchg: Include linux/compiler.h in asm/cmpxchg.h arm64: move percpu cmpxchg implementation from cmpxchg.h to percpu.h arm64: cmpxchg: Include build_bug.h instead of bug.h for BUILD_BUG arm64: lse: Include compiler_types.h and export.h for out-of-line LL/SC arm64: fpsimd: include <linux/init.h> in fpsimd.h drivers/perf: arm_pmu_platform: do not warn about affinity on uniprocessor perf: arm_spe: include linux/vmalloc.h for vmap() Revert "arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size)" arm64: cpufeature: Avoid warnings due to unused symbols arm64: Add work around for Arm Cortex-A55 Erratum 1024718 arm64: Delay enabling hardware DBM feature arm64: Add MIDR encoding for Arm Cortex-A55 and Cortex-A35 arm64: capabilities: Handle shared entries arm64: capabilities: Add support for checks based on a list of MIDRs arm64: Add helpers for checking CPU MIDR against a range arm64: capabilities: Clean up midr range helpers arm64: capabilities: Change scope of VHE to Boot CPU feature ...
2018-03-27drivers/perf: arm_pmu_platform: do not warn about affinity on uniprocessorAlexander Monakov1-1/+1
If there is exactly one CPU present, there is no ambiguity: do not warn that PMU setup would need to guess IRQ affinity. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Alexander Monakov <amonakov@ispras.ru> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27perf: arm_spe: include linux/vmalloc.h for vmap()Arnd Bergmann1-0/+14
On linux-next, I get a build failure in some configurations: drivers/perf/arm_spe_pmu.c: In function 'arm_spe_pmu_setup_aux': drivers/perf/arm_spe_pmu.c:857:14: error: implicit declaration of function 'vmap'; did you mean 'swap'? [-Werror=implicit-function-declaration] buf->base = vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL); ^~~~ swap drivers/perf/arm_spe_pmu.c:857:37: error: 'VM_MAP' undeclared (first use in this function); did you mean 'VM_MPX'? buf->base = vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL); ^~~~~~ VM_MPX drivers/perf/arm_spe_pmu.c:857:37: note: each undeclared identifier is reported only once for each function it appears in drivers/perf/arm_spe_pmu.c: In function 'arm_spe_pmu_free_aux': drivers/perf/arm_spe_pmu.c:878:2: error: implicit declaration of function 'vunmap'; did you mean 'iounmap'? [-Werror=implicit-function-declaration] vmap() is declared in linux/vmalloc.h, so we should include that header file. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> [will: add additional missing #includes reported by Mark] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-19Merge tag 'v4.16-rc6' into perf/core, to pick up fixesIngo Molnar1-1/+1
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-16perf: Fix sibling iterationPeter Zijlstra6-10/+9
Mark noticed that the change to sibling_list changed some iteration semantics; because previously we used group_list as list entry, sibling events would always have an empty sibling_list. But because we now use sibling_list for both list head and list entry, siblings will report as having siblings. Fix this with a custom for_each_sibling_event() iterator. Fixes: 8343aae66167 ("perf/core: Remove perf_event::group_entry") Reported-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: vincent.weaver@maine.edu Cc: alexander.shishkin@linux.intel.com Cc: torvalds@linux-foundation.org Cc: alexey.budankov@linux.intel.com Cc: valery.cherepennikov@intel.com Cc: eranian@google.com Cc: acme@redhat.com Cc: linux-tip-commits@vger.kernel.org Cc: davidcc@google.com Cc: kan.liang@intel.com Cc: Dmitry.Prohorov@intel.com Cc: jolsa@redhat.com Link: https://lkml.kernel.org/r/20180315170129.GX4043@hirez.programming.kicks-ass.net
2018-03-12perf/core: Remove perf_event::group_entryPeter Zijlstra6-8/+7
Now that all the grouping is done with RB trees, we no longer need group_entry and can replace the whole thing with sibling_list. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valery Cherepennikov <valery.cherepennikov@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-06perf/arm-cci: Untangle global cci_ctrl_baseRobin Murphy1-23/+24
Depending directly on the bus driver's global cci_ctrl_base variable is a little unpleasant, and exporting it to allow the PMU driver to be modular would be even more so. Let's make things a little better abstracted by adding the control register block to the cci_pmu instance data alongside the PMU register block, and communicating the mapped address from the bus driver via platform data. It's not practical to try the same thing for the bus driver itself, given that the globals are entangled with the hairy assembly code for port control, so we leave them be there. It would however be prudent to move them to the __ro_after_init section in passing, since the addresses really should never be changing once set. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06perf/arm-cci: Clean up model discoveryRobin Murphy1-24/+16
Since I am the self-appointed of_device_get_match_data() police, it's only right that I should clean up this driver while I'm otherwise touching it. This also reveals that we're passing around a struct platform_device in places where we only ever care about its regular device, so straighten that out in the process. Acked-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06perf/arm-cci: Simplify CPU hotplugRobin Murphy1-37/+19
Realistically, systems with multiple CCIs are unlikely to ever exist, and since the driver only actually supports a single instance anyway there's really no need to do the multi-instance hotplug state dance. Take the opportunity to simplify the hotplug-related code all over, addressing the context-migration TODO in the process for good measure. Acked-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06drivers/bus: Split Arm CCI driverRobin Murphy3-0/+1774
The arm-cci driver is really two entirely separate drivers; one for MCPM port control and the other for the performance monitors. Since they are already relatively self-contained, let's take the plunge and move the PMU parts out to drivers/perf where they belong these days. For non-MCPM systems this leaves a small dependency on the remaining "bus" stub for initial probing and discovery, but we end up with something that still fits the general pattern of its fellow system PMU drivers to ease future maintenance. Moving code to a new file also offers a perfect excuse to modernise the license/copyright headers and clean up some funky linewraps on the way. Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com> Acked-by: Punit Agrawal <punit.agrawal@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06drivers/bus: Move Arm CCN PMU driverRobin Murphy3-0/+1605
The arm-ccn driver is purely a perf driver for the CCN PMU, not a bus driver in the sense of the other residents of drivers/bus/, so let's move it to the appropriate place for SoC PMU drivers. Not to mention moving the documentation accordingly as well. Acked-by: Pawel Moll <pawel.moll@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-02-28arm_pmu: Use disable_irq_nosync when disabling SPI in CPU teardown hookWill Deacon1-1/+1
Commit 6de3f79112cc ("arm_pmu: explicitly enable/disable SPIs at hotplug") moved all of the arm_pmu IRQ enable/disable calls to the CPU hotplug hooks, regardless of whether they are implemented as PPIs or SPIs. This can lead to us sleeping from atomic context due to disable_irq blocking: | BUG: sleeping function called from invalid context at kernel/irq/manage.c:112 | in_atomic(): 1, irqs_disabled(): 128, pid: 15, name: migration/1 | no locks held by migration/1/15. | irq event stamp: 192 | hardirqs last enabled at (191): [<00000000803c2507>] | _raw_spin_unlock_irq+0x2c/0x4c | hardirqs last disabled at (192): [<000000007f57ad28>] multi_cpu_stop+0x9c/0x140 | softirqs last enabled at (0): [<0000000004ee1b58>] | copy_process.isra.77.part.78+0x43c/0x1504 | softirqs last disabled at (0): [< (null)>] (null) | CPU: 1 PID: 15 Comm: migration/1 Not tainted 4.16.0-rc3-salvator-x #1651 | Hardware name: Renesas Salvator-X board based on r8a7796 (DT) | Call trace: | dump_backtrace+0x0/0x140 | show_stack+0x14/0x1c | dump_stack+0xb4/0xf0 | ___might_sleep+0x1fc/0x218 | __might_sleep+0x70/0x80 | synchronize_irq+0x40/0xa8 | disable_irq+0x20/0x2c | arm_perf_teardown_cpu+0x80/0xac Since the interrupt is always CPU-affine and this code is running with interrupts disabled, we can just use disable_irq_nosync as we know there isn't a concurrent invocation of the handler to worry about. Fixes: 6de3f79112cc ("arm_pmu: explicitly enable/disable SPIs at hotplug") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-20arm_pmu: acpi: request IRQs up-frontMark Rutland3-36/+20
We can't request IRQs in atomic context, so for ACPI systems we'll have to request them up-front, and later associate them with CPUs. This patch reorganises the arm_pmu code to do so. As we no longer have the arm_pmu structure at probe time, a number of prototypes need to be adjusted, requiring changes to the common arm_pmu code and arm_pmu platform code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20arm_pmu: note IRQs and PMUs per-cpuMark Rutland1-17/+52
To support ACPI systems, we need to request IRQs before we know the associated PMU, and thus we need some percpu variable that the IRQ handler can find the PMU from. As we're going to request IRQs without the PMU, we can't rely on the arm_pmu::active_irqs mask, and similarly need to track requested IRQs with a percpu variable. Signed-off-by: Mark Rutland <mark.rutland@arm.com> [will: made armpmu_count_irq_users static] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20arm_pmu: explicitly enable/disable SPIs at hotplugMark Rutland1-5/+10
To support ACPI systems, we need to request IRQs before CPUs are hotplugged, and thus we need to request IRQs before we know their associated PMU. This is problematic if a PMU IRQ is pending out of reset, as it may be taken before we know the PMU, and thus the IRQ handler won't be able to handle it, leaving it screaming. To avoid such problems, lets request all IRQs in a disabled state, and explicitly enable/disable them at hotplug time, when we're sure the PMU has been probed. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20arm_pmu: acpi: check for mismatched PPIsMark Rutland3-24/+42
The arm_pmu platform code explicitly checks for mismatched PPIs at probe time, while the ACPI code leaves this to the core code. Future refactoring will make this difficult for the core code to check, so let's have the ACPI code check this explicitly. As before, upon a failure we'll continue on without an interrupt. Ho hum. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20arm_pmu: add armpmu_alloc_atomic()Mark Rutland2-4/+15
In ACPI systems, we don't know the makeup of CPUs until we hotplug them on, and thus have to allocate the PMU datastructures at hotplug time. Thus, we must use GFP_ATOMIC allocations. Let's add an armpmu_alloc_atomic() that we can use in this case. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20arm_pmu: fold platform helpers into platform codeMark Rutland2-21/+21
The armpmu_{request,free}_irqs() helpers are only used by arm_pmu_platform.c, so let's fold them in and make them static. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20arm_pmu: kill arm_pmu_platdataMark Rutland1-23/+4
Now that we have no platforms passing platform data to the arm_pmu code, we can get rid of the platdata and associated hooks, paving the way for rework of our IRQ handling. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-06bitmap: replace bitmap_{from,to}_u32arrayYury Norov1-4/+2
with bitmap_{from,to}_arr32 over the kernel. Additionally to it: * __check_eq_bitmap() now takes single nbits argument. * __check_eq_u32_array is not used in new test but may be used in future. So I don't remove it here, but annotate as __used. Tested on arm64 and 32-bit BE mips. [arnd@arndb.de: perf: arm_dsu_pmu: convert to bitmap_from_arr32] Link: http://lkml.kernel.org/r/20180201172508.5739-2-ynorov@caviumnetworks.com [ynorov@caviumnetworks.com: fix net/core/ethtool.c] Link: http://lkml.kernel.org/r/20180205071747.4ekxtsbgxkj5b2fz@yury-thinkpad Link: http://lkml.kernel.org/r/20171228150019.27953-2-ynorov@caviumnetworks.com Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: David Decotigny <decot@googlers.com>, Cc: David S. Miller <davem@davemloft.net>, Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-15perf: dsu: Use signed field for dsu_pmu->num_countersSuzuki K Poulose1-1/+1
We set dsu_pmu->num_counters to -1, when the DSU is allocated but not initialised when none of the CPUs are active in the DSU. However, we use an unsigned field for num_counters. Switch this to a signed field. Fixes: 7520fa99246d ("perf: ARM DynamIQ Shared Unit PMU support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-12Merge branch 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linuxCatalin Marinas4-12/+856
Support for the Cluster PMU part of the ARM DynamIQ Shared Unit (DSU). * 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux: perf: ARM DynamIQ Shared Unit PMU support dt-bindings: Document devicetree binding for ARM DSU PMU arm_pmu: Use of_cpu_node_to_id helper arm64: Use of_cpu_node_to_id helper for CPU topology parsing irqchip: gic-v3: Use of_cpu_node_to_id helper coresight: of: Use of_cpu_node_to_id helper of: Add helper for mapping device node to logical CPU number perf: Export perf_event_update_userpage
2018-01-02perf: ARM DynamIQ Shared Unit PMU supportSuzuki K Poulose3-0/+853
Add support for the Cluster PMU part of the ARM DynamIQ Shared Unit (DSU). The DSU integrates one or more cores with an L3 memory system, control logic, and external interfaces to form a multicore cluster. The PMU allows counting the various events related to L3, SCU etc, along with providing a cycle counter. The PMU can be accessed via system registers, which are common to the cores in the same cluster. The PMU registers follow the semantics of the ARMv8 PMU, mostly, with the exception that the counters record the cluster wide events. This driver is mostly based on the ARMv8 and CCI PMU drivers. The driver only supports ARM64 at the moment. It can be extended to support ARM32 by providing register accessors like we do in arch/arm64/include/arm_dsu_pmu.h. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-01-02arm_pmu: Use of_cpu_node_to_id helperSuzuki K Poulose1-12/+3
Use the new generic helper, of_cpu_node_to_id(), to map a a phandle to the logical CPU number while parsing the PMU irq affinity. Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11perf: arm_spe: Fail device probe when arm64_kernel_unmapped_at_el0()Will Deacon1-0/+9
When running with the kernel unmapped whilst at EL0, the virtually-addressed SPE buffer is also unmapped, which can lead to buffer faults if userspace profiling is enabled and potentially also when writing back kernel samples unless an expensive drain operation is performed on exception return. For now, fail the SPE driver probe when arm64_kernel_unmapped_at_el0(). Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-15Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds13-10/+3276
Pull arm64 updates from Will Deacon: "The big highlight is support for the Scalable Vector Extension (SVE) which required extensive ABI work to ensure we don't break existing applications by blowing away their signal stack with the rather large new vector context (<= 2 kbit per vector register). There's further work to be done optimising things like exception return, but the ABI is solid now. Much of the line count comes from some new PMU drivers we have, but they're pretty self-contained and I suspect we'll have more of them in future. Plenty of acronym soup here: - initial support for the Scalable Vector Extension (SVE) - improved handling for SError interrupts (required to handle RAS events) - enable GCC support for 128-bit integer types - remove kernel text addresses from backtraces and register dumps - use of WFE to implement long delay()s - ACPI IORT updates from Lorenzo Pieralisi - perf PMU driver for the Statistical Profiling Extension (SPE) - perf PMU driver for Hisilicon's system PMUs - misc cleanups and non-critical fixes" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits) arm64: Make ARMV8_DEPRECATED depend on SYSCTL arm64: Implement __lshrti3 library function arm64: support __int128 on gcc 5+ arm64/sve: Add documentation arm64/sve: Detect SVE and activate runtime support arm64/sve: KVM: Hide SVE from CPU features exposed to guests arm64/sve: KVM: Treat guest SVE use as undefined instruction execution arm64/sve: KVM: Prevent guests from using SVE arm64/sve: Add sysctl to set the default vector length for new processes arm64/sve: Add prctl controls for userspace vector length management arm64/sve: ptrace and ELF coredump support arm64/sve: Preserve SVE registers around EFI runtime service calls arm64/sve: Preserve SVE registers around kernel-mode NEON use arm64/sve: Probe SVE capabilities and usable vector lengths arm64: cpufeature: Move sys_caps_initialised declarations arm64/sve: Backend logic for setting the vector length arm64/sve: Signal handling support arm64/sve: Support vector length resetting for new processes arm64/sve: Core task context handling arm64/sve: Low-level CPU setup ...
2017-11-03perf: arm_spe: Prevent module unload while the PMU is in useSuzuki K Poulose1-0/+1
When the PMU driver is built as a module, the perf expects the pmu->module to be valid, so that the driver is prevented from being unloaded while it is in use. Fix the SPE pmu driver to fill in this field. Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman2-0/+2
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-24Merge branch 'for-next/perf' into aarch64/for-next/coreWill Deacon9-0/+3214
Merge in ARM PMU and perf updates for 4.15: - Support for the Statistical Profiling Extension - Support for Hisilicon's SoC PMU Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-24arm/arm64: pmu: Distinguish percpu irq and percpu_devid irqJulien Thierry2-7/+7
arm_pmu interrupts are maked as PERCPU even when these are not local physical interrupts to a single CPU. When using non-local interrupts, interrupts marked as PERCPU will not get freed not disabled properly by the PMU driver. Check if interrupts are local to a single CPU with PERCPU_DEVID since this is what the PMU driver really needs to know. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19perf: hisi: Add support for HiSilicon SoC DDRC PMU driverShaokun Zhang2-1/+464
This patch adds support for DDRC PMU driver in HiSilicon SoC chip, Each DDRC has own control, counter and interrupt registers and is an separate PMU. For each DDRC PMU, it has 8-fixed-purpose counters which have been mapped to 8-events by hardware, it assumes that counter index is equal to event code (0 - 7) in DDRC PMU driver. Interrupt is supported to handle counter (32-bits) overflow. Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Anurup M <anurup.m@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19perf: hisi: Add support for HiSilicon SoC HHA PMU driverShaokun Zhang2-1/+474
L3 cache coherence is maintained by Hydra Home Agent (HHA) in HiSilicon SoC. This patch adds support for HHA PMU driver, Each HHA has own control, counter and interrupt registers and is an separate PMU. For each HHA PMU, it has 16-programable counters and each counter is free-running. Interrupt is supported to handle counter (48-bits) overflow. Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Anurup M <anurup.m@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19perf: hisi: Add support for HiSilicon SoC L3C PMU driverShaokun Zhang2-1/+464
This patch adds support for L3C PMU driver in HiSilicon SoC chip, Each L3C has own control, counter and interrupt registers and is an separate PMU. For each L3C PMU, it has 8-programable counters and each counter is free-running. Interrupt is supported to handle counter (48-bits) overflow. Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Anurup M <anurup.m@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19perf: hisi: Add support for HiSilicon SoC uncore PMU driverShaokun Zhang5-0/+558
This patch adds support HiSilicon SoC uncore PMU driver framework and interfaces. Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Anurup M <anurup.m@huawei.com> [will: Fix leader accounting in uncore group validation] Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18drivers/perf: Add support for ARMv8.2 Statistical Profiling ExtensionWill Deacon3-0/+1257
The ARMv8.2 architecture introduces the optional Statistical Profiling Extension (SPE). SPE can be used to profile a population of operations in the CPU pipeline after instruction decode. These are either architected instructions (i.e. a dynamic instruction trace) or CPU-specific uops and the choice is fixed statically in the hardware and advertised to userspace via caps/. Sampling is controlled using a sampling interval, similar to a regular PMU counter, but also with an optional random perturbation to avoid falling into patterns where you continuously profile the same instruction in a hot loop. After each operation is decoded, the interval counter is decremented. When it hits zero, an operation is chosen for profiling and tracked within the pipeline until it retires. Along the way, information such as TLB lookups, cache misses, time spent to issue etc is captured in the form of a sample. The sample is then filtered according to certain criteria (e.g. load latency) that can be specified in the event config (described under format/) and, if the sample satisfies the filter, it is written out to memory as a record, otherwise it is discarded. Only one operation can be sampled at a time. The in-memory buffer is linear and virtually addressed, raising an interrupt when it fills up. The PMU driver handles these interrupts to give the appearance of a ring buffer, as expected by the AUX code. The in-memory trace-like format is self-describing (though not parseable in reverse) and written as a series of records, with each record corresponding to a sample and consisting of a sequence of packets. These packets are defined by the architecture, although some have CPU-specific fields for recording information specific to the microarchitecture. As a simple example, a record generated for a branch instruction may consist of the following packets: 0 (Address) : Virtual PC of the branch instruction 1 (Type) : Conditional direct branch 2 (Counter) : Number of cycles taken from Dispatch to Issue 3 (Address) : Virtual branch target + condition flags 4 (Counter) : Number of cycles taken from Dispatch to Complete 5 (Events) : Mispredicted as not-taken 6 (END) : End of record It is also possible to toggle properties such as timestamp packets in each record. This patch adds support for SPE in the form of a new perf driver. Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-04drivers/perf: arm_pmu_acpi: drop redundant acpi_disabled checkShaokun Zhang1-3/+0
acpi_disabled has been checked in armv8_pmu_driver_init and it shall be ZERO in arm_pmu_acpi_probe, clean up this unnecessary check. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-02perf: qcom_l2_pmu: add event namesNeil Leeder1-0/+54
Add event names so that common events can be specified symbolically, for example: l2cache_0/total-reads/,l2cache_0/cycles/ Event names are displayed in 'perf list'. Signed-off-by: Neil Leeder <nleeder@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-09-22drivers/perf: arm_pmu_acpi: Release memory obtained by kasprintfArvind Yadav1-0/+1
Free memory region, if arm_pmu_acpi_probe is not successful. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-08arm64: perf: Allow standard PMUv3 events to be extended by the CPU typeWill Deacon1-0/+6
Rather than continue adding CPU-specific event maps, instead look up by default in the PMUv3 event map and only fallback to the CPU-specific maps if either the event isn't described by PMUv3, or it is described but the PMCEID registers say that it is unsupported by the current CPU. Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-08perf: xgene: Remove unnecessary managed resources cleanupTai Nguyen1-52/+22
Managed resources in the driver should be automatically cleaned up on driver detach. It's unnecessary to manually free/unmmap these resources. One of the manual cleanup causes static checkers to complain. The bug is reported by Dan Carpenter <dan.carpenter@oracle.com> in [1] [1] https://www.spinics.net/lists/arm-kernel/msg593012.html This patch gets rid of all the unnecessary manual cleanup and properly unregister all the registered PMU devices by the driver on driver detach. Signed-off-by: Tai Nguyen <ttnguyen@apm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-27drivers/perf: arm_pmu: Request PMU SPIs with IRQF_PER_CPUWill Deacon1-14/+27
Since the PMU register interface is banked per CPU, CPU PMU interrrupts cannot be handled by a CPU other than the one with the PMU asserting the interrupt. This means that migrating PMU SPIs, as we do during a CPU hotplug operation doesn't make any sense and can lead to the IRQ being disabled entirely if we route a spurious IRQ to the new affinity target. This has been observed in practice on AMD Seattle, where CPUs on the non-boot cluster appear to take a spurious PMU IRQ when coming online, which is routed to CPU0 where it cannot be handled. This patch passes IRQF_PERCPU for PMU SPIs and forcefully sets their affinity prior to requesting them, ensuring that they cannot be migrated during hotplug events. This interacts badly with the DB8500 erratum workaround that ping-pongs the interrupt affinity from the handler, so we avoid passing IRQF_PERCPU in that case by allowing the IRQ flags to be overridden in the platdata. Fixes: 3cf7ee98b848 ("drivers/perf: arm_pmu: move irq request/free into probe") Cc: Mark Rutland <mark.rutland@arm.com> Cc: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-26perf: qcom_l2: fix column exclusion checkNeil Leeder1-0/+2
The check for column exclusion did not verify that the event being checked was an L2 event, and not a software event. Software events should not be checked for column exclusion. This resulted in a group with both software and L2 events sometimes incorrectly rejecting the L2 event for column exclusion and not counting it. Add a check for PMU type before applying column exclusion logic. Fixes: 21bdbb7102edeaeb ("perf: add qcom l2 cache perf events driver") Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Neil Leeder <nleeder@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-20perf: Convert to using %pOF instead of full_nameRob Herring1-5/+4
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-26Merge branch 'aarch64/for-next/ras-apei' into aarch64/for-next/coreWill Deacon1-0/+11
Merge in arm64 ACPI RAS support (APEI/GHES) from Tyler Baicar.
2017-06-22perf: xgene: Add support for SoC PMU version 3Hoan Tran1-29/+534
This patch adds support for SoC-wide (AKA uncore) Performance Monitoring Unit version 3. It can support up to - 2 IOB PMU instances - 8 L3C PMU instances - 2 MCB PMU instances - 8 MCU PMU instances and these PMUs support 64 bit counter Signed-off-by: Hoan Tran <hotran@apm.com> [Mark: stop counters in _xgene_pmu_isr()] Signed-off-by: Mark Rutland <mark.rutland@arm.com> [will: make xgene_pmu_v3_ops static] Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-22perf: xgene: Move PMU leaf functions into function pointer structureHoan Tran1-18/+65
This patch moves PMU leaf functions into a function pointer structure. It helps code maintain and expasion easier. Signed-off-by: Hoan Tran <hotran@apm.com> [Mark: remove redundant cast] Signed-off-by: Mark Rutland <mark.rutland@arm.com> [will: make xgene_pmu_ops static] Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-22perf: xgene: Parse PMU subnode from the match tableHoan Tran1-10/+30
This patch parses PMU Subnode from a match table. Signed-off-by: Hoan Tran <hotran@apm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15drivers/perf: commonise PERF_EVENTS dependencyMark Rutland1-4/+5
All PMU drivers are going to depend on PERF_EVENTS, so let's make this dependency common and simplify the individual Kconfig entries. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-30drivers/perf: arm_pmu_acpi: avoid perf IRQ init when guest PMU is offWei Huang1-0/+11
We saw perf IRQ init failures when running Linux kernel in an ACPI guest without PMU (i.e. pmu=off). This is because perf IRQ is not present when pmu=off, but arm_pmu_acpi still tries to register or unregister GSI. This patch addresses the problem by checking gicc->performance_interrupt. If it is 0, which is the value set by qemu when pmu=off, we skip the IRQ register/unregister process. [ 4.069470] bc00: 0000000000040b00 ffff0000089db190 [ 4.070267] [<ffff000008134f80>] enable_percpu_irq+0xdc/0xe4 [ 4.071192] [<ffff000008667cc4>] arm_perf_starting_cpu+0x108/0x10c [ 4.072200] [<ffff0000080cbdd4>] cpuhp_invoke_callback+0x14c/0x4ac [ 4.073210] [<ffff0000080ccd3c>] cpuhp_thread_fun+0xd4/0x11c [ 4.074132] [<ffff0000080f1394>] smpboot_thread_fn+0x1b4/0x1c4 [ 4.075081] [<ffff0000080ec90c>] kthread+0x10c/0x138 [ 4.075921] [<ffff0000080833c0>] ret_from_fork+0x10/0x50 [ 4.076947] genirq: Setting trigger mode 4 for irq 43 failed (gic_set_type+0x0/0x74) Signed-off-by: Wei Huang <wei@redhat.com> [will: add comment justifying deviation from ACPI spec, removed redundant hunk] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>