aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/perf (follow)
AgeCommit message (Collapse)AuthorFilesLines
2021-03-30perf/arm_pmu_platform: Clean up with dev_printkRobin Murphy1-25/+22
Nearly all of the messages we can log from the platform device code relate to the specific PMU device and the properties we're parsing from its DT node. In some cases we use %pOF to point at where something was wrong, but even that is inconsistent. Let's convert these logs to the appropriate dev_printk variants, so that every issue specific to the device and/or its DT description is clearly and instantly attributable, particularly if there is more than one PMU node present in the DT. The local refactoring in a couple of functions invites some extra cleanup in the process - the init_fn matching can be streamlined, and the PMU registration failure message moved to the appropriate place and log level. CC: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/10a4aacdf071d0c03d061c408a5899e5b32cc0a6.1616774562.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-30perf/arm_pmu_platform: Fix error handlingRobin Murphy1-1/+1
If we're aborting after failing to register the PMU device, we probably don't want to leak the IRQs that we've claimed. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/53031a607fc8412a60024bfb3bb8cd7141f998f5.1616774562.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-30perf/arm_pmu_platform: Use dev_err_probe() for IRQ errorsRobin Murphy1-4/+3
By virtue of using platform_irq_get_optional() under the covers, platform_irq_count() needs the target interrupt controller to be available and may return -EPROBE_DEFER if it isn't. Let's use dev_err_probe() to avoid a spurious error log (and help debug any deferral issues) in that case. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/073d5e0d3ed1f040592cb47ca6fe3759f40cc7d1.1616774562.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: hisi: Add support for HiSilicon PA PMU driverShaokun Zhang2-1/+502
On HiSilicon Hip09 platform, there is a PA (Protocol Adapter) module on each chip SICL (Super I/O Cluster) which incorporates three Hydra interface and facilitates the cache coherency between the dies on the chip. While PA uncore PMU model is the same as other Hip09 PMU modules and many PMU events are supported. Let's support the PMU driver using the HiSilicon uncore PMU framework. PA PMU supports the following filter functions: * tracetag_en: allows user to count events according to tt_req or tt_core set in L3C PMU. It's the same as other PMUs. * srcid_cmd & srcid_msk: allows user to filter statistics that come from specific CCL/ICL by configuration source ID. * tgtid_cmd & tgtid_msk: it is the similar function to srcid_cmd & srcid_msk. Both are used to check where the data comes from or go to. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Co-developed-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1615186237-22263-9-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: hisi: Add support for HiSilicon SLLC PMU driverShaokun Zhang2-1/+531
HiSilicon's Hip09 is comprised by multi-dies that can be connected by SLLC module (Skyros Link Layer Controller), its has separate PMU registers which the driver can program it freely and interrupt is supported to handle counter overflow. Let's support its driver under the framework of HiSilicon uncore PMU driver. SLLC PMU supports the following filter functions: * tracetag_en: allows user to count data according to tt_req or tt_core set in L3C PMU. * srcid_cmd & srcid_msk: allows user to filter statistics that come from specific CCL/ICL by configuration source ID. * tgtid_hi & tgtid_lo: it also supports event statistics that these operations will go to the CCL/ICL by configuration target ID or target ID range. It's the same as source ID with 11-bit width in the SoC. More introduction is added in documentation: Documentation/admin-guide/perf/hisi-pmu.rst Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Co-developed-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1615186237-22263-8-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: hisi: Update DDRC PMU for programmable counterShaokun Zhang2-13/+196
DDRC PMU's events are useful for performance profiling, but the events are limited and counter is fixed. On HiSilicon Hip09 platform, PMU counters are the programmable and more events are supported. Let's add the DDRC PMU v2 driver. Bandwidth events are exposed directly in driver and some more events will listed in JSON file later. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Co-developed-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1615186237-22263-7-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: hisi: Add new functions for HHA PMUShaokun Zhang1-15/+188
On HiSilicon Hip09 platform, some new functions are also supported on HHA PMU. * tracetag_en: it is the abbreviation of tracetag enable and allows user to count events according to tt_req or tt_core set in L3C PMU. * datasrc_skt: it is the abbreviation of data source from another socket and it is used in the multi-chips. It's the same as L3C PMU. * srcid_cmd & srcid_msk: pair of the fields are used to filter statistics that come from the specific CCL/ICL by the configuration. These are the abbreviation of source ID command and mask. The source ID is 11-bit and detailed descriptions are documented in Documentation/admin-guide/perf/hisi-pmu.rst. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Co-developed-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1615186237-22263-6-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: hisi: Add new functions for L3C PMUShaokun Zhang3-20/+258
On HiSilicon Hip09 platform, some new functions are enhanced on L3C PMU: * tt_req: it is the abbreviation of tracetag request and allows user to count only read/write/atomic operations. tt_req is 3-bit and details are listed in the hisi-pmu document. $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_req=0x4/ sleep 5 * tt_core: it is the abbreviation of tracetag core and allows user to filter by core/thread within the cluster, it is a 8-bit bitmap that each bit represents the corresponding core/thread in this L3C. $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0xf/ sleep 5 * datasrc_cfg: it is the abbreviation of data source configuration and allows user to check where the data comes from, such as: from local DDR, cross-die DDR or cross-socket DDR. Its is 5-bit and represents different data source in the SoC. $# perf stat -a -e hisi_sccl3_l3c0/dat_access,datasrc_cfg=0xe/ sleep 5 * datasrc_skt: it is the abbreviation of data source from another socket and is used in the multi-chips, if user wants to check the cross-socket datat source, it shall be added in perf command. Only one bit is used to control this. $# perf stat -a -e hisi_sccl3_l3c0/dat_access,datasrc_cfg=0x10,datasrc_skt=1/ sleep 5 Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Co-developed-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1615186237-22263-5-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: hisi: Add PMU version for uncore PMU drivers.Shaokun Zhang3-71/+75
For HiSilicon uncore PMU, more versions are supported and some variables shall be added suffix to distinguish the version which are prepared for the new drivers. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: John Garry <john.garry@huawei.com> Co-developed-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1615186237-22263-4-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: hisi: Refactor code for more uncore PMUsShaokun Zhang5-157/+80
On HiSilicon uncore PMU drivers, interrupt handling function and interrupt registration function are very similar in differents PMU modules. Let's refactor the frame. Two new callbacks are added for the HW accessors: * hisi_uncore_ops::get_int_status returns a bitmap of events which have overflowed and raised an interrupt * hisi_uncore_ops::clear_int_status clears the overflow status for a specific event These callback functions are used by a common IRQ handler, hisi_uncore_pmu_isr(). One more function hisi_uncore_pmu_init_irq() is added to replace each PMU initialization IRQ interface and simplify the code. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Co-developed-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1615186237-22263-3-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: hisi: Remove unnecessary check of counter indexShaokun Zhang5-61/+7
The sanity check for counter index has been done in the function hisi_uncore_pmu_get_event_idx, so remove the redundant interface hisi_uncore_pmu_counter_valid() and sanity check. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1615186237-22263-2-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: Simplify the SMMUv3 PMU event attributesQi Liu1-19/+13
For each PMU event, there is a SMMU_EVENT_ATTR(xx, XX) and &smmu_event_attr_xx.attr.attr. Let's redefine the SMMU_EVENT_ATTR to simplify the smmu_pmu_events. Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1612789498-12957-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: convert sysfs sprintf family to sysfs_emitQi Liu8-14/+14
sprintf does not know the PAGE_SIZE maximum of the temporary buffer used for sysfs content and it's possible to overrun the buffer length. Use sysfs_emit() function to ensures that no overrun is done. Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1616148273-16374-4-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: convert sysfs scnprintf family to sysfs_emit_at() and sysfs_emit()Qi Liu1-16/+11
Use the generic sysfs_emit_at() and sysfs_emit() function to take place of scnprintf() Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1616148273-16374-3-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25drivers/perf: convert sysfs snprintf family to sysfs_emitZihao Tang8-28/+25
Fix the following coccicheck warning: ./drivers/perf/hisilicon/hisi_uncore_pmu.c:128:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/fsl_imx8_ddr_perf.c:173:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm_spe_pmu.c:129:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm_smmu_pmu.c:563:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm_dsu_pmu.c:149:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm_dsu_pmu.c:139:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm-cmn.c:563:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm-cmn.c:351:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm-ccn.c:224:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm-cci.c:708:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm-cci.c:699:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm-cci.c:528:8-16: WARNING: use scnprintf or sprintf. ./drivers/perf/arm-cci.c:309:8-16: WARNING: use scnprintf or sprintf. Signed-off-by: Zihao Tang <tangzihao1@hisilicon.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1616148273-16374-2-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-12perf/arm_dmc620_pmu: Fix error return code in dmc620_pmu_device_probe()Wei Yongjun1-0/+1
Fix to return negative error code -ENOMEM from the error handling case instead of 0, as done elsewhere in this function. Fixes: 53c218da220c ("driver/perf: Add PMU driver for the ARM DMC-620 memory controller") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Link: https://lore.kernel.org/r/20210312080421.277562-1-weiyongjun1@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-22Merge tag 'iommu-updates-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds1-1/+1
Pull iommu updates from Joerg Roedel: - ARM SMMU and Mediatek updates from Will Deacon: - Support for MT8192 IOMMU from Mediatek - Arm v7s io-pgtable extensions for MT8192 - Removal of TLBI_ON_MAP quirk - New Qualcomm compatible strings - Allow SVA without hardware broadcast TLB maintenance on SMMUv3 - Virtualization Host Extension support for SMMUv3 (SVA) - Allow SMMUv3 PMU perf driver to be built independently from IOMMU - Some tidy-up in IOVA and core code - Conversion of the AMD IOMMU code to use the generic IO-page-table framework - Intel VT-d updates from Lu Baolu: - Audit capability consistency among different IOMMUs - Add SATC reporting structure support - Add iotlb_sync_map callback support - SDHI support for Renesas IOMMU driver - Misc cleanups and other small improvments * tag 'iommu-updates-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (94 commits) iommu/amd: Fix performance counter initialization MAINTAINERS: repair file pattern in MEDIATEK IOMMU DRIVER iommu/mediatek: Fix error code in probe() iommu/mediatek: Fix unsigned domid comparison with less than zero iommu/vt-d: Parse SATC reporting structure iommu/vt-d: Add new enum value and structure for SATC iommu/vt-d: Add iotlb_sync_map callback iommu/vt-d: Move capability check code to cap_audit files iommu/vt-d: Audit IOMMU Capabilities and add helper functions iommu/vt-d: Fix 'physical' typos iommu: Properly pass gfp_t in _iommu_map() to avoid atomic sleeping iommu/vt-d: Fix compile error [-Werror=implicit-function-declaration] driver/perf: Remove ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3 MAINTAINERS: Add entry for MediaTek IOMMU iommu/mediatek: Add mt8192 support iommu/mediatek: Remove unnecessary check in attach_device iommu/mediatek: Support master use iova over 32bit iommu/mediatek: Add iova reserved function iommu/mediatek: Support for multi domains iommu/mediatek: Add get_domain_id from dev->dma_range_map ...
2021-02-10drivers/perf: Replace spin_lock_irqsave to spin_lockQi Liu2-6/+4
There is no need to do spin_lock_irqsave in context of hard IRQ, so replace them with spin_lock. Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1612863742-1551-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-02drivers/perf: Prevent forced unbinding of ARM_DMC620_PMU driversQi Liu1-0/+1
Set "suppress_bind_attrs" to true, so that bind/unbind can be disabled via sysfs and prevent unbinding ARM_DMC620_PMU drivers during perf sampling. Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1612252686-50329-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-02Merge tag 'arm-smmu-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmuJoerg Roedel1-1/+1
Arm SMMU updates for 5.12 - Support for MT8192 IOMMU from Mediatek - Arm v7s io-pgtable extensions for MT8192 - Removal of TLBI_ON_MAP quirk - New Qualcomm compatible strings - Allow SVA without hardware broadcast TLB maintenance on SMMUv3 - Virtualization Host Extension support for SMMUv3 (SVA) - Allow SMMUv3 PMU (perf) driver to be built independently from IOMMU - Misc cleanups
2021-02-01driver/perf: Remove ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3John Garry1-1/+1
The ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3_PMU was added with the idea that a SMMUv3 PMCG would only exist on a system with an associated SMMUv3. However it is not the job of Kconfig to make these sorts of decisions (even if it were true), so remove the dependency. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/1612175042-56866-1-git-send-email-john.garry@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-28perf/arm-cmn: Move IRQs when migrating contextRobin Murphy1-1/+3
If we migrate the PMU context to another CPU, we need to remember to retarget the IRQs as well. Fixes: 0ba64770a2f2 ("perf: Add Arm CMN-600 PMU driver") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e080640aea4ed8dfa870b8549dfb31221803eb6b.1611839564.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-28perf/arm-cmn: Fix PMU instance namingRobin Murphy1-9/+4
Although it's neat to avoid the suffix for the typical case of a single PMU, it means systems with multiple CMN instances end up with inconsistent naming. I think it also breaks perf tool's "uncore alias" logic if the common instance prefix is also the full name of one. Avoid any surprises by not trying to be clever and simply numbering every instance, even when it might technically prove redundant. Fixes: 0ba64770a2f2 ("perf: Add Arm CMN-600 PMU driver") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/649a2281233f193d59240b13ed91b57337c77b32.1611839564.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20perf: Constify static struct attribute_groupRikard Falkeborn6-12/+12
The only usage is to put their addresses in an array of pointers to const struct attribute group. Make them const to allow the compiler to put them in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20210117212847.21319-5-rikard.falkeborn@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20perf: hisi: Constify static struct attribute_groupRikard Falkeborn3-3/+3
The only usage is to put their addresses in an array of pointers to const struct attribute group. Make them const to allow the compiler to put them in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20210117212847.21319-4-rikard.falkeborn@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20perf/imx_ddr: Constify static struct attribute_groupRikard Falkeborn1-5/+5
The only usage is to put their addresses in an array of pointers to const struct attribute group. Make them const to allow the compiler to put them in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20210117212847.21319-3-rikard.falkeborn@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20perf: qcom: Constify static struct attribute_groupRikard Falkeborn2-6/+6
The only usage is to put their addresses in an array of pointers to const struct attribute group. Make them const to allow the compiler to put them in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20210117212847.21319-2-rikard.falkeborn@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20drivers/perf: Add support for ARMv8.3-SPEWei Li1-2/+15
Armv8.3 extends the SPE by adding: - Alignment field in the Events packet, and filtering on this event using PMSEVFR_EL1. - Support for the Scalable Vector Extension (SVE). The main additions for SVE are: - Recording the vector length for SVE operations in the Operation Type packet. It is not possible to filter on vector length. - Incomplete predicate and empty predicate fields in the Events packet, and filtering on these events using PMSEVFR_EL1. Update the check of pmsevfr for empty/partial predicated SVE and alignment event in SPE driver. Signed-off-by: Wei Li <liwei391@huawei.com> Link: https://lore.kernel.org/r/20201203141609.14148-1-liwei391@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-13Revert "arm64: Enable perf events based hard lockup detector"Will Deacon1-5/+0
This reverts commit 367c820ef08082e68df8a3bc12e62393af21e4b5. lockup_detector_init() makes heavy use of per-cpu variables and must be called with preemption disabled. Usually, it's handled early during boot in kernel_init_freeable(), before SMP has been initialised. Since we do not know whether or not our PMU interrupt can be signalled as an NMI until considerably later in the boot process, the Arm PMU driver attempts to re-initialise the lockup detector off the back of a device_initcall(). Unfortunately, this is called from preemptible context and results in the following splat: | BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 | caller is debug_smp_processor_id+0x20/0x2c | CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #276 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x3c0 | show_stack+0x20/0x6c | dump_stack+0x2f0/0x42c | check_preemption_disabled+0x1cc/0x1dc | debug_smp_processor_id+0x20/0x2c | hardlockup_detector_event_create+0x34/0x18c | hardlockup_detector_perf_init+0x2c/0x134 | watchdog_nmi_probe+0x18/0x24 | lockup_detector_init+0x44/0xa8 | armv8_pmu_driver_init+0x54/0x78 | do_one_initcall+0x184/0x43c | kernel_init_freeable+0x368/0x380 | kernel_init+0x1c/0x1cc | ret_from_fork+0x10/0x30 Rather than bodge this with raw_smp_processor_id() or randomly disabling preemption, simply revert the culprit for now until we figure out how to do this properly. Reported-by: Lecopzer Chen <lecopzer.chen@mediatek.com> Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Sumit Garg <sumit.garg@linaro.org> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20201221162249.3119-1-lecopzer.chen@mediatek.com Link: https://lore.kernel.org/r/20210112221855.10666-1-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-09perf/imx_ddr: Add system PMU identifier for userspaceJoakim Zhang1-0/+55
The DDR Perf for i.MX8 is a system PMU whose AXI ID would different from SoC to SoC. Need expose system PMU identifier for userspace which refer to /sys/bus/event_source/devices/<PMU DEVICE>/identifier. Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20201130114202.26057-3-qiangqing.zhang@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25arm64: Enable perf events based hard lockup detectorSumit Garg1-0/+5
With the recent feature added to enable perf events to use pseudo NMIs as interrupts on platforms which support GICv3 or later, its now been possible to enable hard lockup detector (or NMI watchdog) on arm64 platforms. So enable corresponding support. One thing to note here is that normally lockup detector is initialized just after the early initcalls but PMU on arm64 comes up much later as device_initcall(). So we need to re-initialize lockup detection once PMU has been initialized. Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Acked-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/1602060704-10921-1-git-send-email-sumit.garg@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25perf/imx_ddr: Add stop event counters support for i.MX8MPJoakim Zhang1-26/+54
DDR Perf driver only supports free-running event counters(counter1/2/3) now, this patch adds support for stop event counters. Legacy SoCs: Cycle counter(counter0) is a special counter, only count cycles. When cycle counter overflow, it will lock all counters and generate an interrupt. In ddr_perf_irq_handler, disable cycle counter then all counters would stop at the same time, update all counters' count, then enable cycle counter that all counters count again. During this process, only clear cycle counter, no need to clear event counters since they are free-running counters. They would continue counting after overflow and do/while loop from ddr_perf_event_update can handle event counters overflow case. i.MX8MP: Almost all is the same as legacy SoCs, the only difference is that, event counters are not free-running any more. Like cycle counter, when event counters overflow, they would stop counting unless clear the counter, and no interrupt generate for event counters. So we should clear event counters that let them re-count when cycle counter overflow, which ensure event counters will not lose data. This patch adds stop event counters support which would be compatible to free-running event counters. We use the cycle counter to stop overflow of the event counters. Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Link: https://lore.kernel.org/r/20201027104451.15434-1-qiangqing.zhang@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25perf/smmuv3: Support sysfs identifier fileJohn Garry1-0/+39
SMMU_PMCG_IIDR was added in the SMMUv3.3 spec. For the perf tool to know the specific HW implementation, expose the PMCG_IIDR contents only when set. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1602149181-237415-5-git-send-email-john.garry@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25drivers/perf: hisi: Add identifier sysfs fileJohn Garry5-0/+65
To allow userspace to identify the specific implementation of the device, add an "identifier" sysfs file. Encoding is as follows (same for all uncore drivers): hi1620: 0x0 hi1630: 0x30 Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1602149181-237415-2-git-send-email-john.garry@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25perf: remove duplicate check on fwnodeWang Qing1-3/+0
fwnode is checked IS_ERR_OR_NULL in following check by is_of_node() or is_acpi_device_node(), remove duplicate check. Signed-off-by: Wang Qing <wangqing@vivo.com> Link: https://lore.kernel.org/r/1604644902-29655-1-git-send-email-wangqing@vivo.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25driver/perf: Add PMU driver for the ARM DMC-620 memory controllerTuan Phan3-0/+756
DMC-620 PMU supports total 10 counters which each is independently programmable to different events and can be started and stopped individually. Currently, it only supports ACPI. Other platforms feel free to test and add support for device tree. Usage example: #perf stat -e arm_dmc620_10008c000/clk_cycle_count/ -C 0 Get perf event for clk_cycle_count counter. #perf stat -e arm_dmc620_10008c000/clkdiv2_allocate,mask=0x1f,match=0x2f, incr=2,invert=1/ -C 0 The above example shows how to specify mask, match, incr, invert parameters for clkdiv2_allocate event. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com> Link: https://lore.kernel.org/r/1604518246-6198-1-git-send-email-tuanphan@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-01perf: arm-cmn: Fix conversion specifiers for node typeWill Deacon1-2/+2
The node type field is an enum type, so print it as a 32-bit quantity rather than as an unsigned short. Link: https://lore.kernel.org/r/202009302350.QIzfkx62-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-10-01perf: arm-cmn: Fix unsigned comparison to less than zeroWill Deacon1-1/+1
Ensure that the 'irq' field of 'struct arm_cmn_dtc' is a signed int so that it can be compared '< 0'. Link: https://lore.kernel.org/r/20200929170835.GA15956@embeddedor Addresses-Coverity-ID: 1497488 ("Unsigned compared against 0") Fixes: 0ba64770a2f2 ("perf: Add Arm CMN-600 PMU driver") Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm_pmu: arm64: Use NMIs for PMUJulien Thierry1-8/+63
Add required PMU interrupt operations for NMIs. Request interrupt lines as NMIs when possible, otherwise fall back to normal interrupts. NMIs are only supported on the arm64 architecture with a GICv3 irqchip. [Alexandru E.: Added that NMIs only work on arm64 + GICv3, print message when PMU is using NMIs] Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-8-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm_pmu: Introduce pmu_irq_opsJulien Thierry1-16/+74
Currently the PMU interrupt can either be a normal irq or a percpu irq. Supporting NMI will introduce two cases for each existing one. It becomes a mess of 'if's when managing the interrupt. Define sets of callbacks for operations commonly done on the interrupt. The appropriate set of callbacks is selected at interrupt request time and simplifies interrupt enabling/disabling and freeing. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-7-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28perf: Add Arm CMN-600 PMU driverRobin Murphy3-0/+1649
Initial driver for PMU event counting on the Arm CMN-600 interconnect. CMN sports an obnoxiously complex distributed PMU system as part of its debug and trace features, which can do all manner of things like sampling, cross-triggering and generating CoreSight trace. This driver covers the PMU functionality, plus the relevant aspects of watchpoints for simply counting matching flits. Tested-by: Tsahi Zidenberg <tsahee@amazon.com> Tested-by: Tuan Phan <tuanphan@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-18drivers/perf: thunderx2_pmu: Fix memory resource error handlingMark Salter1-2/+5
In tx2_uncore_pmu_init_dev(), a call to acpi_dev_get_resources() is used to create a list _CRS resources which is searched for the device base address. There is an error check following this: if (!rentry->res) return NULL In no case, will rentry->res be NULL, so the test is useless. Even if the test worked, it comes before the resource list memory is freed. None of this really matters as long as the ACPI table has the memory resource. Let's clean it up so that it makes sense and will give a meaningful error should firmware leave out the memory resource. Fixes: 69c32972d593 ("drivers/perf: Add Cavium ThunderX2 SoC UNCORE PMU driver") Signed-off-by: Mark Salter <msalter@redhat.com> Link: https://lore.kernel.org/r/20200915204110.326138-2-msalter@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-18drivers/perf: xgene_pmu: Fix uninitialized resource structMark Salter1-15/+17
This splat was reported on newer Fedora kernels booting on certain X-gene based machines: xgene-pmu APMC0D83:00: X-Gene PMU version 3 Unable to handle kernel read from unreadable memory at virtual \ address 0000000000004006 ... Call trace: string+0x50/0x100 vsnprintf+0x160/0x750 devm_kvasprintf+0x5c/0xb4 devm_kasprintf+0x54/0x60 __devm_ioremap_resource+0xdc/0x1a0 devm_ioremap_resource+0x14/0x20 acpi_get_pmu_hw_inf.isra.0+0x84/0x15c acpi_pmu_dev_add+0xbc/0x21c acpi_ns_walk_namespace+0x16c/0x1e4 acpi_walk_namespace+0xb4/0xfc xgene_pmu_probe_pmu_dev+0x7c/0xe0 xgene_pmu_probe.part.0+0x2c0/0x310 xgene_pmu_probe+0x54/0x64 platform_drv_probe+0x60/0xb4 really_probe+0xe8/0x4a0 driver_probe_device+0xe4/0x100 device_driver_attach+0xcc/0xd4 __driver_attach+0xb0/0x17c bus_for_each_dev+0x6c/0xb0 driver_attach+0x30/0x40 bus_add_driver+0x154/0x250 driver_register+0x84/0x140 __platform_driver_register+0x54/0x60 xgene_pmu_driver_init+0x28/0x34 do_one_initcall+0x40/0x204 do_initcalls+0x104/0x144 kernel_init_freeable+0x198/0x210 kernel_init+0x20/0x12c ret_from_fork+0x10/0x18 Code: 91000400 110004e1 eb08009f 540000c0 (38646846) ---[ end trace f08c10566496a703 ]--- This is due to use of an uninitialized local resource struct in the xgene pmu driver. The thunderx2_pmu driver avoids this by using the resource list constructed by acpi_dev_get_resources() rather than using a callback from that function. The callback in the xgene driver didn't fully initialize the resource. So get rid of the callback and search the resource list as done by thunderx2. Fixes: 832c927d119b ("perf: xgene: Add APM X-Gene SoC Performance Monitoring Unit driver") Signed-off-by: Mark Salter <msalter@redhat.com> Link: https://lore.kernel.org/r/20200915204110.326138-1-msalter@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-15perf: arm_dsu: Support DSU ACPI devicesTuan Phan1-6/+57
Add support for probing device from ACPI node. Each DSU ACPI node and its associated cpus are inside a cluster node. Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/1600106656-9542-1-git-send-email-tuanphan@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07drivers/perf: hisi: Add missing include of linux/module.hShaokun Zhang1-0/+1
MODULE_*** is used in HiSilicon uncore PMU drivers and is provided by linux/module.h, but the header file is not directly included. Add the missing include. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1599186097-18599-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva2-3/+3
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-07Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds2-66/+25
Pull clk updates from Stephen Boyd: "It looks like a smaller batch of clk updates this time around. In the core framework we just have some minor tweaks and a debugfs feature, so not much to see there. The driver updates are fairly well split between AT91 and Qualcomm clk support. Adding those two drivers together equals about 50% of the diffstat. Otherwise, the big amount of work this time was on supporting Broadcom's Raspberry Pi firmware clks. Highlights: Core: - Document clk_hw_round_rate() so it gets some more use - Remove unused __clk_get_flags() - Add a prepare/enable debugfs feature similar to rate setting New Drivers: - Add support for SAMA7G5 SoC clks - Enable CPU clks on Qualcomm IPQ6018 SoCs - Enable CPU clks on Qualcomm MSM8996 SoCs - GPU clk support for Qualcomm SM8150 and SM8250 SoCs - Audio clks on Qualcomm SC7180 SoCs - Microchip Sparx5 DPLL clk - Add support for the new Renesas RZ/G2H (R8A774E1) SoC Updates: - Make defines for bcm63xx-gate clks to use in DT - Support BCM2711 SoC firmware clks - Add HDMI clks for BCM2711 SoCs - Add RTC related clks on Ingenic SoCs - Support USB PHY clks on Ingenic SoCs - Support gate clks on BCM6318 SoCs - RMU and DMAC/GPIO clock support for Actions Semi S500 SoCs - Use poll_timeout functions in Rockchip clk driver - Support Rockchip rk3288w SoC variant - Mark mac_lbtest critical on Rockchip rk3188 - Add CAAM clock support for i.MX vf610 driver - Add MU root clock support for i.MX imx8mp driver - Amlogic g12: add neural network accelerator clock sources - Amlogic meson8: remove critical flag for main PLL divider - Amlogic meson8: add video decoder clock gates - Convert one more Renesas DT binding to json-schema - Enhance critical clock handling on Renesas platforms to only consider clocks that were enabled at boot time" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (79 commits) clk: qcom: gcc: Make disp gpll0 branch aon for sc7180/sdm845 ipq806x: gcc: add support for child probe clk: qcom: msm8996: Make symbol 'cpu_msm8996_clks' static clk: qcom: ipq8074: Add correct index for PCIe clocks clk: <linux/clk-provider.h>: drop a duplicated word clk: renesas: cpg-mssr: Add r8a774e1 support dt-bindings: clock: renesas,cpg-mssr: Document r8a774e1 clk: Drop duplicate selection in Kconfig clk: qcom: smd: Add support for MSM8992/4 rpm clocks clk: qcom: ipq8074: Add missing clocks for pcie dt-bindings: clock: qcom: ipq8074: Add missing bindings for PCIe Replace HTTP links with HTTPS ones: Common CLK framework clk: qcom: Add CPU clock driver for msm8996 dt-bindings: clk: qcom: Add bindings for CPU clock for msm8996 soc: qcom: Separate kryo l2 accessors from PMU driver clk: meson: meson8b: add the vclk2_en gate clock clk: meson: meson8b: add the vclk_en gate clock clk: qcom: Fix return value check in apss_ipq6018_probe() clk: bcm: dvp: Add missing module informations clk: meson: meson8b: Drop CLK_IS_CRITICAL from fclk_div2 ...
2020-08-03Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds1-2/+1
Pull arm64 and cross-arch updates from Catalin Marinas: "Here's a slightly wider-spread set of updates for 5.9. Going outside the usual arch/arm64/ area is the removal of read_barrier_depends() series from Will and the MSI/IOMMU ID translation series from Lorenzo. The notable arm64 updates include ARMv8.4 TLBI range operations and translation level hint, time namespace support, and perf. Summary: - Removal of the tremendously unpopular read_barrier_depends() barrier, which is a NOP on all architectures apart from Alpha, in favour of allowing architectures to override READ_ONCE() and do whatever dance they need to do to ensure address dependencies provide LOAD -> LOAD/STORE ordering. This work also offers a potential solution if compilers are shown to convert LOAD -> LOAD address dependencies into control dependencies (e.g. under LTO), as weakly ordered architectures will effectively be able to upgrade READ_ONCE() to smp_load_acquire(). The latter case is not used yet, but will be discussed further at LPC. - Make the MSI/IOMMU input/output ID translation PCI agnostic, augment the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID bus-specific parameter and apply the resulting changes to the device ID space provided by the Freescale FSL bus. - arm64 support for TLBI range operations and translation table level hints (part of the ARMv8.4 architecture version). - Time namespace support for arm64. - Export the virtual and physical address sizes in vmcoreinfo for makedumpfile and crash utilities. - CPU feature handling cleanups and checks for programmer errors (overlapping bit-fields). - ACPI updates for arm64: disallow AML accesses to EFI code regions and kernel memory. - perf updates for arm64. - Miscellaneous fixes and cleanups, most notably PLT counting optimisation for module loading, recordmcount fix to ignore relocations other than R_AARCH64_CALL26, CMA areas reserved for gigantic pages on 16K and 64K configurations. - Trivial typos, duplicate words" Link: http://lkml.kernel.org/r/20200710165203.31284-1-will@kernel.org Link: http://lkml.kernel.org/r/20200619082013.13661-1-lorenzo.pieralisi@arm.com * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (82 commits) arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack arm64/mm: save memory access in check_and_switch_context() fast switch path arm64: sigcontext.h: delete duplicated word arm64: ptrace.h: delete duplicated word arm64: pgtable-hwdef.h: delete duplicated words bus: fsl-mc: Add ACPI support for fsl-mc bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver of/irq: Make of_msi_map_rid() PCI bus agnostic of/irq: make of_msi_map_get_device_domain() bus agnostic dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus of/device: Add input id to of_dma_configure() of/iommu: Make of_map_rid() PCI agnostic ACPI/IORT: Add an input ID to acpi_dma_configure() ACPI/IORT: Remove useless PCI bus walk ACPI/IORT: Make iort_msi_map_rid() PCI agnostic ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC arm64: enable time namespace support arm64/vdso: Restrict splitting VVAR VMA arm64/vdso: Handle faults on timens page ...
2020-07-17drivers/perf: Prevent forced unbinding of PMU driversQi Liu13-0/+13
Forcefully unbinding PMU drivers during perf sampling will lead to a kernel panic, because the perf upper-layer framework call a NULL pointer in this situation. To solve this issue, "suppress_bind_attrs" should be set to true, so that bind/unbind can be disabled via sysfs and prevent unbinding PMU drivers during perf sampling. Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1594975763-32966-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16drivers/perf: Fix kernel panic when rmmod PMU modules during perf samplingQi Liu5-0/+5
When users try to remove PMU modules during perf sampling, kernel panic will happen because the pmu->read() is a NULL pointer here. INFO on HiSilicon hip08 platform as follow: pc : hisi_uncore_pmu_event_update+0x30/0xa4 [hisi_uncore_pmu] lr : hisi_uncore_pmu_read+0x20/0x2c [hisi_uncore_pmu] sp : ffff800010103e90 x29: ffff800010103e90 x28: ffff0027db0c0e40 x27: ffffa29a76f129d8 x26: ffffa29a77ceb000 x25: ffffa29a773a5000 x24: ffffa29a77392000 x23: ffffddffe5943f08 x22: ffff002784285960 x21: ffff002784285800 x20: ffff0027d2e76c80 x19: ffff0027842859e0 x18: ffff80003498bcc8 x17: ffffa29a76afe910 x16: ffffa29a7583f530 x15: 16151a1512061a1e x14: 0000000000000000 x13: ffffa29a76f1e238 x12: 0000000000000001 x11: 0000000000000400 x10: 00000000000009f0 x9 : ffff8000107b3e70 x8 : ffff0027db0c1890 x7 : ffffa29a773a7000 x6 : 00000007f5131013 x5 : 00000007f5131013 x4 : 09f257d417c00000 x3 : 00000002187bd7ce x2 : ffffa29a38f0f0d8 x1 : ffffa29a38eae268 x0 : ffff0027d2e76c80 Call trace: hisi_uncore_pmu_event_update+0x30/0xa4 [hisi_uncore_pmu] hisi_uncore_pmu_read+0x20/0x2c [hisi_uncore_pmu] __perf_event_read+0x1a0/0x1f8 flush_smp_call_function_queue+0xa0/0x160 generic_smp_call_function_single_interrupt+0x18/0x20 handle_IPI+0x31c/0x4dc gic_handle_irq+0x2c8/0x310 el1_irq+0xcc/0x180 arch_cpu_idle+0x4c/0x20c default_idle_call+0x20/0x30 do_idle+0x1b4/0x270 cpu_startup_entry+0x28/0x30 secondary_start_kernel+0x1a4/0x1fc To solve the above issue, current module should be registered to kernel, so that try_module_get() can be invoked when perf sampling starts. This adds the reference counting of module and could prevent users from removing modules during sampling. Reported-by: Haifeng Wang <wang.wanghaifeng@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1594891165-8228-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>