aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-sqlite.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-07-14drivers/perf: hisi: Simplify the probe process for each DDRC versionJunhao He2-188/+142
Version 1 and 2 of DDRC PMU also use different HID. Make use of struct acpi_device_id::driver_data for version specific information rather than judge the version register. This will help to simplify the probe process and also a bit easier for extension. In order to support this extend struct hisi_pmu_dev_info for version specific counter bits and event range. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14perf/arm-ni: Support sharing IRQs within an NI instanceShouping Wang1-27/+55
NI-700 has a distinct PMU interrupt output for each Clock Domain, however some integrations may still combine these together externally. The initial driver didn't attempt to support this, in anticipation of a more general solution for IRQ sharing between system PMU instances, but that's still a way off, so let's make this intermediate step for now to at least allow sharing IRQs within an individual NI instance. Now that CPU affinity and migration are cleaned up, it's fairly straightforward to adopt similar logic to arm-cmn, to identify CDs with a common interrupt and loop over them directly in the handler. Signed-off-by: Shouping Wang <allen.wang@hj-micro.com> [ rm: Rework for affinity handling, cosmetics, new commit message ] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/f62db639d3b54c959ec477db7b8ccecbef1ca310.1752256072.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14perf/arm-ni: Consolidate CPU affinity handlingRobin Murphy1-40/+34
Since overflow interrupts from the individual PMUs are infrequent and unlikely to coincide, and we make no attempt to balance them across CPUs anyway, there's really not much point tracking a separate CPU affinity per PMU. Move the CPU affinity and hotplug migration up to the NI instance level. Tested-by: Shouping Wang <allen.wang@hj-micro.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/00b622872006c2f0c89485e343b1cb8caaa79c47.1752256072.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14perf/cxlpmu: Fix typos in cxl_pmu.c comments and documentationAlok Tiwari1-3/+3
Fix several minor typo errors in comments: - Remove duplicated word "a" in "a a VID / GroupID". - Correct "Opcopdes" to "Opcodes" in CXL spec reference. - Fix spelling of "implemnted" to "implemented". Improves code readability and documentation consistency. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://lore.kernel.org/r/20250624194350.109790-4-alok.a.tiwari@oracle.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14perf/cxlpmu: Remove unintended newline from IRQ name format stringAlok Tiwari1-1/+1
The IRQ name format string used in devm_kasprintf() mistakenly included a newline character "\n". This could lead to confusing log output or misformatted names in sysfs or debug messages. This fix removes the newline to ensure proper IRQ naming. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://lore.kernel.org/r/20250624194350.109790-3-alok.a.tiwari@oracle.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14perf/cxlpmu: Fix devm_kcalloc() argument order in cxl_pmu_probe()Alok Tiwari1-2/+2
The previous code mistakenly swapped the count and size parameters. This fix corrects the argument order in devm_kcalloc() to follow the conventional count, size form, avoiding potential confusion or bugs. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://lore.kernel.org/r/20250624194350.109790-2-alok.a.tiwari@oracle.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08perf: arm_spe: Relax period restrictionLeo Yan1-7/+11
The minimum interval specified the PMSIDR_EL1.Interval field is a hardware recommendation. However, this value is set by hardware designer before the production. It is not actual hardware limitation but tools currently have no way to test shorter periods. This change relaxes the limitation by allowing any non-zero periods, with simplifying code with clamp_t(). The downside is that small periods may increase the risk of AUX ring buffer overruns. When an overrun occurs, the perf core layer will trigger an irq work to disable the event and wake up the tool in user space to read the trace data. After the tool finishes reading, it will re-enable the AUX event. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250627163028.3503122-1-leo.yan@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)Rob Herring (Arm)7-5/+990
The ARMv9.2 architecture introduces the optional Branch Record Buffer Extension (BRBE), which records information about branches as they are executed into set of branch record registers. BRBE is similar to x86's Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer (BHRB). BRBE supports filtering by exception level and can filter just the source or target address if excluded to avoid leaking privileged addresses. The h/w filter would be sufficient except when there are multiple events with disjoint filtering requirements. In this case, BRBE is configured with a union of all the events' desired branches, and then the recorded branches are filtered based on each event's filter. For example, with one event capturing kernel events and another event capturing user events, BRBE will be configured to capture both kernel and user branches. When handling event overflow, the branch records have to be filtered by software to only include kernel or user branch addresses for that event. In contrast, x86 simply configures LBR using the last installed event which seems broken. It is possible on x86 to configure branch filter such that no branches are ever recorded (e.g. -j save_type). For BRBE, events with a configuration that will result in no samples are rejected. Recording branches in KVM guests is not supported like x86. However, perf on x86 allows requesting branch recording in guests. The guest events are recorded, but the resulting branches are all from the host. For BRBE, events with branch recording and "exclude_host" set are rejected. Requiring "exclude_guest" to be set did not work. The default for the perf tool does set "exclude_guest" if no exception level options are specified. However, specifying kernel or user events defaults to including both host and guest. In this case, only host branches are recorded. BRBE can support some additional exception branch types compared to x86. On x86, all exceptions other than syscalls are recorded as IRQ. With BRBE, it is possible to better categorize these exceptions. One limitation relative to x86 is we cannot distinguish a syscall return from other exception returns. So all exception returns are recorded as ERET type. The FIQ branch type is omitted as the only FIQ user is Apple platforms which don't support BRBE. The debug branch types are omitted as there is no clear need for them. BRBE records are invalidated whenever events are reconfigured, a new task is scheduled in, or after recording is paused (and the records have been recorded for the event). The architecture allows branch records to be invalidated by the PE under implementation defined conditions. It is expected that these conditions are rare. Cc: Catalin Marinas <catalin.marinas@arm.com> Co-developed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Co-developed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> tested-by: Adam Young <admiyo@os.amperecomputing.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-4-e7775563036e@kernel.org [will: Fix sparse warnings about mixed declarations and code. Fix C99 comment syntax.] Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08KVM: arm64: nvhe: Disable branch generation in nVHE guestsAnshuman Khandual4-1/+39
While BRBE can record branches within guests, the host recording branches in guests is not supported by perf (though events are). Support for BRBE in guests will supported by providing direct access to BRBE within the guests. That is how x86 LBR works for guests. Therefore, BRBE needs to be disabled on guest entry and restored on exit. For nVHE, this requires explicit handling for guests. Before entering a guest, save the BRBE state and disable the it. When returning to the host, restore the state. For VHE, it is not necessary. We initialize BRBCR_EL1.{E1BRE,E0BRE}=={0,0} at boot time, and HCR_EL2.TGE==1 while running in the host. We configure BRBCR_EL2.{E2BRE,E0HBRE} to enable branch recording in the host. When entering the guest, we set HCR_EL2.TGE==0 which means BRBCR_EL1 is used instead of BRBCR_EL2. Consequently for VHE, BRBE recording is disabled at EL1 and EL0 when running a guest. Should recording in guests (by the host) ever be desired, the perf ABI will need to be extended to distinguish guest addresses (struct perf_branch_entry.priv) for starters. BRBE records would also need to be invalidated on guest entry/exit as guest/host EL1 and EL0 records can't be distinguished. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Co-developed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Tested-by: James Clark <james.clark@linaro.org> Reviewed-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-3-e7775563036e@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08arm64: Handle BRBE booting requirementsAnshuman Khandual2-3/+89
To use the Branch Record Buffer Extension (BRBE), some configuration is necessary at EL3 and EL2. This patch documents the requirements and adds the initial EL2 setup code, which largely consists of configuring the fine-grained traps and initializing a couple of BRBE control registers. Before this patch, __init_el2_fgt() would initialize HDFGRTR_EL2 and HDFGWTR_EL2 with the same value, relying on the read/write trap controls for a register occupying the same bit position in either register. The 'nBRBIDR' trap control only exists in bit 59 of HDFGRTR_EL2, while bit 59 of HDFGWTR_EL2 is RES0, and so this assumption no longer holds. To handle HDFGRTR_EL2 and HDFGWTR_EL2 having (slightly) different bit layouts, __init_el2_fgt() is changed to accumulate the HDFGRTR_EL2 and HDFGWTR_EL2 control bits separately. While making this change the open-coded value (1 << 62) is replaced with HDFG{R,W}TR_EL2_nPMSNEVFR_EL1_MASK. The BRBCR_EL1 and BRBCR_EL2 registers are unusual and require special initialisation: even though they are subject to E2H renaming, both have an effect regardless of HCR_EL2.TGE, even when running at EL2. So we must initialize BRBCR_EL2 in case we run in nVHE mode. This is handled in __init_el2_brbe() with a comment to explain the situation. Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Leo Yan <leo.yan@arm.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> [Mark: rewrite commit message, fix typo in comment] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Co-developed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> tested-by: Adam Young <admiyo@os.amperecomputing.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-2-e7775563036e@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08arm64/sysreg: Add BRBE registers and fieldsAnshuman Khandual2-10/+138
This patch adds definitions related to the Branch Record Buffer Extension (BRBE) as per ARM DDI 0487K.a. These will be used by KVM and a BRBE driver in subsequent patches. Some existing BRBE definitions in asm/sysreg.h are replaced with equivalent generated definitions. Cc: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: James Clark <james.clark@linaro.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> tested-by: Adam Young <admiyo@os.amperecomputing.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-1-e7775563036e@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04perf/arm: Add missing .suppress_bind_attrsRobin Murphy2-0/+2
PMU drivers should set .suppress_bind_attrs so that userspace is denied the opportunity to pull the driver out from underneath an in-use PMU (with predictably unpleasant consequences). Somehow both the CMN and NI drivers have managed to miss this; put that right. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/acd48c341b33b96804a3969ee00b355d40c546e2.1751465293.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04perf/arm-cmn: Reduce stack usage during discoveryRobin Murphy1-7/+8
Arnd reports that Clang's aggressive inlining of arm_cmn_discover() can lead to stack frame size warnings, and while we could simply prevent such inlining to hide the issue, it seems more productive to actually heed the warning and do something about the overall stack footprint. The xp_region array is already rather large, and CMN_MAX_XPS might only grow larger in future, however it only serves as a convenience to save repeating the first level's worth of register reads in the second pass of discovery. There's no performance concern here, and it only takes a small tweak to the flow to re-extract the offsets instead of stashing them, so let's just do that and save several hundred bytes of stack. Reported-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/e7dd41bf0f1b098e2e4b01ef91318a4b272abff8.1751046159.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04perf: imx9_perf: make the read-only array mask static constColin Ian King1-3/+5
Don't populate the read-only array mask on the stack at run time, instead make it static const. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20250611133917.170888-1-colin.i.king@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04perf/arm-cmn: Broaden module description for wider interconnect supportZhiyuan Dai1-2/+2
The current MODULE_DESCRIPTION only mentions CMN-600, but this driver now supports several Arm mesh interconnects including CMN-650, CMN-700, CI-700, and CMN-S3. Update the MODULE_DESCRIPTION to reflect the expanded scope. Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn> Link: https://lore.kernel.org/r/20250522032122.949373-1-daizhiyuan@phytium.com.cn Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04perf/arm-ni: Set initial IRQ affinityRobin Murphy1-0/+2
While we do request our IRQs with the right flags to stop their affinity changing unexpectedly, we forgot to actually set it to start with. Oops. Cc: stable@vger.kernel.org Fixes: 4d5a7680f2b4 ("perf: Add driver for Arm NI-700 interconnect PMU") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Shouping Wang <allen.wang@hj-micro.com> Link: https://lore.kernel.org/r/614ced9149ee8324e58930862bd82cbf46228d27.1747149165.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04arm64/watchdog_hld: Add a cpufreq notifier for update watchdog threshYicong Yang1-0/+58
arm64 depends on the cpufreq driver to gain the maximum cpu frequency to convert the watchdog_thresh to perf event period. cpufreq drivers like cppc_cpufreq will be initialized lately after the initializing of the hard lockup detector so just use a safe cpufreq which will be inaccurency. Use a cpufreq notifier to adjust the event's period to a more accurate one. Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250701110214.27242-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04watchdog/perf: Provide function for adjusting the event periodYicong Yang2-0/+24
Architecture's using perf events for hard lockup detection needs to convert the watchdog_thresh to the event's period, some architecture for example arm64 perform this conversion using the CPU's maximum frequency which will be acquired by cpufreq. However by the time the lockup detector's initialized the cpufreq driver may not be initialized, thus launch a watchdog with inaccurate period. Provide a function hardlockup_detector_perf_adjust_period() to allowing adjust the event period. Then architecture can update with more accurate period if cpufreq is initialized. Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250701110214.27242-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-06-08Linux 6.16-rc1Linus Torvalds1-2/+2
2025-06-08tools/power turbostat: version 2025.06.08Len Brown1-37/+36
Add initial DMR support, which required smarter RAPL probe Fix AMD MSR RAPL energy reporting Add RAPL power limit configuration output Minor fixes Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Add initial support for BartlettLakeZhang Rui1-0/+1
Add initial support for BartlettLake. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Add initial support for DMRZhang Rui1-0/+18
Add initial support for DMR. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Dump RAPL sysfs infoZhang Rui1-0/+156
for example: intel-rapl:1: psys 28.0s:100W 976.0us:100W intel-rapl:0: package-0 28.0s:57W,max:15W 2.4ms:57W intel-rapl:0/intel-rapl:0:0: core disabled intel-rapl:0/intel-rapl:0:1: uncore disabled intel-rapl-mmio:0: package-0 28.0s:28W,max:15W 2.4ms:57W [lenb: simplified format] Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> squish me Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Avoid probing the same perf countersZhang Rui1-0/+15
For the RAPL package energy status counter, Intel and AMD share the same perf_subsys and perf_name, but with different MSR addresses. Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are introduced to describe this counter for different Vendors. As a result, the perf counter is probed twice, and causes a failure in in get_rapl_counters() because expected_read_size and actual_read_size don't match. Fix the problem by skipping the already probed counter. Note, this is not a perfect fix. For example, if different vendors/platforms use the same MSR value for different purpose, the code can be fooled when it probes a rapl_counter_arch_infos[] entry that does not belong to the running Vendor/Platform. In a long run, better to put rapl_counter_arch_infos[] into the platform_features so that this becomes Vendor/Platform specific. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs clearedZhang Rui1-25/+24
platform_features->rapl_msrs describes the RAPL MSRs supported. While RAPL Perf counters can be exposed from different kernel backend drivers, e.g. RAPL MSR I/F driver, or RAPL TPMI I/F driver. Thus, turbostat should first blindly probe all the available RAPL Perf counters, and falls back to the RAPL MSR counters if they are listed in platform_features->rapl_msrs. With this, platforms that don't have RAPL MSRs can clear the platform_features->rapl_msrs bits and use RAPL Perf counters only. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Clean up add perf/msr counter logicZhang Rui1-7/+18
Increase the code readability by moving the no_perf/no_msr flag and the cai->perf_name/cai->msr sanity checks into the counter probe functions. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Introduce add_msr_counter()Zhang Rui1-9/+23
probe_rapl_msr() is reused for probing RAPL MSR counters, cstate MSR counters and MPERF/APERF/SMI MSR counters, thus its name is misleading. Similar to add_perf_counter(), introduce add_msr_counter() to probe a counter via MSR. Introduce wrapper function add_rapl_msr_counter() at the same time to add extra check for Zero return value for specified RAPL counters. No functional change intended. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Remove add_msr_perf_counter_()Zhang Rui1-12/+8
As the only caller of add_msr_perf_counter_(), add_msr_perf_counter() just gives extra debug output on top. There is no need to keep both functions. Remove add_msr_perf_counter_() and move all the logic to add_msr_perf_counter(). No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Remove add_cstate_perf_counter_()Zhang Rui1-13/+9
As the only caller of add_cstate_perf_counter_(), add_cstate_perf_counter() just gives extra debug output on top. There is no need to keep both functions. Remove add_cstate_perf_counter_() and move all the logic to add_cstate_perf_counter(). No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Remove add_rapl_perf_counter_()Zhang Rui1-15/+10
As the only caller of add_rapl_perf_counter_(), add_rapl_perf_counter() just gives extra debug output on top. There is no need to keep both functions. Remove add_rapl_perf_counter_() and move all the logic to add_rapl_perf_counter(). No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Quit early for unsupported RAPL countersZhang Rui1-1/+4
Quit early for unsupported RAPL counters. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Always check rapl_joules flagZhang Rui1-3/+9
rapl_joules bit should always be checked even if platform_features->rapl_msrs is not set or no_msr flag is used. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Fix AMD package-energy reportingGautham R. Shenoy1-5/+36
commit 05a2f07db888 ("tools/power turbostat: read RAPL counters via perf") that adds support to read RAPL counters via perf defines the notion of a RAPL domain_id which is set to physical_core_id on platforms which support per_core_rapl counters (Eg: AMD processors Family 17h onwards) and is set to the physical_package_id on all the other platforms. However, the physical_core_id is only unique within a package and on platforms with multiple packages more than one core can have the same physical_core_id and thus the same domain_id. (For eg, the first cores of each package have the physical_core_id = 0). This results in all these cores with the same physical_core_id using the same entry in the rapl_counter_info_perdomain[]. Since rapl_perf_init() skips the perf-initialization for cores whose domain_ids have already been visited, cores that have the same physical_core_id always read the perf file corresponding to the physical_core_id of the first package and thus the package-energy is incorrectly reported to be the same value for different packages. Note: This issue only arises when RAPL counters are read via perf and not when they are read via MSRs since in the latter case the MSRs are read separately on each core. Fix this issue by associating each CPU with rapl_core_id which is unique across all the packages in the system. Fixes: 05a2f07db888 ("tools/power turbostat: read RAPL counters via perf") Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Fix RAPL_GFX_ALL typoKaushlendra Kumar1-1/+1
Fix typo in the currently unused RAPL_GFX_ALL macro definition. Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat: Add Android support for MSR device handlingKaushlendra Kumar1-3/+17
It uses /dev/msrN device paths on Android instead of /dev/cpu/N/msr, updates error messages and permission checks to reflect the Android device path, and wraps platform-specific code with #if defined(ANDROID) to ensure correct behavior on both Android and non-Android systems. These changes improve compatibility and usability of turbostat on Android devices. Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat.8: pm_domain wording fixLen Brown1-2/+2
turbostat.8: clarify that uncore "domains" are Power Management domains, aka pm_domains. Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08tools/power turbostat.8: fix typo: idle_pct should be pct_idleLen Brown1-1/+1
idle_pct should be pct_idle Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08treewide, timers: Rename from_timer() to timer_container_of()Ingo Molnar689-955/+1151
Move this API to the canonical timer_*() namespace. [ tglx: Redone against pre rc1 ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
2025-06-07tracing: Add rcu annotation around file->filter accessesSteven Rostedt1-4/+6
Running sparse on trace_events_filter.c triggered several warnings about file->filter being accessed directly even though it's annotated with __rcu. Add rcu_dereference() around it and shuffle the logic slightly so that it's always referenced via accessor functions. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250607102821.6c7effbf@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-06-07sh: kprobes: Remove unused variables in kprobe_exceptions_notify()Mike Rapoport1-4/+0
kbuild reports the following warning: arch/sh/kernel/kprobes.c: In function 'kprobe_exceptions_notify': >> arch/sh/kernel/kprobes.c:412:24: warning: variable 'p' set but not used [-Wunused-but-set-variable] 412 | struct kprobe *p = NULL; | ^ The variable 'p' is indeed unused since the commit fa5a24b16f94 ("sh/kprobes: Don't call the ->break_handler() in SH kprobes code") Remove that variable along with 'kprobe_opcode_t *addr' which also becomes unused after 'p' is removed. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505151341.EuRFR22l-lkp@intel.com/ Fixes: fa5a24b16f94 ("sh/kprobes: Don't call the ->break_handler() in SH kprobes code") Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2025-06-07sh: ecovec24: Make SPI mode explicitGeert Uytterhoeven1-0/+1
Commit cf9e4784f3bde3e4 ("spi: sh-msiof: Add slave mode support") added a new mode member to the sh_msiof_spi_info structure, but did not update any board files. Hence all users in board files rely on the default being host mode. Make this unambiguous by configuring host mode explicitly. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2025-06-07sh: Replace __ASSEMBLY__ with __ASSEMBLER__ in all headersThomas Huth17-46/+46
While the GCC and Clang compilers already define __ASSEMBLER__ automatically when compiling assembly code, __ASSEMBLY__ is a macro that only gets defined by the Makefiles in the kernel. This can be very confusing when switching between userspace and kernelspace coding, or when dealing with uapi headers that rather should use __ASSEMBLER__ instead. So let's standardize on the __ASSEMBLER__ macro that is provided by the compilers now. This is a completely mechanical patch (done with a simple "sed -i" statement). Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: linux-sh@vger.kernel.org Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2025-06-07genksyms: Fix enum consts from a reference affecting new valuesPetr Pavlu1-7/+20
Enumeration constants read from a symbol reference file can incorrectly affect new enumeration constants parsed from an actual input file. Example: $ cat test.c enum { E_A, E_B, E_MAX }; struct bar { int mem[E_MAX]; }; int foo(struct bar *a) {} __GENKSYMS_EXPORT_SYMBOL(foo); $ cat test.c | ./scripts/genksyms/genksyms -T test.0.symtypes #SYMVER foo 0x070d854d $ cat test.0.symtypes E#E_MAX 2 s#bar struct bar { int mem [ E#E_MAX ] ; } foo int foo ( s#bar * ) $ cat test.c | ./scripts/genksyms/genksyms -T test.1.symtypes -r test.0.symtypes <stdin>:4: warning: foo: modversion changed because of changes in enum constant E_MAX #SYMVER foo 0x9c9dfd81 $ cat test.1.symtypes E#E_MAX ( 2 ) + 3 s#bar struct bar { int mem [ E#E_MAX ] ; } foo int foo ( s#bar * ) The __add_symbol() function includes logic to handle the incrementation of enumeration values, but this code is also invoked when reading a reference file. As a result, the variables last_enum_expr and enum_counter might be incorrectly set after reading the reference file, which later affects parsing of the actual input. Fix the problem by splitting the logic for the incrementation of enumeration values into a separate function process_enum() and call it from __add_symbol() only when processing non-reference data. Fixes: e37ddb825003 ("genksyms: Track changes to enum constants") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-06-07arch: use always-$(KBUILD_BUILTIN) for vmlinux.ldsMasahiro Yamada21-21/+21
The extra-y syntax is deprecated. Instead, use always-$(KBUILD_BUILTIN), which behaves equivalently. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Nicolas Schier <n.schier@avm.de>
2025-06-07do_change_type(): refuse to operate on unmounted/not ours mountsAl Viro1-0/+4
Ensure that propagation settings can only be changed for mounts located in the caller's mount namespace. This change aligns permission checking with the rest of mount(2). Reviewed-by: Christian Brauner <brauner@kernel.org> Fixes: 07b20889e305 ("beginning of the shared-subtree proper") Reported-by: "Orlando, Noah" <Noah.Orlando@deshaw.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-07kbuild: set y instead of 1 to KBUILD_{BUILTIN,MODULES}Masahiro Yamada2-8/+12
KBUILD_BUILTIN is set to 1 unless you are building only modules. KBUILD_MODULES is set to 1 when you are building only modules (a typical use case is "make modules"). It is more useful to set them to 'y' instead, so we can do something like: always-$(KBUILD_BUILTIN) += vmlinux.lds This works equivalently to: extra-y += vmlinux.lds This allows us to deprecate extra-y. extra-y and always-y are quite similar, and we do not need both. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
2025-06-07clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right usernsAl Viro1-0/+3
What we want is to verify there is that clone won't expose something hidden by a mount we wouldn't be able to undo. "Wouldn't be able to undo" may be a result of MNT_LOCKED on a child, but it may also come from lacking admin rights in the userns of the namespace mount belongs to. clone_private_mnt() checks the former, but not the latter. There's a number of rather confusing CAP_SYS_ADMIN checks in various userns during the mount, especially with the new mount API; they serve different purposes and in case of clone_private_mnt() they usually, but not always end up covering the missing check mentioned above. Reviewed-by: Christian Brauner <brauner@kernel.org> Reported-by: "Orlando, Noah" <Noah.Orlando@deshaw.com> Fixes: 427215d85e8d ("ovl: prevent private clone if bind mount is not allowed") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-07selftests/mount_setattr: adapt detached mount propagation testChristian Brauner1-16/+1
Make sure that detached trees don't receive mount propagation. Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-07do_move_mount(): split the checks in subtree-of-our-ns and entire-anon casesAl Viro1-21/+25
... and fix the breakage in anon-to-anon case. There are two cases acceptable for do_move_mount() and mixing checks for those is making things hard to follow. One case is move of a subtree in caller's namespace. * source and destination must be in caller's namespace * source must be detachable from parent Another is moving the entire anon namespace elsewhere * source must be the root of anon namespace * target must either in caller's namespace or in a suitable anon namespace (see may_use_mount() for details). * target must not be in the same namespace as source. It's really easier to follow if tests are *not* mixed together... Reviewed-by: Christian Brauner <brauner@kernel.org> Fixes: 3b5260d12b1f ("Don't propagate mounts into detached trees") Reported-by: Allison Karlitskaya <lis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-07fs: allow clone_private_mount() for a path on real rootfsKONDO KAZUMA(近藤 和真)1-10/+11
Mounting overlayfs with a directory on real rootfs (initramfs) as upperdir has failed with following message since commit db04662e2f4f ("fs: allow detached mounts in clone_private_mount()"). [ 4.080134] overlayfs: failed to clone upperpath Overlayfs mount uses clone_private_mount() to create internal mount for the underlying layers. The commit made clone_private_mount() reject real rootfs because it does not have a parent mount and is in the initial mount namespace, that is not an anonymous mount namespace. This issue can be fixed by modifying the permission check of clone_private_mount() following [1]. Reviewed-by: Christian Brauner <brauner@kernel.org> Fixes: db04662e2f4f ("fs: allow detached mounts in clone_private_mount()") Link: https://lore.kernel.org/all/20250514190252.GQ2023217@ZenIV/ [1] Link: https://lore.kernel.org/all/20250506194849.GT2023217@ZenIV/ Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Kazuma Kondo <kazuma-kondo@nec.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>