From 547b60988e631f74ed025cf1ec50cfc17f49fd13 Mon Sep 17 00:00:00 2001 From: Suzuki K Poulose Date: Mon, 5 Apr 2021 17:42:48 +0100 Subject: perf: aux: Add flags for the buffer format Allocate a byte for advertising the PMU specific format type of the given AUX record. A PMU could end up providing hardware trace data in multiple format in a single session. e.g, The format of hardware buffer produced by CoreSight ETM PMU depends on the type of the "sink" device used for collection for an event (Traditional TMC-ETR/Bs with formatting or TRBEs without any formatting). # Boring story of why this is needed. Goto The_End_of_Story for skipping. CoreSight ETM trace allows instruction level tracing of Arm CPUs. The ETM generates the CPU excecution trace and pumps it into CoreSight AMBA Trace Bus and is collected by a different CoreSight component (traditionally CoreSight TMC-ETR /ETB/ETF), called "sink". Important to note that there is no guarantee that every CPU has a dedicated sink. Thus multiple ETMs could pump the trace data into the same "sink" and thus they apply additional formatting of the trace data for the user to decode it properly and attribute the trace data to the corresponding ETM. However, with the introduction of Arm Trace buffer Extensions (TRBE), we now have a dedicated per-CPU architected sink for collecting the trace. Since the TRBE is always per-CPU, it doesn't apply any formatting of the trace. The support for this driver is under review [1]. Now a system could have a per-cpu TRBE and one or more shared TMC-ETRs on the system. A user could choose a "specific" sink for a perf session (e.g, a TMC-ETR) or the driver could automatically select the nearest sink for a given ETM. It is possible that some ETMs could end up using TMC-ETR (e.g, if the TRBE is not usable on the CPU) while the others using TRBE in a single perf session. Thus we now have "formatted" trace collected from TMC-ETR and "unformatted" trace collected from TRBE. However, we don't get into a situation where a single event could end up using TMC-ETR & TRBE. i.e, any AUX buffer is guaranteed to be either RAW or FORMATTED, but not a mix of both. As for perf decoding, we need to know the type of the data in the individual AUX buffers, so that it can set up the "OpenCSD" (library for decoding CoreSight trace) decoder instance appropriately. Thus the perf.data file must conatin the hints for the tool to decode the data correctly. Since this is a runtime variable, and perf tool doesn't have a control on what sink gets used (in case of automatic sink selection), we need this information made available from the PMU driver for each AUX record. # The_End_of_Story Cc: Peter Ziljstra Cc: alexander.shishkin@linux.intel.com Cc: mingo@redhat.com Cc: will@kernel.org Cc: mark.rutland@arm.com Cc: mike.leach@linaro.org Cc: acme@kernel.org Cc: jolsa@redhat.com Cc: Mathieu Poirier Reviewed by: Mike Leach Acked-by: Peter Ziljstra Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20210405164307.1720226-2-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier --- include/uapi/linux/perf_event.h | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index ad15e40d7f5d..f006eeab6f0e 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -1156,10 +1156,11 @@ enum perf_callchain_context { /** * PERF_RECORD_AUX::flags bits */ -#define PERF_AUX_FLAG_TRUNCATED 0x01 /* record was truncated to fit */ -#define PERF_AUX_FLAG_OVERWRITE 0x02 /* snapshot from overwrite mode */ -#define PERF_AUX_FLAG_PARTIAL 0x04 /* record contains gaps */ -#define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */ +#define PERF_AUX_FLAG_TRUNCATED 0x01 /* record was truncated to fit */ +#define PERF_AUX_FLAG_OVERWRITE 0x02 /* snapshot from overwrite mode */ +#define PERF_AUX_FLAG_PARTIAL 0x04 /* record contains gaps */ +#define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */ +#define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* PMU specific trace format type */ #define PERF_FLAG_FD_NO_GROUP (1UL << 0) #define PERF_FLAG_FD_OUTPUT (1UL << 1) -- cgit v1.3-8-gc7d7 From 7dde51767ca5339ed33109056d92fdca05d56d8d Mon Sep 17 00:00:00 2001 From: Suzuki K Poulose Date: Mon, 5 Apr 2021 17:42:49 +0100 Subject: perf: aux: Add CoreSight PMU buffer formats CoreSight PMU supports aux-buffer for the ETM tracing. The trace generated by the ETM (associated with individual CPUs, like Intel PT) is captured by a separate IP (CoreSight TMC-ETR/ETF until now). The TMC-ETR applies formatting of the raw ETM trace data, as it can collect traces from multiple ETMs, with the TraceID to indicate the source of a given trace packet. Arm Trace Buffer Extension is new "sink" IP, attached to individual CPUs and thus do not provide additional formatting, like TMC-ETR. Additionally, a system could have both TRBE *and* TMC-ETR for the trace collection. e.g, TMC-ETR could be used as a single trace buffer to collect data from multiple ETMs to correlate the traces from different CPUs. It is possible to have a perf session where some events end up collecting the trace in TMC-ETR while the others in TRBE. Thus we need a way to identify the type of the trace for each AUX record. Define the trace formats exported by the CoreSight PMU. We don't define the flags following the "ETM" as this information is available to the user when issuing the session. What is missing is the additional formatting applied by the "sink" which is decided at the runtime and the user may not have a control on. So we define : - CORESIGHT format (indicates the Frame format) - RAW format (indicates the format of the source) The default value is CORESIGHT format for all the records (i,e == 0). Add the RAW format for others that use raw format. Cc: Peter Zijlstra Cc: Mike Leach Cc: Mathieu Poirier Cc: Leo Yan Cc: Anshuman Khandual Reviewed-by: Mike Leach Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20210405164307.1720226-3-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier --- include/uapi/linux/perf_event.h | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index f006eeab6f0e..63971eaef127 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -1162,6 +1162,10 @@ enum perf_callchain_context { #define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */ #define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* PMU specific trace format type */ +/* CoreSight PMU AUX buffer formats */ +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_CORESIGHT 0x0000 /* Default for backward compatibility */ +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW 0x0100 /* Raw format of the source */ + #define PERF_FLAG_FD_NO_GROUP (1UL << 0) #define PERF_FLAG_FD_OUTPUT (1UL << 1) #define PERF_FLAG_PID_CGROUP (1UL << 2) /* pid=cgroup id, per-cpu mode only */ -- cgit v1.3-8-gc7d7 From 3bf725699bf62494b3e179f1795f08c7d749f061 Mon Sep 17 00:00:00 2001 From: Jianyong Wu Date: Wed, 9 Dec 2020 14:09:29 +0800 Subject: KVM: arm64: Add support for the KVM PTP service Implement the hypervisor side of the KVM PTP interface. The service offers wall time and cycle count from host to guest. The caller must specify whether they want the host's view of either the virtual or physical counter. Signed-off-by: Jianyong Wu Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20201209060932.212364-7-jianyong.wu@arm.com --- Documentation/virt/kvm/api.rst | 10 +++++++ Documentation/virt/kvm/arm/index.rst | 1 + Documentation/virt/kvm/arm/ptp_kvm.rst | 25 ++++++++++++++++ arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/hypercalls.c | 53 ++++++++++++++++++++++++++++++++++ include/linux/arm-smccc.h | 16 ++++++++++ include/uapi/linux/kvm.h | 1 + 7 files changed, 107 insertions(+) create mode 100644 Documentation/virt/kvm/arm/ptp_kvm.rst (limited to 'include/uapi/linux') diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 38e327d4b479..987d99e39887 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6724,3 +6724,13 @@ vcpu_info is set. The KVM_XEN_HVM_CONFIG_RUNSTATE flag indicates that the runstate-related features KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR/_CURRENT/_DATA/_ADJUST are supported by the KVM_XEN_VCPU_SET_ATTR/KVM_XEN_VCPU_GET_ATTR ioctls. + +8.31 KVM_CAP_PTP_KVM +-------------------- + +:Architectures: arm64 + +This capability indicates that the KVM virtual PTP service is +supported in the host. A VMM can check whether the service is +available to the guest on migration. + diff --git a/Documentation/virt/kvm/arm/index.rst b/Documentation/virt/kvm/arm/index.rst index 3e2b2aba90fc..78a9b670aafe 100644 --- a/Documentation/virt/kvm/arm/index.rst +++ b/Documentation/virt/kvm/arm/index.rst @@ -10,3 +10,4 @@ ARM hyp-abi psci pvtime + ptp_kvm diff --git a/Documentation/virt/kvm/arm/ptp_kvm.rst b/Documentation/virt/kvm/arm/ptp_kvm.rst new file mode 100644 index 000000000000..68cffb50d8bf --- /dev/null +++ b/Documentation/virt/kvm/arm/ptp_kvm.rst @@ -0,0 +1,25 @@ +.. SPDX-License-Identifier: GPL-2.0 + +PTP_KVM support for arm/arm64 +============================= + +PTP_KVM is used for high precision time sync between host and guests. +It relies on transferring the wall clock and counter value from the +host to the guest using a KVM-specific hypercall. + +* ARM_SMCCC_HYP_KVM_PTP_FUNC_ID: 0x86000001 + +This hypercall uses the SMC32/HVC32 calling convention: + +ARM_SMCCC_HYP_KVM_PTP_FUNC_ID + ============= ========== ========== + Function ID: (uint32) 0x86000001 + Arguments: (uint32) KVM_PTP_VIRT_COUNTER(0) + KVM_PTP_PHYS_COUNTER(1) + Return Values: (int32) NOT_SUPPORTED(-1) on error, or + (uint32) Upper 32 bits of wall clock time (r0) + (uint32) Lower 32 bits of wall clock time (r1) + (uint32) Upper 32 bits of counter (r2) + (uint32) Lower 32 bits of counter (r3) + Endianness: No Restrictions. + ============= ========== ========== diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 7f06ba76698d..46401798c644 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -206,6 +206,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_ARM_INJECT_EXT_DABT: case KVM_CAP_SET_GUEST_DEBUG: case KVM_CAP_VCPU_ATTRIBUTES: + case KVM_CAP_PTP_KVM: r = 1; break; case KVM_CAP_ARM_SET_DEVICE_ADDR: diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 78d32c34d49c..30da78f72b3b 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -9,6 +9,55 @@ #include #include +static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val) +{ + struct system_time_snapshot systime_snapshot; + u64 cycles = ~0UL; + u32 feature; + + /* + * system time and counter value must captured at the same + * time to keep consistency and precision. + */ + ktime_get_snapshot(&systime_snapshot); + + /* + * This is only valid if the current clocksource is the + * architected counter, as this is the only one the guest + * can see. + */ + if (systime_snapshot.cs_id != CSID_ARM_ARCH_COUNTER) + return; + + /* + * The guest selects one of the two reference counters + * (virtual or physical) with the first argument of the SMCCC + * call. In case the identifier is not supported, error out. + */ + feature = smccc_get_arg1(vcpu); + switch (feature) { + case KVM_PTP_VIRT_COUNTER: + cycles = systime_snapshot.cycles - vcpu_read_sys_reg(vcpu, CNTVOFF_EL2); + break; + case KVM_PTP_PHYS_COUNTER: + cycles = systime_snapshot.cycles; + break; + default: + return; + } + + /* + * This relies on the top bit of val[0] never being set for + * valid values of system time, because that is *really* far + * in the future (about 292 years from 1970, and at that stage + * nobody will give a damn about it). + */ + val[0] = upper_32_bits(systime_snapshot.real); + val[1] = lower_32_bits(systime_snapshot.real); + val[2] = upper_32_bits(cycles); + val[3] = lower_32_bits(cycles); +} + int kvm_hvc_call_handler(struct kvm_vcpu *vcpu) { u32 func_id = smccc_get_function(vcpu); @@ -79,6 +128,10 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu) break; case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID: val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES); + val[0] |= BIT(ARM_SMCCC_KVM_FUNC_PTP); + break; + case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID: + kvm_ptp_get_time(vcpu, val); break; case ARM_SMCCC_TRNG_VERSION: case ARM_SMCCC_TRNG_FEATURES: diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index 1a27bd9493fe..6861489a1890 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -103,6 +103,7 @@ /* KVM "vendor specific" services */ #define ARM_SMCCC_KVM_FUNC_FEATURES 0 +#define ARM_SMCCC_KVM_FUNC_PTP 1 #define ARM_SMCCC_KVM_FUNC_FEATURES_2 127 #define ARM_SMCCC_KVM_NUM_FUNCS 128 @@ -114,6 +115,21 @@ #define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 1 +/* + * ptp_kvm is a feature used for time sync between vm and host. + * ptp_kvm module in guest kernel will get service from host using + * this hypercall ID. + */ +#define ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_32, \ + ARM_SMCCC_OWNER_VENDOR_HYP, \ + ARM_SMCCC_KVM_FUNC_PTP) + +/* ptp_kvm counter type ID */ +#define KVM_PTP_VIRT_COUNTER 0 +#define KVM_PTP_PHYS_COUNTER 1 + /* Paravirtualised time calls (defined by ARM DEN0057A) */ #define ARM_SMCCC_HV_PV_TIME_FEATURES \ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index f6afee209620..0e0f70c0d0dc 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1078,6 +1078,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_DIRTY_LOG_RING 192 #define KVM_CAP_X86_BUS_LOCK_EXIT 193 #define KVM_CAP_PPC_DAWR1 194 +#define KVM_CAP_PTP_KVM 195 #ifdef KVM_CAP_IRQ_ROUTING -- cgit v1.3-8-gc7d7