aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/perf/power7-pmu.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2013-07-12perf tools: Make Power7 events available for perfRunzhen Wang1-116/+32
Power7 supports over 530 different perf events but only a small subset of these can be specified by name, for the remaining events, we must specify them by their raw code: perf stat -e r2003c <application> This patch makes all the POWER7 events available in sysfs. So we can instead specify these as: perf stat -e 'cpu/PM_CMPLU_STALL_DFU/' <application> where PM_CMPLU_STALL_DFU is the r2003c in previous example. Before this patch is applied, the size of power7-pmu.o is: $ size arch/powerpc/perf/power7-pmu.o text data bss dec hex filename 3073 2720 0 5793 16a1 arch/powerpc/perf/power7-pmu.o and after the patch is applied, it is: $ size arch/powerpc/perf/power7-pmu.o text data bss dec hex filename 15950 31112 0 47062 b7d6 arch/powerpc/perf/power7-pmu.o For the run time overhead, I use two scripts, one is "event_name.sh", which contains 50 event names, it looks like: # ./perf record -e 'cpu/PM_CMPLU_STALL_DFU/' -e ..... /bin/sleep 1 the other one is named "event_code.sh" which use corresponding events raw code instead of events names, it looks like: # ./perf record -e r2003c -e ...... /bin/sleep 1 below is the result. Using events name: [root@localhost perf]# time ./event_name.sh [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.002 MB perf.data (~102 samples) ] real 0m1.192s user 0m0.028s sys 0m0.106s Using events raw code: [root@localhost perf]# time ./event_code.sh [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.003 MB perf.data (~112 samples) ] real 0m1.198s user 0m0.028s sys 0m0.105s Signed-off-by: Runzhen Wang <runzhen@linux.vnet.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Cc: icycoder@gmail.com Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Runzhen Wang <runzhew@clemson.edu> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1372407297-6996-3-git-send-email-runzhen@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-08perf tools: fix a typo of a Power7 event nameRunzhen Wang1-6/+6
In the Power7 PMU guide: https://www.power.org/documentation/commonly-used-metrics-for-performance-analysis/ PM_BRU_MPRED is referred to as PM_BR_MPRED. It fixed the typo by changing the name of the event in kernel and documentation accordingly. This patch changes the ABI, there are some reasons I think it's ok: - It is relatively new interface, specific to the Power7 platform. - No tools that we know of actually use this interface at this point (none are listed near the interface). - Users of this interface (eg oprofile users migrating to perf) would be more used to the "PM_BR_MPRED" rather than "PM_BRU_MPRED". - These are in the ABI/testing at this point rather than ABI/stable, so hoping we have some wiggle room. Signed-off-by: Runzhen Wang <runzhen@linux.vnet.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Cc: icycoder@gmail.com Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Runzhen Wang <runzhew@clemson.edu> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1372407297-6996-2-git-send-email-runzhen@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-05-29perf: Power7: Make CPI stack events available in sysfsSukadev Bhattiprolu1-0/+73
A set of Power7 events are often used for Cycles Per Instruction (CPI) stack analysis. Make these events available in sysfs (/sys/devices/cpu/events/) so they can be identified using their symbolic names: perf stat -e 'cpu/PM_CMPLU_STALL_DCACHE_MISS/' /bin/ls Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130406164803.GA408@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-13perf/POWER7: Create a sysfs format entry for Power7 eventsSukadev Bhattiprolu1-0/+13
Create a sysfs entry, '/sys/bus/event_source/devices/cpu/format/event' which describes the format of the POWER7 PMU events. This code is based on corresponding code in x86. Changelog[v4]: [Michael Ellerman, Paul Mckerras] The event format is different for other POWER cpus. So move the code to POWER7-specific, power7-pmu.c Also, the POWER7 format uses bits 0-19 not 0-20. Changelog[v2]: [Jiri Osla] Use PMU_FORMAT_ATTR rather than duplicating code. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Tested-by: Michael Ellerman <michael@ellerman.id.au> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: benh@kernel.crashing.org Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130306054826.GA14627@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-31perf/POWER7: Make some POWER7 events available in sysfsSukadev Bhattiprolu1-0/+18
Make some POWER7-specific perf events available in sysfs. $ /bin/ls -1 /sys/bus/event_source/devices/cpu/events/ branch-instructions branch-misses cache-misses cache-references cpu-cycles instructions PM_BRU_FIN PM_BRU_MPRED PM_CMPLU_STALL PM_CYC PM_GCT_NOSLOT_CYC PM_INST_CMPL PM_LD_MISS_L1 PM_LD_REF_L1 stalled-cycles-backend stalled-cycles-frontend where the 'PM_*' events are POWER specific and the others are the generic events. This will enable users to specify these events with their symbolic names rather than with their raw code. perf stat -e 'cpu/PM_CYC' ... Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130123062528.GE13720@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-31perf/POWER7: Make generic event translations available in sysfsSukadev Bhattiprolu1-0/+34
Make the generic perf events in POWER7 available via sysfs. $ ls /sys/bus/event_source/devices/cpu/events branch-instructions branch-misses cache-misses cache-references cpu-cycles instructions stalled-cycles-backend stalled-cycles-frontend $ cat /sys/bus/event_source/devices/cpu/events/cache-misses event=0x400f0 This patch is based on commits that implement this functionality on x86. Eg: commit a47473939db20e3961b200eb00acf5fcf084d755 Author: Jiri Olsa <jolsa@redhat.com> Date: Wed Oct 10 14:53:11 2012 +0200 perf/x86: Make hardware event translations available in sysfs Changelog:[v2] [Jiri Osla] Drop EVENT_ID() macro since it is only used once. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130123062454.GD13720@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-31perf/Power7: Use macros to identify perf eventsSukadev Bhattiprolu1-8/+20
Define and use macros to identify perf events codes This would make it easier and more readable when these event codes need to be used in more than one place. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130123062353.GB13720@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-15powerpc/perf: Add missing L2 constraint handling in Power7 PMUMichael Ellerman1-3/+14
If we have two cache events that require different settings of the L2SEL bits in MMCR1 then we can not schedule those events simultaneously. Add logic to the constraint handling to express that. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-27powerpc/perf: Sample only if SIAR-Valid bit is set in P7+sukadev@linux.vnet.ibm.com1-0/+3
powerpc/perf: Sample only if SIAR-Valid bit is set in P7+ On POWER7+ two new bits (mmcra[35] and mmcra[36]) indicate whether the contents of SIAR and SDAR are valid. For marked instructions on P7+, we must save the contents of SIAR and SDAR registers only if these new bits are set. This code/check for the SIAR-Valid bit is specific to P7+, so rather than waste a CPU-feature bit use the PVR flag. Note that Carl Love proposed a similar change for oprofile: https://lkml.org/lkml/2012/6/22/309 Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23powerpc/perf: Move perf core & PMU code into a subdirectoryMichael Ellerman1-0/+379
The perf code has grown a lot since it started, and is big enough to warrant its own subdirectory. For reference it's ~60% bigger than the oprofile code. It declutters the kernel directory, makes it simpler to grep for "just perf stuff", and allows us to shorten some filenames. While we're at it, make it more obvious that we have two implementations of the core perf logic. One for (roughly) Book3S CPUs, which was the original implementation, and the other for Freescale embedded CPUs. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>