aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/cpu/perf_event_intel_lbr.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-02-17perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.cBorislav Petkov1-1062/+0
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-7-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-06perf/x86/intel: Add perf core PMU support for Intel Knights LandingHarish Chegondi1-0/+14
Knights Landing core is based on Silvermont core with several differences. Like Silvermont, Knights Landing has 8 pairs of LBR MSRs. However, the LBR MSRs addresses match those of the Xeon cores' first 8 pairs of LBR MSRs Unlike Silvermont, Knights Landing supports hyperthreading. Knights Landing offcore response events config register mask is different from that of the Silvermont. This patch was developed based on a patch from Andi Kleen. For more details, please refer to the public document: https://software.intel.com/sites/default/files/managed/15/8d/IntelXeonPhi%E2%84%A2x200ProcessorPerformanceMonitoringReferenceManual_Volume1_Registers_v0%206.pdf Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Harish Chegondi <harish.chegondi@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/d14593c7311f78c93c9cf6b006be843777c5ad5c.1449517401.git.harish.chegondi@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-06perf/x86: Fix LBR related crashes on Intel AtomStephane Eranian1-4/+7
This patches fixes the LBR kernel crashes on Intel Atom. The kernel was assuming that if the CPU supports 64-bit format LBR, then it has an LBR_SELECT MSR. Atom uses 64-bit LBR format but does not have LBR_SELECT. That was causing NULL pointer dereferences in a couple of places. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: kan.liang@intel.com Fixes: 96f3eda67fcf ("perf/x86/intel: Fix static checker warning in lbr enable") Link: http://lkml.kernel.org/r/1449182000-31524-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-23perf/x86: Add option to disable reading branch flags/cyclesAndi Kleen1-2/+17
With LBRv5 reading the extra LBR flags like mispredict, TSX, cycles is not free anymore, as it has moved to a separate MSR. For callstack mode we don't need any of this information; so we can avoid the unnecessary MSR read. Add flags to the perf interface where perf record can request not collecting this information. Add branch_sample_type flags for CYCLES and FLAGS. It's a bit unusual for branch_sample_types to be negative (disable), not positive (enable), but since the legacy ABI reported the flags we need some form of explicit disabling to avoid breaking the ABI. After we have the flags the x86 perf code can keep track if any users need the flags. If noone needs it the information is not collected. This cuts down the cost of LBR callstack on Skylake significantly. Profiling a kernel build with LBR call stack the average run time of the PMI handler drops by 43%. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1445366797-30894-2-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-23perf/x86: Fix LBR call stack save/restoreAndi Kleen1-1/+3
This fixes a bug I added in the following commit: 90405aa02247 ("perf/x86/intel/lbr: Limit LBR accesses to TOS in callstack mode") The bug could lead to lost LBR call stacks. When restoring the LBR state we need to use the TOS of the previous context, not the current context. To do that we need to save/restore the TOS. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1445366797-30894-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20perf/x86: Add support for PERF_SAMPLE_BRANCH_CALLStephane Eranian1-0/+4
This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL for Intel x86 processors. When the processor support LBR filtering this the selection is done in hardware. Otherwise, the filter is applied by software. Note that we chose to include zero length calls because they also represent calls. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: khandual@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1444720151-10275-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-18perf/x86/intel: Fix static checker warning in lbr enableKan Liang1-2/+3
Commit deb27519bf1f ("perf/x86/intel: Fix LBR callstack issue caused by FREEZE_LBRS_ON_PMI") leads to the following Smatch complaint: warn: variable dereferenced before check 'cpuc->lbr_sel' (see line 154) Fix the warning. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: deb27519bf1f ("perf/x86/intel: Fix LBR callstack issue caused by FREEZE_LBRS_ON_PMI") Link: http://lkml.kernel.org/r/1442240047-48149-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-13perf/x86/intel: Fix LBR callstack issue caused by FREEZE_LBRS_ON_PMIKan Liang1-3/+2
This patch fixes an issue which introduced by commit 1a78d93750bb5f61abdc59a91fc3bd06a214542a ("perf/x86/intel: Streamline LBR MSR handling in PMI"). The old patch not only avoids writing LBR_SELECT MSR in PMI, but also avoids updating lbr_select variable. So in PMI, FREEZE_LBRS_ON_PMI bit is always mistakenly set for IA32_DEBUGCTLMSR MSR, which causes superfluous increase/decrease of LBR_TOS when collecting LBR callstack. Reported-by: Milian Wolff <mail@milianw.de> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1439815051-8616-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-04perf/x86/intel/lbr: Limit LBR accesses to TOS in callstack modeAndi Kleen1-3/+7
In callstack mode the LBR is not a ring buffer, but a stack that grows up and down. This means in this case we don't need to access all LBRs, only the ones up to TOS. Do this optimization for the normal LBR read, and the context switch save/restore code. For save/restore it can be done unconditionally, as it only runs when call stack mode is active. This recovers some of the cost of going to 32 LBRs on Skylake. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: eranian@google.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1432786398-23861-6-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-04perf/x86/intel/lbr: Use correct index to save/restore LBR_INFO with call stackAndi Kleen1-2/+2
Use the correct index to save/restore the LBR_INFO_x MSR in callstack mode. This is more a cleanup, as even with the wrong index the register was correctly saved/restored, and also LBR callgraph mode in perf tools do not really need anything in LBR_INFO. But still better to use the right index. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: eranian@google.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1432786398-23861-5-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-04perf/x86/intel: Add Intel Skylake PMU supportAndi Kleen1-0/+20
Add perf core PMU support for future Intel Skylake CPU cores. The code is based on Haswell/Broadwell. There is a new cache event list, based on the updated Haswell event list. Skylake has removed most counter constraints on basic events, so the basic constraints table now only has a single entry (plus the fixed counters). TSX support and various other setups are all shared with Haswell. Skylake has 32 LBR entries. Add a new LBR init function to set this up. The filters are all the same as Haswell. It also has a new LBR format with a separate LBR_INFO_* MSR, but that has been already added earlier. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Link: http://lkml.kernel.org/r/1431285767-27027-7-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-04perf/x86/intel/lbr: Optimize v4 LBR unfreezingAndi Kleen1-0/+7
In Arch perfmon v4 the GLOBAL_STATUS reset automatically unfreezes LBRs. So no need to do it manually in the LBR code. Add a check to skip it. v2: Move test up to beginning of function. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Link: http://lkml.kernel.org/r/1431285767-27027-9-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-04perf/x86/intel/lbr: Add support for LBRv5Andi Kleen1-1/+20
Add support for the new LBRv5 format used on Intel Skylake CPUs. The flags for mispredict, abort, in_tx etc. moved to range of separate LBR_INFO_* MSRs. Teach the LBR code to read those. The original LBR registers stay the same, except they have full sign extension now. LBR_INFO also reports a cycle count to the last branch. Report the cycle information using the new "cycles" branch_info output field. In addition we have to context switch and clear the new INFO MSRs to avoid any information leaks. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Link: http://lkml.kernel.org/r/1431285767-27027-6-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-07perf/x86/intel: Drain the PEBS buffer during context switchesYan, Zheng1-3/+0
Flush the PEBS buffer during context switches if PEBS interrupt threshold is larger than one. This allows perf to supply TID for sample outputs. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1430940834-8964-6-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-07perf/x86/intel: add support for PERF_SAMPLE_BRANCH_IND_JUMPStephane Eranian1-1/+9
This patch enables support for branch sampling filter for indirect jumps (IND_JUMP). It enables LBR IND_JMP filtering where available. There is also software filtering support. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: dsahern@gmail.com Cc: jolsa@redhat.com Cc: kan.liang@intel.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1431637800-31061-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02perf/x86/intel: Avoid rewriting DEBUGCTL with the same value for LBRsAndi Kleen1-2/+4
perf with LBRs on has a tendency to rewrite the DEBUGCTL MSR with the same value. Add a little optimization to skip the unnecessary write. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: eranian@google.com Link: http://lkml.kernel.org/r/1426871484-21285-2-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02perf/x86/intel: Streamline LBR MSR handling in PMIAndi Kleen1-4/+8
The perf PMI currently does unnecessary MSR accesses when LBRs are enabled. We use LBR freezing, or when in callstack mode force the LBRs to only filter on ring 3. So there is no need to disable the LBRs explicitely in the PMI handler. Also we always unnecessarily rewrite LBR_SELECT in the LBR handler, even though it can never change. 5) | /* write_msr: MSR_LBR_SELECT(1c8), value 0 */ 5) | /* read_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */ 5) | /* write_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */ 5) | /* write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 70000000f */ 5) | /* write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0 */ 5) | /* write_msr: MSR_LBR_SELECT(1c8), value 0 */ 5) | /* read_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */ 5) | /* write_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */ This patch: - Avoids disabling already frozen LBRs unnecessarily in the PMI - Avoids changing LBR_SELECT in the PMI Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: eranian@google.com Link: http://lkml.kernel.org/r/1426871484-21285-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18perf/x86/intel: Expose LBR callstack to user space toolingPeter Zijlstra1-4/+4
With LBR call stack feature enable, there are three callchain options. Enable the 3rd callchain option (LBR callstack) to user space tooling. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-api@vger.kernel.org Link: http://lkml.kernel.org/r/20141105093759.GQ10501@worktop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18perf/x86/intel: Discard zero length call entries in LBR call stackYan, Zheng1-2/+11
"Zero length call" uses the attribute of the call instruction to push the immediate instruction pointer on to the stack and then pops off that address into a register. This is accomplished without any matching return instruction. It confuses the hardware and make the recorded call stack incorrect. We can partially resolve this issue by: decode call instructions and discard any zero length call entry in the LBR stack. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: eranian@google.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1415156173-10035-16-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18perf/x86/intel: Disable FREEZE_LBRS_ON_PMI when LBR operates in callstack modeYan, Zheng1-4/+13
LBR callstack is designed for PEBS, It does not work well with FREEZE_LBRS_ON_PMI for non PEBS event. If FREEZE_LBRS_ON_PMI is set for non PEBS event, PMIs near call/return instructions may cause superfluous increase/decrease of LBR_TOS. This patch modifies __intel_pmu_lbr_enable() to not enable FREEZE_LBRS_ON_PMI when LBR operates in callstack mode. We currently don't use LBR callstack to capture kernel space callchain, so disabling FREEZE_LBRS_ON_PMI should not be a problem. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: eranian@google.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1415156173-10035-15-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18perf/x86/intel: Save/restore LBR stack during context switchYan, Zheng1-12/+76
When the LBR call stack is enabled, it is necessary to save/restore the LBR stack on context switch. The solution is saving/restoring the LBR stack to/from task's perf event context. The LBR stack is saved/restored only when there are events that use the LBR call stack. If no event uses LBR call stack, the LBR stack is reset when task is scheduled in. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: eranian@google.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1415156173-10035-10-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18perf/x86/intel: Track number of events that use the LBR callstackYan, Zheng1-0/+19
When enabling/disabling an event, check if the event uses the LBR callstack feature, adjust the LBR callstack usage count accordingly. Later patch will use the usage count to decide if LBR stack should be saved/restored. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: eranian@google.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1415156173-10035-9-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18perf/x86/intel: Add basic Haswell LBR call stack supportYan, Zheng1-22/+69
Haswell has a new feature that utilizes the existing LBR facility to record call chains. To enable this feature, bits (JCC, NEAR_IND_JMP, NEAR_REL_JMP, FAR_BRANCH, EN_CALLSTACK) in LBR_SELECT must be set to 1, bits (NEAR_REL_CALL, NEAR-IND_CALL, NEAR_RET) must be cleared. Due to a hardware bug of Haswell, this feature doesn't work well with FREEZE_LBRS_ON_PMI. When the call stack feature is enabled, the LBR stack will capture unfiltered call data normally, but as return instructions are executed, the last captured branch record is flushed from the on-chip registers in a last-in first-out (LIFO) manner. Thus, branch information relative to leaf functions will not be captured, while preserving the call stack information of the main line execution path. This patch defines a separate lbr_sel map for Haswell. The map contains a new entry for the call stack feature. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: eranian@google.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1415156173-10035-5-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18perf/x86/intel: Use context switch callback to flush LBR stackYan, Zheng1-0/+27
Previous commit introduces context switch callback, its function overlaps with the flush branch stack callback. So we can use the context switch callback to flush LBR stack. This patch adds code that uses the flush branch callback to flush the LBR stack when task is being scheduled in. The callback is enabled only when there are events use the LBR hardware. This patch also removes all old flush branch stack code. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: eranian@google.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1415156173-10035-4-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18perf/x86/intel: Reduce lbr_sel_map[] sizeYan, Zheng1-29/+25
The index of lbr_sel_map is bit value of perf branch_sample_type. PERF_SAMPLE_BRANCH_MAX is 1024 at present, so each lbr_sel_map uses 4096 bytes. By using bit shift as index, we can reduce lbr_sel_map size to 40 bytes. This patch defines 'bit shift' for branch types, and use 'bit shift' to define lbr_sel_maps. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Stephane Eranian <eranian@google.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: jolsa@redhat.com Cc: linux-api@vger.kernel.org Link: http://lkml.kernel.org/r/1415156173-10035-2-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-18x86: Remove arbitrary instruction size limit in instruction decoderDave Hansen1-6/+19
The current x86 instruction decoder steps along through the instruction stream but always ensures that it never steps farther than the largest possible instruction size (MAX_INSN_SIZE). The MPX code is now going to be doing some decoding of userspace instructions. We copy those from userspace in to the kernel and they're obviously completely untrusted coming from userspace. In addition to the constraint that instructions can only be so long, we also have to be aware of how long the buffer is that came in from userspace. This _looks_ to be similar to what the perf and kprobes is doing, but it's unclear to me whether they are affected. The whole reason we need this is that it is perfectly valid to be executing an instruction within MAX_INSN_SIZE bytes of an unreadable page. We should be able to gracefully handle short reads in those cases. This adds support to the decoder to record how long the buffer being decoded is and to refuse to "validate" the instruction if we would have gone over the end of the buffer to decode it. The kprobes code probably needs to be looked at here a bit more carefully. This patch still respects the MAX_INSN_SIZE limit there but the kprobes code does look like it might be able to be a bit more strict than it currently is. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Jim Keniston <jkenisto@us.ibm.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: x86@kernel.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Link: http://lkml.kernel.org/r/20141114153957.E6B01535@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-10-15Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpuLinus Torvalds1-6/+6
Pull percpu consistent-ops changes from Tejun Heo: "Way back, before the current percpu allocator was implemented, static and dynamic percpu memory areas were allocated and handled separately and had their own accessors. The distinction has been gone for many years now; however, the now duplicate two sets of accessors remained with the pointer based ones - this_cpu_*() - evolving various other operations over time. During the process, we also accumulated other inconsistent operations. This pull request contains Christoph's patches to clean up the duplicate accessor situation. __get_cpu_var() uses are replaced with with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr(). Unfortunately, the former sometimes is tricky thanks to C being a bit messy with the distinction between lvalues and pointers, which led to a rather ugly solution for cpumask_var_t involving the introduction of this_cpu_cpumask_var_ptr(). This converts most of the uses but not all. Christoph will follow up with the remaining conversions in this merge window and hopefully remove the obsolete accessors" * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits) irqchip: Properly fetch the per cpu offset percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write. percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t Revert "powerpc: Replace __get_cpu_var uses" percpu: Remove __this_cpu_ptr clocksource: Replace __this_cpu_ptr with raw_cpu_ptr sparc: Replace __get_cpu_var uses avr32: Replace __get_cpu_var with __this_cpu_write blackfin: Replace __get_cpu_var uses tile: Use this_cpu_ptr() for hardware counters tile: Replace __get_cpu_var uses powerpc: Replace __get_cpu_var uses alpha: Replace __get_cpu_var ia64: Replace __get_cpu_var uses s390: cio driver &__get_cpu_var replacements s390: Replace __get_cpu_var uses mips: Replace __get_cpu_var uses MIPS: Replace __get_cpu_var uses in FPU emulator. arm: Replace __this_cpu_ptr with raw_cpu_ptr ...
2014-09-09perf/x86/intel: Mark initialization code as suchMathias Krause1-4/+4
A few of the initialization functions are missing the __init annotation. Fix this and thereby allow ~680 additional bytes of code to be released after initialization. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1409071785-26015-1-git-send-email-minipli@googlemail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-08-26x86: Replace __get_cpu_var usesChristoph Lameter1-6/+6
__get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : #define __get_cpu_var(var) (*this_cpu_ptr(&(var))) __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-06-05perf/x86: Add conditional branch filtering supportAnshuman Khandual1-0/+5
This patch adds conditional branch filtering support, enabling it for PERF_SAMPLE_BRANCH_COND in perf branch stack sampling framework by utilizing an available software filter X86_BR_JCC. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: mpe@ellerman.id.au Cc: benh@kernel.crashing.org Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1400743210-32289-3-git-send-email-khandual@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Fix arch_perf_out_copy_user defaultPeter Zijlstra1-1/+1
The arch_perf_output_copy_user() default of __copy_from_user_inatomic() returns bytes not copied, while all other argument functions given DEFINE_OUTPUT_COPY() return bytes copied. Since copy_from_user_nmi() is the odd duck out by returning bytes copied where all other *copy_{to,from}* functions return bytes not copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes not copied. Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied while expecting its worker functions to return bytes copied. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: will.deacon@arm.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04perf/x86: Suppress duplicated abort LBR recordsAndi Kleen1-8/+21
Haswell always give an extra LBR record after every TSX abort. Suppress the extra record. This only works when the abort is visible in the LBR If the original abort has already left the 16 LBR entries the extra entry will will stay. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-7-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19perf/x86/intel: Support Haswell/v4 LBR formatAndi Kleen1-5/+51
Haswell has two additional LBR from flags for TSX: in_tx and abort_tx, implemented as a new "v4" version of the LBR format. Handle those in and adjust the sign extension code to still correctly extend. The flags are exported similarly in the LBR record to the existing misprediction flag Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Andi Kleen <ak@linux.jf.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1371515812-9646-6-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-28perf/x86: Check branch sampling priv level in generic codeStephane Eranian1-10/+3
This patch moves commit 7cc23cd to the generic code: perf/x86/intel/lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNEL The check is now implemented in generic code instead of x86 specific code. That way we do not have to repeat the test in each arch supporting branch sampling. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20130521105337.GA2879@quad Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-05perf/x86/intel/lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNELPeter Zijlstra1-3/+10
We should always have proper privileges when requesting kernel data. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: eranian@google.com Link: http://lkml.kernel.org/r/20130503121256.230745028@chello.nl [ Fix build error reported by fengguang.wu@intel.com, propagate error code back. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/n/tip-v0x9ky3ahzr6nm3c6ilwrili@git.kernel.org
2013-05-04perf/x86/intel/lbr: Fix LBR filterPeter Zijlstra1-2/+12
The LBR 'from' adddress is under full userspace control; ensure we validate it before reading from it. Note: is_module_text_address() can potentially be quite expensive; for those running into that with high overhead in modules optimize it using an RCU backed rb-tree. Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> Cc: eranian@google.com Link: http://lkml.kernel.org/r/20130503121256.158211806@chello.nl Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/n/tip-mk8i82ffzax01cnqo829iy1q@git.kernel.org
2012-09-04perf/x86: Enable Intel Cedarview Atom suppportStephane Eranian1-1/+2
This patch enables perf_events support for Intel Cedarview Atom (model 54) processors. Support includes PEBS and LBR. Tested on my Atom N2600 netbook. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120820092421.GA11284@quad Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-05perf/x86: Add LBR software filter support for Intel CPUsStephane Eranian1-22/+310
This patch adds an internal sofware filter to complement the (optional) LBR hardware filter. The software filter is necessary: - as a substitute when there is no HW LBR filter (e.g., Atom, Core) - to complement HW LBR filter in case of errata (e.g., Nehalem/Westmere) - to provide finer grain filtering (e.g., all processors) Sometimes the LBR HW filter cannot distinguish between two types of branches. For instance, to capture syscall as CALLS, it is necessary to enable the LBR_FAR filter which will also capture JMP instructions. Thus, a second pass is necessary to filter those out, this is what the SW filter can do. The SW filter is built on top of the internal x86 disassembler. It is a best effort filter especially for user level code. It is subject to the availability of the text page of the program. The SW filter is enabled on all Intel processors. It is bypassed when the user is capturing all branches at all priv levels. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-9-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-05perf/x86: Implement PERF_SAMPLE_BRANCH for Intel CPUsStephane Eranian1-2/+84
This patch implements PERF_SAMPLE_BRANCH support for Intel x86processors. It connects PERF_SAMPLE_BRANCH to the actual LBR. The patch adds the hooks in the PMU irq handler to save the LBR on counter overflow for both regular and PEBS modes. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-8-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-05perf/x86: Disable LBR support for older Intel Atom processorsStephane Eranian1-0/+10
The patch adds a restriction for Intel Atom LBR support. Only steppings 10 (PineView) and more recent are supported. Older models do not have a functional LBR. Their LBR does not freeze on PMU interrupt which makes LBR unusable in the context of perf_events. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-7-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-05perf/x86: Add Intel LBR mappings for PERF_SAMPLE_BRANCH filtersStephane Eranian1-2/+101
This patch adds the mappings from the generic PERF_SAMPLE_BRANCH_* filters to the actual Intel x86LBR filters, whenever they exist. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-6-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-05perf/x86: Add Intel LBR MSR definitionsStephane Eranian1-9/+9
This patch adds the LBR definitions for NHM/WSM/SNB and Core. It also adds the definitions for the architected LBR MSR: LBR_SELECT, LBRT_TOS. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-05perf: Add generic taken branch sampling supportStephane Eranian1-8/+13
This patch adds the ability to sample taken branches to the perf_event interface. The ability to capture taken branches is very useful for all sorts of analysis. For instance, basic block profiling, call counts, statistical call graph. This new capability requires hardware assist and as such may not be available on all HW platforms. On Intel x86 it is implemented on top of the Last Branch Record (LBR) facility. To enable taken branches sampling, the PERF_SAMPLE_BRANCH_STACK bit must be set in attr->sample_type. Sampled taken branches may be filtered by type and/or priv levels. The patch adds a new field, called branch_sample_type, to the perf_event_attr structure. It contains a bitmask of filters to apply to the sampled taken branches. Filters may be implemented in HW. If the HW filter does not exist or is not good enough, some arch may also implement a SW filter. The following generic filters are currently defined: - PERF_SAMPLE_USER only branches whose targets are at the user level - PERF_SAMPLE_KERNEL only branches whose targets are at the kernel level - PERF_SAMPLE_HV only branches whose targets are at the hypervisor level - PERF_SAMPLE_ANY any type of branches (subject to priv levels filters) - PERF_SAMPLE_ANY_CALL any call branches (may incl. syscall on some arch) - PERF_SAMPLE_ANY_RET any return branches (may incl. syscall returns on some arch) - PERF_SAMPLE_IND_CALL indirect call branches Obviously filter may be combined. The priv level bits are optional. If not provided, the priv level of the associated event are used. It is possible to collect branches at a priv level different from the associated event. Use of kernel, hv priv levels is subject to permissions and availability (hv). The number of taken branch records present in each sample may vary based on HW, the type of sampled branches, the executed code. Therefore each sample contains the number of taken branches it contains. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-03perf: Remove deprecated WARN_ON_ONCE()Stephane Eranian1-2/+0
With the new throttling/unthrottling code introduced with commit: e050e3f0a71b ("perf: Fix broken interrupt rate throttling") we occasionally hit two WARN_ON_ONCE() checks in: - intel_pmu_pebs_enable() - intel_pmu_lbr_enable() - x86_pmu_start() The assertions are no longer problematic. There is a valid path where they can trigger but it is harmless. The assertion can be triggered with: $ perf record -e instructions:pp .... Leading to paths: intel_pmu_pebs_enable intel_pmu_enable_event x86_perf_event_set_period x86_pmu_start perf_adjust_freq_unthr_context perf_event_task_tick scheduler_tick And: intel_pmu_lbr_enable intel_pmu_enable_event x86_perf_event_set_period x86_pmu_start perf_adjust_freq_unthr_context. perf_event_task_tick scheduler_tick cpuc->enabled is always on because when we get to perf_adjust_freq_unthr_context() the PMU is not totally disabled. Furthermore when we need to adjust a period, we only stop the event we need to change and not the entire PMU. Thus, when we re-enable, cpuc->enabled is already set. Note that when we stop the event, both pebs and lbr are stopped if necessary (and possible). Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/20120202110401.GA30911@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-26x86, perf: Clean up perf_event cpu codeKevin Winchester1-12/+16
The CPU support for perf events on x86 was implemented via included C files with #ifdefs. Clean this up by creating a new header file and compiling the vendor-specific files as needed. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1314747665-2090-1-git-send-email-kjwinchester@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-26perf, x86: Clean up debugctlmsr bit definitionsPeter Zijlstra1-5/+2
Move all debugctlmsr thingies into msr-index.h Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100325135413.861425293@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Fix LBR read-outPeter Zijlstra1-2/+2
Don't decrement the TOS twice... Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Don't reset the LBR as frequentlyPeter Zijlstra1-5/+4
If we reset the LBR on each first counter, simple counter rotation which first deschedules all counters and then reschedules the new ones will lead to LBR reset, even though we're still in the same task context. Reduce this by not flushing on the first counter but only flushing on different task contexts. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Fix LBR enable/disable vs cpuc->enabledPeter Zijlstra1-3/+4
We should never call ->enable with the pmu enabled, and we _can_ have ->disable called with the pmu enabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10perf, x86: Clear the LBRs on initPeter Zijlstra1-0/+3
Some CPUs have errata where the LBR is not cleared on Power-On. So always clear the LBRs before use. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100305154128.966563424@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>