aboutsummaryrefslogtreecommitdiffstats
path: root/tools (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2011-07-29cpupowerutils - cpufrequtils extended with quite some featuresDominik Brodowski72-0/+14417
CPU power consumption vs performance tuning is no longer limited to CPU frequency switching anymore: deep sleep states, traditional dynamic frequency scaling and hidden turbo/boost frequencies are tied close together and depend on each other. The first two exist on different architectures like PPC, Itanium and ARM, the latter (so far) only on X86. On X86 the APU (CPU+GPU) will only run most efficiently if CPU and GPU has proper power management in place. Users and Developers want to have *one* tool to get an overview what their system supports and to monitor and debug CPU power management in detail. The tool should compile and work on as many architectures as possible. Once this tool stabilizes a bit, it is intended to replace the Intel-specific tools in tools/power/x86 Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2011-06-16tools/perf: Fix static build of perf toolMathias Krause1-1/+1
To build a statically linked version of the perf tool all needed libraries must be added in the correct order to get the symbols resolved. Currently this is broken when, e.g. python or newt support is enabled -- libpython needs libpthread which is an unconditional link dependency of the perf tool; libslang needs libm, another unconditional dependency. To solve the problem in the long run without the need to keep track of transitive library dependencies, simply make the linker look at the EXTLIBS multiple times until it has all symbols resolved. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/1308171818-20370-1-git-send-email-minipli@googlemail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-15perf: clear out make flags when calling kernel make kernelverAndy Whitcroft1-1/+1
When generating the perf version from the kernel version using 'make kernelver' it is necessary to clear out any MAKEFLAGS otherwise they may trigger additional output which pollute the contents. Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-06-14rcu: Use softirq to address performance regressionShaohua Li1-0/+1
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread) introduced performance regression. In an AIM7 test, this commit degraded performance by about 40%. The commit runs rcu callbacks in a kthread instead of softirq. We observed high rate of context switch which is caused by this. Out test system has 64 CPUs and HZ is 1000, so we saw more than 64k context switch per second which is caused by RCU's per-CPU kthread. A trace showed that most of the time the RCU per-CPU kthread doesn't actually handle any callbacks, but instead just does a very small amount of work handling grace periods. This means that RCU's per-CPU kthreads are making the scheduler do quite a bit of work in order to allow a very small amount of RCU-related processing to be done. Alex Shi's analysis determined that this slowdown is due to lock contention within the scheduler. Unfortunately, as Peter Zijlstra points out, the scheduler's real-time semantics require global action, which means that this contention is inherent in real-time scheduling. (Yes, perhaps someone will come up with a workaround -- otherwise, -rt is not going to do well on large SMP systems -- but this patch will work around this issue in the meantime. And "the meantime" might well be forever.) This patch therefore re-introduces softirq processing to RCU, but only for core RCU work. RCU callbacks are still executed in kthread context, so that only a small amount of RCU work runs in softirq context in the common case. This should minimize ksoftirqd execution, allowing us to skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Tested-by: "Alex,Shi" <alex.shi@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-06-09perf: Use make kernelversion instead of parsing the MakefileMichal Marek1-6/+1
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-06-03perf python: Fix argument name list of read_on_cpu()Frederic Weisbecker1-1/+1
Mandatory arguments need to be present in the argument name list, as well as optional arguments, otherwise python barfs: # ./python/twatch.py Traceback (most recent call last): File "./python/twatch.py", line 41, in <module> main() File "./python/twatch.py", line 32, in main event = evlist.read_on_cpu(cpu) RuntimeError: more argument specifiers than keyword list entries Hence, add cpu to the name list. Cc: David Ahern <daahern@cisco.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03perf evlist: Don't die if sample_{id_all|type} is invalidArnaldo Carvalho de Melo9-44/+74
Fixes two more cases where the python binding would not load: . Not finding die(), which it shouldn't anyway, not good to just stop the world because some particular perf.data file is invalid, just propagate the error to the caller. . Not finding perf_sample_size: fix it by moving it from event.c to evsel, where it belongs, as most cases are moving to operate on an evsel object.o One of the fixed problems: [root@emilia ~]# python >>> import perf Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size >>> [root@emilia ~]# Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03perf python: Use exception to propagate errorsArnaldo Carvalho de Melo1-6/+4
We were using pr_debug to tell the user about not being able to parse a sample where we should really use the python way of reporting errors: exceptions. Fixes this problem: [root@emilia ~]# python >>> import perf Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf >>> [root@emilia ~] As we want to keep the objects linked in the python binding (and in the future in a shared library) minimal. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03perf evlist: Remove dependency on debug routinesArnaldo Carvalho de Melo1-9/+4
So far we avoided having to link debug.o in the python binding, keep it that way by not using ui__warning() in evlist.c. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02ktest: Ignore unset values of the minconfig in config_bisectSteven Rostedt1-1/+1
By ignoring the unset values of the minconfig in deciding what to test in the config_bisect can cause the problem config from being tested too. Just do not test the configs that are set in the minconfig. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-02ktest: Fix result of rebooting the kernelSteven Rostedt1-1/+1
The command that is called that reboots the kernel may fail but the return code is not passed back to the ktest.pl script. This is because a ';' is used between the two commands and if the second command fails, only the first command's return code is returned. Using a '&&' between the two commands fixes this. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-02ktest: Fix off-by-one in config bisect resultSteven Rostedt1-2/+2
Because in perl the array size returned by $#arr, is the last index and not the actually size of the array, we end the config bisect early, thinking there is only one config left when there are in fact two. Thus the result has a 50% chance of picking the correct config that caused the problem. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-30virtio_test: support event indexMichael S. Tsirkin1-2/+17
Add ability to test the new event idx feature, enable by default. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-27perf top: Don't stop if no kernel symtab is foundArnaldo Carvalho de Melo1-5/+15
We now just warn the user about the fact and go on providing just userspace samples. This fixes a problem when no vmlinux is explicetely passed by the user, thus symbol_conf.vmlinux_name is NULL, no suitable vmlinux is found, and then we get: aldebaran:~> perf top -p 7557 [kernel.kallsyms] with build id 44d9a989eabbd79e486bc079d6b743d397c204e0 not found, continuing without symbols The (null) file can't be used Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-cj2g81hn64wv2bipmqk4fy2m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27perf top: Handle kptr_restrictArnaldo Carvalho de Melo1-0/+15
Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-cyl5zmi1nu35vyu7l5im2pyv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27perf top: Remove unused macroArnaldo Carvalho de Melo1-2/+0
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-weqbs0tkk2u0qp1xxdxxosfg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27perf events: initialize fd array to -1 instead of 0David Ahern1-0/+10
perf_evsel__alloc_fd allocates an array of file descriptors with the memory initialized to 0. The array has dimensions for cpus and threads. Later, __perf_evsel__open calls sys_perf_event_open for each cpu and thread dimensions. If the open fails for any of the cpus or threads then the fd's for this event are closed and the fd entry in the array is set to -1. Now, if the first attempt fails for the event (e.g., the event is not supported) the remaining dimensions (cpu > 0 and thread > 0) are not touched and left at the initialized value of 0. builtin-stat catches ENOENT and ENOSYS failures and allows the command to continue. The end result is that stat attempts to read from an fd of 0 which of course is stdin and so the command hangs until you type ctrl-D. Resolve by initializing the array to -1 since an fd < 0 is already handled. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1306511914-8016-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27perf tools: Make sure kptr_restrict warnings fit 80 col termsArnaldo Carvalho de Melo2-21/+15
Suggested-by: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-i1p8vrhq7xveyui6t1sc914e@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-26perf tools: Fix build on older systemsArnaldo Carvalho de Melo2-0/+3
Where /usr/include/linux/const.h is not present, e.g. RHEL5. Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-ypcw2mu0w7dl1rrc6ncz3pee@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-26perf symbols: Handle /proc/sys/kernel/kptr_restrictArnaldo Carvalho de Melo6-6/+107
Perf uses /proc/modules to figure out where kernel modules are loaded. With the advent of kptr_restrict, non root users get zeroes for all module start addresses. So check if kptr_restrict is non zero and don't generate the syntethic PERF_RECORD_MMAP events for them. Warn the user about it in perf record and in perf report. In perf report the reference relocation symbol being zero means that kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't use it to fixup symbol addresses when using a valid kallsyms (in the buildid cache) or vmlinux (in the vmlinux path) build-id located automatically or specified by the user. Provide an explanation about it in 'perf report' if kernel samples were taken, checking if a suitable vmlinux or kallsyms was found/specified. Restricted /proc/kallsyms don't go to the buildid cache anymore. Example: [acme@emilia ~]$ perf record -F 100000 sleep 1 WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check /proc/sys/kernel/kptr_restrict. Samples in kernel functions may not be resolved if a suitable vmlinux file is not found in the buildid cache or in the vmlinux path. Samples in kernel modules won't be resolved at all. If some relocation was applied (e.g. kexec) symbols may be misresolved even with a suitable vmlinux or kallsyms file. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ] [acme@emilia ~]$ [acme@emilia ~]$ perf report --stdio Kernel address maps (/proc/{kallsyms,modules}) were restricted, check /proc/sys/kernel/kptr_restrict before running 'perf record'. If some relocation was applied (e.g. kexec) symbols may be misresolved. Samples in kernel modules can't be resolved as well. # Events: 13 cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ..................... # 20.24% sleep [kernel.kallsyms] [k] page_fault 20.04% sleep [kernel.kallsyms] [k] filemap_fault 19.78% sleep [kernel.kallsyms] [k] __lru_cache_add 19.69% sleep ld-2.12.so [.] memcpy 14.71% sleep [kernel.kallsyms] [k] dput 4.70% sleep [kernel.kallsyms] [k] flush_signal_handlers 0.73% sleep [kernel.kallsyms] [k] perf_event_comm 0.11% sleep [kernel.kallsyms] [k] native_write_msr_safe # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@emilia ~]$ This is because it found a suitable vmlinux (build-id checked) in /lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long file name). If we remove that file from the vmlinux path: [root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \ /lib/modules/2.6.39-rc7+/build/vmlinux.OFF [acme@emilia ~]$ perf report --stdio [kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562 not found, continuing without symbols Kernel address maps (/proc/{kallsyms,modules}) were restricted, check /proc/sys/kernel/kptr_restrict before running 'perf record'. As no suitable kallsyms nor vmlinux was found, kernel samples can't be resolved. Samples in kernel modules can't be resolved as well. # Events: 13 cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ...... # 80.31% sleep [kernel.kallsyms] [k] 0xffffffff8103425a 19.69% sleep ld-2.12.so [.] memcpy # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@emilia ~]$ Reported-by: Stephane Eranian <eranian@google.com> Suggested-by: David Miller <davem@davemloft.net> Cc: Dave Jones <davej@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-26perf: Remove duplicate headersJesper Juhl2-3/+0
Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: trivial@kernel.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1105261011290.17400@swampdragon.chaosbits.net Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-24perf tools: Fix sample type size calculation in 32 bits archsFrederic Weisbecker1-1/+1
The shift used here to count the number of bits set in the mask doesn't work above the low part for archs that are not 64 bits. Fix the constant used for the shift. This fixes a 32-bit perf top failure reported by Eric Dumazet: Can't parse sample, err = -14 Can't parse sample, err = -14 ... Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com Link: http://lkml.kernel.org/r/1306200686-17317-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-23perf tools: Fix sample size bit operationsFrederic Weisbecker1-16/+16
What we want is to count the number of bits in the mask, not some other random operation written in the middle of the night. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1306148788-6179-2-git-send-email-fweisbec@gmail.com [ Fixed perf_event__names[] alignment which was nearby and hurting my eyes ... ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-23perf tools: Fix ommitted mmap data update on remapFrederic Weisbecker1-13/+26
Commit eac9eacee16 "perf tools: Check we are able to read the event size on mmap" brought a check to ensure we can read the size of the event before dereferencing it, and do a remap otherwise to move the buffer forward. However that remap was ommitting all the necessary work to update the new page offset, head, and to unmap previous pages, etc... To fix this, gather all the code that fetches the event in a seperate helper which does all the necessary checks about the header/event size and tells us anytime a remap is needed. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1306148788-6179-3-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-22perf tools: Propagate event parse error handlingFrederic Weisbecker4-11/+33
Better handle event parsing error by propagating the details in upper layers or by dumping some failure message. So that the user knows he has some crazy events in the batch. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
2011-05-22perf tools: Robustify dynamic sample content fetchFrederic Weisbecker1-0/+26
Ensure the size of the dynamic fields such as callchains or raw events don't overlap the whole event boundaries. This prevents from dereferencing junk if the given size of the callchain goes too eager. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
2011-05-22perf tools: Pre-check sample size before parsingFrederic Weisbecker7-5/+41
Check that the total size of the sample fields having a fixed size do not exceed the one of the whole event. This robustifies the sample parsing. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
2011-05-22perf tools: Move evlist sample helpers to evlist areaFrederic Weisbecker4-33/+34
These APIs should belong to evlist.c as they may not be exclusively tied to the headers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com
2011-05-22perf tools: Remove junk code in mmap size handlingFrederic Weisbecker1-3/+0
size is overriden later and used only then. Those lines are only junk, probably a leftover. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
2011-05-22perf tools: Check we are able to read the event size on mmapFrederic Weisbecker1-0/+7
Check we have enough mmaped space to read the current event size from its headers, otherwise we may dereference some hell there. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
2011-05-20sanitize <linux/prefetch.h> usageLinus Torvalds1-1/+1
Commit e66eed651fd1 ("list: remove prefetching from regular list iterators") removed the include of prefetch.h from list.h, which uncovered several cases that had apparently relied on that rather obscure header file dependency. So this fixes things up a bit, using grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]') grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]') to guide us in finding files that either need <linux/prefetch.h> inclusion, or have it despite not needing it. There are more of them around (mostly network drivers), but this gets many core ones. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-20ktest: Allow options to be used by other optionsSteven Rostedt2-1/+97
There are cases where one ktest option may be used within another ktest option. Allow them to be reused just like config variables but there are evaluated at time of test not config processing time. Thus having something like: MAKE_CMD = make ARCH=${ARCH} TEST_START ARCH = powerpc TEST_START ARCH = arm Will have the arch defined for each test iteration. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20ktest: Create variables for the ktest config filesSteven Rostedt2-0/+121
I found that I constantly reuse information for each test case. It would be nice to just define a variable to reuse. For example I may have: TEST_START [...] TEST = ssh root@mybox /path/to/my/script TEST_START [...] TEST = ssh root@mybox /path/to/my/script [etc] The issue is, I may wont to change that script or one of the other fields. Then I need to update each line individually. With the addition of config variables (variables only used during parsing the config) we can simplify the config files. These variables can also be defined multiple times and each time the new value will overwrite the old value. The convention to use a config variable over a ktest option is to use := instead of =. Now we could do: USER := root TARGET := mybox TEST_SCRIPT := /path/to/my/script TEST_CASE := ${USER}@${TARGET} ${TEST_SCRIPT} TEST_START [...] TEST = ${TEST_CASE} TEST_START [...] TEST = ${TEST_CASE} [etc] Now we just need to update the variables at the top. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20ktest: Reboot after each patchcheck runSteven Rostedt2-0/+17
The patches being checked may not leave the kernel in a state that the next run will allow the new kernel to be copied to the machine. Reboot to a known good kernel before continuing to the next kernel to test. Added option PATCHCHECK_SLEEP_TIME for the max time to sleep between patchcheck reboots. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20ktest: Reboot to good kernel after every bisect runSteven Rostedt1-5/+5
Reboot after each bisect run regardless if the bisect passed or failed. The test may just be to boot the kernel and that kernel may not have a way to copy the next kerne to it. Reboot to a known good kernel after each bisect run. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20ktest: If test failed due to timeout, print thatSteven Rostedt1-0/+1
If the test failed due to timeout for boot, print a message saying so. Otherwise the user will be confused to why their test just failed. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20ktest: Fix post install commandSteven Rostedt1-1/+1
The command to run post install (for those that want initrds) was broken. Instead of doing a substitution for the $KERNEL_VERSION variable. It was replacing the entire command with nothing. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-19perf stat: Add more cache-miss percentage printoutsIngo Molnar1-2/+136
Print out the cache-miss percentage as well if the cache refs were collected, for all the generic cache event types. Before: 11,103,723,230 dTLB-loads # 622.471 M/sec ( +- 0.30% ) 87,065,337 dTLB-load-misses # 4.881 M/sec ( +- 0.90% ) After: 11,353,713,242 dTLB-loads # 626.020 M/sec ( +- 0.35% ) 113,393,472 dTLB-load-misses # 1.00% of all dTLB cache hits ( +- 0.49% ) Also ASCII color highlight too high percentages, them when it's executed on the console. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-lkhwxsevdbd9a8nymx0vxc3y@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-19perf stat: Add -d -d and -d -d -d options to show more CPU eventsIngo Molnar1-55/+154
Print even more detailed statistics if requested via perf stat -d: -d: detailed events, L1 and LLC data cache -d -d: more detailed events, dTLB and iTLB events -d -d -d: very detailed events, adding prefetch events Full output looks like this now: Performance counter stats for '/home/mingo/hackbench 10' (5 runs): 1703.674707 task-clock # 8.709 CPUs utilized ( +- 4.19% ) 49,068 context-switches # 0.029 M/sec ( +- 16.66% ) 8,303 CPU-migrations # 0.005 M/sec ( +- 24.90% ) 17,397 page-faults # 0.010 M/sec ( +- 0.46% ) 2,345,389,239 cycles # 1.377 GHz ( +- 4.61% ) [55.90%] 1,884,503,527 stalled-cycles-frontend # 80.35% frontend cycles idle ( +- 5.67% ) [50.39%] 743,919,737 stalled-cycles-backend # 31.72% backend cycles idle ( +- 8.75% ) [49.91%] 1,314,416,379 instructions # 0.56 insns per cycle # 1.43 stalled cycles per insn ( +- 2.53% ) [60.87%] 272,592,567 branches # 160.003 M/sec ( +- 1.74% ) [56.56%] 3,794,846 branch-misses # 1.39% of all branches ( +- 6.59% ) [58.50%] 449,982,778 L1-dcache-loads # 264.125 M/sec ( +- 2.47% ) [49.88%] 22,404,961 L1-dcache-load-misses # 4.98% of all L1-dcache hits ( +- 6.08% ) [55.05%] 6,204,750 LLC-loads # 3.642 M/sec ( +- 8.91% ) [43.75%] 1,837,411 LLC-load-misses # 1.078 M/sec ( +- 7.27% ) [12.07%] 411,440,421 L1-icache-loads # 241.502 M/sec ( +- 5.60% ) [36.52%] 27,556,832 L1-icache-load-misses # 16.175 M/sec ( +- 7.46% ) [46.72%] 464,067,627 dTLB-loads # 272.392 M/sec ( +- 4.46% ) [54.17%] 10,765,648 dTLB-load-misses # 6.319 M/sec ( +- 3.18% ) [48.68%] 1,273,080,386 iTLB-loads # 747.256 M/sec ( +- 3.38% ) [47.53%] 117,481 iTLB-load-misses # 0.069 M/sec ( +- 14.99% ) [47.01%] 4,590,653 L1-dcache-prefetches # 2.695 M/sec ( +- 4.49% ) [46.19%] 1,712,660 L1-dcache-prefetch-misses # 1.005 M/sec ( +- 3.75% ) [44.82%] 0.195622057 seconds time elapsed ( +- 6.84% ) Also clean up the attribute construction code to be appending, and factor it out into add_default_attributes(). Tweak the coverage percentage printout a bit, so that it's easier to view it alongside the +- sttddev colum. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-to3kgu04449s64062val8b62@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-18perf bench, x86: Add alternatives-asm.h wrapperIngo Molnar1-0/+8
perf bench needs this to build the kernel's memcpy routine: In file included from bench/mem-memcpy-x86-64-asm.S:2:0: bench/../../../arch/x86/lib/memcpy_64.S:7:33: fatal error: asm/alternative-asm.h: No such file or directory Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-c5d41xibgullk8h2280q4gv0@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-17perf: Fix multi-event parsing bugStephane Eranian1-0/+3
This patch fixes an issue with event parsing. The following commit appears to have broken the ability to specify a comma separated list of events: commit ceb53fbf6dbb1df26d38379a262c6981fe73dd36 Author: Ingo Molnar <mingo@elte.hu> Date: Wed Apr 27 04:06:33 2011 +0200 perf stat: Fail more clearly when an invalid modifier is specified This patch fixes this while preserving the desired effect: $ perf stat -e instructions:u,instructions:k ls /dev/null /dev/null Performance counter stats for 'ls /dev/null': 365956 instructions:u # 0.00 insns per cycle 731806 instructions:k # 0.00 insns per cycle 0.001108862 seconds time elapsed $ perf stat -e task-clock-msecs true invalid event modifier: '-msecs' Run 'perf list' for a list of valid events and modifiers Signed-off-by: Stephane Eranian <eranian@google.com> Cc: acme@redhat.com Cc: peterz@infradead.org Cc: fweisbec@gmail.com Link: http://lkml.kernel.org/r/20110517133619.GA6999@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-15perf evlist: Fix per thread mmap setupArnaldo Carvalho de Melo6-53/+115
The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using --pid when monitoring multithreaded apps, as we can only share a ring buffer for events on the same thread if not doing per cpu. Fix it by using per thread ring buffers. Tested with: [root@felicio ~]# tuna -t 26131 -CP | nl 1 thread ctxt_switches 2 pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 3 26131 OTHER 0 0,1 10814276 2397830 chromium-browse 4 642 OTHER 0 0,1 14688 0 chromium-browse 5 26148 OTHER 0 0,1 713602 115479 chromium-browse 6 26149 OTHER 0 0,1 801958 2262 chromium-browse 7 26150 OTHER 0 0,1 1271128 248 chromium-browse 8 26151 OTHER 0 0,1 3 0 chromium-browse 9 27049 OTHER 0 0,1 36796 9 chromium-browse 10 618 OTHER 0 0,1 14711 0 chromium-browse 11 661 OTHER 0 0,1 14593 0 chromium-browse 12 29048 OTHER 0 0,1 28125 0 chromium-browse 13 26143 OTHER 0 0,1 2202789 781 chromium-browse [root@felicio ~]# So 11 threads under pid 26131, then: [root@felicio ~]# perf record -F 50000 --pid 26131 [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# 11 mmaps, one per thread since we didn't specify any CPU list, so we need one mmap per thread and: [root@felicio ~]# perf record -F 50000 --pid 26131 ^M ^C[ perf record: Woken up 79 times to write data ] [ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 371310 26131 2 96516 26148 3 95694 26149 4 95203 26150 5 7291 26143 6 87 27049 7 76 661 8 60 29048 9 47 618 10 43 642 [root@felicio ~]# Ok, one of the threads, 26151 was quiescent, so no samples there, but all the others are there. Then, if I specify one CPU: [root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1 ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 8444 26131 2 2584 26149 3 2518 26148 4 2324 26150 5 123 26143 6 9 661 7 9 29048 [root@felicio ~]# This machine has two cores, so fewer threads appeared on the radar, and: [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# Just one mmap, as now we can use just one per-cpu buffer instead of the per-thread needed in the previous case. For global profiling: [root@felicio ~]# perf record -F 50000 -a ^C[ perf record: Woken up 26 times to write data ] [ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ] [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 2 7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# It uses per-cpu buffers. For just one thread: [root@felicio ~]# perf record -F 50000 --tid 26148 ^C[ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 9969 26148 [root@felicio ~]# [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# Tested-by: David Ahern <dsahern@gmail.com> Tested-by: Lin Ming <ming.m.lin@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-15perf tools: Honour the cpu list parameter when also monitoring a thread listArnaldo Carvalho de Melo1-1/+1
The perf_evlist__create_maps was discarding the --cpu parameter when a --pid or --tid was specified, fix that. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-10perf probe: Fix the missed parameter initializationLin Ming1-0/+1
pubname_callback_param::found should be initialized to 0 in fastpath lookup, the structure is on the stack and uninitialized otherwise. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Link: http://lkml.kernel.org/r/1304066518-30420-2-git-send-email-ming.m.lin@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-10perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.cJesper Juhl1-1/+0
Including "../../annotate.h" once in tools/perf/util/ui/browsers/annotate.c is enough. No need to do it twice. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-05-07perf tools: Makefile: Use gcc to determine ARCHLin Ming1-6/+10
The original Makefile uses "uname -m" to determine ARCH. This causes problem on x86 when compile perf tool on 32 bit userspace with a 64 bit kernel. bench/../../../arch/x86/lib/memcpy_64.S: Assembler messages: bench/../../../arch/x86/lib/memcpy_64.S:28: Error: bad register name `%rdi' This is because "uname -m" returns x86_64 and memcpy_64.S is included in 32 bit build. Reported-by: Riccardo Magliocchetti <riccardo.magliocchetti@gmail.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Link: http://lkml.kernel.org/r/1304743274.3132.17.camel@localhost Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-05rcu: move TREE_RCU from softirq to kthreadPaul E. McKenney1-1/+0
If RCU priority boosting is to be meaningful, callback invocation must be boosted in addition to preempted RCU readers. Otherwise, in presence of CPU real-time threads, the grace period ends, but the callbacks don't get invoked. If the callbacks don't get invoked, the associated memory doesn't get freed, so the system is still subject to OOM. But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit moves the callback invocations to a kthread, which can be boosted easily. Also add comments and properly synchronized all accesses to rcu_cpu_kthread_task, as suggested by Lai Jiangshan. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-04-30perf stat: Tell user about unsupported events in the listDavid Ahern1-1/+5
Similar to perf-record, tell user about unsupported events that will not be counted if invoked in verbose mode. e.g., $ perf stat -e dTLB-prefetch-misses -v -- sleep 1 dTLB-prefetch-misses event is not supported by the kernel. dTLB-prefetch-misses: 0 0 0 Performance counter stats for 'sleep 1': <not counted> dTLB-prefetch-misses 1.001884783 seconds time elapsed Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1304114655-10600-1-git-send-email-dsahern@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29perf list: Fix max event string sizeIngo Molnar1-12/+13
Recent stalled-cycles event names were larger than the 40 chars printout used by perf list. Extend that, make it robust for future extensions and also adjust alignments in face of wider event names. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n009io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29perf stat: Fail softly on unsupported eventsIngo Molnar1-3/+1
David Ahern reported this perf stat failure: > # /tmp/build-perf/perf stat -- sleep 1 > Error: stalled-cycles-frontend event is not supported. > Fatal: Not all events could be opened. > > This is a Dell R410 with an E5620 processor. Fail in a softer fashion on unknown/unsupported events. Reported-by: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n006io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>