aboutsummaryrefslogtreecommitdiffstats
path: root/tools (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-05-09perf tools: Support reading from backward ring bufferWang Nan2-0/+54
perf_evlist__mmap_read_backward() is introduced for reading backward ring buffer. Since direction for reading such ring buffer is different from the direction kernel writing to it, and since user need to fetch most recent record from it, a perf_evlist__mmap_read_catchup() is introduced to move the reading pointer to the end of the buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1462758471-89706-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf script: Fix incorrect python db-export error messageChris Phlipot1-1/+1
Fix the error message printed when attempting and failing to create the call path root incorrectly references the call return process. This change fixes the message to properly reference the failure to create the call path root. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462612620-25008-2-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf stat: Scale values by unit before metricsAndi Kleen1-1/+3
Scale values by unit before passing them to the metrics printing functions. This is needed for TopDown, because it needs to scale the slots correctly by pipeline width / SMTness. For existing metrics it shouldn't make any difference, as those generally use events that don't have any units. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462489447-31832-8-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding supportHe Kuang1-2/+0
There is no need to check for DWARF unwinding support when using the 'dwarf' callchain record method, as this will only ask the kernel to collect stack dumps for later DWARF CFI processing, which can be done in another machine, where the support for DWARF unwinding need to be present. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1462525154-125656-2-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-08tools: bpf_jit_disasm: check for klogctl failureColin Ian King1-0/+3
klogctl can fail and return -ve len, so check for this and return NULL to avoid passing a (size_t)-1 to malloc. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06perf trace: Move futex_op beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo2-45/+46
To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vb8dpy7bptkf219q5c25ulfp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Move open_flags beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo2-56/+57
To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-jt293541hv9od7gqw6lilioh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Move signum beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo2-53/+54
To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qecqxwwtreio6eaatfv58yq5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf stat: Add extra output of counter values with -vvAndi Kleen1-0/+8
Add debug output of raw counter values per CPU when perf stat -v is specified, together with their cpu numbers. This is very useful to debug problems with per core counters, where we can normally only see aggregated values. v2: Make it depend on -vv, not -v Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461787251-6702-12-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Update export-to-postgresql to support callchain exportChris Phlipot1-17/+30
Update the export-to-postgresql.py to support the newly introduced callchain export. callchains are added into the existing call_paths table and can now be associated with samples when the "callpaths" commandline option is used with the script. Ex.: $ perf script -s export-to-postgresql.py example_db all callchains Includes the following changes to enable callchain export via the python export APIs: - Add the "callchains" commandline option, which is used to enable callchain export by setting the perf_db_export_callchains global - Add perf_db_export_callchains checks for call_path table creation and population. - Add call_path_id to samples_table to conform with the new API example usage and output using a small test app: test_app.c: volatile int x = 0; void inc_x_loop() { int i; for(i=0; i<100000000; i++) x++; } void a() { inc_x_loop(); } void b() { inc_x_loop(); } int main() { a(); b(); return 0; } example usage: $ gcc -g -O0 test_app.c $ perf record --call-graph=dwarf ./a.out [ perf record: Woken up 77 times to write data ] [ perf record: Captured and wrote 19.373 MB perf.data (2404 samples) ] $ perf script -s scripts/python/export-to-postgresql.py example_db all callchains $ psql example_db example_db=# SELECT (SELECT name FROM symbols WHERE id = cps.symbol_id) as symbol, (SELECT name FROM symbols WHERE id = (SELECT symbol_id from call_paths where id = cps.parent_id)) as parent_symbol, sum(period) as event_count FROM samples join call_paths as cps on call_path_id = cps.id GROUP BY cps.id,evsel_id ORDER BY event_count DESC LIMIT 5; symbol | parent_symbol | event_count ------------------+--------------------------+------------- inc_x_loop | a | 734250982 inc_x_loop | b | 731028057 unknown | unknown | 1335858 task_tick_fair | scheduler_tick | 1238842 update_wall_time | tick_do_update_jiffies64 | 650373 (5 rows) The above data shows total "self time" in cycles for each call path that was sampled. It is intended to demonstrate how it accounts separately for the two ways to reach the "inc_x_loop" function(via "a" and "b"). Recursive common table expressions can be used as well to get cumulative time spent in a function as well, but that is beyond the scope of this basic example. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-7-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Expose usage of the callchain db export via the python apiChris Phlipot3-16/+46
This change allows python scripts to be able to utilize the recent changes to the db export api allowing the export of call_paths derived from sampled callchains. These call paths are also now associated with the samples from which they were derived. - This feature is enabled by setting "perf_db_export_callchains" to true - When enabled, samples that have callchain information will have the callchains exported via call_path_table - The call_path_id field is added to sample_table to enable association of samples with the corresponding callchain stored in the call paths table. A call_path_id of 0 will be exported if there is no corresponding callchain. - When "perf_db_export_callchains" and "perf_db_export_calls" are both set to True, the call path root data structure will be shared. This prevents duplicating of data and call path ids that would result from building two separate call path trees in memory. - The call_return_processor structure definition was relocated to the header file to make its contents visible to db-export.c. This enables the sharing of call path trees between the two features, as mentioned above. This change is visible to python scripts using the python db export api. The change is backwards compatible with scripts written against the previous API, assuming that the scripts model the sample_table function after the one in export-to-postgresql.py script by allowing for additional arguments to be added in the future. ie. using *x as the final argument of the sample_table function. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-6-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Add call path id to exported sample in db exportChris Phlipot2-1/+4
The exported sample now contains a reference to the call_path_id that represents its callchain. While callchains themselves are nice to have, being able to associate them with samples makes them much more useful, and can allow for such things as determining how much cumulative time is spent in a particular function. This information is normally possible to get from the call return processor. However, when doing normal sampling, call/return information is not available, thus necessitating the need for associating samples directly with call paths. This commit include changes to db-export layer to make this information available for subsequent patches in this change set, but by itself, does not make any changes visible to the user. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-5-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Enable db export to output sampled callchainsChris Phlipot2-0/+84
This change enables the db export api to export callchains. This is accomplished by adding callchains obtained from samples to the call_path_root structure and exporting them via the current call path export API. While the current API does support exporting call paths, this is not supported when sampling. This commit addresses that missing feature by allowing the export of call paths when callchains are present in samples. Summary: - This feature is activated by initializing the call_path_root member inside the db_export structure to a non-null value. - Callchains are resolved with thread__resolve_callchain() and then stored and exported by adding a call path under call path root. - Symbol and DSO for each callchain node are exported via db_ids_from_al() This commit puts in place infrastructure to be used by subsequent commits, and by itself, does not introduce any user-visible changes. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-4-git-send-email-cphlipot0@gmail.com [ Made adjustments suggested by Adrian Hunter, see thread via this cset's Link: tag ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf tools: Refactor code to move call path handling out of thread-stackChris Phlipot7-149/+204
Move the call path handling code out of thread-stack.c and thread-stack.h to allow other components that are not part of thread-stack to create call paths. Summary: - Create call-path.c and call-path.h and add them to the build. - Move all call path related code out of thread-stack.c and thread-stack.h and into call-path.c and call-path.h. - A small subset of structures and functions are now visible through call-path.h, which is required for thread-stack.c to continue to compile. This change is a prerequisite for subsequent patches in this change set and by itself contains no user-visible changes. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-3-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf callchain: Fix incorrect ordering of entriesChris Phlipot1-15/+41
The existing implementation of thread__resolve_callchain, under certain circumstances, can assemble callchain entries in the incorrect order. The callchain entries are resolved incorrectly for a sample when all of the following conditions are met: 1. callchain_param.order is set to ORDER_CALLER 2. thread__resolve_callchain_sample is able to resolve callchain entries for the sample. 3. unwind__get_entries is also able to resolve callchain entries for the sample. The fix is accomplished by reversing the order in which thread__resolve_callchain_sample and unwind__get_entries are called when callchain_param.order is set to ORDER_CALLER. Unwind specific code from thread__resolve_callchain is also moved into a new static function to improve readability of the fix. How to Reproduce the Existing Bug: Modifying perf script to print call trees in the opposite order or applying the remaining patches from this series and comparing the results output from export-to-postgtresql.py are the easiest ways to see the bug, however it can still be seen in current builds using perf report. Here is how i can reproduce the bug using perf report: # perf record --call-graph=dwarf stress -c 1 -t 5 when i run this command: # perf report --call-graph=flat,0,0,callee This callchain, containing kernel (handle_irq_event, etc) and userspace samples (__libc_start_main, etc) is contained in the output, which looks correct (callee order): gen8_irq_handler handle_irq_event_percpu handle_irq_event handle_edge_irq handle_irq do_IRQ ret_from_intr __random rand 0x558f2a04dded 0x558f2a04c774 __libc_start_main 0x558f2a04dcd9 Now run this command using caller order: # perf report --call-graph=flat,0,0,caller It is expected to see the exact reverse of the above when using caller order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the bottom) in the output, but it is nowhere to be found. instead you see this: ret_from_intr do_IRQ handle_irq handle_edge_irq handle_irq_event handle_irq_event_percpu gen8_irq_handler 0x558f2a04dcd9 __libc_start_main 0x558f2a04c774 0x558f2a04dded rand __random Notice how internally the kernel symbols are reversed and the user space symbols are reversed, but the kernel symbols still appear above the user space symbols. if this patch is applied and perf script is re-run, you will see the expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the bottom): 0x558f2a04dcd9 __libc_start_main 0x558f2a04c774 0x558f2a04dded rand __random ret_from_intr do_IRQ handle_irq handle_edge_irq handle_irq_event handle_irq_event_percpu gen8_irq_handler Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-2-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Do not print raw args list for syscalls with no argsArnaldo Carvalho de Melo1-1/+6
The test to check if the arg format had been read from the syscall:sys_enter_name/format file was looking at the list of non-commom fields, and if that is empty, it would think it had failed to read it, because it doesn't exist, for instance, for the clone() syscall. So instead before dumping the raw syscall args list check IS_ERR(sc->tp_format), if that is true, then an attempt was made to read the format file and failed, in which case dump the raw arg list values. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ls7pmdqb2xy9339vdburwvnk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf evlist: Rename variable in perf_mmap__read()Wang Nan1-14/+15
In perf_mmap__read(), give better names to pointers. Original name 'old' and 'head' directly related to pointers in ring buffer control page. For backward ring buffer, the meaning of 'head' point is not 'the first byte of free space', but 'the first byte of the last record'. To reduce confusion, rename 'old' to 'start', 'head' to 'end'. 'start' -> 'end' is the direction the records should be read from. Change parameter order. Change 'overwrite' to 'check_messup'. When reading from 'head', no need to check messup for for backward ring buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461723563-67451-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf evlist: Extract perf_mmap__read()Wang Nan1-15/+24
Extract event reader from perf_evlist__mmap_read() to perf__mmap_read(). Future commit will feed it with manually computed 'head' and 'old' pointers. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461723563-67451-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf symbols: Fix kallsyms perf test on ppc64leNaveen N. Rao5-15/+31
ppc64le functions have a Global Entry Point (GEP) and a Local Entry Point (LEP). While placing a probe, we always prefer the LEP since it catches function calls through both the GEP and the LEP. In order to do this, we fixup the function entry points during elf symbol table lookup to point to the LEPs. This works, but breaks 'perf test kallsyms' since the symbols loaded from the symbol table (pointing to the LEP) do not match the symbols in kallsyms. To fix this, we do not adjust all the symbols during symbol table load. Instead, we note down st_other in a newly introduced arch-specific member of perf symbol structure, and later use this to adjust the probe trace point. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Mark Wielaard <mjw@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/6be7c2b17e370100c2f79dd444509df7929bdd3e.1460451721.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf powerpc: Fix kprobe and kretprobe handling with kallsyms on ppc64leNaveen N. Rao1-5/+14
So far, we used to treat probe point offsets as being offset from the LEP. However, userspace applications (objdump/readelf) always show disassembly and offsets from the function GEP. This is confusing to the user as we will end up probing at an address different from what the user expects when looking at the function disassembly with readelf/objdump. Fix this by changing how we modify probe address with perf. If only the function name is provided, we assume the user needs the LEP. Otherwise, if an offset is specified, we assume that the user knows the exact address to probe based on function disassembly, and so we just place the probe from the GEP offset. Finally, kretprobe was also broken with kallsyms as we were trying to specify an offset. This patch also fixes that issue. Reported-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Mark Wielaard <mjw@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/75df860aad8216bf4b9bcd10c6351ecc0e3dee54.1460451721.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf hists: Move sort__has_comm into struct perf_hpp_listJiri Olsa4-5/+6
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_comm into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-8-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf hists: Move sort__has_thread into struct perf_hpp_listJiri Olsa4-9/+8
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_thread into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-7-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf hists: Move sort__has_socket into struct perf_hpp_listJiri Olsa5-6/+5
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_socket into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf hists: Move sort__has_dso into struct perf_hpp_listJiri Olsa4-9/+8
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_dso into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf hists: Move sort__has_sym into struct perf_hpp_listJiri Olsa8-14/+13
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_sym into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf hists: Move sort__has_parent into struct perf_hpp_listJiri Olsa6-7/+6
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_parent into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf hists: Move sort__need_collapse into struct perf_hpp_listJiri Olsa11-20/+22
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__need_collapse into struct perf_hpp_list. Adding hists__has macro to easily access this info perf struct hists object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf tools powerpc: Add support for generating bpf prologueNaveen N. Rao2-12/+29
Generalize existing macros to serve the purpose. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Wang Nan <wangnan0@huawei.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1462461799-17518-1-git-send-email-naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf trace: Do not show the runtime_ms for a thread when not collecting itArnaldo Carvalho de Melo1-1/+5
That field is only updated when we use the "sched:sched_stat_runtime" tracepoint, and that is only done so far when we use the '--stat' command line option, without it we get just zeros, confusing the users: Without this patch: # trace -a -s sleep 1 <SNIP> qemu-system-x86 (9931), 468 events, 9.6%, 0.000 msec syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) ---------- ------ --------- --------- --------- --------- ------ ppoll 98 982.374 0.000 10.024 29.983 12.65% write 34 0.401 0.005 0.012 0.027 5.49% ioctl 102 0.347 0.002 0.003 0.007 3.08% firefox (10871), 1856 events, 38.2%, 0.000 msec (msec) (msec) (msec) (msec) (%) ---------- ------ --------- --------- --------- --------- ------ poll 395 934.873 0.000 2.367 17.120 11.51% recvmsg 395 0.988 0.001 0.003 0.021 4.20% read 106 0.460 0.002 0.004 0.007 3.17% futex 24 0.108 0.001 0.004 0.010 10.05% mmap 2 0.041 0.016 0.021 0.026 23.92% write 6 0.027 0.004 0.004 0.005 2.52% After this patch that ', 0.000 msecs' gets suppressed when --stat is not in use. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-p7emqrsw7900tdkg43v9l1e1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf trace: Sort syscalls stats by msecs in --summaryArnaldo Carvalho de Melo1-10/+22
# trace -a -s sleep 1 <SNIP> Xorg (1965), 788 events, 19.0%, 0.000 msec syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- --------- --------- --------- --------- ------ select 89 731.038 0.000 8.214 175.218 36.71% ioctl 22 0.661 0.010 0.030 0.072 10.43% writev 42 0.253 0.002 0.006 0.011 5.94% recvmsg 60 0.185 0.001 0.003 0.009 5.90% setitimer 60 0.127 0.001 0.002 0.006 6.14% read 52 0.102 0.001 0.002 0.005 8.55% rt_sigprocmask 45 0.092 0.001 0.002 0.023 23.65% poll 12 0.021 0.001 0.002 0.003 7.21% epoll_wait 12 0.019 0.001 0.002 0.002 2.71% firefox (10871), 1080 events, 26.1%, 0.000 msec syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- --------- --------- --------- --------- ------ poll 240 979.562 0.000 4.082 17.132 11.33% recvmsg 240 0.532 0.001 0.002 0.007 3.69% read 60 0.303 0.003 0.005 0.029 8.50% Suggested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-52kdkuyxihq0kvc0n2aalhay@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf trace: Sort summary output by number of eventsArnaldo Carvalho de Melo1-21/+28
# trace -a -s sleep 1 |& grep events | tail gmain (1733), 34 events, 1.0%, 0.000 msec hexchat (9765), 46 events, 1.4%, 0.000 msec ssh (11109), 80 events, 2.4%, 0.000 msec sleep (32631), 81 events, 2.4%, 0.000 msec qemu-system-x86 (10021), 272 events, 8.2%, 0.000 msec Xorg (1965), 322 events, 9.7%, 0.000 msec SoftwareVsyncTh (10922), 366 events, 11.1%, 0.000 msec gnome-shell (2231), 446 events, 13.5%, 0.000 msec qemu-system-x86 (9931), 468 events, 14.1%, 0.000 msec firefox (10871), 1098 events, 33.2%, 0.000 msec [root@jouet ~]# Suggested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ye4cnprhfeiq32ar4lt60dqs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf tools: Add template for generating rbtree resort classArnaldo Carvalho de Melo1-0/+149
Sometimes we want to sort an existing rbtree by a different key, introduce a template for that, that needs only to be provided the rbtree root and the number of entries in it. To do that a new rbtree will be created with extra space for each entry, where possibly pre-calculated keys will be stored to be used in the resort process and also later, when using the newly sorted rbtree. Please check the following two changesets to see it in use for resorting stats for threads and its syscalls in 'perf trace --summary'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9l6e1q34lmf3wwdeewstyakg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf machine: Introduce number of threads memberArnaldo Carvalho de Melo2-1/+7
To be used, for instance, for pre-allocating an rb_tree array for sorting by other keys besides the current pid one. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ja0ifkwue7ttjhbwijn6g6eu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06Merge back new ACPICA material for v4.7.Rafael J. Wysocki6-41/+49
2016-05-05ACPICA: Move all ASCII utilities to a common fileBob Moore2-1/+8
ACPICA commit ba60e4500053010bf775d58f6f61febbdb94d817 New file is utascii.c Link: https://github.com/acpica/acpica/commit/ba60e450 Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-05ACPICA: Divergence: remove unwanted spaces for typedefLv Zheng3-37/+36
ACPICA commit b2294cae776f5a66a7697414b21949d307e6856f This patch removes unwanted spaces for typedef. This solution doesn't cover function types. Note that the linuxize result of this commit is very giant and should have many conflicts against the current Linux upstream. Thus it is required to modify the linuxize result of this commit and the commits around it manually in order to have them merged to the Linux upstream. Since this is very costy, we should do this only once, and if we can't ensure to do this only once, we need to revert the Linux code to the wrong indentation result before merging the linuxize result of this commit. Lv Zheng. Link: https://github.com/acpica/acpica/commit/b2294cae Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-05Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar5-35/+311
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05Merge tag 'v4.6-rc6' into x86/asm, to refresh the treeIngo Molnar6-36/+312
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05cpupower: fix potential memory leakArjun Sreedharan1-3/+5
Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-04signals/sigaltstack: Change SS_AUTODISARM to (1U << 31)Andy Lutomirski1-1/+1
Using bit 4 divides the space of available bits strangely. Use bit 31 instead so that we have a better chance of keeping flag and mode bits separate in the long run. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Amanieu d'Antras <amanieu@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Jason Low <jason.low2@hp.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <pmoore@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Stas Sergeev <stsp@list.ru> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: linux-api@vger.kernel.org Link: http://lkml.kernel.org/r/bb996508a600af14b406810c3d58fe0e0d0afe0d.1462296606.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-04signals/sigaltstack: Report current flag bits in sigaltstack()Andy Lutomirski1-3/+16
sigaltstack()'s reported previous state uses a somewhat odd convention, but the concept of flag bits is new, and we can do the flag bits sensibly. Specifically, let's just report them directly. This will allow saving and restoring the sigaltstack state using sigaltstack() to work correctly. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Amanieu d'Antras <amanieu@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Stas Sergeev <stsp@list.ru> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: linux-api@vger.kernel.org Link: http://lkml.kernel.org/r/94b291ec9fd47741a9264851e316e158ded0b00d.1462296606.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-04selftests/sigaltstack: Fix the sigaltstack test on old kernelsAndy Lutomirski1-7/+14
The handling for old kernels was wrong, resulting in a segfault. Fix it. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Stas Sergeev <stsp@list.ru> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-api@vger.kernel.org Link: http://lkml.kernel.org/r/f3e739bf435beeaecbd5f038f1359d2eac6d1e63.1462296606.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-03selftests/sigaltstack: Add new testcase for sigaltstack(SS_ONSTACK|SS_AUTODISARM)Stas Sergeev3-0/+165
This patch adds the test case for SS_AUTODISARM flag. The test-case tries to set SS_AUTODISARM flag and checks if the nested signal corrupts the stack after swapcontext(). Signed-off-by: Stas Sergeev <stsp@list.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-api@vger.kernel.org Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1460665206-13646-5-git-send-email-stsp@list.ru Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29selftests/x86/ldt_gdt: Test set_thread_area() deletion of an active segmentAndy Lutomirski1-0/+250
Now that set_thread_area() is supposed to give deterministic behavior when it modifies in-use segments, test it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/f2bc11af1ee1a0f815ed910840cbdba06b640a20.1461698311.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-28perf tests: Do not use sizeof on pointer typeVaishali Thakkar1-1/+1
Using sizeof on a malloced pointer type will return the wordsize which can often cause one to allocate a buffer much smaller than it is needed. So, here do not use sizeof on pointer type. Note that this has no effect on runtime because 'dsos' is a pointer to a pointer. Problem found using Coccinelle. Signed-off-by: Vaishali Thakkar <vaishali.thakkar@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461862017-23358-1-git-send-email-vaishali.thakkar@oracle.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-28cpupower: Add cpuidle parts into libraryThomas Renninger17-931/+1242
This more or less is a renaming and moving of functions and should not introduce any functional change. cpupower was built from cpufrequtils (which had a C library providing easy access to cpu frequency platform info). In the meantime it got enhanced by quite some neat cpuidle userspace tools. Now the cpu idle functions have been separated and added to the cpupower.so library. So beside an already existing public header file: cpufreq.h cpupower now also exports these cpu idle functions in: cpuidle.h Here again pasted for better review of the interfaces: ====================================== int cpuidle_is_state_disabled(unsigned int cpu, unsigned int idlestate); int cpuidle_state_disable(unsigned int cpu, unsigned int idlestate, unsigned int disable); unsigned long cpuidle_state_latency(unsigned int cpu, unsigned int idlestate); unsigned long cpuidle_state_usage(unsigned int cpu, unsigned int idlestate); unsigned long long cpuidle_state_time(unsigned int cpu, unsigned int idlestate); char *cpuidle_state_name(unsigned int cpu, unsigned int idlestate); char *cpuidle_state_desc(unsigned int cpu, unsigned int idlestate); unsigned int cpuidle_state_count(unsigned int cpu); char *cpuidle_get_governor(void); char *cpuidle_get_driver(void); ====================================== Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-28cpupowerutils: bench: trivial fix of spelling mistake on "average"Colin Ian King2-3/+3
fix spelling mistake, avarage -> average Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-28Fix cpupower manpages "NAME" sectionMattia Dongili4-4/+4
The token before "-" should be the program name, no spaces allowed. See man(7) and lexgrog(1). Signed-off-by: Mattia Dongili <malattia@linux.it> Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-28cpupower: bench: parse.c: fix several resource leaksColin Ian King1-3/+11
The error handling in prepare_output has several issues with resource leaks. Ensure that filename is free'd and the directory stream DIR is closed before returning. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-28Honour user's LDFLAGSMattia Dongili1-1/+1
Signed-off-by: Mattia Dongili <malattia@linux.it> Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>