aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-04-14perf/x86: Fix spurious NMI with PEBS Load Latency eventKan Liang3-2/+3
Spurious NMIs will be observed with the following command: while :; do perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp" -e "cpu/umask=0x03,event=0x0/" -e "cpu/umask=0x02,event=0x0/" -e cycles,branches,cache-misses -e cache-references -- sleep 10 done The bug was introduced by commit: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+") That commit clears the status bits for the counters used for PEBS events, by masking the whole 64 bits pebs_enabled. However, only the low 32 bits of both status and pebs_enabled are reserved for PEBS-able counters. For status bits 32-34 are fixed counter overflow bits. For pebs_enabled bits 32-34 are for PEBS Load Latency. In the test case, the PEBS Load Latency event and fixed counter event could overflow at the same time. The fixed counter overflow bit will be cleared by mistake. Once it is cleared, the fixed counter overflow never be processed, which finally trigger spurious NMI. Correct the PEBS enabled mask by ignoring the non-PEBS bits. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+") Link: http://lkml.kernel.org/r/1491333246-3965-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-14perf/x86: Avoid exposing wrong/stale data in intel_pmu_lbr_read_32()Peter Zijlstra1-0/+3
When the perf_branch_entry::{in_tx,abort,cycles} fields were added, intel_pmu_lbr_read_32() wasn't updated to initialize them. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> Fixes: 135c5612c460 ("perf/x86/intel: Support Haswell/v4 LBR format") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-13Revert "perf tools: Fix include of linux/mman.h"David Carrillo-Cisneros1-1/+1
In https://lkml.org/lkml/2017/2/2/16 I reported a build error that I believed was caused by wrong uapi includes. The synthom was fixed by Arnaldo in: commit 2f7db5557994 ("perf tools: Fix include of linux/mman.h") but I was wrong attributing the problem to the uapi include. The root cause was that I was using ARCH=x86_64, hence using the wrong uapi include path. This explains why no one else ran into this build problem. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170412064919.92449-8-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-13perf util: Hint missing file when tool tips fail to loadDavid Carrillo-Cisneros1-1/+2
Besides memory allocation failure, tips.txt may fail to load because the file is not found (a more likely cause). Communicate that to the user in tips failure warning. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170412064919.92449-5-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-13tools build: Fix feature detection redefinion of build flagsDavid Carrillo-Cisneros1-5/+5
This change is a follow up of https://lkml.org/lkml/2017/2/2/16 The patch above avoided redefining CC, CXX and PKG_CONFIG in feature detection. The patch was not merged due to a unsolved concern with the -MD flag. Later, commit c8c188679ccf ("tools build: Use the same CC for feature detection and actual build") did the change for CC and CXX but not PKG_CONFIG. This patch makes PKG_CONFIG consistent with CC and CXX and moves the -MD to CFLAGS, as suggested by Jiri in the thread above. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170412064919.92449-3-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-13perf tools: Disable JVMTI if no ELF support availableDavid Carrillo-Cisneros1-1/+3
The build of JVMTI depends on LIBELF (-lelf). Make Makefile.conf check this dependendancy and notify user when not present. v2: Comma nitpicking. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Tested-by: Kim Phillips <kim.phillips@arm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170412170745.26620-1-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-13perf trace: Add usage of --no-syscalls in man pageRavi Bangoria1-1/+2
perf trace supports --no-syscalls option but it's not listed in the man page. (Though, I see an example using --no-syscalls in EXAMPLES section.) Committer note: The --no-syscalls option tells 'perf trace' not to automagically ask for raw_syscalls:sys_{enter,exit} to then format it in a strace like way. This become more used as 'perf trace' got support for arbitrary events, such as tracepoints, so more and more we use: # perf trace --no-syscalls -e nmi:* 0.000 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 36649 handled: 1) 0.019 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 2907 handled: 0) 0.676 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 9401 handled: 1) 0.680 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 288 handled: 0) 0.701 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 4977 handled: 1) 0.703 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 67 handled: 0) 0.736 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 8549 handled: 1) ^C# Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1492063332-5745-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-13perf stat: Fix bug in handling events in error stateStephane Eranian2-5/+11
(This is a patch has been sitting in the Intel CQM/CMT driver series for a while, despite not depend on it. Sending it now independently since the series is being discarded.) When an event is in error state, read() returns 0 instead of sizeof() buffer. In certain modes, such as interval printing, ignoring the 0 return value may cause bogus count deltas to be computed and thus invalid results printed. This patch fixes this problem by modifying read_counters() to mark the event as not scaled (scaled = -1) to force the printout routine to show <NOT COUNTED>. Signed-off-by: Stephane Eranian <eranian@google.com> Reviewed-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170412182301.44406-1-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-12perf tools: Pass PYTHON config to feature detectionDavid Carrillo-Cisneros2-20/+13
( This is a rebased version of https://lkml.org/lkml/2017/2/7/662 ) Python's CC and link Makefile variables were not passed to feature detection, causing feature detection to use system's Python rather than PYTHON_CONFIG's one. This created a mismatch between the detected Python support and the one actually used by perf when PYTHON_CONFIG is specified. Fix it by moving Python's variable initialization to before feature detection and pass FLAGS_PYTHON_EMBED to Python's feature detection's build target. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170412064919.92449-2-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-12kprobes/x86: Consolidate insn decoder users for copying codeMasami Hiramatsu3-39/+36
Consolidate x86 instruction decoder users on the path of copying original code for kprobes. Kprobes decodes the same instruction a maximum of 3 times when preparing the instruction buffer: - The first time for getting the length of the instruction, - the 2nd for adjusting displacement, - and the 3rd for checking whether the instruction is boostable or not. For each time, the actual decoding target address is slightly different (1st is original address or recovered instruction buffer, 2nd and 3rd are pointing to the copied buffer), but all have the same instruction. Thus, this patch also changes the target address to the copied buffer at first and reuses the decoded "insn" for displacement adjusting and checking boostability. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/149076389643.22469.13151892839998777373.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-12kprobes/x86: Use probe_kernel_read() instead of memcpy()Masami Hiramatsu2-4/+13
Use probe_kernel_read() for avoiding unexpected faults while copying kernel text in __recover_probed_insn(), __recover_optprobed_insn() and __copy_instruction(). Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/149076382624.22469.10091613887942958518.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-12kprobes/x86: Set kprobes pages read-onlyMasami Hiramatsu2-0/+7
Set the pages which is used for kprobes' singlestep buffer and optprobe's trampoline instruction buffer to readonly. This can prevent unexpected (or unintended) instruction modification. This also passes rodata_test as below. Without this patch, rodata_test shows a warning: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 note_page+0x7a9/0xa20 x86/mm: Found insecure W+X mapping at address ffffffffa0000000/0xffffffffa0000000 With this fix, no W+X pages are found: x86/mm: Checked W+X mappings: passed, no W+X pages found. rodata_test: all tests were successful Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/149076375592.22469.14174394514338612247.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-12kprobes/x86: Make boostable flag booleanMasami Hiramatsu3-11/+10
Make arch_specific_insn.boostable to boolean, since it has only 2 states, boostable or not. So it is better to use boolean from the viewpoint of code readability. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/149076368566.22469.6322906866458231844.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-12kprobes/x86: Do not modify singlestep buffer while resumingMasami Hiramatsu1-22/+20
Do not modify singlestep execution buffer (kprobe.ainsn.insn) while resuming from single-stepping, instead, modifies the buffer to add a jump back instruction at preparing buffer. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/149076361560.22469.1610155860343077495.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-12kprobes/x86: Use instruction decoder for boosterMasami Hiramatsu1-23/+16
Use x86 instruction decoder for checking whether the probed instruction is able to boost or not, instead of hand-written code. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/149076354563.22469.13379472209338986858.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-12kprobes/x86: Fix the description of __copy_instruction()Masami Hiramatsu1-5/+5
Fix the description comment of __copy_instruction() function since it has already been changed to return the length of the copied instruction. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/149076347582.22469.3775133607244923462.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-12kprobes/x86: Fix kprobe-booster not to boost far call instructionsMasami Hiramatsu1-0/+2
Fix the kprobe-booster not to boost far call instruction, because a call may store the address in the single-step execution buffer to the stack, which should be modified after single stepping. Currently, this instruction will be filtered as not boostable in resume_execution(), so this is not a critical issue. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/149076340615.22469.14066273186134229909.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11perf annotate: Use stripped line instead of raw disassemble lineTaeung Song1-2/+2
When parsing disassemble lines for source line number, use a stripped line instead of raw line. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1491612748-1605-3-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf annotate: Refactor the code to parse disassemble lines with {l,r}trim()Taeung Song1-35/+7
When parsing disassemble lines, use ltrim() and rtrim() to strip them, not using just while loop and isspace(). Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1491612748-1605-2-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf tools: Do not print missing features in pipe-modeDavid Carrillo-Cisneros1-0/+3
Pipe-mode has no perf.data header, hence no upfront knowledge of presend and missing features, hence, do not print missing features in pipe-mode. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170410201432.24807-8-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf session: Don't rely on evlist in pipe modeDavid Carrillo-Cisneros1-3/+13
Session sets a number parameters that rely on evlist. These parameters are not used in pipe-mode and should not be set, since evlist is unavailable. Fix that. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170410201432.24807-6-davidcc@google.com [ Check if file != NULL in perf_session__new(), like when used by builtin-top.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf annotate: Process attr and build_id recordsDavid Carrillo-Cisneros1-0/+2
perf annotate did not get some love for pipe-mode, and did not have .attr and .buil_id setup (while record and inject did. Fix that. It can easily be reproduced by: perf record -o - noploop | perf annotate that in my system shows: 0xd8 [0x28]: failed to process type: 9 Committer Testing: Before: $ perf record -o - stress -t 2 -c 2 | perf annotate --stdio stress: info: [11060] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd 0x4470 [0x28]: failed to process type: 9 $ stress: info: [11060] successful run completed in 2s $ After: $ perf record -o - stress -t 2 -c 2 | perf annotate --stdio stress: info: [11871] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd stress: info: [11871] successful run completed in 2s [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] no symbols found in /usr/bin/stress, maybe install a debug package? Percent | Source code & Disassembly of libc-2.24.so for cycles:uhH (6117 samples) --------------------------------------------------------------------------------------- : : Disassembly of section .text: : : 000000000003b050 <random_r>: : __random_r(): 10.56 : 3b050: test %rdi,%rdi 0.00 : 3b053: je 3b0d0 <random_r+0x80> 0.34 : 3b055: test %rsi,%rsi 0.00 : 3b058: je 3b0d0 <random_r+0x80> 0.46 : 3b05a: mov 0x18(%rdi),%eax 12.44 : 3b05d: mov 0x10(%rdi),%r8 0.18 : 3b061: test %eax,%eax 0.00 : 3b063: je 3b0b0 <random_r+0x60> <SNIP> Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170410201432.24807-5-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf tools: Describe pipe mode in perf.data-file-fomat.txtDavid Carrillo-Cisneros1-2/+17
Add a minimal description of pipe's data format. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170410201432.24807-4-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf inject: Copy events when reordering events in pipe modeDavid Carrillo-Cisneros2-1/+3
__perf_session__process_pipe_events reuses the same memory buffer to process all events in the pipe. When reordering is needed (e.g. -b option), events are not immediately flushed, but kept around until reordering is possible, causing memory corruption. The problem is usually observed by a "Unknown sample error" output. It can easily be reproduced by: perf record -o - noploop | perf inject -b > output Committer testing: Before: $ perf record -o - stress -t 2 -c 2 | perf inject -b > /dev/null stress: info: [8297] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd stress: info: [8297] successful run completed in 2s [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] Warning: Found 1 unknown events! Is this an older tool processing a perf.data file generated by a more recent tool? If that is not the case, consider reporting to linux-kernel@vger.kernel.org. $ After: $ perf record -o - stress -t 2 -c 2 | perf inject -b > /dev/null stress: info: [9027] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd stress: info: [9027] successful run completed in 2s [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] no symbols found in /usr/bin/stress, maybe install a debug package? no symbols found in /usr/bin/stress, maybe install a debug package? $ Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170410201432.24807-3-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf inject: Don't proceed if perf_session__process_event() failsDavid Carrillo-Cisneros1-0/+2
All paths following perf_session__process_event() in __cmd_inject() are useless if __cmd_inject() is to fail, some depend on a correct session->evlist. First commit to add code that depends on session->evlist without checking error was commmit e558a5bd8b ("perf inject: Work with files"). It has grown since then. Change __cmd_inject() to fail immediately after perf_session__process_event() fails. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Vagin <avagin@openvz.org> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: e558a5bd8b74 ("perf inject: Work with files") Link: http://lkml.kernel.org/r/20170410201432.24807-2-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf annotate s390: Implement jump types for perf annotateChristian Borntraeger2-0/+32
Implement simple detection for all kind of jumps and branches. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Andreas Krebbel <krebbel@linux.vnet.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-s390 <linux-s390@vger.kernel.org> Cc: stable@kernel.org # v4.10+ Link: http://lkml.kernel.org/r/1491465112-45819-3-git-send-email-borntraeger@de.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf annotate s390: Fix perf annotate error -95 (4.10 regression)Christian Borntraeger1-0/+6
since 4.10 perf annotate exits on s390 with an "unknown error -95". Turns out that commit 786c1b51844d ("perf annotate: Start supporting cross arch annotation") added a hard requirement for architecture support when objdump is used but only provided x86 and arm support. Meanwhile power was added so lets add s390 as well. While at it make sure to implement the branch and jump types. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Andreas Krebbel <krebbel@linux.vnet.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-s390 <linux-s390@vger.kernel.org> Cc: stable@kernel.org # v4.10+ Fixes: 786c1b51844 "perf annotate: Start supporting cross arch annotation" Link: http://lkml.kernel.org/r/1491465112-45819-2-git-send-email-borntraeger@de.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf string: Simplify ltrim() implementationArnaldo Carvalho de Melo1-5/+1
We don't need to use strlen(), a var, or check for the end explicitely, isspace('\0') is false: [acme@jouet c]$ cat ltrim.c #include <ctype.h> #include <stdio.h> static char *ltrim(char *s) { while (isspace(*s)) ++s; return s; } int main(void) { printf("ltrim(\"\")='%s'\n", ltrim("")); return 0; } [acme@jouet c]$ ./ltrim ltrim("")='' [acme@jouet c]$ Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/n/tip-w3nk0x3pai2vojk2ab6kdvaw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf tools: Refactor the code to strip command name with {l,r}trim()Taeung Song1-9/+2
After reading command name from /proc/<pid>/status, use ltrim() and rtrim() to strip command name, not using just while loop, isspace() and etc. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1491575061-704-6-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf pmu: Refactor wordwrap() with ltrim()Taeung Song1-2/+1
Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1491575061-704-5-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf ui browser: Refactor the code to parse color configs with ltrim()Taeung Song1-1/+1
When parsing {fore, back} ground color configs, use ltrim() instead of just while loop and isspace(). Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1491575061-704-4-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf stat: Refactor the code to strip csv output with ltrim()Taeung Song1-8/+2
To strip csv output, use ltrim() instead of just while loop and isspace() at print_metric_{only}_csv(). Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1491575061-704-3-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf evsel: Return exact sub event which failed with EPERM for wildcardsJin Yao1-1/+7
The kernel has a special check for a specific irq_vectors trace event. TRACE_EVENT_PERF_PERM(irq_work_exit, is_sampling_event(p_event) ? -EPERM : 0); The perf-record fails for this irq_vectors event when it is present, like when using a wildcard: root@skl:/tmp# perf record -a -e irq_vectors:* sleep 2 Error: You may not have permission to collect system-wide stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, which controls use of the performance events system by unprivileged users (without CAP_SYS_ADMIN). The current value is 2: -1: Allow use of (almost) all events by all users >= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN To make this setting permanent, edit /etc/sysctl.conf too, e.g.: kernel.perf_event_paranoid = -1 This patch prints out the exact sub event that failed with EPERM for wildcards to help in understanding what went wrong when this event is present: After the patch: root@skl:/tmp# perf record -a -e irq_vectors:* sleep 2 Error: No permission to enable irq_vectors:irq_work_exit event. You may not have permission to collect system-wide stats. ...... Committer notes: So we have a lot of irq_vectors events: [root@jouet ~]# perf list irq_vectors:* List of pre-defined events (to be used in -e): irq_vectors:call_function_entry [Tracepoint event] irq_vectors:call_function_exit [Tracepoint event] irq_vectors:call_function_single_entry [Tracepoint event] irq_vectors:call_function_single_exit [Tracepoint event] irq_vectors:deferred_error_apic_entry [Tracepoint event] irq_vectors:deferred_error_apic_exit [Tracepoint event] irq_vectors:error_apic_entry [Tracepoint event] irq_vectors:error_apic_exit [Tracepoint event] irq_vectors:irq_work_entry [Tracepoint event] irq_vectors:irq_work_exit [Tracepoint event] irq_vectors:local_timer_entry [Tracepoint event] irq_vectors:local_timer_exit [Tracepoint event] irq_vectors:reschedule_entry [Tracepoint event] irq_vectors:reschedule_exit [Tracepoint event] irq_vectors:spurious_apic_entry [Tracepoint event] irq_vectors:spurious_apic_exit [Tracepoint event] irq_vectors:thermal_apic_entry [Tracepoint event] irq_vectors:thermal_apic_exit [Tracepoint event] irq_vectors:threshold_apic_entry [Tracepoint event] irq_vectors:threshold_apic_exit [Tracepoint event] irq_vectors:x86_platform_ipi_entry [Tracepoint event] irq_vectors:x86_platform_ipi_exit [Tracepoint event] # And some may be sampled: [root@jouet ~]# perf record -e irq_vectors:local* sleep 20s [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data (2 samples) ] [root@jouet ~]# perf report -D | egrep 'stats:|events:' Aggregated stats: TOTAL events: 155 MMAP events: 144 COMM events: 2 EXIT events: 1 SAMPLE events: 2 MMAP2 events: 4 FINISHED_ROUND events: 1 TIME_CONV events: 1 irq_vectors:local_timer_entry stats: TOTAL events: 1 SAMPLE events: 1 irq_vectors:local_timer_exit stats: TOTAL events: 1 SAMPLE events: 1 [root@jouet ~]# But, as shown in the tracepoint definition at the start of this message, some, like "irq_vectors:irq_work_exit", may not be sampled, just counted, i.e. if we try to sample, as when using 'perf record', we get an error: [root@jouet ~]# perf record -e irq_vectors:irq_work_exit Error: You may not have permission to collect system-wide stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, <SNIP> The error message is misleading, this patch will help in pointing out what is the event causing such an error, but the error message needs improvement, i.e. we need to figure out a way to check if a tracepoint is counting only, like this one, when all we can do is to count it with 'perf stat', at most printing the delta using interval printing, as in: [root@jouet ~]# perf stat -I 5000 -e irq_vectors:irq_work_* # time counts unit events 5.000168871 0 irq_vectors:irq_work_entry 5.000168871 0 irq_vectors:irq_work_exit 10.000676730 0 irq_vectors:irq_work_entry 10.000676730 0 irq_vectors:irq_work_exit 15.001122415 0 irq_vectors:irq_work_entry 15.001122415 0 irq_vectors:irq_work_exit 20.001298051 0 irq_vectors:irq_work_entry 20.001298051 0 irq_vectors:irq_work_exit 25.001485020 1 irq_vectors:irq_work_entry 25.001485020 1 irq_vectors:irq_work_exit 30.001658706 0 irq_vectors:irq_work_entry 30.001658706 0 irq_vectors:irq_work_exit ^C 32.045711878 0 irq_vectors:irq_work_entry 32.045711878 0 irq_vectors:irq_work_exit [root@jouet ~]# But at least, when we use a wildcard, this patch helps a bit. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1491566932-503-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf script: Use strtok_r() when parsing output field listArnaldo Carvalho de Melo1-2/+2
Just avoiding non-reentrant functions. Cc: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/n/tip-eqytykipd74epzl9aexvppcg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf callchains: Switch from strtok() to strtok_r() when parsing optionsArnaldo Carvalho de Melo1-2/+2
Trying to keep everything reentrant. Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-rdce0p2k9e1b4qnrb8ki9mtf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-11perf/amd/uncore: Fix pr_fmt() prefixBorislav Petkov1-2/+5
Make it "perf/amd/uncore: ", i.e., something more specific than "perf: ". Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20170410122047.3026-4-bp@alien8.de [ Changed it to perf/amd/uncore/ ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11perf/amd/uncore: Clean up per-family setupBorislav Petkov1-38/+21
Fam16h is the same as the default one, remove it. Turn the switch-case into a simple if-else. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20170410122047.3026-3-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11perf/amd/uncore: Do feature check first, before assignmentsBorislav Petkov1-5/+4
... and save some unnecessary work. Remove now unused label while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20170410122047.3026-2-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-09Linux 4.11-rc6Linus Torvalds1-1/+1
2017-04-08mm/mempolicy.c: fix error handling in set_mempolicy and mbind.Chris Salls1-12/+8
In the case that compat_get_bitmap fails we do not want to copy the bitmap to the user as it will contain uninitialized stack data and leak sensitive data. Signed-off-by: Chris Salls <salls@cs.ucsb.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvecLiping Zhang1-0/+2
Currently, inputting the following command will succeed but actually the value will be truncated: # echo 0x12ffffffff > /proc/sys/net/ipv4/tcp_notsent_lowat This is not friendly to the user, so instead, we should report error when the value is larger than UINT_MAX. Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08MAINTAINERS: separate out kernfs maintainershipTejun Heo1-2/+9
Separate out kernfs from driver core and add myself as a co-maintainer. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08sysfs: be careful of error returns from ops->show()NeilBrown1-2/+4
ops->show() can return a negative error code. Commit 65da3484d9be ("sysfs: correctly handle short reads on PREALLOC attrs.") (in v4.4) caused this to be stored in an unsigned 'size_t' variable, so errors would look like large numbers. As a result, if an error is returned, sysfs_kf_read() will return the value of 'count', typically 4096. Commit 17d0774f8068 ("sysfs: correctly handle read offset on PREALLOC attrs") (in v4.8) extended this error to use the unsigned large 'len' as a size for memmove(). Consequently, if ->show returns an error, then the first read() on the sysfs file will return 4096 and could return uninitialized memory to user-space. If the application performs a subsequent read, this will trigger a memmove() with extremely large count, and is likely to crash the machine is bizarre ways. This bug can currently only be triggered by reading from an md sysfs attribute declared with __ATTR_PREALLOC() during the brief period between when mddev_put() deletes an mddev from the ->all_mddevs list, and when mddev_delayed_delete() - which is scheduled on a workqueue - completes. Before this, an error won't be returned by the ->show() After this, the ->show() won't be called. I can reproduce it reliably only by putting delay like usleep_range(500000,700000); early in mddev_delayed_delete(). Then after creating an md device md0 run echo clear > /sys/block/md0/md/array_state; cat /sys/block/md0/md/array_state The bug can be triggered without the usleep. Fixes: 65da3484d9be ("sysfs: correctly handle short reads on PREALLOC attrs.") Fixes: 17d0774f8068 ("sysfs: correctly handle read offset on PREALLOC attrs") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08Documentation: stable-kernel-rules: fix stable-tag formatJohan Hovold1-1/+1
A patch documenting how to specify which kernels a particular fix should be backported to (seemingly) inadvertently added a minus sign after the kernel version. This particular stable-tag format had never been used prior to this patch, and was neither present when the patch in question was first submitted (it was added in v2 without any comment). Drop the minus sign to avoid any confusion. Fixes: fdc81b7910ad ("stable_kernel_rules: Add clause about specification of kernel versions to patch.") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08staging: android: ashmem: lseek failed due to no FMODE_LSEEK.Shuxiao Zhang1-0/+1
vfs_llseek will check whether the file mode has FMODE_LSEEK, no return failure. But ashmem can be lseek, so add FMODE_LSEEK to ashmem file. Comment From Greg Hackmann: ashmem_llseek() passes the llseek() call through to the backing shmem file. 91360b02ab48 ("ashmem: use vfs_llseek()") changed this from directly calling the file's llseek() op into a VFS layer call. This also adds a check for the FMODE_LSEEK bit, so without that bit ashmem_llseek() now always fails with -ESPIPE. Fixes: 91360b02ab48 ("ashmem: use vfs_llseek()") Signed-off-by: Shuxiao Zhang <zhangshuxiao@xiaomi.com> Tested-by: Greg Hackmann <ghackmann@google.com> Cc: stable <stable@vger.kernel.org> # 3.18+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08mm: move pcp and lru-pcp draining into single wqMichal Hocko4-26/+32
We currently have 2 specific WQ_RECLAIM workqueues in the mm code. vmstat_wq for updating pcp stats and lru_add_drain_wq dedicated to drain per cpu lru caches. This seems more than necessary because both can run on a single WQ. Both do not block on locks requiring a memory allocation nor perform any allocations themselves. We will save one rescuer thread this way. On the other hand drain_all_pages() queues work on the system wq which doesn't have rescuer and so this depend on memory allocation (when all workers are stuck allocating and new ones cannot be created). Initially we thought this would be more of a theoretical problem but Hugh Dickins has reported: : 4.11-rc has been giving me hangs after hours of swapping load. At : first they looked like memory leaks ("fork: Cannot allocate memory"); : but for no good reason I happened to do "cat /proc/sys/vm/stat_refresh" : before looking at /proc/meminfo one time, and the stat_refresh stuck : in D state, waiting for completion of flush_work like many kworkers. : kthreadd waiting for completion of flush_work in drain_all_pages(). This worker should be using WQ_RECLAIM as well in order to guarantee a forward progress. We can reuse the same one as for lru draining and vmstat. Link: http://lkml.kernel.org/r/20170307131751.24936-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@suse.de> Tested-by: Yang Li <pku.leo@gmail.com> Tested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08mailmap: update Yakir Yang email addressJeffy Chen1-0/+1
Set current email address to replace previous employers email addresses. Link: http://lkml.kernel.org/r/1491450722-6633-1-git-send-email-jeffy.chen@rock-chips.com Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff()David Rientjes1-0/+2
We got need_resched() warnings in swap_cgroup_swapoff() because swap_cgroup_ctrl[type].length is particularly large. Reschedule when needed. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1704061315270.80559@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08dax: fix radix tree insertion raceRoss Zwisler1-13/+22
While running generic/340 in my test setup I hit the following race. It can happen with kernels that support FS DAX PMDs, so v4.10 thru v4.11-rc5. Thread 1 Thread 2 -------- -------- dax_iomap_pmd_fault() grab_mapping_entry() spin_lock_irq() get_unlocked_mapping_entry() 'entry' is NULL, can't call lock_slot() spin_unlock_irq() radix_tree_preload() dax_iomap_pmd_fault() grab_mapping_entry() spin_lock_irq() get_unlocked_mapping_entry() ... lock_slot() spin_unlock_irq() dax_pmd_insert_mapping() <inserts a PMD mapping> spin_lock_irq() __radix_tree_insert() fails with -EEXIST <fall back to 4k fault, and die horribly when inserting a 4k entry where a PMD exists> The issue is that we have to drop mapping->tree_lock while calling radix_tree_preload(), but since we didn't have a radix tree entry to lock (unlike in the pmd_downgrade case) we have no protection against Thread 2 coming along and inserting a PMD at the same index. For 4k entries we handled this with a special-case response to -EEXIST coming from the __radix_tree_insert(), but this doesn't save us for PMDs because the -EEXIST case can also mean that we collided with a 4k entry in the radix tree at a different index, but one that is covered by our PMD range. So, correctly handle both the 4k and 2M collision cases by explicitly re-checking the radix tree for an entry at our index once we reacquire mapping->tree_lock. This patch has made it through a clean xfstests run with the current v4.11-rc5 based linux/master, and it also ran generic/340 500 times in a loop. It used to fail within the first 10 iterations. Link: http://lkml.kernel.org/r/20170406212944.2866-1-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> [4.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08mm, thp: fix setting of defer+madvise thp defrag modeDavid Rientjes1-6/+6
Setting thp defrag mode of "defer+madvise" actually sets "defer" in the kernel due to the name similarity and the out-of-order way the string is checked in defrag_store(). Check the string in the correct order so that TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG is set appropriately for "defer+madvise". Fixes: 21440d7eb904 ("mm, thp: add new defer+madvise defrag option") Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1704051814420.137626@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>