Age | Commit message (Collapse) | Author | Files | Lines |
|
The recent new patch "perf tools: Add new 'perf data' command" (commit
2245bf14 in acme's git repo perf/core) has caused a building error when
compiling the source code of perf:
cc1: warnings being treated as errors
builtin-data.c:89: error: missing initializer
builtin-data.c:89: error: (near initialization for ‘data_cmds[1].summary’)
make[2]: *** [builtin-data.o] Error 1
make[2]: *** Waiting for unfinished jobs....
LD bench/perf-in.o
LD tests/perf-in.o
make[1]: *** [perf-in.o] Error 2
make: *** [all] Error 2
This patch fixes the building error above.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1425038026-27604-1-git-send-email-yunlong.song@huawei.com
[ .name == NULL ends the loop, use it instead of seting all fields to NULL ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The minus operator has higher precedence than ?: Add parentheses around
?: fix this.
Before this patch:
$ echo 'p:myprobe do_sys_open' > /sys/kernel/debug/tracing/kprobe_events
$ perf probe -l -k ../vmlinux
kprobes:myprobe (on do_sys_open)
After this patch:
$ echo 'p:myprobe do_sys_open' > /sys/kernel/debug/tracing/kprobe_events
$ perf probe -l -k ../vmlinux
kprobes:myprobe (on do_sys_open@linux.git/fs/open.c)
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1425034373-14511-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently, the perf diff only works with same binaries. That's because
it compares the symbol start address. It doesn't work if the perf.data
comes from different binaries. This patch matches the symbol names.
Actually, perf diff once intended to compare the symbol names. The
commit as below can look for a pair by name.
604c5c92972d (perf diff: Change the default sort order to "dso,symbol")
However, at that time, perf diff used a global list of dsos. That means
the binaries which has same name can only be loaded once. That's a
problem for comparing different binaries.
For example, we have an old binary and an updated binary. They very
likely have same name and most of the functions, so only dsos from old
binary will be loaded. When processing the data from updated binary,
perf still use the symbol information from old binary. That's wrong.
Then the commit as below used IP to replace symbol name.
9c443dfdd31e ("perf diff: Fix support for all --sort combinations")
>From that time, perf diff starts to compare the symbol address.
The global dsos is discarded from a patch in 2010.
a1645ce12adb ("perf: 'perf kvm' tool for monitoring guest performance
from host")
However, at that time, perf diff already compared by address. So perf
diff cannot work for different binaries as well.
This patch actually rolls back the perf diff to original design. The
document is also changed, so everybody knows the original design is to
compare the symbol names.
Here are some examples:
The only difference between example_v1.c and example_v2.c is the
location of f2 and f3. There is no change in behavior, but the previous
perf diff display the wrong differential profile.
example_v1.c
noinline void f3(void)
{
volatile int i;
for (i = 0; i < 10000;) {
if(i%2)
i++;
else
i++;
}
}
noinline void f2(void)
{
volatile int a = 100, b, c;
for (b = 0; b < 10000; b++)
c = a * b;
}
noinline void f1(void)
{
f2();
f3();
}
int main()
{
int i;
for (i = 0; i < 100000; i++)
f1();
}
example_v2.c
noinline void f2(void)
{
volatile int a = 100, b, c;
for (b = 0; b < 10000; b++)
c = a * b;
}
noinline void f3(void)
{
volatile int i;
for (i = 0; i < 10000;) {
if(i%2)
i++;
else
i++;
}
}
noinline void f1(void)
{
f2();
f3();
}
int main()
{
int i;
for (i = 0; i < 100000; i++)
f1();
}
[lk@localhost perf_diff]$ gcc example_v1.c -o example
[lk@localhost perf_diff]$ perf record -o example_v1.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.813 MB example_v1.data (~35522 samples) ]
[lk@localhost perf_diff]$ gcc example_v2.c -o example
[lk@localhost perf_diff]$ perf record -o example_v2.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.824 MB example_v2.data (~36015 samples) ]
Old perf diff result:
[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
Event 'cycles'
Baseline Delta Shared Object Symbol
........ ....... ................ ...............................
[kernel.vmlinux] [k] __perf_event_task_sched_out
0.00% [kernel.vmlinux] [k] apic_timer_interrupt
[kernel.vmlinux] [k] idle_cpu
[kernel.vmlinux] [k] intel_pstate_timer_func
[kernel.vmlinux] [k] native_read_msr_safe
0.00% [kernel.vmlinux] [k] native_read_tsc
0.00% [kernel.vmlinux] [k] native_write_msr_safe
[kernel.vmlinux] [k] ntp_tick_length
0.00% [kernel.vmlinux] [k] rb_erase
0.00% [kernel.vmlinux] [k] tick_sched_timer
0.00% [kernel.vmlinux] [k] unmap_single_vma
0.00% [kernel.vmlinux] [k] update_wall_time
0.00% example [.] f1
46.24% example [.] f2
53.71% -7.55% example [.] f3
+53.81% example [.] f3
0.02% example [.] main
New perf diff result:
[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
[kernel.vmlinux] [k] __perf_event_task_sched_out
0.00% [kernel.vmlinux] [k] apic_timer_interrupt
[kernel.vmlinux] [k] idle_cpu
[kernel.vmlinux] [k] intel_pstate_timer_func
[kernel.vmlinux] [k] native_read_msr_safe
0.00% [kernel.vmlinux] [k] native_read_tsc
0.00% [kernel.vmlinux] [k] native_write_msr_safe
[kernel.vmlinux] [k] ntp_tick_length
0.00% [kernel.vmlinux] [k] rb_erase
0.00% [kernel.vmlinux] [k] tick_sched_timer
0.00% [kernel.vmlinux] [k] unmap_single_vma
0.00% [kernel.vmlinux] [k] update_wall_time
0.00% example [.] f1
46.24% -0.08% example [.] f2
53.71% +0.11% example [.] f3
0.02% example [.] main
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1423460384-11645-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add new buildid cache if the update target file is not cached.
This can happen when an old binary is replaced by new one after caching
the old one. In this case, user sees his operation just failed.
But it does not look straight, since user just pass the binary "path",
not "build-id".
----
# ./perf buildid-cache --add ./perf
(update ./perf to new binary)
# ./perf buildid-cache --update ./perf
./perf wasn't in the cache
#
----
This patch adds given new binary to cache if the new binary is
not cached. So we'll not see the above error.
----
# ./perf buildid-cache --add ./perf
(update ./perf to new binary)
# ./perf buildid-cache --update ./perf
#
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20150226065440.23912.1494.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We could end up returning 0 (Ok) with a NULL raw_path. Fix it.
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naohiro Aota <naota@elisp.net>
Link: http://lkml.kernel.org/n/tip-l0kcbcg5f4nnzqt01cv42vec@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fix get_real_path to free allocated memory when comp_dir is used for
complementing path and getting an error.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naohiro Aota <naota@elisp.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150226082504.28125.74506.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Recent linux kernel provides a blacklist of the functions which can not
be probed. perf probe can now check this blacklist before setting new
events and indicate better error message for users.
Without this patch,
----
# perf probe --add vmalloc_fault
Added new event:
Failed to write event: Invalid argument
Error: Failed to add events.
----
With this patch
----
# perf probe --add vmalloc_fault
Added new event:
Warning: Skipped probing on blacklisted function: vmalloc_fault
----
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150219143113.14434.5387.stgit@localhost.localdomain
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On Sparc64 perf-trace is failing in many spots due to extended load
instructions being used on misaligned accesses.
(gdb) run trace ls
Starting program: /tmp/perf/perf trace ls
[Thread debugging using libthread_db enabled]
Detaching after fork from child process 169460.
<ls output removed>
Program received signal SIGBUS, Bus error.
0x000000000014f4dc in tp_field__u64 (field=0x4cc700, sample=0x7feffffa098) at builtin-trace.c:61
warning: Source file is more recent than executable.
61 TP_UINT_FIELD(64);
(gdb) bt
0 0x000000000014f4dc in tp_field__u64 (field=0x4cc700, sample=0x7feffffa098) at builtin-trace.c:61
1 0x0000000000156ad4 in trace__sys_exit (trace=0x7feffffc268, evsel=0x4cc580, event=0xfffffc0104912000,
sample=0x7feffffa098) at builtin-trace.c:1701
2 0x0000000000158c14 in trace__run (trace=0x7feffffc268, argc=1, argv=0x7fefffff360) at builtin-trace.c:2160
3 0x000000000015b78c in cmd_trace (argc=1, argv=0x7fefffff360, prefix=0x0) at builtin-trace.c:2609
4 0x0000000000107d94 in run_builtin (p=0x4549c8, argc=2, argv=0x7fefffff360) at perf.c:341
5 0x0000000000108140 in handle_internal_command (argc=2, argv=0x7fefffff360) at perf.c:400
6 0x0000000000108308 in run_argv (argcp=0x7feffffef2c, argv=0x7feffffef20) at perf.c:444
7 0x0000000000108728 in main (argc=2, argv=0x7fefffff360) at perf.c:559
(gdb) p *sample
$1 = {ip = 4391276, pid = 169472, tid = 169472, time = 6303014583281250, addr = 0, id = 72082,
stream_id = 18446744073709551615, period = 1, weight = 0, transaction = 0, cpu = 73, raw_size = 36,
data_src = 84410401, flags = 0, insn_len = 0, raw_data = 0xfffffc010491203c, callchain = 0x0,
branch_stack = 0x0, user_regs = {abi = 0, mask = 0, regs = 0x0, cache_regs = 0x7feffffa098, cache_mask = 0},
intr_regs = {abi = 0, mask = 0, regs = 0x0, cache_regs = 0x7feffffa098, cache_mask = 0}, user_stack = {
offset = 0, size = 0, data = 0x0}, read = {time_enabled = 0, time_running = 0, {group = {nr = 0,
values = 0x0}, one = {value = 0, id = 0}}}}
(gdb) p *field
$2 = {offset = 16, {integer = 0x14f4a8 <tp_field__u64>, pointer = 0x14f4a8 <tp_field__u64>}}
sample->raw_data is guaranteed to not be 8-byte aligned because it is preceded
by the size as a u3. So accessing raw data with an extended load instruction causes
the SIGBUS. Resolve by using memcpy to a temporary variable of appropriate size.
Signed-off-by: David Ahern <david.ahern@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1424376022-140608-1-git-send-email-david.ahern@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some of the tracers bring their own id or pid fields and we can end up
having two of them. This patch adds a "perf_" prefix to the 'generic'
fields so we avoid a clash of the member names.
The change is visible in the babeltrace output:
Before:
$ babeltrace ./ctf-data/
[03:19:13.962131936] (+0.000001935) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 8 }
[03:19:13.962133732] (+0.000001796) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 114 }
...
Now:
$ babeltrace ./ctf-data/
[03:19:13.962131936] (+0.000001935) cycles: { }, { perf_ip = 0xFFFFFFFF8105443A, perf_tid = 20714, perf_pid = 20714, perf_period = 8 }
[03:19:13.962133732] (+0.000001796) cycles: { }, { perf_ip = 0xFFFFFFFF8105443A, perf_tid = 20714, perf_pid = 20714, perf_period = 114 }
...
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1424470628-5969-5-git-send-email-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adding 'perf data convert' to convert perf data file into different
format. This patch adds support for CTF format conversion.
To convert perf.data into CTF run:
$ perf data convert --to-ctf=./ctf-data/
[ perf data convert: Converted 'perf.data' into CTF data './ctf-data/' ]
[ perf data convert: Converted and wrote 11.268 MB (100230 samples) ]
The command will create CTF metadata out of perf.data file (or one
specified via -i option) and then convert all sample events into single
CTF stream.
Each sample_type bit is translated into separated CTF event field apart
from following exceptions:
PERF_SAMPLE_RAW - added in next patch
PERF_SAMPLE_READ - TODO
PERF_SAMPLE_CALLCHAIN - TODO
PERF_SAMPLE_BRANCH_STACK - TODO
PERF_SAMPLE_REGS_USER - TODO
PERF_SAMPLE_STACK_USER - TODO
$ perf --debug=data-convert=2 data convert ...
The converted CTF data could be analyzed by CTF tools, like babletrace
or tracecompass [1].
$ babeltrace ./ctf-data/
[03:19:13.962125533] (+?.?????????) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 1 }
[03:19:13.962130001] (+0.000004468) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 1 }
[03:19:13.962131936] (+0.000001935) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 8 }
[03:19:13.962133732] (+0.000001796) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 114 }
[03:19:13.962135557] (+0.000001825) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 2087 }
[03:19:13.962137627] (+0.000002070) cycles: { }, { ip = 0xFFFFFFFF81361938, tid = 20714, pid = 20714, period = 37582 }
[03:19:13.962161091] (+0.000023464) cycles: { }, { ip = 0xFFFFFFFF8124218F, tid = 20714, pid = 20714, period = 600246 }
[03:19:13.962517569] (+0.000356478) cycles: { }, { ip = 0xFFFFFFFF811A75DB, tid = 20714, pid = 20714, period = 1325731 }
[03:19:13.969518008] (+0.007000439) cycles: { }, { ip = 0x34080917B2, tid = 20714, pid = 20714, period = 1144298 }
The following members to the ctf-environment were decided to be added to
distinguish and specify perf CTF data:
- domain
It says "kernel" because it contains a kernel trace (not to be
confused with a user space like lttng-ust does)
- tracer_name
It says perf. This can be used to distinguish between lttng and perf
CTF based trace.
- version
The kernel version from stream. In addition to release, this is what
it looks like on a Debian kernel:
release = "3.14-1-amd64";
version = "3.14.0";
[1] http://projects.eclipse.org/projects/tools.tracecompass
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1424470628-5969-4-git-send-email-jolsa@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adding new 'perf data' command to provide operations over data files.
The 'perf data convert' sub command is coming in following patch, but
there's possibility for other useful commands like 'perf data ls' (to
display perf data file in directory in ls style).
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1424470628-5969-3-git-send-email-jolsa@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adding feature check for babeltrace library [1], which will be used for
perf data file CTF [2] conversion in following patches.
The babeltrace library is now automatically detected as standard
feature. It's possible to specify LIBBABELTRACE_DIR make variable to
specify location of installed libbabeltrace, like:
$ make LIBBABELTRACE_DIR=/opt/libbabeltrace/
BUILD: Doing 'make -j4' parallel build
Auto-detecting system features:
... dwarf: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libaudit: [ on ]
... libbfd: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libslang: [ on ]
... libunwind: [ on ]
... libbabeltrace: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... DWARF post unwind library: libunwind
NOTE The installation of the [1] to to used by above make:
$ git clone git://git.efficios.com/babeltrace.git
$ cd babeltrace
$ vim README
$ ./bootstrap
$ ./configure --prefix=/opt/libbabeltrace
$ make prefix=/opt/libbabeltrace
$ sudo make install prefix=/opt/libbabeltrace
Please make sure that the /opt/libbabeltrace/lib directory is in your
LD_LIBRARY_PATH:
$ export LD_LIBRARY_PATH=/opt/libbabeltrace/lib
[1] babeltrace - http://www.efficios.com/babeltrace
[2] Common Trace Format - http://www.efficios.com/ctf
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1424470628-5969-2-git-send-email-jolsa@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
[ Added missing babeltrace build instructions ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add an option to perf record to record running/enabled time for read
events, similar to what stat does.
This is useful to understand multiplexing problems.
Right now the report support is not great, but at least report -D
already supports it.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1424819620-16043-1-git-send-email-andi@firstfloor.org
[ Fixed the Documentation entry to match the OPT_BOOLEAN one ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-fmto8ft6jrtwz09dxn5d4z8w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
[root@ssdandy ~]# perf trace --filter-pids 16348
0.000 ( 0.000 ms): tuned/1027 ... [continued]: select()) = 0 Timeout
793.770 ( 0.000 ms): lsmd/895 ... [continued]: select()) = 0 Timeout
793.775 (793.724 ms): tuned/1027 select(tvp: 0x7f7655556e50) ...
perf: Segmentation fault
Obtained 15 stack frames.
perf(dump_stack+0x2e) [0x4ed330]
perf(sighandler_dump_stack+0x2e) [0x4ed40f]
/lib64/libc.so.6(+0x35640) [0x7fa2d5b69640]
perf() [0x4c2d35]
perf(machine__findnew_thread+0x39) [0x4c2ed6]
perf() [0x454a4d]
perf() [0x455f87]
perf() [0x456556]
perf(cmd_trace+0xa7e) [0x4580af]
perf() [0x4867bd]
perf() [0x486a1c]
perf() [0x486b68]
perf(main+0x23b) [0x486ec9]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fa2d5b55af5]
perf() [0x41bd91]
[ root@ssdandy ~]#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-v38cbxcnm2yf5qn9u4y4n9ab@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To use in stdio based tools, like 'trace'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-79kjmerlw6d88csyx1afzwvn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To figure out if ordered_events are being used when doing a flush
operation, it is enough to check if there were in fact some events
queued, i.e. look at oe->nr_events.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1c5r404vy766kt5nflv88uag@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
.. after extensive statistical analysis of my G+ polling, I've come to
the inescapable conclusion that internet polls are bad.
Big surprise.
But "Hurr durr I'ma sheep" trounced "I like online polls" by a 62-to-38%
margin, in a poll that people weren't even supposed to participate in.
Who can argue with solid numbers like that? 5,796 votes from people who
can't even follow the most basic directions?
In contrast, "v4.0" beat out "v3.20" by a slimmer margin of 56-to-44%,
but with a total of 29,110 votes right now.
Now, arguably, that vote spread is only about 3,200 votes, which is less
than the almost six thousand votes that the "please ignore" poll got, so
it could be considered noise.
But hey, I asked, so I'll honor the votes.
|
|
All it wants is session->evlist.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6w9663gka3jb1j1rfxxd5jcq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Further untangling perf_session from plain event delivery routines.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-cvz8e6pwyogs4w14582iis9w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pxxm1liohog3d6i826x8sud8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For tools that don't deal with perf.data files, thus do not need to
use perf_session.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-kglq67gvauq9tak02a4se00r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Start to untangle session from delivering samples, as there are
tools that want to use ordered_events and don't use perf_session at all.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-rn4pk3pjxd78sgzrkn19tktp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Because we need to use ordered_events in some cases, so we will need to
first have them in a queue, order that queue, and then process the
event.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-cmkw9zgoh0z4r218957ftp1a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Forgot to do it when adding the feature.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mx152b6x9cgknhw91vsyjlnd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When tracing in X we get event loops due to the tracing activity, i.e.
updates to a gnome-terminal that generate syscalls for X.org, etc.
To get a more useful view of what is happening, syscall wise, system
wide, we need to filter those, like in:
# ps ax|egrep '981|2296|1519' | grep -v egrep
981 tty1 Ss+ 5:40 /usr/bin/Xorg :0 -background none ...
1519 ? Sl 2:22 /usr/bin/gnome-shell
2296 ? Sl 4:16 /usr/libexec/gnome-terminal-server
#
# trace -e write --filter-pids 981,2296,1519
0.385 ( 0.021 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 136) = 136
0.922 ( 0.014 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 140) = 140
5006.525 ( 0.029 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 136) = 136
5007.235 ( 0.023 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 140) = 140
5177.646 ( 0.018 ms): rtkit-daemon/782 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7f7eea70be88, count: 8) = 8
8314.497 ( 0.004 ms): gsd-locate-poi/2084 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7fffe96af7b0, count: 8) = 8
8314.518 ( 0.002 ms): gsd-locate-poi/2084 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7fffe96af0e0, count: 8) = 8
^C#
When this option is used the tracer pid is also filtered.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-f5qmiyy7c0uxdm21ncatpeek@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We need to filter multiple pids in trace, i.e. trace itself,
gnome-terminal, X.org, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-frtpkg7qapqwf7asa35wf8am@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To avoid tracing the tracer.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-shmwd1khzpaobr3i0j1ygapg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To filter out events for a certain pid, for instance, when tracing
system wide, so that the tracer itself doesn't creates an event loop.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-byoia9dzu4gmkdv87etnd9zf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When printing just events, i.e. '--no-sys --ev some:events' it makes no
sense to waste screen space.
Before:
# trace --no-sys --ev probe:*
84481.704 ( ): probe:vfs_getname:(ffffffff811ed023) pathname="/etc/services")
84481.892 ( ): probe:vfs_getname:(ffffffff811ed023) pathname="/etc/services")
84482.230 ( ): probe:vfs_getname:(ffffffff811ed023) pathname="/etc/resolv.conf")
84482.481 ( ): probe:vfs_getname:(ffffffff811ed023) pathname="/etc/hosts")
85097.725 ( ): probe:vfs_getname:(ffffffff811ed023) pathname="/root"
#
After:
# trace --no-sys --ev probe:*
0.000 probe:vfs_getname:(ffffffff811ed023) pathname="/root")
1.711 probe:vfs_getname:(ffffffff811ed023) pathname="/etc/localtime")
2.103 probe:vfs_getname:(ffffffff811ed023) pathname="/etc/localtime")
^C#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-jhryxgnam8zecq0q0wsy6pyb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
X-Coverup: just ask spender
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
use_pde()/unuse_pde() in ->follow_link()/->put_link() resp.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
As it is, we have debugfs_remove() racing with symlink traversals.
Supply ->evict_inode() and do freeing there - inode will remain
pinned until we are done with the symlink body.
And rip the idiocy with checking if dentry is positive right after
we'd verified debugfs_positive(), which is a stronger check...
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
I've noticed significant locking contention in memory reclaimer around
sb_lock inside grab_super_passive(). Grab_super_passive() is called from
two places: in icache/dcache shrinkers (function super_cache_scan) and
from writeback (function __writeback_inodes_wb). Both are required for
progress in memory allocator.
Grab_super_passive() acquires sb_lock to increment sb->s_count and check
sb->s_instances. It seems sb->s_umount locked for read is enough here:
super-block deactivation always runs under sb->s_umount locked for write.
Protecting super-block itself isn't a problem: in super_cache_scan() sb
is protected by shrinker_rwsem: it cannot be freed if its slab shrinkers
are still active. Inside writeback super-block comes from inode from bdi
writeback list under wb->list_lock.
This patch removes locking sb_lock and checks s_instances under s_umount:
generic_shutdown_super() unlinks it under sb->s_umount locked for write.
New variant is called trylock_super() and since it only locks semaphore,
callers must call up_read(&sb->s_umount) instead of drop_super(sb) when
they're done.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup()
rather than d_is_dir() when checking a dir watch and give an error on fake
directories.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Fix up the following scripted S_ISDIR/S_ISREG/S_ISLNK conversions (or lack
thereof) in cachefiles:
(1) Cachefiles mostly wants to use d_can_lookup() rather than d_is_dir() as
it doesn't want to deal with automounts in its cache.
(2) Coccinelle didn't find S_IS* expressions in ASSERT() statements in
cachefiles.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Convert the following where appropriate:
(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.
In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).
Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.
However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.
There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.
The following perl+coccinelle script was used:
use strict;
my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}
my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );
my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);
foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}
[AV: overlayfs parts skipped]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Use d_is_positive() rather than testing dentry->d_inode in SELinux to get rid
of direct references to d_inode outside of the VFS.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Use d_is_positive() rather than testing dentry->d_inode in Smack to get rid of
direct references to d_inode outside of the VFS.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Use d_is_dir() rather than d_inode and S_ISDIR(). Note that this will include
fake directories such as automount triggers.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Use d_is_positive(dentry) or d_is_negative(dentry) rather than testing
dentry->d_inode as the dentry may cover another layer that has an inode when
the top layer doesn't or may hold a 0,0 chardev that's actually a whiteout.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
mediated_filesystem() should use dentry->d_sb not dentry->d_inode->i_sb and
should avoid file_inode() also since it is really dealing with the path.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Split DCACHE_FILE_TYPE into DCACHE_REGULAR_TYPE (dentries representing regular
files) and DCACHE_SPECIAL_TYPE (representing blockdev, chardev, FIFO and
socket files).
d_is_reg() and d_is_special() are added to detect these subtypes and
d_is_file() is left as the union of the two.
This allows a number of places that use S_ISREG(dentry->d_inode->i_mode) to
use d_is_reg(dentry) instead.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add a DCACHE_FALLTHRU flag to indicate that, in a layered filesystem, this is
a virtual dentry that covers another one in a lower layer that should be used
instead. This may be recorded on medium if directory integration is stored
there.
The flag can be set with d_set_fallthru() and tested with d_is_fallthru().
Original-author: Valerie Aurora <vaurora@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add DCACHE_WHITEOUT_TYPE and provide a d_is_whiteout() accessor function. A
d_is_miss() accessor is also added for ordinary cache misses and
d_is_negative() is modified to indicate either an ordinary miss or an enforced
miss (whiteout).
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Introduce some function for getting the inode (and also the dentry) in an
environment where layered/unioned filesystems are in operation.
The problem is that we have places where we need *both* the union dentry and
the lower source or workspace inode or dentry available, but we can only have
a handle on one of them. Therefore we need to derive the handle to the other
from that.
The idea is to introduce an extra field in struct dentry that allows the union
dentry to refer to and pin the lower dentry.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The use of READ_ONCE() causes lots of warnings witht he pending paravirt
spinlock fixes, because those ends up having passing a member to a
'const' structure to READ_ONCE().
There should certainly be nothing wrong with using READ_ONCE() with a
const source, but the helper function __read_once_size() would cause
warnings because it would drop the 'const' qualifier, but also because
the destination would be marked 'const' too due to the use of 'typeof'.
Use a union of types in READ_ONCE() to avoid this issue.
Also make sure to use parenthesis around the macro arguments to avoid
possible operator precedence issues.
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When reading blkio.throttle.io_serviced in a recently created blkio
cgroup, it's possible to race against the creation of a throttle policy,
which delays the allocation of stats_cpu.
Like other functions in the throttle code, just checking for a NULL
stats_cpu prevents the following oops caused by that race.
[ 1117.285199] Unable to handle kernel paging request for data at address 0x7fb4d0020
[ 1117.285252] Faulting instruction address: 0xc0000000003efa2c
[ 1137.733921] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1137.733945] SMP NR_CPUS=2048 NUMA PowerNV
[ 1137.734025] Modules linked in: bridge stp llc kvm_hv kvm binfmt_misc autofs4
[ 1137.734102] CPU: 3 PID: 5302 Comm: blkcgroup Not tainted 3.19.0 #5
[ 1137.734132] task: c000000f1d188b00 ti: c000000f1d210000 task.ti: c000000f1d210000
[ 1137.734167] NIP: c0000000003efa2c LR: c0000000003ef9f0 CTR: c0000000003ef980
[ 1137.734202] REGS: c000000f1d213500 TRAP: 0300 Not tainted (3.19.0)
[ 1137.734230] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 42008884 XER: 20000000
[ 1137.734325] CFAR: 0000000000008458 DAR: 00000007fb4d0020 DSISR: 40000000 SOFTE: 0
GPR00: c0000000003ed3a0 c000000f1d213780 c000000000c59538 0000000000000000
GPR04: 0000000000000800 0000000000000000 0000000000000000 0000000000000000
GPR08: ffffffffffffffff 00000007fb4d0020 00000007fb4d0000 c000000000780808
GPR12: 0000000022000888 c00000000fdc0d80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 000001003e120200 c000000f1d5b0cc0 0000000000000200 0000000000000000
GPR24: 0000000000000001 c000000000c269e0 0000000000000020 c000000f1d5b0c80
GPR28: c000000000ca3a08 c000000000ca3dec c000000f1c667e00 c000000f1d213850
[ 1137.734886] NIP [c0000000003efa2c] .tg_prfill_cpu_rwstat+0xac/0x180
[ 1137.734915] LR [c0000000003ef9f0] .tg_prfill_cpu_rwstat+0x70/0x180
[ 1137.734943] Call Trace:
[ 1137.734952] [c000000f1d213780] [d000000005560520] 0xd000000005560520 (unreliable)
[ 1137.734996] [c000000f1d2138a0] [c0000000003ed3a0] .blkcg_print_blkgs+0xe0/0x1a0
[ 1137.735039] [c000000f1d213960] [c0000000003efb50] .tg_print_cpu_rwstat+0x50/0x70
[ 1137.735082] [c000000f1d2139e0] [c000000000104b48] .cgroup_seqfile_show+0x58/0x150
[ 1137.735125] [c000000f1d213a70] [c0000000002749dc] .kernfs_seq_show+0x3c/0x50
[ 1137.735161] [c000000f1d213ae0] [c000000000218630] .seq_read+0xe0/0x510
[ 1137.735197] [c000000f1d213bd0] [c000000000275b04] .kernfs_fop_read+0x164/0x200
[ 1137.735240] [c000000f1d213c80] [c0000000001eb8e0] .__vfs_read+0x30/0x80
[ 1137.735276] [c000000f1d213cf0] [c0000000001eb9c4] .vfs_read+0x94/0x1b0
[ 1137.735312] [c000000f1d213d90] [c0000000001ebb38] .SyS_read+0x58/0x100
[ 1137.735349] [c000000f1d213e30] [c000000000009218] syscall_exit+0x0/0x98
[ 1137.735383] Instruction dump:
[ 1137.735405] 7c6307b4 7f891800 409d00b8 60000000 60420000 3d420004 392a63b0 786a1f24
[ 1137.735471] 7d49502a e93e01c8 7d495214 7d2ad214 <7cead02a> e9090008 e9490010 e9290018
And here is one code that allows to easily reproduce this, although this
has first been found by running docker.
void run(pid_t pid)
{
int n;
int status;
int fd;
char *buffer;
buffer = memalign(BUFFER_ALIGN, BUFFER_SIZE);
n = snprintf(buffer, BUFFER_SIZE, "%d\n", pid);
fd = open(CGPATH "/test/tasks", O_WRONLY);
write(fd, buffer, n);
close(fd);
if (fork() > 0) {
fd = open("/dev/sda", O_RDONLY | O_DIRECT);
read(fd, buffer, 512);
close(fd);
wait(&status);
} else {
fd = open(CGPATH "/test/blkio.throttle.io_serviced", O_RDONLY);
n = read(fd, buffer, BUFFER_SIZE);
close(fd);
}
free(buffer);
exit(0);
}
void test(void)
{
int status;
mkdir(CGPATH "/test", 0666);
if (fork() > 0)
wait(&status);
else
run(getpid());
rmdir(CGPATH "/test");
}
int main(int argc, char **argv)
{
int i;
for (i = 0; i < NR_TESTS; i++)
test();
return 0;
}
Reported-by: Ricardo Marin Matinata <rmm@br.ibm.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Commit e9de688dac65 ("irqchip: mips-gic: Support local interrupts")
updated several platforms. This is a copy paste error.
Signed-off-by: Niklas Cassel <niklass@axis.com>
Reviewed-by: Andrew Bresticker <abrestic@chromium.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9245/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|