Age | Commit message (Collapse) | Author | Files | Lines |
|
With commit d3119bc985fb645 ("LoongArch: Fix callchain parse error with
kernel tracepoint events"), perf can parse kernel callchain, but not
complete and sometimes maybe error. The reason is LoongArch's unwinders
(guess, prologue and orc) don't really need fp (i.e., regs[22]), and
they use sp (i.e., regs[3]) as the frame address rather than the current
stack pointer.
Fix that by removing the assignment of regs[22], and instead assign the
__builtin_frame_address(0) to regs[3].
Without fix:
Children Self Command Shared Object Symbol
........ ........ ............. ................. ................
33.91% 33.91% swapper [kernel.vmlinux] [k] __schedule
|
|--33.04%--__schedule
|
--0.87%--__arch_cpu_idle
__schedule
With this fix:
Children Self Command Shared Object Symbol
........ ........ ............. ................. ................
31.16% 31.16% swapper [kernel.vmlinux] [k] __schedule
|
|--20.63%--smpboot_entry
| cpu_startup_entry
| schedule_idle
| __schedule
|
--10.53%--start_kernel
cpu_startup_entry
schedule_idle
__schedule
Fixes: d3119bc985fb645 ("LoongArch: Fix callchain parse error with kernel tracepoint events")
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
In order to fix perf's callchain parse error for LoongArch, we implement
perf_arch_fetch_caller_regs() which fills several necessary registers
used for callchain unwinding, including sp, fp, and era. This is similar
to the following commits.
commit b3eac0265bf6:
("arm: perf: Fix callchain parse error with kernel tracepoint events")
commit 5b09a094f2fb:
("arm64: perf: Fix callchain parse error with kernel tracepoint events")
commit 9a7e8ec0d4cc:
("riscv: perf: Fix callchain parse error with kernel tracepoint events")
Test with commands:
perf record -e sched:sched_switch -g --call-graph dwarf
perf report
Without this patch:
Children Self Command Shared Object Symbol
........ ........ ............. ................. ....................
43.41% 43.41% swapper [unknown] [k] 0000000000000000
10.94% 10.94% loong-container [unknown] [k] 0000000000000000
|
|--5.98%--0x12006ba38
|
|--2.56%--0x12006bb84
|
--2.40%--0x12006b6b8
With this patch, callchain can be parsed correctly:
Children Self Command Shared Object Symbol
........ ........ ............. ................. ....................
47.57% 47.57% swapper [kernel.vmlinux] [k] __schedule
|
---__schedule
26.76% 26.76% loong-container [kernel.vmlinux] [k] __schedule
|
|--13.78%--0x12006ba38
| |
| |--9.19%--__schedule
| |
| --4.59%--handle_syscall
| do_syscall
| sys_futex
| do_futex
| futex_wait
| futex_wait_queue_me
| hrtimer_start_range_ns
| __schedule
|
|--8.38%--0x12006bb84
| handle_syscall
| do_syscall
| sys_epoll_pwait
| do_epoll_wait
| schedule_hrtimeout_range_clock
| hrtimer_start_range_ns
| __schedule
|
--4.59%--0x12006b6b8
handle_syscall
do_syscall
sys_nanosleep
hrtimer_nanosleep
do_nanosleep
hrtimer_start_range_ns
__schedule
Cc: stable@vger.kernel.org
Fixes: b37042b2bb7cd751f0 ("LoongArch: Add perf events support")
Reported-by: Youling Tang <tangyouling@kylinos.cn>
Suggested-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The perf events infrastructure of LoongArch is very similar to old MIPS-
based Loongson, so most of the codes are derived from MIPS.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add some other common headers for basic LoongArch support.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|