aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/trace/ftrace.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-11-27Merge tag 'trace-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds1-45/+568
Pull tracing updates from Steven Rostedt: "New tracing features: - New PERMANENT flag to ftrace_ops when attaching a callback to a function. As /proc/sys/kernel/ftrace_enabled when set to zero will disable all attached callbacks in ftrace, this has a detrimental impact on live kernel tracing, as it disables all that it patched. If a ftrace_ops is registered to ftrace with the PERMANENT flag set, it will prevent ftrace_enabled from being disabled, and if ftrace_enabled is already disabled, it will prevent a ftrace_ops with PREMANENT flag set from being registered. - New register_ftrace_direct(). As eBPF would like to register its own trampolines to be called by the ftrace nop locations directly, without going through the ftrace trampoline, this function has been added. This allows for eBPF trampolines to live along side of ftrace, perf, kprobe and live patching. It also utilizes the ftrace enabled_functions file that keeps track of functions that have been modified in the kernel, to allow for security auditing. - Allow for kernel internal use of ftrace instances. Subsystems in the kernel can now create and destroy their own tracing instances which allows them to have their own tracing buffer, and be able to record events without worrying about other users from writing over their data. - New seq_buf_hex_dump() that lets users use the hex_dump() in their seq_buf usage. - Notifications now added to tracing_max_latency to allow user space to know when a new max latency is hit by one of the latency tracers. - Wider spread use of generic compare operations for use of bsearch and friends. - More synthetic event fields may be defined (32 up from 16) - Use of xarray for architectures with sparse system calls, for the system call trace events. This along with small clean ups and fixes" * tag 'trace-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (51 commits) tracing: Enable syscall optimization for MIPS tracing: Use xarray for syscall trace events tracing: Sample module to demonstrate kernel access to Ftrace instances. tracing: Adding new functions for kernel access to Ftrace instances tracing: Fix Kconfig indentation ring-buffer: Fix typos in function ring_buffer_producer ftrace: Use BIT() macro ftrace: Return ENOTSUPP when DYNAMIC_FTRACE_WITH_DIRECT_CALLS is not configured ftrace: Rename ftrace_graph_stub to ftrace_stub_graph ftrace: Add a helper function to modify_ftrace_direct() to allow arch optimization ftrace: Add helper find_direct_entry() to consolidate code ftrace: Add another check for match in register_ftrace_direct() ftrace: Fix accounting bug with direct->count in register_ftrace_direct() ftrace/selftests: Fix spelling mistake "wakeing" -> "waking" tracing: Increase SYNTH_FIELDS_MAX for synthetic_events ftrace/samples: Add a sample module that implements modify_ftrace_direct() ftrace: Add modify_ftrace_direct() tracing: Add missing "inline" in stub function of latency_fsnotify() tracing: Remove stray tab in TRACE_EVAL_MAP_FILE's help text tracing: Use seq_buf_hex_dump() to dump buffers ...
2019-11-18ftrace: Add a helper function to modify_ftrace_direct() to allow arch optimizationSteven Rostedt (VMware)1-32/+88
If a direct ftrace callback is at a location that does not have any other ftrace helpers attached to it, it is possible to simply just change the text to call the new caller (if the architecture supports it). But this requires special architecture code. Currently, modify_ftrace_direct() uses a trick to add a stub ftrace callback to the location forcing it to call the ftrace iterator. Then it can change the direct helper to call the new function in C, and then remove the stub. Removing the stub will have the location now call the new location that the direct helper is using. The new helper function does the registering the stub trick, but is a weak function, allowing an architecture to override it to do something a bit more direct. Link: https://lore.kernel.org/r/20191115215125.mbqv7taqnx376yed@ast-mbp.dhcp.thefacebook.com Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-15ftrace: Add helper find_direct_entry() to consolidate codeSteven Rostedt (VMware)1-30/+29
Both unregister_ftrace_direct() and modify_ftrace_direct() needs to normalize the ip passed in to match the rec->ip, as it is acceptable to have the ip on the ftrace call site but not the start. There are also common validity checks with the record found by the ip, these should be done for both unregister_ftrace_direct() and modify_ftrace_direct(). Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-15ftrace: Add another check for match in register_ftrace_direct()Steven Rostedt (VMware)1-1/+6
As an instruction pointer passed into register_ftrace_direct() may just exist on the ftrace call site, but may not be the start of the call site itself, register_ftrace_direct() still needs to update test if a direct call exists on the normalized site, as only one direct call is allowed at any one time. Fixes: 763e34e74bb7d ("ftrace: Add register_ftrace_direct()") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-15ftrace: Fix accounting bug with direct->count in register_ftrace_direct()Steven Rostedt (VMware)1-2/+1
The direct->count wasn't being updated properly, where it only was updated when the first entry was added, but should be updated every time. Fixes: 013bf0da04748 ("ftrace: Add ftrace_find_direct_func()") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-14ftrace: Add modify_ftrace_direct()Steven Rostedt (VMware)1-0/+78
Add a new function modify_ftrace_direct() that will allow a user to update an existing direct caller to a new trampoline, without missing hits due to unregistering one and then adding another. Link: https://lore.kernel.org/r/20191109022907.6zzo6orhxpt5n2sv@ast-mbp.dhcp.thefacebook.com Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-14tracing: Use generic type for comparator functionAndy Shevchenko1-6/+6
Comparator function type, cmp_func_t, is defined in the types.h, use it in the code. Link: http://lkml.kernel.org/r/20191007135656.37734-3-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-13ftrace: Add information on number of page groups allocatedSteven Rostedt (VMware)1-0/+14
Looking for ways to shrink the size of the dyn_ftrace structure, knowing the information about how many pages and the number of groups of those pages, is useful in working out the best ways to save on memory. This adds one info print on how many groups of pages were used to allocate the ftrace dyn_ftrace structures, and also shows the number of pages and groups in the dyn_ftrace_total_info (which is used for debugging). Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-13ftrace/x86: Add a counter to test function_graph with directSteven Rostedt (VMware)1-0/+4
As testing for direct calls from the function graph tracer adds a little overhead (which is a lot when tracing every function), add a counter that can be used to test if function_graph tracer needs to test for a direct caller or not. It would have been nicer if we could use a static branch, but the static branch logic fails when used within the function graph tracer trampoline. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-13ftrace: Add ftrace_find_direct_func()Steven Rostedt (VMware)1-1/+78
As function_graph tracer modifies the return address to insert a trampoline to trace the return of a function, it must be aware of a direct caller, as when it gets called, the function's return address may not be at on the stack where it expects. It may have to see if that return address points to the a direct caller and adjust if it is. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-13ftrace: Add register_ftrace_direct()Steven Rostedt (VMware)1-5/+264
Add the start of the functionality to allow other trampolines to use the ftrace mcount/fentry/nop location. This adds two new functions: register_ftrace_direct() and unregister_ftrace_direct() Both take two parameters: the first is the instruction address of where the mcount/fentry/nop exists, and the second is the trampoline to have that location called. This will handle cases where ftrace is already used on that same location, and will make it still work, where the registered direct called trampoline will get called after all the registered ftrace callers are handled. Currently, it will not allow for IP_MODIFY functions to be called at the same locations, which include some kprobes and live kernel patching. At this point, no architecture supports this. This is only the start of implementing the framework. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-08ftrace: Separate out functionality from ftrace_location_range()Steven Rostedt (VMware)1-15/+23
Create a new function called lookup_rec() from the functionality of ftrace_location_range(). The difference between lookup_rec() is that it returns the record that it finds, where as ftrace_location_range() returns only if it found a match or not. The lookup_rec() is static, and can be used for new functionality where ftrace needs to find a record of a specific address. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-08ftrace: Separate out the copying of a ftrace_hash from __ftrace_hash_move()Steven Rostedt (VMware)1-12/+17
Most of the functionality of __ftrace_hash_move() can be reused, but not all of it. That is, __ftrace_hash_move() is used to simply make a new hash from an existing one, using the same size as the original. Creating a dup_hash(), where we can specify a new size will be useful when we want to create a hash with a default size, or simply copy the old one. Signed-off-by: Steven Rostedt (VMWare) <rostedt@goodmis.org>
2019-11-06ftrace: add ftrace_init_nop()Mark Rutland1-3/+3
Architectures may need to perform special initialization of ftrace callsites, and today they do so by special-casing ftrace_make_nop() when the expected branch address is MCOUNT_ADDR. In some cases (e.g. for patchable-function-entry), we don't have an mcount-like symbol and don't want a synthetic MCOUNT_ADDR, but we may need to perform some initialization of callsites. To make it possible to separate initialization from runtime modification, and to handle cases without an mcount-like symbol, this patch adds an optional ftrace_init_nop() function that architectures can implement, which does not pass a branch address. Where an architecture does not provide ftrace_init_nop(), we will fall back to the existing behaviour of calling ftrace_make_nop() with MCOUNT_ADDR. At the same time, ftrace_code_disable() is renamed to ftrace_nop_initialize() to make it clearer that it is intended to intialize a callsite into a disabled state, and is not for disabling a callsite that has been runtime enabled. The kerneldoc description of rec arguments is updated to cover non-mcount callsites. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Ingo Molnar <mingo@redhat.com>
2019-11-04ftrace: Introduce PERMANENT ftrace_ops flagMiroslav Benes1-2/+21
Livepatch uses ftrace for redirection to new patched functions. It means that if ftrace is disabled, all live patched functions are disabled as well. Toggling global 'ftrace_enabled' sysctl thus affect it directly. It is not a problem per se, because only administrator can set sysctl values, but it still may be surprising. Introduce PERMANENT ftrace_ops flag to amend this. If the FTRACE_OPS_FL_PERMANENT is set on any ftrace ops, the tracing cannot be disabled by disabling ftrace_enabled. Equally, a callback with the flag set cannot be registered if ftrace_enabled is disabled. Link: http://lkml.kernel.org/r/20191016113316.13415-2-mbenes@suse.cz Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-10-12tracing: Add locked_down checks to the open calls of files created for tracefsSteven Rostedt (VMware)1-1/+22
Added various checks on open tracefs calls to see if tracefs is in lockdown mode, and if so, to return -EPERM. Note, the event format files (which are basically standard on all machines) as well as the enabled_functions file (which shows what is currently being traced) are not lockde down. Perhaps they should be, but it seems counter intuitive to lockdown information to help you know if the system has been modified. Link: http://lkml.kernel.org/r/CAHk-=wj7fGPKUspr579Cii-w_y60PtRaiDgKuxVtBAMK0VNNkA@mail.gmail.com Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-10-12tracing: Add tracing_check_open_get_tr()Steven Rostedt (VMware)1-3/+4
Currently, most files in the tracefs directory test if tracing_disabled is set. If so, it should return -ENODEV. The tracing_disabled is called when tracing is found to be broken. Originally it was done in case the ring buffer was found to be corrupted, and we wanted to prevent reading it from crashing the kernel. But it's also called if a tracing selftest fails on boot. It's a one way switch. That is, once it is triggered, tracing is disabled until reboot. As most tracefs files can also be used by instances in the tracefs directory, they need to be carefully done. Each instance has a trace_array associated to it, and when the instance is removed, the trace_array is freed. But if an instance is opened with a reference to the trace_array, then it requires looking up the trace_array to get its ref counter (as there could be a race with it being deleted and the open itself). Once it is found, a reference is added to prevent the instance from being removed (and the trace_array associated with it freed). Combine the two checks (tracing_disabled and trace_array_get()) into a single helper function. This will also make it easier to add lockdown to tracefs later. Link: http://lkml.kernel.org/r/20191011135458.7399da44@gandalf.local.home Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-10-12ftrace: Get a reference counter for the trace_array on filter filesSteven Rostedt (VMware)1-9/+18
The ftrace set_ftrace_filter and set_ftrace_notrace files are specific for an instance now. They need to take a reference to the instance otherwise there could be a race between accessing the files and deleting the instance. It wasn't until the :mod: caching where these file operations started referencing the trace_array directly. Cc: stable@vger.kernel.org Fixes: 673feb9d76ab3 ("ftrace: Add :mod: caching infrastructure to trace_array") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-09-20Merge tag 'trace-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds1-5/+1
Pull tracing updates from Steven Rostedt: - Addition of multiprobes to kprobe and uprobe events (allows for more than one probe attached to the same location) - Addition of adding immediates to probe parameters - Clean up of the recordmcount.c code. This brings us closer to merging recordmcount into objtool, and reuse code. - Other small clean ups * tag 'trace-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits) selftests/ftrace: Update kprobe event error testcase tracing/probe: Reject exactly same probe event tracing/probe: Fix to allow user to enable events on unloaded modules selftests/ftrace: Select an existing function in kprobe_eventname test tracing/kprobe: Fix NULL pointer access in trace_porbe_unlink() tracing: Make sure variable reference alias has correct var_ref_idx tracing: Be more clever when dumping hex in __print_hex() ftrace: Simplify ftrace hash lookup code in clear_func_from_hash() tracing: Add "gfp_t" support in synthetic_events tracing: Rename tracing_reset() to tracing_reset_cpu() tracing: Document the stack trace algorithm in the comments tracing/arm64: Have max stack tracer handle the case of return address after data recordmcount: Clarify what cleanup() does recordmcount: Remove redundant cleanup() calls recordmcount: Kernel style formatting recordmcount: Kernel style function signature formatting recordmcount: Rewrite error/success handling selftests/ftrace: Add syntax error test for multiprobe selftests/ftrace: Add syntax error test for immediates selftests/ftrace: Add a testcase for kprobe multiprobe event ...
2019-09-17ftrace: Simplify ftrace hash lookup code in clear_func_from_hash()Changbin Du1-5/+1
Function ftrace_lookup_ip() will check empty hash table. So we don't need extra check outside. Link: http://lkml.kernel.org/r/20190910143336.13472-1-changbin.du@gmail.com Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-09-16Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull scheduler updates from Ingo Molnar: - MAINTAINERS: Add Mark Rutland as perf submaintainer, Juri Lelli and Vincent Guittot as scheduler submaintainers. Add Dietmar Eggemann, Steven Rostedt, Ben Segall and Mel Gorman as scheduler reviewers. As perf and the scheduler is getting bigger and more complex, document the status quo of current responsibilities and interests, and spread the review pain^H^H^H^H fun via an increase in the Cc: linecount generated by scripts/get_maintainer.pl. :-) - Add another series of patches that brings the -rt (PREEMPT_RT) tree closer to mainline: split the monolithic CONFIG_PREEMPT dependencies into a new CONFIG_PREEMPTION category that will allow the eventual introduction of CONFIG_PREEMPT_RT. Still a few more hundred patches to go though. - Extend the CPU cgroup controller with uclamp.min and uclamp.max to allow the finer shaping of CPU bandwidth usage. - Micro-optimize energy-aware wake-ups from O(CPUS^2) to O(CPUS). - Improve the behavior of high CPU count, high thread count applications running under cpu.cfs_quota_us constraints. - Improve balancing with SCHED_IDLE (SCHED_BATCH) tasks present. - Improve CPU isolation housekeeping CPU allocation NUMA locality. - Fix deadline scheduler bandwidth calculations and logic when cpusets rebuilds the topology, or when it gets deadline-throttled while it's being offlined. - Convert the cpuset_mutex to percpu_rwsem, to allow it to be used from setscheduler() system calls without creating global serialization. Add new synchronization between cpuset topology-changing events and the deadline acceptance tests in setscheduler(), which were broken before. - Rework the active_mm state machine to be less confusing and more optimal. - Rework (simplify) the pick_next_task() slowpath. - Improve load-balancing on AMD EPYC systems. - ... and misc cleanups, smaller fixes and improvements - please see the Git log for more details. * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) sched/psi: Correct overly pessimistic size calculation sched/fair: Speed-up energy-aware wake-ups sched/uclamp: Always use 'enum uclamp_id' for clamp_id values sched/uclamp: Update CPU's refcount on TG's clamp changes sched/uclamp: Use TG's clamps to restrict TASK's clamps sched/uclamp: Propagate system defaults to the root group sched/uclamp: Propagate parent clamps sched/uclamp: Extend CPU's cgroup controller sched/topology: Improve load balancing on AMD EPYC systems arch, ia64: Make NUMA select SMP sched, perf: MAINTAINERS update, add submaintainers and reviewers sched/fair: Use rq_lock/unlock in online_fair_sched_group cpufreq: schedutil: fix equation in comment sched: Rework pick_next_task() slow-path sched: Allow put_prev_task() to drop rq->lock sched/fair: Expose newidle_balance() sched: Add task_struct pointer to sched_class::set_curr_task sched: Rework CPU hotplug task selection sched/{rt,deadline}: Fix set_next_task vs pick_next_task sched: Fix kerneldoc comment for ia64_set_curr_task ...
2019-08-30ftrace: Check for successful allocation of hashNaveen N. Rao1-0/+5
In register_ftrace_function_probe(), we are not checking the return value of alloc_and_copy_ftrace_hash(). The subsequent call to ftrace_match_records() may end up dereferencing the same. Add a check to ensure this doesn't happen. Link: http://lkml.kernel.org/r/26e92574f25ad23e7cafa3cf5f7a819de1832cbe.1562249521.git.naveen.n.rao@linux.vnet.ibm.com Cc: stable@vger.kernel.org Fixes: 1ec3a81a0cf42 ("ftrace: Have each function probe use its own ftrace_ops") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-08-30ftrace: Check for empty hash and comment the race with registering probesSteven Rostedt (VMware)1-1/+9
The race between adding a function probe and reading the probes that exist is very subtle. It needs a comment. Also, the issue can also happen if the probe has has the EMPTY_HASH as its func_hash. Cc: stable@vger.kernel.org Fixes: 7b60f3d876156 ("ftrace: Dynamically create the probe ftrace_ops for the trace_array") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-08-30ftrace: Fix NULL pointer dereference in t_probe_next()Naveen N. Rao1-0/+4
LTP testsuite on powerpc results in the below crash: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc00000000029d800 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV ... CPU: 68 PID: 96584 Comm: cat Kdump: loaded Tainted: G W NIP: c00000000029d800 LR: c00000000029dac4 CTR: c0000000001e6ad0 REGS: c0002017fae8ba10 TRAP: 0300 Tainted: G W MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28022422 XER: 20040000 CFAR: c00000000029d90c DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 0 ... NIP [c00000000029d800] t_probe_next+0x60/0x180 LR [c00000000029dac4] t_mod_start+0x1a4/0x1f0 Call Trace: [c0002017fae8bc90] [c000000000cdbc40] _cond_resched+0x10/0xb0 (unreliable) [c0002017fae8bce0] [c0000000002a15b0] t_start+0xf0/0x1c0 [c0002017fae8bd30] [c0000000004ec2b4] seq_read+0x184/0x640 [c0002017fae8bdd0] [c0000000004a57bc] sys_read+0x10c/0x300 [c0002017fae8be30] [c00000000000b388] system_call+0x5c/0x70 The test (ftrace_set_ftrace_filter.sh) is part of ftrace stress tests and the crash happens when the test does 'cat $TRACING_PATH/set_ftrace_filter'. The address points to the second line below, in t_probe_next(), where filter_hash is dereferenced: hash = iter->probe->ops.func_hash->filter_hash; size = 1 << hash->size_bits; This happens due to a race with register_ftrace_function_probe(). A new ftrace_func_probe is created and added into the func_probes list in trace_array under ftrace_lock. However, before initializing the filter, we drop ftrace_lock, and re-acquire it after acquiring regex_lock. If another process is trying to read set_ftrace_filter, it will be able to acquire ftrace_lock during this window and it will end up seeing a NULL filter_hash. Fix this by just checking for a NULL filter_hash in t_probe_next(). If the filter_hash is NULL, then this probe is just being added and we can simply return from here. Link: http://lkml.kernel.org/r/05e021f757625cbbb006fad41380323dbe4e3b43.1562249521.git.naveen.n.rao@linux.vnet.ibm.com Cc: stable@vger.kernel.org Fixes: 7b60f3d876156 ("ftrace: Dynamically create the probe ftrace_ops for the trace_array") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-07-31tracing: Use CONFIG_PREEMPTIONThomas Gleixner1-1/+1
CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT. Both PREEMPT and PREEMPT_RT require the same functionality which today depends on CONFIG_PREEMPT. Switch the conditionals in the tracer over to CONFIG_PREEMPTION. This is the first step to make the tracer work on RT. The other small tweaks are submitted separately. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20190726212124.409766323@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-18Merge tag 'trace-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds1-23/+25
Pull tracing updates from Steven Rostedt: "The main changes in this release include: - Add user space specific memory reading for kprobes - Allow kprobes to be executed earlier in boot The rest are mostly just various clean ups and small fixes" * tag 'trace-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits) tracing: Make trace_get_fields() global tracing: Let filter_assign_type() detect FILTER_PTR_STRING tracing: Pass type into tracing_generic_entry_update() ftrace/selftest: Test if set_event/ftrace_pid exists before writing ftrace/selftests: Return the skip code when tracing directory not configured in kernel tracing/kprobe: Check registered state using kprobe tracing/probe: Add trace_event_call accesses APIs tracing/probe: Add probe event name and group name accesses APIs tracing/probe: Add trace flag access APIs for trace_probe tracing/probe: Add trace_event_file access APIs for trace_probe tracing/probe: Add trace_event_call register API for trace_probe tracing/probe: Add trace_probe init and free functions tracing/uprobe: Set print format when parsing command tracing/kprobe: Set print format right after parsed command kprobes: Fix to init kprobes in subsys_initcall tracepoint: Use struct_size() in kmalloc() ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS ftrace: Enable trampoline when rec count returns back to one tracing/kprobe: Do not run kprobe boot tests if kprobe_event is on cmdline tracing: Make a separate config for trace event self tests ...
2019-06-28ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()Petr Mladek1-9/+1
The commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race") causes a possible deadlock between register_kprobe() and ftrace_run_update_code() when ftrace is using stop_machine(). The existing dependency chain (in reverse order) is: -> #1 (text_mutex){+.+.}: validate_chain.isra.21+0xb32/0xd70 __lock_acquire+0x4b8/0x928 lock_acquire+0x102/0x230 __mutex_lock+0x88/0x908 mutex_lock_nested+0x32/0x40 register_kprobe+0x254/0x658 init_kprobes+0x11a/0x168 do_one_initcall+0x70/0x318 kernel_init_freeable+0x456/0x508 kernel_init+0x22/0x150 ret_from_fork+0x30/0x34 kernel_thread_starter+0x0/0xc -> #0 (cpu_hotplug_lock.rw_sem){++++}: check_prev_add+0x90c/0xde0 validate_chain.isra.21+0xb32/0xd70 __lock_acquire+0x4b8/0x928 lock_acquire+0x102/0x230 cpus_read_lock+0x62/0xd0 stop_machine+0x2e/0x60 arch_ftrace_update_code+0x2e/0x40 ftrace_run_update_code+0x40/0xa0 ftrace_startup+0xb2/0x168 register_ftrace_function+0x64/0x88 klp_patch_object+0x1a2/0x290 klp_enable_patch+0x554/0x980 do_one_initcall+0x70/0x318 do_init_module+0x6e/0x250 load_module+0x1782/0x1990 __s390x_sys_finit_module+0xaa/0xf0 system_call+0xd8/0x2d0 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(text_mutex); lock(cpu_hotplug_lock.rw_sem); lock(text_mutex); lock(cpu_hotplug_lock.rw_sem); It is similar problem that has been solved by the commit 2d1e38f56622b9b ("kprobes: Cure hotplug lock ordering issues"). Many locks are involved. To be on the safe side, text_mutex must become a low level lock taken after cpu_hotplug_lock.rw_sem. This can't be achieved easily with the current ftrace design. For example, arm calls set_all_modules_text_rw() already in ftrace_arch_code_modify_prepare(), see arch/arm/kernel/ftrace.c. This functions is called: + outside stop_machine() from ftrace_run_update_code() + without stop_machine() from ftrace_module_enable() Fortunately, the problematic fix is needed only on x86_64. It is the only architecture that calls set_all_modules_text_rw() in ftrace path and supports livepatching at the same time. Therefore it is enough to move text_mutex handling from the generic kernel/trace/ftrace.c into arch/x86/kernel/ftrace.c: ftrace_arch_code_modify_prepare() ftrace_arch_code_modify_post_process() This patch basically reverts the ftrace part of the problematic commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race"). And provides x86_64 specific-fix. Some refactoring of the ftrace code will be needed when livepatching is implemented for arm or nds32. These architectures call set_all_modules_text_rw() and use stop_machine() at the same time. Link: http://lkml.kernel.org/r/20190627081334.12793-1-pmladek@suse.com Fixes: 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race") Acked-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Petr Mladek <pmladek@suse.com> [ As reviewed by Miroslav Benes <mbenes@suse.cz>, removed return value of ftrace_run_update_code() as it is a void function. ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-06-14ftrace: Fix NULL pointer dereference in free_ftrace_func_mapper()Wei Li1-2/+5
The mapper may be NULL when called from register_ftrace_function_probe() with probe->data == NULL. This issue can be reproduced as follow (it may be covered by compiler optimization sometime): / # cat /sys/kernel/debug/tracing/set_ftrace_filter #### all functions enabled #### / # echo foo_bar:dump > /sys/kernel/debug/tracing/set_ftrace_filter [ 206.949100] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 206.952402] Mem abort info: [ 206.952819] ESR = 0x96000006 [ 206.955326] Exception class = DABT (current EL), IL = 32 bits [ 206.955844] SET = 0, FnV = 0 [ 206.956272] EA = 0, S1PTW = 0 [ 206.956652] Data abort info: [ 206.957320] ISV = 0, ISS = 0x00000006 [ 206.959271] CM = 0, WnR = 0 [ 206.959938] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000419f3a000 [ 206.960483] [0000000000000000] pgd=0000000411a87003, pud=0000000411a83003, pmd=0000000000000000 [ 206.964953] Internal error: Oops: 96000006 [#1] SMP [ 206.971122] Dumping ftrace buffer: [ 206.973677] (ftrace buffer empty) [ 206.975258] Modules linked in: [ 206.976631] Process sh (pid: 281, stack limit = 0x(____ptrval____)) [ 206.978449] CPU: 10 PID: 281 Comm: sh Not tainted 5.2.0-rc1+ #17 [ 206.978955] Hardware name: linux,dummy-virt (DT) [ 206.979883] pstate: 60000005 (nZCv daif -PAN -UAO) [ 206.980499] pc : free_ftrace_func_mapper+0x2c/0x118 [ 206.980874] lr : ftrace_count_free+0x68/0x80 [ 206.982539] sp : ffff0000182f3ab0 [ 206.983102] x29: ffff0000182f3ab0 x28: ffff8003d0ec1700 [ 206.983632] x27: ffff000013054b40 x26: 0000000000000001 [ 206.984000] x25: ffff00001385f000 x24: 0000000000000000 [ 206.984394] x23: ffff000013453000 x22: ffff000013054000 [ 206.984775] x21: 0000000000000000 x20: ffff00001385fe28 [ 206.986575] x19: ffff000013872c30 x18: 0000000000000000 [ 206.987111] x17: 0000000000000000 x16: 0000000000000000 [ 206.987491] x15: ffffffffffffffb0 x14: 0000000000000000 [ 206.987850] x13: 000000000017430e x12: 0000000000000580 [ 206.988251] x11: 0000000000000000 x10: cccccccccccccccc [ 206.988740] x9 : 0000000000000000 x8 : ffff000013917550 [ 206.990198] x7 : ffff000012fac2e8 x6 : ffff000012fac000 [ 206.991008] x5 : ffff0000103da588 x4 : 0000000000000001 [ 206.991395] x3 : 0000000000000001 x2 : ffff000013872a28 [ 206.991771] x1 : 0000000000000000 x0 : 0000000000000000 [ 206.992557] Call trace: [ 206.993101] free_ftrace_func_mapper+0x2c/0x118 [ 206.994827] ftrace_count_free+0x68/0x80 [ 206.995238] release_probe+0xfc/0x1d0 [ 206.995555] register_ftrace_function_probe+0x4a8/0x868 [ 206.995923] ftrace_trace_probe_callback.isra.4+0xb8/0x180 [ 206.996330] ftrace_dump_callback+0x50/0x70 [ 206.996663] ftrace_regex_write.isra.29+0x290/0x3a8 [ 206.997157] ftrace_filter_write+0x44/0x60 [ 206.998971] __vfs_write+0x64/0xf0 [ 206.999285] vfs_write+0x14c/0x2f0 [ 206.999591] ksys_write+0xbc/0x1b0 [ 206.999888] __arm64_sys_write+0x3c/0x58 [ 207.000246] el0_svc_common.constprop.0+0x408/0x5f0 [ 207.000607] el0_svc_handler+0x144/0x1c8 [ 207.000916] el0_svc+0x8/0xc [ 207.003699] Code: aa0003f8 a9025bf5 aa0103f5 f946ea80 (f9400303) [ 207.008388] ---[ end trace 7b6d11b5f542bdf1 ]--- [ 207.010126] Kernel panic - not syncing: Fatal exception [ 207.011322] SMP: stopping secondary CPUs [ 207.013956] Dumping ftrace buffer: [ 207.014595] (ftrace buffer empty) [ 207.015632] Kernel Offset: disabled [ 207.017187] CPU features: 0x002,20006008 [ 207.017985] Memory Limit: none [ 207.019825] ---[ end Kernel panic - not syncing: Fatal exception ]--- Link: http://lkml.kernel.org/r/20190606031754.10798-1-liwei391@huawei.com Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-06-14module: Fix livepatch/ftrace module text permissions raceJosh Poimboeuf1-1/+9
It's possible for livepatch and ftrace to be toggling a module's text permissions at the same time, resulting in the following panic: BUG: unable to handle page fault for address: ffffffffc005b1d9 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 3ea0c067 P4D 3ea0c067 PUD 3ea0e067 PMD 3cc13067 PTE 3b8a1061 Oops: 0003 [#1] PREEMPT SMP PTI CPU: 1 PID: 453 Comm: insmod Tainted: G O K 5.2.0-rc1-a188339ca5 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014 RIP: 0010:apply_relocate_add+0xbe/0x14c Code: fa 0b 74 21 48 83 fa 18 74 38 48 83 fa 0a 75 40 eb 08 48 83 38 00 74 33 eb 53 83 38 00 75 4e 89 08 89 c8 eb 0a 83 38 00 75 43 <89> 08 48 63 c1 48 39 c8 74 2e eb 48 83 38 00 75 32 48 29 c1 89 08 RSP: 0018:ffffb223c00dbb10 EFLAGS: 00010246 RAX: ffffffffc005b1d9 RBX: 0000000000000000 RCX: ffffffff8b200060 RDX: 000000000000000b RSI: 0000004b0000000b RDI: ffff96bdfcd33000 RBP: ffffb223c00dbb38 R08: ffffffffc005d040 R09: ffffffffc005c1f0 R10: ffff96bdfcd33c40 R11: ffff96bdfcd33b80 R12: 0000000000000018 R13: ffffffffc005c1f0 R14: ffffffffc005e708 R15: ffffffff8b2fbc74 FS: 00007f5f447beba8(0000) GS:ffff96bdff900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffc005b1d9 CR3: 000000003cedc002 CR4: 0000000000360ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: klp_init_object_loaded+0x10f/0x219 ? preempt_latency_start+0x21/0x57 klp_enable_patch+0x662/0x809 ? virt_to_head_page+0x3a/0x3c ? kfree+0x8c/0x126 patch_init+0x2ed/0x1000 [livepatch_test02] ? 0xffffffffc0060000 do_one_initcall+0x9f/0x1c5 ? kmem_cache_alloc_trace+0xc4/0xd4 ? do_init_module+0x27/0x210 do_init_module+0x5f/0x210 load_module+0x1c41/0x2290 ? fsnotify_path+0x3b/0x42 ? strstarts+0x2b/0x2b ? kernel_read+0x58/0x65 __do_sys_finit_module+0x9f/0xc3 ? __do_sys_finit_module+0x9f/0xc3 __x64_sys_finit_module+0x1a/0x1c do_syscall_64+0x52/0x61 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The above panic occurs when loading two modules at the same time with ftrace enabled, where at least one of the modules is a livepatch module: CPU0 CPU1 klp_enable_patch() klp_init_object_loaded() module_disable_ro() ftrace_module_enable() ftrace_arch_code_modify_post_process() set_all_modules_text_ro() klp_write_object_relocations() apply_relocate_add() *patches read-only code* - BOOM A similar race exists when toggling ftrace while loading a livepatch module. Fix it by ensuring that the livepatch and ftrace code patching operations -- and their respective permissions changes -- are protected by the text_mutex. Link: http://lkml.kernel.org/r/ab43d56ab909469ac5d2520c5d944ad6d4abd476.1560474114.git.jpoimboe@redhat.com Reported-by: Johannes Erdfelt <johannes@erdfelt.com> Fixes: 444d13ff10fb ("modules: add ro_after_init support") Acked-by: Jessica Yu <jeyu@kernel.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-06-14tracing: avoid build warning with HAVE_NOP_MCOUNTVasily Gorbik1-3/+2
Selecting HAVE_NOP_MCOUNT enables -mnop-mcount (if gcc supports it) and sets CC_USING_NOP_MCOUNT. Reuse __is_defined (which is suitable for testing CC_USING_* defines) to avoid conditional compilation and fix the following gcc 9 warning on s390: kernel/trace/ftrace.c:2514:1: warning: ‘ftrace_code_disable’ defined but not used [-Wunused-function] Link: http://lkml.kernel.org/r/patch.git-1a82d13f33ac.your-ad-here.call-01559732716-ext-6629@work.hours Fixes: 2f4df0017baed ("tracing: Add -mcount-nop option support") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-25ftrace: Enable trampoline when rec count returns back to oneCheng Jian1-13/+15
Custom trampolines can only be enabled if there is only a single ops attached to it. If there's only a single callback registered to a function, and the ops has a trampoline registered for it, then we can call the trampoline directly. This is very useful for improving the performance of ftrace and livepatch. If more than one callback is registered to a function, the general trampoline is used, and the custom trampoline is not restored back to the direct call even if all the other callbacks were unregistered and we are back to one callback for the function. To fix this, set FTRACE_FL_TRAMP flag if rec count is decremented to one, and the ops that left has a trampoline. Testing After this patch : insmod livepatch_unshare_files.ko cat /sys/kernel/debug/tracing/enabled_functions unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0 echo unshare_files > /sys/kernel/debug/tracing/set_ftrace_filter echo function > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/enabled_functions unshare_files (2) R I ->ftrace_ops_list_func+0x0/0x150 echo nop > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/enabled_functions unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0 Link: http://lkml.kernel.org/r/1556969979-111047-1-git-send-email-cj.chengjian@huawei.com Signed-off-by: Cheng Jian <cj.chengjian@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-25ftrace: Make enable and update parameters bool when applicableSteven Rostedt (VMware)1-10/+10
The code modification functions have "enable" and "update" variables that are sometimes "int" but used as "bool". Remove the ambiguity and make them "bool" when they are only used for true or false values. Link: http://lkml.kernel.org/r/e1429923d9eda92a3cf5ee9e33c7eacce539781d.1558115654.git.naveen.n.rao@linux.vnet.ibm.com Reported-by: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-15Merge tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds1-5/+4
Pull tracing updates from Steven Rostedt: "The major changes in this tracing update includes: - Removal of non-DYNAMIC_FTRACE from 32bit x86 - Removal of mcount support from x86 - Emulating a call from int3 on x86_64, fixes live kernel patching - Consolidated Tracing Error logs file Minor updates: - Removal of klp_check_compiler_support() - kdb ftrace dumping output changes - Accessing and creating ftrace instances from inside the kernel - Clean up of #define if macro - Introduction of TRACE_EVENT_NOP() to disable trace events based on config options And other minor fixes and clean ups" * tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) x86: Hide the int3_emulate_call/jmp functions from UML livepatch: Remove klp_check_compiler_support() ftrace/x86: Remove mcount support ftrace/x86_32: Remove support for non DYNAMIC_FTRACE tracing: Simplify "if" macro code tracing: Fix documentation about disabling options using trace_options tracing: Replace kzalloc with kcalloc tracing: Fix partial reading of trace event's id file tracing: Allow RCU to run between postponed startup tests tracing: Fix white space issues in parse_pred() function tracing: Eliminate const char[] auto variables ring-buffer: Fix mispelling of Calculate tracing: probeevent: Fix to make the type of $comm string tracing: probeevent: Do not accumulate on ret variable tracing: uprobes: Re-enable $comm support for uprobe events ftrace/x86_64: Emulate call function while updating in breakpoint handler x86_64: Allow breakpoints to emulate call instructions x86_64: Add gap to int3 to allow for call emulation tracing: kdb: Allow ftdump to skip all but the last few entries tracing: Add trace_total_entries() / trace_total_entries_cpu() ...
2019-05-08tracing: Eliminate const char[] auto variablesRasmus Villemoes1-1/+1
Automatic const char[] variables cause unnecessary code generation. For example, the this_mod variable leads to 3f04: 48 b8 5f 5f 74 68 69 73 5f 6d movabs $0x6d5f736968745f5f,%rax # __this_m 3f0e: 4c 8d 44 24 02 lea 0x2(%rsp),%r8 3f13: 48 8d 7c 24 10 lea 0x10(%rsp),%rdi 3f18: 48 89 44 24 02 mov %rax,0x2(%rsp) 3f1d: 4c 89 e9 mov %r13,%rcx 3f20: b8 65 00 00 00 mov $0x65,%eax # e 3f25: 48 c7 c2 00 00 00 00 mov $0x0,%rdx 3f28: R_X86_64_32S .rodata.str1.1+0x18d 3f2c: be 48 00 00 00 mov $0x48,%esi 3f31: c7 44 24 0a 6f 64 75 6c movl $0x6c75646f,0xa(%rsp) # odul 3f39: 66 89 44 24 0e mov %ax,0xe(%rsp) i.e., the string gets built on the stack at runtime. Similar code can be found for the other instances I'm replacing here. Putting the string in .rodata reduces the combined .text+.rodata size and saves time and stack space at runtime. The simplest fix, and what I've done for the this_mod case, is to just make the variable static. However, for the "<faulted>" case where the same string is used twice, that prevents the linker from merging those two literals, so instead use a macro - that also keeps the two instances automatically in sync (instead of only the compile-time strlen expression). Finally, for the two runs of spaces, it turns out that the "build these strings on the stack" is not the worst part of what gcc does - it turns print_func_help_header_irq() into "if (tgid) { /* print_event_info + five seq_printf calls */ } else { /* print event_info + another five seq_printf */}". Taking inspiration from a suggestion from Al Viro, use %.*s to make snprintf either stop after the first two spaces or print the whole string. As a bonus, the seq_printfs now fit on single lines (at least, they are not longer than the existing ones in the function just above), making it easier to see that the ascii art lines up. x86-64 defconfig + CONFIG_FUNCTION_TRACER: $ scripts/stackdelta /tmp/stackusage.{0,1} ./kernel/trace/ftrace.c ftrace_mod_callback 152 136 -16 ./kernel/trace/trace.c trace_default_header 56 32 -24 ./kernel/trace/trace.c tracing_mark_raw_write 96 72 -24 ./kernel/trace/trace.c tracing_mark_write 104 80 -24 bloat-o-meter add/remove: 1/0 grow/shrink: 0/4 up/down: 14/-375 (-361) Function old new delta this_mod - 14 +14 ftrace_mod_callback 577 542 -35 tracing_mark_raw_write 444 374 -70 tracing_mark_write 616 540 -76 trace_default_header 600 406 -194 Link: http://lkml.kernel.org/r/20190320081757.6037-1-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-19kprobes: Mark ftrace mcount handler functions nokprobeMasami Hiramatsu1-1/+5
Mark ftrace mcount handler functions nokprobe since probing on these functions with kretprobe pushes return address incorrectly on kretprobe shadow stack. Reported-by: Francis Deslauriers <francis.deslauriers@efficios.com> Tested-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/155094062044.6137.6419622920568680640.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-11ftrace: Do not process STUB functions in ftrace_ops_list_func()Steven Rostedt (VMware)1-0/+3
The function_graph tracer has a stub function and its ops flag has the FTRACE_OPS_FL_STUB set. As the function graph does not use the ftrace_ops->func pointer but instead is called by a separate part of the ftrace trampoline. The function_graph tracer still requires to pass in a ftrace_ops that may also hold the hash of the functions to call. But there's no reason to test that hash in the function tracing portion. Instead of testing to see if we should call the stub function, just test if the ops has FTRACE_OPS_FL_STUB set, and just skip it. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-10ftrace: Remove ASSIGN_OPS_HASH() macro from ftrace.cSteven Rostedt (VMware)1-4/+0
The ASSIGN_OPS_HASH() macro was moved to fgraph.c where it was used, but for some reason it wasn't removed from ftrace.c, as it is no longer referenced there. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-03-26ftrace: Fix warning using plain integer as NULL & spelling correctionsHariprasad Kelam1-6/+6
Changed 0 --> NULL to avoid sparse warning Corrected spelling mistakes reported by checkpatch.pl Sparse warning below: sudo make C=2 CF=-D__CHECK_ENDIAN__ M=kernel/trace CHECK kernel/trace/ftrace.c kernel/trace/ftrace.c:3007:24: warning: Using plain integer as NULL pointer kernel/trace/ftrace.c:4758:37: warning: Using plain integer as NULL pointer Link: http://lkml.kernel.org/r/20190323183523.GA2244@hari-Inspiron-1545 Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-15ftrace: Allow enabling of filters via index of available_filter_functionsSteven Rostedt (VMware)1-0/+30
Enabling of large number of functions by echoing in a large subset of the functions in available_filter_functions can take a very long time. The process requires testing all functions registered by the function tracer (which is in the 10s of thousands), and doing a kallsyms lookup to convert the ip address into a name, then comparing that name with the string passed in. When a function causes the function tracer to crash the system, a binary bisect of the available_filter_functions can be done to find the culprit. But this requires passing in half of the functions in available_filter_functions over and over again, which makes it basically a O(n^2) operation. With 40,000 functions, that ends up bing 1,600,000,000 opertions! And enabling this can take over 20 minutes. As a quick speed up, if a number is passed into one of the filter files, instead of doing a search, it just enables the function at the corresponding line of the available_filter_functions file. That is: # echo 50 > set_ftrace_filter # cat set_ftrace_filter x86_pmu_commit_txn # head -50 available_filter_functions | tail -1 x86_pmu_commit_txn This allows setting of half the available_filter_functions to take place in less than a second! # time seq 20000 > set_ftrace_filter real 0m0.042s user 0m0.005s sys 0m0.015s # wc -l set_ftrace_filter 20000 set_ftrace_filter Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-31Merge tag 'trace-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds1-437/+53
Pull tracing updates from Steven Rostedt: - Rework of the kprobe/uprobe and synthetic events to consolidate all the dynamic event code. This will make changes in the future easier. - Partial rewrite of the function graph tracing infrastructure. This will allow for multiple users of hooking onto functions to get the callback (return) of the function. This is the ground work for having kprobes and function graph tracer using one code base. - Clean up of the histogram code that will facilitate adding more features to the histograms in the future. - Addition of str_has_prefix() and a few use cases. There currently is a similar function strstart() that is used in a few places, but only returns a bool and not a length. These instances will be removed in the future to use str_has_prefix() instead. - A few other various clean ups as well. * tag 'trace-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (57 commits) tracing: Use the return of str_has_prefix() to remove open coded numbers tracing: Have the historgram use the result of str_has_prefix() for len of prefix tracing: Use str_has_prefix() instead of using fixed sizes tracing: Use str_has_prefix() helper for histogram code string.h: Add str_has_prefix() helper function tracing: Make function ‘ftrace_exports’ static tracing: Simplify printf'ing in seq_print_sym tracing: Avoid -Wformat-nonliteral warning tracing: Merge seq_print_sym_short() and seq_print_sym_offset() tracing: Add hist trigger comments for variable-related fields tracing: Remove hist trigger synth_var_refs tracing: Use hist trigger's var_ref array to destroy var_refs tracing: Remove open-coding of hist trigger var_ref management tracing: Use var_refs[] for hist trigger reference checking tracing: Change strlen to sizeof for hist trigger static strings tracing: Remove unnecessary hist trigger struct field tracing: Fix ftrace_graph_get_ret_stack() to use task and not current seq_buf: Use size_t for len in seq_buf_puts() seq_buf: Make seq_buf_puts() null-terminate the buffer arm64: Use ftrace_graph_get_ret_stack() instead of curr_ret_stack ...
2018-12-26Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-12/+12
Pull RCU updates from Ingo Molnar: "The biggest RCU changes in this cycle were: - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar. - Replace calls of RCU-bh and RCU-sched update-side functions to their vanilla RCU counterparts. This series is a step towards complete removal of the RCU-bh and RCU-sched update-side functions. ( Note that some of these conversions are going upstream via their respective maintainers. ) - Documentation updates, including a number of flavor-consolidation updates from Joel Fernandes. - Miscellaneous fixes. - Automate generation of the initrd filesystem used for rcutorture testing. - Convert spin_is_locked() assertions to instead use lockdep. ( Note that some of these conversions are going upstream via their respective maintainers. ) - SRCU updates, especially including a fix from Dennis Krein for a bag-on-head-class bug. - RCU torture-test updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits) rcutorture: Don't do busted forward-progress testing rcutorture: Use 100ms buckets for forward-progress callback histograms rcutorture: Recover from OOM during forward-progress tests rcutorture: Print forward-progress test age upon failure rcutorture: Print time since GP end upon forward-progress failure rcutorture: Print histogram of CB invocation at OOM time rcutorture: Print GP age upon forward-progress failure rcu: Print per-CPU callback counts for forward-progress failures rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings rcutorture: Dump grace-period diagnostics upon forward-progress OOM rcutorture: Prepare for asynchronous access to rcu_fwd_startat torture: Remove unnecessary "ret" variables rcutorture: Affinity forward-progress test to avoid housekeeping CPUs rcutorture: Break up too-long rcu_torture_fwd_prog() function rcutorture: Remove cbflood facility torture: Bring any extra CPUs online during kernel startup rcutorture: Add call_rcu() flooding forward-progress tests rcutorture/formal: Replace synchronize_sched() with synchronize_rcu() tools/kernel.h: Replace synchronize_sched() with synchronize_rcu() net/decnet: Replace rcu_barrier_bh() with rcu_barrier() ...
2018-12-11tracing: Fix memory leak of instance function hash filtersSteven Rostedt (VMware)1-0/+1
The following commands will cause a memory leak: # cd /sys/kernel/tracing # mkdir instances/foo # echo schedule > instance/foo/set_ftrace_filter # rmdir instances/foo The reason is that the hashes that hold the filters to set_ftrace_filter and set_ftrace_notrace are not freed if they contain any data on the instance and the instance is removed. Found by kmemleak detector. Cc: stable@vger.kernel.org Fixes: 591dffdade9f ("ftrace: Allow for function tracing instance to filter functions") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-10ftrace: Allow ftrace_replace_code() to be schedulableSteven Rostedt (VMware)1-3/+16
The function ftrace_replace_code() is the ftrace engine that does the work to modify all the nops into the calls to the function callback in all the functions being traced. The generic version which is normally called from stop machine, but an architecture can implement a non stop machine version and still use the generic ftrace_replace_code(). When an architecture does this, ftrace_replace_code() may be called from a schedulable context, where it can allow the code to be preemptible, and schedule out. In order to allow an architecture to make ftrace_replace_code() schedulable, a new command flag is added called: FTRACE_MAY_SLEEP Which can be or'd to the command that is passed to ftrace_modify_all_code() that calls ftrace_replace_code() and will have it call cond_resched() in the loop that modifies the nops into the calls to the ftrace trampolines. Link: http://lkml.kernel.org/r/20181204192903.8193-1-anders.roxell@linaro.org Link: http://lkml.kernel.org/r/20181205183303.828422192@goodmis.org Reported-by: Anders Roxell <anders.roxell@linaro.org> Tested-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-08function_graph: Have profiler use new helper ftrace_graph_get_ret_stack()Steven Rostedt (VMware)1-10/+11
The ret_stack processing is going to change, and that is going to break anything that is accessing the ret_stack directly. One user is the function graph profiler. By using the ftrace_graph_get_ret_stack() helper function, the profiler can access the ret_stack entry without relying on the implementation details of the stack itself. Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-08fgraph: Add new fgraph_ops structure to enable function graph hooksSteven Rostedt (VMware)1-3/+7
Currently the registering of function graph is to pass in a entry and return function. We need to have a way to associate those functions together where the entry can determine to run the return hook. Having a structure that contains both functions will facilitate the process of converting the code to be able to do such. This is similar to the way function hooks are enabled (it passes in ftrace_ops). Instead of passing in the functions to use, a single structure is passed in to the registering function. The unregister function is now passed in the fgraph_ops handle. When we allow more than one callback to the function graph hooks, this will let the system know which one to remove. Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-08fgraph: Move function graph specific code into fgraph.cSteven Rostedt (VMware)1-361/+7
To make the function graph infrastructure more managable, the code needs to be in its own file (fgraph.c). Move the code that is specific for managing the function graph infrastructure out of ftrace.c and into fgraph.c Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-08ftrace: Create new ftrace_internal.h headerSteven Rostedt (VMware)1-62/+14
In order to move function graph infrastructure into its own file (fgraph.h) it needs to access various functions and variables in ftrace.c that are currently static. Create a new file called ftrace-internal.h that holds the function prototypes and the extern declarations of the variables needed by fgraph.c as well, and make them global in ftrace.c such that they can be used outside that file. Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-04Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcuIngo Molnar1-12/+12
Pull RCU changes from Paul E. McKenney: - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar. - Replace calls of RCU-bh and RCU-sched update-side functions to their vanilla RCU counterparts. This series is a step towards complete removal of the RCU-bh and RCU-sched update-side functions. ( Note that some of these conversions are going upstream via their respective maintainers. ) - Documentation updates, including a number of flavor-consolidation updates from Joel Fernandes. - Miscellaneous fixes. - Automate generation of the initrd filesystem used for rcutorture testing. - Convert spin_is_locked() assertions to instead use lockdep. ( Note that some of these conversions are going upstream via their respective maintainers. ) - SRCU updates, especially including a fix from Dennis Krein for a bag-on-head-class bug. - RCU torture-test updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-27function_graph: Have profiler use curr_ret_stack and not depthSteven Rostedt (VMware)1-2/+2
The profiler uses trace->depth to find its entry on the ret_stack, but the depth may not match the actual location of where its entry is (if an interrupt were to preempt the processing of the profiler for another function, the depth and the curr_ret_stack will be different). Have it use the curr_ret_stack as the index to find its ret_stack entry instead of using the depth variable, as that is no longer guaranteed to be the same. Cc: stable@kernel.org Fixes: 03274a3ffb449 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback") Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-27function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stackSteven Rostedt (VMware)1-0/+3
Currently, the depth of the ret_stack is determined by curr_ret_stack index. The issue is that there's a race between setting of the curr_ret_stack and calling of the callback attached to the return of the function. Commit 03274a3ffb44 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback") moved the calling of the callback to after the setting of the curr_ret_stack, even stating that it was safe to do so, when in fact, it was the reason there was a barrier() there (yes, I should have commented that barrier()). Not only does the curr_ret_stack keep track of the current call graph depth, it also keeps the ret_stack content from being overwritten by new data. The function profiler, uses the "subtime" variable of ret_stack structure and by moving the curr_ret_stack, it allows for interrupts to use the same structure it was using, corrupting the data, and breaking the profiler. To fix this, there needs to be two variables to handle the call stack depth and the pointer to where the ret_stack is being used, as they need to change at two different locations. Cc: stable@kernel.org Fixes: 03274a3ffb449 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback") Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>