aboutsummaryrefslogtreecommitdiffstats
path: root/include/trace
AgeCommit message (Collapse)AuthorFilesLines
2017-11-14Merge branch 'for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linuxLinus Torvalds1-8/+33
Pull btrfs updates from David Sterba: "There are some new user features and the usual load of invisible enhancements or cleanups. New features: - extend mount options to specify zlib compression level, -o compress=zlib:9 - v2 of ioctl "extent to inode mapping", addressing a usecase where we want to retrieve more but inaccurate results and do the postprocessing in userspace, aiding defragmentation or deduplication tools - populate compression heuristics logic, do data sampling and try to guess compressibility by: looking for repeated patterns, counting unique byte values and distribution, calculating Shannon entropy; this will need more benchmarking and possibly fine tuning, but the base should be good enough - enable indexing for btrfs as lower filesystem in overlayfs - speedup page cache readahead during send on large files Internal enhancements: - more sanity checks of b-tree items when reading them from disk - more EINVAL/EUCLEAN fixups, missing BLK_STS_* conversion, other errno or error handling fixes - remove some homegrown IO-related logic, that's been obsoleted by core block layer changes (batching, plug/unplug, own counters) - add ref-verify, optional debugging feature to verify extent reference accounting - simplify code handling outstanding extents, make it more clear where and how the accounting is done - make delalloc reservations per-inode, simplify the code and make the logic more straightforward - extensive cleanup of delayed refs code Notable fixes: - fix send ioctl on 32bit with 64bit kernel" * 'for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (102 commits) btrfs: Fix bug for misused dev_t when lookup in dev state hash table. Btrfs: heuristic: add Shannon entropy calculation Btrfs: heuristic: add byte core set calculation Btrfs: heuristic: add byte set calculation Btrfs: heuristic: add detection of repeated data patterns Btrfs: heuristic: implement sampling logic Btrfs: heuristic: add bucket and sample counters and other defines Btrfs: compression: separate heuristic/compression workspaces btrfs: move btrfs_truncate_block out of trans handle btrfs: don't call btrfs_start_delalloc_roots in flushoncommit btrfs: track refs in a rb_tree instead of a list btrfs: add a comp_refs() helper btrfs: switch args for comp_*_refs btrfs: make the delalloc block rsv per inode btrfs: add tracepoints for outstanding extents mods Btrfs: rework outstanding_extents btrfs: increase output size for LOGICAL_INO_V2 ioctl btrfs: add a flags argument to LOGICAL_INO and call it LOGICAL_INO_V2 btrfs: add a flag to iterate_inodes_from_logical to find all extent refs for uncompressed extents btrfs: send: remove unused code ...
2017-11-13Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+201
Pull irq core updates from Thomas Gleixner: "A rather large update for the interrupt core code and the irq chip drivers: - Add a new bitmap matrix allocator and supporting changes, which is used to replace the x86 vector allocator which comes with separate pull request. This allows to replace the convoluted nested loop allocation function in x86 with a facility which supports the recently added property of managed interrupts proper and allows to switch to a best effort vector reservation scheme, which addresses problems with vector exhaustion. - A large update to the ARM GIC-V3-ITS driver adding support for range selectors. - New interrupt controllers: - Meson and Meson8 GPIO - BCM7271 L2 - Socionext EXIU If you expected that this will stop at some point, I have to disappoint you. There are new ones posted already. Sigh! - STM32 interrupt controller support for new platforms. - A pile of fixes, cleanups and updates to the MIPS GIC driver - The usual small fixes, cleanups and updates all over the place. Most visible one is to move the irq chip drivers Kconfig switches into a separate Kconfig menu" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits) genirq: Fix type of shifting literal 1 in __setup_irq() irqdomain: Drop pointless NULL check in virq_debug_show_one genirq/proc: Return proper error code when irq_set_affinity() fails irq/work: Use llist_for_each_entry_safe irqchip: mips-gic: Print warning if inherited GIC base is used irqchip/mips-gic: Add pr_fmt and reword pr_* messages irqchip/stm32: Move the wakeup on interrupt mask irqchip/stm32: Fix initial values irqchip/stm32: Add stm32h7 support dt-bindings/interrupt-controllers: Add compatible string for stm32h7 irqchip/stm32: Add multi-bank management irqchip/stm32: Select GENERIC_IRQ_CHIP irqchip/exiu: Add support for Socionext Synquacer EXIU controller dt-bindings: Add description of Socionext EXIU interrupt controller irqchip/gic-v3-its: Fix VPE activate callback return value irqchip: mips-gic: Make IPI bitmaps static irqchip: mips-gic: Share register writes in gic_set_type() irqchip: mips-gic: Remove gic_vpes variable irqchip: mips-gic: Use num_possible_cpus() to reserve IPIs irqchip: mips-gic: Configure EIC when CPUs come online ...
2017-11-13afs: Protect call->state changes against signalsDavid Howells1-0/+30
Protect call->state changes against the call being prematurely terminated due to a signal. What can happen is that a signal causes afs_wait_for_call_to_complete() to abort an afs_call because it's not yet complete whilst afs_deliver_to_call() is delivering data to that call. If the data delivery causes the state to change, this may overwrite the state of the afs_call, making it not-yet-complete again - but no further notifications will be forthcoming from AF_RXRPC as the rxrpc call has been aborted and completed, so kAFS will just hang in various places waiting for that call or on page bits that need clearing by that call. A tracepoint to monitor call state changes is also provided. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13afs: Trace page dirty/cleanDavid Howells1-0/+39
Add a trace event that logs the dirtying and cleaning of pages attached to AFS inodes. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13afs: Fix directory read/modify raceDavid Howells1-0/+21
Because parsing of the directory wasn't being done under any sort of lock, the pages holding the directory content can get invalidated whilst the parsing is ongoing. Further, the directory page check function gets called outside of the page lock, so if the page gets cleared or updated, this may return reports of bad magic numbers in the directory page. Also, the directory may change size whilst checking and parsing are ongoing, so more care needs to be taken here. Fix this by: (1) Perform the page check from the page filling function before we set PageUptodate and drop the page lock. (2) Check for the file having shrunk and the page having been abandoned before checking the page contents. (3) Lock the page whilst parsing it for the directory iterator. Whilst we're at it, add a tracepoint to report check failure. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13afs: Trace the sending of pagesDavid Howells1-0/+61
Add a pair of tracepoints to log the sending of pages for an FS.StoreData or FS.StoreData64 operation. Tracepoint afs_send_pages notes each set of pages added to the operation. There may be several of these per operation as we get up at most 8 contiguous pages in one go because the bvec we're using is on the stack. Tracepoint afs_sent_pages notes the end of adding data from a whole run of pages to the operation and the completion of the request phase. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13afs: Trace the initiation and completion of client callsDavid Howells1-0/+142
Add tracepoints to trace the initiation and completion of client calls within the kafs filesystem. The afs_make_vl_call tracepoint watches calls to the volume location database server. The afs_make_fs_call tracepoint watches calls to the file server. The afs_call_done tracepoint watches for call completion. Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-10net: allow per netns sysctl_rmem and sysctl_wmem for protosEric Dumazet1-1/+1
As we want to gradually implement per netns sysctl_rmem and sysctl_wmem on per protocol basis, add two new fields in struct proto, and two new helpers : sk_get_wmem0() and sk_get_rmem0() First user will be TCP. Then UDP and SCTP can be easily converted, while DECNET probably wont get this support. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08Merge branch 'linus' into sched/core, to pick up fixesIngo Molnar90-0/+90
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-06f2fs: trace checkpoint reason in fsync()Chao Yu1-6/+18
This patch slightly changes need_do_checkpoint to return the detail info that indicates why we need do checkpoint, then caller could print it with trace message. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller90-0/+90
Files removed in 'net-next' had their license header updated in 'net'. We take the remove from 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03dax: Implement dax_finish_sync_fault()Jan Kara1-0/+2
Implement a function that filesystems can call to finish handling of synchronous page faults. It takes care of syncing appropriare file range and insertion of page table entry. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-11-03dax: Inline dax_pmd_insert_mapping() into the callsiteJan Kara1-1/+0
dax_pmd_insert_mapping() has only one callsite and we will need to further fine tune what it does for synchronous faults. Just inline it into the callsite so that we don't have to pass awkward bools around. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-11-03tcp: add tracepoint trace_tcp_retransmit_synack()Song Liu1-0/+56
This tracepoint can be used to trace synack retransmits. It maintains pointer to struct request_sock. We cannot simply reuse trace_tcp_retransmit_skb() here, because the sk here is the LISTEN socket. The IP addresses and ports should be extracted from struct request_sock. Note that, like many other tracepoints, this patch uses IS_ENABLED in TP_fast_assign macro, which triggers sparse warning like: ./include/trace/events/tcp.h:274:1: error: directive in argument list ./include/trace/events/tcp.h:281:1: error: directive in argument list However, there is no good solution to avoid these warnings. To the best of our knowledge, these warnings are harmless. Signed-off-by: Song Liu <songliubraving@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-02Merge tag 'v4.14-rc3' into irq/irqchip-4.15Marc Zyngier1-7/+12
Required merge to get mainline irqchip updates. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman90-0/+90
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01btrfs: add tracepoints for outstanding extents modsJosef Bacik1-0/+21
This is handy for tracing problems with modifying the outstanding extents counters. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-10-30btrfs: remove delayed_ref_node from ref_headJosef Bacik1-8/+5
This is just excessive information in the ref_head, and makes the code complicated. It is a relic from when we had the heads and the refs in the same tree, which is no longer the case. With this removal I've cleaned up a bunch of the cruft around this old assumption as well. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-10-30btrfs: declare TRACE_DEFINE_ENUM for each of show_flush_state enumAnand Jain1-0/+7
So that perf can show the state symbol. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-10-26f2fs: trace f2fs_readdirChao Yu1-0/+29
This patch adds trace for f2fs_readdir. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: trace f2fs_lookupChao Yu1-0/+56
This patch adds trace for f2fs_lookup. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: trace f2fs_remove_discardChao Yu1-0/+7
This patch adds tracepoint to trace f2fs_remove_discard. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-25bpf: permit multiple bpf attachments for a single perf eventYonghong Song1-3/+3
This patch enables multiple bpf attachments for a kprobe/uprobe/tracepoint single trace event. Each trace_event keeps a list of attached perf events. When an event happens, all attached bpf programs will be executed based on the order of attachment. A global bpf_event_mutex lock is introduced to protect prog_array attaching and detaching. An alternative will be introduce a mutex lock in every trace_event_call structure, but it takes a lot of extra memory. So a global bpf_event_mutex lock is a good compromise. The bpf prog detachment involves allocation of memory. If the allocation fails, a dummy do-nothing program will replace to-be-detached program in-place. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24tcp: add tracepoint trace_tcp_set_state()Song Liu1-0/+76
This patch adds tracepoint trace_tcp_set_state. Besides usual fields (s/d ports, IP addresses), old and new state of the socket is also printed with TP_printk, with __print_symbolic(). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24tcp: add tracepoint trace_tcp_destroy_sockSong Liu1-0/+7
This patch adds trace event trace_tcp_destroy_sock. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24tcp: add tracepoint trace_tcp_receive_resetSong Liu1-0/+66
New tracepoint trace_tcp_receive_reset is added and called from tcp_reset(). This tracepoint is define with a new class tcp_event_sk. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24tcp: add tracepoint trace_tcp_send_resetSong Liu1-0/+11
New tracepoint trace_tcp_send_reset is added and called from tcp_v4_send_reset(), tcp_v6_send_reset() and tcp_send_active_reset(). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24tcp: mark trace event arguments sk and skb as constSong Liu1-4/+4
Some functions that we plan to add trace points require const sk and/or skb. So we mark these fields as const in the tracepoint. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24tcp: add trace event class tcp_event_sk_skbSong Liu1-1/+14
Introduce event class tcp_event_sk_skb for tcp tracepoints that have arguments sk and skb. Existing tracepoint trace_tcp_retransmit_skb() falls into this class. This patch rewrites the definition of trace_tcp_retransmit_skb() with tcp_event_sk_skb. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21ipv6: let trace_fib6_table_lookup() dereference the fib tablePaolo Abeni1-3/+3
The perf traces for ipv6 routing code show a relevant cost around trace_fib6_table_lookup(), even if no trace is enabled. This is due to the fib6_table de-referencing currently performed by the caller. Let's the tracing code pay this overhead, passing to the trace helper the table pointer. This gives small but measurable performance improvement under UDP flood. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20tcp: Remove use of inet6_sk and add IPv6 checks to tracepointDavid Ahern1-3/+5
386fd5da401d ("tcp: Check daddr_cache before use in tracepoint") was the second version of the tracepoint fixup patch. This patch is the delta between v2 and v3. Specifically, remove the use of inet6_sk and check sk_family as requested by Eric and add IS_ENABLED(CONFIG_IPV6) around the use of sk_v6_rcv_saddr and sk_v6_daddr as done in sock_common (noted by Cong). Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18tcp: Check daddr_cache before use in tracepointDavid Ahern1-4/+4
Running perf in one window to capture tcp_retransmit_skb tracepoint: $ perf record -e tcp:tcp_retransmit_skb -a And causing a retransmission on an active TCP session (e.g., dropping packets in the receiver, changing MTU on the interface to 500 and back to 1500) triggers a panic: [ 58.543144] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 58.545300] IP: perf_trace_tcp_retransmit_skb+0xd0/0x145 [ 58.546770] PGD 0 P4D 0 [ 58.547472] Oops: 0000 [#1] SMP [ 58.548328] Modules linked in: vrf [ 58.549262] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc4+ #26 [ 58.551004] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 58.554560] task: ffffffff81a0e540 task.stack: ffffffff81a00000 [ 58.555817] RIP: 0010:perf_trace_tcp_retransmit_skb+0xd0/0x145 [ 58.557137] RSP: 0018:ffff88003fc03d68 EFLAGS: 00010282 [ 58.558292] RAX: 0000000000000000 RBX: ffffe8ffffc0ec80 RCX: ffff880038543098 [ 58.559850] RDX: 0400000000000000 RSI: ffff88003fc03d70 RDI: ffff88003fc14b68 [ 58.561099] RBP: ffff88003fc03da8 R08: 0000000000000000 R09: ffffea0000d3224a [ 58.562005] R10: ffff88003fc03db8 R11: 0000000000000010 R12: ffff8800385428c0 [ 58.562930] R13: ffffe8ffffc0e478 R14: ffffffff81a93a40 R15: ffff88003d4f0c00 [ 58.563845] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 58.564873] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 58.565613] CR2: 0000000000000008 CR3: 000000003d68f004 CR4: 00000000000606f0 [ 58.566538] Call Trace: [ 58.566865] <IRQ> [ 58.567140] __tcp_retransmit_skb+0x4ab/0x4c6 [ 58.567704] ? tcp_set_ca_state+0x22/0x3f [ 58.568231] tcp_retransmit_skb+0x14/0xa3 [ 58.568754] tcp_retransmit_timer+0x472/0x5e3 [ 58.569324] ? tcp_write_timer_handler+0x1e9/0x1e9 [ 58.569946] tcp_write_timer_handler+0x95/0x1e9 [ 58.570548] tcp_write_timer+0x2a/0x58 Check that daddr_cache is non-NULL before de-referencing. Fixes: e086101b150a ("tcp: add a tracepoint for tcp retransmission") Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18tcp: Use pI6c in tcp tracepointDavid Ahern1-1/+1
The compact form for IPv6 addresses is more user friendly than the full version. For example: compact: 2001:db8:1::1 full: 2001:0db8:0001:0000:0000:0000:0000:0004i Update the tcp tracepoint to show the compact form. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18bpf: cpumap add tracepointsJesper Dangaard Brouer1-0/+70
This adds two tracepoint to the cpumap. One for the enqueue side trace_xdp_cpumap_enqueue() and one for the kthread dequeue side trace_xdp_cpumap_kthread(). To mitigate the tracepoint overhead, these are invoked during the enqueue/dequeue bulking phases, thus amortizing the cost. The obvious use-cases are for debugging and monitoring. The non-intuitive use-case is using these as a feedback loop to know the system load. One can imagine auto-scaling by reducing, adding or activating more worker CPUs on demand. V4: tracepoint remove time_limit info, instead add sched info V8: intro struct bpf_cpu_map_entry members cpu+map_id in this patch Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18bpf: XDP_REDIRECT enable use of cpumapJesper Dangaard Brouer1-2/+8
This patch connects cpumap to the xdp_do_redirect_map infrastructure. Still no SKB allocation are done yet. The XDP frames are transferred to the other CPU, but they are simply refcnt decremented on the remote CPU. This served as a good benchmark for measuring the overhead of remote refcnt decrement. If driver page recycle cache is not efficient then this, exposes a bottleneck in the page allocator. A shout-out to MST's ptr_ring, which is the secret behind is being so efficient to transfer memory pointers between CPUs, without constantly bouncing cache-lines between CPUs. V3: Handle !CONFIG_BPF_SYSCALL pointed out by kbuild test robot. V4: Make Generic-XDP aware of cpumap type, but don't allow redirect yet, as implementation require a separate upstream discussion. V5: - Fix a maybe-uninitialized pointed out by kbuild test robot. - Restrict bpf-prog side access to cpumap, open when use-cases appear - Implement cpu_map_enqueue() as a more simple void pointer enqueue V6: - Allow cpumap type for usage in helper bpf_redirect_map, general bpf-prog side restriction moved to earlier patch. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-17tracing, thermal: Hide cpu cooling trace events when not in useSteven Rostedt (VMware)1-0/+2
As trace events when defined create data structures and functions to process them, defining trace events when not using them is a waste of memory. The trace events thermal_power_cpu_get_power and thermal_power_cpu_limit are only used when CONFIG_CPU_THERMAL is set. Make those events only defined when that is set as well. Link: http://lkml.kernel.org/r/20171013102309.2c4ef81a@gandalf.local.home Cc: Eduardo Valentin <edubezval@gmail.com> Acked-by: Javi Merino <javi.merino@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-10-17tracing, thermal: Hide devfreq trace events when not in useSteven Rostedt (VMware)1-0/+2
As trace events when defined create data structures and functions to process them, defining trace events when not using them is a waste of memory. The trace events thermal_power_devfreq_get_power and thermal_power_devfreq_limit are only used when CONFIG_DEVFREQ_THERMAL is set. Make those events only defined when that is set as well. Link: http://lkml.kernel.org/r/20171013102150.0050cb74@gandalf.local.home Acked-by: Javi Merino <javi.merino@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-10-16tracing, dma-buf: Remove unused trace event dma_fence_annotate_wait_onSteven Rostedt (VMware)1-40/+0
Commit e941759c74 ("fence: dma-buf cross-device synchronization") added trace event fence_annotate_wait_on, but never used it. It was renamed to dma_fence_annotate_wait_on by commit f54d186700 ("dma-buf: Rename struct fence to dma_fence") but still not used. As defined trace events have data structures and functions created for them, it is a waste of memory if they are not used. Remove the unused trace event. Link: http://lkml.kernel.org/r/20171013100625.6c820059@gandalf.local.home Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-10-16tracing: bpf: Hide bpf trace events when they are not usedSteven Rostedt (VMware)1-1/+4
All the trace events defined in include/trace/events/bpf.h are only used when CONFIG_BPF_SYSCALL is defined. But this file gets included by include/linux/bpf_trace.h which is included by the networking code with CREATE_TRACE_POINTS defined. If a trace event is created but not used it still has data structures and functions created for its use, even though nothing is using them. To not waste space, do not define the BPF trace events in bpf.h unless CONFIG_BPF_SYSCALL is defined. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14tcp: add a tracepoint for tcp retransmissionCong Wang1-0/+68
We need a real-time notification for tcp retransmission for monitoring. Of course we could use ftrace to dynamically instrument this kernel function too, however we can't retrieve the connection information at the same time, for example perf-tools [1] reads /proc/net/tcp for socket details, which is slow when we have a lots of connections. Therefore, this patch adds a tracepoint for __tcp_retransmit_skb() and exposes src/dst IP addresses and ports of the connection. This also makes it easier to integrate into perf. Note, I expose both IPv4 and IPv6 addresses at the same time: for a IPv4 socket, v4 mapped address is used as IPv6 addresses, for a IPv6 socket, LOOPBACK4_IPV6 is already filled by kernel. Also, add sk and skb pointers as they are useful for BPF. 1. https://github.com/brendangregg/perf-tools/blob/master/net/tcpretrans Cc: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Brendan Gregg <bgregg@netflix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13tracing, memcg, vmscan: Hide trace events when not in useSteven Rostedt (VMware)1-0/+4
When trace events are defined but not used they still create data structures and functions for their use, even though nothing may be using them. The trace events mm_vmscan_memcg_reclaim_begin, mm_vmscan_memcg_softlimit_reclaim_begin, mm_vmscan_memcg_reclaim_end, and mm_vmscan_memcg_softlimit_reclaim_end are not used if CONFIG_MEMCG is not defined. Do not create these trace events unless CONFIG_MEMCG is defined. Link: http://lkml.kernel.org/r/20171012184632.2bd247cd@gandalf.local.home Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-10-13tracing/xen: Hide events that are not used when X86_PAE is not definedSteven Rostedt (VMware)1-16/+19
TRACE_EVENTS() take up memory. If they are defined but not used, then they simply waste space. If their use case is behind a define, then the trace events should be as well. The trace events xen_mmu_set_pte_atomic, xen_mmu_pte_clear, and xen_mmu_pmd_clear are not used when CONFIG_X86_PAE is not defined. Link: http://lkml.kernel.org/r/20171010191256.3d6d72cb@gandalf.local.home Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-10-11SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_statusTrond Myklebust1-7/+10
There is no guarantee that either the request or the svc_xprt exist by the time we get round to printing the trace message. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-10-10tracing: Add support for preempt and irq enable/disable eventsJoel Fernandes1-0/+70
Preempt and irq trace events can be used for tracing the start and end of an atomic section which can be used by a trace viewer like systrace to graphically view the start and end of an atomic section and correlate them with latencies and scheduling issues. This also serves as a prelude to using synthetic events or probes to rewrite the preempt and irqsoff tracers, along with numerous benefits of using trace events features for these events. Link: http://lkml.kernel.org/r/20171006005432.14244-3-joelaf@google.com Link: http://lkml.kernel.org/r/20171010225137.17370-1-joelaf@google.com Cc: Peter Zilstra <peterz@infradead.org> Cc: kernel-team@android.com Signed-off-by: Joel Fernandes <joelaf@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-10-10sched/debug: Rename task-state printing helpersPeter Zijlstra1-1/+1
Steve requested better names for the new task-state helper functions. So introduce the concept of task-state index for the printing and rename __get_task_state() to task_state_index() and __task_state_to_char() to task_index_to_char(). Requested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170929115016.pzlqc7ss3ccystyg@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-04writeback: eliminate work item allocation in bd_start_writeback()Jens Axboe1-1/+0
Handle start-all writeback like we do periodic or kupdate style writeback - by marking the bdi_writeback as needing a full flush, and simply waking the thread. This eliminates the need to allocate and queue a specific work item just for this purpose. After this change, we truly only ever have one of them running at any point in time. We mark the need to start all flushes, and the writeback thread will clear it once it has processed the request. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-29sched/debug: Add explicit TASK_PARKED printingPeter Zijlstra1-1/+1
Currently TASK_PARKED is masqueraded as TASK_INTERRUPTIBLE, give it its own print state because it will not in fact get woken by regular wakeups and is a long-term state. This requires moving TASK_PARKED into the TASK_REPORT mask, and since that latter needs to be a contiguous bitmask, we need to shuffle the bits around a bit. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29sched/debug: Add explicit TASK_IDLE printingPeter Zijlstra1-3/+4
Markus reported that kthreads that idle using TASK_IDLE instead of TASK_INTERRUPTIBLE are reported in as TASK_UNINTERRUPTIBLE and things like htop mark those red. This is undesirable, so add an explicit state for TASK_IDLE. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29sched/tracing: Fix trace_sched_switch task-state printingPeter Zijlstra1-7/+11
Convert trace_sched_switch to use the common task-state helpers and fix the "X" and "Z" order, possibly they ended up in the wrong order because TASK_REPORT has them in the wrong order too. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-25genirq/matrix: Add tracepointsThomas Gleixner1-0/+201
Add tracepoints for the irq bitmap matrix allocator. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Yu Chen <yu.c.chen@intel.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/20170913213153.279468022@linutronix.de