aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2018-05-24ahci: Add PCI ID for Cannon Lake PCH-LP AHCIMika Westerberg1-0/+1
This one should be using the default LPM policy for mobile chipsets so add the PCI ID to the driver list of supported revices. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org
2018-05-24Btrfs: fix error handling in btrfs_truncate()Omar Sandoval1-1/+2
Jun Wu at Facebook reported that an internal service was seeing a return value of 1 from ftruncate() on Btrfs in some cases. This is coming from the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items(). btrfs_truncate() uses two variables for error handling, ret and err. When btrfs_truncate_inode_items() returns non-zero, we set err to the return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we only set err if ret is an error (i.e., negative). To reproduce the issue: mount a filesystem with -o compress-force=zstd and the following program will encounter return value of 1 from ftruncate: int main(void) { char buf[256] = { 0 }; int ret; int fd; fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666); if (fd == -1) { perror("open"); return EXIT_FAILURE; } if (write(fd, buf, sizeof(buf)) != sizeof(buf)) { perror("write"); close(fd); return EXIT_FAILURE; } if (fsync(fd) == -1) { perror("fsync"); close(fd); return EXIT_FAILURE; } ret = ftruncate(fd, 128); if (ret) { printf("ftruncate() returned %d\n", ret); close(fd); return EXIT_FAILURE; } close(fd); return EXIT_SUCCESS; } Fixes: ddfae63cc8e0 ("btrfs: move btrfs_truncate_block out of trans handle") CC: stable@vger.kernel.org # 4.15+ Reported-by: Jun Wu <quark@fb.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-23RDMA/hns: Move the location for initializing tmp_lenoulijun1-1/+2
When posted work request, it need to compute the length of all sges of every wr and fill it into the msg_len field of send wqe. Thus, While posting multiple wr, tmp_len should be reinitialized to zero. Fixes: 8b9b8d143b46 ("RDMA/hns: Fix the endian problem for hns") Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-23RDMA/hns: Bugfix for cq record db for kerneloulijun1-0/+1
When use cq record db for kernel, it needs to set the hr_cq->db_en to 1 and configure the dma address of record cq db of qp context. Fixes: 86188a8810ed ("RDMA/hns: Support cq record doorbell for kernel space") Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-23IB/uverbs: Fix uverbs_attr_get_objJason Gunthorpe1-5/+5
The err pointer comes from uverbs_attr_get, not from the uobject member, which does not store an ERR_PTR. Fixes: be934cca9e98 ("IB/uverbs: Add device memory registration ioctl support") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2018-05-23RDMA/qedr: Fix doorbell bar mapping for dpi > 1Kalderon, Michal1-31/+29
Each user_context receives a separate dpi value and thus a different address on the doorbell bar. The qedr_mmap function needs to validate the address and map the doorbell bar accordingly. The current implementation always checked against dpi=0 doorbell range leading to a wrong mapping for doorbell bar. (It entered an else case that mapped the address differently). qedr_mmap should only be used for doorbells, so the else was actually wrong in the first place. This only has an affect on arm architecture and not an issue on a x86 based architecture. This lead to doorbells not occurring on arm based systems and left applications that use more than one dpi (or several applications run simultaneously ) to hang. Fixes: ac1b36e55a51 ("qedr: Add support for user context verbs") Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-23perf kcore_copy: Amend the offset of sections that remap kernel textAdrian Hunter1-2/+51
x86 PTI entry trampolines all map to the same physical page. If that is reflected in the program headers of /proc/kcore, then do the same for the copy of kcore. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-18-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Copy x86 PTI entry trampoline sectionsAdrian Hunter1-0/+42
Identify and copy any sections for x86 PTI entry trampolines. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-17-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Get rid of kernel_mapAdrian Hunter1-18/+52
In preparation to add more program headers, get rid of kernel_map and modules_map by moving ->kernel_map and ->modules_map to newly allocated entries in the ->phdrs list. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-16-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Iterate phdrsAdrian Hunter1-15/+10
In preparation to add more program headers, iterate phdrs instead of assuming there is only one for the kernel text and one for the modules. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-15-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Layout sectionsAdrian Hunter1-3/+22
In preparation to add more program headers, layout the relative offset of each section. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-14-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Calculate offset from phnumAdrian Hunter1-1/+5
In preparation to add more program headers, calculate offset from the number of phdrs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-13-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Keep a count of phdrsAdrian Hunter1-5/+4
In preparation to add more program headers, keep a count of phdrs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-12-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Keep phdr data in a listAdrian Hunter1-0/+9
Currently, kcore_copy makes 2 program headers, one for the kernel text (namely kernel_map) and one for the modules (namely modules_map). Now more program headers are needed, but treating each program header as a special case results in much more code. Instead, in preparation to add more program headers, change to keep program header data (phdr_data) in a list. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-11-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf annotate: Show group event string for stdioJin Yao1-1/+5
When we enable the group, for tui/stdio2, the output first line includes the group event string. While for stdio, it will show only one event. For example, perf record -e cycles,branches ./div perf annotate --group --stdio Percent | Source code & Disassembly of div for cycles (44407 samples) ...... The first line doesn't include the event 'branches'. With this patch, it will show the correct group even string. perf annotate --group --stdio Percent | Source code & Disassembly of div for cycles, branches (44407 samples) ...... Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1526989115-14435-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf machine: Synthesize and process mmap events for x86 PTI entry trampolinesAdrian Hunter5-7/+140
Like the kernel text, the location of x86 PTI entry trampolines must be recorded in the perf.data file. Like the kernel, synthesize a mmap event for that, and add processing for it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf machine: Create maps for x86 PTI entry trampolinesAdrian Hunter5-19/+187
Create maps for x86 PTI entry trampolines, based on symbols found in kallsyms. It is also necessary to keep track of whether the trampolines have been mapped particularly when the kernel dso is kcore. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-9-git-send-email-adrian.hunter@intel.com [ Fix extra_kernel_map_info.cnt designed struct initializer on gcc 4.4.7 (centos:6, etc) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23mfd: cros_ec: Retry commands when EC is known to be busyBrian Norris2-4/+22
Commit 001dde9400d5 ("mfd: cros ec: spi: Fix "in progress" error signaling") pointed out some bad code, but its analysis and conclusion was not 100% correct. It *is* correct that we should not propagate result==EC_RES_IN_PROGRESS for transport errors, because this has a special meaning -- that we should follow up with EC_CMD_GET_COMMS_STATUS until the EC is no longer busy. This is definitely the wrong thing for many commands, because among other problems, EC_CMD_GET_COMMS_STATUS doesn't actually retrieve any RX data from the EC, so commands that expected some data back will instead start processing junk. For such commands, the right answer is to either propagate the error (and return that error to the caller) or resend the original command (*not* EC_CMD_GET_COMMS_STATUS). Unfortunately, commit 001dde9400d5 forgets a crucial point: that for some long-running operations, the EC physically cannot respond to commands any more. For example, with EC_CMD_FLASH_ERASE, the EC may be re-flashing its own code regions, so it can't respond to SPI interrupts. Instead, the EC prepares us ahead of time for being busy for a "long" time, and fills its hardware buffer with EC_SPI_PAST_END. Thus, we expect to see several "transport" errors (or, messages filled with EC_SPI_PAST_END). So we should really translate that to a retryable error (-EAGAIN) and continue sending EC_CMD_GET_COMMS_STATUS until we get a ready status. IOW, it is actually important to treat some of these "junk" values as retryable errors. Together with commit 001dde9400d5, this resolves bugs like the following: 1. EC_CMD_FLASH_ERASE now works again (with commit 001dde9400d5, we would abort the first time we saw EC_SPI_PAST_END) 2. Before commit 001dde9400d5, transport errors (e.g., EC_SPI_RX_BAD_DATA) seen in other commands (e.g., EC_CMD_RTC_GET_VALUE) used to yield junk data in the RX buffer; they will now yield -EAGAIN return values, and tools like 'hwclock' will simply fail instead of retrieving and re-programming undefined time values Fixes: 001dde9400d5 ("mfd: cros ec: spi: Fix "in progress" error signaling") Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2018-05-22alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering #2Sinan Kaya1-7/+7
memory-barriers.txt has been updated with the following requirement. "When using writel(), a prior wmb() is not needed to guarantee that the cache coherent memory writes have completed before writing to the MMIO region." Current writeX() and iowriteX() implementations on alpha are not satisfying this requirement as the barrier is after the register write. Move mb() in writeX() and iowriteX() functions to guarantee that HW observes memory changes before performing register operations. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-05-22alpha: simplify get_arch_dma_opsChristoph Hellwig2-5/+3
Remove the dma_ops indirection. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-05-22alpha: use dma_direct_ops for jensenChristoph Hellwig3-33/+5
The generic dma_direct implementation does the same thing as the alpha pci-noop implementation, just with more bells and whistles. And unlike the current code it at least has a theoretical chance to actually compile. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-05-22perf machine: Allow for extra kernel mapsAdrian Hunter5-12/+34
Identify extra kernel maps by name so that they can be distinguished from the kernel map and module maps. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Fix map_groups__split_kallsyms() for entry trampoline symbolsAdrian Hunter2-0/+21
When kernel symbols are derived from /proc/kallsyms only (not using vmlinux or /proc/kcore) map_groups__split_kallsyms() is used. However that function makes assumptions that are not true with entry trampoline symbols. For now, remove the entry trampoline symbols at that point, as they are no longer needed at that point. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Workaround missing maps for x86 PTI entry trampolinesAdrian Hunter3-5/+106
On x86_64 the PTI entry trampolines are not in the kernel map created by perf tools. That results in the addresses having no symbols and prevents annotation. It also causes Intel PT to have decoding errors at the trampoline addresses. Workaround that by creating maps for the trampolines. At present the kernel does not export information revealing where the trampolines are. Until that happens, the addresses are hardcoded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Add nr_cpus_avail()Adrian Hunter4-0/+20
Add a function to return the number of the machine's available CPUs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22PM / core: Fix direct_complete handling for devices with no callbacksRafael J. Wysocki1-4/+3
Commit 08810a4119aa (PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags) inadvertently prevented the power.direct_complete flag from being set for devices without PM callbacks and with disabled runtime PM which also prevents power.direct_complete from being set for their parents. That led to problems including a resume crash on HP ZBook 14u. Restore the previous behavior by causing power.direct_complete to be set for those devices again, but do that in a more direct way to avoid overlooking that case in the future. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199693 Fixes: 08810a4119aa (PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags) Reported-by: Thomas Martitz <kugel@rockbox.org> Tested-by: Thomas Martitz <kugel@rockbox.org> Cc: 4.15+ <stable@vger.kernel.org> # 4.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Johan Hovold <johan@kernel.org>
2018-05-21powerpc/64s: Add support for a store forwarding barrier at kernel entry/exitNicholas Piggin9-2/+356
On some CPUs we can prevent a vulnerability related to store-to-load forwarding by preventing store forwarding between privilege domains, by inserting a barrier in kernel entry and exit paths. This is known to be the case on at least Power7, Power8 and Power9 powerpc CPUs. Barriers must be inserted generally before the first load after moving to a higher privilege, and after the last store before moving to a lower privilege, HV and PR privilege transitions must be protected. Barriers are added as patch sections, with all kernel/hypervisor entry points patched, and the exit points to lower privilge levels patched similarly to the RFI flush patching. Firmware advertisement is not implemented yet, so CPU flush types are hard coded. Thanks to Michal Suchánek for bug fixes and review. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michal Suchánek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-21loop: clear wb_err in bd_inode when detaching backing fileJeff Layton1-0/+1
When a loop block device encounters a writeback error, that error will get propagated to the bd_inode's wb_err field. If we then detach the backing file from it, attach another and fsync it, we'll get back the writeback error that we had from the previous backing file. This is a bit of a grey area as POSIX doesn't cover loop devices, but it is somewhat counterintuitive. If we detach a backing file from the loopdev while there are still unreported errors, take it as a sign that we're no longer interested in the previous file, and clear out the wb_err in the loop blockdev. Reported-and-Tested-by: Theodore Y. Ts'o <tytso@mit.edu> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-21aio: fix io_destroy(2) vs. lookup_ioctx() raceAl Viro1-2/+2
kill_ioctx() used to have an explicit RCU delay between removing the reference from ->ioctx_table and percpu_ref_kill() dropping the refcount. At some point that delay had been removed, on the theory that percpu_ref_kill() itself contained an RCU delay. Unfortunately, that was the wrong kind of RCU delay and it didn't care about rcu_read_lock() used by lookup_ioctx(). As the result, we could get ctx freed right under lookup_ioctx(). Tejun has fixed that in a6d7cff472e ("fs/aio: Add explicit RCU grace period when freeing kioctx"); however, that fix is not enough. Suppose io_destroy() from one thread races with e.g. io_setup() from another; CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2 has picked it (under rcu_read_lock()). Then CPU1 proceeds to drop the refcount, getting it to 0 and triggering a call of free_ioctx_users(), which proceeds to drop the secondary refcount and once that reaches zero calls free_ioctx_reqs(). That does INIT_RCU_WORK(&ctx->free_rwork, free_ioctx); queue_rcu_work(system_wq, &ctx->free_rwork); and schedules freeing the whole thing after RCU delay. In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the refcount from 0 to 1 and returned the reference to io_setup(). Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get freed until after percpu_ref_get(). Sure, we'd increment the counter before ctx can be freed. Now we are out of rcu_read_lock() and there's nothing to stop freeing of the whole thing. Unfortunately, CPU2 assumes that since it has grabbed the reference, ctx is *NOT* going away until it gets around to dropping that reference. The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss. It's not costlier than what we currently do in normal case, it's safe to call since freeing *is* delayed and it closes the race window - either lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx() fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see the object in question at all. Cc: stable@kernel.org Fixes: a6d7cff472e "fs/aio: Add explicit RCU grace period when freeing kioctx" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21ext2: fix a block leakAl Viro1-10/+0
open file, unlink it, then use ioctl(2) to make it immutable or append only. Now close it and watch the blocks *not* freed... Immutable/append-only checks belong in ->setattr(). Note: the bug is old and backport to anything prior to 737f2e93b972 ("ext2: convert to use the new truncate convention") will need these checks lifted into ext2_setattr(). Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21nfsd: vfs_mkdir() might succeed leaving dentry negative unhashedAl Viro1-0/+22
That can (and does, on some filesystems) happen - ->mkdir() (and thus vfs_mkdir()) can legitimately leave its argument negative and just unhash it, counting upon the lookup to pick the object we'd created next time we try to look at that name. Some vfs_mkdir() callers forget about that possibility... Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21cachefiles: vfs_mkdir() might succeed leaving dentry negative unhashedAl Viro1-0/+10
That can (and does, on some filesystems) happen - ->mkdir() (and thus vfs_mkdir()) can legitimately leave its argument negative and just unhash it, counting upon the lookup to pick the object we'd created next time we try to look at that name. Some vfs_mkdir() callers forget about that possibility... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21unfuck sysfs_mount()Al Viro1-3/+3
new_sb is left uninitialized in case of early failures in kernfs_mount_ns(), and while IS_ERR(root) is true in all such cases, using IS_ERR(root) || !new_sb is not a solution - IS_ERR(root) is true in some cases when new_sb is true. Make sure new_sb is initialized (and matches the reality) in all cases and fix the condition for dropping kobj reference - we want it done precisely in those situations where the reference has not been transferred into a new super_block instance. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21kernfs: deal with kernfs_fill_super() failuresAl Viro1-0/+1
make sure that info->node is initialized early, so that kernfs_kill_sb() can list_del() it safely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21cramfs: Fix IS_ENABLED typoJoe Perches1-1/+1
There's an extra C here... Fixes: 99c18ce580c6 ("cramfs: direct memory access support") Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21befs_lookup(): use d_splice_alias()Al Viro1-12/+5
RTFS(Documentation/filesystems/nfs/Exporting) if you try to make something exportable. Fixes: ac632f5b6301 "befs: add NFS export support" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21affs_lookup: switch to d_splice_alias()Al Viro1-6/+5
Making something exportable takes more than providing ->s_export_ops. In particular, ->lookup() *MUST* use d_splice_alias() instead of d_add(). Reading Documentation/filesystems/nfs/Exporting would've been a good idea; as it is, exporting AFFS is badly (and exploitably) broken. Partially-Fixes: ed4433d72394 "fs/affs: make affs exportable" Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21affs_lookup(): close a race with affs_remove_link()Al Viro1-3/+7
we unlock the directory hash too early - if we are looking at secondary link and primary (in another directory) gets removed just as we unlock, we could have the old primary moved in place of the secondary, leaving us to look into freed entry (and leaving our dentry with ->d_fsdata pointing to a freed entry). Cc: stable@vger.kernel.org # 2.4.4+ Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21sr: pass down correctly sized SCSI sense bufferJens Axboe1-2/+8
We're casting the CDROM layer request_sense to the SCSI sense buffer, but the former is 64 bytes and the latter is 96 bytes. As we generally allocate these on the stack, we end up blowing up the stack. Fix this by wrapping the scsi_execute() call with a properly sized sense buffer, and copying back the bits for the CDROM layer. Cc: stable@vger.kernel.org Reported-by: Piotr Gabriel Kosinski <pg.kosinski@gmail.com> Reported-by: Daniel Shapira <daniel@twistlock.com> Tested-by: Kees Cook <keescook@chromium.org> Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-21perf annotate: Support '--group' optionJin Yao1-0/+7
With the '--group' option, even for non-explicit group, 'perf annotate' will enable the group output. For example, $ perf record -e cycles,branches ./div $ perf annotate main --stdio --group : Disassembly of section .text: : : 00000000004004b0 <main>: : main(): : : return i; : } : : int main(void) : { 0.00 0.00 : 4004b0: push %rbx : int i; : int flag; : volatile double x = 1212121212, y = 121212; : : s_randseed = time(0); 0.00 0.00 : 4004b1: xor %edi,%edi : srand(s_randseed); 0.00 0.00 : 4004b3: mov $0x77359400,%ebx : : return i; : } : But if without --group, there is only one event reported. $ perf annotate main --stdio : Disassembly of section .text: : : 00000000004004b0 <main>: : main(): : : return i; : } : : int main(void) : { 0.00 : 4004b0: push %rbx : int i; : int flag; : volatile double x = 1212121212, y = 121212; : : s_randseed = time(0); 0.00 : 4004b1: xor %edi,%edi : srand(s_randseed); 0.00 : 4004b3: mov $0x77359400,%ebx : : return i; : } Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1526914666-31839-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-21perf report: Use perf_evlist__force_leader to support '--group'Jin Yao1-11/+2
Since we created a new function perf_evlist__force_leader(), remove the old code and use that new evlist method. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1526914666-31839-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-21perf evlist: Introduce force_leader() methodJin Yao2-0/+18
For non-explicit group (e.g. those created with -e '{eventA,eventB}'), 'perf report' supports a option '--group' which can enable group output. We also need to support 'perf annotate' with the same '--group'. Create a new function perf_evlist__force_leader() which contains common code to force setting the group leader. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1526914666-31839-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-21libata: blacklist Micron 500IT SSD with MU01 firmwareSudip Mukherjee1-0/+2
While whitelisting Micron M500DC drives, the tweaked blacklist entry enabled queued TRIM from M500IT variants also. But these do not support queued TRIM. And while using those SSDs with the latest kernel we have seen errors and even the partition table getting corrupted. Some part from the dmesg: [ 6.727384] ata1.00: ATA-9: Micron_M500IT_MTFDDAK060MBD, MU01, max UDMA/133 [ 6.727390] ata1.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32), AA [ 6.741026] ata1.00: supports DRM functions and may not be fully accessible [ 6.759887] ata1.00: configured for UDMA/133 [ 6.762256] scsi 0:0:0:0: Direct-Access ATA Micron_M500IT_MT MU01 PQ: 0 ANSI: 5 and then for the error: [ 120.860334] ata1.00: exception Emask 0x1 SAct 0x7ffc0007 SErr 0x0 action 0x6 frozen [ 120.860338] ata1.00: irq_stat 0x40000008 [ 120.860342] ata1.00: failed command: SEND FPDMA QUEUED [ 120.860351] ata1.00: cmd 64/01:00:00:00:00/00:00:00:00:00/a0 tag 0 ncq dma 512 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x5 (timeout) [ 120.860353] ata1.00: status: { DRDY } [ 120.860543] ata1: hard resetting link [ 121.166128] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 121.166376] ata1.00: supports DRM functions and may not be fully accessible [ 121.186238] ata1.00: supports DRM functions and may not be fully accessible [ 121.204445] ata1.00: configured for UDMA/133 [ 121.204454] ata1.00: device reported invalid CHS sector 0 [ 121.204541] sd 0:0:0:0: [sda] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 [ 121.204546] sd 0:0:0:0: [sda] tag#18 Sense Key : 0x5 [current] [ 121.204550] sd 0:0:0:0: [sda] tag#18 ASC=0x21 ASCQ=0x4 [ 121.204555] sd 0:0:0:0: [sda] tag#18 CDB: opcode=0x93 93 08 00 00 00 00 00 04 28 80 00 00 00 30 00 00 [ 121.204559] print_req_error: I/O error, dev sda, sector 272512 After few reboots with these errors, and the SSD is corrupted. After blacklisting it, the errors are not seen and the SSD does not get corrupted any more. Fixes: 243918be6393 ("libata: Do not blacklist Micron M500DC") Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2018-05-20Linux 4.17-rc6Linus Torvalds1-1/+1
2018-05-19net: ip6_gre: fix tunnel metadata device sharing.William Tu1-22/+79
Currently ip6gre and ip6erspan share single metadata mode device, using 'collect_md_tun'. Thus, when doing: ip link add dev ip6gre11 type ip6gretap external ip link add dev ip6erspan12 type ip6erspan external RTNETLINK answers: File exists simply fails due to the 2nd tries to create the same collect_md_tun. The patch fixes it by adding a separate collect md tunnel device for the ip6erspan, 'collect_md_tun_erspan'. As a result, a couple of places need to refactor/split up in order to distinguish ip6gre and ip6erspan. First, move the collect_md check at ip6gre_tunnel_{unlink,link} and create separate function {ip6gre,ip6ersapn}_tunnel_{link_md,unlink_md}. Then before link/unlink, make sure the link_md/unlink_md is called. Finally, a separate ndo_uninit is created for ip6erspan. Tested it using the samples/bpf/test_tunnel_bpf.sh. Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode") Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19bpf: Prevent memory disambiguation attackAlexei Starovoitov2-3/+57
Detect code patterns where malicious 'speculative store bypass' can be used and sanitize such patterns. 39: (bf) r3 = r10 40: (07) r3 += -216 41: (79) r8 = *(u64 *)(r7 +0) // slow read 42: (7a) *(u64 *)(r10 -72) = 0 // verifier inserts this instruction 43: (7b) *(u64 *)(r8 +0) = r3 // this store becomes slow due to r8 44: (79) r1 = *(u64 *)(r6 +0) // cpu speculatively executes this load 45: (71) r2 = *(u8 *)(r1 +0) // speculatively arbitrary 'load byte' // is now sanitized Above code after x86 JIT becomes: e5: mov %rbp,%rdx e8: add $0xffffffffffffff28,%rdx ef: mov 0x0(%r13),%r14 f3: movq $0x0,-0x48(%rbp) fb: mov %rdx,0x0(%r14) ff: mov 0x0(%rbx),%rdi 103: movzbq 0x0(%rdi),%rsi Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-05-19ARM: fix kill( ,SIGFPE) breakageRussell King2-14/+1
Commit 7771c6645700 ("signal/arm: Document conflicts with SI_USER and SIGFPE") broke the siginfo structure for userspace triggered signals, causing the strace testsuite to regress. Fix this by eliminating the FPE_FIXME definition (which is at the root of the breakage) and use FPE_FLTINV instead for the case where the hardware appears to be reporting nonsense. Fixes: 7771c6645700 ("signal/arm: Document conflicts with SI_USER and SIGFPE") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-05-19mmap: relax file size limit for regular filesLinus Torvalds1-1/+1
Commit be83bbf80682 ("mmap: introduce sane default mmap limits") was introduced to catch problems in various ad-hoc character device drivers doing mmap and getting the size limits wrong. In the process, it used "known good" limits for the normal cases of mapping regular files and block device drivers. It turns out that the "s_maxbytes" limit was less "known good" than I thought. In particular, /proc doesn't set it, but exposes one regular file to mmap: /proc/vmcore. As a result, that file got limited to the default MAX_INT s_maxbytes value. This went unnoticed for a while, because apparently the only thing that needs it is the s390 kernel zfcpdump, but there might be other tools that use this too. Vasily suggested just changing s_maxbytes for all of /proc, which isn't wrong, but makes me nervous at this stage. So instead, just make the new mmap limit always be MAX_LFS_FILESIZE for regular files, which won't affect anything else. It wasn't the regular file case I was worried about. I'd really prefer for maxsize to have been per-inode, but that is not how things are today. Fixes: be83bbf80682 ("mmap: introduce sane default mmap limits") Reported-by: Vasily Gorbik <gor@linux.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-19x86/MCE/AMD: Cache SMCA MISC block addressesBorislav Petkov1-15/+14
... into a global, two-dimensional array and service subsequent reads from that cache to avoid rdmsr_on_cpu() calls during CPU hotplug (IPIs with IRQs disabled). In addition, this fixes a KASAN slab-out-of-bounds read due to wrong usage of the bank->blocks pointer. Fixes: 27bd59502702 ("x86/mce/AMD: Get address from already initialized block") Reported-by: Johannes Hirte <johannes.hirte@datenkhaos.de> Tested-by: Johannes Hirte <johannes.hirte@datenkhaos.de> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Yazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20180414004230.GA2033@probook
2018-05-19ARM: 8772/1: kprobes: Prohibit kprobes on get_user functionsMasami Hiramatsu2-0/+20
Since do_undefinstr() uses get_user to get the undefined instruction, it can be called before kprobes processes recursive check. This can cause an infinit recursive exception. Prohibit probing on get_user functions. Fixes: 24ba613c9d6c ("ARM kprobes: core code") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>