aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-10-13Revert "vmalloc: back off when the current task is killed"Johannes Weiner1-6/+0
This reverts commits 5d17a73a2ebe ("vmalloc: back off when the current task is killed") and 171012f56127 ("mm: don't warn when vmalloc() fails due to a fatal signal"). Commit 5d17a73a2ebe ("vmalloc: back off when the current task is killed") made all vmalloc allocations from a signal-killed task fail. We have seen crashes in the tty driver from this, where a killed task exiting tries to switch back to N_TTY, fails n_tty_open because of the vmalloc failing, and later crashes when dereferencing tty->disc_data. Arguably, relying on a vmalloc() call to succeed in order to properly exit a task is not the most robust way of doing things. There will be a follow-up patch to the tty code to fall back to the N_NULL ldisc. But the justification to make that vmalloc() call fail like this isn't convincing, either. The patch mentions an OOM victim exhausting the memory reserves and thus deadlocking the machine. But the OOM killer is only one, improbable source of fatal signals. It doesn't make sense to fail allocations preemptively with plenty of memory in most cases. The patch doesn't mention real-life instances where vmalloc sites would exhaust memory, which makes it sound more like a theoretical issue to begin with. But just in case, the OOM access to memory reserves has been restricted on the allocator side in cd04ae1e2dc8 ("mm, oom: do not rely on TIF_MEMDIE for memory reserves access"), which should take care of any theoretical concerns on that front. Revert this patch, and the follow-up that suppresses the allocation warnings when we fail the allocations due to a signal. Link: http://lkml.kernel.org/r/20171004185906.GB2136@cmpxchg.org Fixes: 171012f56127 ("mm: don't warn when vmalloc() fails due to a fatal signal") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alan Cox <alan@llwyncelyn.cymru> Cc: Christoph Hellwig <hch@lst.de> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13mm/cma.c: take __GFP_NOWARN into account in cma_alloc()Boris Brezillon1-1/+1
cma_alloc() unconditionally prints an INFO message when the CMA allocation fails. Make this message conditional on the non-presence of __GFP_NOWARN in gfp_mask. This patch aims at removing INFO messages that are displayed when the VC4 driver tries to allocate buffer objects. From the driver perspective an allocation failure is acceptable, and the driver can possibly do something to make following allocation succeed (like flushing the VC4 internal cache). Link: http://lkml.kernel.org/r/20171004125447.15195-1-boris.brezillon@free-electrons.com Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Acked-by: Laura Abbott <labbott@redhat.com> Cc: Jaewon Kim <jaewon31.kim@samsung.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Eric Anholt <eric@anholt.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13scripts/kallsyms.c: ignore symbol type 'n'Guenter Roeck1-1/+1
gcc on aarch64 may emit synbols of type 'n' if the kernel is built with '-frecord-gcc-switches'. In most cases, those symbols are reported with nm as 000000000000000e n $d and with objdump as 0000000000000000 l d .GCC.command.line 0000000000000000 .GCC.command.line 000000000000000e l .GCC.command.line 0000000000000000 $d Those symbols are detected in is_arm_mapping_symbol() and ignored. However, if "--prefix-symbols=<prefix>" is configured as well, the situation is different. For example, in efi/libstub, arm64 images are built with '--prefix-alloc-sections=.init --prefix-symbols=__efistub_'. In combination with '-frecord-gcc-switches', the symbols are now reported by nm as: 000000000000000e n __efistub_$d and by objdump as: 0000000000000000 l d .GCC.command.line 0000000000000000 .GCC.command.line 000000000000000e l .GCC.command.line 0000000000000000 __efistub_$d Those symbols are no longer ignored and included in the base address calculation. This results in a base address of 000000000000000e, which in turn causes kallsyms to abort with kallsyms failure: relative symbol value 0xffffff900800a000 out of range in relative mode The problem is seen in little endian arm64 builds with CONFIG_EFI enabled and with '-frecord-gcc-switches' set in KCFLAGS. Explicitly ignore symbols of type 'n' since those are clearly debug symbols. Link: http://lkml.kernel.org/r/1507136063-3139-1-git-send-email-linux@roeck-us.net Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13userfaultfd: selftest: exercise -EEXIST only in background transferAndrea Arcangeli1-5/+20
I was stress testing some backports and with high load, after some time, the latest version of the selftest showed some false positive in connection with the uffdio_copy_retry. This seems to fix it while still exercising -EEXIST in the background transfer once in a while. The fork child will quit after the last UFFDIO_COPY is run, so a repeated UFFDIO_COPY may not return -EEXIST. This change restricts the -EEXIST stress to the background transfer where the memory can't go away from under it. Also updated uffdio_zeropage, so the interface is consistent. Link: http://lkml.kernel.org/r/20171004171541.1495-2-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13mm: only display online cpus of the numa nodeZhen Lei1-2/+10
When I execute numactl -H (which reads /sys/devices/system/node/nodeX/cpumap and displays cpumask_of_node for each node), I get different result on X86 and arm64. For each numa node, the former only displayed online CPUs, and the latter displayed all possible CPUs. Unfortunately, both Linux documentation and numactl manual have not described it clear. I sent a mail to ask for help, and Michal Hocko replied that he preferred to print online cpus because it doesn't really make much sense to bind anything on offline nodes. Will said: "I suspect the vast majority (if not all) code that reads this file was developed for x86, so having the same behaviour for arm64 sounds like something we should do ASAP before people try to special case with things like #ifdef __aarch64__. I'd rather have this in 4.14 if possible." Link: http://lkml.kernel.org/r/1506678805-15392-2-git-send-email-thunder.leizhen@huawei.com Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tianhong Ding <dingtianhong@huawei.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Libin <huawei.libin@huawei.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13mm: remove unnecessary WARN_ONCE in page_vma_mapped_walk().Zi Yan1-2/+1
A non present pmd entry can appear after pmd_lock is taken in page_vma_mapped_walk(), even if THP migration is not enabled. The WARN_ONCE is unnecessary. Link: http://lkml.kernel.org/r/20171003142606.12324-1-zi.yan@sent.com Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path") Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13mm/mempolicy: fix NUMA_INTERLEAVE_HIT counterAndrey Ryabinin1-2/+5
Commit 3a321d2a3dde ("mm: change the call sites of numa statistics items") separated NUMA counters from zone counters, but the NUMA_INTERLEAVE_HIT call site wasn't updated to use the new interface. So alloc_page_interleave() actually increments NR_ZONE_INACTIVE_FILE instead of NUMA_INTERLEAVE_HIT. Fix this by using __inc_numa_state() interface to increment NUMA_INTERLEAVE_HIT. Link: http://lkml.kernel.org/r/20171003191003.8573-1-aryabinin@virtuozzo.com Fixes: 3a321d2a3dde ("mm: change the call sites of numa statistics items") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Kemi Wang <kemi.wang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13include/linux/of.h: provide of_n_{addr,size}_cells wrappers for !CONFIG_OFArnd Bergmann1-0/+10
The pci-rcar driver is enabled for compile tests, and this has shown that the driver cannot build without CONFIG_OF, following the inclusion of commit f8f2fe7355fb ("PCI: rcar: Use new OF interrupt mapping when possible"): drivers/pci/host/pcie-rcar.c: In function 'pci_dma_range_parser_init': drivers/pci/host/pcie-rcar.c:1039:2: error: implicit declaration of function 'of_n_addr_cells' [-Werror=implicit-function-declaration] parser->pna = of_n_addr_cells(node); ^ As pointed out by Ben Dooks and Geert Uytterhoeven, this is actually supposed to build fine, which we can achieve if we make the declaration of of_irq_parse_and_map_pci conditional on CONFIG_OF and provide an empty inline function otherwise, as we do for a lot of other of interfaces. This lets us build the rcar_pci driver again without CONFIG_OF for build testing. All platforms using this driver select OF, so this doesn't change anything for the users. [akpm@linux-foundation.org: be consistent with surrounding code] Link: http://lkml.kernel.org/r/20170911200805.3363318-1-arnd@arndb.de Fixes: c25da4778803 ("PCI: rcar: Add Renesas R-Car PCIe driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Magnus Damm <damm@opensource.se> Cc: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13mm/madvise.c: add description for MADV_WIPEONFORK and MADV_KEEPONFORKYang Shi1-1/+6
mm/madvise.c has a brief description about all MADV_ flags. Add a description for the newly added MADV_WIPEONFORK and MADV_KEEPONFORK. Although man page has the similar information, but it'd better to keep the consistent with other flags. Link: http://lkml.kernel.org/r/1506117328-88228-1-git-send-email-yang.s@alibaba-inc.com Signed-off-by: Yang Shi <yang.s@alibaba-inc.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13lib/Kconfig.debug: kernel hacking menu: runtime testing: keep tests togetherRandy Dunlap1-72/+71
Expand the "Runtime testing" menu by including more entries inside it instead of after it. This is just Kconfig symbol movement. This causes the (arch-independent) Runtime tests to be presented (listed) all in one place instead of in multiple places. Link: http://lkml.kernel.org/r/c194e5c4-2042-bf94-a2d8-7aa13756e257@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: "Luis R. Rodriguez" <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13mm/migrate: fix indexing bug (off by one) and avoid out of bound accessMark Hairgrove1-1/+2
Index was incremented before last use and thus the second array could dereference to an invalid address (not mentioning the fact that it did not properly clear the entry we intended to clear). Link: http://lkml.kernel.org/r/1506973525-16491-1-git-send-email-jglisse@redhat.com Fixes: 8315ada7f095bf ("mm/migrate: allow migrate_vma() to alloc new page on empty entry") Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com> Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13iommu/amd: Finish TLB flush in amd_iommu_unmap()Joerg Roedel1-0/+1
The function only sends the flush command to the IOMMU(s), but does not wait for its completion when it returns. Fix that. Fixes: 601367d76bd1 ('x86/amd-iommu: Remove iommu_flush_domain function') Cc: stable@vger.kernel.org # >= 2.6.33 Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-10-13powerpc/perf: Fix IMC initialization crashAnju T Sudhakar1-1/+2
Panic observed with latest firmware, and upstream kernel: NIP init_imc_pmu+0x8c/0xcf0 LR init_imc_pmu+0x2f8/0xcf0 Call Trace: init_imc_pmu+0x2c8/0xcf0 (unreliable) opal_imc_counters_probe+0x300/0x400 platform_drv_probe+0x64/0x110 driver_probe_device+0x3d8/0x580 __driver_attach+0x14c/0x1a0 bus_for_each_dev+0x8c/0xf0 driver_attach+0x34/0x50 bus_add_driver+0x298/0x350 driver_register+0x9c/0x180 __platform_driver_register+0x5c/0x70 opal_imc_driver_init+0x2c/0x40 do_one_initcall+0x64/0x1d0 kernel_init_freeable+0x280/0x374 kernel_init+0x24/0x160 ret_from_kernel_thread+0x5c/0x74 While registering nest imc at init, cpu-hotplug callback nest_pmu_cpumask_init() makes an OPAL call to stop the engine. And if the OPAL call fails, imc_common_cpuhp_mem_free() is invoked to cleanup memory and cpuhotplug setup. But when cleaning up the attribute group, we are dereferencing the attribute element array without checking whether the backing element is not NULL. This causes the kernel panic. Add a check for the backing element prior to dereferencing the attribute element, to handle the failing case gracefully. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> [mpe: Trim change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-12scripts: fix faddr2line to work on last symbolNeilBrown1-2/+3
If faddr2line is given a function name which is the last one listed by "nm -n", it will fail because it never finds the next symbol. So teach the awk script to catch that possibility, and use 'size' to provide the end point of the last function. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-12device property: preserve usecount for node passed to of_fwnode_graph_get_port_parent()Niklas Söderlund1-1/+1
Using CONFIG_OF_DYNAMIC=y uncovered an imbalance in the usecount of the node being passed to of_fwnode_graph_get_port_parent(). Preserve the usecount by using of_get_parent() instead of of_get_next_parent() which don't decrement the usecount of the node passed to it. Fixes: 3b27d00e7b6d7c88 ("device property: Move fwnode graph ops to firmware specific locations") Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Rob Herring <robh@kernel.org>
2017-10-12drivers: of: increase MAX_RESERVED_REGIONS to 32Stewart Smith1-1/+1
There are two types of memory reservations firmware can ask the kernel to make in the device tree: static and dynamic. See Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt If you have greater than 16 entries in /reserved-memory (as we do on POWER9 systems) you would get this scary looking error message: [ 0.000000] OF: reserved mem: not enough space all defined regions. This is harmless if all your reservations are static (which with OPAL on POWER9, they are). It is not harmless if you have any dynamic reservations after the 16th. In the first pass over the fdt to find reservations, the child nodes of /reserved-memory are added to a static array in of_reserved_mem.c so that memory can be reserved in a 2nd pass. The array has 16 entries. This is why, on my dual socket POWER9 system, I get that error 4 times with 20 static reservations. We don't have a problem on ppc though, as in arch/powerpc/kernel/prom.c we look at the new style /reserved-ranges property to do reservations, and this logic was introduced in 0962e8004e974 (well before any powernv system shipped). A Google search shows up no occurances of that exact error message, so we're probably safe in that no machine that people use has memory not being reserved when it should be. The simple fix is to bump the length of the array to 32 which "should be enough for everyone(TM)". The simple fix of not recording static allocations in the array would cause problems for devices with "memory-region" properties. A more future-proof fix is likely possible, although more invasive and this simple fix is perfectly suitable in the meantime while a more future-proof fix is developed. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Tested-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Rob Herring <robh@kernel.org>
2017-10-12of: do not leak console optionsSergey Senozhatsky1-2/+6
Do not strdup() console options. It seems that the only reason for it to be strdup()-ed was a compilation warning: printk, UART and console drivers, for some reason, expect char pointer instead of const char pointer. So we can just pass `of_stdout_options', but need to cast it to char pointer. A better fix would be to change printk, console drivers and UART to accept const char `options'; but that will take time - there are lots of drivers to update. The patch also fixes a possible memory leak: add_preferred_console() can fail, but we don't kfree() options. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Rob Herring <robh@kernel.org>
2017-10-12powerpc/perf: Add ___GFP_NOWARN flag to alloc_pages_node()Anju T Sudhakar1-4/+4
Stack trace output during a stress test: [ 4.310049] Freeing initrd memory: 22592K [ 4.310646] rtas_flash: no firmware flash support [ 4.313341] cpuhp/64: page allocation failure: order:0, mode:0x14480c0(GFP_KERNEL|__GFP_ZERO|__GFP_THISNODE), nodemask=(null) [ 4.313465] cpuhp/64 cpuset=/ mems_allowed=0 [ 4.313521] CPU: 64 PID: 392 Comm: cpuhp/64 Not tainted 4.11.0-39.el7a.ppc64le #1 [ 4.313588] Call Trace: [ 4.313622] [c000000f1fb1b8e0] [c000000000c09388] dump_stack+0xb0/0xf0 (unreliable) [ 4.313694] [c000000f1fb1b920] [c00000000030ef6c] warn_alloc+0x12c/0x1c0 [ 4.313753] [c000000f1fb1b9c0] [c00000000030ff68] __alloc_pages_nodemask+0xea8/0x1000 [ 4.313823] [c000000f1fb1bbb0] [c000000000113a8c] core_imc_mem_init+0xbc/0x1c0 [ 4.313892] [c000000f1fb1bc00] [c000000000113cdc] ppc_core_imc_cpu_online+0x14c/0x170 [ 4.313962] [c000000f1fb1bc90] [c000000000125758] cpuhp_invoke_callback+0x198/0x5d0 [ 4.314031] [c000000f1fb1bd00] [c00000000012782c] cpuhp_thread_fun+0x8c/0x3d0 [ 4.314101] [c000000f1fb1bd60] [c0000000001678d0] smpboot_thread_fn+0x290/0x2a0 [ 4.314169] [c000000f1fb1bdc0] [c00000000015ee78] kthread+0x168/0x1b0 [ 4.314229] [c000000f1fb1be30] [c00000000000b368] ret_from_kernel_thread+0x5c/0x74 [ 4.314313] Mem-Info: [ 4.314356] active_anon:0 inactive_anon:0 isolated_anon:0 core_imc_mem_init() at system boot use alloc_pages_node() to get memory and alloc_pages_node() throws this stack dump when tried to allocate memory from a node which has no memory behind it. Add a ___GFP_NOWARN flag in allocation request as a fix. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Venkat R.B <venkatb3@in.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-12powerpc/perf: Fix for core/nest imc call trace on cpuhotplugAnju T Sudhakar1-0/+28
Nest/core pmu units are enabled only when it is used. A reference count is maintained for the events which uses the nest/core pmu units. Currently in *_imc_counters_release function a WARN() is used for notification of any underflow of ref count. The case where event ref count hit a negative value is, when perf session is started, followed by offlining of all cpus in a given core. i.e. in cpuhotplug offline path ppc_core_imc_cpu_offline() function set the ref->count to zero, if the current cpu which is about to offline is the last cpu in a given core and make an OPAL call to disable the engine in that core. And on perf session termination, perf->destroy (core_imc_counters_release) will first decrement the ref->count for this core and based on the ref->count value an opal call is made to disable the core-imc engine. Now, since cpuhotplug path already clears the ref->count for core and disabled the engine, perf->destroy() decrementing again at event termination make it negative which in turn fires the WARN_ON. The same happens for nest units. Add a check to see if the reference count is alreday zero, before decrementing the count, so that the ref count will not hit a negative value. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-12MAINTAINERS: Add Paul Mackerras as maintainer for KVM/powerpcThomas Huth1-1/+1
Paul is handling almost all of the powerpc related KVM patches nowadays, so he should be mentioned in the MAINTAINERS file accordingly. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-12KVM: nVMX: fix guest CR4 loading when emulating L2 to L1 exitHaozhong Zhang1-1/+1
When KVM emulates an exit from L2 to L1, it loads L1 CR4 into the guest CR4. Before this CR4 loading, the guest CR4 refers to L2 CR4. Because these two CR4's are in different levels of guest, we should vmx_set_cr4() rather than kvm_set_cr4() here. The latter, which is used to handle guest writes to its CR4, checks the guest change to CR4 and may fail if the change is invalid. The failure may cause trouble. Consider we start a L1 guest with non-zero L1 PCID in use, (i.e. L1 CR4.PCIDE == 1 && L1 CR3.PCID != 0) and a L2 guest with L2 PCID disabled, (i.e. L2 CR4.PCIDE == 0) and following events may happen: 1. If kvm_set_cr4() is used in load_vmcs12_host_state() to load L1 CR4 into guest CR4 (in VMCS01) for L2 to L1 exit, it will fail because of PCID check. As a result, the guest CR4 recorded in L0 KVM (i.e. vcpu->arch.cr4) is left to the value of L2 CR4. 2. Later, if L1 attempts to change its CR4, e.g., clearing VMXE bit, kvm_set_cr4() in L0 KVM will think L1 also wants to enable PCID, because the wrong L2 CR4 is used by L0 KVM as L1 CR4. As L1 CR3.PCID != 0, L0 KVM will inject GP to L1 guest. Fixes: 4704d0befb072 ("KVM: nVMX: Exiting from L2 to L1") Cc: qemu-stable@nongnu.org Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-12iommu/exynos: Remove initconst attribute to avoid potential kernel oopsMarek Szyprowski1-1/+1
Exynos SYSMMU registers standard platform device with sysmmu_of_match table, what means that this table is accessed every time a new platform device is registered in a system. This might happen also after the boot, so the table must not be attributed as initconst to avoid potential kernel oops caused by access to freed memory. Fixes: 6b21a5db3642 ("iommu/exynos: Support for device tree") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-10-11ACPI: properties: Fix __acpi_node_get_property_reference() return codesSakari Ailus1-4/+6
Fix more return codes for device property: Align return codes of __acpi_node_get_property_reference(). In particular, what was missed previously: -EPROTO could be returned in certain cases, now -EINVAL; -EINVAL was returned if the property was not found, now -ENOENT; -EINVAL was returned also if the index was higher than the number of entries in a package, now -ENOENT. Reported-by: Hyungwoo Yang <hyungwoo.yang@intel.com> Fixes: 3e3119d3088f (device property: Introduce fwnode_property_get_reference_args) Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Tested-by: Hyungwoo Yang <hyungwoo.yang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-11ACPI: properties: Align return codes of __acpi_node_get_property_reference()Sakari Ailus2-10/+13
acpi_fwnode_get_reference_args(), the function implementing ACPI support for fwnode_property_get_reference_args(), returns directly error codes from __acpi_node_get_property_reference(). The latter uses different error codes than the OF implementation. In particular, the OF implementation uses -ENOENT to indicate that the property is not found, a reference entry is empty and there are no more references. Document and align the error codes for property for fwnode_property_get_reference_args() so that they match with of_parse_phandle_with_args(). Fixes: 3e3119d3088f (device property: Introduce fwnode_property_get_reference_args) Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-11remoteproc: imx_rproc: fix return value check in imx_rproc_addr_init()Wei Yongjun1-3/+2
In case of error, the function devm_ioremap() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2017-10-11xfs: handle error if xfs_btree_get_bufs failsEric Sandeen1-0/+8
Jason reported that a corrupted filesystem failed to replay the log with a metadata block out of bounds warning: XFS (dm-2): _xfs_buf_find: Block out of range: block 0x80270fff8, EOFS 0x9c40000 _xfs_buf_find() and xfs_btree_get_bufs() return NULL if that happens, and then when xfs_alloc_fix_freelist() calls xfs_trans_binval() on that NULL bp, we oops with: BUG: unable to handle kernel NULL pointer dereference at 00000000000000f8 We don't handle _xfs_buf_find errors very well, every caller higher up the stack gets to guess at why it failed. But we should at least handle it somehow, so return EFSCORRUPTED here. Reported-by: Jason L Tibbitts III <tibbs@math.uh.edu> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-11xfs: reinit btree pointer on attr tree inactivation walkBrian Foster1-0/+2
xfs_attr3_root_inactive() walks the attr fork tree to invalidate the associated blocks. xfs_attr3_node_inactive() recursively descends from internal blocks to leaf blocks, caching block address values along the way to revisit parent blocks, locate the next entry and descend down that branch of the tree. The code that attempts to reread the parent block is unsafe because it assumes that the local xfs_da_node_entry pointer remains valid after an xfs_trans_brelse() and re-read of the parent buffer. Under heavy memory pressure, it is possible that the buffer has been reclaimed and reallocated by the time the parent block is reread. This means that 'btree' can point to an invalid memory address, lead to a random/garbage value for child_fsb and cause the subsequent read of the attr fork to go off the rails and return a NULL buffer for an attr fork offset that is most likely not allocated. Note that this problem can be manufactured by setting XFS_ATTR_BTREE_REF to 0 to prevent LRU caching of attr buffers, creating a file with a multi-level attr fork and removing it to trigger inactivation. To address this problem, reinit the node/btree pointers to the parent buffer after it has been re-read. This ensures btree points to a valid record and allows the walk to proceed. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-11xfs: Fix bool initialization/comparisonThomas Meyer5-8/+8
Bool initializations should use true and false. Bool tests don't need comparisons. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-11xfs: don't change inode mode if ACL update failsDave Chinner1-6/+16
If we get ENOSPC half way through setting the ACL, the inode mode can still be changed even though the ACL does not exist. Reorder the operation to only change the mode of the inode if the ACL is set correctly. Whilst this does not fix the problem with crash consistency (that requires attribute addition to be a deferred op) it does prevent ENOSPC and other non-fatal errors setting an xattr to be handled sanely. This fixes xfstests generic/449. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-11xfs: move more RT specific code under CONFIG_XFS_RTDave Chinner3-0/+27
Various utility functions and interfaces that iterate internal devices try to reference the realtime device even when RT support is not compiled into the kernel. Make sure this code is excluded from the CONFIG_XFS_RT=n build, and where appropriate stub functions to return fatal errors if they ever get called when RT support is not present. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-11xfs: Don't log uninitialised fields in inode structuresDave Chinner3-58/+50
Prevent kmemcheck from throwing warnings about reading uninitialised memory when formatting inodes into the incore log buffer. There are several issues here - we don't always log all the fields in the inode log format item, and we never log the inode the di_next_unlinked field. In the case of the inode log format item, this is exacerbated by the old xfs_inode_log_format structure padding issue. Hence make the padded, 64 bit aligned version of the structure the one we always use for formatting the log and get rid of the 64 bit variant. This means we'll always log the 64-bit version and so recovery only needs to convert from the unpadded 32 bit version from older 32 bit kernels. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-119p: set page uptodate when required in write_end()Alexander Levin1-3/+7
Commit 77469c3f570 prevented setting the page as uptodate when we wrote the right amount of data, fix that. Fixes: 77469c3f570 ("9p: saner ->write_end() on failing copy into non-uptodate page") Reviewed-by: Jan Kara <jack@suse.com> Signed-off-by: Alexander Levin <alexander.levin@verizon.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-11ALSA: caiaq: Fix stray URB at probe error pathTakashi Iwai1-3/+9
caiaq driver doesn't kill the URB properly at its error path during the probe, which may lead to a use-after-free error later. This patch addresses it. Reported-by: Johan Hovold <johan@kernel.org> Reviewed-by: Johan Hovold <johan@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-10-11HID: hid-elecom: extend to fix descriptor for HUGE trackballAlex Manoussakis4-4/+14
In addition to DEFT, Elecom introduced a larger trackball called HUGE, in both wired (M-HT1URBK) and wireless (M-HT1DRBK) versions. It has the same buttons and behavior as the DEFT. This patch adds the two relevant USB IDs to enable operation of the three Fn buttons on the top of the device. Cc: Diego Elio Petteno <flameeyes@flameeyes.eu> Signed-off-by: Alex Manoussakis <amanou@gnu.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-10-11HID: usbhid: fix out-of-bounds bugJaejoong Kim1-1/+11
The hid descriptor identifies the length and type of subordinate descriptors for a device. If the received hid descriptor is smaller than the size of the struct hid_descriptor, it is possible to cause out-of-bounds. In addition, if bNumDescriptors of the hid descriptor have an incorrect value, this can also cause out-of-bounds while approaching hdesc->desc[n]. So check the size of hid descriptor and bNumDescriptors. BUG: KASAN: slab-out-of-bounds in usbhid_parse+0x9b1/0xa20 Read of size 1 at addr ffff88006c5f8edf by task kworker/1:2/1261 CPU: 1 PID: 1261 Comm: kworker/1:2 Not tainted 4.14.0-rc1-42251-gebb2c2437d80 #169 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: usb_hub_wq hub_event Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x292/0x395 lib/dump_stack.c:52 print_address_description+0x78/0x280 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 kasan_report+0x22f/0x340 mm/kasan/report.c:409 __asan_report_load1_noabort+0x19/0x20 mm/kasan/report.c:427 usbhid_parse+0x9b1/0xa20 drivers/hid/usbhid/hid-core.c:1004 hid_add_device+0x16b/0xb30 drivers/hid/hid-core.c:2944 usbhid_probe+0xc28/0x1100 drivers/hid/usbhid/hid-core.c:1369 usb_probe_interface+0x35d/0x8e0 drivers/usb/core/driver.c:361 really_probe drivers/base/dd.c:413 driver_probe_device+0x610/0xa00 drivers/base/dd.c:557 __device_attach_driver+0x230/0x290 drivers/base/dd.c:653 bus_for_each_drv+0x161/0x210 drivers/base/bus.c:463 __device_attach+0x26e/0x3d0 drivers/base/dd.c:710 device_initial_probe+0x1f/0x30 drivers/base/dd.c:757 bus_probe_device+0x1eb/0x290 drivers/base/bus.c:523 device_add+0xd0b/0x1660 drivers/base/core.c:1835 usb_set_configuration+0x104e/0x1870 drivers/usb/core/message.c:1932 generic_probe+0x73/0xe0 drivers/usb/core/generic.c:174 usb_probe_device+0xaf/0xe0 drivers/usb/core/driver.c:266 really_probe drivers/base/dd.c:413 driver_probe_device+0x610/0xa00 drivers/base/dd.c:557 __device_attach_driver+0x230/0x290 drivers/base/dd.c:653 bus_for_each_drv+0x161/0x210 drivers/base/bus.c:463 __device_attach+0x26e/0x3d0 drivers/base/dd.c:710 device_initial_probe+0x1f/0x30 drivers/base/dd.c:757 bus_probe_device+0x1eb/0x290 drivers/base/bus.c:523 device_add+0xd0b/0x1660 drivers/base/core.c:1835 usb_new_device+0x7b8/0x1020 drivers/usb/core/hub.c:2457 hub_port_connect drivers/usb/core/hub.c:4903 hub_port_connect_change drivers/usb/core/hub.c:5009 port_event drivers/usb/core/hub.c:5115 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119 worker_thread+0x221/0x1850 kernel/workqueue.c:2253 kthread+0x3a1/0x470 kernel/kthread.c:231 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431 Cc: stable@vger.kernel.org Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Jaejoong Kim <climbbb.kim@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-10-11livepatch: unpatch all klp_objects if klp_module_coming failsJoe Lawrence1-23/+37
When an incoming module is considered for livepatching by klp_module_coming(), it iterates over multiple patches and multiple kernel objects in this order: list_for_each_entry(patch, &klp_patches, list) { klp_for_each_object(patch, obj) { which means that if one of the kernel objects fails to patch, klp_module_coming()'s error path needs to unpatch and cleanup any kernel objects that were already patched by a previous patch. Reported-by: Miroslav Benes <mbenes@suse.cz> Suggested-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-10-11ALSA: seq: Fix use-after-free at creating a portTakashi Iwai2-3/+10
There is a potential race window opened at creating and deleting a port via ioctl, as spotted by fuzzing. snd_seq_create_port() creates a port object and returns its pointer, but it doesn't take the refcount, thus it can be deleted immediately by another thread. Meanwhile, snd_seq_ioctl_create_port() still calls the function snd_seq_system_client_ev_port_start() with the created port object that is being deleted, and this triggers use-after-free like: BUG: KASAN: use-after-free in snd_seq_ioctl_create_port+0x504/0x630 [snd_seq] at addr ffff8801f2241cb1 ============================================================================= BUG kmalloc-512 (Tainted: G B ): kasan: bad access detected ----------------------------------------------------------------------------- INFO: Allocated in snd_seq_create_port+0x94/0x9b0 [snd_seq] age=1 cpu=3 pid=4511 ___slab_alloc+0x425/0x460 __slab_alloc+0x20/0x40 kmem_cache_alloc_trace+0x150/0x190 snd_seq_create_port+0x94/0x9b0 [snd_seq] snd_seq_ioctl_create_port+0xd1/0x630 [snd_seq] snd_seq_do_ioctl+0x11c/0x190 [snd_seq] snd_seq_ioctl+0x40/0x80 [snd_seq] do_vfs_ioctl+0x54b/0xda0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x16/0x75 INFO: Freed in port_delete+0x136/0x1a0 [snd_seq] age=1 cpu=2 pid=4717 __slab_free+0x204/0x310 kfree+0x15f/0x180 port_delete+0x136/0x1a0 [snd_seq] snd_seq_delete_port+0x235/0x350 [snd_seq] snd_seq_ioctl_delete_port+0xc8/0x180 [snd_seq] snd_seq_do_ioctl+0x11c/0x190 [snd_seq] snd_seq_ioctl+0x40/0x80 [snd_seq] do_vfs_ioctl+0x54b/0xda0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x16/0x75 Call Trace: [<ffffffff81b03781>] dump_stack+0x63/0x82 [<ffffffff81531b3b>] print_trailer+0xfb/0x160 [<ffffffff81536db4>] object_err+0x34/0x40 [<ffffffff815392d3>] kasan_report.part.2+0x223/0x520 [<ffffffffa07aadf4>] ? snd_seq_ioctl_create_port+0x504/0x630 [snd_seq] [<ffffffff815395fe>] __asan_report_load1_noabort+0x2e/0x30 [<ffffffffa07aadf4>] snd_seq_ioctl_create_port+0x504/0x630 [snd_seq] [<ffffffffa07aa8f0>] ? snd_seq_ioctl_delete_port+0x180/0x180 [snd_seq] [<ffffffff8136be50>] ? taskstats_exit+0xbc0/0xbc0 [<ffffffffa07abc5c>] snd_seq_do_ioctl+0x11c/0x190 [snd_seq] [<ffffffffa07abd10>] snd_seq_ioctl+0x40/0x80 [snd_seq] [<ffffffff8136d433>] ? acct_account_cputime+0x63/0x80 [<ffffffff815b515b>] do_vfs_ioctl+0x54b/0xda0 ..... We may fix this in a few different ways, and in this patch, it's fixed simply by taking the refcount properly at snd_seq_create_port() and letting the caller unref the object after use. Also, there is another potential use-after-free by sprintf() call in snd_seq_create_port(), and this is moved inside the lock. This fix covers CVE-2017-15265. Reported-and-tested-by: Michael23 Yu <ycqzsy@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-10-10bio_copy_user_iov(): don't ignore ->iov_offsetAl Viro1-2/+2
Since "block: support large requests in blk_rq_map_user_iov" we started to call it with partially drained iter; that works fine on the write side, but reads create a copy of iter for completion time. And that needs to take the possibility of ->iov_iter != 0 into account... Cc: stable@vger.kernel.org #v4.5+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-10-10more bio_map_user_iov() leak fixesAl Viro1-5/+9
we need to take care of failure exit as well - pages already in bio should be dropped by analogue of bio_unmap_pages(), since their refcounts had been bumped only once per reference in bio. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-10-10fix unbalanced page refcounting in bio_map_user_iovVitaly Mayatskikh1-0/+8
bio_map_user_iov and bio_unmap_user do unbalanced pages refcounting if IO vector has small consecutive buffers belonging to the same page. bio_add_pc_page merges them into one, but the page reference is never dropped. Cc: stable@vger.kernel.org Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-10-10direct-io: Prevent NULL pointer access in submit_page_sectionAndreas Gruenbacher1-1/+2
In the code added to function submit_page_section by commit b1058b981, sdio->bio can currently be NULL when calling dio_bio_submit. This then leads to a NULL pointer access in dio_bio_submit, so check for a NULL bio in submit_page_section before trying to submit it instead. Fixes xfstest generic/250 on gfs2. Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-10-10seccomp: make function __get_seccomp_filter staticColin Ian King1-1/+1
The function __get_seccomp_filter is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: symbol '__get_seccomp_filter' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Fixes: 66a733ea6b61 ("seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2017-10-10remoteproc: qcom: fix RPMSG_QCOM_GLINK_SMEM dependenciesArnd Bergmann1-0/+2
When RPMSG_QCOM_GLINK_SMEM=m and one driver causes the qcom_common.c file to be compiled as built-in, we get a link error: drivers/remoteproc/qcom_common.o: In function `glink_subdev_remove': qcom_common.c:(.text+0x130): undefined reference to `qcom_glink_smem_unregister' qcom_common.c:(.text+0x130): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `qcom_glink_smem_unregister' drivers/remoteproc/qcom_common.o: In function `glink_subdev_probe': qcom_common.c:(.text+0x160): undefined reference to `qcom_glink_smem_register' qcom_common.c:(.text+0x160): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `qcom_glink_smem_register' Out of the three PIL driver instances, QCOM_ADSP_PIL already has a Kconfig dependency to prevent this from happening, but the other two do not. This adds the same dependency there. Fixes: eea07023e6d9 ("remoteproc: qcom: adsp: Allow defining GLINK edge") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2017-10-10remoteproc: imx_rproc: fix a couple off by one bugsDan Carpenter1-2/+2
The priv->mem[] array has IMX7D_RPROC_MEM_MAX elements so the > should be >= to avoid writing one element beyond the end of the array. Fixes: a0ff4aa6f010 ("remoteproc: imx_rproc: add a NXP/Freescale imx_rproc driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2017-10-10rpmsg: glink: Fix memory leak in qcom_glink_alloc_intent()Dan Carpenter1-3/+8
We need to free "intent" and "intent->data" on a couple error paths. Fixes: 933b45da5d1d ("rpmsg: glink: Add support for TX intents") Acked-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2017-10-10rpmsg: glink: Unlock on error in qcom_glink_request_intent()Dan Carpenter1-1/+2
If qcom_glink_tx() fails, then we need to unlock before returning the error code. Fixes: 27b9c5b66b23 ("rpmsg: glink: Request for intents when unavailable") Acked-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2017-10-10iommu/amd: Do not disable SWIOTLB if SME is activeTom Lendacky1-4/+6
When SME memory encryption is active it will rely on SWIOTLB to handle DMA for devices that cannot support the addressing requirements of having the encryption mask set in the physical address. The IOMMU currently disables SWIOTLB if it is not running in passthrough mode. This is not desired as non-PCI devices attempting DMA may fail. Update the code to check if SME is active and not disable SWIOTLB. Fixes: 2543a786aa25 ("iommu/amd: Allow the AMD IOMMU to work with memory encryption") Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-10-11crypto: shash - Fix zero-length shash ahash digest crashHerbert Xu1-3/+5
The shash ahash digest adaptor function may crash if given a zero-length input together with a null SG list. This is because it tries to read the SG list before looking at the length. This patch fixes it by checking the length first. Cc: <stable@vger.kernel.org> Reported-by: Stephan Müller<smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Stephan Müller <smueller@chronox.de>
2017-10-10quota: Generate warnings for DQUOT_SPACE_NOFAIL allocationsJan Kara1-11/+16
Eryu has reported that since commit 7b9ca4c61bc2 "quota: Reduce contention on dq_data_lock" test generic/233 occasionally fails. This is caused by the fact that since that commit we don't generate warning and set grace time for quota allocations that have DQUOT_SPACE_NOFAIL set (these are for example some metadata allocations in ext4). We need these allocations to behave regularly wrt warning generation and grace time setting so fix the code to return to the original behavior. Reported-and-tested-by: Eryu Guan <eguan@redhat.com> CC: stable@vger.kernel.org Fixes: 7b9ca4c61bc278b771fb57d6290a31ab1fc7fdac Signed-off-by: Jan Kara <jack@suse.cz>
2017-10-10KVM: MMU: always terminate page walks at level 1Ladi Prosek2-8/+9
is_last_gpte() is not equivalent to the pseudo-code given in commit 6bb69c9b69c31 ("KVM: MMU: simplify last_pte_bitmap") because an incorrect value of last_nonleaf_level may override the result even if level == 1. It is critical for is_last_gpte() to return true on level == 1 to terminate page walks. Otherwise memory corruption may occur as level is used as an index to various data structures throughout the page walking code. Even though the actual bug would be wherever the MMU is initialized (as in the previous patch), be defensive and ensure here that is_last_gpte() returns the correct value. This patch is also enough to fix CVE-2017-12188. Fixes: 6bb69c9b69c315200ddc2bc79aee14c0184cf5b2 Cc: stable@vger.kernel.org Cc: Andy Honig <ahonig@google.com> Signed-off-by: Ladi Prosek <lprosek@redhat.com> [Panic if walk_addr_generic gets an incorrect level; this is a serious bug and it's not worth a WARN_ON where the recovery path might hide further exploitable issues; suggested by Andrew Honig. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>