Age | Commit message (Collapse) | Author | Files | Lines |
|
This reverts commits 5d17a73a2ebe ("vmalloc: back off when the current
task is killed") and 171012f56127 ("mm: don't warn when vmalloc() fails
due to a fatal signal").
Commit 5d17a73a2ebe ("vmalloc: back off when the current task is
killed") made all vmalloc allocations from a signal-killed task fail.
We have seen crashes in the tty driver from this, where a killed task
exiting tries to switch back to N_TTY, fails n_tty_open because of the
vmalloc failing, and later crashes when dereferencing tty->disc_data.
Arguably, relying on a vmalloc() call to succeed in order to properly
exit a task is not the most robust way of doing things. There will be a
follow-up patch to the tty code to fall back to the N_NULL ldisc.
But the justification to make that vmalloc() call fail like this isn't
convincing, either. The patch mentions an OOM victim exhausting the
memory reserves and thus deadlocking the machine. But the OOM killer is
only one, improbable source of fatal signals. It doesn't make sense to
fail allocations preemptively with plenty of memory in most cases.
The patch doesn't mention real-life instances where vmalloc sites would
exhaust memory, which makes it sound more like a theoretical issue to
begin with. But just in case, the OOM access to memory reserves has
been restricted on the allocator side in cd04ae1e2dc8 ("mm, oom: do not
rely on TIF_MEMDIE for memory reserves access"), which should take care
of any theoretical concerns on that front.
Revert this patch, and the follow-up that suppresses the allocation
warnings when we fail the allocations due to a signal.
Link: http://lkml.kernel.org/r/20171004185906.GB2136@cmpxchg.org
Fixes: 171012f56127 ("mm: don't warn when vmalloc() fails due to a fatal signal")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alan Cox <alan@llwyncelyn.cymru>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
cma_alloc() unconditionally prints an INFO message when the CMA
allocation fails. Make this message conditional on the non-presence of
__GFP_NOWARN in gfp_mask.
This patch aims at removing INFO messages that are displayed when the
VC4 driver tries to allocate buffer objects. From the driver
perspective an allocation failure is acceptable, and the driver can
possibly do something to make following allocation succeed (like
flushing the VC4 internal cache).
Link: http://lkml.kernel.org/r/20171004125447.15195-1-boris.brezillon@free-electrons.com
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Cc: Jaewon Kim <jaewon31.kim@samsung.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Eric Anholt <eric@anholt.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
gcc on aarch64 may emit synbols of type 'n' if the kernel is built with
'-frecord-gcc-switches'. In most cases, those symbols are reported with
nm as
000000000000000e n $d
and with objdump as
0000000000000000 l d .GCC.command.line 0000000000000000 .GCC.command.line
000000000000000e l .GCC.command.line 0000000000000000 $d
Those symbols are detected in is_arm_mapping_symbol() and ignored.
However, if "--prefix-symbols=<prefix>" is configured as well, the
situation is different. For example, in efi/libstub, arm64 images are
built with
'--prefix-alloc-sections=.init --prefix-symbols=__efistub_'.
In combination with '-frecord-gcc-switches', the symbols are now reported
by nm as:
000000000000000e n __efistub_$d
and by objdump as:
0000000000000000 l d .GCC.command.line 0000000000000000 .GCC.command.line
000000000000000e l .GCC.command.line 0000000000000000 __efistub_$d
Those symbols are no longer ignored and included in the base address
calculation. This results in a base address of 000000000000000e, which
in turn causes kallsyms to abort with
kallsyms failure:
relative symbol value 0xffffff900800a000 out of range in relative mode
The problem is seen in little endian arm64 builds with CONFIG_EFI
enabled and with '-frecord-gcc-switches' set in KCFLAGS.
Explicitly ignore symbols of type 'n' since those are clearly debug
symbols.
Link: http://lkml.kernel.org/r/1507136063-3139-1-git-send-email-linux@roeck-us.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I was stress testing some backports and with high load, after some time,
the latest version of the selftest showed some false positive in
connection with the uffdio_copy_retry. This seems to fix it while still
exercising -EEXIST in the background transfer once in a while.
The fork child will quit after the last UFFDIO_COPY is run, so a
repeated UFFDIO_COPY may not return -EEXIST. This change restricts the
-EEXIST stress to the background transfer where the memory can't go away
from under it.
Also updated uffdio_zeropage, so the interface is consistent.
Link: http://lkml.kernel.org/r/20171004171541.1495-2-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When I execute numactl -H (which reads /sys/devices/system/node/nodeX/cpumap
and displays cpumask_of_node for each node), I get different result
on X86 and arm64. For each numa node, the former only displayed online
CPUs, and the latter displayed all possible CPUs. Unfortunately, both
Linux documentation and numactl manual have not described it clear.
I sent a mail to ask for help, and Michal Hocko replied that he
preferred to print online cpus because it doesn't really make much sense
to bind anything on offline nodes.
Will said:
"I suspect the vast majority (if not all) code that reads this file was
developed for x86, so having the same behaviour for arm64 sounds like
something we should do ASAP before people try to special case with
things like #ifdef __aarch64__. I'd rather have this in 4.14 if
possible."
Link: http://lkml.kernel.org/r/1506678805-15392-2-git-send-email-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tianhong Ding <dingtianhong@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Libin <huawei.libin@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A non present pmd entry can appear after pmd_lock is taken in
page_vma_mapped_walk(), even if THP migration is not enabled. The
WARN_ONCE is unnecessary.
Link: http://lkml.kernel.org/r/20171003142606.12324-1-zi.yan@sent.com
Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path")
Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 3a321d2a3dde ("mm: change the call sites of numa statistics
items") separated NUMA counters from zone counters, but the
NUMA_INTERLEAVE_HIT call site wasn't updated to use the new interface.
So alloc_page_interleave() actually increments NR_ZONE_INACTIVE_FILE
instead of NUMA_INTERLEAVE_HIT.
Fix this by using __inc_numa_state() interface to increment
NUMA_INTERLEAVE_HIT.
Link: http://lkml.kernel.org/r/20171003191003.8573-1-aryabinin@virtuozzo.com
Fixes: 3a321d2a3dde ("mm: change the call sites of numa statistics items")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Kemi Wang <kemi.wang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The pci-rcar driver is enabled for compile tests, and this has shown that
the driver cannot build without CONFIG_OF, following the inclusion of
commit f8f2fe7355fb ("PCI: rcar: Use new OF interrupt mapping when possible"):
drivers/pci/host/pcie-rcar.c: In function 'pci_dma_range_parser_init':
drivers/pci/host/pcie-rcar.c:1039:2: error: implicit declaration of function 'of_n_addr_cells' [-Werror=implicit-function-declaration]
parser->pna = of_n_addr_cells(node);
^
As pointed out by Ben Dooks and Geert Uytterhoeven, this is actually
supposed to build fine, which we can achieve if we make the declaration
of of_irq_parse_and_map_pci conditional on CONFIG_OF and provide an
empty inline function otherwise, as we do for a lot of other of
interfaces.
This lets us build the rcar_pci driver again without CONFIG_OF for build
testing. All platforms using this driver select OF, so this doesn't
change anything for the users.
[akpm@linux-foundation.org: be consistent with surrounding code]
Link: http://lkml.kernel.org/r/20170911200805.3363318-1-arnd@arndb.de
Fixes: c25da4778803 ("PCI: rcar: Add Renesas R-Car PCIe driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Frank Rowand <frank.rowand@sony.com>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Magnus Damm <damm@opensource.se>
Cc: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
mm/madvise.c has a brief description about all MADV_ flags. Add a
description for the newly added MADV_WIPEONFORK and MADV_KEEPONFORK.
Although man page has the similar information, but it'd better to keep
the consistent with other flags.
Link: http://lkml.kernel.org/r/1506117328-88228-1-git-send-email-yang.s@alibaba-inc.com
Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Expand the "Runtime testing" menu by including more entries inside it
instead of after it. This is just Kconfig symbol movement.
This causes the (arch-independent) Runtime tests to be presented
(listed) all in one place instead of in multiple places.
Link: http://lkml.kernel.org/r/c194e5c4-2042-bf94-a2d8-7aa13756e257@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Index was incremented before last use and thus the second array could
dereference to an invalid address (not mentioning the fact that it did
not properly clear the entry we intended to clear).
Link: http://lkml.kernel.org/r/1506973525-16491-1-git-send-email-jglisse@redhat.com
Fixes: 8315ada7f095bf ("mm/migrate: allow migrate_vma() to alloc new page on empty entry")
Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com>
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The function only sends the flush command to the IOMMU(s),
but does not wait for its completion when it returns. Fix
that.
Fixes: 601367d76bd1 ('x86/amd-iommu: Remove iommu_flush_domain function')
Cc: stable@vger.kernel.org # >= 2.6.33
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Panic observed with latest firmware, and upstream kernel:
NIP init_imc_pmu+0x8c/0xcf0
LR init_imc_pmu+0x2f8/0xcf0
Call Trace:
init_imc_pmu+0x2c8/0xcf0 (unreliable)
opal_imc_counters_probe+0x300/0x400
platform_drv_probe+0x64/0x110
driver_probe_device+0x3d8/0x580
__driver_attach+0x14c/0x1a0
bus_for_each_dev+0x8c/0xf0
driver_attach+0x34/0x50
bus_add_driver+0x298/0x350
driver_register+0x9c/0x180
__platform_driver_register+0x5c/0x70
opal_imc_driver_init+0x2c/0x40
do_one_initcall+0x64/0x1d0
kernel_init_freeable+0x280/0x374
kernel_init+0x24/0x160
ret_from_kernel_thread+0x5c/0x74
While registering nest imc at init, cpu-hotplug callback
nest_pmu_cpumask_init() makes an OPAL call to stop the engine. And if
the OPAL call fails, imc_common_cpuhp_mem_free() is invoked to cleanup
memory and cpuhotplug setup.
But when cleaning up the attribute group, we are dereferencing the
attribute element array without checking whether the backing element
is not NULL. This causes the kernel panic.
Add a check for the backing element prior to dereferencing the
attribute element, to handle the failing case gracefully.
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
[mpe: Trim change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
If faddr2line is given a function name which is the last one listed by
"nm -n", it will fail because it never finds the next symbol.
So teach the awk script to catch that possibility, and use 'size' to
provide the end point of the last function.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Using CONFIG_OF_DYNAMIC=y uncovered an imbalance in the usecount of the
node being passed to of_fwnode_graph_get_port_parent(). Preserve the
usecount by using of_get_parent() instead of of_get_next_parent() which
don't decrement the usecount of the node passed to it.
Fixes: 3b27d00e7b6d7c88 ("device property: Move fwnode graph ops to firmware specific locations")
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
There are two types of memory reservations firmware can ask the kernel
to make in the device tree: static and dynamic.
See Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
If you have greater than 16 entries in /reserved-memory (as we do on
POWER9 systems) you would get this scary looking error message:
[ 0.000000] OF: reserved mem: not enough space all defined regions.
This is harmless if all your reservations are static (which with OPAL on
POWER9, they are).
It is not harmless if you have any dynamic reservations after the 16th.
In the first pass over the fdt to find reservations, the child nodes of
/reserved-memory are added to a static array in of_reserved_mem.c so that
memory can be reserved in a 2nd pass. The array has 16 entries. This is why,
on my dual socket POWER9 system, I get that error 4 times with 20 static
reservations.
We don't have a problem on ppc though, as in arch/powerpc/kernel/prom.c
we look at the new style /reserved-ranges property to do reservations,
and this logic was introduced in 0962e8004e974 (well before any powernv
system shipped).
A Google search shows up no occurances of that exact error message, so we're
probably safe in that no machine that people use has memory not being reserved
when it should be.
The simple fix is to bump the length of the array to 32 which "should be
enough for everyone(TM)". The simple fix of not recording static allocations
in the array would cause problems for devices with "memory-region" properties.
A more future-proof fix is likely possible, although more invasive and this
simple fix is perfectly suitable in the meantime while a more future-proof
fix is developed.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Tested-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Do not strdup() console options. It seems that the only reason for
it to be strdup()-ed was a compilation warning: printk, UART and
console drivers, for some reason, expect char pointer instead of
const char pointer. So we can just pass `of_stdout_options', but
need to cast it to char pointer. A better fix would be to change
printk, console drivers and UART to accept const char `options';
but that will take time - there are lots of drivers to update.
The patch also fixes a possible memory leak: add_preferred_console()
can fail, but we don't kfree() options.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Stack trace output during a stress test:
[ 4.310049] Freeing initrd memory: 22592K
[ 4.310646] rtas_flash: no firmware flash support
[ 4.313341] cpuhp/64: page allocation failure: order:0, mode:0x14480c0(GFP_KERNEL|__GFP_ZERO|__GFP_THISNODE), nodemask=(null)
[ 4.313465] cpuhp/64 cpuset=/ mems_allowed=0
[ 4.313521] CPU: 64 PID: 392 Comm: cpuhp/64 Not tainted 4.11.0-39.el7a.ppc64le #1
[ 4.313588] Call Trace:
[ 4.313622] [c000000f1fb1b8e0] [c000000000c09388] dump_stack+0xb0/0xf0 (unreliable)
[ 4.313694] [c000000f1fb1b920] [c00000000030ef6c] warn_alloc+0x12c/0x1c0
[ 4.313753] [c000000f1fb1b9c0] [c00000000030ff68] __alloc_pages_nodemask+0xea8/0x1000
[ 4.313823] [c000000f1fb1bbb0] [c000000000113a8c] core_imc_mem_init+0xbc/0x1c0
[ 4.313892] [c000000f1fb1bc00] [c000000000113cdc] ppc_core_imc_cpu_online+0x14c/0x170
[ 4.313962] [c000000f1fb1bc90] [c000000000125758] cpuhp_invoke_callback+0x198/0x5d0
[ 4.314031] [c000000f1fb1bd00] [c00000000012782c] cpuhp_thread_fun+0x8c/0x3d0
[ 4.314101] [c000000f1fb1bd60] [c0000000001678d0] smpboot_thread_fn+0x290/0x2a0
[ 4.314169] [c000000f1fb1bdc0] [c00000000015ee78] kthread+0x168/0x1b0
[ 4.314229] [c000000f1fb1be30] [c00000000000b368] ret_from_kernel_thread+0x5c/0x74
[ 4.314313] Mem-Info:
[ 4.314356] active_anon:0 inactive_anon:0 isolated_anon:0
core_imc_mem_init() at system boot use alloc_pages_node() to get memory
and alloc_pages_node() throws this stack dump when tried to allocate
memory from a node which has no memory behind it. Add a ___GFP_NOWARN
flag in allocation request as a fix.
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Venkat R.B <venkatb3@in.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Nest/core pmu units are enabled only when it is used. A reference count is
maintained for the events which uses the nest/core pmu units. Currently in
*_imc_counters_release function a WARN() is used for notification of any
underflow of ref count.
The case where event ref count hit a negative value is, when perf session is
started, followed by offlining of all cpus in a given core.
i.e. in cpuhotplug offline path ppc_core_imc_cpu_offline() function set the
ref->count to zero, if the current cpu which is about to offline is the last
cpu in a given core and make an OPAL call to disable the engine in that core.
And on perf session termination, perf->destroy (core_imc_counters_release) will
first decrement the ref->count for this core and based on the ref->count value
an opal call is made to disable the core-imc engine.
Now, since cpuhotplug path already clears the ref->count for core and disabled
the engine, perf->destroy() decrementing again at event termination make it
negative which in turn fires the WARN_ON. The same happens for nest units.
Add a check to see if the reference count is alreday zero, before decrementing
the count, so that the ref count will not hit a negative value.
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Santosh Sivaraj <santosh@fossix.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Paul is handling almost all of the powerpc related KVM patches nowadays,
so he should be mentioned in the MAINTAINERS file accordingly.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When KVM emulates an exit from L2 to L1, it loads L1 CR4 into the
guest CR4. Before this CR4 loading, the guest CR4 refers to L2
CR4. Because these two CR4's are in different levels of guest, we
should vmx_set_cr4() rather than kvm_set_cr4() here. The latter, which
is used to handle guest writes to its CR4, checks the guest change to
CR4 and may fail if the change is invalid.
The failure may cause trouble. Consider we start
a L1 guest with non-zero L1 PCID in use,
(i.e. L1 CR4.PCIDE == 1 && L1 CR3.PCID != 0)
and
a L2 guest with L2 PCID disabled,
(i.e. L2 CR4.PCIDE == 0)
and following events may happen:
1. If kvm_set_cr4() is used in load_vmcs12_host_state() to load L1 CR4
into guest CR4 (in VMCS01) for L2 to L1 exit, it will fail because
of PCID check. As a result, the guest CR4 recorded in L0 KVM (i.e.
vcpu->arch.cr4) is left to the value of L2 CR4.
2. Later, if L1 attempts to change its CR4, e.g., clearing VMXE bit,
kvm_set_cr4() in L0 KVM will think L1 also wants to enable PCID,
because the wrong L2 CR4 is used by L0 KVM as L1 CR4. As L1
CR3.PCID != 0, L0 KVM will inject GP to L1 guest.
Fixes: 4704d0befb072 ("KVM: nVMX: Exiting from L2 to L1")
Cc: qemu-stable@nongnu.org
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Exynos SYSMMU registers standard platform device with sysmmu_of_match
table, what means that this table is accessed every time a new platform
device is registered in a system. This might happen also after the boot,
so the table must not be attributed as initconst to avoid potential kernel
oops caused by access to freed memory.
Fixes: 6b21a5db3642 ("iommu/exynos: Support for device tree")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Fix more return codes for device property: Align return codes of
__acpi_node_get_property_reference().
In particular, what was missed previously:
-EPROTO could be returned in certain cases, now -EINVAL;
-EINVAL was returned if the property was not found, now -ENOENT;
-EINVAL was returned also if the index was higher than the number of
entries in a package, now -ENOENT.
Reported-by: Hyungwoo Yang <hyungwoo.yang@intel.com>
Fixes: 3e3119d3088f (device property: Introduce fwnode_property_get_reference_args)
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Hyungwoo Yang <hyungwoo.yang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
acpi_fwnode_get_reference_args(), the function implementing ACPI
support for fwnode_property_get_reference_args(), returns directly
error codes from __acpi_node_get_property_reference(). The latter
uses different error codes than the OF implementation. In particular,
the OF implementation uses -ENOENT to indicate that the property is
not found, a reference entry is empty and there are no more
references.
Document and align the error codes for property for
fwnode_property_get_reference_args() so that they match with
of_parse_phandle_with_args().
Fixes: 3e3119d3088f (device property: Introduce fwnode_property_get_reference_args)
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In case of error, the function devm_ioremap() returns NULL pointer
not ERR_PTR(). The IS_ERR() test in the return value check should
be replaced with NULL test.
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
Jason reported that a corrupted filesystem failed to replay
the log with a metadata block out of bounds warning:
XFS (dm-2): _xfs_buf_find: Block out of range: block 0x80270fff8, EOFS 0x9c40000
_xfs_buf_find() and xfs_btree_get_bufs() return NULL if
that happens, and then when xfs_alloc_fix_freelist() calls
xfs_trans_binval() on that NULL bp, we oops with:
BUG: unable to handle kernel NULL pointer dereference at 00000000000000f8
We don't handle _xfs_buf_find errors very well, every
caller higher up the stack gets to guess at why it failed.
But we should at least handle it somehow, so return
EFSCORRUPTED here.
Reported-by: Jason L Tibbitts III <tibbs@math.uh.edu>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
xfs_attr3_root_inactive() walks the attr fork tree to invalidate the
associated blocks. xfs_attr3_node_inactive() recursively descends
from internal blocks to leaf blocks, caching block address values
along the way to revisit parent blocks, locate the next entry and
descend down that branch of the tree.
The code that attempts to reread the parent block is unsafe because
it assumes that the local xfs_da_node_entry pointer remains valid
after an xfs_trans_brelse() and re-read of the parent buffer. Under
heavy memory pressure, it is possible that the buffer has been
reclaimed and reallocated by the time the parent block is reread.
This means that 'btree' can point to an invalid memory address, lead
to a random/garbage value for child_fsb and cause the subsequent
read of the attr fork to go off the rails and return a NULL buffer
for an attr fork offset that is most likely not allocated.
Note that this problem can be manufactured by setting
XFS_ATTR_BTREE_REF to 0 to prevent LRU caching of attr buffers,
creating a file with a multi-level attr fork and removing it to
trigger inactivation.
To address this problem, reinit the node/btree pointers to the
parent buffer after it has been re-read. This ensures btree points
to a valid record and allows the walk to proceed.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Bool initializations should use true and false. Bool tests don't need
comparisons.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
If we get ENOSPC half way through setting the ACL, the inode mode
can still be changed even though the ACL does not exist. Reorder the
operation to only change the mode of the inode if the ACL is set
correctly.
Whilst this does not fix the problem with crash consistency (that requires
attribute addition to be a deferred op) it does prevent ENOSPC and other
non-fatal errors setting an xattr to be handled sanely.
This fixes xfstests generic/449.
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Various utility functions and interfaces that iterate internal
devices try to reference the realtime device even when RT support is
not compiled into the kernel.
Make sure this code is excluded from the CONFIG_XFS_RT=n build,
and where appropriate stub functions to return fatal errors if
they ever get called when RT support is not present.
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Prevent kmemcheck from throwing warnings about reading uninitialised
memory when formatting inodes into the incore log buffer. There are
several issues here - we don't always log all the fields in the
inode log format item, and we never log the inode the
di_next_unlinked field.
In the case of the inode log format item, this is exacerbated
by the old xfs_inode_log_format structure padding issue. Hence make
the padded, 64 bit aligned version of the structure the one we always
use for formatting the log and get rid of the 64 bit variant. This
means we'll always log the 64-bit version and so recovery only needs
to convert from the unpadded 32 bit version from older 32 bit
kernels.
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Commit 77469c3f570 prevented setting the page as uptodate when we wrote
the right amount of data, fix that.
Fixes: 77469c3f570 ("9p: saner ->write_end() on failing copy into non-uptodate page")
Reviewed-by: Jan Kara <jack@suse.com>
Signed-off-by: Alexander Levin <alexander.levin@verizon.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
caiaq driver doesn't kill the URB properly at its error path during
the probe, which may lead to a use-after-free error later. This patch
addresses it.
Reported-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
In addition to DEFT, Elecom introduced a larger trackball called HUGE, in
both wired (M-HT1URBK) and wireless (M-HT1DRBK) versions. It has the same
buttons and behavior as the DEFT. This patch adds the two relevant USB IDs
to enable operation of the three Fn buttons on the top of the device.
Cc: Diego Elio Petteno <flameeyes@flameeyes.eu>
Signed-off-by: Alex Manoussakis <amanou@gnu.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
The hid descriptor identifies the length and type of subordinate
descriptors for a device. If the received hid descriptor is smaller than
the size of the struct hid_descriptor, it is possible to cause
out-of-bounds.
In addition, if bNumDescriptors of the hid descriptor have an incorrect
value, this can also cause out-of-bounds while approaching hdesc->desc[n].
So check the size of hid descriptor and bNumDescriptors.
BUG: KASAN: slab-out-of-bounds in usbhid_parse+0x9b1/0xa20
Read of size 1 at addr ffff88006c5f8edf by task kworker/1:2/1261
CPU: 1 PID: 1261 Comm: kworker/1:2 Not tainted
4.14.0-rc1-42251-gebb2c2437d80 #169
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x292/0x395 lib/dump_stack.c:52
print_address_description+0x78/0x280 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351
kasan_report+0x22f/0x340 mm/kasan/report.c:409
__asan_report_load1_noabort+0x19/0x20 mm/kasan/report.c:427
usbhid_parse+0x9b1/0xa20 drivers/hid/usbhid/hid-core.c:1004
hid_add_device+0x16b/0xb30 drivers/hid/hid-core.c:2944
usbhid_probe+0xc28/0x1100 drivers/hid/usbhid/hid-core.c:1369
usb_probe_interface+0x35d/0x8e0 drivers/usb/core/driver.c:361
really_probe drivers/base/dd.c:413
driver_probe_device+0x610/0xa00 drivers/base/dd.c:557
__device_attach_driver+0x230/0x290 drivers/base/dd.c:653
bus_for_each_drv+0x161/0x210 drivers/base/bus.c:463
__device_attach+0x26e/0x3d0 drivers/base/dd.c:710
device_initial_probe+0x1f/0x30 drivers/base/dd.c:757
bus_probe_device+0x1eb/0x290 drivers/base/bus.c:523
device_add+0xd0b/0x1660 drivers/base/core.c:1835
usb_set_configuration+0x104e/0x1870 drivers/usb/core/message.c:1932
generic_probe+0x73/0xe0 drivers/usb/core/generic.c:174
usb_probe_device+0xaf/0xe0 drivers/usb/core/driver.c:266
really_probe drivers/base/dd.c:413
driver_probe_device+0x610/0xa00 drivers/base/dd.c:557
__device_attach_driver+0x230/0x290 drivers/base/dd.c:653
bus_for_each_drv+0x161/0x210 drivers/base/bus.c:463
__device_attach+0x26e/0x3d0 drivers/base/dd.c:710
device_initial_probe+0x1f/0x30 drivers/base/dd.c:757
bus_probe_device+0x1eb/0x290 drivers/base/bus.c:523
device_add+0xd0b/0x1660 drivers/base/core.c:1835
usb_new_device+0x7b8/0x1020 drivers/usb/core/hub.c:2457
hub_port_connect drivers/usb/core/hub.c:4903
hub_port_connect_change drivers/usb/core/hub.c:5009
port_event drivers/usb/core/hub.c:5115
hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
worker_thread+0x221/0x1850 kernel/workqueue.c:2253
kthread+0x3a1/0x470 kernel/kthread.c:231
ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
Cc: stable@vger.kernel.org
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Jaejoong Kim <climbbb.kim@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
When an incoming module is considered for livepatching by
klp_module_coming(), it iterates over multiple patches and multiple
kernel objects in this order:
list_for_each_entry(patch, &klp_patches, list) {
klp_for_each_object(patch, obj) {
which means that if one of the kernel objects fails to patch,
klp_module_coming()'s error path needs to unpatch and cleanup any kernel
objects that were already patched by a previous patch.
Reported-by: Miroslav Benes <mbenes@suse.cz>
Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
There is a potential race window opened at creating and deleting a
port via ioctl, as spotted by fuzzing. snd_seq_create_port() creates
a port object and returns its pointer, but it doesn't take the
refcount, thus it can be deleted immediately by another thread.
Meanwhile, snd_seq_ioctl_create_port() still calls the function
snd_seq_system_client_ev_port_start() with the created port object
that is being deleted, and this triggers use-after-free like:
BUG: KASAN: use-after-free in snd_seq_ioctl_create_port+0x504/0x630 [snd_seq] at addr ffff8801f2241cb1
=============================================================================
BUG kmalloc-512 (Tainted: G B ): kasan: bad access detected
-----------------------------------------------------------------------------
INFO: Allocated in snd_seq_create_port+0x94/0x9b0 [snd_seq] age=1 cpu=3 pid=4511
___slab_alloc+0x425/0x460
__slab_alloc+0x20/0x40
kmem_cache_alloc_trace+0x150/0x190
snd_seq_create_port+0x94/0x9b0 [snd_seq]
snd_seq_ioctl_create_port+0xd1/0x630 [snd_seq]
snd_seq_do_ioctl+0x11c/0x190 [snd_seq]
snd_seq_ioctl+0x40/0x80 [snd_seq]
do_vfs_ioctl+0x54b/0xda0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x16/0x75
INFO: Freed in port_delete+0x136/0x1a0 [snd_seq] age=1 cpu=2 pid=4717
__slab_free+0x204/0x310
kfree+0x15f/0x180
port_delete+0x136/0x1a0 [snd_seq]
snd_seq_delete_port+0x235/0x350 [snd_seq]
snd_seq_ioctl_delete_port+0xc8/0x180 [snd_seq]
snd_seq_do_ioctl+0x11c/0x190 [snd_seq]
snd_seq_ioctl+0x40/0x80 [snd_seq]
do_vfs_ioctl+0x54b/0xda0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x16/0x75
Call Trace:
[<ffffffff81b03781>] dump_stack+0x63/0x82
[<ffffffff81531b3b>] print_trailer+0xfb/0x160
[<ffffffff81536db4>] object_err+0x34/0x40
[<ffffffff815392d3>] kasan_report.part.2+0x223/0x520
[<ffffffffa07aadf4>] ? snd_seq_ioctl_create_port+0x504/0x630 [snd_seq]
[<ffffffff815395fe>] __asan_report_load1_noabort+0x2e/0x30
[<ffffffffa07aadf4>] snd_seq_ioctl_create_port+0x504/0x630 [snd_seq]
[<ffffffffa07aa8f0>] ? snd_seq_ioctl_delete_port+0x180/0x180 [snd_seq]
[<ffffffff8136be50>] ? taskstats_exit+0xbc0/0xbc0
[<ffffffffa07abc5c>] snd_seq_do_ioctl+0x11c/0x190 [snd_seq]
[<ffffffffa07abd10>] snd_seq_ioctl+0x40/0x80 [snd_seq]
[<ffffffff8136d433>] ? acct_account_cputime+0x63/0x80
[<ffffffff815b515b>] do_vfs_ioctl+0x54b/0xda0
.....
We may fix this in a few different ways, and in this patch, it's fixed
simply by taking the refcount properly at snd_seq_create_port() and
letting the caller unref the object after use. Also, there is another
potential use-after-free by sprintf() call in snd_seq_create_port(),
and this is moved inside the lock.
This fix covers CVE-2017-15265.
Reported-and-tested-by: Michael23 Yu <ycqzsy@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Since "block: support large requests in blk_rq_map_user_iov" we
started to call it with partially drained iter; that works fine
on the write side, but reads create a copy of iter for completion
time. And that needs to take the possibility of ->iov_iter != 0
into account...
Cc: stable@vger.kernel.org #v4.5+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
we need to take care of failure exit as well - pages already
in bio should be dropped by analogue of bio_unmap_pages(),
since their refcounts had been bumped only once per reference
in bio.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
bio_map_user_iov and bio_unmap_user do unbalanced pages refcounting if
IO vector has small consecutive buffers belonging to the same page.
bio_add_pc_page merges them into one, but the page reference is never
dropped.
Cc: stable@vger.kernel.org
Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
In the code added to function submit_page_section by commit b1058b981,
sdio->bio can currently be NULL when calling dio_bio_submit. This then
leads to a NULL pointer access in dio_bio_submit, so check for a NULL
bio in submit_page_section before trying to submit it instead.
Fixes xfstest generic/250 on gfs2.
Cc: stable@vger.kernel.org # v3.10+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The function __get_seccomp_filter is local to the source and does
not need to be in global scope, so make it static.
Cleans up sparse warning:
symbol '__get_seccomp_filter' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Fixes: 66a733ea6b61 ("seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
When RPMSG_QCOM_GLINK_SMEM=m and one driver causes the qcom_common.c file
to be compiled as built-in, we get a link error:
drivers/remoteproc/qcom_common.o: In function `glink_subdev_remove':
qcom_common.c:(.text+0x130): undefined reference to `qcom_glink_smem_unregister'
qcom_common.c:(.text+0x130): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `qcom_glink_smem_unregister'
drivers/remoteproc/qcom_common.o: In function `glink_subdev_probe':
qcom_common.c:(.text+0x160): undefined reference to `qcom_glink_smem_register'
qcom_common.c:(.text+0x160): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `qcom_glink_smem_register'
Out of the three PIL driver instances, QCOM_ADSP_PIL already has a
Kconfig dependency to prevent this from happening, but the other two
do not. This adds the same dependency there.
Fixes: eea07023e6d9 ("remoteproc: qcom: adsp: Allow defining GLINK edge")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
The priv->mem[] array has IMX7D_RPROC_MEM_MAX elements so the > should
be >= to avoid writing one element beyond the end of the array.
Fixes: a0ff4aa6f010 ("remoteproc: imx_rproc: add a NXP/Freescale imx_rproc driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
We need to free "intent" and "intent->data" on a couple error paths.
Fixes: 933b45da5d1d ("rpmsg: glink: Add support for TX intents")
Acked-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
If qcom_glink_tx() fails, then we need to unlock before returning the
error code.
Fixes: 27b9c5b66b23 ("rpmsg: glink: Request for intents when unavailable")
Acked-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
When SME memory encryption is active it will rely on SWIOTLB to handle
DMA for devices that cannot support the addressing requirements of
having the encryption mask set in the physical address. The IOMMU
currently disables SWIOTLB if it is not running in passthrough mode.
This is not desired as non-PCI devices attempting DMA may fail. Update
the code to check if SME is active and not disable SWIOTLB.
Fixes: 2543a786aa25 ("iommu/amd: Allow the AMD IOMMU to work with memory encryption")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The shash ahash digest adaptor function may crash if given a
zero-length input together with a null SG list. This is because
it tries to read the SG list before looking at the length.
This patch fixes it by checking the length first.
Cc: <stable@vger.kernel.org>
Reported-by: Stephan Müller<smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Stephan Müller <smueller@chronox.de>
|
|
Eryu has reported that since commit 7b9ca4c61bc2 "quota: Reduce
contention on dq_data_lock" test generic/233 occasionally fails. This is
caused by the fact that since that commit we don't generate warning and
set grace time for quota allocations that have DQUOT_SPACE_NOFAIL set
(these are for example some metadata allocations in ext4). We need these
allocations to behave regularly wrt warning generation and grace time
setting so fix the code to return to the original behavior.
Reported-and-tested-by: Eryu Guan <eguan@redhat.com>
CC: stable@vger.kernel.org
Fixes: 7b9ca4c61bc278b771fb57d6290a31ab1fc7fdac
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
is_last_gpte() is not equivalent to the pseudo-code given in commit
6bb69c9b69c31 ("KVM: MMU: simplify last_pte_bitmap") because an incorrect
value of last_nonleaf_level may override the result even if level == 1.
It is critical for is_last_gpte() to return true on level == 1 to
terminate page walks. Otherwise memory corruption may occur as level
is used as an index to various data structures throughout the page
walking code. Even though the actual bug would be wherever the MMU is
initialized (as in the previous patch), be defensive and ensure here
that is_last_gpte() returns the correct value.
This patch is also enough to fix CVE-2017-12188.
Fixes: 6bb69c9b69c315200ddc2bc79aee14c0184cf5b2
Cc: stable@vger.kernel.org
Cc: Andy Honig <ahonig@google.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
[Panic if walk_addr_generic gets an incorrect level; this is a serious
bug and it's not worth a WARN_ON where the recovery path might hide
further exploitable issues; suggested by Andrew Honig. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|