aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2017-06-20arm64: mm: select CONFIG_ARCH_PROC_KCORE_TEXTArd Biesheuvel1-0/+3
To avoid issues with the /proc/kcore code getting confused about the kernels block mappings in the VMALLOC region, enable the existing facility that describes the [_text, _end) interval as a separate KCORE_TEXT region, which supersedes the KCORE_VMALLOC region that it intersects with on arm64. Reported-by: Tan Xiaojun <tanxiaojun@huawei.com> Tested-by: Tan Xiaojun <tanxiaojun@huawei.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Laura Abbott <labbott@redhat.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-20fs/proc: kcore: use kcore_list type to check for vmalloc/module addressArd Biesheuvel1-1/+1
Instead of passing each start address into is_vmalloc_or_module_addr() to decide whether it falls into either the VMALLOC or the MODULES region, we can simply check the type field of the current kcore_list entry, since it will be set to KCORE_VMALLOC based on exactly the same conditions. As a bonus, when reading the KCORE_TEXT region on architectures that have one, this will avoid using vread() on the region if it happens to intersect with a KCORE_VMALLOC region. This is due the fact that the KCORE_TEXT region is the first one to be added to the kcore region list. Reported-by: Tan Xiaojun <tanxiaojun@huawei.com> Tested-by: Tan Xiaojun <tanxiaojun@huawei.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Laura Abbott <labbott@redhat.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-20drivers/char: kmem: disable on arm64Ard Biesheuvel1-0/+2
As it turns out, arm64 deviates from other architectures in the way it maps the VMALLOC region: on most (all?) other architectures, it resides strictly above the kernel's direct mapping of DRAM, but on arm64, this is the other way around. For instance, for a 48-bit VA configuration, we have modules : 0xffff000000000000 - 0xffff000008000000 ( 128 MB) vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000 (129022 GB) ... vmemmap : 0xffff7e0000000000 - 0xffff800000000000 ( 2048 GB maximum) 0xffff7e0000000000 - 0xffff7e0003ff0000 ( 63 MB actual) memory : 0xffff800000000000 - 0xffff8000ffc00000 ( 4092 MB) This has mostly gone unnoticed until now, but it does appear that it breaks an assumption in the kmem read/write code, which does something like if (p < (unsigned long) high_memory) { ... use straight copy_[to|from]_user() using p as virtual address ... } ... if (count > 0) { ... use vread/vwrite for accesses past high_memory ... } The first condition will inadvertently hold for the VMALLOC region if VMALLOC_START < PAGE_OFFSET [which is the case on arm64], but the read or write will subsequently fail the virt_addr_valid() check, resulting in a -ENXIO return value. Given how kmem seems to be living in borrowed time anyway, and given the fact that nobody noticed that the read/write interface is broken on arm64 in the first place, let's not bother trying to fix it, but simply disable the /dev/kmem interface entirely for arm64. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15arm64: Export save_stack_trace_tsk()Dustin Brown1-0/+1
The kernel watchdog is a great debugging tool for finding tasks that consume a disproportionate amount of CPU time in contiguous chunks. One can imagine building a similar watchdog for arbitrary driver threads using save_stack_trace_tsk() and print_stack_trace(). However, this is not viable for dynamically loaded driver modules on ARM platforms because save_stack_trace_tsk() is not exported for those architectures. Export save_stack_trace_tsk() for the ARM64 architecture to align with x86 and support various debugging use cases such as arbitrary driver thread watchdog timers. Signed-off-by: Dustin Brown <dustinb@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15ACPI/IORT: Remove iort_node_match()Lorenzo Pieralisi2-17/+0
Commit 316ca8804ea8 ("ACPI/IORT: Remove linker section for IORT entries probing") removed the linker section for IORT entries probing. Since those IORT entries were the only iort_node_match() interface users, the iort_node_match() became obsolete and can then be removed. Remove the ACPI IORT iort_node_match() interface from the kernel. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15ARM64/irqchip: Update ACPI_IORT symbol selection logicLorenzo Pieralisi2-1/+1
ACPI IORT is an ACPI addendum to describe the connection topology of devices with IOMMUs and interrupt controllers on ARM64 ACPI systems. Currently the ACPI IORT Kbuild symbol is selected whenever the Kbuild symbol ARM_GIC_V3_ITS is enabled, which in turn is selected by ARM64 Kbuild defaults. This makes the logic behind ACPI_IORT selection a bit twisted and not easy to follow. On ARM64 systems enabling ACPI the kbuild symbol ACPI_IORT should always be selected in that it is a kernel layer provided to the ARM64 arch code to parse and enable ACPI firmware bindings. Make the ACPI_IORT selection explicit in ARM64 Kbuild and remove the selection from ARM_GIC_V3_ITS entry, making the ACPI_IORT selection logic clearer to follow. Acked-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15arm64/dma-mapping: Remove extraneous null-pointer checksOlav Haugan2-11/+0
The current null-pointer check in __dma_alloc_coherent and __dma_free_coherent is not needed anymore since the __dma_alloc/__dma_free functions won't be called if !dev (dummy ops will be called instead). Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-12arm64: kconfig: allow support for memory failure handlingJonathan (Zhixiong) Zhang1-0/+1
Declare ARCH_SUPPORTS_MEMORY_FAILURE, as arm64 does support memory failure recovery attempt. Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> (Dropped changes to ACPI APEI Kconfig and updated commit log) Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-12arm64: mm: Update perf accounting to handle poison faultsPunit Agrawal1-32/+36
Re-organise the perf accounting for fault handling in preparation for enabling handling of hardware poison faults in subsequent commits. The change updates perf accounting to be inline with the behaviour on x86. With this update, the perf fault accounting - * Always report PERF_COUNT_SW_PAGE_FAULTS * Doesn't report anything else for VM_FAULT_ERROR (which includes hwpoison faults) * Reports PERF_COUNT_SW_PAGE_FAULTS_MAJ if it's a major fault (indicated by VM_FAULT_MAJOR) * Otherwise, reports PERF_COUNT_SW_PAGE_FAULTS_MIN Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-12arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handlingJonathan (Zhixiong) Zhang1-3/+19
Add VM_FAULT_HWPOISON[_LARGE] handling to the arm64 page fault handler. Handling of VM_FAULT_HWPOISON[_LARGE] is very similar to VM_FAULT_OOM, the only difference is that a different si_code (BUS_MCEERR_AR) is passed to user space and si_addr_lsb field is initialized. Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> (fix new __do_user_fault call-site) Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-12arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entriesPunit Agrawal2-20/+11
When memory failure is enabled, a poisoned hugepage pte is marked as a swap entry. huge_pte_offset() does not return the poisoned page table entries when it encounters PUD/PMD hugepages. This behaviour of huge_pte_offset() leads to error such as below when munmap is called on poisoned hugepages. [ 344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074. Fix huge_pte_offset() to return the poisoned pte which is then appropriately handled by the generic layer code. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: David Woods <dwoods@mellanox.com> Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-12arm64: ftrace: fix building without CONFIG_MODULESWill Deacon1-6/+10
When CONFIG_MODULES is disabled, we cannot dereference a module pointer: arch/arm64/kernel/ftrace.c: In function 'ftrace_make_call': arch/arm64/kernel/ftrace.c:107:36: error: dereferencing pointer to incomplete type 'struct module' trampoline = (unsigned long *)mod->arch.ftrace_trampoline; Also, the within_module() function is not defined: arch/arm64/kernel/ftrace.c: In function 'ftrace_make_nop': arch/arm64/kernel/ftrace.c:171:8: error: implicit declaration of function 'within_module'; did you mean 'init_module'? [-Werror=implicit-function-declaration] This addresses both by adding replacing the IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) checks with #ifdef versions. Fixes: e71a4e1bebaf ("arm64: ftrace: add support for far branches to dynamic ftrace") Reported-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-12arm64: fault: Print info about page table structure when dumping pteWill Deacon1-1/+3
Whilst debugging a remote crash, I noticed that show_pte is unhelpful when it comes to describing the structure of the page table being walked. This is easily fixed by printing out the page table (swapper vs user), page size and virtual address size when displaying the PGD address. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-12arm64: mm: print file name of faulting vmaKristina Martsenko1-1/+3
Print out the name of the file associated with the vma that faulted. This is usually the executable or shared library name. We already print out the task name, but also printing the library name is useful for pinpointing bugs to libraries. Also print the base address and size of the vma, which together with the PC (printed by __show_regs) gives the offset into the library. Fault prints now look like: test[2361]: unhandled level 2 translation fault (11) at 0x00000012, esr 0x92000006, in libfoo.so[ffffa0145000+1000] This is already done on x86, for more details see commit 03252919b798 ("x86: print which shared library/executable faulted in segfault etc. messages v3"). Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-12arm64: mm: don't print out page table entries on EL0 faultsKristina Martsenko1-1/+0
When we take a fault from EL0 that can't be handled, we print out the page table entries associated with the faulting address. This allows userspace to print out any current page table entries, including kernel (TTBR1) entries. Exposing kernel mappings like this could pose a security risk, so don't print out page table information on EL0 faults. (But still print it out for EL1 faults.) This also follows the same behaviour as x86, printing out page table entries on kernel mode faults but not user mode faults. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-12arm64: mm: print out correct page table entriesKristina Martsenko2-12/+26
When we take a fault that can't be handled, we print out the page table entries associated with the faulting address. In some cases we currently print out the wrong entries. For a faulting TTBR1 address, we sometimes print out TTBR0 table entries instead, and for a faulting TTBR0 address we sometimes print out TTBR1 table entries. Fix this by choosing the tables based on the faulting address. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [will: zero-extend addrs to 64-bit, don't walk swapper w/ TTBR0 addr] Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-07arm64: ftrace: add support for far branches to dynamic ftraceArd Biesheuvel7-2/+84
Currently, dynamic ftrace support in the arm64 kernel assumes that all core kernel code is within range of ordinary branch instructions that occur in module code, which is usually the case, but is no longer guaranteed now that we have support for module PLTs and address space randomization. Since on arm64, all patching of branch instructions involves function calls to the same entry point [ftrace_caller()], we can emit the modules with a trampoline that has unlimited range, and patch both the trampoline itself and the branch instruction to redirect the call via the trampoline. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: minor clarification to smp_wmb() comment] Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-07arm64: ftrace: don't validate branch via PLT in ftrace_make_nop()Ard Biesheuvel1-3/+43
When turning branch instructions into NOPs, we attempt to validate the action by comparing the old value at the call site with the opcode of a direct relative branch instruction pointing at the old target. However, these call sites are statically initialized to call _mcount(), and may be redirected via a PLT entry if the module is loaded far away from the kernel text, leading to false negatives and spurious errors. So skip the validation if CONFIG_ARM64_MODULE_PLTS is configured. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-06arm64, vdso: Define vdso_{start,end} as arrayKees Cook1-5/+5
Adjust vdso_{start|end} to be char arrays to avoid compile-time analysis that flags "too large" memcmp() calls with CONFIG_FORTIFY_SOURCE. Cc: Jisheng Zhang <jszhang@marvell.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-05arm64: cpufeature: Fix CPU_OUT_OF_SPEC taint for uniform systemsWill Deacon1-2/+4
Commit 3fde2999fac5 ("arm64: cpufeature: Don't dump useless backtrace on CPU_OUT_OF_SPEC") changed the cpufeature detection code to use add_taint instead of WARN_TAINT_ONCE when detecting a heterogeneous system with mismatched feature support. Unfortunately, this resulted in all systems getting the taint, regardless of any feature mismatch. This patch fixes the problem by conditionalising the taint on detecting a feature mismatch. Acked-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-01arm64: kernel: restrict /dev/mem read() calls to linear regionArd Biesheuvel1-6/+13
When running lscpu on an AArch64 system that has SMBIOS version 2.0 tables, it will segfault in the following way: Unable to handle kernel paging request at virtual address ffff8000bfff0000 pgd = ffff8000f9615000 [ffff8000bfff0000] *pgd=0000000000000000 Internal error: Oops: 96000007 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 1284 Comm: lscpu Not tainted 4.11.0-rc3+ #103 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 task: ffff8000fa78e800 task.stack: ffff8000f9780000 PC is at __arch_copy_to_user+0x90/0x220 LR is at read_mem+0xcc/0x140 This is caused by the fact that lspci issues a read() on /dev/mem at the offset where it expects to find the SMBIOS structure array. However, this region is classified as EFI_RUNTIME_SERVICE_DATA (as per the UEFI spec), and so it is omitted from the linear mapping. So let's restrict /dev/mem read/write access to those areas that are covered by the linear region. Reported-by: Alexander Graf <agraf@suse.de> Fixes: 4dffbfc48d65 ("arm64/efi: mark UEFI reserved regions as MEMBLOCK_NOMAP") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-30ARM64/PCI: Set root bus NUMA node on ACPI systemsLorenzo Pieralisi1-0/+3
PCI core requires the NUMA node for the struct pci_host_bridge.dev to be set by using the pcibus_to_node(struct pci_bus*) API, that on ARM64 systems relies on the struct pci_host_bridge->bus.dev NUMA node. The struct pci_host_bridge.dev NUMA node is then propagated through the PCI device hierarchy as PCI devices (and bridges) are enumerated under it. Therefore, in order to set-up the PCI NUMA hierarchy appropriately, the struct pci_host_bridge->bus.dev NUMA node must be set before core code calls pcibus_to_node(struct pci_bus*) on it so that PCI core can retrieve the NUMA node for the struct pci_host_bridge.dev device and can propagate it through the PCI bus tree. On ARM64 ACPI based systems the struct pci_host_bridge->bus.dev NUMA node can be set-up in pcibios_root_bridge_prepare() by parsing the root bridge ACPI device firmware binding. Add code to the pcibios_root_bridge_prepare() that, when booting with ACPI, parse the root bridge ACPI device companion NUMA binding and set the corresponding struct pci_host_bridge->bus.dev NUMA node appropriately. Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Robert Richter <rrichter@cavium.com> Tested-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-30arm64: futex: Fix undefined behaviour with FUTEX_OP_OPARG_SHIFT usageWill Deacon1-4/+4
FUTEX_OP_OPARG_SHIFT instructs the futex code to treat the 12-bit oparg field as a shift value, potentially leading to a left shift value that is negative or with an absolute value that is significantly larger then the size of the type. UBSAN chokes with: ================================================================================ UBSAN: Undefined behaviour in ./arch/arm64/include/asm/futex.h:60:13 shift exponent -1 is negative CPU: 1 PID: 1449 Comm: syz-executor0 Not tainted 4.11.0-rc4-00005-g977eb52-dirty #11 Hardware name: linux,dummy-virt (DT) Call trace: [<ffff200008094778>] dump_backtrace+0x0/0x538 arch/arm64/kernel/traps.c:73 [<ffff200008094cd0>] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:228 [<ffff200008c194a8>] __dump_stack lib/dump_stack.c:16 [inline] [<ffff200008c194a8>] dump_stack+0x120/0x188 lib/dump_stack.c:52 [<ffff200008cc24b8>] ubsan_epilogue+0x18/0x98 lib/ubsan.c:164 [<ffff200008cc3098>] __ubsan_handle_shift_out_of_bounds+0x250/0x294 lib/ubsan.c:421 [<ffff20000832002c>] futex_atomic_op_inuser arch/arm64/include/asm/futex.h:60 [inline] [<ffff20000832002c>] futex_wake_op kernel/futex.c:1489 [inline] [<ffff20000832002c>] do_futex+0x137c/0x1740 kernel/futex.c:3231 [<ffff200008320504>] SYSC_futex kernel/futex.c:3281 [inline] [<ffff200008320504>] SyS_futex+0x114/0x268 kernel/futex.c:3249 [<ffff200008084770>] el0_svc_naked+0x24/0x28 ================================================================================ syz-executor1 uses obsolete (PF_INET,SOCK_PACKET) sock: process `syz-executor0' is using obsolete setsockopt SO_BSDCOMPAT This patch attempts to fix some of this by: * Making encoded_op an unsigned type, so we can shift it left even if the top bit is set. * Casting to signed prior to shifting right when extracting oparg and cmparg * Consider only the bottom 5 bits of oparg when using it as a left-shift value. Whilst I think this catches all of the issues, I'd much prefer to remove this stuff, as I think it's unused and the bugs are copy-pasted between a bunch of architectures. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-30arm64: check return value of of_flat_dt_get_machine_nameKefeng Wang1-0/+3
It's useless to print machine name and setup arch-specific system identifiers if of_flat_dt_get_machine_name() return NULL, especially when ACPI-based boot. Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-30arm64: cpufeature: Don't dump useless backtrace on CPU_OUT_OF_SPECWill Deacon1-2/+2
Unfortunately, it turns out that mismatched CPU features in big.LITTLE systems are starting to appear in the wild. Whilst we should continue to taint the kernel with CPU_OUT_OF_SPEC for features that differ in ways that we can't fix up, dumping a useless backtrace out of the cpufeature code is pointless and irritating. This patch removes the backtrace from the taint. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-30arm64: mm: explicity include linux/vmalloc.hTobias Klauser1-0/+1
arm64's mm/mmu.c uses vm_area_add_early, struct vm_area and other definitions but relies on implict inclusion of linux/vmalloc.h which means that changes in other headers could break the build. Thus, add an explicit include. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-30arm64: Add dump_backtrace() in show_regsKefeng Wang3-3/+3
Generic code expects show_regs() to dump the stack, but arm64's show_regs() does not. This makes it hard to debug softlockups and other issues that result in show_regs() being called. This patch updates arm64's show_regs() to dump the stack, as common code expects. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> [will: folded in bug_handler fix from mrutland] Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-30arm64: Call __show_regs directlyKefeng Wang2-3/+3
Generic code expects show_regs() to also dump the stack, but arm64's show_reg() does not do this. Some arm64 callers of show_regs() *only* want the registers dumped, without the stack. To enable generic code to work as expected, we need to make show_regs() dump the stack. Where we only want the registers dumped, we must use __show_regs(). This patch updates code to use __show_regs() where only registers are desired. A subsequent patch will modify show_regs(). Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-30arm64: Preventing READ_IMPLIES_EXEC propagationDong Bo1-0/+6
Like arch/arm/, we inherit the READ_IMPLIES_EXEC personality flag across fork(). This is undesirable for a number of reasons: * ELF files that don't require executable stack can end up with it anyway * We end up performing un-necessary I-cache maintenance when mapping what should be non-executable pages * Restricting what is executable is generally desirable when defending against overflow attacks This patch clears the personality flag when setting up the personality for newly spwaned native tasks. Given that semi-recent AArch64 toolchains emit a non-executable PT_GNU_STACK header, userspace applications can already not rely on READ_IMPLIES_EXEC so shouldn't be adversely affected by this change. Cc: <stable@vger.kernel.org> Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Dong Bo <dongbo4@huawei.com> [will: added comment to compat code, rewrote commit message] Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-28Linux 4.12-rc3Linus Torvalds1-1/+1
2017-05-28Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermalLinus Torvalds4-17/+11
Pull thermal SoC management fixes from Eduardo Valentin: - fixes to TI SoC driver, Broadcom, qoriq - small sparse warning fix on thermal core * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: broadcom: ns-thermal: default on iProc SoCs ti-soc-thermal: Fix a typo in a comment line ti-soc-thermal: Delete error messages for failed memory allocations in ti_bandgap_build() ti-soc-thermal: Use devm_kcalloc() in ti_bandgap_build() thermal: core: make thermal_emergency_poweroff static thermal: qoriq: remove useless call for of_thermal_get_trip_points()
2017-05-27Merge tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/ttyLinus Torvalds14-49/+162
Pull tty/serial fixes from Greg KH: "Here are some serial and tty fixes for 4.12-rc3. They are a bit bigger than normal, which is why I had them bake in linux-next for a few weeks and didn't send them to you for -rc2. They revert a few of the serdev patches from 4.12-rc1, and bring things back to how they were in 4.11, to try to make things a bit more stable there. Rob and Johan both agree that this is the way forward, so this isn't people squabbling over semantics. Other than that, just a few minor serial driver fixes that people have had problems with. All of these have been in linux-next for a few weeks with no reported issues" * tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: altera_uart: call iounmap() at driver remove serial: imx: ensure UCR3 and UFCR are setup correctly MAINTAINERS/serial: Change maintainer of jsm driver serial: enable serdev support tty/serdev: add serdev registration interface serdev: Restore serdev_device_write_buf for atomic context serial: core: fix crash in uart_suspend_port tty: fix port buffer locking tty: ehv_bytechan: clean up init error handling serial: ifx6x60: fix use-after-free on module unload serial: altera_jtaguart: adding iounmap() serial: exar: Fix stuck MSIs serial: efm32: Fix parity management in 'efm32_uart_console_get_options()' serdev: fix tty-port client deregistration Revert "tty_port: register tty ports with serdev bus" drivers/tty: 8250: only call fintek_8250_probe when doing port I/O
2017-05-27Merge tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds6-6/+12
Pull powerpc fixes from Michael Ellerman: "Fix running SPU programs on Cell, and a few other minor fixes. Thanks to Alistair Popple, Jeremy Kerr, Michael Neuling, Nicholas Piggin" * tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Add PPC_FEATURE userspace bits for SCV and DARN instructions powerpc/spufs: Fix hash faults for kernel regions powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N powerpc/powernv/npu-dma.c: Fix opal_npu_destroy_context() call selftests/powerpc: Fix TM resched DSCR test with some compilers
2017-05-27Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds9-37/+81
Pull x86 fixes from Thomas Gleixner: "A series of fixes for X86: - The final fix for the end-of-stack issue in the unwinder - Handle non PAT systems gracefully - Prevent access to uninitiliazed memory - Move early delay calaibration after basic init - Fix Kconfig help text - Fix a cross compile issue - Unbreak older make versions" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/timers: Move simple_udelay_calibration past init_hypervisor_platform x86/alternatives: Prevent uninitialized stack byte read in apply_alternatives() x86/PAT: Fix Xorg regression on CPUs that don't support PAT x86/watchdog: Fix Kconfig help text file path reference to lockup watchdog documentation x86/build: Permit building with old make versions x86/unwind: Add end-of-stack check for ftrace handlers Revert "x86/entry: Fix the end of the stack for newly forked tasks" x86/boot: Use CROSS_COMPILE prefix for readelf
2017-05-27Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-8/+16
Pull timer fixlet from Thomas Gleixner: "Silence dmesg spam by making the posix cpu timer printks depend on print_fatal_signals" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-timers: Make signal printks conditional
2017-05-27Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds3-8/+8
Pull RAS fixes from Thomas Gleixner: "Two fixlets for RAS: - Export memory_error() so the NFIT module can utilize it - Handle memory errors in NFIT correctly" * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: acpi, nfit: Fix the memory error check in nfit_handle_mce() x86/MCE: Export memory_error()
2017-05-27Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds18-47/+176
Pull perf tooling fixes from Thomas Gleixner: - Synchronization of tools and kernel headers - A series of fixes for perf report addressing various failures: * Handle invalid maps proper * Plug a memory leak * Handle frames and callchain order correctly - Fixes for handling inlines and children mode * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools/include: Sync kernel ABI headers with tooling headers perf tools: Put caller above callee in --children mode perf report: Do not drop last inlined frame perf report: Always honor callchain order for inlined nodes perf script: Add --inline option for debugging perf report: Fix off-by-one for non-activation frames perf report: Fix memory leak in addr2line when called by addr2inlines perf report: Don't crash on invalid maps in `-g srcline` mode
2017-05-27Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-6/+18
Pull locking fix from Thomas Gleixner: "A fix for a state leak which was introduced in the recent rework of futex/rtmutex interaction" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()
2017-05-27Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-5/+12
Pull kthread fix from Thomas Gleixner: "A single fix which prevents a use after free when kthread fork fails" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kthread: Fix use-after-free if kthread fork fails
2017-05-27Merge tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds6-9/+47
Pull ftrace fixes from Steven Rostedt: "There's been a few memory issues found with ftrace. One was simply a memory leak where not all was being freed that should have been in releasing a file pointer on set_graph_function. Then Thomas found that the ftrace trampolines were marked for read/write as well as execute. To shrink the possible attack surface, he added calls to set them to ro. Which also uncovered some other issues with freeing module allocated memory that had its permissions changed. Kprobes had a similar issue which is fixed and a selftest was added to trigger that issue again" * tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: x86/ftrace: Make sure that ftrace trampolines are not RWX x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range() selftests/ftrace: Add a testcase for many kprobe events kprobes/x86: Fix to set RWX bits correctly before releasing trampoline ftrace: Fix memory leak in ftrace_graph_release()
2017-05-26x86/ftrace: Make sure that ftrace trampolines are not RWXThomas Gleixner1-6/+14
ftrace use module_alloc() to allocate trampoline pages. The mapping of module_alloc() is RWX, which makes sense as the memory is written to right after allocation. But nothing makes these pages RO after writing to them. Add proper set_memory_rw/ro() calls to protect the trampolines after modification. Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705251056410.1862@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-26x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()Steven Rostedt (VMware)1-1/+1
With function tracing starting in early bootup and having its trampoline pages being read only, a bug triggered with the following: kernel BUG at arch/x86/mm/pageattr.c:189! invalid opcode: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc2-test+ #3 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 task: ffffffffb4222500 task.stack: ffffffffb4200000 RIP: 0010:change_page_attr_set_clr+0x269/0x302 RSP: 0000:ffffffffb4203c88 EFLAGS: 00010046 RAX: 0000000000000046 RBX: 0000000000000000 RCX: 00000001b6000000 RDX: ffffffffb4203d40 RSI: 0000000000000000 RDI: ffffffffb4240d60 RBP: ffffffffb4203d18 R08: 00000001b6000000 R09: 0000000000000001 R10: ffffffffb4203aa8 R11: 0000000000000003 R12: ffffffffc029b000 R13: ffffffffb4203d40 R14: 0000000000000001 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff9a639ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff9a636b384000 CR3: 00000001ea21d000 CR4: 00000000000406b0 Call Trace: change_page_attr_clear+0x1f/0x21 set_memory_ro+0x1e/0x20 arch_ftrace_update_trampoline+0x207/0x21c ? ftrace_caller+0x64/0x64 ? 0xffffffffc029b000 ftrace_startup+0xf4/0x198 register_ftrace_function+0x26/0x3c function_trace_init+0x5e/0x73 tracer_init+0x1e/0x23 tracing_set_tracer+0x127/0x15a register_tracer+0x19b/0x1bc init_function_trace+0x90/0x92 early_trace_init+0x236/0x2b3 start_kernel+0x200/0x3f5 x86_64_start_reservations+0x29/0x2b x86_64_start_kernel+0x17c/0x18f secondary_startup_64+0x9f/0x9f ? secondary_startup_64+0x9f/0x9f Interrupts should not be enabled at this early in the boot process. It is also fine to leave interrupts enabled during this time as there's only one CPU running, and on_each_cpu() means to only run on the current CPU. If early_boot_irqs_disabled is set, it is safe to run cpu_flush_range() with interrupts disabled. Don't trigger a BUG_ON() in that case. Link: http://lkml.kernel.org/r/20170526093717.0be3b849@gandalf.local.home Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-26selftests/ftrace: Add a testcase for many kprobe eventsMasami Hiramatsu1-0/+21
Add a testcase to test kprobes via ftrace interface with many concurrent kprobe events. This tries to add many kprobe events (up to 256) on kernel functions. To avoid making ftrace-based kprobes (kprobes on fentry), it skips first N bytes (on x86 N=5, on ppc or arm N=4) of function entry. After that, it enables all those events, disable it, and remove it. Since the unoptimization buffer reclaiming will be delayed, after removing events, it will wait enough time. Link: http://lkml.kernel.org/r/149577388470.11702.11832460851769204511.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-26kprobes/x86: Fix to set RWX bits correctly before releasing trampolineMasami Hiramatsu2-1/+10
Fix kprobes to set(recover) RWX bits correctly on trampoline buffer before releasing it. Releasing readonly page to module_memfree() crash the kernel. Without this fix, if kprobes user register a bunch of kprobes in function body (since kprobes on function entry usually use ftrace) and unregister it, kernel hits a BUG and crash. Link: http://lkml.kernel.org/r/149570868652.3518.14120169373590420503.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: d0381c81c2f7 ("kprobes/x86: Set kprobes pages read-only") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-26ftrace: Fix memory leak in ftrace_graph_release()Luis Henriques1-1/+1
ftrace_hash is being kfree'ed in ftrace_graph_release(), however the ->buckets field is not. This results in a memory leak that is easily captured by kmemleak: unreferenced object 0xffff880038afe000 (size 8192): comm "trace-cmd", pid 238, jiffies 4294916898 (age 9.736s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff815f561e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff8113964d>] __kmalloc+0x12d/0x1a0 [<ffffffff810bf6d1>] alloc_ftrace_hash+0x51/0x80 [<ffffffff810c0523>] __ftrace_graph_open.isra.39.constprop.46+0xa3/0x100 [<ffffffff810c05e8>] ftrace_graph_open+0x68/0xa0 [<ffffffff8114003d>] do_dentry_open.isra.1+0x1bd/0x2d0 [<ffffffff81140df7>] vfs_open+0x47/0x60 [<ffffffff81150f95>] path_openat+0x2a5/0x1020 [<ffffffff81152d6a>] do_filp_open+0x8a/0xf0 [<ffffffff811411df>] do_sys_open+0x12f/0x200 [<ffffffff811412ce>] SyS_open+0x1e/0x20 [<ffffffff815fa6e0>] entry_SYSCALL_64_fastpath+0x13/0x94 [<ffffffffffffffff>] 0xffffffffffffffff Link: http://lkml.kernel.org/r/20170525152038.7661-1-lhenriques@suse.com Cc: stable@vger.kernel.org Fixes: b9b0c831bed2 ("ftrace: Convert graph filter to use hash tables") Signed-off-by: Luis Henriques <lhenriques@suse.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-26Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds5-17/+20
Pull input layer fixes from Dmitry Torokhov: "Just a few fixups to a couple of drivers" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: elan_i2c - ignore signals when finishing updating firmware Input: elan_i2c - clear INT before resetting controller Input: atmel_mxt_ts - add T100 as a readable object Input: edt-ft5x06 - increase allowed data range for threshold parameter
2017-05-26Merge tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-ledsLinus Torvalds1-1/+1
Pull LED fix from Jacek Anaszewski: "A single LED fix for 4.12-rc3. leds-pca955x driver uses only i2c_smbus API and thus it should pass I2C_FUNC_SMBUS_BYTE_DATA flag to i2c_check_functionality" * tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: leds: pca955x: Correct I2C Functionality
2017-05-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds57-254/+701
Pull networking fixes from David Miller: 1) Fix state pruning in bpf verifier wrt. alignment, from Daniel Borkmann. 2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide Caratti. 3) Fix bit field definitions for rss_hash_type of descriptors in mlx5 driver, from Jesper Brouer. 4) Defer slave->link updates until bonding is ready to do a full commit to the new settings, from Nithin Sujir. 5) Properly reference count ipv4 FIB metrics to avoid use after free situations, from Eric Dumazet and several others including Cong Wang and Julian Anastasov. 6) Fix races in llc_ui_bind(), from Lin Zhang. 7) Fix regression of ESP UDP encapsulation for TCP packets, from Steffen Klassert. 8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap. 9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter Dawson. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits) ipv4: add reference counting to metrics net: ethernet: ax88796: don't call free_irq without request_irq first ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets sctp: fix ICMP processing if skb is non-linear net: llc: add lock_sock in llc_ui_bind to avoid a race condition bonding: Don't update slave->link until ready to commit test_bpf: Add a couple of tests for BPF_JSGE. bpf: add various verifier test cases bpf: fix wrong exposure of map_flags into fdinfo for lpm bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data bpf: properly reset caller saved regs after helper call and ld_abs/ind bpf: fix incorrect pruning decision when alignment must be tracked arp: fixed -Wuninitialized compiler warning tcp: avoid fastopen API to be used on AF_UNSPEC net: move somaxconn init from sysctl code net: fix potential null pointer dereference geneve: fix fill_info when using collect_metadata virtio-net: enable TSO/checksum offloads for Q-in-Q vlans be2net: Fix offload features for Q-in-Q packets vlan: Fix tcp checksum offloads in Q-in-Q vlans ...
2017-05-26Merge tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds6-74/+66
Pull XFS fixes from Darrick Wong: "A few miscellaneous bug fixes & cleanups: - Fix indlen block reservation accounting bug when splitting delalloc extent - Fix warnings about unused variables that appeared in -rc1. - Don't spew errors when bmapping a local format directory - Fix an off-by-one error in a delalloc eof assertion - Make fsmap only return inode information for CAP_SYS_ADMIN - Fix a potential mount time deadlock recovering cow extents - Fix unaligned memory access in _btree_visit_blocks - Fix various SEEK_HOLE/SEEK_DATA bugs" * tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Move handling of missing page into one place in xfs_find_get_desired_pgoff() xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff() xfs: Fix missed holes in SEEK_HOLE implementation xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff() xfs: fix unaligned access in xfs_btree_visit_blocks xfs: avoid mount-time deadlock in CoW extent recovery xfs: only return detailed fsmap info if the caller has CAP_SYS_ADMIN xfs: bad assertion for delalloc an extent that start at i_size xfs: fix warnings about unused stack variables xfs: BMAPX shouldn't barf on inline-format directories xfs: fix indlen accounting error on partial delalloc conversion
2017-05-26ipv4: add reference counting to metricsEric Dumazet5-23/+45
Andrey Konovalov reported crashes in ipv4_mtu() I could reproduce the issue with KASAN kernels, between 10.246.7.151 and 10.246.7.152 : 1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 & 2) At the same time run following loop : while : do ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 done Cong Wang attempted to add back rt->fi in commit 82486aa6f1b9 ("ipv4: restore rt->fi for reference counting") but this proved to add some issues that were complex to solve. Instead, I suggested to add a refcount to the metrics themselves, being a standalone object (in particular, no reference to other objects) I tried to make this patch as small as possible to ease its backport, instead of being super clean. Note that we believe that only ipv4 dst need to take care of the metric refcount. But if this is wrong, this patch adds the basic infrastructure to extend this to other families. Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang for his efforts on this problem. Fixes: 2860583fe840 ("ipv4: Kill rt->fi") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>