aboutsummaryrefslogtreecommitdiffstats
path: root/arch (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-10-09Revert commit 1a8b6d76dc5b ("net:add one common config...")Ding Tianhong2-4/+0
The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added to indicate that Relaxed Ordering Attributes (RO) should not be used for Transaction Layer Packets (TLP) targeted toward these affected Root Port, it will clear the bit4 in the PCIe Device Control register, so the PCIe device drivers could query PCIe configuration space to determine if it can send TLPs to Root Port with the Relaxed Ordering Attributes set. With this new flag we don't need the config ARCH_WANT_RELAX_ORDER to control the Relaxed Ordering Attributes for the ixgbe drivers just like the commit 1a8b6d76dc5b ("net:add one common config...") did, so revert this commit. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06ARC: [plat-hsdk]: Add reset controller node to manage ethernet resetEugeniy Paltsev2-0/+10
DW ethernet controller on HSDK hangs sometimes after SW reset, so add reset node to make possible to reset DW ethernet controller HW. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-06arm64: Ensure fpsimd support is ready before userspace is activeSuzuki K Poulose1-1/+1
We register the pm/hotplug callbacks for FPSIMD as late_initcall, which happens after the userspace is active (from initramfs via populate_rootfs, a rootfs_initcall). Make sure we are ready even before the userspace could potentially use it, by promoting to a core_initcall. Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Dave Martin <dave.martin@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-10-06arm64: Ensure the instruction emulation is ready for userspaceSuzuki K Poulose2-2/+2
We trap and emulate some instructions (e.g, mrs, deprecated instructions) for the userspace. However the handlers for these are registered as late_initcalls and the userspace could be up and running from the initramfs by that time (with populate_rootfs, which is a rootfs_initcall()). This could cause problems for the early applications ending up in failure like : [ 11.152061] modprobe[93]: undefined instruction: pc=0000ffff8ca48ff4 This patch promotes the specific calls to core_initcalls, which are guaranteed to be completed before we hit userspace. Cc: stable@vger.kernel.org Cc: Dave Martin <dave.martin@arm.com> Cc: Matthias Brugger <mbrugger@suse.com> Cc: James Morse <james.morse@arm.com> Reported-by: Matwey V. Kornilov <matwey.kornilov@gmail.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-10-06powerpc/tm: Fix illegal TM state in signal handlerGustavo Romero1-1/+12
Currently it's possible that on returning from the signal handler through the restore_tm_sigcontexts() code path (e.g. from a signal caught due to a `trap` instruction executed in the middle of an HTM block, or a deliberately constructed sigframe) an illegal TM state (like TS=10 TM=0, i.e. "T0") is set in SRR1 and when `rfid` sets implicitly the MSR register from SRR1 register on return to userspace it causes a TM Bad Thing exception. That illegal state can be set (a) by a malicious user that disables the TM bit by tweaking the bits in uc_mcontext before returning from the signal handler or (b) by a sufficient number of context switches occurring such that the load_tm counter overflows and TM is disabled whilst in the signal handler. This commit fixes the illegal TM state by ensuring that TM bit is always enabled before we return from restore_tm_sigcontexts(). A small comment correction is made as well. Fixes: 5d176f751ee3 ("powerpc: tm: Enable transactional memory (TM) lazily for userspace") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-06powerpc/64s: Use emergency stack for kernel TM Bad Thing program checksCyril Bur1-1/+23
When using transactional memory (TM), the CPU can be in one of six states as far as TM is concerned, encoded in the Machine State Register (MSR). Certain state transitions are illegal and if attempted trigger a "TM Bad Thing" type program check exception. If we ever hit one of these exceptions it's treated as a bug, ie. we oops, and kill the process and/or panic, depending on configuration. One case where we can trigger a TM Bad Thing, is when returning to userspace after a system call or interrupt, using RFID. When this happens the CPU first restores the user register state, in particular r1 (the stack pointer) and then attempts to update the MSR. However the MSR update is not allowed and so we take the program check with the user register state, but the kernel MSR. This tricks the exception entry code into thinking we have a bad kernel stack pointer, because the MSR says we're coming from the kernel, but r1 is pointing to userspace. To avoid this we instead always switch to the emergency stack if we take a TM Bad Thing from the kernel. That way none of the user register values are used, other than for printing in the oops message. This is the fix for CVE-2017-1000255. Fixes: 5d176f751ee3 ("powerpc: tm: Enable transactional memory (TM) lazily for userspace") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Cyril Bur <cyrilbur@gmail.com> [mpe: Rewrite change log & comments, tweak asm slightly] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-06powerpc/powernv: Increase memory block size to 1GB on radixAnton Blanchard1-1/+9
Memory hot unplug on PowerNV radix hosts is broken. Our memory block size is 256MB but since we map the linear region with very large pages, each pte we tear down maps 1GB. A hot unplug of one 256MB memory block results in 768MB of memory getting unintentionally unmapped. At this point we are likely to oops. Fix this by increasing our memory block size to 1GB on PowerNV radix hosts. Fixes: 4b5d62ca17a1 ("powerpc/mm: add radix__remove_section_mapping()") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-05KVM: add X86_LOCAL_APIC dependencyArnd Bergmann1-0/+1
The rework of the posted interrupt handling broke building without support for the local APIC: ERROR: "boot_cpu_physical_apicid" [arch/x86/kvm/kvm-intel.ko] undefined! That configuration is probably not particularly useful anyway, so we can avoid the randconfig failures by adding a Kconfig dependency. Fixes: 8b306e2f3c41 ("KVM: VMX: avoid double list add with VT-d posted interrupts") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-10-05x86/kvm: Move kvm_fastop_exception to .fixup sectionJosh Poimboeuf1-2/+4
When compiling the kernel with the '-frecord-gcc-switches' flag, objtool complains: arch/x86/kvm/emulate.o: warning: objtool: .GCC.command.line+0x0: special: can't find new instruction And also the kernel fails to link. The problem is that the 'kvm_fastop_exception' code gets placed into the throwaway '.GCC.command.line' section instead of '.text'. Exception fixup code is conventionally placed in the '.fixup' section, so put it there where it belongs. Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-10-04arm64: Use larger stacks when KASAN is selectedMark Rutland1-3/+6
AddressSanitizer instrumentation can significantly bloat the stack, and with GCC 7 this can result in stack overflows at boot time in some configurations. We can avoid this by doubling our stack size when KASAN is in use, as is already done on x86 (and has been since KASAN was introduced). Regardless of other patches to decrease KASAN's stack utilization, kernels built with KASAN will always require more stack space than those built without, and we should take this into account. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-10-04kvm/x86: Avoid async PF preempting the kernel incorrectlyBoqun Feng3-7/+13
Currently, in PREEMPT_COUNT=n kernel, kvm_async_pf_task_wait() could call schedule() to reschedule in some cases. This could result in accidentally ending the current RCU read-side critical section early, causing random memory corruption in the guest, or otherwise preempting the currently running task inside between preempt_disable and preempt_enable. The difficulty to handle this well is because we don't know whether an async PF delivered in a preemptible section or RCU read-side critical section for PREEMPT_COUNT=n, since preempt_disable()/enable() and rcu_read_lock/unlock() are both no-ops in that case. To cure this, we treat any async PF interrupting a kernel context as one that cannot be preempted, preventing kvm_async_pf_task_wait() from choosing the schedule() path in that case. To do so, a second parameter for kvm_async_pf_task_wait() is introduced, so that we know whether it's called from a context interrupting the kernel, and the parameter is set properly in all the callsites. Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: stable@vger.kernel.org Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-10-04ARM: dts: stm32: use right pinctrl compatible for stm32f469Alexandre Torgue7-297/+537
Currently, same stm32f429-pinctrl driver is used for stm32f429 and stm32f469. As pin map is different between those 2 MCUs, a stm32f469-pinctrl driver has been recently added. This patch -allows to use stm32f469-pinctrl driver for stm32f469 boards -reworks stm32 devicetree files to fit with stm32f429 / stm32f469 In the same time it fixes an issue when only MACH_STM32F469 flag is selected in menuconfig. Fixes: d28bcd53fa90 ("ARM: stm32: Introduce MACH_STM32F469 flag") Reported-by: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
2017-10-04powerpc/mm: Call flush_tlb_kernel_range with interrupts enabledGuenter Roeck1-1/+1
flush_tlb_kernel_range() may call smp_call_function_many() which expects interrupts to be enabled. This results in a traceback. WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1 task: cf830000 task.stack: cf82e000 NIP: c00a93c8 LR: c00a9634 CTR: 00000001 REGS: cf82fde0 TRAP: 0700 Not tainted (4.14.0-rc1-00009-g0666f56) MSR: 00021000 <CE,ME> CR: 24000082 XER: 00000000 GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001 GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000 GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000 NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc LR [c00a9634] smp_call_function+0x3c/0x50 Call Trace: [cf82fe90] [00000010] 0x10 (unreliable) [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50 [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38 [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c [cf82ff20] [c001484c] free_initmem+0x20/0x4c [cf82ff30] [c000316c] kernel_init+0x1c/0x108 [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78 Fixes: 3184cc4b6f6a ("powerpc/mm: Fix kernel RAM protection after freeing ...") Fixes: e611939fc8ec ("powerpc/mm: Ensure change_page_attr() doesn't ...") Cc: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/xive: Clear XIVE internal structures when a CPU is removedCédric Le Goater1-0/+8
Commit eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") introduced support for the XIVE exploitation mode of the P9 interrupt controller on the pseries platform. At that time, support for CPU removal was not complete on PowerVM and CPU hot unplug remained untested. It appears that some cleanups of the XIVE internal structures are required before releasing the CPU, without which the kernel crashes in a RTAS call doing the CPU isolation. These changes fix the crash by deconfiguring the IPI interrupt source and clearing the event queues of the CPU when it is removed. Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/xive: Fix IPI resetCédric Le Goater1-0/+4
When resetting an IPI, hw_ipi should also be set to zero. Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04ARM: dts: stm32: Fix STMPE1600 binding on stm32429i-eval boardAlexandre Torgue1-3/+1
To declare gpio interrupt line for STMPE1600, 2 possibilities are offered: -use gpio binding (and then the gpiolib interface inside driver) -use interrupt binding as each gpio-controller are also interrupt controller on stm32f429. In STMPE 1600 node both (gpio and interrupt) bindings are defined. This patch fixes this issue and use only interrupt binding. Fixes: c04b2e72af8d ("ARM: dts: stm32: Enable STMPE1600 gpio expander of STM32F429-EVAL board") Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
2017-10-04powerpc/watchdog: Make use of watchdog_nmi_probe()Thomas Gleixner1-9/+8
The rework of the core hotplug code triggers the WARN_ON in start_wd_cpu() on powerpc because it is called multiple times for the boot CPU. The first call is via: start_wd_on_cpu+0x80/0x2f0 watchdog_nmi_reconfigure+0x124/0x170 softlockup_reconfigure_threads+0x110/0x130 lockup_detector_init+0xbc/0xe0 kernel_init_freeable+0x18c/0x37c kernel_init+0x2c/0x160 ret_from_kernel_thread+0x5c/0xbc And then again via the CPU hotplug registration: start_wd_on_cpu+0x80/0x2f0 cpuhp_invoke_callback+0x194/0x620 cpuhp_thread_fun+0x7c/0x1b0 smpboot_thread_fn+0x290/0x2a0 kthread+0x168/0x1b0 ret_from_kernel_thread+0x5c/0xbc This can be avoided by setting up the cpu hotplug state with nocalls and move the initialization to the watchdog_nmi_probe() function. That initializes the hotplug callbacks without invoking the callback and the following core initialization function then configures the watchdog for the online CPUs (in this case CPU0) via softlockup_reconfigure_threads(). Reported-and-tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org
2017-10-04watchdog/core, powerpc: Lock cpus across reconfigurationThomas Gleixner1-4/+0
Instead of dropping the cpu hotplug lock after stopping NMI watchdog and threads and reaquiring for restart, the code and the protection rules become more obvious when holding cpu hotplug lock across the full reconfiguration. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1710022105570.2114@nanos
2017-10-04watchdog/core, powerpc: Replace watchdog_nmi_reconfigure()Thomas Gleixner1-9/+14
The recent cleanup of the watchdog code split watchdog_nmi_reconfigure() into two stages. One to stop the NMI and one to restart it after reconfiguration. That was done by adding a boolean 'run' argument to the code, which is functionally correct but not necessarily a piece of art. Replace it by two explicit functions: watchdog_nmi_stop() and watchdog_nmi_start(). Fixes: 6592ad2fcc8f ("watchdog/core, powerpc: Make watchdog_nmi_reconfigure() two stage") Requested-by: Linus 'Nursing his pet-peeve' Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Thomas 'Mopping up garbage' Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1710021957480.2114@nanos
2017-10-03ARC: [plat-hsdk]: Temporary fix to set CPU frequency to 1GHzEugeniy Paltsev1-0/+42
Add temporary fix to HSDK platform code to setup CPU frequency to 1GHz on early boot. We can remove this fix when smart hsdk pll driver will be introduced, see discussion: https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg02689.html Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-03ARC: fix allnoconfig build warningVineet Gupta1-1/+1
Reported-by: Dmitrii Kolesnichenko <dmitrii@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-03ARCv2: boot log: identify HS48 cores (dual issue)Vineet Gupta2-4/+16
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-03ARC: boot log: decontaminate ARCv2 ISA_CONFIG registerVineet Gupta2-8/+15
ARCv2 ISA_CONFIG and ARC700_BUILD build config registers are not compatible. cpuinfo_arc had isa info placeholder which was mashup of bits form both. Untangle this by defining it off of ARCv2 ISA info and it is fine even for ARC700 since former is a super set of latter (ARC700 buildonly has 2 bits for atomics and stack check). At runtime, we treat ARCv2 ISA info as a generic placeholder but populate it correctly depending on ARC700 or HS. This paves way for adding more HS specific bits in isa info which was colliding with the extra bits for arc700. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-03arc: remove redundant UTS_MACHINE define in arch/arc/MakefileMasahiro Yamada1-2/+0
The top-level Makefile sets the default of UTS_MACHINE to $(ARCH). If ARCH and UTS_MACHINE match, arch/$(ARCH)/Makefile need not specify UTS_MACHINE explicitly. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-03ARC: [plat-hsdk] use actual clk driver to manage cpu clkEugeniy Paltsev2-3/+11
With corresponding clk driver now merged upstream, switch to it. - core_clk now represent the PLL (vs. fixed clk before) - input_clk represent the clk signal src for PLL (basically xtal) Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-03ARC: [*defconfig] Reenable soft lock-up detectorAlexey Brodkin7-7/+7
Commit 92e5aae45778 "kernel/watchdog: split up config options" introduced SOFTLOCKUP_DETECTOR which selects LOCKUP_DETECTOR instead of the latter to be selected itself. We need to adjust our defconfigs accordingly. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-03ARC: [plat-axs10x] sdio: Temporary fix of sdio ciu frequencyEugeniy Paltsev1-1/+8
DW sdio controller has external ciu clock divider controlled via register in SDIO IP. It divides sdio_ref_clk (which comes from CGU) by 16 for default. So default mmcclk clock (which comes to sdk_in) is 25000000 Hz. So fix wrong current value (50000000 Hz) to actual 25000000 Hz. Note this is a preventive fix, in line with similar change for HSDK where this was actually needed. see: http://lists.infradead.org/pipermail/linux-snps-arc/2017-September/002924.html Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-03ARC: [plat-hsdk] sdio: Temporary fix of sdio ciu frequencyEugeniy Paltsev1-1/+11
DW sdio controller has external ciu clock divider controlled via register in SDIO IP. Due to its unexpected default value (it should divide by 1 but it divides by 8) SDIO IP uses wrong ciu clock and works unstable So add temporary fix and change clock frequency from 100000000 to 12500000 Hz until we fix dw sdio driver itself. Fixes SNPS STAR 9001204800 Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-03ARC: [plat-axs103] Add temporary quirk to reset ethernet IPEugeniy Paltsev1-0/+7
DW ethernet controller on AXS10x hangs sometimes after SW reset, so add temporary quirk to reset DW ethernet controller IP core. This quirk can be removed after axs10x reset driver (see http://patchwork.ozlabs.org/patch/800273/) or simple reset driver (see https://patchwork.kernel.org/patch/9903375/) will be available in upstream. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-10-03ARM: defconfig: update Gemini defconfigLinus Walleij1-1/+2
This updates the Gemini defconfig with drivers merged for v4.13 or v4.14: - ATA driver is merged - DMA driver is merged - RTC driver gets selected from default Kconfig Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2017-10-03ARM: defconfig: FRAMEBUFFER_CONSOLE can no longer be =mArnd Bergmann3-3/+3
It is no longer possible to load this at runtime, so let's change the few remaining users to have it built-in all the time. arch/arm/configs/zeus_defconfig:115:warning: symbol value 'm' invalid for FRAMEBUFFER_CONSOLE arch/arm/configs/viper_defconfig:116:warning: symbol value 'm' invalid for FRAMEBUFFER_CONSOLE arch/arm/configs/pxa_defconfig:474:warning: symbol value 'm' invalid for FRAMEBUFFER_CONSOLE Reported-by: kernelci.org bot <bot@kernelci.org> Fixes: 6104c37094e7 ("fbcon: Make fbcon a built-time depency for fbdev") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Olof Johansson <olof@lixom.net>
2017-10-03m32r: fix build failureSudip Mukherjee1-0/+9
The allmodconfig build of m32r is failing with the error: lib/mpi/mpih-div.o: In function 'mpihelp_divrem': mpih-div.c:(.text+0x40): undefined reference to 'abort' mpih-div.c:(.text+0x40): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol 'abort' The function 'abort' was never defined for the m32r architecture. Create 'abort' as is done in other arch like 'arm' and 'unicore32'. Link: http://lkml.kernel.org/r/1506727220-6108-1-git-send-email-sudip.mukherjee@codethink.co.uk Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03m32r: define CPU_BIG_ENDIANSudip Mukherjee1-0/+4
The build of m32r allmodconfig is giving lots of build warnings about: include/linux/byteorder/big_endian.h:7:2: warning: #warning inconsistent configuration, needs CONFIG_CPU_BIG_ENDIAN [-Wcpp] #warning inconsistent configuration, needs CONFIG_CPU_BIG_ENDIAN Define CPU_BIG_ENDIAN like the way CPU_LITTLE_ENDIAN is defined. Link: http://lkml.kernel.org/r/1505678083-10320-1-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03rapidio: remove global irq spinlocks from the subsystemIoan Nicu2-2/+23
Locking of config and doorbell operations should be done only if the underlying hardware requires it. This patch removes the global spinlocks from the rapidio subsystem and moves them to the mport drivers (fsl_rio and tsi721), only to the necessary places. For example, local config space read and write operations (lcread/lcwrite) are atomic in all existing drivers, so there should be no need for locking, while the cread/cwrite operations which generate maintenance transactions need to be synchronized with a lock. Later, each driver could chose to use a per-port lock instead of a global one, or even more granular locking. Link: http://lkml.kernel.org/r/20170824113023.GD50104@nokia.com Signed-off-by: Ioan Nicu <ioan.nicu.ext@nokia.com> Signed-off-by: Frank Kunz <frank.kunz@nokia.com> Acked-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03sh: sh7269: remove nonexistent GPIO_PH[0-7] to fix pinctrl registrationGeert Uytterhoeven1-3/+1
Pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array initializers, where the GPIO_* enums serve as indices. If enum values are defined, but never used, pinmux_pins[] contains (zero-filled) holes. Such entries are treated as pin zero, which was registered before, thus leading to pinctrl registration failures, as seen on sh7722: sh-pfc pfc-sh7722: pin 0 already registered sh-pfc pfc-sh7722: error during pin registration sh-pfc pfc-sh7722: could not register: -22 sh-pfc: probe of pfc-sh7722 failed with error -22 Remove GPIO_PH[0-7] from the enum to fix this. Link: http://lkml.kernel.org/r/1505205657-18012-5-git-send-email-geert+renesas@glider.be Fixes: ef0fa5331a73e479 ("sh: Add pinmux for sh7269") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Magnus Damm <magnus.damm@gmail.com> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Jacopo Mondi <jacopo+renesas@jmondi.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03sh: sh7264: remove nonexistent GPIO_PH[0-7] to fix pinctrl registrationGeert Uytterhoeven1-3/+1
Pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array initializers, where the GPIO_* enums serve as indices. If enum values are defined, but never used, pinmux_pins[] contains (zero-filled) holes. Such entries are treated as pin zero, which was registered before, thus leading to pinctrl registration failures, as seen on sh7722: sh-pfc pfc-sh7722: pin 0 already registered sh-pfc pfc-sh7722: error during pin registration sh-pfc pfc-sh7722: could not register: -22 sh-pfc: probe of pfc-sh7722 failed with error -22 Remove GPIO_PH[0-7] from the enum to fix this. Link: http://lkml.kernel.org/r/1505205657-18012-4-git-send-email-geert+renesas@glider.be Fixes: 41797f75486d8ca3 ("sh: Add pinmux for sh7264") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Jacopo Mondi <jacopo+renesas@jmondi.org> Cc: Magnus Damm <magnus.damm@gmail.com> Cc: Rich Felker <dalias@libc.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03sh: sh7757: remove nonexistent GPIO_PT[JLNQ]7_RESV to fix pinctrl registrationGeert Uytterhoeven1-4/+4
Commit 3810e96056ff ("sh: modify pinmux for SH7757 2nd cut") renamed GPIO_PT[JLNQ]7 to GPIO_PT[JLNQ]7_RESV, and removed the existing users from the pinmux_pins[] array. However, pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array initializers, where the GPIO_* enums serve as indices. Hence entries were not really removed, but replaced by (zero-filled) holes. Such entries are treated as pin zero, which was registered before, thus leading to pinctrl registration failures, as seen on sh7722: sh-pfc pfc-sh7722: pin 0 already registered sh-pfc pfc-sh7722: error during pin registration sh-pfc pfc-sh7722: could not register: -22 sh-pfc: probe of pfc-sh7722 failed with error -22 Remove GPIO_PT[JLNQ]7_RESV from the enum to fix this. Link: http://lkml.kernel.org/r/1505205657-18012-3-git-send-email-geert+renesas@glider.be Fixes: 3810e96056ffddf6 ("sh: modify pinmux for SH7757 2nd cut") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Jacopo Mondi <jacopo+renesas@jmondi.org> Cc: Magnus Damm <magnus.damm@gmail.com> Cc: Rich Felker <dalias@libc.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03sh: sh7722: remove nonexistent GPIO_PTQ7 to fix pinctrl registrationGeert Uytterhoeven1-1/+1
Patch series "sh: sh7722/sh7757i/sh7264/sh7269: Fix pinctrl registration", v2. Magnus Damm reported that on sh7722/Migo-R, pinctrl registration fails with: sh-pfc pfc-sh7722: pin 0 already registered sh-pfc pfc-sh7722: error during pin registration sh-pfc pfc-sh7722: could not register: -22 sh-pfc: probe of pfc-sh7722 failed with error -22 pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array initializers, where the GPIO_* enums serve as indices. Apparently GPIO_PTQ7 was defined in the enum, but never used. If enum values are defined, but never used, pinmux_pins[] contains (zero-filled) holes. Hence such entries are treated as pin zero, which was registered before, and pinctrl registration fails. I can't see how this ever worked, as at the time of commit f5e25ae52fef ("sh-pfc: Add sh7722 pinmux support"), pinmux_gpios[] in drivers/pinctrl/sh-pfc/pfc-sh7722.c already had the hole, and drivers/pinctrl/core.c already had the check. Some scripting revealed a few more broken drivers: - sh7757 has four holes, due to nonexistent GPIO_PT[JLNQ]7_RESV. - sh7264 and sh7269 define GPIO_PH[0-7], but don't use it with PINMUX_GPIO(). Patch 1 fixes the issue on sh7722, and was tested. Patches 3-4 should fix the issue on the other 3 SoCs, but was untested due to lack of hardware. This patch (of 4): On sh7722/Migo-R, pinctrl registration fails with: sh-pfc pfc-sh7722: pin 0 already registered sh-pfc pfc-sh7722: error during pin registration sh-pfc pfc-sh7722: could not register: -22 sh-pfc: probe of pfc-sh7722 failed with error -22 pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array initializers, where the GPIO_* enums serve as indices. As GPIO_PTQ7 is defined in the enum, but never used, pinmux_pins[] contains a (zero-filled) hole. Hence this entry is treated as pin zero, which was registered before, and pinctrl registration fails. According to the datasheet, port PTQ7 does not exist. Hence remove GPIO_PTQ7 from the enum to fix this. Link: http://lkml.kernel.org/r/1505205657-18012-2-git-send-email-geert+renesas@glider.be Fixes: 8d7b5b0af7e070b9 ("sh: Add sh7722 pinmux code") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reported-by: Magnus Damm <magnus.damm@gmail.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Tested-by: Jacopo Mondi <jacopo+renesas@jmondi.org> Cc: Rich Felker <dalias@libc.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03alpha: fix build failuresSudip Mukherjee1-0/+1
The build of alpha allmodconfig is giving error: arch/alpha/include/asm/mmu_context.h: In function 'ev5_switch_mm': arch/alpha/include/asm/mmu_context.h:160:2: error: implicit declaration of function 'task_thread_info'; did you mean 'init_thread_info'? [-Werror=implicit-function-declaration] The file 'mmu_context.h' needed an extra header file. Link: http://lkml.kernel.org/r/1505668810-7497-1-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03bpf: fix bpf_tail_call() x64 JITAlexei Starovoitov1-2/+2
- bpf prog_array just like all other types of bpf array accepts 32-bit index. Clarify that in the comment. - fix x64 JIT of bpf_tail_call which was incorrectly loading 8 instead of 4 bytes - tighten corresponding check in the interpreter to stay consistent The JIT bug can be triggered after introduction of BPF_F_NUMA_NODE flag in commit 96eabe7a40aa in 4.14. Before that the map_flags would stay zero and though JIT code is wrong it will check bounds correctly. Hence two fixes tags. All other JITs don't have this problem. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Fixes: 96eabe7a40aa ("bpf: Allow selecting numa node during map creation") Fixes: b52f00e6a715 ("x86: bpf_jit: implement bpf_tail_call() helper") Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03KVM: PPC: Book3S: Fix server always zero from kvmppc_xive_get_xive()Sam Bobroff2-4/+2
In KVM's XICS-on-XIVE emulation, kvmppc_xive_get_xive() returns the value of state->guest_server as "server". However, this value is not set by it's counterpart kvmppc_xive_set_xive(). When the guest uses this interface to migrate interrupts away from a CPU that is going offline, it sees all interrupts as belonging to CPU 0, so they are left assigned to (now) offline CPUs. This patch removes the guest_server field from the state, and returns act_server in it's place (that is, the CPU actually handling the interrupt, which may differ from the one requested). Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Cc: stable@vger.kernel.org Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-10-03powerpc/4xx: Fix compile error with 64K pages on 40x, 44xChristian Lamparter1-3/+0
The mmu context on the 40x, 44x does not define pte_frag entry. This causes gcc abort the compilation due to: setup-common.c: In function ‘setup_arch’: setup-common.c:908: error: ‘mm_context_t’ has no ‘pte_frag’ This patch fixes the issue by removing the pte_frag initialization in setup-common.c. This is possible, because the compiler will do the initialization, since the mm_context is a sub struct of init_mm. init_mm is declared in mm_types.h as external linkage. According to C99 6.2.4.3: An object whose identifier is declared with external linkage [...] has static storage duration. C99 defines in 6.7.8.10 that: If an object that has static storage duration is not initialized explicitly, then: - if it has pointer type, it is initialized to a null pointer Fixes: b1923caa6e64 ("powerpc: Merge 32-bit and 64-bit setup_arch()") Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-03powerpc: Fix action argument for cpufeatures-based TLB flushJeremy Kerr1-2/+2
Commit 41d0c2ecde19 ("powerpc/powernv: Fix local TLB flush for boot and MCE on POWER9") introduced calls to __flush_tlb_power[89] from the cpufeatures code, specifying the number of sets to flush. However, these functions take an action argument, not a number of sets. This means we hit the BUG() in __flush_tlb_{206,300} when using cpufeatures-style configuration. This change passes TLB_INVAL_SCOPE_GLOBAL instead. Fixes: 41d0c2ecde19 ("powerpc/powernv: Fix local TLB flush for boot and MCE on POWER9") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-02arm64: fix misleading data abort decodingMark Rutland1-1/+1
Currently data_abort_decode() dumps the ISS field as a decimal value with a '0x' prefix, which is somewhat misleading. Fix it to print as hexadecimal, as was intended. Fixes: 1f9b8936f36f4a8e ("arm64: Decode information from ESR upon mem faults") Reviewed-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-29x86/asm: Use register variable to get stack pointer valueAndrey Ryabinin5-18/+7
Currently we use current_stack_pointer() function to get the value of the stack pointer register. Since commit: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang") ... we have a stack register variable declared. It can be used instead of current_stack_pointer() function which allows to optimize away some excessive "mov %rsp, %<dst>" instructions: -mov %rsp,%rdx -sub %rdx,%rax -cmp $0x3fff,%rax -ja ffffffff810722fd <ist_begin_non_atomic+0x2d> +sub %rsp,%rax +cmp $0x3fff,%rax +ja ffffffff810722fa <ist_begin_non_atomic+0x2a> Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer and use it instead of the removed function. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29x86/mm: Disable branch profiling in mem_encrypt.cTom Lendacky1-0/+2
Some routines in mem_encrypt.c are called very early in the boot process, e.g. sme_encrypt_kernel(). When CONFIG_TRACE_BRANCH_PROFILING=y is defined the resulting branch profiling associated with the check to see if SME is active results in a kernel crash. Disable branch profiling for mem_encrypt.c by defining DISABLE_BRANCH_PROFILING before including any header files. Reported-by: kernel test robot <lkp@01.org> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170929162419.6016.53390.stgit@tlendack-t1.amdoffice.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29arm64: fault: Route pte translation faults via do_translation_faultWill Deacon1-1/+1
We currently route pte translation faults via do_page_fault, which elides the address check against TASK_SIZE before invoking the mm fault handling code. However, this can cause issues with the path walking code in conjunction with our word-at-a-time implementation because load_unaligned_zeropad can end up faulting in kernel space if it reads across a page boundary and runs into a page fault (e.g. by attempting to read from a guard region). In the case of such a fault, load_unaligned_zeropad has registered a fixup to shift the valid data and pad with zeroes, however the abort is reported as a level 3 translation fault and we dispatch it straight to do_page_fault, despite it being a kernel address. This results in calling a sleeping function from atomic context: BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313 in_atomic(): 0, irqs_disabled(): 0, pid: 10290 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [...] [<ffffff8e016cd0cc>] ___might_sleep+0x134/0x144 [<ffffff8e016cd158>] __might_sleep+0x7c/0x8c [<ffffff8e016977f0>] do_page_fault+0x140/0x330 [<ffffff8e01681328>] do_mem_abort+0x54/0xb0 Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0) [...] [<ffffff8e016844fc>] el1_da+0x18/0x78 [<ffffff8e017f399c>] path_parentat+0x44/0x88 [<ffffff8e017f4c9c>] filename_parentat+0x5c/0xd8 [<ffffff8e017f5044>] filename_create+0x4c/0x128 [<ffffff8e017f59e4>] SyS_mkdirat+0x50/0xc8 [<ffffff8e01684e30>] el0_svc_naked+0x24/0x28 Code: 36380080 d5384100 f9400800 9402566d (d4210000) ---[ end trace 2d01889f2bca9b9f ]--- Fix this by dispatching all translation faults to do_translation_faults, which avoids invoking the page fault logic for faults on kernel addresses. Cc: <stable@vger.kernel.org> Reported-by: Ankit Jain <ankijain@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-29arm64: mm: Use READ_ONCE when dereferencing pointer to pte tableWill Deacon1-1/+1
On kernels built with support for transparent huge pages, different CPUs can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk and they must take care to use READ_ONCE to avoid value tearing or caching of stale values by the compiler. Unfortunately, these functions call into our pgtable macros, which don't use READ_ONCE, and compiler caching has been observed to cause the following crash during ext4 writeback: PC is at check_pte+0x20/0x170 LR is at page_vma_mapped_walk+0x2e0/0x540 [...] Process doio (pid: 2463, stack limit = 0xffff00000f2e8000) Call trace: [<ffff000008233328>] check_pte+0x20/0x170 [<ffff000008233758>] page_vma_mapped_walk+0x2e0/0x540 [<ffff000008234adc>] page_mkclean_one+0xac/0x278 [<ffff000008234d98>] rmap_walk_file+0xf0/0x238 [<ffff000008236e74>] rmap_walk+0x64/0xa0 [<ffff0000082370c8>] page_mkclean+0x90/0xa8 [<ffff0000081f3c64>] clear_page_dirty_for_io+0x84/0x2a8 [<ffff00000832f984>] mpage_submit_page+0x34/0x98 [<ffff00000832fb4c>] mpage_process_page_bufs+0x164/0x170 [<ffff00000832fc8c>] mpage_prepare_extent_to_map+0x134/0x2b8 [<ffff00000833530c>] ext4_writepages+0x484/0xe30 [<ffff0000081f6ab4>] do_writepages+0x44/0xe8 [<ffff0000081e5bd4>] __filemap_fdatawrite_range+0xbc/0x110 [<ffff0000081e5e68>] file_write_and_wait_range+0x48/0xd8 [<ffff000008324310>] ext4_sync_file+0x80/0x4b8 [<ffff0000082bd434>] vfs_fsync_range+0x64/0xc0 [<ffff0000082332b4>] SyS_msync+0x194/0x1e8 This is because page_vma_mapped_walk loads the PMD twice before calling pte_offset_map: the first time without READ_ONCE (where it gets all zeroes due to a concurrent pmdp_invalidate) and the second time with READ_ONCE (where it sees a valid table pointer due to a concurrent pmd_populate). However, the compiler inlines everything and caches the first value in a register, which is subsequently used in pte_offset_phys which returns a junk pointer that is later dereferenced when attempting to access the relevant pte. This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure that a stale value is not used. Whilst this is a point fix for a known failure (and simple to backport), a full fix moving all of our page table accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in page_vma_mapped_walk is in the works for a future kernel release. Cc: Jon Masters <jcm@redhat.com> Cc: Timur Tabi <timur@codeaurora.org> Cc: <stable@vger.kernel.org> Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()") Tested-by: Richard Ruigrok <rruigrok@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-29kvm/x86: Handle async PF in RCU read-side critical sectionsBoqun Feng1-1/+2
Sasha Levin reported a WARNING: | WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329 | rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline] | WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329 | rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458 ... | CPU: 0 PID: 6974 Comm: syz-fuzzer Not tainted 4.13.0-next-20170908+ #246 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS | 1.10.1-1ubuntu1 04/01/2014 | Call Trace: ... | RIP: 0010:rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline] | RIP: 0010:rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458 | RSP: 0018:ffff88003b2debc8 EFLAGS: 00010002 | RAX: 0000000000000001 RBX: 1ffff1000765bd85 RCX: 0000000000000000 | RDX: 1ffff100075d7882 RSI: ffffffffb5c7da20 RDI: ffff88003aebc410 | RBP: ffff88003b2def30 R08: dffffc0000000000 R09: 0000000000000001 | R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003b2def08 | R13: 0000000000000000 R14: ffff88003aebc040 R15: ffff88003aebc040 | __schedule+0x201/0x2240 kernel/sched/core.c:3292 | schedule+0x113/0x460 kernel/sched/core.c:3421 | kvm_async_pf_task_wait+0x43f/0x940 arch/x86/kernel/kvm.c:158 | do_async_page_fault+0x72/0x90 arch/x86/kernel/kvm.c:271 | async_page_fault+0x22/0x30 arch/x86/entry/entry_64.S:1069 | RIP: 0010:format_decode+0x240/0x830 lib/vsprintf.c:1996 | RSP: 0018:ffff88003b2df520 EFLAGS: 00010283 | RAX: 000000000000003f RBX: ffffffffb5d1e141 RCX: ffff88003b2df670 | RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffb5d1e140 | RBP: ffff88003b2df560 R08: dffffc0000000000 R09: 0000000000000000 | R10: ffff88003b2df718 R11: 0000000000000000 R12: ffff88003b2df5d8 | R13: 0000000000000064 R14: ffffffffb5d1e140 R15: 0000000000000000 | vsnprintf+0x173/0x1700 lib/vsprintf.c:2136 | sprintf+0xbe/0xf0 lib/vsprintf.c:2386 | proc_self_get_link+0xfb/0x1c0 fs/proc/self.c:23 | get_link fs/namei.c:1047 [inline] | link_path_walk+0x1041/0x1490 fs/namei.c:2127 ... This happened when the host hit a page fault, and delivered it as in an async page fault, while the guest was in an RCU read-side critical section. The guest then tries to reschedule in kvm_async_pf_task_wait(), but rcu_preempt_note_context_switch() would treat the reschedule as a sleep in RCU read-side critical section, which is not allowed (even in preemptible RCU). Thus the WARN. To cure this, make kvm_async_pf_task_wait() go to the halt path if the PF happens in a RCU read-side critical section. Reported-by: Sasha Levin <levinsasha928@gmail.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-29KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresumeWanpeng Li1-1/+2
------------[ cut here ]------------ WARNING: CPU: 4 PID: 5280 at /home/kernel/linux/arch/x86/kvm//vmx.c:11394 nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel] CPU: 4 PID: 5280 Comm: qemu-system-x86 Tainted: G W OE 4.13.0+ #17 RIP: 0010:nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel] Call Trace: ? emulator_read_emulated+0x15/0x20 [kvm] ? segmented_read+0xae/0xf0 [kvm] vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel] ? vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel] x86_emulate_instruction+0x733/0x810 [kvm] vmx_handle_exit+0x2f4/0xda0 [kvm_intel] ? kvm_arch_vcpu_ioctl_run+0xd2f/0x1c60 [kvm] kvm_arch_vcpu_ioctl_run+0xdab/0x1c60 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 A nested #PF is triggered during L0 emulating instruction for L2. However, it doesn't consider we should not break L1's vmlauch/vmresme. This patch fixes it by queuing the #PF exception instead ,requesting an immediate VM exit from L2 and keeping the exception for L1 pending for a subsequent nested VM exit. This should actually work all the time, making vmx_inject_page_fault_nested totally unnecessary. However, that's not working yet, so this patch can work around the issue in the meanwhile. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>