aboutsummaryrefslogtreecommitdiffstats
path: root/arch (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-10-03sh: sh7269: remove nonexistent GPIO_PH[0-7] to fix pinctrl registrationGeert Uytterhoeven1-3/+1
Pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array initializers, where the GPIO_* enums serve as indices. If enum values are defined, but never used, pinmux_pins[] contains (zero-filled) holes. Such entries are treated as pin zero, which was registered before, thus leading to pinctrl registration failures, as seen on sh7722: sh-pfc pfc-sh7722: pin 0 already registered sh-pfc pfc-sh7722: error during pin registration sh-pfc pfc-sh7722: could not register: -22 sh-pfc: probe of pfc-sh7722 failed with error -22 Remove GPIO_PH[0-7] from the enum to fix this. Link: http://lkml.kernel.org/r/1505205657-18012-5-git-send-email-geert+renesas@glider.be Fixes: ef0fa5331a73e479 ("sh: Add pinmux for sh7269") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Magnus Damm <magnus.damm@gmail.com> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Jacopo Mondi <jacopo+renesas@jmondi.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03sh: sh7264: remove nonexistent GPIO_PH[0-7] to fix pinctrl registrationGeert Uytterhoeven1-3/+1
Pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array initializers, where the GPIO_* enums serve as indices. If enum values are defined, but never used, pinmux_pins[] contains (zero-filled) holes. Such entries are treated as pin zero, which was registered before, thus leading to pinctrl registration failures, as seen on sh7722: sh-pfc pfc-sh7722: pin 0 already registered sh-pfc pfc-sh7722: error during pin registration sh-pfc pfc-sh7722: could not register: -22 sh-pfc: probe of pfc-sh7722 failed with error -22 Remove GPIO_PH[0-7] from the enum to fix this. Link: http://lkml.kernel.org/r/1505205657-18012-4-git-send-email-geert+renesas@glider.be Fixes: 41797f75486d8ca3 ("sh: Add pinmux for sh7264") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Jacopo Mondi <jacopo+renesas@jmondi.org> Cc: Magnus Damm <magnus.damm@gmail.com> Cc: Rich Felker <dalias@libc.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03sh: sh7757: remove nonexistent GPIO_PT[JLNQ]7_RESV to fix pinctrl registrationGeert Uytterhoeven1-4/+4
Commit 3810e96056ff ("sh: modify pinmux for SH7757 2nd cut") renamed GPIO_PT[JLNQ]7 to GPIO_PT[JLNQ]7_RESV, and removed the existing users from the pinmux_pins[] array. However, pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array initializers, where the GPIO_* enums serve as indices. Hence entries were not really removed, but replaced by (zero-filled) holes. Such entries are treated as pin zero, which was registered before, thus leading to pinctrl registration failures, as seen on sh7722: sh-pfc pfc-sh7722: pin 0 already registered sh-pfc pfc-sh7722: error during pin registration sh-pfc pfc-sh7722: could not register: -22 sh-pfc: probe of pfc-sh7722 failed with error -22 Remove GPIO_PT[JLNQ]7_RESV from the enum to fix this. Link: http://lkml.kernel.org/r/1505205657-18012-3-git-send-email-geert+renesas@glider.be Fixes: 3810e96056ffddf6 ("sh: modify pinmux for SH7757 2nd cut") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Jacopo Mondi <jacopo+renesas@jmondi.org> Cc: Magnus Damm <magnus.damm@gmail.com> Cc: Rich Felker <dalias@libc.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03sh: sh7722: remove nonexistent GPIO_PTQ7 to fix pinctrl registrationGeert Uytterhoeven1-1/+1
Patch series "sh: sh7722/sh7757i/sh7264/sh7269: Fix pinctrl registration", v2. Magnus Damm reported that on sh7722/Migo-R, pinctrl registration fails with: sh-pfc pfc-sh7722: pin 0 already registered sh-pfc pfc-sh7722: error during pin registration sh-pfc pfc-sh7722: could not register: -22 sh-pfc: probe of pfc-sh7722 failed with error -22 pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array initializers, where the GPIO_* enums serve as indices. Apparently GPIO_PTQ7 was defined in the enum, but never used. If enum values are defined, but never used, pinmux_pins[] contains (zero-filled) holes. Hence such entries are treated as pin zero, which was registered before, and pinctrl registration fails. I can't see how this ever worked, as at the time of commit f5e25ae52fef ("sh-pfc: Add sh7722 pinmux support"), pinmux_gpios[] in drivers/pinctrl/sh-pfc/pfc-sh7722.c already had the hole, and drivers/pinctrl/core.c already had the check. Some scripting revealed a few more broken drivers: - sh7757 has four holes, due to nonexistent GPIO_PT[JLNQ]7_RESV. - sh7264 and sh7269 define GPIO_PH[0-7], but don't use it with PINMUX_GPIO(). Patch 1 fixes the issue on sh7722, and was tested. Patches 3-4 should fix the issue on the other 3 SoCs, but was untested due to lack of hardware. This patch (of 4): On sh7722/Migo-R, pinctrl registration fails with: sh-pfc pfc-sh7722: pin 0 already registered sh-pfc pfc-sh7722: error during pin registration sh-pfc pfc-sh7722: could not register: -22 sh-pfc: probe of pfc-sh7722 failed with error -22 pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array initializers, where the GPIO_* enums serve as indices. As GPIO_PTQ7 is defined in the enum, but never used, pinmux_pins[] contains a (zero-filled) hole. Hence this entry is treated as pin zero, which was registered before, and pinctrl registration fails. According to the datasheet, port PTQ7 does not exist. Hence remove GPIO_PTQ7 from the enum to fix this. Link: http://lkml.kernel.org/r/1505205657-18012-2-git-send-email-geert+renesas@glider.be Fixes: 8d7b5b0af7e070b9 ("sh: Add sh7722 pinmux code") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reported-by: Magnus Damm <magnus.damm@gmail.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Tested-by: Jacopo Mondi <jacopo+renesas@jmondi.org> Cc: Rich Felker <dalias@libc.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03alpha: fix build failuresSudip Mukherjee1-0/+1
The build of alpha allmodconfig is giving error: arch/alpha/include/asm/mmu_context.h: In function 'ev5_switch_mm': arch/alpha/include/asm/mmu_context.h:160:2: error: implicit declaration of function 'task_thread_info'; did you mean 'init_thread_info'? [-Werror=implicit-function-declaration] The file 'mmu_context.h' needed an extra header file. Link: http://lkml.kernel.org/r/1505668810-7497-1-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03bpf: fix bpf_tail_call() x64 JITAlexei Starovoitov1-2/+2
- bpf prog_array just like all other types of bpf array accepts 32-bit index. Clarify that in the comment. - fix x64 JIT of bpf_tail_call which was incorrectly loading 8 instead of 4 bytes - tighten corresponding check in the interpreter to stay consistent The JIT bug can be triggered after introduction of BPF_F_NUMA_NODE flag in commit 96eabe7a40aa in 4.14. Before that the map_flags would stay zero and though JIT code is wrong it will check bounds correctly. Hence two fixes tags. All other JITs don't have this problem. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Fixes: 96eabe7a40aa ("bpf: Allow selecting numa node during map creation") Fixes: b52f00e6a715 ("x86: bpf_jit: implement bpf_tail_call() helper") Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03KVM: PPC: Book3S: Fix server always zero from kvmppc_xive_get_xive()Sam Bobroff2-4/+2
In KVM's XICS-on-XIVE emulation, kvmppc_xive_get_xive() returns the value of state->guest_server as "server". However, this value is not set by it's counterpart kvmppc_xive_set_xive(). When the guest uses this interface to migrate interrupts away from a CPU that is going offline, it sees all interrupts as belonging to CPU 0, so they are left assigned to (now) offline CPUs. This patch removes the guest_server field from the state, and returns act_server in it's place (that is, the CPU actually handling the interrupt, which may differ from the one requested). Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Cc: stable@vger.kernel.org Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-10-03powerpc/4xx: Fix compile error with 64K pages on 40x, 44xChristian Lamparter1-3/+0
The mmu context on the 40x, 44x does not define pte_frag entry. This causes gcc abort the compilation due to: setup-common.c: In function ‘setup_arch’: setup-common.c:908: error: ‘mm_context_t’ has no ‘pte_frag’ This patch fixes the issue by removing the pte_frag initialization in setup-common.c. This is possible, because the compiler will do the initialization, since the mm_context is a sub struct of init_mm. init_mm is declared in mm_types.h as external linkage. According to C99 6.2.4.3: An object whose identifier is declared with external linkage [...] has static storage duration. C99 defines in 6.7.8.10 that: If an object that has static storage duration is not initialized explicitly, then: - if it has pointer type, it is initialized to a null pointer Fixes: b1923caa6e64 ("powerpc: Merge 32-bit and 64-bit setup_arch()") Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-03powerpc: Fix action argument for cpufeatures-based TLB flushJeremy Kerr1-2/+2
Commit 41d0c2ecde19 ("powerpc/powernv: Fix local TLB flush for boot and MCE on POWER9") introduced calls to __flush_tlb_power[89] from the cpufeatures code, specifying the number of sets to flush. However, these functions take an action argument, not a number of sets. This means we hit the BUG() in __flush_tlb_{206,300} when using cpufeatures-style configuration. This change passes TLB_INVAL_SCOPE_GLOBAL instead. Fixes: 41d0c2ecde19 ("powerpc/powernv: Fix local TLB flush for boot and MCE on POWER9") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-02arm64: fix misleading data abort decodingMark Rutland1-1/+1
Currently data_abort_decode() dumps the ISS field as a decimal value with a '0x' prefix, which is somewhat misleading. Fix it to print as hexadecimal, as was intended. Fixes: 1f9b8936f36f4a8e ("arm64: Decode information from ESR upon mem faults") Reviewed-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-10-01Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds13-46/+55
Pull x86 fixes from Thomas Gleixner: "This contains the following fixes and improvements: - Avoid dereferencing an unprotected VMA pointer in the fault signal generation code - Fix inline asm call constraints for GCC 4.4 - Use existing register variable to retrieve the stack pointer instead of forcing the compiler to create another indirect access which results in excessive extra 'mov %rsp, %<dst>' instructions - Disable branch profiling for the memory encryption code to prevent an early boot crash - Fix a sparse warning caused by casting the __user annotation in __get_user_asm_u64() away - Fix an off by one error in the loop termination of the error patch in the x86 sysfs init code - Add missing CPU IDs to various Intel specific drivers to enable the functionality on recent hardware - More (init) constification in the numachip code" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Use register variable to get stack pointer value x86/mm: Disable branch profiling in mem_encrypt.c x86/asm: Fix inline asm call constraints for GCC 4.4 perf/x86/intel/uncore: Correct num_boxes for IIO and IRP perf/x86/intel/rapl: Add missing CPU IDs perf/x86/msr: Add missing CPU IDs perf/x86/intel/cstate: Add missing CPU IDs x86: Don't cast away the __user in __get_user_asm_u64() x86/sysfs: Fix off-by-one error in loop termination x86/mm: Fix fault error path using unsafe vma pointer x86/numachip: Add const and __initconst to numachip2_clockevent
2017-10-01Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-2/+2
Pull timer fixes from Thomas Gleixner: "This adds a new timer wheel function which is required for the conversion of the timer callback function from the 'unsigned long data' argument to 'struct timer_list *timer'. This conversion has two benefits: 1) It makes struct timer_list smaller 2) Many callers hand in a pointer to the timer or to the structure containing the timer, which happens via type casting both at setup and in the callback. This change gets rid of the typecasts. Once the conversion is complete, which is planned for 4.15, the old setup function and the intermediate typecast in the new setup function go away along with the data field in struct timer_list. Merging this now into mainline allows a smooth queueing of the actual conversion in the affected maintainer trees without creating dependencies" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: um/time: Fixup namespace collision timer: Prepare to change timer callback argument type
2017-09-29Merge tag 'led_fixes-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-ledsLinus Torvalds1-3/+7
Pull LED fixes from Jacek Anaszewski: "Four fixes for the as3645a LED flash controller and one update to MAINTAINERS" * tag 'led_fixes-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: MAINTAINERS: Add entry for MediaTek PMIC LED driver as3645a: Unregister indicator LED on device unbind as3645a: Use integer numbers for parsing LEDs dt: bindings: as3645a: Use LED number to refer to LEDs as3645a: Use ams,input-max-microamp as documented in DT bindings
2017-09-29Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds3-2/+3
Pull arm64 fixes from Catalin Marinas: - SPsel register initialisation on reset as the architecture defines its state as unknown - Use READ_ONCE when dereferencing pmd_t pointers to avoid race conditions in page_vma_mapped_walk() (or fast GUP) with concurrent modifications of the page table - Avoid invoking the mm fault handling code for kernel addresses (check against TASK_SIZE) which would otherwise result in calling might_sleep() in atomic context * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: fault: Route pte translation faults via do_translation_fault arm64: mm: Use READ_ONCE when dereferencing pointer to pte table arm64: Make sure SPsel is always set
2017-09-29Merge tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds2-11/+6
Pull xen fixes from Juergen Gross: - avoid a warning when compiling with clang - consider read-only bits in xen-pciback when writing to a BAR - fix a boot crash of pv-domains * tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping xen-pciback: relax BAR sizing write value check x86/xen: clean up clang build warning
2017-09-29Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-107/+116
Pull kvm fixes from Paolo Bonzini: "Mixed bugfixes. Perhaps the most interesting one is a latent bug that was finally triggered by PCID support" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm/x86: Handle async PF in RCU read-side critical sections KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume KVM: VMX: use cmpxchg64 KVM: VMX: simplify and fix vmx_vcpu_pi_load KVM: VMX: avoid double list add with VT-d posted interrupts KVM: VMX: extract __pi_post_block KVM: PPC: Book3S HV: Check for updated HDSISR on P9 HDSI exception KVM: nVMX: fix HOST_CR3/HOST_CR4 cache
2017-09-29x86/asm: Use register variable to get stack pointer valueAndrey Ryabinin5-18/+7
Currently we use current_stack_pointer() function to get the value of the stack pointer register. Since commit: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang") ... we have a stack register variable declared. It can be used instead of current_stack_pointer() function which allows to optimize away some excessive "mov %rsp, %<dst>" instructions: -mov %rsp,%rdx -sub %rdx,%rax -cmp $0x3fff,%rax -ja ffffffff810722fd <ist_begin_non_atomic+0x2d> +sub %rsp,%rax +cmp $0x3fff,%rax +ja ffffffff810722fa <ist_begin_non_atomic+0x2a> Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer and use it instead of the removed function. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29x86/mm: Disable branch profiling in mem_encrypt.cTom Lendacky1-0/+2
Some routines in mem_encrypt.c are called very early in the boot process, e.g. sme_encrypt_kernel(). When CONFIG_TRACE_BRANCH_PROFILING=y is defined the resulting branch profiling associated with the check to see if SME is active results in a kernel crash. Disable branch profiling for mem_encrypt.c by defining DISABLE_BRANCH_PROFILING before including any header files. Reported-by: kernel test robot <lkp@01.org> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170929162419.6016.53390.stgit@tlendack-t1.amdoffice.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29arm64: fault: Route pte translation faults via do_translation_faultWill Deacon1-1/+1
We currently route pte translation faults via do_page_fault, which elides the address check against TASK_SIZE before invoking the mm fault handling code. However, this can cause issues with the path walking code in conjunction with our word-at-a-time implementation because load_unaligned_zeropad can end up faulting in kernel space if it reads across a page boundary and runs into a page fault (e.g. by attempting to read from a guard region). In the case of such a fault, load_unaligned_zeropad has registered a fixup to shift the valid data and pad with zeroes, however the abort is reported as a level 3 translation fault and we dispatch it straight to do_page_fault, despite it being a kernel address. This results in calling a sleeping function from atomic context: BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313 in_atomic(): 0, irqs_disabled(): 0, pid: 10290 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [...] [<ffffff8e016cd0cc>] ___might_sleep+0x134/0x144 [<ffffff8e016cd158>] __might_sleep+0x7c/0x8c [<ffffff8e016977f0>] do_page_fault+0x140/0x330 [<ffffff8e01681328>] do_mem_abort+0x54/0xb0 Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0) [...] [<ffffff8e016844fc>] el1_da+0x18/0x78 [<ffffff8e017f399c>] path_parentat+0x44/0x88 [<ffffff8e017f4c9c>] filename_parentat+0x5c/0xd8 [<ffffff8e017f5044>] filename_create+0x4c/0x128 [<ffffff8e017f59e4>] SyS_mkdirat+0x50/0xc8 [<ffffff8e01684e30>] el0_svc_naked+0x24/0x28 Code: 36380080 d5384100 f9400800 9402566d (d4210000) ---[ end trace 2d01889f2bca9b9f ]--- Fix this by dispatching all translation faults to do_translation_faults, which avoids invoking the page fault logic for faults on kernel addresses. Cc: <stable@vger.kernel.org> Reported-by: Ankit Jain <ankijain@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-29arm64: mm: Use READ_ONCE when dereferencing pointer to pte tableWill Deacon1-1/+1
On kernels built with support for transparent huge pages, different CPUs can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk and they must take care to use READ_ONCE to avoid value tearing or caching of stale values by the compiler. Unfortunately, these functions call into our pgtable macros, which don't use READ_ONCE, and compiler caching has been observed to cause the following crash during ext4 writeback: PC is at check_pte+0x20/0x170 LR is at page_vma_mapped_walk+0x2e0/0x540 [...] Process doio (pid: 2463, stack limit = 0xffff00000f2e8000) Call trace: [<ffff000008233328>] check_pte+0x20/0x170 [<ffff000008233758>] page_vma_mapped_walk+0x2e0/0x540 [<ffff000008234adc>] page_mkclean_one+0xac/0x278 [<ffff000008234d98>] rmap_walk_file+0xf0/0x238 [<ffff000008236e74>] rmap_walk+0x64/0xa0 [<ffff0000082370c8>] page_mkclean+0x90/0xa8 [<ffff0000081f3c64>] clear_page_dirty_for_io+0x84/0x2a8 [<ffff00000832f984>] mpage_submit_page+0x34/0x98 [<ffff00000832fb4c>] mpage_process_page_bufs+0x164/0x170 [<ffff00000832fc8c>] mpage_prepare_extent_to_map+0x134/0x2b8 [<ffff00000833530c>] ext4_writepages+0x484/0xe30 [<ffff0000081f6ab4>] do_writepages+0x44/0xe8 [<ffff0000081e5bd4>] __filemap_fdatawrite_range+0xbc/0x110 [<ffff0000081e5e68>] file_write_and_wait_range+0x48/0xd8 [<ffff000008324310>] ext4_sync_file+0x80/0x4b8 [<ffff0000082bd434>] vfs_fsync_range+0x64/0xc0 [<ffff0000082332b4>] SyS_msync+0x194/0x1e8 This is because page_vma_mapped_walk loads the PMD twice before calling pte_offset_map: the first time without READ_ONCE (where it gets all zeroes due to a concurrent pmdp_invalidate) and the second time with READ_ONCE (where it sees a valid table pointer due to a concurrent pmd_populate). However, the compiler inlines everything and caches the first value in a register, which is subsequently used in pte_offset_phys which returns a junk pointer that is later dereferenced when attempting to access the relevant pte. This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure that a stale value is not used. Whilst this is a point fix for a known failure (and simple to backport), a full fix moving all of our page table accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in page_vma_mapped_walk is in the works for a future kernel release. Cc: Jon Masters <jcm@redhat.com> Cc: Timur Tabi <timur@codeaurora.org> Cc: <stable@vger.kernel.org> Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()") Tested-by: Richard Ruigrok <rruigrok@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-29kvm/x86: Handle async PF in RCU read-side critical sectionsBoqun Feng1-1/+2
Sasha Levin reported a WARNING: | WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329 | rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline] | WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329 | rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458 ... | CPU: 0 PID: 6974 Comm: syz-fuzzer Not tainted 4.13.0-next-20170908+ #246 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS | 1.10.1-1ubuntu1 04/01/2014 | Call Trace: ... | RIP: 0010:rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline] | RIP: 0010:rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458 | RSP: 0018:ffff88003b2debc8 EFLAGS: 00010002 | RAX: 0000000000000001 RBX: 1ffff1000765bd85 RCX: 0000000000000000 | RDX: 1ffff100075d7882 RSI: ffffffffb5c7da20 RDI: ffff88003aebc410 | RBP: ffff88003b2def30 R08: dffffc0000000000 R09: 0000000000000001 | R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003b2def08 | R13: 0000000000000000 R14: ffff88003aebc040 R15: ffff88003aebc040 | __schedule+0x201/0x2240 kernel/sched/core.c:3292 | schedule+0x113/0x460 kernel/sched/core.c:3421 | kvm_async_pf_task_wait+0x43f/0x940 arch/x86/kernel/kvm.c:158 | do_async_page_fault+0x72/0x90 arch/x86/kernel/kvm.c:271 | async_page_fault+0x22/0x30 arch/x86/entry/entry_64.S:1069 | RIP: 0010:format_decode+0x240/0x830 lib/vsprintf.c:1996 | RSP: 0018:ffff88003b2df520 EFLAGS: 00010283 | RAX: 000000000000003f RBX: ffffffffb5d1e141 RCX: ffff88003b2df670 | RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffb5d1e140 | RBP: ffff88003b2df560 R08: dffffc0000000000 R09: 0000000000000000 | R10: ffff88003b2df718 R11: 0000000000000000 R12: ffff88003b2df5d8 | R13: 0000000000000064 R14: ffffffffb5d1e140 R15: 0000000000000000 | vsnprintf+0x173/0x1700 lib/vsprintf.c:2136 | sprintf+0xbe/0xf0 lib/vsprintf.c:2386 | proc_self_get_link+0xfb/0x1c0 fs/proc/self.c:23 | get_link fs/namei.c:1047 [inline] | link_path_walk+0x1041/0x1490 fs/namei.c:2127 ... This happened when the host hit a page fault, and delivered it as in an async page fault, while the guest was in an RCU read-side critical section. The guest then tries to reschedule in kvm_async_pf_task_wait(), but rcu_preempt_note_context_switch() would treat the reschedule as a sleep in RCU read-side critical section, which is not allowed (even in preemptible RCU). Thus the WARN. To cure this, make kvm_async_pf_task_wait() go to the halt path if the PF happens in a RCU read-side critical section. Reported-by: Sasha Levin <levinsasha928@gmail.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-29KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresumeWanpeng Li1-1/+2
------------[ cut here ]------------ WARNING: CPU: 4 PID: 5280 at /home/kernel/linux/arch/x86/kvm//vmx.c:11394 nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel] CPU: 4 PID: 5280 Comm: qemu-system-x86 Tainted: G W OE 4.13.0+ #17 RIP: 0010:nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel] Call Trace: ? emulator_read_emulated+0x15/0x20 [kvm] ? segmented_read+0xae/0xf0 [kvm] vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel] ? vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel] x86_emulate_instruction+0x733/0x810 [kvm] vmx_handle_exit+0x2f4/0xda0 [kvm_intel] ? kvm_arch_vcpu_ioctl_run+0xd2f/0x1c60 [kvm] kvm_arch_vcpu_ioctl_run+0xdab/0x1c60 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 A nested #PF is triggered during L0 emulating instruction for L2. However, it doesn't consider we should not break L1's vmlauch/vmresme. This patch fixes it by queuing the #PF exception instead ,requesting an immediate VM exit from L2 and keeping the exception for L1 pending for a subsequent nested VM exit. This should actually work all the time, making vmx_inject_page_fault_nested totally unnecessary. However, that's not working yet, so this patch can work around the issue in the meanwhile. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-29x86/asm: Fix inline asm call constraints for GCC 4.4Josh Poimboeuf1-2/+4
The kernel test bot (run by Xiaolong Ye) reported that the following commit: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang") is causing double faults in a kernel compiled with GCC 4.4. Linus subsequently diagnosed the crash pattern and the buggy commit and found that the issue is with this code: register unsigned int __asm_call_sp asm("esp"); #define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp) Even on a 64-bit kernel, it's using ESP instead of RSP. That causes GCC to produce the following bogus code: ffffffff8147461d: 89 e0 mov %esp,%eax ffffffff8147461f: 4c 89 f7 mov %r14,%rdi ffffffff81474622: 4c 89 fe mov %r15,%rsi ffffffff81474625: ba 20 00 00 00 mov $0x20,%edx ffffffff8147462a: 89 c4 mov %eax,%esp ffffffff8147462c: e8 bf 52 05 00 callq ffffffff814c98f0 <copy_user_generic_unrolled> Despite the absurdity of it backing up and restoring the stack pointer for no reason, the bug is actually the fact that it's only backing up and restoring the lower 32 bits of the stack pointer. The upper 32 bits are getting cleared out, corrupting the stack pointer. So change the '__asm_call_sp' register variable to be associated with the actual full-size stack pointer. This also requires changing the __ASM_SEL() macro to be based on the actual compiled arch size, rather than the CONFIG value, because CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso). Otherwise Clang fails to build the kernel because it complains about the use of a 64-bit register (RSP) in a 32-bit file. Reported-and-Bisected-and-Tested-by: kernel test robot <xiaolong.ye@intel.com> Diagnosed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: LKP <lkp@01.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang") Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-29um/time: Fixup namespace collisionThomas Gleixner1-2/+2
The new timer_setup() function for struct timer_list collides with a private um function. Rename it. Fixes: 686fef928bba ("timer: Prepare to change timer callback argument type") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Weinberger <richard@nod.at> Cc: Jeff Dike <jdike@addtoit.com> Cc: user-mode-linux-devel@lists.sourceforge.net Cc: Kees Cook <keescook@chromium.org>
2017-09-29powerpc: Fix workaround for spurious MCE on POWER9Michael Neuling1-2/+2
In the recent commit d8bd9f3f0925 ("powerpc: Handle MCE on POWER9 with only DSISR bit 30 set") I screwed up the bit number. It should be bit 25 (IBM bit 38). Fixes: d8bd9f3f0925 ("powerpc: Handle MCE on POWER9 with only DSISR bit 30 set") Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-28KVM: VMX: use cmpxchg64Paolo Bonzini1-6/+6
This fixes a compilation failure on 32-bit systems. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-28xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mappingZhenzhong Duan1-9/+4
When bootup a PVM guest with large memory(Ex.240GB), XEN provided initial mapping overlaps with kernel module virtual space. When mapping in this space is cleared by xen_cleanhighmap(), in certain case there could be an 2MB mapping left. This is due to XEN initialize 4MB aligned mapping but xen_cleanhighmap() finish at 2MB boundary. When module loading is just on top of the 2MB space, got below warning: WARNING: at mm/vmalloc.c:106 vmap_pte_range+0x14e/0x190() Call Trace: [<ffffffff81117083>] warn_alloc_failed+0xf3/0x160 [<ffffffff81146022>] __vmalloc_area_node+0x182/0x1c0 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80 [<ffffffff81145df7>] __vmalloc_node_range+0xa7/0x110 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80 [<ffffffff8103ca54>] module_alloc+0x64/0x70 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80 [<ffffffff810ac91e>] module_alloc_update_bounds+0x1e/0x80 [<ffffffff810ac9a7>] move_module+0x27/0x150 [<ffffffff810aefa0>] layout_and_allocate+0x120/0x1b0 [<ffffffff810af0a8>] load_module+0x78/0x640 [<ffffffff811ff90b>] ? security_file_permission+0x8b/0x90 [<ffffffff810af6d2>] sys_init_module+0x62/0x1e0 [<ffffffff815154c2>] system_call_fastpath+0x16/0x1b Then the mapping of 2MB is cleared, finally oops when the page in that space is accessed. BUG: unable to handle kernel paging request at ffff880022600000 IP: [<ffffffff81260877>] clear_page_c_e+0x7/0x10 PGD 1788067 PUD 178c067 PMD 22434067 PTE 0 Oops: 0002 [#1] SMP Call Trace: [<ffffffff81116ef7>] ? prep_new_page+0x127/0x1c0 [<ffffffff81117d42>] get_page_from_freelist+0x1e2/0x550 [<ffffffff81133010>] ? ii_iovec_copy_to_user+0x90/0x140 [<ffffffff81119c9d>] __alloc_pages_nodemask+0x12d/0x230 [<ffffffff81155516>] alloc_pages_vma+0xc6/0x1a0 [<ffffffff81006ffd>] ? pte_mfn_to_pfn+0x7d/0x100 [<ffffffff81134cfb>] do_anonymous_page+0x16b/0x350 [<ffffffff81139c34>] handle_pte_fault+0x1e4/0x200 [<ffffffff8100712e>] ? xen_pmd_val+0xe/0x10 [<ffffffff810052c9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff81139dab>] handle_mm_fault+0x15b/0x270 [<ffffffff81510c10>] do_page_fault+0x140/0x470 [<ffffffff8150d7d5>] page_fault+0x25/0x30 Call xen_cleanhighmap() with 4MB aligned for page tables mapping to fix it. The unnecessory call of xen_cleanhighmap() in DEBUG mode is also removed. -v2: add comment about XEN alignment from Juergen. References: https://lists.xen.org/archives/html/xen-devel/2012-07/msg01562.html Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> [boris: added 'xen/mmu' tag to commit subject] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-09-27KVM: VMX: simplify and fix vmx_vcpu_pi_loadPaolo Bonzini1-33/+35
The simplify part: do not touch pi_desc.nv, we can set it when the VCPU is first created. Likewise, pi_desc.sn is only handled by vmx_vcpu_pi_load, do not touch it in __pi_post_block. The fix part: do not check kvm_arch_has_assigned_device, instead check the SN bit to figure out whether vmx_vcpu_pi_put ran before. This matches what the previous patch did in pi_post_block. Cc: Huangweidong <weidong.huang@huawei.com> Cc: Gonglei <arei.gonglei@huawei.com> Cc: wangxin <wangxinxin.wang@huawei.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Tested-by: Longpeng (Mike) <longpeng2@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-27KVM: VMX: avoid double list add with VT-d posted interruptsPaolo Bonzini1-37/+25
In some cases, for example involving hot-unplug of assigned devices, pi_post_block can forget to remove the vCPU from the blocked_vcpu_list. When this happens, the next call to pi_pre_block corrupts the list. Fix this in two ways. First, check vcpu->pre_pcpu in pi_pre_block and WARN instead of adding the element twice in the list. Second, always do the list removal in pi_post_block if vcpu->pre_pcpu is set (not -1). The new code keeps interrupts disabled for the whole duration of pi_pre_block/pi_post_block. This is not strictly necessary, but easier to follow. For the same reason, PI.ON is checked only after the cmpxchg, and to handle it we just call the post-block code. This removes duplication of the list removal code. Cc: Huangweidong <weidong.huang@huawei.com> Cc: Gonglei <arei.gonglei@huawei.com> Cc: wangxin <wangxinxin.wang@huawei.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Tested-by: Longpeng (Mike) <longpeng2@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-27KVM: VMX: extract __pi_post_blockPaolo Bonzini1-33/+38
Simple code movement patch, preparing for the next one. Cc: Huangweidong <weidong.huang@huawei.com> Cc: Gonglei <arei.gonglei@huawei.com> Cc: wangxin <wangxinxin.wang@huawei.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Tested-by: Longpeng (Mike) <longpeng2@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-27arm64: Make sure SPsel is always setMarc Zyngier1-0/+1
When the kernel is entered at EL2 on an ARMv8.0 system, we construct the EL1 pstate and make sure this uses the the EL1 stack pointer (we perform an exception return to EL1h). But if the kernel is either entered at EL1 or stays at EL2 (because we're on a VHE-capable system), we fail to set SPsel, and use whatever stack selection the higher exception level has choosen for us. Let's not take any chance, and make sure that SPsel is set to one before we decide the mode we're going to run in. Cc: <stable@vger.kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-26arm64: dts: rockchip: add the grf clk for dw-mipi-dsi on rk3399Nickey Yang1-2/+2
The clk of grf must be enabled before writing grf register for rk3399. Signed-off-by: Nickey Yang <nickey.yang@rock-chips.com> [the grf clock is already part of the binding since march 2017] Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2017-09-26powerpc: Handle MCE on POWER9 with only DSISR bit 30 setMichael Neuling1-0/+13
On POWER9 DD2.1 and below, it's possible for a paste instruction to cause a Machine Check Exception (MCE) where only DSISR bit 30 (IBM 33) is set. This will result in the MCE handler seeing an unknown event, which triggers linux to crash. We change this by detecting unknown events caused by load/stores in the MCE handler and marking them as handled so that we no longer crash. An MCE that occurs like this is spurious, so we don't need to do anything in terms of servicing it. If there is something that needs to be serviced, the CPU will raise the MCE again with the correct DSISR so that it can be serviced properly. Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com Acked-by: Balbir Singh <bsingharora@gmail.com> [mpe: Expand comment with details from change log, use normal bit #s] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-26Merge branch 'WIP.x86/fpu' into x86/fpu, because it's readyIngo Molnar15-315/+375
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Use using_compacted_format() instead of open coded X86_FEATURE_XSAVESEric Biggers1-1/+1
This is the canonical method to use. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170924105913.9157-11-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_user_to_xstate()Eric Biggers1-11/+5
Tighten the checks in copy_user_to_xstate(). Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170924105913.9157-10-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Eliminate the 'xfeatures' local variable in copy_user_to_xstate()Eric Biggers1-7/+4
We now have this field in hdr.xfeatures. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170924105913.9157-9-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Copy the full header in copy_user_to_xstate()Eric Biggers1-2/+5
This is in preparation to verify the full xstate header as supplied by user-space. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170924105913.9157-8-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_kernel_to_xstate()Eric Biggers1-10/+2
Tighten the checks in copy_kernel_to_xstate(). Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170924105913.9157-7-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Eliminate the 'xfeatures' local variable in copy_kernel_to_xstate()Eric Biggers1-6/+4
We have this information in the xstate_header. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170924105913.9157-6-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Copy the full state_header in copy_kernel_to_xstate()Eric Biggers1-2/+4
This is in preparation to verify the full xstate header as supplied by user-space. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170924105913.9157-5-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Use validate_xstate_header() to validate the xstate_header in __fpu__restore_sig()Eric Biggers1-7/+9
Tighten the checks in __fpu__restore_sig() and update comments. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170924105913.9157-4-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Use validate_xstate_header() to validate the xstate_header in xstateregs_set()Eric Biggers1-13/+6
Tighten the checks in xstateregs_set(). Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170924105913.9157-3-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Introduce validate_xstate_header()Eric Biggers2-0/+28
Move validation of user-supplied xstate_header into a helper function, in preparation of calling it from both the ptrace and sigreturn syscall paths. The new function also considers it to be an error if *any* reserved bits are set, whereas before we were just clearing most of them silently. This should reduce the chance of bugs that fail to correctly validate user-supplied XSAVE areas. It also will expose any broken userspace programs that set the other reserved bits; this is desirable because such programs will lose compatibility with future CPUs and kernels if those bits are ever used for anything. (There shouldn't be any such programs, and in fact in the case where the compacted format is in use we were already validating xfeatures. But you never know...) Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170924105913.9157-2-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Rename fpu__activate_fpstate_read/write() to fpu__prepare_[read|write]()Ingo Molnar3-10/+10
As per the new nomenclature we don't 'activate' the FPU state anymore, we initialize it. So drop the _activate_fpstate name from these functions, which were a bit of a mouthful anyway, and name them: fpu__prepare_read() fpu__prepare_write() Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Rename fpu__activate_curr() to fpu__initialize()Ingo Molnar5-8/+8
Rename this function to better express that it's all about initializing the FPU state of a task which goes hand in hand with the fpu::initialized field. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Link: http://lkml.kernel.org/r/20170923130016.21448-33-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Simplify and speed up fpu__copy()Ingo Molnar1-12/+3
fpu__copy() has a preempt_disable()/enable() pair, which it had to do to be able to atomically unlazy the current task when doing an FNSAVE. But we don't unlazy tasks anymore, we always do direct saves/restores of FPU context. So remove both the unnecessary critical section, and update the comments. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Link: http://lkml.kernel.org/r/20170923130016.21448-32-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Fix stale comments about lazy FPU logicIngo Molnar1-6/+3
We don't do any lazy restore anymore, what we have are two pieces of optimization: - no-FPU tasks that don't save/restore the FPU context (kernel threads are such) - cached FPU registers maintained via the fpu->last_cpu field. This means that if an FPU task context switches to a non-FPU task then we can maintain the FPU registers as an in-FPU copies (cache), and skip the restoration of them once we switch back to the original FPU-using task. Update all the comments that still referred to old 'lazy' and 'unlazy' concepts. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Link: http://lkml.kernel.org/r/20170923130016.21448-31-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Rename fpu::fpstate_active to fpu::initializedIngo Molnar11-35/+35
The x86 FPU code used to have a complex state machine where both the FPU registers and the FPU state context could be 'active' (or inactive) independently of each other - which enabled features like lazy FPU restore. Much of this complexity is gone in the current code: now we basically can have FPU-less tasks (kernel threads) that don't use (and save/restore) FPU state at all, plus full FPU users that save/restore directly with no laziness whatsoever. But the fpu::fpstate_active still carries bits of the old complexity - meanwhile this flag has become a simple flag that shows whether the FPU context saving area in the thread struct is initialized and used, or not. Rename it to fpu::initialized to express this simplicity in the name as well. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Link: http://lkml.kernel.org/r/20170923130016.21448-30-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26x86/fpu: Remove fpu__current_fpstate_write_begin/end()Ingo Molnar2-65/+0
These functions are not used anymore, so remove them. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Link: http://lkml.kernel.org/r/20170923130016.21448-29-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>