aboutsummaryrefslogtreecommitdiffstats
path: root/include/asm-generic/error-injection.h (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2019-04-03asm-generic/tlb: Remove CONFIG_HAVE_GENERIC_MMU_GATHERPeter Zijlstra2-5/+0
Since all architectures are now using it, it is redundant. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03asm-generic/tlb: Remove arch_tlb*_mmu()Peter Zijlstra1-51/+42
Now that all architectures are converted to the generic code, remove the arch hooks. No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03s390/tlb: Convert to generic mmu_gatherMartin Schwidefsky3-151/+42
No change in behavior intended. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: aneesh.kumar@linux.vnet.ibm.com Cc: heiko.carstens@de.ibm.com Cc: linux@armlinux.org.uk Cc: npiggin@gmail.com Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/20180918125151.31744-3-schwidefsky@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03asm-generic/tlb: Introduce CONFIG_HAVE_MMU_GATHER_NO_GATHER=yMartin Schwidefsky3-49/+70
Add the Kconfig option HAVE_MMU_GATHER_NO_GATHER to the generic mmu_gather code. If the option is set the mmu_gather will not track individual pages for delayed page free anymore. A platform that enables the option needs to provide its own implementation of the __tlb_remove_page_size() function to free pages. No change in behavior intended. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: aneesh.kumar@linux.vnet.ibm.com Cc: heiko.carstens@de.ibm.com Cc: linux@armlinux.org.uk Cc: npiggin@gmail.com Link: http://lkml.kernel.org/r/20180918125151.31744-2-schwidefsky@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03arch/tlb: Clean up simple architecturesPeter Zijlstra22-143/+16
For the architectures that do not implement their own tlb_flush() but do already use the generic mmu_gather, there are two options: 1) the platform has an efficient flush_tlb_range() and asm-generic/tlb.h doesn't need any overrides at all. 2) the platform lacks an efficient flush_tlb_range() and we select MMU_GATHER_NO_RANGE to minimize full invalidates. Convert all 'simple' architectures to one of these two forms. alpha: has no range invalidate -> 2 arc: already used flush_tlb_range() -> 1 c6x: has no range invalidate -> 2 hexagon: has an efficient flush_tlb_range() -> 1 (flush_tlb_mm() is in fact a full range invalidate, so no need to shoot down everything) m68k: has inefficient flush_tlb_range() -> 2 microblaze: has no flush_tlb_range() -> 2 mips: has efficient flush_tlb_range() -> 1 (even though it currently seems to use flush_tlb_mm()) nds32: already uses flush_tlb_range() -> 1 nios2: has inefficient flush_tlb_range() -> 2 (no limit on range iteration) openrisc: has inefficient flush_tlb_range() -> 2 (no limit on range iteration) parisc: already uses flush_tlb_range() -> 1 sparc32: already uses flush_tlb_range() -> 1 unicore32: has inefficient flush_tlb_range() -> 2 (no limit on range iteration) xtensa: has efficient flush_tlb_range() -> 1 Note this also fixes a bug in the existing code for a number platforms. Those platforms that did: tlb_end_vma() -> if (!full_mm) flush_tlb_*() tlb_flush -> if (full_mm) flush_tlb_mm() missed the case of shift_arg_pages(), which doesn't have @fullmm set, nor calls into tlb_*vma(), but still frees page-tables and thus needs an invalidate. The new code handles this by detecting a non-empty range, and either issuing the matching range invalidate or a full invalidate, depending on the capabilities. No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Helge Deller <deller@gmx.de> Cc: Jonas Bonn <jonas@southpole.se> Cc: Ley Foon Tan <lftan@altera.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Piggin <npiggin@gmail.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03um/tlb: Convert to generic mmu_gatherPeter Zijlstra1-154/+2
Generic mmu_gather provides the simple flush_tlb_range() based range tracking mmu_gather UM needs. No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03sh/tlb: Convert SH to generic mmu_gatherPeter Zijlstra2-129/+10
Generic mmu_gather provides everything SH needs (range tracking and cache coherency). No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rich Felker <dalias@libc.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03ia64/tlb: Convert to generic mmu_gatherPeter Zijlstra3-257/+47
Generic mmu_gather provides everything ia64 needs (range tracking). No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03arm/tlb: Convert to generic mmu_gatherPeter Zijlstra1-241/+13
Generic mmu_gather provides everything that ARM needs: - range tracking - RCU table free - VM_EXEC tracking - VIPT cache flushing The one notable curiosity is the 'funny' range tracking for classical ARM in __pte_free_tlb(). No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03asm-generic/tlb, arch: Invert CONFIG_HAVE_RCU_TABLE_INVALIDATEPeter Zijlstra7-8/+9
Make issuing a TLB invalidate for page-table pages the normal case. The reason is twofold: - too many invalidates is safer than too few, - most architectures use the linux page-tables natively and would thus require this. Make it an opt-out, instead of an opt-in. No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03asm-generic/tlb, ia64: Conditionally provide tlb_migrate_finish()Peter Zijlstra1-0/+2
Needed for ia64 -- alternatively we drop the entire hook. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_mm()Peter Zijlstra1-1/+40
When an architecture does not have (an efficient) flush_tlb_range(), but instead always uses full TLB invalidates, the current generic tlb_flush() is sub-optimal, for it will generate extra flushes in order to keep the range small. But if we cannot do range flushes, that is a moot concern. Optionally provide this simplified default. No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03asm-generic/tlb, arch: Provide generic tlb_flush() based on flush_tlb_range()Peter Zijlstra5-12/+87
Provide a generic tlb_flush() implementation that relies on flush_tlb_range(). This is a little awkward because flush_tlb_range() assumes a VMA for range invalidation, but we no longer have one. Audit of all flush_tlb_range() implementations shows only vma->vm_mm and vma->vm_flags are used, and of the latter only VM_EXEC (I-TLB invalidates) and VM_HUGETLB (large TLB invalidate) are used. Therefore, track VM_EXEC and VM_HUGETLB in two more bits, and create a 'fake' VMA. This allows architectures that have a reasonably efficient flush_tlb_range() to not require any additional effort. No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03asm-generic/tlb, arch: Provide generic VIPT cache flushPeter Zijlstra8-61/+11
The one obvious thing SH and ARM want is a sensible default for tlb_start_vma(). (also: https://lkml.org/lkml/2004/1/15/6 ) Avoid all VIPT architectures providing their own tlb_start_vma() implementation and rely on architectures to provide a no-op flush_cache_range() when it is not relevant. This patch makes tlb_start_vma() default to flush_cache_range(), which should be right and sufficient. The only exceptions that I found where (oddly): - m68k-mmu - sparc64 - unicore Those architectures appear to have flush_cache_range(), but their current tlb_start_vma() does not call it. No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Miller <davem@davemloft.net> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03asm-generic/tlb, arch: Provide CONFIG_HAVE_MMU_GATHER_PAGE_SIZEPeter Zijlstra14-49/+39
Move the mmu_gather::page_size things into the generic code instead of PowerPC specific bits. No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03asm-generic/tlb: Provide a commentPeter Zijlstra1-3/+116
Write a comment explaining some of this.. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-01debugfs: fix use-after-free on symlink traversalAl Viro1-4/+9
symlink body shouldn't be freed without an RCU delay. Switch debugfs to ->destroy_inode() and use of call_rcu(); free both the inode and symlink body in the callback. Similar to solution for bpf, only here it's even more obvious that ->evict_inode() can be dropped. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-01ubifs: fix use-after-free on symlink traversalAl Viro1-3/+1
free the symlink body after the same RCU delay we have for freeing the struct inode itself, so that traversal during RCU pathwalk wouldn't step into freed memory. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-01jffs2: fix use-after-free on symlink traversalAl Viro2-6/+4
free the symlink body after the same RCU delay we have for freeing the struct inode itself, so that traversal during RCU pathwalk wouldn't step into freed memory. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-31Linux 5.1-rc3Linus Torvalds1-1/+1
2019-03-30leds: trigger: netdev: use memcpy in device_name_storeRasmus Villemoes1-1/+2
If userspace doesn't end the input with a newline (which can easily happen if the write happens from a C program that does write(fd, iface, strlen(iface))), we may end up including garbage from a previous, longer value in the device_name. For example # cat device_name # printf 'eth12' > device_name # cat device_name eth12 # printf 'eth3' > device_name # cat device_name eth32 I highly doubt anybody is relying on this behaviour, so switch to simply copying the bytes (we've already checked that size is < IFNAMSIZ) and unconditionally zero-terminate it; of course, we also still have to strip a trailing newline. This is also preparation for future patches. Fixes: 06f502f57d0d ("leds: trigger: Introduce a NETDEV trigger") Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
2019-03-30leds: pca9532: fix a potential NULL pointer dereferenceKangjie Lu1-2/+6
In case of_match_device cannot find a match, return -EINVAL to avoid NULL pointer dereference. Fixes: fa4191a609f2 ("leds: pca9532: Add device tree support") Signed-off-by: Kangjie Lu <kjlu@umn.edu> Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
2019-03-29LSM: Revive CONFIG_DEFAULT_SECURITY_* for "make oldconfig"Kees Cook1-0/+38
Commit 70b62c25665f636c ("LoadPin: Initialize as ordered LSM") removed CONFIG_DEFAULT_SECURITY_{SELINUX,SMACK,TOMOYO,APPARMOR,DAC} from security/Kconfig and changed CONFIG_LSM to provide a fixed ordering as a default value. That commit expected that existing users (upgrading from Linux 5.0 and earlier) will edit CONFIG_LSM value in accordance with their CONFIG_DEFAULT_SECURITY_* choice in their old kernel configs. But since users might forget to edit CONFIG_LSM value, this patch revives the choice (only for providing the default value for CONFIG_LSM) in order to make sure that CONFIG_LSM reflects CONFIG_DEFAULT_SECURITY_* from their old kernel configs. Note that since TOMOYO can be fully stacked against the other legacy major LSMs, when it is selected, it explicitly disables the other LSMs to avoid them also initializing since TOMOYO does not expect this currently. Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Fixes: 70b62c25665f636c ("LoadPin: Initialize as ordered LSM") Co-developed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-03-29fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_linksYueHaibing1-1/+2
Syzkaller reports: kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI CPU: 1 PID: 5373 Comm: syz-executor.0 Not tainted 5.0.0-rc8+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 RIP: 0010:put_links+0x101/0x440 fs/proc/proc_sysctl.c:1599 Code: 00 0f 85 3a 03 00 00 48 8b 43 38 48 89 44 24 20 48 83 c0 38 48 89 c2 48 89 44 24 28 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 fe 02 00 00 48 8b 74 24 20 48 c7 c7 60 2a 9d 91 RSP: 0018:ffff8881d828f238 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff8881e01b1140 RCX: ffffffff8ee98267 RDX: 0000000000000007 RSI: ffffc90001479000 RDI: ffff8881e01b1178 RBP: dffffc0000000000 R08: ffffed103ee27259 R09: ffffed103ee27259 R10: 0000000000000001 R11: ffffed103ee27258 R12: fffffffffffffff4 R13: 0000000000000006 R14: ffff8881f59838c0 R15: dffffc0000000000 FS: 00007f072254f700(0000) GS:ffff8881f7100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff8b286668 CR3: 00000001f0542002 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: drop_sysctl_table+0x152/0x9f0 fs/proc/proc_sysctl.c:1629 get_subdir fs/proc/proc_sysctl.c:1022 [inline] __register_sysctl_table+0xd65/0x1090 fs/proc/proc_sysctl.c:1335 br_netfilter_init+0xbc/0x1000 [br_netfilter] do_one_initcall+0xfa/0x5ca init/main.c:887 do_init_module+0x204/0x5f6 kernel/module.c:3460 load_module+0x66b2/0x8570 kernel/module.c:3808 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x462e99 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f072254ec58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99 RDX: 0000000000000000 RSI: 0000000020000280 RDI: 0000000000000003 RBP: 00007f072254ec70 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f072254f6bc R13: 00000000004bcefa R14: 00000000006f6fb0 R15: 0000000000000004 Modules linked in: br_netfilter(+) dvb_usb_dibusb_mc_common dib3000mc dibx000_common dvb_usb_dibusb_common dvb_usb_dw2102 dvb_usb classmate_laptop palmas_regulator cn videobuf2_v4l2 v4l2_common snd_soc_bd28623 mptbase snd_usb_usx2y snd_usbmidi_lib snd_rawmidi wmi libnvdimm lockd sunrpc grace rc_kworld_pc150u rc_core rtc_da9063 sha1_ssse3 i2c_cros_ec_tunnel adxl34x_spi adxl34x nfnetlink lib80211 i5500_temp dvb_as102 dvb_core videobuf2_common videodev media videobuf2_vmalloc videobuf2_memops udc_core lnbp22 leds_lp3952 hid_roccat_ryos s1d13xxxfb mtd vport_geneve openvswitch nf_conncount nf_nat_ipv6 nsh geneve udp_tunnel ip6_udp_tunnel snd_soc_mt6351 sis_agp phylink snd_soc_adau1761_spi snd_soc_adau1761 snd_soc_adau17x1 snd_soc_core snd_pcm_dmaengine ac97_bus snd_compress snd_soc_adau_utils snd_soc_sigmadsp_regmap snd_soc_sigmadsp raid_class hid_roccat_konepure hid_roccat_common hid_roccat c2port_duramar2150 core mdio_bcm_unimac iptable_security iptable_raw iptable_mangle iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel hsr veth netdevsim devlink vxcan batman_adv cfg80211 rfkill chnl_net caif nlmon dummy team bonding vcan bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel joydev mousedev ide_pci_generic piix aesni_intel aes_x86_64 ide_core crypto_simd atkbd cryptd glue_helper serio_raw ata_generic pata_acpi i2c_piix4 floppy sch_fq_codel ip_tables x_tables ipv6 [last unloaded: lm73] Dumping ftrace buffer: (ftrace buffer empty) ---[ end trace 770020de38961fd0 ]--- A new dir entry can be created in get_subdir and its 'header->parent' is set to NULL. Only after insert_header success, it will be set to 'dir', otherwise 'header->parent' is set to NULL and drop_sysctl_table is called. However in err handling path of get_subdir, drop_sysctl_table also be called on 'new->header' regardless its value of parent pointer. Then put_links is called, which triggers NULL-ptr deref when access member of header->parent. In fact we have multiple error paths which call drop_sysctl_table() there, upon failure on insert_links() we also call drop_sysctl_table().And even in the successful case on __register_sysctl_table() we still always call drop_sysctl_table().This patch fix it. Link: http://lkml.kernel.org/r/20190314085527.13244-1-yuehaibing@huawei.com Fixes: 0e47c99d7fe25 ("sysctl: Replace root_list with links between sysctl_table_sets") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reported-by: Hulk Robot <hulkci@huawei.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> [3.4+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29fs: fs_parser: fix printk format warningRandy Dunlap1-1/+1
Fix printk format warning (seen on i386 builds) by using ptrdiff format specifier (%t): fs/fs_parser.c:413:6: warning: format `%lu' expects argument of type `long unsigned int', but argument 3 has type `int' [-Wformat=] Link: http://lkml.kernel.org/r/19432668-ffd3-fbb2-af4f-1c8e48f6cc81@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: David Howells <dhowells@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29checkpatch: add %pt as a valid vsprintf extensionAlexandre Belloni1-1/+1
Commit 4d42c44727a0 ("lib/vsprintf: Print time and date in human readable format via %pt") introduced a new extension, %pt. Add it in the list of valid extensions. Link: http://lkml.kernel.org/r/20190314203719.29130-1-alexandre.belloni@bootlin.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29mm/migrate.c: add missing flush_dcache_page for non-mapped page migrateLars Persson1-3/+8
Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL and SIGSEGV that could not be traced back to a userspace code bug. They had all the magic signs of an I/D cache coherency issue. Now recently we noticed that the /proc/sys/vm/compact_memory interface was quite efficient at provoking this class of userspace crashes. Studying the code in mm/migrate.c there is a distinction made between migrating a page that is mapped at the instant of migration and one that is not mapped. Our problem turned out to be the non-mapped pages. For the non-mapped page the code performs a copy of the page content and all relevant meta-data of the page without doing the required D-cache maintenance. This leaves dirty data in the D-cache of the CPU and on the 1004K cores this data is not visible to the I-cache. A subsequent page-fault that triggers a mapping of the page will happily serve the process with potentially stale code. What about ARM then, this bug should have seen greater exposure? Well ARM became immune to this flaw back in 2010, see commit c01778001a4f ("ARM: 6379/1: Assume new page cache pages have dirty D-cache"). My proposed fix moves the D-cache maintenance inside move_to_new_page to make it common for both cases. Link: http://lkml.kernel.org/r/20190315083502.11849-1-larper@axis.com Fixes: 97ee0524614 ("flush cache before installing new page at migraton") Signed-off-by: Lars Persson <larper@axis.com> Reviewed-by: Paul Burton <paul.burton@mips.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29drivers/block/zram/zram_drv.c: fix idle/writeback string compareMinchan Kim1-26/+6
Makoto report a below KASAN error: zram does out-of-bounds read. Because strscpy copies from source up to count bytes unconditionally. It could cause out-of-bounds read on next object in slab. To prevent it, use strlcpy which checks source's length automatically. BUG: KASAN: slab-out-of-bounds in strscpy+0x68/0x154 Read of size 8 at addr ffffffc0c3495a00 by task system_server/1314 .. Call trace: strscpy+0x68/0x154 idle_store+0xc4/0x34c dev_attr_store+0x50/0x6c sysfs_kf_write+0x98/0xb4 kernfs_fop_write+0x198/0x260 __vfs_write+0x10c/0x338 vfs_write+0x114/0x238 SyS_write+0xc8/0x168 __sys_trace_return+0x0/0x4 Allocated by task 1314: __kmalloc+0x280/0x318 kernfs_fop_write+0xac/0x260 __vfs_write+0x10c/0x338 vfs_write+0x114/0x238 SyS_write+0xc8/0x168 __sys_trace_return+0x0/0x4 Freed by task 2855: kfree+0x138/0x630 kernfs_put_open_node+0x10c/0x124 kernfs_fop_release+0xd8/0x114 __fput+0x130/0x2a4 ____fput+0x1c/0x28 task_work_run+0x16c/0x1c8 do_notify_resume+0x2bc/0x107c work_pending+0x8/0x10 The buggy address belongs to the object at ffffffc0c3495a00 which belongs to the cache kmalloc-128 of size 128 The buggy address is located 0 bytes inside of 128-byte region [ffffffc0c3495a00, ffffffc0c3495a80) The buggy address belongs to the page: page:ffffffbf030d2500 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x4000000000010200(slab|head) page dumped because: kasan: bad access detected Memory state around the buggy address: ffffffc0c3495900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffc0c3495980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffffffc0c3495a00: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffffffc0c3495a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffffffc0c3495b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Link: http://lkml.kernel.org/r/20190319231911.145968-1-minchan@kernel.org Cc: <stable@vger.kernel.org> [5.0] Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Makoto Wu <makotowu@google.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29mm/page_isolation.c: fix a wrong flag in set_migratetype_isolate()Qian Cai1-1/+2
Due to has_unmovable_pages() taking an incorrect irqsave flag instead of the isolation flag in set_migratetype_isolate(), there are issues with HWPOSION and error reporting where dump_page() is not called when there is an unmovable page. Link: http://lkml.kernel.org/r/20190320204941.53731-1-cai@lca.pw Fixes: d381c54760dc ("mm: only report isolation failures when offlining memory") Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> [5.0.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29mm/memory_hotplug.c: fix notification in offline error pathQian Cai1-1/+1
When start_isolate_page_range() returned -EBUSY in __offline_pages(), it calls memory_notify(MEM_CANCEL_OFFLINE, &arg) with an uninitialized "arg". As the result, it triggers warnings below. Also, it is only necessary to notify MEM_CANCEL_OFFLINE after MEM_GOING_OFFLINE. page:ffffea0001200000 count:1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x3fffe000001000(reserved) raw: 003fffe000001000 ffffea0001200008 ffffea0001200008 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: unmovable page WARNING: CPU: 25 PID: 1665 at mm/kasan/common.c:665 kasan_mem_notifier+0x34/0x23b CPU: 25 PID: 1665 Comm: bash Tainted: G W 5.0.0+ #94 Hardware name: HP ProLiant DL180 Gen9/ProLiant DL180 Gen9, BIOS U20 10/25/2017 RIP: 0010:kasan_mem_notifier+0x34/0x23b RSP: 0018:ffff8883ec737890 EFLAGS: 00010206 RAX: 0000000000000246 RBX: ff10f0f4435f1000 RCX: f887a7a21af88000 RDX: dffffc0000000000 RSI: 0000000000000020 RDI: ffff8881f221af88 RBP: ffff8883ec737898 R08: ffff888000000000 R09: ffffffffb0bddcd0 R10: ffffed103e857088 R11: ffff8881f42b8443 R12: dffffc0000000000 R13: 00000000fffffff9 R14: dffffc0000000000 R15: 0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000560fbd31d730 CR3: 00000004049c6003 CR4: 00000000001606a0 Call Trace: notifier_call_chain+0xbf/0x130 __blocking_notifier_call_chain+0x76/0xc0 blocking_notifier_call_chain+0x16/0x20 memory_notify+0x1b/0x20 __offline_pages+0x3e2/0x1210 offline_pages+0x11/0x20 memory_block_action+0x144/0x300 memory_subsys_offline+0xe5/0x170 device_offline+0x13f/0x1e0 state_store+0xeb/0x110 dev_attr_store+0x3f/0x70 sysfs_kf_write+0x104/0x150 kernfs_fop_write+0x25c/0x410 __vfs_write+0x66/0x120 vfs_write+0x15a/0x4f0 ksys_write+0xd2/0x1b0 __x64_sys_write+0x73/0xb0 do_syscall_64+0xeb/0xb78 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f14f75cc3b8 RSP: 002b:00007ffe84d01d68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f14f75cc3b8 RDX: 0000000000000008 RSI: 0000563f8e433d70 RDI: 0000000000000001 RBP: 0000563f8e433d70 R08: 000000000000000a R09: 00007ffe84d018f0 R10: 000000000000000a R11: 0000000000000246 R12: 00007f14f789e780 R13: 0000000000000008 R14: 00007f14f7899740 R15: 0000000000000008 Link: http://lkml.kernel.org/r/20190320204255.53571-1-cai@lca.pw Fixes: 7960509329c2 ("mm, memory_hotplug: print reason for the offlining failure") Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> [5.0.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29ptrace: take into account saved_sigmask in PTRACE{GET,SET}SIGMASKAndrei Vagin2-2/+31
There are a few system calls (pselect, ppoll, etc) which replace a task sigmask while they are running in a kernel-space When a task calls one of these syscalls, the kernel saves a current sigmask in task->saved_sigmask and sets a syscall sigmask. On syscall-exit-stop, ptrace traps a task before restoring the saved_sigmask, so PTRACE_GETSIGMASK returns the syscall sigmask and PTRACE_SETSIGMASK does nothing, because its sigmask is replaced by saved_sigmask, when the task returns to user-space. This patch fixes this problem. PTRACE_GETSIGMASK returns saved_sigmask if it's set. PTRACE_SETSIGMASK drops the TIF_RESTORE_SIGMASK flag. Link: http://lkml.kernel.org/r/20181120060616.6043-1-avagin@gmail.com Fixes: 29000caecbe8 ("ptrace: add ability to get/set signal-blocked mask") Signed-off-by: Andrei Vagin <avagin@gmail.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29fs/proc/kcore.c: make kcore_modules staticYueHaibing1-1/+1
Fix sparse warning: fs/proc/kcore.c:591:19: warning: symbol 'kcore_modules' was not declared. Should it be static? Link: http://lkml.kernel.org/r/20190320135417.13272-1-yuehaibing@huawei.com Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Mukesh Ojha <mojha@codeaurora.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Omar Sandoval <osandov@fb.com> Cc: James Morse <james.morse@arm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29include/linux/list.h: fix list_is_first() kernel-docRandy Dunlap1-1/+1
Fix typo of kernel-doc parameter notation (there should be no space between '@' and the parameter name). Also fixes bogus kernel-doc notation output formatting. Link: http://lkml.kernel.org/r/ddce8b80-9a8a-d52d-3546-87b2211c089a@infradead.org Fixes: 70b44595eafe9 ("mm, compaction: use free lists to quickly locate a migration source") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29mm/debug.c: fix __dump_page when mapping->host is not setOscar Salvador1-1/+1
While debugging something, I added a dump_page() into do_swap_page(), and I got the splat from below. The issue happens when dereferencing mapping->host in __dump_page(): ... else if (mapping) { pr_warn("%ps ", mapping->a_ops); if (mapping->host->i_dentry.first) { struct dentry *dentry; dentry = container_of(mapping->host->i_dentry.first, struct dentry, d_u.d_alias); pr_warn("name:\"%pd\" ", dentry); } } ... Swap address space does not contain an inode information, and so mapping->host equals NULL. Although the dump_page() call was added artificially into do_swap_page(), I am not sure if we can hit this from any other path, so it looks worth fixing it. We can easily do that by checking mapping->host first. Link: http://lkml.kernel.org/r/20190318072931.29094-1-osalvador@suse.de Fixes: 1c6fb1d89e73c ("mm: print more information about mapping in __dump_page") Signed-off-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specifiedYang Shi1-7/+33
When MPOL_MF_STRICT was specified and an existing page was already on a node that does not follow the policy, mbind() should return -EIO. But commit 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()") broke the rule. And commit c8633798497c ("mm: mempolicy: mbind and migrate_pages support thp migration") didn't return the correct value for THP mbind() too. If MPOL_MF_STRICT is set, ignore vma_migratable() to make sure it reaches queue_pages_to_pte_range() or queue_pages_pmd() to check if an existing page was already on a node that does not follow the policy. And, non-migratable vma may be used, return -EIO too if MPOL_MF_MOVE or MPOL_MF_MOVE_ALL was specified. Tested with https://github.com/metan-ucw/ltp/blob/master/testcases/kernel/syscalls/mbind/mbind02.c [akpm@linux-foundation.org: tweak code comment] Link: http://lkml.kernel.org/r/1553020556-38583-1-git-send-email-yang.shi@linux.alibaba.com Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()") Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Oscar Salvador <osalvador@suse.de> Reported-by: Cyril Hrubis <chrubis@suse.cz> Suggested-by: Kirill A. Shutemov <kirill@shutemov.name> Acked-by: Rafael Aquini <aquini@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: David Rientjes <rientjes@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29include/linux/hugetlb.h: convert to use vm_fault_tSouptick Joarder1-1/+7
kbuild produces the below warning: tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 5453a3df2a5eb49bc24615d4cf0d66b2aae05e5f commit 3d3539018d2c ("mm: create the new vm_fault_t type") reproduce: # apt-get install sparse git checkout 3d3539018d2cbd12e5af4a132636ee7fd8d43ef0 make ARCH=x86_64 allmodconfig make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' >> mm/memory.c:3968:21: sparse: incorrect type in assignment (different >> base types) @@ expected restricted vm_fault_t [usertype] ret @@ >> got e] ret @@ mm/memory.c:3968:21: expected restricted vm_fault_t [usertype] ret mm/memory.c:3968:21: got int This patch converts to return vm_fault_t type for hugetlb_fault() when CONFIG_HUGETLB_PAGE=n. Regarding the sparse warning, Luc said: : This is the expected behaviour. The constant 0 is magic regarding bitwise : types but ({ ...; 0; }) is not, it is just an ordinary expression of type : 'int'. : : So, IMHO, Souptick's patch is the right thing to do. Link: http://lkml.kernel.org/r/20190318162604.GA31553@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29iommu/io-pgtable-arm-v7s: request DMA32 memory, and improve debuggingNicolas Boichat1-4/+15
IOMMUs using ARMv7 short-descriptor format require page tables (level 1 and 2) to be allocated within the first 4GB of RAM, even on 64-bit systems. For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32 is defined (e.g. on arm64 platforms). For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32. Note that we do not explicitly pass GFP_DMA[32] to kmem_cache_zalloc, as this is not strictly necessary, and would cause a warning in mm/sl*b.c, as we did not update GFP_SLAB_BUG_MASK. Also, print an error when the physical address does not fit in 32-bit, to make debugging easier in the future. Link: http://lkml.kernel.org/r/20181210011504.122604-3-drinkcat@chromium.org Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32") Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Huaisheng Ye <yehs1@lenovo.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Sasha Levin <Alexander.Levin@microsoft.com> Cc: Tomasz Figa <tfiga@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Yong Wu <yong.wu@mediatek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29mm: add support for kmem caches in DMA32 zoneNicolas Boichat5-2/+12
Patch series "iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables", v6. This is a followup to the discussion in [1], [2]. IOMMUs using ARMv7 short-descriptor format require page tables (level 1 and 2) to be allocated within the first 4GB of RAM, even on 64-bit systems. For L1 tables that are bigger than a page, we can just use __get_free_pages with GFP_DMA32 (on arm64 systems only, arm would still use GFP_DMA). For L2 tables that only take 1KB, it would be a waste to allocate a full page, so we considered 3 approaches: 1. This series, adding support for GFP_DMA32 slab caches. 2. genalloc, which requires pre-allocating the maximum number of L2 page tables (4096, so 4MB of memory). 3. page_frag, which is not very memory-efficient as it is unable to reuse freed fragments until the whole page is freed. [3] This series is the most memory-efficient approach. stable@ note: We confirmed that this is a regression, and IOMMU errors happen on 4.19 and linux-next/master on MT8173 (elm, Acer Chromebook R13). The issue most likely starts from commit ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32"), i.e. 4.15, and presumably breaks a number of Mediatek platforms (and maybe others?). [1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html [2] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html [3] https://patchwork.codeaurora.org/patch/671639/ This patch (of 3): IOMMUs using ARMv7 short-descriptor format require page tables to be allocated within the first 4GB of RAM, even on 64-bit systems. On arm64, this is done by passing GFP_DMA32 flag to memory allocation functions. For IOMMU L2 tables that only take 1KB, it would be a waste to allocate a full page using get_free_pages, so we considered 3 approaches: 1. This patch, adding support for GFP_DMA32 slab caches. 2. genalloc, which requires pre-allocating the maximum number of L2 page tables (4096, so 4MB of memory). 3. page_frag, which is not very memory-efficient as it is unable to reuse freed fragments until the whole page is freed. This change makes it possible to create a custom cache in DMA32 zone using kmem_cache_create, then allocate memory using kmem_cache_alloc. We do not create a DMA32 kmalloc cache array, as there are currently no users of kmalloc(..., GFP_DMA32). These calls will continue to trigger a warning, as we keep GFP_DMA32 in GFP_SLAB_BUG_MASK. This implies that calls to kmem_cache_*alloc on a SLAB_CACHE_DMA32 kmem_cache must _not_ use GFP_DMA32 (it is anyway redundant and unnecessary). Link: http://lkml.kernel.org/r/20181210011504.122604-2-drinkcat@chromium.org Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Sasha Levin <Alexander.Levin@microsoft.com> Cc: Huaisheng Ye <yehs1@lenovo.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Yong Wu <yong.wu@mediatek.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Tomasz Figa <tfiga@google.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29ocfs2: fix inode bh swapping mixup in ocfs2_reflink_inodes_lockDarrick J. Wong1-18/+24
ocfs2_reflink_inodes_lock() can swap the inode1/inode2 variables so that we always grab cluster locks in order of increasing inode number. Unfortunately, we forget to swap the inode record buffer head pointers when we've done this, which leads to incorrect bookkeepping when we're trying to make the two inodes have the same refcount tree. This has the effect of causing filesystem shutdowns if you're trying to reflink data from inode 100 into inode 97, where inode 100 already has a refcount tree attached and inode 97 doesn't. The reflink code decides to copy the refcount tree pointer from 100 to 97, but uses inode 97's inode record to open the tree root (which it doesn't have) and blows up. This issue causes filesystem shutdowns and metadata corruption! Link: http://lkml.kernel.org/r/20190312214910.GK20533@magnolia Fixes: 29ac8e856cb369 ("ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29mm/hotplug: fix offline undo_isolate_page_range()Qian Cai5-34/+45
Commit f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") introduced move_pfn_range_to_zone() which calls memmap_init_zone() during onlining a memory block. memmap_init_zone() will reset pagetype flags and makes migrate type to be MOVABLE. However, in __offline_pages(), it also call undo_isolate_page_range() after offline_isolated_pages() to do the same thing. Due to commit 2ce13640b3f4 ("mm: __first_valid_page skip over offline pages") changed __first_valid_page() to skip offline pages, undo_isolate_page_range() here just waste CPU cycles looping around the offlining PFN range while doing nothing, because __first_valid_page() will return NULL as offline_isolated_pages() has already marked all memory sections within the pfn range as offline via offline_mem_sections(). Also, after calling the "useless" undo_isolate_page_range() here, it reaches the point of no returning by notifying MEM_OFFLINE. Those pages will be marked as MIGRATE_MOVABLE again once onlining. The only thing left to do is to decrease the number of isolated pageblocks zone counter which would make some paths of the page allocation slower that the above commit introduced. Even if alloc_contig_range() can be used to isolate 16GB-hugetlb pages on ppc64, an "int" should still be enough to represent the number of pageblocks there. Fix an incorrect comment along the way. [cai@lca.pw: v4] Link: http://lkml.kernel.org/r/20190314150641.59358-1-cai@lca.pw Link: http://lkml.kernel.org/r/20190313143133.46200-1-cai@lca.pw Fixes: 2ce13640b3f4 ("mm: __first_valid_page skip over offline pages") Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [4.13+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29fs/open.c: allow opening only regular files during execve()Tetsuo Handa1-0/+6
syzbot is hitting lockdep warning [1] due to trying to open a fifo during an execve() operation. But we don't need to open non regular files during an execve() operation, for all files which we will need are the executable file itself and the interpreter programs like /bin/sh and ld-linux.so.2 . Since the manpage for execve(2) says that execve() returns EACCES when the file or a script interpreter is not a regular file, and the manpage for uselib(2) says that uselib() can return EACCES, and we use FMODE_EXEC when opening for execve()/uselib(), we can bail out if a non regular file is requested with FMODE_EXEC set. Since this deadlock followed by khungtaskd warnings is trivially reproducible by a local unprivileged user, and syzbot's frequent crash due to this deadlock defers finding other bugs, let's workaround this deadlock until we get a chance to find a better solution. [1] https://syzkaller.appspot.com/bug?id=b5095bfec44ec84213bac54742a82483aad578ce Link: http://lkml.kernel.org/r/1552044017-7890-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Reported-by: syzbot <syzbot+e93a80c1bb7c5c56e522461c149f8bf55eab1b2b@syzkaller.appspotmail.com> Fixes: 8924feff66f35fe2 ("splice: lift pipe_lock out of splice_to_pipe()") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> [4.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29mailmap: add Changbin DuChangbin Du1-0/+2
Add my email in the mailmap file to have a consistent shortlog output. Link: http://lkml.kernel.org/r/20190308142103.4929-1-changbin.du@gmail.com Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29mm/debug.c: add a cast to u64 for atomic64_read()Qian Cai1-1/+1
atomic64_read() on ppc64le returns "long int", so fix the same way as commit d549f545e690 ("drm/virtio: use %llu format string form atomic64_t") by adding a cast to u64, which makes it work on all arches. In file included from ./include/linux/printk.h:7, from ./include/linux/kernel.h:15, from mm/debug.c:9: mm/debug.c: In function 'dump_mm': ./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 19 has type 'long int' [-Wformat=] #define KERN_SOH "A" /* ASCII Start Of Header */ ^~~~~~ ./include/linux/kern_levels.h:8:20: note: in expansion of macro 'KERN_SOH' #define KERN_EMERG KERN_SOH "0" /* system is unusable */ ^~~~~~~~ ./include/linux/printk.h:297:9: note: in expansion of macro 'KERN_EMERG' printk(KERN_EMERG pr_fmt(fmt), ##__VA_ARGS__) ^~~~~~~~~~ mm/debug.c:133:2: note: in expansion of macro 'pr_emerg' pr_emerg("mm %px mmap %px seqnum %llu task_size %lu" ^~~~~~~~ mm/debug.c:140:17: note: format string is defined here "pinned_vm %llx data_vm %lx exec_vm %lx stack_vm %lx" ~~~^ %lx Link: http://lkml.kernel.org/r/20190310183051.87303-1-cai@lca.pw Fixes: 70f8a3ca68d3 ("mm: make mm->pinned_vm an atomic64 counter") Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29mm/memory.c: fix modifying of page protection by insert_pfn()Jan Kara1-5/+6
Aneesh has reported that PPC triggers the following warning when excercising DAX code: IP set_pte_at+0x3c/0x190 LR insert_pfn+0x208/0x280 Call Trace: insert_pfn+0x68/0x280 dax_iomap_pte_fault.isra.7+0x734/0xa40 __xfs_filemap_fault+0x280/0x2d0 do_wp_page+0x48c/0xa40 __handle_mm_fault+0x8d0/0x1fd0 handle_mm_fault+0x140/0x250 __do_page_fault+0x300/0xd60 handle_page_fault+0x18 Now that is WARN_ON in set_pte_at which is VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep)); The problem is that on some architectures set_pte_at() cannot cope with a situation where there is already some (different) valid entry present. Use ptep_set_access_flags() instead to modify the pfn which is built to deal with modifying existing PTE. Link: http://lkml.kernel.org/r/20190311084537.16029-1-jack@suse.cz Fixes: b2770da64254 "mm: add vm_insert_mixed_mkwrite()" Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Chandan Rajendra <chandan@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29kasan: fix variable 'tag' set but not used warningQian Cai1-1/+4
set_tag() compiles away when CONFIG_KASAN_SW_TAGS=n, so make arch_kasan_set_tag() a static inline function to fix warnings below. mm/kasan/common.c: In function '__kasan_kmalloc': mm/kasan/common.c:475:5: warning: variable 'tag' set but not used [-Wunused-but-set-variable] u8 tag; ^~~ Link: http://lkml.kernel.org/r/20190307185244.54648-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29staging: vt6655: Remove vif check from vnt_interruptMalcolm Priestley1-2/+1
A check for vif is made in vnt_interrupt_work. There is a small chance of leaving interrupt disabled while vif is NULL and the work hasn't been scheduled. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> CC: stable@vger.kernel.org # v4.2+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-29staging: erofs: keep corrupted fs from crashing kernel in erofs_readdir()Gao Xiang1-20/+25
After commit 419d6efc50e9, kernel cannot be crashed in the namei path. However, corrupted nameoff can do harm in the process of readdir for scenerios without dm-verity as well. Fix it now. Fixes: 3aa8ec716e52 ("staging: erofs: add directory operations") Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-29iommu/amd: Reserve exclusion range in iova-domainJoerg Roedel3-6/+12
If a device has an exclusion range specified in the IVRS table, this region needs to be reserved in the iova-domain of that device. This hasn't happened until now and can cause data corruption on data transfered with these devices. Treat exclusion ranges as reserved regions in the iommu-core to fix the problem. Fixes: be2a022c0dd0 ('x86, AMD IOMMU: add functions to parse IOMMU memory mapping requirements for devices') Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Gary R Hook <gary.hook@amd.com>
2019-03-29kconfig/[mn]conf: handle backspace (^H) keyChangbin Du3-3/+5
Backspace is not working on some terminal emulators which do not send the key code defined by terminfo. Terminals either send '^H' (8) or '^?' (127). But currently only '^?' is handled. Let's also handle '^H' for those terminals. Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-03-29x86/realmode: Make set_real_mode_mem() static inlineMatteo Croce3-10/+7
Remove the unused @size argument and move it into a header file, so it can be inlined. [ bp: Massage. ] Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-efi <linux-efi@vger.kernel.org> Cc: platform-driver-x86@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190328114233.27835-1-mcroce@redhat.com