wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2023-07-01	mm: Update do_vmi_align_munmap() return semantics	Liam R. Howlett	3	-69/+57
	Since do_vmi_align_munmap() will always honor the downgrade request on the success, the callers no longer have to deal with confusing return codes. Since all callers that request downgrade actually want the lock to be dropped, change the downgrade to an unlock request. Note that the lock still needs to be held in read mode during the page table clean up to avoid races with a map request. Update do_vmi_align_munmap() to return 0 for success. Clean up the callers and comments to always expect the unlock to be honored on the success path. The error path will always leave the lock untouched. As part of the cleanup, the wrapper function do_vmi_munmap() and callers to the wrapper are also updated. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/linux-mm/20230629191414.1215929-1-willy@infradead.org/ Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-01	mm: Always downgrade mmap_lock if requested	Matthew Wilcox (Oracle)	1	-13/+2
	Now that stack growth must always hold the mmap_lock for write, we can always downgrade the mmap_lock to read and safely unmap pages from the page table, even if we're next to a stack. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-01	xtensa: fix lock_mm_and_find_vma in case VMA not found	Max Filippov	1	-1/+6
	MMU version of lock_mm_and_find_vma releases the mm lock before returning when VMA is not found. Do the same in noMMU version. This fixes hang on an attempt to handle protection fault. Fixes: d85a143b69ab ("xtensa: fix NOMMU build with lock_mm_and_find_vma() conversion") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-30	xtensa: fix NOMMU build with lock_mm_and_find_vma() conversion	Linus Torvalds	2	-2/+14
	It turns out that xtensa has a really odd configuration situation: you can do a no-MMU config, but still have the page fault code enabled. Which doesn't sound all that sensible, but it turns out that xtensa can have protection faults even without the MMU, and we have this: config PFAULT bool "Handle protection faults" if EXPERT && !MMU default y help Handle protection faults. MMU configurations must enable it. noMMU configurations may disable it if used memory map never generates protection faults or faults are always fatal. If unsure, say Y. which completely violated my expectations of the page fault handling. End result: Guenter reports that the xtensa no-MMU builds all fail with arch/xtensa/mm/fault.c: In function ‘do_page_fault’: arch/xtensa/mm/fault.c:133:8: error: implicit declaration of function ‘lock_mm_and_find_vma’ because I never exposed the new lock_mm_and_find_vma() function for the no-MMU case. Doing so is simple enough, and fixes the problem. Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Fixes: a050ba1e7422 ("mm/fault: convert remaining simple cases to lock_mm_and_find_vma()") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-30	sysctl: set variable sysctl_mount_point storage-class-specifier to static	Tom Rix	1	-1/+1
	smatch reports fs/proc/proc_sysctl.c:32:18: warning: symbol 'sysctl_mount_point' was not declared. Should it be static? This variable is only used in its defining file, so it should be static. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-30	pid: Replace struct pid 1-element array with flex-array	Kees Cook	3	-4/+7
	For pid namespaces, struct pid uses a dynamically sized array member, "numbers". This was implemented using the ancient 1-element fake flexible array, which has been deprecated for decades. Replace it with a C99 flexible array, refactor the array size calculations to use struct_size(), and address elements via indexes. Note that the static initializer (which defines a single element) works as-is, and requires no special handling. Without this, CONFIG_UBSAN_BOUNDS (and potentially CONFIG_FORTIFY_SOURCE) will trigger bounds checks: https://lore.kernel.org/lkml/20230517-bushaltestelle-super-e223978c1ba6@brauner Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jeff Xu <jeffxu@google.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Daniel Verkamp <dverkamp@chromium.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Jeff Xu <jeffxu@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Reported-by: syzbot+ac3b41786a2d0565b6d5@syzkaller.appspotmail.com [brauner: dropped unrelated changes and remove 0 with NULL cast] Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-29	csky: fix up lock_mm_and_find_vma() conversion	Linus Torvalds	1	-1/+1
	As already mentioned in my merge message for the 'expand-stack' branch, we have something like 24 different versions of the page fault path for all our different architectures, all just _slightly_ different due to various historical reasons (usually related to exactly when they branched off the original i386 version, and the details of the other architectures they had in their history). And a few of them had some silly mistake in the conversion. Most of the architectures call the faulting address 'address' in the fault path. But not all. Some just call it 'addr'. And if you end up doing a bit too much copy-and-paste, you end up with the wrong version in the places that do it differently. In this case it was csky. Fixes: a050ba1e7422 ("mm/fault: convert remaining simple cases to lock_mm_and_find_vma()") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-29	parisc: fix expand_stack() conversion	Linus Torvalds	1	-1/+1
	In commit 8d7071af8907 ("mm: always expand the stack with the mmap write lock held") I tried to deal with the remaining odd page fault handling cases. The oddest one is ia64, which has stacks that grow both up and down. And because ia64 was _so_ odd, I asked people to verify the end result. But a close second oddity is parisc, which is the only one that has a main stack growing up (our "CONFIG_STACK_GROWSUP" config option). But it looked obvious enough that I didn't worry about it. I should have worried a bit more. Not because it was particularly complex, but because I just used the wrong variable name. The previous vma isn't called "prev", it's called "prev_vma". Blush. Fixes: 8d7071af8907 ("mm: always expand the stack with the mmap write lock held") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-29	sparc32: fix lock_mm_and_find_vma() conversion	Linus Torvalds	1	-1/+1
	The sparc32 conversion to lock_mm_and_find_vma() in commit a050ba1e7422 ("mm/fault: convert remaining simple cases to lock_mm_and_find_vma()") missed the fact that we didn't actually have a 'regs' pointer available in the 'force_user_fault()' case. It's there in the regular page fault path ("do_sparc_fault()"), but not the window underflow/overflow paths. Which is all fine - we can just pass in a NULL pointer. The register state is only used to avoid deadlock with kernel faults, which is not the case for any of these register window faults. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: a050ba1e7422 ("mm/fault: convert remaining simple cases to lock_mm_and_find_vma()") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-29	objtool: Remove btrfs_assertfail() from the noreturn exceptions list	Ingo Molnar	1	-1/+0
	The objtool merge in commit 6f612579be9d ("Merge tag 'objtool-core ...") generated a semantic conflict that was not resolved. The btrfs_assertfail() entry was removed from the noreturn list in commit b831306b3b7d ("btrfs: print assertion failure report and stack trace from the same line") because btrfs_assertfail() was changed from a noreturn function into a macro. The noreturn list was then moved from check.c to noreturns.h in commit 6245ce4ab670 ("objtool: Move noreturn function list to separate file"), and should be removed from that post-merge as well. Do it explicitly. Cc: David Sterba <dsterba@suse.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-29	sysctl: fix unused proc_cap_handler() function warning	Arnd Bergmann	1	-1/+1
	Since usermodehelper_table() is marked static now, we get a warning about it being unused when SYSCTL is disabled: kernel/umh.c:497:12: error: 'proc_cap_handler' defined but not used [-Werror=unused-function] Just move it inside of the same #ifdef. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Fixes: 861dc0b46432 ("sysctl: move umh sysctl registration to its own file") Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested [mcgrof: adjust new commit ID for Fixes tag] Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-29	gup: avoid stack expansion warning for known-good case	Linus Torvalds	1	-0/+4
	In commit a425ac5365f6 ("gup: add warning if some caller would seem to want stack expansion") I added a temporary warning to catch any strange GUP users that would be impacted by the fact that GUP no longer extends the stack. But it turns out that the warning is most easily triggered through __access_remote_vm(), that already knows to expand the stack - it just does it after calling GUP. So the warning is easy to trigger by just running gdb (or similar) and accessing things remotely under the stack. This just adds a temporary extra "expand stack early" to avoid the warning for the already converted case - not because the warning is bad, but because getting the warning for this known good case would then hide any subsequent warnings for any actually interesting cases. Let's try to remember to revert this change when we remove the warnings. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-29	mm/khugepaged: fix regression in collapse_file()	Hugh Dickins	1	-4/+3
	There is no xas_pause(&xas) in collapse_file()'s main loop, at the points where it does xas_unlock_irq(&xas) and then continues. That would explain why, once two weeks ago and twice yesterday, I have hit the VM_BUG_ON_PAGE(page != xas_load(&xas), page) since "mm/khugepaged: fix iteration in collapse_file" removed the xas_set(&xas, index) just before it: xas.xa_node could be left pointing to a stale node, if there was concurrent activity on the file which transformed its xarray. I tried inserting xas_pause()s, but then even bootup crashed on that VM_BUG_ON_PAGE(): there appears to be a subtle "nextness" implicit in xas_pause(). xas_next() and xas_pause() are good for use in simple loops, but not in this one: xas_set() worked well until now, so use xas_set(&xas, index) explicitly at the head of the loop; and change that VM_BUG_ON_PAGE() not to need its own xas_set(), and not to interfere with the xa_state (which would probably stop the crashes from xas_pause(), but I trust that less). The user-visible effects of this bug (if VM_BUG_ONs are configured out) would be data loss and data leak - potentially - though in practice I expect it is more likely that a subsequent check (e.g. on mapping or on nr_none) would notice an inconsistency, and just abandon the collapse. Link: https://lore.kernel.org/linux-mm/f18e4b64-3f88-a8ab-56cc-d1f5f9c58d4@google.com/ Fixes: c8a8f3b4a95a ("mm/khugepaged: fix iteration in collapse_file") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Stevens <stevensd@chromium.org> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-29	cifs: new dynamic tracepoint to track ses not found errors	Shyam Prasad N	3	-0/+23
	It is perfectly valid to not find session not found errors when a reconnect of a session happens when requests for the same session are happening in parallel. We had these log messages as VFS logs. My last change dumped these logs as FYI logs. This change just creates a new dynamic tracepoint to capture events of this type, just in case it is useful while debugging issues in the future. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-29	cifs: log session id when a matching ses is not found	Shyam Prasad N	2	-4/+4
	We do not log the session id in crypt_setup when a matching session is not found. Printing the session id helps debugging here. This change does just that. This change also changes this log to FYI, since it is normal to see then during a reconnect. Doing the same for a similar log in case of signed connections. The plan is to have a tracepoint for this event, so that we will be able to see this event if need be. That will be done as another change. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-29	LoongArch: Remove five DIE_* definitions in kdebug.h	Tiezhu Yang	1	-5/+0
	For now, DIE_PAGE_FAULT, DIE_BREAK, DIE_SSTEPBP, DIE_UPROBE and DIE_UPROBE_XOL are not used by any code, remove them. Tested-by: Jeff Xie <xiehuan09@gmail.com> Suggested-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add uprobes support	Tiezhu Yang	5	-5/+197
	Uprobes is the user-space counterpart to kprobes, this patch adds uprobes support for LoongArch. Here is a simple example with CONFIG_UPROBE_EVENTS=y: # cat test.c #include <stdio.h> int add(int a, int b) { return a + b; } int main() { return add(2, 7); } # gcc test.c -o /tmp/test # nm /tmp/test \| grep add 0000000120004194 T add # cd /sys/kernel/debug/tracing # echo > uprobe_events # echo "p:myuprobe /tmp/test:0x4194 %r4 %r5" > uprobe_events # echo "r:myuretprobe /tmp/test:0x4194 %r4" >> uprobe_events # echo 1 > events/uprobes/enable # echo 1 > tracing_on # /tmp/test # cat trace ... # TASK-PID CPU# \|\|\|\|\| TIMESTAMP FUNCTION # \| \| \| \|\|\|\|\| \| \| test-1060 [001] DNZff 1015.770620: myuprobe: (0x120004194) arg1=0x2 arg2=0x7 test-1060 [001] DNZff 1015.770930: myuretprobe: (0x1200041f0 <- 0x120004194) arg1=0x9 Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Use larch_insn_gen_break() for kprobes	Tiezhu Yang	2	-20/+15
	For now, we can use larch_insn_gen_break() to define KPROBE_BP_INSN and KPROBE_SSTEPBP_INSN. Because larch_insn_gen_break() returns instruction word, define kprobe_opcode_t as u32, then do some small changes related with type conversion, no functional change intended. Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add larch_insn_gen_break() to generate break insns	Tiezhu Yang	2	-0/+26
	There exist various break insns such as BRK_KPROBE_BP, BRK_KPROBE_SSTEPBP, BRK_UPROBE_BP and BRK_UPROBE_XOLBP, add larch_insn_gen_break() to generate break insns simpler, this is preparation for later patch. Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Check for AMO instructions in insns_not_supported()	Tiezhu Yang	2	-0/+32
	Like llsc instructions, the atomic memory access instructions shouldn't be supported for probing, so check for them in insns_not_supported(). Closes: https://lore.kernel.org/all/SY4P282MB351877A70A0333C790FE85A5C09C9@SY4P282MB3518.AUSP282.PROD.OUTLOOK.COM/ Tested-by: Jeff Xie <xiehuan09@gmail.com> Reported-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Move three functions from kprobes.c to inst.c	Tiezhu Yang	3	-44/+45
	The three functions insns_not_supported(), insns_need_simulation() and arch_simulate_insn() will be used for uprobes, move them from kprobes.c to inst.c, this is preparation for later patch, no functionality change. Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Replace kretprobe with rethook	Haoran Jiang	7	-28/+44
	This is an adaptation of commit f3a112c0c40d ("x86,rethook,kprobes: Replace kretprobe with rethook on x86") and commit b57c2f124098 ("riscv: add riscv rethook implementation") to LoongArch. Mainly refer to commit b57c2f124098 ("riscv: add riscv rethook implementation"). Replaces the kretprobe code with rethook on LoongArch. With this patch, kretprobe on LoongArch uses the rethook instead of kretprobe specific trampoline code. Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add jump-label implementation	Youling Tang	5	-1/+77
	Add support for jump labels based on the ARM64 version. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Select HAVE_DEBUG_KMEMLEAK to support kmemleak	Tiezhu Yang	2	-1/+2
	We can see that DEBUG_KMEMLEAK depends on HAVE_DEBUG_KMEMLEAK after commit b69ec42b1b19 ("Kconfig: clean up the long arch list for the DEBUG_KMEMLEAK config option"), just select HAVE_DEBUG_KMEMLEAK to support kmemleak on LoongArch. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Export some arch-specific pm interfaces	Yinbo Zhu	3	-6/+16
	Some PMC (Power Management Controllers) need to support DTS and will use the suspend interfaces thus this patch was to export such interfaces for their use. Signed-off-by: Yinbo Zhu <zhuyinbo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Introduce hardware page table walker	Huacai Chen	10	-7/+56
	Loongson-3A6000 and newer processors have hardware page table walker (PTW) support. PTW can handle all fastpaths of TLBI/TLBL/TLBS/TLBM exceptions by hardware, software only need to handle slowpaths (page faults). BTW, PTW doesn't append _PAGE_MODIFIED for page table entries, so we change pmd_dirty() and pte_dirty() to also check _PAGE_DIRTY for the "dirty" attribute. Signed-off-by: Liang Gao <gaoliang@loongson.cn> Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Support dbar with different hints	Huacai Chen	6	-81/+78
	Traditionally, LoongArch uses "dbar 0" (full completion barrier) for everything. But the full completion barrier is a performance killer, so Loongson-3A6000 and newer processors have made finer granularity hints available: Bit4: ordering or completion (0: completion, 1: ordering) Bit3: barrier for previous read (0: true, 1: false) Bit2: barrier for previous write (0: true, 1: false) Bit1: barrier for succeeding read (0: true, 1: false) Bit0: barrier for succeeding write (0: true, 1: false) Hint 0x700: barrier for "read after read" from the same address, which is needed by LL-SC loops on old models (dbar 0x700 behaves the same as nop if such reordering is disabled on new models). This patch makes use of the various new hints for different kinds of memory barriers. It brings performance improvements on Loongson-3A6000 series, while not affecting the existing models because all variants are treated as 'dbar 0' there. Why override queued_spin_unlock()? After commit 01e3b958efe85a26d9b ("drivers: Remove explicit invocations of mmiowb()") we need a completion barrier in queued_spin_unlock(), but the generic implementation use smp_store_release() which only provide an ordering barrier. Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add SMT (Simultaneous Multi-Threading) support	Huacai Chen	7	-15/+62
	Loongson-3A6000 has SMT (Simultaneous Multi-Threading) support, each physical core has two logical cores (threads). This patch add SMT probe and scheduler support via ACPI PPTT. If SCHED_SMT enabled, Loongson-3A6000 is treated as 4 cores, 8 threads; If SCHED_SMT disabled, Loongson-3A6000 is treated as 8 cores, 8 threads. Remove smp_num_siblings to support HMP (Heterogeneous Multi-Processing). Signed-off-by: Liupu Wang <wangliupu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add vector extensions support	Huacai Chen	11	-23/+1452
	Add LoongArch's vector extensions support, which including 128bit LSX (i.e., Loongson SIMD eXtension) and 256bit LASX (i.e., Loongson Advanced SIMD eXtension). Linux kernel doesn't use vector itself, it only handle exceptions and context save/restore. So it only needs a subset of these instructions: * Vector load/store: vld vst vldx vstx xvld xvst xvldx xvstx * 8bit-elements move: vpickve2gr.b xvpickve2gr.b vinsgr2vr.b xvinsgr2vr.b * 16bit-elements move: vpickve2gr.h xvpickve2gr.h vinsgr2vr.h xvinsgr2vr.h * 32bit-elements move: vpickve2gr.w xvpickve2gr.w vinsgr2vr.w xvinsgr2vr.w * 64bit-elements move: vpickve2gr.d xvpickve2gr.d vinsgr2vr.d xvinsgr2vr.d * Elements permute: vpermi.w vpermi.d xvpermi.w xvpermi.d xvpermi.q Introduce AS_HAS_LSX_EXTENSION and AS_HAS_LASX_EXTENSION to avoid non- vector toolchains complains unsupported instructions. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add support to clone a time namespace	Tiezhu Yang	7	-24/+121
	We can see that "Time namespaces are not supported" on LoongArch: (1) clone3 test # cd tools/testing/selftests/clone3 && make && ./clone3 ... # Time namespaces are not supported ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME # Totals: pass:17 fail:0 xfail:0 xpass:0 skip:1 error:0 (2) timens test # cd tools/testing/selftests/timens && make && ./timens ... 1..0 # SKIP Time namespaces are not supported On LoongArch the current kernel does not support CONFIG_TIME_NS which depends on GENERIC_VDSO_TIME_NS, select GENERIC_VDSO_TIME_NS to enable CONFIG_TIME_NS to build kernel/time/namespace.c. Additionally, it needs to define some arch-dependent functions for the timens, such as __arch_get_timens_vdso_data(), arch_get_vdso_data() and vdso_join_timens(). At the same time, modify the layout of vvar to use one page size for generic vdso data, expand another page size for timens vdso data and assign LOONGARCH_VDSO_DATA_SIZE (maybe exceeds a page size if expand in the future) for loongarch vdso data, at last add the callback function vvar_fault() and modify stack_top(). With this patch under CONFIG_TIME_NS: (1) clone3 test # cd tools/testing/selftests/clone3 && make && ./clone3 ... ok 18 [739] Result (0) matches expectation (0) # Totals: pass:18 fail:0 xfail:0 xpass:0 skip:0 error:0 (2) timens test # cd tools/testing/selftests/timens && make && ./timens ... # Totals: pass:10 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	Makefile: Add loongarch target flag for Clang compilation	WANG Xuerui	1	-0/+1
	The LoongArch kernel is 64-bit and built with the soft-float ABI, hence the loongarch64-linux-gnusf target. (The "libc" part can affect the codegen of libcalls: other arches do not use a bare-metal target, and currently the only fully supported libc on LoongArch is glibc anyway.) See: https://lore.kernel.org/loongarch/CAKwvOdnimxv8oJ4mVY74zqtt1x7KTMrWvn2_T9x22SFDbU6rHQ@mail.gmail.com/ Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Mark Clang LTO as working	WANG Xuerui	1	-0/+2
	Confirmed working with QEMU system emulation. Acked-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Include KBUILD_CPPFLAGS in CHECKFLAGS invocation	WANG Xuerui	1	-1/+1
	This is a port of commit 08f6554ff90e ("mips: Include KBUILD_CPPFLAGS in CHECKFLAGS invocation") to arch/loongarch, for fixing cross-compilation of Linux/LoongArch with Clang, where previously the `--target` flag would no longer be present for the CHECKFLAGS cc invocation leading to build failure. Reported-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1787#issuecomment-1608306002 Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: vDSO: Use CLANG_FLAGS instead of filtering out '--target='	WANG Xuerui	1	-4/+1
	This is a port of commit 76d7fff22be3e ("MIPS: VDSO: Use CLANG_FLAGS instead of filtering out '--target='") to arch/loongarch, for fixing cross-compilation with Clang. Reported-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1787#issuecomment-1608306002 Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Tweak CFLAGS for Clang compatibility	WANG Xuerui	3	-10/+19
	Now the arch code is mostly ready for LLVM/Clang consumption, it is time to re-organize the CFLAGS a little to actually enable the LLVM build. Namely, all -G0 switches from CFLAGS are removed, and -mexplicit-relocs and -mdirect-extern-access are now wrapped with cc-option (with the related asm/percpu.h definition guarded against toolchain combos that are known to not work). A build with !RELOCATABLE && !MODULE is confirmed working within a QEMU environment; support for the two features are currently blocked on LLVM/Clang, and will come later. Why -G0 can be removed: In GCC, -G stands for "small data threshold", that instructs the compiler to put data smaller than the specified threshold in a dedicated "small data" section (called .sdata on LoongArch and several other arches). However, benefiting from this would require ABI cooperation, which is not the case for LoongArch; and current GCC behave the same whether -G0 (equal to disabling this optimization) is given or not. So, remove -G0 from CFLAGS altogether for one less thing to care about. This also benefits LLVM/Clang compatibility where the -G switch is not supported. Why -mexplicit-relocs can now be conditionally applied without regressions: Originally -mexplicit-relocs is unconditionally added to CFLAGS in case of CONFIG_AS_HAS_EXPLICIT_RELOCS, because not having it (i.e. old GCC + new binutils) would not work: modules will have R_LARCH_ABS_* relocs inside, but given the rarity of such toolchain combo in the wild, it may not be worthwhile to support it, so support for such relocs in modules were not added back when explicit relocs support was upstreamed, and -mexplicit-relocs is unconditionally added to fail the build early. Now that Clang compatibility is desired, given Clang is behaving like -mexplicit-relocs from day one but without support for the CLI flag, we must ensure the flag is not passed in case of Clang. However, explicit compiler flavor checks can be more brittle than feature detection: in this case what actually matters is support for __attribute__((model)) when building modules. Given neither older GCC nor current Clang support this attribute, probing for the attribute support and #error'ing out would allow proper UX without checking for Clang, and also automatically work when Clang support for the attribute is to be added in the future. Why -mdirect-extern-access is now conditionally applied: This is actually a nice-to-have optimization that can reduce GOT accesses, but not having it is harmless either. Because Clang does not support the option currently, but might do so in the future, conditional application via cc-option ensures compatibility with both current and future Clang versions. Suggested-by: Xi Ruoyao <xry111@xry111.site> # cc-option changes Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Simplify the invtlb wrappers	WANG Xuerui	1	-24/+19
	The invtlb instruction has been supported by upstream LoongArch toolchains from day one, so ditch the raw opcode trickery and just use plain inline asm for it. While at it, also make the invtlb asm statements barriers, for proper modeling of the side effects. The functions are also marked as __always_inline instead of just "inline", because they cannot work at all if not inlined: the op argument will not be compile-time const in that case, thus failing to satisfy the "i" constraint. The signature of the other more specific invtlb wrappers contain unused arguments right now, but these are not removed right away in order for the patch to be focused. In the meantime, assertions are added to ensure no accidental misuse happens before the refactor. (The more specific wrappers cannot re-use the generic invtlb wrapper, because the ISA manual says $zero shall be used in case a particular op does not take the respective argument: re-using the generic wrapper would mean losing control over the register usage.) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Make the CPUCFG&CSR ops simple aliases of compiler built-ins	WANG Xuerui	3	-56/+15
	In addition to less visual clutter, this also makes Clang happy regarding the const-ness of arguments. In the original approach, all Clang gets to see is the incoming arguments whose const-ness cannot be proven without first being inlined; so Clang errors out here while GCC is fine. While at it, tweak several printk format strings because the return type of csr_read64 becomes effectively unsigned long, instead of unsigned long long. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Prepare for assemblers with proper FCSR class support	WANG Xuerui	3	-1/+18
	The GNU assembler (as of 2.40) mis-treats FCSR operands as GPRs, but the LLVM IAS does not. Probe for this and refer to FCSRs as "$fcsrNN" if support is present. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: extable: Also recognize ABI names of registers	WANG Rui	1	-0/+30
	When the kernel is compiled with LLVM, the register names being handled during exception fixup building are ABI names instead of bare $rNN style. Add mapping for the ABI names for LLVM compatibility. Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Calculate various sizes in the linker script	WANG Rui	3	-7/+16
	Taking the address delta between symbols in different sections is not supported by the LLVM IAS. Instead, do this in the linker script, so the same data can be properly referenced in assembly. Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: WANG Xuerui <git@xen0n.name> [chenhuacai: Fix build with !CONFIG_EFI_STUB] Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Add guard for the larch_insn_gen_xxx functions	WANG Rui	3	-5/+34
	Add guard for the larch_insn_gen_xxx functions to verify whether the immediate operand is within the acceptable range. Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Delete unnecessary debugfs checking	Dan Carpenter	1	-2/+0
	Debugfs functions are not supposed to be checked for errors. This is sort of unusual but it is described in the comments for the debugfs_create_dir() function. Also debugfs_create_dir() can never return NULL. Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29	LoongArch: Set CPU#0 as the io master for FDT	Huacai Chen	1	-0/+1
	ACPI systems set io masters by parsing ACPI MADT, FDT systems have no MADT so we explicitly set CPU#0 as the io master. Otherwise CPU#0 will be considered as hotpluggable. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-28	ksmbd: avoid field overflow warning	Arnd Bergmann	1	-1/+1
	clang warns about a possible field overflow in a memcpy: In file included from fs/smb/server/smb_common.c:7: include/linux/fortify-string.h:583:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning] __write_overflow_field(p_size_field, size); It appears to interpret the "&out[baselen + 4]" as referring to a single byte of the character array, while the equivalen "out + baselen + 4" is seen as an offset into the array. I don't see that kind of warning elsewhere, so just go with the simple rework. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-28	modules: catch concurrent module loads, treat them as idempotent	Linus Torvalds	1	-2/+71
	This is the new-and-improved attempt at avoiding huge memory load spikes when the user space boot sequence tries to load hundreds (or even thousands) of redundant duplicate modules in parallel. See commit 9828ed3f695a ("module: error out early on concurrent load of the same module file") for background and an earlier failed attempt that was reverted. That earlier attempt just said "concurrently loading the same module is silly, just open the module file exclusively and return -ETXTBSY if somebody else is already loading it". While it is true that concurrent module loads of the same module is silly, the reason that earlier attempt then failed was that the concurrently loaded module would often be a prerequisite for another module. Thus failing to load the prerequisite would then cause cascading failures of the other modules, rather than just short-circuiting that one unnecessary module load. At the same time, we still really don't want to load the contents of the same module file hundreds of times, only to then wait for an eventually successful load, and have everybody else return -EEXIST. As a result, this takes another approach, and treats concurrent module loads from the same file as "idempotent" in the inode. So if one module load is ongoing, we don't start a new one, but instead just wait for the first one to complete and return the same return value as it did. So unlike the first attempt, this does not return early: the intent is not to speed up the boot, but to avoid a thundering herd problem in allocating memory (both physical and virtual) for a module more than once. Also note that this does change behavior: it used to be that when you had concurrent loads, you'd have one "winner" that would return success, and everybody else would return -EEXIST. In contrast, this idempotent logic goes all Oprah on the problem, and says "You are a winner! And you are a winner! We are ALL winners". But since there's no possible actual real semantic difference between "you loaded the module" and "somebody else already loaded the module", this is more of a feel-good change than an actual honest-to-goodness semantic change. Of course, any true Johnny-come-latelies that don't get caught in the concurrency filter will still return -EEXIST. It's no different from not even getting a seat at an Oprah taping. That's life. See the long thread on the kernel mailing list about this all, which includes some numbers for memory use before and after the patch. Link: https://lore.kernel.org/lkml/20230524213620.3509138-1-mcgrof@kernel.org/ Reviewed-by: Johan Hovold <johan@kernel.org> Tested-by: Johan Hovold <johan@kernel.org> Tested-by: Luis Chamberlain <mcgrof@kernel.org> Tested-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Rudi Heitbaum <rudi@heitbaum..com> Tested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-28	module: split up 'finit_module()' into init_module_from_file() helper	Linus Torvalds	1	-15/+27
	This will simplify the next step, where we can then key off the inode to do one idempotent module load. Let's do the obvious re-organization in one step, and then the new code in another. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-28	smb: client: improve DFS mount check	Paulo Alcantara	1	-2/+3
	Some servers may return error codes from REQ_GET_DFS_REFERRAL requests that are unexpected by the client, so to make it easier, assume non-DFS mounts when the client can't get the initial DFS referral of @ctx->UNC in dfs_mount_share(). Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-28	smb: client: fix shared DFS root mounts with different prefixes	Paulo Alcantara	8	-100/+118
	When having two DFS root mounts that are connected to same namespace, same mount options but different prefix paths, we can't really use the shared @server->origin_fullpath when chasing DFS links in them. Move the origin_fullpath field to cifs_tcon structure so when having shared DFS root mounts with different prefix paths, and we need to chase any DFS links, dfs_get_automount_devname() will pick up the correct full path out of the @tcon that will be used for the new mount. Before patch mount.cifs //dom/dfs/dir /mnt/1 -o ... mount.cifs //dom/dfs /mnt/2 -o ... # shared server, ses, tcon # server: origin_fullpath=//dom/dfs/dir # @server->origin_fullpath + '/dir/link1' $ ls /mnt/2/dir/link1 ls: cannot open directory '/mnt/2/dir/link1': No such file or directory After patch mount.cifs //dom/dfs/dir /mnt/1 -o ... mount.cifs //dom/dfs /mnt/2 -o ... # shared server & ses # tcon_1: origin_fullpath=//dom/dfs/dir # tcon_2: origin_fullpath=//dom/dfs # @tcon_2->origin_fullpath + '/dir/link1' $ ls /mnt/2/dir/link1 dir0 dir1 dir10 dir3 dir5 dir6 dir7 dir9 target2_file.txt tsub Fixes: 8e3554150d6c ("cifs: fix sharing of DFS connections") Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-28	x86/mem_encrypt: Remove stale mem_encrypt_init() declaration	Linus Torvalds	1	-1/+0
	The memory encryption initialization logic was moved from init/main.c into arch_cpu_finalize_init() in commit 439e17576eb4 ("init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init()"), but a stale declaration for the init function was left in <linux/init.h>. And didn't cause any problems if you had X86_MEM_ENCRYPT enabled, which apparently everybody involved did have. See also commit 0a9567ac5e6a ("x86/mem_encrypt: Unbreak the AMD_MEM_ENCRYPT=n build") in this whole sad saga of conflicting declarations for different situations. Reported-by: Matthew Wilcox <willy@infradead.org> Fixes: 439e17576eb4 init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init() Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-28	mm: fix __access_remote_vm() GUP failure case	Linus Torvalds	1	-0/+1
	Commit ca5e863233e8 ("mm/gup: remove vmas parameter from get_user_pages_remote()") removed the vma argument from GUP handling, and instead added a helper function (get_user_page_vma_remote()) that looks it up separately using 'vma_lookup()'. And then converted existing users that needed a vma to use the helper instead. However, the helper function intentionally acts exactly like the old get_user_pages_remote() did, and only fills in 'vma' on successful page lookup. Fine so far. However, __access_remote_vm() wants the vma even for the unsuccessful case, and used to do a vma = vma_lookup(mm, addr); explicitly to look it up when the get_user_page() failed. However, that conversion commit incorrectly removed that vma lookup, thinking that get_user_page_vma_remote() would have done it. Not so. So add the vma_lookup() back in. Fixes: ca5e863233e8 ("mm/gup: remove vmas parameter from get_user_pages_remote()") Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>