wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2023-08-16	kselftest/arm64: add pmull feature to hwcap test	Zeng Heng	1	-0/+13
	Add the pmull feature check in the set of hwcap tests. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230815040915.3966955-4-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-16	kselftest/arm64: add AES feature check to hwcap test	Zeng Heng	1	-0/+13
	Add the AES feature check in the set of hwcap tests. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230815040915.3966955-3-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-16	kselftest/arm64: add SHA1 and related features to hwcap test	Zeng Heng	1	-0/+39
	Add the SHA1 and related features check in the set of hwcap tests. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230815040915.3966955-2-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-16	kselftest/arm64: build BTI tests in output directory	Andre Przywara	2	-27/+20
	The arm64 BTI selftests are currently built in the source directory, then the generated binaries are copied to the output directory. This leaves the object files around in a potentially otherwise pristine source tree, tainting it for out-of-tree kernel builds. Prepend $(OUTPUT) to every reference to an object file in the Makefile, and remove the extra handling and copying. This puts all generated files under the output directory. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230815145931.2522557-1-andre.przywara@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-16	kselftest/arm64: fix a memleak in zt_regs_run()	Ding Xiang	1	-0/+1
	If memcmp() does not return 0, "zeros" need to be freed to prevent memleak Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20230815074915.245528-1-dingxiang@cmss.chinamobile.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11	kselftest/arm64: Size sycall-abi buffers for the actual maximum VL	Mark Brown	1	-15/+23
	Our ABI opts to provide future proofing by defining a much larger SVE_VQ_MAX than the architecture actually supports. Since we use this define to control the size of our vector data buffers this results in a lot of overhead when we initialise which can be a very noticable problem in emulation, we fill buffers that are orders of magnitude larger than we will ever actually use even with virtual platforms that provide the full range of architecturally supported vector lengths. Define and use the actual architecture maximum to mitigate this. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230810-arm64-syscall-abi-perf-v1-1-6a0d7656359c@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11	kselftest/arm64: add lse and lse2 features to hwcap test	Zeng Heng	1	-0/+30
	Add the LSE and various features check in the set of hwcap tests. As stated in the ARM manual, the LSE2 feature allows for atomic access to unaligned memory. Therefore, for processors that only have the LSE feature, we register .sigbus_fn to test their ability to perform unaligned access. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808134036.668954-6-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11	kselftest/arm64: add test item that support to capturing the SIGBUS signal	Zeng Heng	1	-10/+23
	Some enhanced features, such as the LSE2 feature, do not result in SILLILL if LSE2 is missing and LSE is present, but will generate a SIGBUS exception when atomic access unaligned. Therefore, we add test item to test this type of features. Notice that testing for SIGBUS only makes sense after make sure that the instruction does not cause a SIGILL signal. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808134036.668954-5-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11	kselftest/arm64: add DEF_SIGHANDLER_FUNC() and DEF_INST_RAISE_SIG() helpers	Zeng Heng	1	-43/+75
	Add macro definition functions DEF_SIGHANDLER_FUNC() and DEF_INST_RAISE_SIG() helpers. Furthermore, there is no need to modify the default SIGILL handling function throughout the entire testing lifecycle in the main() function. It is reasonable to narrow the scope to the context of the sig_fn function only. This is a pre-patch for the subsequent SIGBUS handler patch. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808134036.668954-4-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11	kselftest/arm64: add crc32 feature to hwcap test	Zeng Heng	1	-0/+12
	Add the CRC32 feature check in the set of hwcap tests. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808134036.668954-3-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11	kselftest/arm64: add float-point feature to hwcap test	Zeng Heng	1	-0/+12
	Add the FP feature check in the set of hwcap tests. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808134036.668954-2-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-04	kselftest/arm64: Use the tools/include compiler.h rather than our own	Mark Brown	4	-27/+3
	The BTI test program started life as standalone programs outside the kselftest suite so provided it's own compiler.h. Now that we have updated the tools/include compiler.h to have all the definitions that we are using and the arm64 selftsets pull in tools/includes let's drop our custom version. __unreachable() is named unreachable() there requiring an update in the code. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-6-0c1290db5d46@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-08-04	kselftest/arm64: Use shared OPTIMZER_HIDE_VAR() definiton	Mark Brown	1	-1/+3
	We had open coded the definition of OPTIMIZER_HIDE_VAR() as a fix but now that we have the generic tools/include available and that has had a definition of OPTIMIZER_HIDE_VAR() we can switch to the define. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-5-0c1290db5d46@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-08-04	kselftest/arm64: Make the tools/include headers available	Mark Brown	1	-0/+2
	Make the generic tools/include headers available to the arm64 selftests so we can reduce some duplication. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-4-0c1290db5d46@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-08-04	tools include: Add some common function attributes	Mark Brown	1	-0/+12
	We don't have definitions of __always_unused or __noreturn in the tools version of compiler.h, add them so we can use them in kselftests. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-3-0c1290db5d46@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-08-04	tools compiler.h: Add OPTIMIZER_HIDE_VAR()	Mark Brown	1	-0/+6
	Port over the definition of OPTIMIZER_HIDE_VAR() so we can use it in kselftests. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-2-0c1290db5d46@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-08-04	kselftest/arm64: Exit streaming mode after collecting signal context	Mark Brown	1	-1/+24
	When we collect a signal context with one of the SME modes enabled we will have enabled that mode behind the compiler and libc's back so they may issue some instructions not valid in streaming mode, causing spurious failures. For the code prior to issuing the BRK to trigger signal handling we need to stay in streaming mode if we were already there since that's a part of the signal context the caller is trying to collect. Unfortunately this code includes a memset() which is likely to be heavily optimised and is likely to use FP instructions incompatible with streaming mode. We can avoid this happening by open coding the memset(), inserting a volatile assembly statement to avoid the compiler recognising what's being done and doing something in optimisation. This code is not performance critical so the inefficiency should not be an issue. After collecting the context we can simply exit streaming mode, avoiding these issues. Use a full SMSTOP for safety to prevent any issues appearing with ZA. Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-1-0c1290db5d46@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-08-04	kselftest/arm64: add RCpc load-acquire to hwcap test	Zeng Heng	1	-0/+26
	Add the RCpc and various features check in the set of hwcap tests. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230803133905.971697-1-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-07-27	kselftest/arm64: Validate that changing one VL type does not affect another	Mark Brown	1	-1/+21
	On a system with both SVE and SME when we change one of the VLs this should not result in a change in the other VL. Add a check that this is in fact the case to vec-syscfg. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230720-arm64-fix-sve-sme-vl-change-v2-3-8eea06b82d57@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-07-27	kselftest/arm64: Add a test case for SVE VL changes with SME active	Mark Brown	1	-3/+102
	We just fixed an issue where changing the SVE VL while SME was active could result in us attempting to save the streaming mode SVE vectors without any backing storage. Add a test case which provokes that issue, ideally we should also verify that the contents of ZA are unaffected by any of what we did. Note that since we need to keep streaming mode enabled we can't use any syscalls to trigger the issue, we have to sit in a loop in usersapce and hope to be preempted. The chosen numbers trigger with defconfig on all the virtual platforms for me, this won't be 100% on all systems but avoid an overcomplicated test implementation. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230720-arm64-fix-sve-sme-vl-change-v2-2-8eea06b82d57@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-07-23	Linux 6.5-rc3	Linus Torvalds	1	-1/+1

2023-07-24	kbuild: rust: avoid creating temporary files	Miguel Ojeda	2	-2/+9
	`rustc` outputs by default the temporary files (i.e. the ones saved by `-Csave-temps`, such as `.rcgu` files) in the current working directory when `-o` and `--out-dir` are not given (even if `--emit=x=path` is given, i.e. it does not use those for temporaries). Since out-of-tree modules are compiled from the `linux` tree, `rustc` then tries to create them there, which may not be accessible. Thus pass `--out-dir` explicitly, even if it is just for the temporary files. Similarly, do so for Rust host programs too. Reported-by: Raphael Nestler <raphael.nestler@gmail.com> Closes: https://github.com/Rust-for-Linux/linux/issues/1015 Reported-by: Andrea Righi <andrea.righi@canonical.com> Tested-by: Raphael Nestler <raphael.nestler@gmail.com> # non-hostprogs Tested-by: Andrea Righi <andrea.righi@canonical.com> # non-hostprogs Fixes: 295d8398c67e ("kbuild: specify output names separately for each emission type from rustc") Cc: stable@vger.kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-07-23	tracing/histograms: Return an error if we fail to add histogram to hist_vars list	Mohamed Khalfella	1	-1/+2
	Commit 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables") added a check to fail histogram creation if save_hist_vars() failed to add histogram to hist_vars list. But the commit failed to set ret to failed return code before jumping to unregister histogram, fix it. Link: https://lore.kernel.org/linux-trace-kernel/20230714203341.51396-1-mkhalfella@purestorage.com Cc: stable@vger.kernel.org Fixes: 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables") Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-23	ring-buffer: Do not swap cpu_buffer during resize process	Chen Lin	2	-2/+15
	When ring_buffer_swap_cpu was called during resize process, the cpu buffer was swapped in the middle, resulting in incorrect state. Continuing to run in the wrong state will result in oops. This issue can be easily reproduced using the following two scripts: /tmp # cat test1.sh //#! /bin/sh for i in `seq 0 100000` do echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb sleep 0.5 echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb sleep 0.5 done /tmp # cat test2.sh //#! /bin/sh for i in `seq 0 100000` do echo irqsoff > /sys/kernel/debug/tracing/current_tracer sleep 1 echo nop > /sys/kernel/debug/tracing/current_tracer sleep 1 done /tmp # ./test1.sh & /tmp # ./test2.sh & A typical oops log is as follows, sometimes with other different oops logs. [ 231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8 [ 231.713375] Modules linked in: [ 231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15 [ 231.716750] Hardware name: linux,dummy-virt (DT) [ 231.718152] Workqueue: events update_pages_handler [ 231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 231.721171] pc : rb_update_pages+0x378/0x3f8 [ 231.722212] lr : rb_update_pages+0x25c/0x3f8 [ 231.723248] sp : ffff800082b9bd50 [ 231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000 [ 231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0 [ 231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a [ 231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000 [ 231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510 [ 231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002 [ 231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558 [ 231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001 [ 231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000 [ 231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208 [ 231.744196] Call trace: [ 231.744892] rb_update_pages+0x378/0x3f8 [ 231.745893] update_pages_handler+0x1c/0x38 [ 231.746893] process_one_work+0x1f0/0x468 [ 231.747852] worker_thread+0x54/0x410 [ 231.748737] kthread+0x124/0x138 [ 231.749549] ret_from_fork+0x10/0x20 [ 231.750434] ---[ end trace 0000000000000000 ]--- [ 233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 233.721696] Mem abort info: [ 233.721935] ESR = 0x0000000096000004 [ 233.722283] EC = 0x25: DABT (current EL), IL = 32 bits [ 233.722596] SET = 0, FnV = 0 [ 233.722805] EA = 0, S1PTW = 0 [ 233.723026] FSC = 0x04: level 0 translation fault [ 233.723458] Data abort info: [ 233.723734] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 233.724176] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 233.724589] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000 [ 233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 233.726720] Modules linked in: [ 233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15 [ 233.727777] Hardware name: linux,dummy-virt (DT) [ 233.728225] Workqueue: events update_pages_handler [ 233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 233.729054] pc : rb_update_pages+0x1a8/0x3f8 [ 233.729334] lr : rb_update_pages+0x154/0x3f8 [ 233.729592] sp : ffff800082b9bd50 [ 233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000 [ 233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418 [ 233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003 [ 233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58 [ 233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001 [ 233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000 [ 233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c [ 233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0 [ 233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000 [ 233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000 [ 233.734418] Call trace: [ 233.734593] rb_update_pages+0x1a8/0x3f8 [ 233.734853] update_pages_handler+0x1c/0x38 [ 233.735148] process_one_work+0x1f0/0x468 [ 233.735525] worker_thread+0x54/0x410 [ 233.735852] kthread+0x124/0x138 [ 233.736064] ret_from_fork+0x10/0x20 [ 233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060) [ 233.736959] ---[ end trace 0000000000000000 ]--- After analysis, the seq of the error is as follows [1-5]: int ring_buffer_resize(struct trace_buffer buffer, unsigned long size, int cpu_id) { for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; //1. get cpu_buffer, aka cpu_buffer(A) ... ... schedule_work_on(cpu, &cpu_buffer->update_pages_work); //2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to // update_pages_handler, do the update process, set 'update_done' in // complete(&cpu_buffer->update_done) and to wakeup resize process. //----> //3. Just at this moment, ring_buffer_swap_cpu is triggered, //cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer. //ring_buffer_swap_cpu is called as the 'Call trace' below. Call trace: dump_backtrace+0x0/0x2f8 show_stack+0x18/0x28 dump_stack+0x12c/0x188 ring_buffer_swap_cpu+0x2f8/0x328 update_max_tr_single+0x180/0x210 check_critical_timing+0x2b4/0x2c8 tracer_hardirqs_on+0x1c0/0x200 trace_hardirqs_on+0xec/0x378 el0_svc_common+0x64/0x260 do_el0_svc+0x90/0xf8 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb8 el0_sync+0x180/0x1c0 //<---- / wait for all the updates to complete */ for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; //4. get cpu_buffer, cpu_buffer(B) is used in the following process, //the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong. //for example, cpu_buffer(A)->update_done will leave be set 1, and will //not 'wait_for_completion' at the next resize round. if (!cpu_buffer->nr_pages_to_update) continue; if (cpu_online(cpu)) wait_for_completion(&cpu_buffer->update_done); cpu_buffer->nr_pages_to_update = 0; } ... } //5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong, //Continuing to run in the wrong state, then oops occurs. Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn Signed-off-by: Chen Lin <chen.lin5@zte.com.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-23	tracing: Remove unused extern declaration tracing_map_set_field_descr()	YueHaibing	1	-4/+0
	Since commit 08d43a5fa063 ("tracing: Add lock-free tracing_map"), this is never used, so can be removed. Link: https://lore.kernel.org/linux-trace-kernel/20230722032123.24664-1-yuehaibing@huawei.com Cc: <mhiramat@kernel.org> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-23	kbuild: flatten KBUILD_CFLAGS	Alexey Dobriyan	1	-5/+17
	Make it slightly easier to see which compiler options are added and removed (and not worry about column limit too!). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Nicolas Schier <n.schier@avm.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-07-23	gen_compile_commands: add assembly files to compilation database	Benjamin Gray	1	-1/+1
	Like C source files, tooling can find it useful to have the assembly source file compilation recorded. The .S extension appears to used across all architectures. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Reviewed-by: Fangrui Song <maskray@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-07-23	ext4: fix rbtree traversal bug in ext4_mb_use_preallocated	Ojaswin Mujoo	1	-27/+131
	During allocations, while looking for preallocations(PA) in the per inode rbtree, we can't do a direct traversal of the tree because ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted and that can cause direct traversal to skip some entries. This was leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy our request and ultimately tried to create a new PA that would overlap with the missed one. To makes sure we handle that case while still keeping the performance of the rbtree, we make use of the fact that the only pa that could possibly overlap the original goal start is the one that satisfies the below conditions: 1. It must have it's logical start immediately to the left of (ie less than) original logical start. 2. It must not be deleted To find this pa we use the following traversal method: 1. Descend into the rbtree normally to find the immediate neighboring PA. Here we keep descending irrespective of if the PA is deleted or if it overlaps with our request etc. The goal is to find an immediately adjacent PA. 2. If the found PA is on right of original goal, use rb_prev() to find the left adjacent PA. 3. Check if this PA is deleted and keep moving left with rb_prev() until a non deleted PA is found. 4. This is the PA we are looking for. Now we can check if it can satisfy the original request and proceed accordingly. This approach also takes care of having deleted PAs in the tree. (While we are at it, also fix a possible overflow bug in calculating the end of a PA) [1] https://lore.kernel.org/linux-ext4/CA+G9fYv2FRpLqBZf34ZinR8bU2_ZRAUOjKAD3+tKRFaEQHtt8Q@mail.gmail.com/ Cc: stable@kernel.org # 6.4 Fixes: 3872778664e3 ("ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list") Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Tested-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Link: https://lore.kernel.org/r/edd2efda6a83e6343c5ace9deea44813e71dbe20.1690045963.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-07-23	ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()	Ojaswin Mujoo	1	-5/+9
	In ext4_mb_choose_next_group_best_avail(), we want the start order to be 1 less than goal length and the min_order to be, at max, 1 more than the original length. This commit fixes an off by one issue that arose due to the fact that 1 << fls(n) > (n). After all the processing: order = 1 order below goal len min_order = maximum of the three:- - order - trim_order - 1 order below B2C(s_stripe) - 1 order above original len Cc: stable@kernel.org Fixes: 33122aa930 ("ext4: Add allocation criteria 1.5 (CR1_5)") Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230609103403.112807-1-ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-07-23	ext4: correct inline offset when handling xattrs in inode body	Eric Whitney	1	-0/+14
	When run on a file system where the inline_data feature has been enabled, xfstests generic/269, generic/270, and generic/476 cause ext4 to emit error messages indicating that inline directory entries are corrupted. This occurs because the inline offset used to locate inline directory entries in the inode body is not updated when an xattr in that shared region is deleted and the region is shifted in memory to recover the space it occupied. If the deleted xattr precedes the system.data attribute, which points to the inline directory entries, that attribute will be moved further up in the region. The inline offset continues to point to whatever is located in system.data's former location, with unfortunate effects when used to access directory entries or (presumably) inline data in the inode body. Cc: stable@kernel.org Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20230522181520.1570360-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-07-22	cifs: update internal module version number for cifs.ko	Steve French	1	-2/+2
	From 2.43 to 2.44 Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-22	cifs: allow dumping keys for directories too	Shyam Prasad N	1	-4/+13
	Dumping the enc/dec keys is a session wide operation. And it should not matter if the ioctl was run on a regular file or a directory. Currently, we obtain the tcon pointer from the cifs file handle. But since there's no dir open call in cifs, this is not populated for dirs. This change allows dumping of session keys using ioctl even for directories. To do this, we'll now get the tcon pointer from the superblock, and not from the file handle. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-21	dt-bindings: serial: Remove obsolete nxp,lpc1850-uart.txt	Rob Herring	1	-28/+0
	nxp,lpc1850-uart.txt binding is already covered by 8250.yaml, so remove it. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230707221607.1064888-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-07-21	dt-bindings: serial: Remove obsolete cavium-uart.txt	Rob Herring	1	-19/+0
	cavium-uart.txt binding is already covered by 8250.yaml, so remove it. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230707221602.1063972-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-07-21	loop: do not enforce max_loop hard limit by (new) default	Mauricio Faria de Oliveira	1	-2/+34
	Problem: The max_loop parameter is used for 2 different purposes: 1) initial number of loop devices to pre-create on init 2) maximum number of loop devices to add on access/open() Historically, its default value (zero) caused 1) to create non-zero number of devices (CONFIG_BLK_DEV_LOOP_MIN_COUNT), and no hard limit on 2) to add devices with autoloading. However, the default value changed in commit 85c50197716c ("loop: Fix the max_loop commandline argument treatment when it is set to 0") to CONFIG_BLK_DEV_LOOP_MIN_COUNT, for max_loop=0 not to pre-create devices. That does improve 1), but unfortunately it breaks 2), as the default behavior changed from no-limit to hard-limit. Example: For example, this userspace code broke for N >= CONFIG, if the user relied on the default value 0 for max_loop: mknod("/dev/loopN"); open("/dev/loopN"); // now fails with ENXIO Though affected users may "fix" it with (loop.)max_loop=0, this means to require a kernel parameter change on stable kernel update (that commit Fixes: an old commit in stable). Solution: The original semantics for the default value in 2) can be applied if the parameter is not set (ie, default behavior). This still keeps the intended function in 1) and 2) if set, and that commit's intended improvement in 1) if max_loop=0. Before 85c50197716c: - default: 1) CONFIG devices 2) no limit - max_loop=0: 1) CONFIG devices 2) no limit - max_loop=X: 1) X devices 2) X limit After 85c50197716c: - default: 1) CONFIG devices 2) CONFIG limit () - max_loop=0: 1) 0 devices () 2) no limit - max_loop=X: 1) X devices 2) X limit This commit: - default: 1) CONFIG devices 2) no limit () - max_loop=0: 1) 0 devices 2) no limit - max_loop=X: 1) X devices 2) X limit Future: The issue/regression from that commit only affects code under the CONFIG_BLOCK_LEGACY_AUTOLOAD deprecation guard, thus the fix too is contained under it. Once that deprecated functionality/code is removed, the purpose 2) of max_loop (hard limit) is no longer in use, so the module parameter description can be changed then. Tests: Linux 6.4-rc7 CONFIG_BLK_DEV_LOOP_MIN_COUNT=8 CONFIG_BLOCK_LEGACY_AUTOLOAD=y - default (original) # ls -1 /dev/loop /dev/loop-control /dev/loop0 ... /dev/loop7 # ./test-loop open: /dev/loop8: No such device or address - default (patched) # ls -1 /dev/loop* /dev/loop-control /dev/loop0 ... /dev/loop7 # ./test-loop # - max_loop=0 (original & patched): # ls -1 /dev/loop* /dev/loop-control # ./test-loop # - max_loop=8 (original & patched): # ls -1 /dev/loop* /dev/loop-control /dev/loop0 ... /dev/loop7 # ./test-loop open: /dev/loop8: No such device or address - max_loop=0 (patched; CONFIG_BLOCK_LEGACY_AUTOLOAD is not set) # ls -1 /dev/loop* /dev/loop-control # ./test-loop open: /dev/loop8: No such device or address Fixes: 85c50197716c ("loop: Fix the max_loop commandline argument treatment when it is set to 0") Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230720143033.841001-3-mfo@canonical.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-07-21	loop: deprecate autoloading callback loop_probe()	Mauricio Faria de Oliveira	1	-0/+4
	The 'probe' callback in __register_blkdev() is only used under the CONFIG_BLOCK_LEGACY_AUTOLOAD deprecation guard. The loop_probe() function is only used for that callback, so guard it too, accordingly. See commit fbdee71bb5d8 ("block: deprecate autoloading based on dev_t"). Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230720143033.841001-2-mfo@canonical.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-07-21	sbitmap: fix batching wakeup	David Jeffery	1	-8/+7
	Current code supposes that it is enough to provide forward progress by just waking up one wait queue after one completion batch is done. Unfortunately this way isn't enough, cause waiter can be added to wait queue just after it is woken up. Follows one example(64 depth, wake_batch is 8) 1) all 64 tags are active 2) in each wait queue, there is only one single waiter 3) each time one completion batch(8 completions) wakes up just one waiter in each wait queue, then immediately one new sleeper is added to this wait queue 4) after 64 completions, 8 waiters are wakeup, and there are still 8 waiters in each wait queue 5) after another 8 active tags are completed, only one waiter can be wakeup, and the other 7 can't be waken up anymore. Turns out it isn't easy to fix this problem, so simply wakeup enough waiters for single batch. Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Chengming Zhou <zhouchengming@bytedance.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20230721095715.232728-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-07-21	drm/atomic: Fix potential use-after-free in nonblocking commits	Daniel Vetter	1	-1/+10
	This requires a bit of background. Properly done a modeset driver's unload/remove sequence should be drm_dev_unplug(); drm_atomic_helper_shutdown(); drm_dev_put(); The trouble is that the drm_dev_unplugged() checks are by design racy, they do not synchronize against all outstanding ioctl. This is because those ioctl could block forever (both for modeset and for driver specific ioctls), leading to deadlocks in hotunplug. Instead the code sections that touch the hardware need to be annotated with drm_dev_enter/exit, to avoid accessing hardware resources after the unload/remove has finished. To avoid use-after-free issues all the involved userspace visible objects are supposed to hold a reference on the underlying drm_device, like drm_file does. The issue now is that we missed one, the atomic modeset ioctl can be run in a nonblocking fashion, and in that case it cannot rely on the implied drm_device reference provided by the ioctl calling context. This can result in a use-after-free if an nonblocking atomic commit is carefully raced against a driver unload. Fix this by unconditionally grabbing a drm_device reference for any drm_atomic_state structures. Strictly speaking this isn't required for blocking commits and TEST_ONLY calls, but it's the simpler approach. Thanks to shanzhulig for the initial idea of grabbing an unconditional reference, I just added comments, a condensed commit message and fixed a minor potential issue in where exactly we drop the final reference. Reported-by: shanzhulig <shanzhulig@gmail.com> Suggested-by: shanzhulig <shanzhulig@gmail.com> Reviewed-by: Maxime Ripard <mripard@kernel.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: stable@kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-21	ia64: mmap: Consider pgoff when searching for free mapping	Helge Deller	1	-1/+1
	IA64 is the only architecture which does not consider the pgoff value when searching for a possible free memory region with vm_unmapped_area(). Adding this seems to have no negative side effect on IA64, so add it now to make IA64 consistent with all other architectures. Cc: stable@vger.kernel.org # 6.4 Signed-off-by: Helge Deller <deller@gmx.de> Tested-by: matoro <matoro_mailinglist_kernel@matoro.tk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-ia64@vger.kernel.org Link: https://lore.kernel.org/r/20230721152432.196382-3-deller@gmx.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-07-21	io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area()	Helge Deller	2	-30/+27
	The io_uring testcase is broken on IA-64 since commit d808459b2e31 ("io_uring: Adjust mapping wrt architecture aliasing requirements"). The reason is, that this commit introduced an own architecture independend get_unmapped_area() search algorithm which finds on IA-64 a memory region which is outside of the regular memory region used for shared userspace mappings and which can't be used on that platform due to aliasing. To avoid similar problems on IA-64 and other platforms in the future, it's better to switch back to the architecture-provided get_unmapped_area() function and adjust the needed input parameters before the call. Beside fixing the issue, the function now becomes easier to understand and maintain. This patch has been successfully tested with the io_uring testcase on physical x86-64, ppc64le, IA-64 and PA-RISC machines. On PA-RISC the LTP mmmap testcases did not report any regressions. Cc: stable@vger.kernel.org # 6.4 Signed-off-by: Helge Deller <deller@gmx.de> Reported-by: matoro <matoro_mailinglist_kernel@matoro.tk> Fixes: d808459b2e31 ("io_uring: Adjust mapping wrt architecture aliasing requirements") Link: https://lore.kernel.org/r/20230721152432.196382-2-deller@gmx.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-07-21	arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes	Mark Brown	1	-8/+25
	When we reconfigure the SVE vector length we discard the backing storage for the SVE vectors and then reallocate on next SVE use, leaving the SME specific state alone. This means that we do not enable SME traps if they were already disabled. That means that userspace code can enter streaming mode without trapping, putting the task in a state where if we try to save the state of the task we will fault. Since the ABI does not specify that changing the SVE vector length disturbs SME state, and since SVE code may not be aware of SME code in the process, we shouldn't simply discard any ZA state. Instead immediately reallocate the storage for SVE, and disable SME if we change the SVE vector length while there is no SME state active. Disabling SME traps on SVE vector length changes would make the overall code more complex since we would have a state where we have valid SME state stored but might get a SME trap. Fixes: 9e4ab6c89109 ("arm64/sme: Implement vector length configuration prctl()s") Reported-by: David Spickett <David.Spickett@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230720-arm64-fix-sve-sme-vl-change-v2-1-8eea06b82d57@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-07-20	blk-iocost: skip empty flush bio in iocost	Chengming Zhou	1	-0/+4
	The flush bio may have data, may have no data (empty flush), we couldn't calculate cost for empty flush bio. So we'd better just skip it for now. Another side effect is that empty flush bio's bio_end_sector() is 0, cause iocg->cursor reset to 0, may break the cost calculation of other bios. This isn't good enough, since flush bio still consume the device bandwidth, but flush request is special, can be merged randomly in the flush state machine, we don't know how to calculate cost for it for now. Its completion time also has flaws, which may include the pre-flush or post-flush completion time, but I don't know if we need to fix that and how to fix it. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20230720121441.1408522-1-chengming.zhou@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-07-20	net: phy: prevent stale pointer dereference in phy_init()	Vladimir Oltean	1	-7/+14
	mdio_bus_init() and phy_driver_register() both have error paths, and if those are ever hit, ethtool will have a stale pointer to the phy_ethtool_phy_ops stub structure, which references memory from a module that failed to load (phylib). It is probably hard to force an error in this code path even manually, but the error teardown path of phy_init() should be the same as phy_exit(), which is now simply not the case. Fixes: 55d8f053ce1b ("net: phy: Register ethtool PHY operations") Link: https://lore.kernel.org/netdev/ZLaiJ4G6TaJYGJyU@shell.armlinux.org.uk/ Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20230720000231.1939689-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-20	tcp: annotate data-races around fastopenq.max_qlen	Eric Dumazet	3	-4/+6
	This field can be read locklessly. Fixes: 1536e2857bd3 ("tcp: Add a TCP_FASTOPEN socket option to get a max backlog on its listner") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-12-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-20	tcp: annotate data-races around icsk->icsk_user_timeout	Eric Dumazet	1	-3/+3
	This field can be read locklessly from do_tcp_getsockopt() Fixes: dca43c75e7e5 ("tcp: Add TCP_USER_TIMEOUT socket option.") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-11-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-20	tcp: annotate data-races around tp->notsent_lowat	Eric Dumazet	2	-3/+7
	tp->notsent_lowat can be read locklessly from do_tcp_getsockopt() and tcp_poll(). Fixes: c9bee3b7fdec ("tcp: TCP_NOTSENT_LOWAT socket option") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-10-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-20	tcp: annotate data-races around rskq_defer_accept	Eric Dumazet	1	-5/+6
	do_tcp_getsockopt() reads rskq_defer_accept while another cpu might change its value. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-9-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-20	tcp: annotate data-races around tp->linger2	Eric Dumazet	1	-4/+4
	do_tcp_getsockopt() reads tp->linger2 while another cpu might change its value. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-8-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-20	tcp: annotate data-races around icsk->icsk_syn_retries	Eric Dumazet	2	-4/+4
	do_tcp_getsockopt() and reqsk_timer_handler() read icsk->icsk_syn_retries while another cpu might change its value. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-7-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-20	tcp: annotate data-races around tp->keepalive_probes	Eric Dumazet	2	-4/+10
	do_tcp_getsockopt() reads tp->keepalive_probes while another cpu might change its value. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>