linux-rng - Development tree for the kernel CSPRNG

Age	Commit message (Collapse)	Author	Files	Lines
2024-02-07	panic: Flush kernel log buffer at the end	John Ogness	1	-0/+8
	If the kernel crashes in a context where printk() calls always defer printing (such as in NMI or inside a printk_safe section) then the final panic messages will be deferred to irq_work. But if irq_work is not available, the messages will not get printed unless explicitly flushed. The result is that the final "end Kernel panic" banner does not get printed. Add one final flush after the last printk() call to make sure the final panic messages make it out as well. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-14-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: Avoid non-panic CPUs writing to ringbuffer	John Ogness	1	-20/+6
	Commit 13fb0f74d702 ("printk: Avoid livelock with heavy printk during panic") introduced a mechanism to silence non-panic CPUs if too many messages are being dropped. Aside from trying to workaround the livelock bugs of legacy consoles, it was also intended to avoid losing panic messages. However, if non-panic CPUs are writing to the ringbuffer, then reacting to dropped messages is too late. Another motivation is that non-finalized messages already might be skipped in panic(). In other words, random messages from non-panic CPUs might already get lost. It is better to ignore all to avoid confusion. To avoid losing panic CPU messages, silence non-panic CPUs immediately on panic. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-13-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: Disable passing console lock owner completely during panic()	Petr Mladek	1	-0/+29
	The commit d51507098ff91 ("printk: disable optimistic spin during panic") added checks to avoid becoming a console waiter if a panic is in progress. However, the transition to panic can occur while there is already a waiter. The current owner should not pass the lock to the waiter because it might get stopped or blocked anytime. Also the panic context might pass the console lock owner to an already stopped waiter by mistake. It might happen when console_flush_on_panic() ignores the current lock owner, for example: CPU0 CPU1 ---- ---- console_lock_spinning_enable() console_trylock_spinning() [CPU1 now console waiter] NMI: panic() panic_other_cpus_shutdown() [stopped as console waiter] console_flush_on_panic() console_lock_spinning_enable() [print 1 record] console_lock_spinning_disable_and_check() [handover to stopped CPU1] This results in panic() not flushing the panic messages. Fix these problems by disabling all spinning operations completely during panic(). Another advantage is that it prevents possible deadlocks caused by "console_owner_lock". The panic() context does not need to take it any longer. The lockless checks are safe because the functions become NOPs when they see the panic in progress. All operations manipulating the state are still synchronized by the lock even when non-panic CPUs would notice the panic synchronously. The current owner might stay spinning. But non-panic() CPUs would get stopped anyway and the panic context will never start spinning. Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes") Signed-off-by: John Ogness <john.ogness@linutronix.de> Link: https://lore.kernel.org/r/20240207134103.1357162-12-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: ringbuffer: Skip non-finalized records in panic	John Ogness	1	-2/+26
	Normally a reader will stop once reaching a non-finalized record. However, when a panic happens, writers from other CPUs (or an interrupted context on the panic CPU) may have been writing a record and were unable to finalize it. The panic CPU will reserve/commit/finalize its panic records, but these will be located after the non-finalized records. This results in panic() not flushing the panic messages. Extend _prb_read_valid() to skip over non-finalized records if on the panic CPU. Fixes: 896fbe20b4e2 ("printk: use the lockless ringbuffer") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-11-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: Wait for all reserved records with pr_flush()	John Ogness	3	-1/+107
	Currently pr_flush() will only wait for records that were available to readers at the time of the call (using prb_next_seq()). But there may be more records (non-finalized) that have following finalized records. pr_flush() should wait for these to print as well. Particularly because any trailing finalized records may be the messages that the calling context wants to ensure are printed. Add a new ringbuffer function prb_next_reserve_seq() to return the sequence number following the most recently reserved record. This guarantees that pr_flush() will wait until all current printk() messages (completed or in progress) have been printed. Fixes: 3b604ca81202 ("printk: add pr_flush()") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-10-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: ringbuffer: Cleanup reader terminology	John Ogness	1	-7/+9
	With the lockless ringbuffer, it is allowed that multiple CPUs/contexts write simultaneously into the buffer. This creates an ambiguity as some writers will finalize sooner. The documentation for the prb_read functions is not clear as it refers to "not yet written" and "no data available". Clarify the return values and language to be in terms of the reader: records available for reading. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-9-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: Add this_cpu_in_panic()	John Ogness	2	-20/+24
	There is already panic_in_progress() and other_cpu_in_panic(), but checking if the current CPU is the panic CPU must still be open coded. Add this_cpu_in_panic() to complete the set. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-8-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: For @suppress_panic_printk check for other CPU in panic	John Ogness	1	-2/+1
	Currently @suppress_panic_printk is checked along with non-matching @panic_cpu and current CPU. This works because @suppress_panic_printk is only set when panic_in_progress() is true. Rather than relying on the @suppress_panic_printk semantics, use the concise helper function other_cpu_in_progress(). The helper function exists to avoid open coding such tests. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-7-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: ringbuffer: Clarify special lpos values	John Ogness	2	-5/+31
	For empty line records, no data blocks are created. Instead, these valid records are identified by special logical position values (in fields of @prb_desc.text_blk_lpos). Currently the macro NO_LPOS is used for empty line records. This name is confusing because it does not imply _why_ there is no data block. Rename NO_LPOS to EMPTY_LINE_LPOS so that it is clear why there is no data block. Also add comments explaining the use of EMPTY_LINE_LPOS as well as clarification to the values used to represent data-less blocks. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-6-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: ringbuffer: Do not skip non-finalized records with prb_next_seq()	John Ogness	2	-41/+127
	Commit f244b4dc53e5 ("printk: ringbuffer: Improve prb_next_seq() performance") introduced an optimization for prb_next_seq() by using best-effort to track recently finalized records. However, the order of finalization does not necessarily match the order of the records. The optimization changed prb_next_seq() to return inconsistent results, possibly yielding sequence numbers that are not available to readers because they are preceded by non-finalized records or they are not yet visible to the reader CPU. Rather than simply best-effort tracking recently finalized records, force the committing writer to read records and increment the last "contiguous block" of finalized records. In order to do this, the sequence number instead of ID must be stored because ID's cannot be directly compared. A new memory barrier pair is introduced to guarantee that a reader can always read the records up until the sequence number returned by prb_next_seq() (unless the records have since been overwritten in the ringbuffer). This restores the original functionality of prb_next_seq() while also keeping the optimization. For 32bit systems, only the lower 32 bits of the sequence number are stored. When reading the value, it is expanded to the full 64bit sequence number using the 32bit seq macros, which fold in the value returned by prb_first_seq(). Fixes: f244b4dc53e5 ("printk: ringbuffer: Improve prb_next_seq() performance") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-5-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: Use prb_first_seq() as base for 32bit seq macros	John Ogness	2	-5/+5
	Note: This change only applies to 32bit architectures. On 64bit architectures the macros are NOPs. Currently prb_next_seq() is used as the base for the 32bit seq macros __u64seq_to_ulseq() and __ulseq_to_u64seq(). However, in a follow-up commit, prb_next_seq() will need to make use of the 32bit seq macros. Use prb_first_seq() as the base for the 32bit seq macros instead because it is guaranteed to return 64bit sequence numbers without relying on any 32bit seq macros. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-4-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: Adjust mapping for 32bit seq macros	Sebastian Andrzej Siewior	1	-1/+1
	Note: This change only applies to 32bit architectures. On 64bit architectures the macros are NOPs. __ulseq_to_u64seq() computes the upper 32 bits of the passed argument value (@ulseq). The upper bits are derived from a base value (@rb_next_seq) in a way that assumes @ulseq represents a 64bit number that is less than or equal to @rb_next_seq. Until now this mapping has been correct for all call sites. However, in a follow-up commit, values of @ulseq will be passed in that are higher than the base value. This requires a change to how the 32bit value is mapped to a 64bit sequence number. Rather than mapping @ulseq such that the base value is the end of a 32bit block, map @ulseq such that the base value is in the middle of a 32bit block. This allows supporting 31 bits before and after the base value, which is deemed acceptable for the console sequence number during runtime. Here is an example to illustrate the previous and new mappings. For a base value (@rb_next_seq) of 2 2000 0000... Before this change the range of possible return values was: 1 2000 0001 to 2 2000 0000 __ulseq_to_u64seq(1fff ffff) => 2 1fff ffff __ulseq_to_u64seq(2000 0000) => 2 2000 0000 __ulseq_to_u64seq(2000 0001) => 1 2000 0001 __ulseq_to_u64seq(9fff ffff) => 1 9fff ffff __ulseq_to_u64seq(a000 0000) => 1 a000 0000 __ulseq_to_u64seq(a000 0001) => 1 a000 0001 After this change the range of possible return values are: 1 a000 0001 to 2 a000 0000 __ulseq_to_u64seq(1fff ffff) => 2 1fff ffff __ulseq_to_u64seq(2000 0000) => 2 2000 0000 __ulseq_to_u64seq(2000 0001) => 2 2000 0001 __ulseq_to_u64seq(9fff ffff) => 2 9fff ffff __ulseq_to_u64seq(a000 0000) => 2 a000 0000 __ulseq_to_u64seq(a000 0001) => 1 a000 0001 [ john.ogness: Rewrite commit message. ] Reported-by: Francesco Dolcini <francesco@dolcini.it> Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-3-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07	printk: nbcon: Relocate 32bit seq macros	John Ogness	2	-37/+37
	The macros __seq_to_nbcon_seq() and __nbcon_seq_to_seq() are used to provide support for atomic handling of sequence numbers on 32bit systems. Until now this was only used by nbcon.c, which is why they were located in nbcon.c and include nbcon in the name. In a follow-up commit this functionality is also needed by printk_ringbuffer. Rather than duplicating the functionality, relocate the macros to printk_ringbuffer.h. Also, since the macros will be no longer nbcon-specific, rename them to __u64seq_to_ulseq() and __ulseq_to_u64seq(). This does not result in any functional change. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240207134103.1357162-2-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2023-11-02	module/decompress: use kvmalloc() consistently	Andrea Righi	1	-4/+4
	We consistently switched from kmalloc() to vmalloc() in module decompression to prevent potential memory allocation failures with large modules, however vmalloc() is not as memory-efficient and fast as kmalloc(). Since we don't know in general the size of the workspace required by the decompression algorithm, it is more reasonable to use kvmalloc() consistently, also considering that we don't have special memory requirements here. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-11-02	locking/atomic: sh: Use generic_cmpxchg_local for arch_cmpxchg_local()	Masami Hiramatsu	1	-0/+9
	Use __generic_cmpxchg_local() for arch_cmpxchg_local() implementation on SH architecture because it does not implement arch_cmpxchg_local(). Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310241310.Ir5uukOG-lkp@intel.com/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/169824660459.24340.14614817132696360531.stgit@devnote2 Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2023-11-01	module: Annotate struct module_notes_attrs with __counted_by	Kees Cook	1	-1/+1
	Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct module_notes_attrs. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01	module: Fix comment typo	Zhu Mao	1	-1/+1
	Delete duplicated word in comment. Signed-off-by: Zhu Mao <zhumao001@208suo.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01	module: Make is_valid_name() return bool	Tiezhu Yang	1	-2/+2
	The return value of is_valid_name() is true or false, so change its type to reflect that. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01	module: Make is_mapping_symbol() return bool	Tiezhu Yang	1	-1/+1
	The return value of is_mapping_symbol() is true or false, so change its type to reflect that. Suggested-by: Xi Zhang <zhangxi@kylinos.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01	module/decompress: use vmalloc() for gzip decompression workspace	Andrea Righi	1	-2/+2
	Use a similar approach as commit a419beac4a07 ("module/decompress: use vmalloc() for zstd decompression workspace") and replace kmalloc() with vmalloc() also for the gzip module decompression workspace. In this case the workspace is represented by struct inflate_workspace that can be fairly large for kmalloc() and it can potentially lead to allocation errors on certain systems: $ pahole inflate_workspace struct inflate_workspace { struct inflate_state inflate_state; /* 0 9544 / / --- cacheline 149 boundary (9536 bytes) was 8 bytes ago --- / unsigned char working_window[32768]; / 9544 32768 / / size: 42312, cachelines: 662, members: 2 / / last cacheline: 8 bytes */ }; Considering that there is no need to use continuous physical memory, simply switch to vmalloc() to provide a more reliable in-kernel module decompression. Fixes: b1ae6dc41eaa ("module: add in-kernel support for decompressing") Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01	MAINTAINERS: add include/linux/module*.h to modules	Luis Chamberlain	1	-1/+1
	Use glob include/linux/module*.h to capture all module changes. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01	module: Clarify documentation of module_param_call()	Kees Cook	1	-1/+5
	Commit 9bbb9e5a3310 ("param: use ops in struct kernel_param, rather than get and set fns directly") added the comment that module_param_call() was deprecated, during a large scale refactoring to bring sanity to type casting back then. In 2017 following more cleanups, it became useful again as it wraps a common pattern of creating an ops struct for a given get/set pair: b2f270e87473 ("module: Prepare to convert all module_param_call() prototypes") ece1996a21ee ("module: Do not paper over type mismatches in module_param_call()") static const struct kernel_param_ops __param_ops_##name = \ { .flags = 0, .set = _set, .get = _get }; \ __module_param_call(MODULE_PARAM_PREFIX, \ name, &__param_ops_##name, arg, perm, -1, 0) __module_param_call(MODULE_PARAM_PREFIX, name, ops, arg, perm, -1, 0) Many users of module_param_cb() appear to be almost universally open-coding the same thing that module_param_call() does now. Don't discourage[1] people from using module_param_call(): clarify the comment to show that module_param_cb() is useful if you repeatedly use the same pair of get/set functions. [1] https://lore.kernel.org/lkml/202308301546.5C789E5EC@keescook/ Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Johan Hovold <johan@kernel.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Nick Desaulniers <ndesaulniers@gooogle.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: linux-modules@vger.kernel.org Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01	scripts/gdb/vmalloc: disable on no-MMU	Ben Wolsieffer	2	-2/+7
	vmap_area does not exist on no-MMU, therefore the GDB scripts fail to load: Traceback (most recent call last): File "<...>/vmlinux-gdb.py", line 51, in <module> import linux.vmalloc File "<...>/scripts/gdb/linux/vmalloc.py", line 14, in <module> vmap_area_ptr_type = vmap_area_type.get_type().pointer() ^^^^^^^^^^^^^^^^^^^^^^^^^ File "<...>/scripts/gdb/linux/utils.py", line 28, in get_type self._type = gdb.lookup_type(self._name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ gdb.error: No struct type named vmap_area. To fix this, disable the command and add an informative error message if CONFIG_MMU is not defined, following the example of lx-slabinfo. Link: https://lkml.kernel.org/r/20231031202235.2655333-2-ben.wolsieffer@hefring.com Fixes: 852622bf3616 ("scripts/gdb/vmalloc: add vmallocinfo support") Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n	Clément Léger	1	-4/+5
	MOD_TEXT is only defined if CONFIG_MODULES=y which lead to loading failure of the gdb scripts when kernel is built without CONFIG_MODULES=y: Reading symbols from vmlinux... Traceback (most recent call last): File "/foo/vmlinux-gdb.py", line 25, in <module> import linux.constants File "/foo/scripts/gdb/linux/constants.py", line 14, in <module> LX_MOD_TEXT = gdb.parse_and_eval("MOD_TEXT") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ gdb.error: No symbol "MOD_TEXT" in current context. Add a conditional check on CONFIG_MODULES to fix this error. Link: https://lkml.kernel.org/r/20231031134848.119391-1-da.gomez@samsung.com Fixes: b4aff7513df3 ("scripts/gdb: use mem instead of core_layout to get the module address") Signed-off-by: Clément Léger <cleger@rivosinc.com> Tested-by: Daniel Gomez <da.gomez@samsung.com> Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	.mailmap: add address mapping for Tomeu Vizoso	Bagas Sanjaya	1	-0/+1
	He's no longer working in Collabora (and his email address there bounces). Map it to his personal address. Link: https://lkml.kernel.org/r/20231031014009.22765-2-bagasdotme@gmail.com Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Cc: Bjorn Andersson <quic_bjorande@quicinc.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Konrad Dybcio <konrad.dybcio@linaro.org> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	mailmap: update email address for Claudiu Beznea	Claudiu Beznea	1	-0/+1
	Claudiu Beznea's Microchip email address is no longer valid. Map it to a valid one. Link: https://lkml.kernel.org/r/20231030063632.1707372-1-claudiu.beznea@tuxon.dev Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Cc: Bjorn Andersson <quic_bjorande@quicinc.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Konrad Dybcio <konrad.dybcio@linaro.org> Cc: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions	Itaru Kitayama	1	-0/+1
	On Ubuntu and probably other distros, ptrace permissions are tightend a bit by default; i.e., /proc/sys/kernel/yama/ptrace_score is set to 1. This cases memfd_secret's ptrace attach test fails with a permission error. Set it to 0 piror to running the program. Link: https://lkml.kernel.org/r/20231030-selftest-v1-1-743df68bb996@linux.dev Signed-off-by: Itaru Kitayama <itaru.kitayama@linux.dev> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	.mailmap: map Benjamin Poirier's address	Bagas Sanjaya	1	-0/+1
	Map out to his gmail address as he had left SUSE some time ago. Link: https://lkml.kernel.org/r/20231030142454.22127-2-bagasdotme@gmail.com Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: Benjamin Poirier <benjamin.poirier@gmail.com> Cc: Bjorn Andersson <quic_bjorande@quicinc.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Konrad Dybcio <konrad.dybcio@linaro.org> Cc: Oleksij Rempel <o.rempel@pengutronix.de> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	scripts/gdb: add lx_current support for riscv	Deepak Gupta	1	-0/+15
	csr_sscratch CSR holds current task_struct address when hart is in user space. Trap handler on entry spills csr_sscratch into "tp" (x2) register and zeroes out csr_sscratch CSR. Trap handler on exit reloads "tp" with expected user mode value and place current task_struct address again in csr_sscratch CSR. This patch assumes "tp" is pointing to task_struct. If value in csr_sscratch is numerically greater than "tp" then it assumes csr_sscratch is correct address of current task_struct. This logic holds when - hart is in user space, "tp" will be less than csr_sscratch. - hart is in kernel space but not in trap handler, "tp" will be more than csr_sscratch (csr_sscratch being equal to 0). - hart is executing trap handler - "tp" is still pointing to user mode but csr_sscratch contains ptr to task_struct. Thus numerically higher. - "tp" is pointing to task_struct but csr_sscratch now contains either 0 or numerically smaller value (transiently holds user mode tp) Link: https://lkml.kernel.org/r/20231026233837.612405-1-debug@rivosinc.com Signed-off-by: Deepak Gupta <debug@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Tested-by: Hsieh-Tseng Shen <woodrow.shen@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Glenn Washburn <development@efficientek.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jeff Xie <xiehuan09@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	ocfs2: fix a spelling typo in comment	Kunwu Chan	1	-1/+1
	Fix a spelling typo in comment. Link: https://lkml.kernel.org/r/20231025072906.14285-1-chentao@kylinos.cn Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	proc: test ProtectionKey in proc-empty-vm test	Swarup Laxman Kotiaklapudi	1	-18/+61
	Check ProtectionKey field in /proc/*/smaps output, if system supports protection keys feature. [adobriyan@gmail.com: test support in the beginning of the program, use syscall, not glibc pkey_alloc(3) which may not compile] Link: https://lkml.kernel.org/r/ac05efa7-d2a0-48ad-b704-ffdd5450582e@p183 Signed-off-by: Swarup Laxman Kotiaklapudi <swarupkotikalapudi@gmail.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Swarup Laxman Kotikalapudi<swarupkotikalapudi@gmail.com> Tested-by: Swarup Laxman Kotikalapudi<swarupkotikalapudi@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	proc: fix proc-empty-vm test with vsyscall	Alexey Dobriyan	1	-4/+6
	* fix embarassing /proc//smaps test bug due to a typo in variable name it tested only the first line of the output if vsyscall is enabled: ffffffffff600000-ffffffffff601000 r-xp ... so test passed but tested only VMA location and permissions. add "KSM" entry, unnoticed because (1) * swap "r-xp" and "--xp" vsyscall test strings, also unnoticed because (1) Link: https://lkml.kernel.org/r/76f42cce-b1ab-45ec-b6b2-4c64f0dccb90@p183 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Tested-by: Swarup Laxman Kotikalapudi<swarupkotikalapudi@mail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	fs/proc/base.c: remove unneeded semicolon	Yang Li	1	-1/+1
	./fs/proc/base.c:3829:2-3: Unneeded semicolon Link: https://lkml.kernel.org/r/20231026005634.6581-1-yang.lee@linux.alibaba.com Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7057 Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	do_io_accounting: use sig->stats_lock	Oleg Nesterov	1	-6/+14
	Rather than lock_task_sighand(), sig->stats_lock was specifically designed for this type of use. This way the "if (whole)" branch runs lockless in the likely case. Link: https://lkml.kernel.org/r/20231023153405.GA4639@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	do_io_accounting: use __for_each_thread()	Oleg Nesterov	1	-4/+8
	Rather than while_each_thread() which should be avoided when possible. This makes the code more clear and allows the next change. Link: https://lkml.kernel.org/r/20231023153343.GA4629@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error()	Jia Rui	1	-1/+8
	The BUG_ON() at ocfs2_num_free_extents() handles the error that l_tree_deepth of leaf extent block just read form disk is invalid. This error is mostly caused by file system metadata corruption on the disk. There is no need to call BUG_ON() to handle such errors. We can return error code, since the caller can deal with errors from ocfs2_num_free_extents(). Also, we should make the file system read-only to avoid the damage from expanding. Therefore, BUG_ON() is removed and ocfs2_error() is called instead. Link: https://lkml.kernel.org/r/20231018191811.412458-1-jindui71@gmail.com Signed-off-by: Jia Rui <jindui71@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	mm/damon/sysfs: update monitoring target regions for online input commit	SeongJae Park	1	-17/+30
	When user input is committed online, DAMON sysfs interface is ignoring the user input for the monitoring target regions. Such request is valid and useful for fixed monitoring target regions-based monitoring ops like 'paddr' or 'fvaddr'. Update the region boundaries as user specified, too. Note that the monitoring results of the regions that overlap between the latest monitoring target regions and the new target regions are preserved. Treat empty monitoring target regions user request as a request to just make no change to the monitoring target regions. Otherwise, users should set the monitoring target regions same to current one for every online input commit, and it could be challenging for dynamic monitoring target regions update DAMON ops like 'vaddr'. If the user really need to remove all monitoring target regions, they can simply remove the target and then create the target again with empty target regions. Link: https://lkml.kernel.org/r/20231031170131.46972-1-sj@kernel.org Fixes: da87878010e5 ("mm/damon/sysfs: support online inputs update") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> [5.19+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	mm/damon/sysfs: remove requested targets when online-commit inputs	SeongJae Park	1	-34/+36
	damon_sysfs_set_targets(), which updates the targets of the context for online commitment, do not remove targets that removed from the corresponding sysfs files. As a result, more than intended targets of the context can exist and hence consume memory and monitoring CPU resource more than expected. Fix it by removing all targets of the context and fill up again using the user input. This could cause unnecessary memory dealloc and realloc operations, but this is not a hot code path. Also, note that damon_target is stateless, and hence no data is lost. [sj@kernel.org: fix unnecessary monitoring results removal] Link: https://lkml.kernel.org/r/20231028213353.45397-1-sj@kernel.org Link: https://lkml.kernel.org/r/20231022210735.46409-2-sj@kernel.org Fixes: da87878010e5 ("mm/damon/sysfs: support online inputs update") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: <stable@vger.kernel.org> [5.19.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	selftests: add a sanity check for zswap	Nhat Pham	1	-0/+48
	We recently encountered a bug that makes all zswap store attempt fail. Specifically, after: "141fdeececb3 mm/zswap: delay the initialization of zswap" if we build a kernel with zswap disabled by default, then enabled after the swapfile is set up, the zswap tree will not be initialized. As a result, all zswap store calls will be short-circuited. We have to perform another swapon to get zswap working properly again. Fortunately, this issue has since been fixed by the patch that kills frontswap: "42c06a0e8ebe mm: kill frontswap" which performs zswap_swapon() unconditionally, i.e always initializing the zswap tree. This test add a sanity check that ensure zswap storing works as intended. Link: https://lkml.kernel.org/r/20231020222009.2358953-1-nphamcs@gmail.com Signed-off-by: Nhat Pham <nphamcs@gmail.com> Acked-by: Rik van Riel <riel@surriel.com> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Zefan Li <lizefan.x@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	Documentation: maple_tree: fix word spelling error	Tom Yang	1	-1/+1
	The "first" is spelled "fist". Link: https://lkml.kernel.org/r/20231023095737.21823-1-yangqixiao@inspur.com Signed-off-by: Tom Yang <yangqixiao@inspur.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	mm/vmalloc: fix the unchecked dereference warning in vread_iter()	Baoquan He	1	-1/+1
	LKP reported smatch warning as below: =================== smatch warnings: mm/vmalloc.c:3689 vread_iter() error: we previously assumed 'vm' could be null (see line 3667) ...... 06c8994626d1b7 @3667 size = vm ? get_vm_area_size(vm) : va_size(va); ...... 06c8994626d1b7 @3689 else if (!(vm->flags & VM_IOREMAP)) ^^^^^^^^^ Unchecked dereference ===================== This is not a runtime bug because the possible null 'vm' in the pointed place could only happen when flags == VMAP_BLOCK. However, the case 'flags == VMAP_BLOCK' should never happen and has been detected with WARN_ON. Please check vm_map_ram() implementation and the earlier checking in vread_iter() at below: ~~~~~~~~~~~~~~~~~~~~~~~~~~ /* * VMAP_BLOCK indicates a sub-type of vm_map_ram area, need * be set together with VMAP_RAM. */ WARN_ON(flags == VMAP_BLOCK); if (!vm && !flags) continue; ~~~~~~~~~~~~~~~~~~~~~~~~~~ So add checking on whether 'vm' could be null when dereferencing it in vread_iter(). This mutes smatch complaint. Link: https://lkml.kernel.org/r/ZTCURc8ZQE+KrTvS@MiWiFi-R3L-srv Link: https://lkml.kernel.org/r/ZS/2k6DIMd0tZRgK@MiWiFi-R3L-srv Signed-off-by: Baoquan He <bhe@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202310171600.WCrsOwFj-lkp@intel.com/ Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Philip Li <philip.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	zswap: export compression failure stats	Nhat Pham	1	-1/+7
	During a zswap store attempt, the compression algorithm could fail (for e.g due to the page containing incompressible random data). This is not tracked in any of existing zswap counters, making it hard to monitor for and investigate. We have run into this problem several times in our internal investigations on zswap store failures. This patch adds a dedicated debugfs counter for compression algorithm failures. Link: https://lkml.kernel.org/r/20231024234509.2680539-1-nphamcs@gmail.com Signed-off-by: Nhat Pham <nphamcs@gmail.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Seth Jennings <sjenning@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	Documentation: ubsan: drop "the" from article title	Andrey Konovalov	1	-2/+4
	Drop "the" from the title of the documentation article for UBSAN, as it is redundant. Also add SPDX-License-Identifier for ubsan.rst. Link: https://lkml.kernel.org/r/5fb11a4743eea9d9232a5284dea0716589088fec.1698161845.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01	watchdog: move softlockup_panic back to early_param	Krister Johansen	2	-1/+7
	Setting softlockup_panic from do_sysctl_args() causes it to take effect later in boot. The lockup detector is enabled before SMP is brought online, but do_sysctl_args runs afterwards. If a user wants to set softlockup_panic on boot and have it trigger should a softlockup occur during onlining of the non-boot processors, they could do this prior to commit f117955a2255 ("kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases"). However, after this commit the value of softlockup_panic is set too late to be of help for this type of problem. Restore the prior behavior. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Cc: stable@vger.kernel.org Fixes: f117955a2255 ("kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases") Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01	proc: sysctl: prevent aliased sysctls from getting passed to init	Krister Johansen	3	-0/+17
	The code that checks for unknown boot options is unaware of the sysctl alias facility, which maps bootparams to sysctl values. If a user sets an old value that has a valid alias, a message about an invalid parameter will be printed during boot, and the parameter will get passed to init. Fix by checking for the existence of aliased parameters in the unknown boot parameter code. If an alias exists, don't return an error or pass the value to init. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Cc: stable@vger.kernel.org Fixes: 0a477e1ae21b ("kernel/sysctl: support handling command line aliases") Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01	vsprintf: uninline simple_strntoull(), reorder arguments	Alexey Dobriyan	1	-13/+12
	* uninline simple_strntoull(), gcc overinlines and this function is not performance critical * reorder arguments, so that appending INT_MAX as 4th argument generates very efficient tail call Space savings: add/remove: 1/0 grow/shrink: 0/3 up/down: 27/-179 (-152) Function old new delta simple_strntoll - 27 +27 simple_strtoull 15 10 -5 simple_strtoll 41 7 -34 vsscanf 1930 1790 -140 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/all/82a2af6e-9b6c-4a09-89d7-ca90cc1cdad1@p183/
2023-11-01	leds: lp5521: Add an error check in lp5521_post_init_device	Su Hui	1	-0/+2
	lp55xx_write() can return an error code, add a check for this. Signed-off-by: Su Hui <suhui@nfschina.com> Link: https://lore.kernel.org/r/20231020091930.207870-1-suhui@nfschina.com Signed-off-by: Lee Jones <lee@kernel.org>
2023-11-01	leds: gpio: Update headers	Andy Shevchenko	1	-1/+5
	Include headers which we are direct users of, no need to have proxies. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20231016161005.1471768-6-andriy.shevchenko@linux.intel.com Signed-off-by: Lee Jones <lee@kernel.org>
2023-11-01	leds: gpio: Remove unneeded assignment	Andy Shevchenko	1	-1/+1
	The initial ret is not used anywhere, drop the unneeded assignment. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20231016161005.1471768-5-andriy.shevchenko@linux.intel.com Signed-off-by: Lee Jones <lee@kernel.org>
2023-11-01	leds: gpio: Move temporary variable for struct device to gpio_led_probe()	Andy Shevchenko	1	-11/+8
	Use temporary variable for struct device in gpio_led_probe() in order to make code neater. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20231016161005.1471768-4-andriy.shevchenko@linux.intel.com Signed-off-by: Lee Jones <lee@kernel.org>