linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2021-11-03	signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed	Eric W. Biederman	3	-1/+11
	As Andy pointed out that there are races between force_sig_info_to_task and sigaction[1] when force_sig_info_task. As Kees discovered[2] ptrace is also able to change these signals. In the case of seeccomp killing a process with a signal it is a security violation to allow the signal to be caught or manipulated. Solve this problem by introducing a new flag SA_IMMUTABLE that prevents sigaction and ptrace from modifying these forced signals. This flag is carefully made kernel internal so that no new ABI is introduced. Longer term I think this can be solved by guaranteeing short circuit delivery of signals in this case. Unfortunately reliable and guaranteed short circuit delivery of these signals is still a ways off from being implemented, tested, and merged. So I have implemented a much simpler alternative for now. [1] https://lkml.kernel.org/r/b5d52d25-7bde-4030-a7b1-7c6f8ab90660@www.fastmail.com [2] https://lkml.kernel.org/r/202110281136.5CE65399A7@keescook Cc: stable@vger.kernel.org Fixes: 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation") Tested-by: Andrea Righi <andrea.righi@canonical.com> Tested-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-29	signal: Replace force_sigsegv(SIGSEGV) with force_fatal_sig(SIGSEGV)	Eric W. Biederman	8	-9/+9
	Now that force_fatal_sig exists it is unnecessary and a bit confusing to use force_sigsegv in cases where the simpler force_fatal_sig is wanted. So change every instance we can to make the code clearer. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Link: https://lkml.kernel.org/r/877de7jrev.fsf@disp2133 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-29	exit/r8188eu: Replace the macro thread_exit with a simple return 0	Eric W. Biederman	3	-4/+2
	The macro thread_exit is called is at the end of functions started with kthread_run. The code in kthread_run has arranged things so a kernel thread can just return and do_exit will be called. So just have rtw_cmd_thread and mp_xmit_packet_thread return instead of calling complete_and_exit. Link: https://lkml.kernel.org/r/20211020174406.17889-20-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	exit/rtl8712: Replace the macro thread_exit with a simple return 0	Eric W. Biederman	2	-2/+1
	The macro thread_exit is called is at the end of a function started with kthread_run. The code in kthread_run has arranged things so a kernel thread can just return and do_exit will be called. So just have the cmd_thread return instead of calling complete_and_exit. Link: https://lkml.kernel.org/r/20211020174406.17889-19-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	exit/rtl8723bs: Replace the macro thread_exit with a simple return 0	Eric W. Biederman	4	-5/+3
	Every place thread_exit is called is at the end of a function started with kthread_run. The code in kthread_run has arranged things so a kernel thread can just return and do_exit will be called. So just have the threads return instead of calling complete_and_exit. Link: https://lkml.kernel.org/r/20211020174406.17889-18-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	signal/x86: In emulate_vsyscall force a signal instead of calling do_exit	Eric W. Biederman	1	-1/+2
	Directly calling do_exit with a signal number has the problem that all of the side effects of the signal don't happen, such as killing all of the threads of a process instead of just the calling thread. So replace do_exit(SIGSYS) with force_fatal_sig(SIGSYS) which causes the signal handling to take it's normal path and work as expected. Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20211020174406.17889-17-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	signal/sparc32: In setup_rt_frame and setup_fram use force_fatal_sig	Eric W. Biederman	1	-2/+2
	Modify the 32bit version of setup_rt_frame and setup_frame to act similar to the 64bit version of setup_rt_frame and fail with a signal instead of calling do_exit. Replacing do_exit(SIGILL) with force_fatal_signal(SIGILL) ensures that the process will be terminated cleanly when the stack frame is invalid, instead of just killing off a single thread and leaving the process is a weird state. Cc: David Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Link: https://lkml.kernel.org/r/20211020174406.17889-16-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	signal/sparc32: Exit with a fatal signal when try_to_clear_window_buffer fails	Eric W. Biederman	1	-2/+4
	The function try_to_clear_window_buffer is only called from rtrap_32.c. After it is called the signal pending state is retested, and signals are handled if TIF_SIGPENDING is set. This allows try_to_clear_window_buffer to call force_fatal_signal and then rely on the signal being delivered to kill the process, without any danger of returning to userspace, or otherwise using possible corrupt state on failure. The functional difference between force_fatal_sig and do_exit is that do_exit will only terminate a single thread, and will never trigger a core-dump. A multi-threaded program for which a single thread terminates unexpectedly is hard to reason about. Calling force_fatal_sig does not give userspace a chance to catch the signal, but otherwise is an ordinary fatal signal exit, and it will trigger a coredump of the offending process if core dumps are enabled. Cc: David Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Link: https://lkml.kernel.org/r/20211020174406.17889-15-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	exit/syscall_user_dispatch: Send ordinary signals on failure	Eric W. Biederman	1	-4/+8
	Use force_fatal_sig instead of calling do_exit directly. This ensures the ordinary signal handling path gets invoked, core dumps as appropriate get created, and for multi-threaded processes all of the threads are terminated not just a single thread. When asked Gabriel Krisman Bertazi <krisman@collabora.com> said [1]: > ebiederm@xmission.com (Eric W. Biederman) asked: > > > Why does do_syscal_user_dispatch call do_exit(SIGSEGV) and > > do_exit(SIGSYS) instead of force_sig(SIGSEGV) and force_sig(SIGSYS)? > > > > Looking at the code these cases are not expected to happen, so I would > > be surprised if userspace depends on any particular behaviour on the > > failure path so I think we can change this. > > Hi Eric, > > There is not really a good reason, and the use case that originated the > feature doesn't rely on it. > > Unless I'm missing yet another problem and others correct me, I think > it makes sense to change it as you described. > > > Is using do_exit in this way something you copied from seccomp? > > I'm not sure, its been a while, but I think it might be just that. The > first prototype of SUD was implemented as a seccomp mode. If at some point it becomes interesting we could relax "force_fatal_sig(SIGSEGV)" to instead say "force_sig_fault(SIGSEGV, SEGV_MAPERR, sd->selector)". I avoid doing that in this patch to avoid making it possible to catch currently uncatchable signals. Cc: Gabriel Krisman Bertazi <krisman@collabora.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> [1] https://lkml.kernel.org/r/87mtr6gdvi.fsf@collabora.com Link: https://lkml.kernel.org/r/20211020174406.17889-14-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	signal: Implement force_fatal_sig	Eric W. Biederman	2	-9/+18
	Add a simple helper force_fatal_sig that causes a signal to be delivered to a process as if the signal handler was set to SIG_DFL. Reimplement force_sigsegv based upon this new helper. This fixes force_sigsegv so that when it forces the default signal handler to be used the code now forces the signal to be unblocked as well. Reusing the tested logic in force_sig_info_to_task that was built for force_sig_seccomp this makes the implementation trivial. This is interesting both because it makes force_sigsegv simpler and because there are a couple of buggy places in the kernel that call do_exit(SIGILL) or do_exit(SIGSYS) because there is no straight forward way today for those places to simply force the exit of a process with the chosen signal. Creating force_fatal_sig allows those places to be implemented with normal signal exits. Link: https://lkml.kernel.org/r/20211020174406.17889-13-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	exit/kthread: Have kernel threads return instead of calling do_exit	Eric W. Biederman	5	-9/+6
	In 2009 Oleg reworked[1] the kernel threads so that it is not necessary to call do_exit if you are not using kthread_stop(). Remove the explicit calls of do_exit and complete_and_exit (with a NULL completion) that were previously necessary. [1] 63706172f332 ("kthreads: rework kthread_stop()") Link: https://lkml.kernel.org/r/20211020174406.17889-12-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-29	signal/s390: Use force_sigsegv in default_trap_handler	Eric W. Biederman	1	-1/+1
	Reading the history it is unclear why default_trap_handler calls do_exit. It is not even menthioned in the commit where the change happened. My best guess is that because it is unknown why the exception happened it was desired to guarantee the process never returned to userspace. Using do_exit(SIGSEGV) has the problem that it will only terminate one thread of a process, leaving the process in an undefined state. Use force_sigsegv(SIGSEGV) instead which effectively has the same behavior except that is uses the ordinary signal mechanism and terminates all threads of a process and is generally well defined. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linux-s390@vger.kernel.org Fixes: ca2ab03237ec ("[PATCH] s390: core changes") History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lkml.kernel.org/r/20211020174406.17889-11-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25	signal/vm86_32: Properly send SIGSEGV when the vm86 state cannot be saved.	Eric W. Biederman	1	-1/+3
	Update save_v86_state to always complete all of it's work except possibly some of the copies to userspace even if save_v86_state takes a fault. This ensures that the kernel is always in a sane state, even if userspace has done something silly. When save_v86_state takes a fault update it to force userspace to take a SIGSEGV and terminate the userspace application. As Andy pointed out in review of the first version of this change there are races between sigaction and the application terinating. Now that the code has been modified to always perform all save_v86_state's work (except possibly copying to userspace) those races do not matter from a kernel perspective. Forcing the userspace application to terminate (by resetting it's handler to SIGDFL) is there to keep everything as close to the current behavior as possible while removing the unique (and difficult to maintain) use of do_exit. If this new SIGSEGV happens during handle_signal the next time around the exit_to_user_mode_loop, SIGSEGV will be delivered to userspace. All of the callers of handle_vm86_trap and handle_vm86_fault run the exit_to_user_mode_loop before they return to userspace any signal sent to the current task during their execution will be delivered to the current task before that tasks exits to usermode. Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: H Peter Anvin <hpa@zytor.com> v1: https://lkml.kernel.org/r/20211020174406.17889-10-ebiederm@xmission.com Link: https://lkml.kernel.org/r/877de1xcr6.fsf_-_@disp2133 Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25	signal/vm86_32: Replace open coded BUG_ON with an actual BUG_ON	Eric W. Biederman	1	-4/+2
	The function save_v86_state is only called when userspace was operating in vm86 mode before entering the kernel. Not having vm86 state in the task_struct should never happen. So transform the hand rolled BUG_ON into an actual BUG_ON to make it clear what is happening. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: H Peter Anvin <hpa@zytor.com> Link: https://lkml.kernel.org/r/20211020174406.17889-9-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25	signal/sparc: In setup_tsb_params convert open coded BUG into BUG	Eric W. Biederman	1	-1/+1
	The function setup_tsb_params has exactly one caller tsb_grow. The function tsb_grow passes in a tsb_bytes value that is between 8192 and 1048576 inclusive, and is guaranteed to be a power of 2. The function setup_tsb_params verifies this property with a switch statement and then prints an error and causes the task to exit if this is not true. In practice that print statement can never be reached because tsb_grow never passes in a bad tsb_size. So if tsb_size ever gets a bad value that is a kernel bug. So replace the do_exit which is effectively an open coded version of BUG() with an actuall call to BUG(). Making it clearer that this is a case that can never, and should never happen. Cc: David Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Link: https://lkml.kernel.org/r/20211020174406.17889-8-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25	signal/powerpc: On swapcontext failure force SIGSEGV	Eric W. Biederman	2	-5/+10
	If the register state may be partial and corrupted instead of calling do_exit, call force_sigsegv(SIGSEGV). Which properly kills the process with SIGSEGV and does not let any more userspace code execute, instead of just killing one thread of the process and potentially confusing everything. Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: 756f1ae8a44e ("PPC32: Rework signal code and add a swapcontext system call.") Fixes: 04879b04bf50 ("[PATCH] ppc64: VMX (Altivec) support & signal32 rework, from Ben Herrenschmidt") Link: https://lkml.kernel.org/r/20211020174406.17889-7-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25	signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL)	Eric W. Biederman	1	-4/+6
	Today the sh code allocates memory the first time a process uses the fpu. If that memory allocation fails, kill the affected task with force_sig(SIGKILL) rather than do_group_exit(SIGKILL). Calling do_group_exit from an exception handler can potentially lead to dead locks as do_group_exit is not designed to be called from interrupt context. Instead use force_sig(SIGKILL) to kill the userspace process. Sending signals in general and force_sig in particular has been tested from interrupt context so there should be no problems. Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: linux-sh@vger.kernel.org Fixes: 0ea820cf9bf5 ("sh: Move over to dynamically allocated FPU context.") Link: https://lkml.kernel.org/r/20211020174406.17889-6-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25	signal/mips: Update (_save\|_restore)_fp_context to fail with -EFAULT	Eric W. Biederman	2	-11/+2
	When an instruction to save or restore a register from the stack fails in _save_fp_context or _restore_fp_context return with -EFAULT. This change was made to r2300_fpu.S[1] but it looks like it got lost with the introduction of EX2[2]. This is also what the other implementation of _save_fp_context and _restore_fp_context in r4k_fpu.S does, and what is needed for the callers to be able to handle the error. Furthermore calling do_exit(SIGSEGV) from bad_stack is wrong because it does not terminate the entire process it just terminates a single thread. As the changed code was the only caller of arch/mips/kernel/syscall.c:bad_stack remove the problematic and now unused helper function. Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Maciej Rozycki <macro@orcam.me.uk> Cc: linux-mips@vger.kernel.org [1] 35938a00ba86 ("MIPS: Fix ISA I FP sigcontext access violation handling") [2] f92722dc4545 ("MIPS: Correct MIPS I FP sigcontext layout") Cc: stable@vger.kernel.org Fixes: f92722dc4545 ("MIPS: Correct MIPS I FP sigcontext layout") Acked-by: Maciej W. Rozycki <macro@orcam.me.uk> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Link: https://lkml.kernel.org/r/20211020174406.17889-5-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-20	signal/sparc32: Remove unreachable do_exit in do_sparc_fault	Eric W. Biederman	1	-1/+0
	The call to do_exit in do_sparc_fault immediately follows a call to unhandled_fault. The function unhandled_fault never returns. This means the call to do_exit can never be reached. Cc: David Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Fixes: 2.3.41 Link: https://lkml.kernel.org/r/20211020174406.17889-4-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-20	reboot: Remove the unreachable panic after do_exit in reboot(2)	Eric W. Biederman	1	-1/+0
	Link: https://lkml.kernel.org/r/20211020174406.17889-3-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-20	exit: Remove calls of do_exit after noreturn versions of die	Eric W. Biederman	11	-20/+9
	On nds32, openrisc, s390, sh, and xtensa the function die never returns. Mark die __noreturn so that no one expects die to return. Remove the do_exit calls after die as they will never be reached. Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: openrisc@lists.librecores.org Cc: Nick Hu <nickhu@andestech.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: linux-sh@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Fixes: 2.3.16 Fixes: 2.3.99-pre8 Fixes: 3f65ce4d141e ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 5") Fixes: 664eec400bf8 ("nds32: MMU fault handling and page table management") Fixes: 61e85e367535 ("OpenRISC: Memory management") Link: https://lkml.kernel.org/r/20211020174406.17889-2-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-20	exit/doublefault: Remove apparently bogus comment about rewind_stack_do_exit	Eric W. Biederman	1	-3/+0
	I do not see panic calling rewind_stack_do_exit anywhere, nor can I find anywhere in the history where doublefault_shim has called rewind_stack_do_exit. So I don't think this comment was ever actually correct. Cc: Andy Lutomirski <luto@kernel.org> Fixes: 7d8d8cfdee9a ("x86/doublefault/32: Rewrite the x86_32 #DF handler and unify with 64-bit") Link: https://lkml.kernel.org/r/20211020174406.17889-1-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-09-12	Linux 5.15-rc1	Linus Torvalds	1	-2/+2

2021-09-11	tools headers UAPI: Update tools's copy of drm.h headers	Arnaldo Carvalho de Melo	1	-2/+12
	Picking the changes from: 17ce9c61c71cbc0d ("drm: document DRM_IOCTL_MODE_RMFB") Doesn't result in any tooling changes: $ tools/perf/trace/beauty/drm_ioctl.sh > before $ cp include/uapi/drm/drm.h tools/include/uapi/drm/drm.h $ tools/perf/trace/beauty/drm_ioctl.sh > after $ diff -u before after Silencing these perf build warnings: Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h' diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h Cc: Simon Ser <contact@emersion.fr> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11	tools headers UAPI: Sync drm/i915_drm.h with the kernel sources	Arnaldo Carvalho de Melo	1	-81/+417
	To pick the changes in: b65a9489730a2494 ("drm/i915/userptr: Probe existence of backing struct pages upon creation") ee242ca704d38699 ("drm/i915/guc: Implement GuC priority management") 81340cf3bddded4f ("drm/i915/uapi: reject set_domain for discrete") 7961c5b60f23dff5 ("drm/i915: Add TTM offset argument to mmap.") aef7b67a79564f6c ("drm/i915/uapi: convert drm_i915_gem_userptr to kernel doc") e7737b67ab46ee0e ("drm/i915/uapi: reject caching ioctls for discrete") 3aa8c57fe25a9247 ("drm/i915/uapi: convert drm_i915_gem_set_domain to kernel doc") 289f5a72009b8f67 ("drm/i915/uapi: convert drm_i915_gem_caching to kernel doc") 4a766ae40ec83301 ("drm/i915: Drop the CONTEXT_CLONE API (v2)") 6ff6d61dd2a943bd ("drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP") fe4751c3d513ff4f ("drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE") 577729533cdc4e37 ("drm/i915: Document the Virtual Engine uAPI") c649432e86ca677d ("drm/i915: Fix busy ioctl commentary") That doesn't result in any changes to tooling as no new ioctl were added (at least not perceived by tools/perf/trace/beauty/drm_ioctl.sh). Addressing this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h' diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11	tools headers UAPI: Sync linux/fs.h with the kernel sources	Arnaldo Carvalho de Melo	1	-0/+1
	To pick the change in: 7957d93bf32bc211 ("block: add ioctl to read the disk sequence number") It adds a new ioctl, but we are still not using that to generate tables for 'perf trace', so no changes in tooling. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h' diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h Cc: Jens Axboe <axboe@kernel.dk> Cc: Matteo Croce <mcroce@microsoft.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11	tools headers UAPI: Sync linux/in.h copy with the kernel sources	Arnaldo Carvalho de Melo	1	-10/+32
	To pick the changes in: db243b796439c0ca ("net/ipv4/ipv6: Replace one-element arraya with flexible-array members") 2d3e5caf96b9449a ("net/ipv4: Replace one-element array with flexible-array member") That don't result in any change in tooling, the structs changed remains with the same layout. This addresses this build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h' diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h Cc: David S. Miller <davem@davemloft.net> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11	perf tools: Add an option to build without libbfd	Ian Rogers	1	-22/+25
	Some distributions, like debian, don't link perf with libbfd. Add a build flag to make this configuration buildable and testable. This was inspired by: https://lore.kernel.org/linux-perf-users/20210910102307.2055484-1-tonyg@leastfixedpoint.com/T/#u Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: tony garnock-jones <tonyg@leastfixedpoint.com> Link: http://lore.kernel.org/lkml/20210910225756.729087-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11	perf tools: Allow build-id with trailing zeros	Namhyung Kim	1	-0/+10
	Currently perf saves a build-id with size but old versions assumes the size of 20. In case the build-id is less than 20 (like for MD5), it'd fill the rest with 0s. I saw a problem when old version of perf record saved a binary in the build-id cache and new version of perf reads the data. The symbols should be read from the build-id cache (as the path no longer has the same binary) but it failed due to mismatch in the build-id. symsrc__init: build id mismatch for /home/namhyung/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf. The build-id event in the data has 20 byte build-ids, but it saw a different size (16) when it reads the build-id of the elf file in the build-id cache. $ readelf -n ~/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf Displaying notes found in: .note.gnu.build-id Owner Data size Description GNU 0x00000010 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: 53e4c2f42a4c61a2d632d92a72afa08f Let's fix this by allowing trailing zeros if the size is different. Fixes: 39be8d0115b321ed ("perf tools: Pass build_id object to dso__build_id_equal()") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210910224630.1084877-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11	perf tools: Fix hybrid config terms list corruption	Adrian Hunter	2	-9/+27
	A config terms list was spliced twice, resulting in a never-ending loop when the list was traversed. Fix by using list_splice_init() and copying and freeing the lists as necessary. This patch also depends on patch "perf tools: Factor out copy_config_terms() and free_config_terms()" Example on ADL: Before: # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname & # jobs [1]+ Running perf record -e "{intel_pt//,cycles/aux-sample-size=4096/pp}" uname # perf top -E 10 PerfTop: 4071 irqs/sec kernel: 6.9% exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (all, 24 CPUs) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 97.60% perf [.] __evsel__get_config_term 0.25% [kernel] [k] kallsyms_expand_symbol.constprop.13 0.24% perf [.] kallsyms__parse 0.15% [kernel] [k] _raw_spin_lock 0.14% [kernel] [k] number 0.13% [kernel] [k] advance_transaction 0.08% [kernel] [k] format_decode 0.08% perf [.] map__process_kallsym_symbol 0.08% perf [.] rb_insert_color 0.08% [kernel] [k] vsnprintf exiting. # kill %1 After: # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname & Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.060 MB perf.data ] # perf script \| head perf-exec 604 [001] 1827.312293: psb: psb offs: 0 ffffffffb8415e87 pt_config_start+0x37 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a3bd event_sched_in.isra.133+0xfd ([kernel.kallsyms]) => ffffffffb856a9a0 perf_pmu_nop_void+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856b10e merge_sched_in+0x26e ([kernel.kallsyms]) => ffffffffb856a2c0 event_sched_in.isra.133+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a45d event_sched_in.isra.133+0x19d ([kernel.kallsyms]) => ffffffffb8568b80 perf_event_set_state.part.61+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb8568b86 perf_event_set_state.part.61+0x6 ([kernel.kallsyms]) => ffffffffb85662a0 perf_event_update_time+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a35c event_sched_in.isra.133+0x9c ([kernel.kallsyms]) => ffffffffb8567610 perf_log_itrace_start+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a377 event_sched_in.isra.133+0xb7 ([kernel.kallsyms]) => ffffffffb8403b40 x86_pmu_add+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb8403b86 x86_pmu_add+0x46 ([kernel.kallsyms]) => ffffffffb8403940 collect_events+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb8403a7b collect_events+0x13b ([kernel.kallsyms]) => ffffffffb8402cd0 collect_event+0x0 ([kernel.kallsyms]) Fixes: 30def61f64bac5 ("perf parse-events Create two hybrid cache events") Fixes: 94da591b1c7913 ("perf parse-events Create two hybrid raw events") Fixes: 9cbfa2f64c04d9 ("perf parse-events Create two hybrid hardware events") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Link: https //lore.kernel.org/r/20210909125508.28693-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11	perf tools: Factor out copy_config_terms() and free_config_terms()	Adrian Hunter	3	-13/+19
	Factor out copy_config_terms() and free_config_terms() so that they can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Link: https //lore.kernel.org/r/20210909125508.28693-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11	perf tools: Fix perf_event_attr__fprintf() missing/dupl. fields	Adrian Hunter	1	-1/+4
	Some fields are missing and text_poke is duplicated. Fix that up. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20210911120550.12203-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11	perf tools: Ignore Documentation dependency file	Ian Rogers	1	-0/+1
	When building directly on the checked out repository the build process produces a file that should be ignored, so add it to .gitignore. Fixes: a81df63a5df3e195 ("perf doc: Fix doc.dep") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210910232249.739661-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10	riscv: Move EXCEPTION_TABLE to RO_DATA segment	Jisheng Zhang	2	-3/+2
	_ex_table section is read-only, so move it to RO_DATA. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-10	riscv: Enable BUILDTIME_TABLE_SORT	Jisheng Zhang	2	-0/+2
	Enable BUILDTIME_TABLE_SORT to sort the exception table at build time rather than during boot. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-10	riscv: dts: microchip: mpfs-icicle: Fix serial console	Geert Uytterhoeven	1	-1/+5
	Currently, nothing is output on the serial console, unless "console=ttyS0,115200n8" or "earlycon" are appended to the kernel command line. Enable automatic console selection using chosen/stdout-path by adding a proper alias, and configure the expected serial rate. While at it, add aliases for the other three serial ports, which are provided on the same micro-USB connector as the first one. Fixes: 0fa6107eca4186ad ("RISC-V: Initial DTS for Microchip ICICLE board") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Bin Meng <bmeng.cn@gmail.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-10	riscv: move the (z)install rules to arch/riscv/Makefile	Masahiro Yamada	2	-10/+5
	Currently, the (z)install targets in arch/riscv/Makefile descend into arch/riscv/boot/Makefile to invoke the shell script, but there is no good reason to do so. arch/riscv/Makefile can run the shell script directly. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-10	riscv: Improve stack randomisation on RV64	Kefeng Wang	2	-1/+4
	This enlarges the bits availiable for stack randomisation on RV64 from the default of 8MiB to 1GiB, to match arm64 and x86. Also, update the documentation to reflect our support for stack randomisation. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> [Palmer: commit text] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-10	riscv: defconfig: enable NLS_CODEPAGE_437, NLS_ISO8859_1	Heinrich Schuchardt	1	-0/+2
	The EFI system partition uses the FAT file system. Many distributions add an entry in /etc/fstab for the ESP. We must ensure that mounting does not fail. The default code page for FAT is 437 (cf. CONFIG_FAT_DEFAULT_CODEPAGE). The default IO character set is "iso8859-1" (cf. CONFIG_NLS_ISO8859_1). So let's enable NLS_CODEPAGE_437 and NLS_ISO8859_1 in defconfig. Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-10	riscv: defconfig: enable BLK_DEV_NVME	Heinrich Schuchardt	1	-0/+2
	NVMe is a non-volatile storage media attached via PCIe. As NVMe has much higher throughput than other block devices like SATA it is a must have for RISC-V. Enable CONFIG_BLK_DEV_NVME. The HiFive Unmatched is a board providing M.2 slots for NVMe drives. Enable CONFIG_PCIE_FU740. Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-11	Documentation: core-api/cpuhotplug: Rewrite the API section	Thomas Gleixner	2	-121/+590
	Dave stumbled over the incomplete and confusing documentation of the CPU hotplug API. Rewrite it, add the missing function documentations and correct the existing ones. Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210909123212.489059409@linutronix.de
2021-09-11	cpu/hotplug: Remove deprecated CPU-hotplug functions.	Sebastian Andrzej Siewior	1	-6/+0
	No users in tree use the deprecated CPU-hotplug functions anymore. Remove them. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210803141621.780504-39-bigeasy@linutronix.de
2021-09-11	thermal: Replace deprecated CPU-hotplug functions.	Sebastian Andrzej Siewior	1	-2/+2
	The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210803141621.780504-20-bigeasy@linutronix.de
2021-09-10	perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions	Arnaldo Carvalho de Melo	1	-0/+8
	The btf__get_from_id() function was deprecated in favour of btf__load_from_kernel_by_id(), but it is still avaiable, so use it to provide a weak function btf__load_from_kernel_by_id() for older libbpf when building perf with LIBBPF_DYNAMIC=1, i.e. using the system's libbpf package. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10	tools include UAPI: Update linux/mount.h copy	Arnaldo Carvalho de Melo	1	-1/+2
	To pick the changes from: 9ffb14ef61bab83f ("move_mount: allow to add a mount into an existing group") That ends up adding support for the new MOVE_MOUNT_SET_GROUP move_mount flag. $ tools/perf/trace/beauty/move_mount_flags.sh > before $ cp include/uapi/linux/mount.h tools/include/uapi/linux/mount.h $ tools/perf/trace/beauty/move_mount_flags.sh > after $ diff -u before after --- before 2021-09-10 12:28:43.865279808 -0300 +++ after 2021-09-10 12:28:50.183429184 -0300 @@ -5,4 +5,5 @@ [ilog2(0x00000010) + 1] = "T_SYMLINKS", [ilog2(0x00000020) + 1] = "T_AUTOMOUNTS", [ilog2(0x00000040) + 1] = "T_EMPTY_PATH", + [ilog2(0x00000100) + 1] = "SET_GROUP", }; $ So now one can use it in --filter expressions for tracepoints. This silences this perf build warnings: Warning: Kernel ABI header at 'tools/include/uapi/linux/mount.h' differs from latest version at 'include/uapi/linux/mount.h' diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10	perf beauty: Cover more flags in the move_mount syscall argument beautifier	Arnaldo Carvalho de Melo	1	-1/+1
	Previously the regext expected MOVE_MOUNT_[FT]_, but in the next patch a flag that doesn't match that expression will be added, MOVE_MOUNT_SET_GROUP To make this more future proof, take advantage of the fact that the only one we don't need to cover is MOVE_MOUNT__MASK and use MOVE_MOUNT_[^_]+__. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10	tools headers UAPI: Sync linux/prctl.h with the kernel sources	Arnaldo Carvalho de Melo	1	-5/+7
	To pick the changes in: 433c38f40f6a81cf ("arm64: mte: change ASYNC and SYNC TCF settings into bitfields") e893bb1bb4d2eb63 ("x86, prctl: Hook L1D flushing in via prctl") That don't result in any changes in tooling: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after $ Just silences this perf tools build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h' diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Cc: Balbir Singh <sblbir@amazon.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10	tools include UAPI: Sync sound/asound.h copy with the kernel sources	Arnaldo Carvalho de Melo	1	-0/+1
	Picking the changes from: 81be10934949da8b ("ALSA: pcm: Add SNDRV_PCM_INFO_EXPLICIT_SYNC flag") Which entails no changes in the tooling side as it doesn't introduce new ioctls. To silence this perf tools build warning: Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h' diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10	tools headers UAPI: Sync linux/kvm.h with the kernel sources	Arnaldo Carvalho de Melo	1	-4/+7
	To pick the changes in: f95937ccf5bd5e0a ("KVM: stats: Support linear and logarithmic histogram statistics") f0376edb1ddcab19 ("KVM: arm64: Add ioctl to fetch/store tags in a guest") ea7fc1bb1cd1b92b ("KVM: arm64: Introduce MTE VM feature") That just rebuilds perf, as these patches don't add any new KVM ioctl to be harvested for the the 'perf trace' ioctl syscall argument beautifiers. This is also by now used by tools/testing/selftests/kvm/, so that will pick the new KVM_STATS_TYPE_LINEAR_HIST and KVM_STATS_TYPE_LOG_HIST defines. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Jing Zhang <jingzhangos@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10	tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources	Arnaldo Carvalho de Melo	1	-0/+1
	To pick the changes in: 61e5f69ef08379cd ("KVM: x86: implement KVM_GUESTDBG_BLOCKIRQ") That just rebuilds kvm-stat.c on x86, no change in functionality. This silences these perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h' diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>