aboutsummaryrefslogtreecommitdiffstats
path: root/arch/um (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-07-17arch: um: Fix build for statically linked UML w/ constructorsDavid Gow1-0/+1
If CONFIG_CONSTUCTORS is enabled on a statically linked (CONFIG_STATIC_LINK=y) build of UML, the build fails due to the .eh_frame section being both used and discarded: ERROR:root:`.eh_frame' referenced in section `.text' of /usr/lib/gcc/x86_64-linux-gnu/11/crtbeginT.o: defined in discarded section `.eh_frame' of /usr/lib/gcc/x86_64-linux-gnu/11/crtbeginT.o `.eh_frame' referenced in section `.text' of /usr/lib/gcc/x86_64-linux-gnu/11/crtbeginT.o: defined in discarded section `.eh_frame' of /usr/lib/gcc/x86_64-linux-gnu/11/crtbeginT.o Instead, keep the .eh_frame section, as we do in dyn.lds.S for dynamically linked UML. This can be reproduced with: ./tools/testing/kunit/kunit.py run --kconfig_add CONFIG_STATIC_LINK=y --kconfig_add CONFIG_GCOV_KERNEL=y --kconfig_add CONFIG_DEBUG_FS=y Signed-off-by: David Gow <davidgow@google.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-07-17um/drivers: Kconfig: Fix indentationJuerg Haefliger1-27/+27
The convention for indentation seems to be a single tab. Help text is further indented by an additional two whitespaces. Fix the lines that violate these rules. Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-07-17um: Kconfig: Fix indentationJuerg Haefliger1-1/+1
The convention for indentation seems to be a single tab. Help text is further indented by an additional two whitespaces. Fix the lines that violate these rules. Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-07-17Merge tag 'x86_urgent_for_v5.19_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+4
Pull x86 fixes from Borislav Petkov: - Improve the check whether the kernel supports WP mappings so that it can accomodate a XenPV guest due to how the latter is setting up the PAT machinery - Now that the retbleed nightmare is public, here's the first round of fallout fixes: * Fix a build failure on 32-bit due to missing include * Remove an untraining point in espfix64 return path * other small cleanups * tag 'x86_urgent_for_v5.19_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/bugs: Remove apostrophe typo um: Add missing apply_returns() x86/entry: Remove UNTRAIN_RET from native_irq_return_ldt x86/bugs: Mark retbleed_strings static x86/pat: Fix x86_has_pat_wp() x86/asm/32: Fix ANNOTATE_UNRET_SAFE use on 32-bit
2022-07-14um: Replace to_phys() and to_virt() with less generic function namesGuenter Roeck3-7/+7
The UML function names to_virt() and to_phys() are exposed by UML headers, and are very generic and may be defined by drivers. As it turns out, commit 9409c9b6709e ("pmem: refactor pmem_clear_poison()") did exactly that. This results in build errors such as the following when trying to build um:allmodconfig: drivers/nvdimm/pmem.c: In function ‘pmem_dax_zero_page_range’: ./arch/um/include/asm/page.h:105:20: error: too few arguments to function ‘to_phys’ 105 | #define __pa(virt) to_phys((void *) (unsigned long) (virt)) | ^~~~~~~ Use less generic function names for the um specific to_phys() and to_virt() functions to fix the problem and to avoid similar problems in the future. Fixes: 9409c9b6709e ("pmem: refactor pmem_clear_poison()") Cc: Dan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-14um: Use enum req_op where appropriateBart Van Assche1-2/+2
Improve static type checking by using type enum req_op instead of int where appropriate. Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220714180729.1065367-21-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-14um: Add missing apply_returns()Peter Zijlstra1-0/+4
Implement apply_returns() stub for UM, just like all the other patching routines. Fixes: 15e67227c49a ("x86: Undo return-thunk damage") Reported-by: Randy Dunlap <rdunlap@infradead.org) Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/Ys%2Ft45l%2FgarIrD0u@worktop.programming.kicks-ass.net
2022-06-28block: remove blk_cleanup_diskChristoph Hellwig1-2/+2
blk_cleanup_disk is nothing but a trivial wrapper for put_disk now, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-16mm: avoid unnecessary page fault retires on shared memory typesPeter Xu1-0/+4
I observed that for each of the shared file-backed page faults, we're very likely to retry one more time for the 1st write fault upon no page. It's because we'll need to release the mmap lock for dirty rate limit purpose with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). Then after that throttling we return VM_FAULT_RETRY. We did that probably because VM_FAULT_RETRY is the only way we can return to the fault handler at that time telling it we've released the mmap lock. However that's not ideal because it's very likely the fault does not need to be retried at all since the pgtable was well installed before the throttling, so the next continuous fault (including taking mmap read lock, walk the pgtable, etc.) could be in most cases unnecessary. It's not only slowing down page faults for shared file-backed, but also add more mmap lock contention which is in most cases not needed at all. To observe this, one could try to write to some shmem page and look at "pgfault" value in /proc/vmstat, then we should expect 2 counts for each shmem write simply because we retried, and vm event "pgfault" will capture that. To make it more efficient, add a new VM_FAULT_COMPLETED return code just to show that we've completed the whole fault and released the lock. It's also a hint that we should very possibly not need another fault immediately on this page because we've just completed it. This patch provides a ~12% perf boost on my aarch64 test VM with a simple program sequentially dirtying 400MB shmem file being mmap()ed and these are the time it needs: Before: 650.980 ms (+-1.94%) After: 569.396 ms (+-1.38%) I believe it could help more than that. We need some special care on GUP and the s390 pgfault handler (for gmap code before returning from pgfault), the rest changes in the page fault handlers should be relatively straightforward. Another thing to mention is that mm_account_fault() does take this new fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping them as-is. Link: https://lkml.kernel.org/r/20220530183450.42886-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vineet Gupta <vgupta@kernel.org> Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm part] Acked-by: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Brian Cain <bcain@quicinc.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Weinberger <richard@nod.at> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Will Deacon <will@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Simek <monstr@monstr.eu> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: David Hildenbrand <david@redhat.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Rich Felker <dalias@libc.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Helge Deller <deller@gmx.de> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-10um: virt-pci: set device ready in probe()Vincent Whitchurch1-1/+6
Call virtio_device_ready() to make this driver work after commit b4ec69d7e09 ("virtio: harden vring IRQ"), since the driver uses the virtqueues in the probe function. (The virtio core sets the device ready when probe returns.) Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ") Fixes: 68f5d3f3b654 ("um: add PCI over virtio emulation driver") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Message-Id: <20220610151203.3492541-1-vincent.whitchurch@axis.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Johannes Berg <johannes@sipsolutions.net>
2022-06-03Merge tag 'ptrace_stop-cleanup-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespaceLinus Torvalds5-8/+10
Pull ptrace_stop cleanups from Eric Biederman: "While looking at the ptrace problems with PREEMPT_RT and the problems Peter Zijlstra was encountering with ptrace in his freezer rewrite I identified some cleanups to ptrace_stop that make sense on their own and move make resolving the other problems much simpler. The biggest issue is the habit of the ptrace code to change task->__state from the tracer to suppress TASK_WAKEKILL from waking up the tracee. No other code in the kernel does that and it is straight forward to update signal_wake_up and friends to make that unnecessary. Peter's task freezer sets frozen tasks to a new state TASK_FROZEN and then it stores them by calling "wake_up_state(t, TASK_FROZEN)" relying on the fact that all stopped states except the special stop states can tolerate spurious wake up and recover their state. The state of stopped and traced tasked is changed to be stored in task->jobctl as well as in task->__state. This makes it possible for the freezer to recover tasks in these special states, as well as serving as a general cleanup. With a little more work in that direction I believe TASK_STOPPED can learn to tolerate spurious wake ups and become an ordinary stop state. The TASK_TRACED state has to remain a special state as the registers for a process are only reliably available when the process is stopped in the scheduler. Fundamentally ptrace needs acess to the saved register values of a task. There are bunch of semi-random ptrace related cleanups that were found while looking at these issues. One cleanup that deserves to be called out is from commit 57b6de08b5f6 ("ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs"). This makes a change that is technically user space visible, in the handling of what happens to a tracee when a tracer dies unexpectedly. According to our testing and our understanding of userspace nothing cares that spurious SIGTRAPs can be generated in that case" * tag 'ptrace_stop-cleanup-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state ptrace: Always take siglock in ptrace_resume ptrace: Don't change __state ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs ptrace: Document that wait_task_inactive can't fail ptrace: Reimplement PTRACE_KILL by always sending SIGKILL signal: Use lockdep_assert_held instead of assert_spin_locked ptrace: Remove arch_ptrace_attach ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP signal: Replace __group_send_sig_info with send_signal_locked signal: Rename send_signal send_signal_locked
2022-06-03Merge tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespaceLinus Torvalds1-7/+8
Pull kthread updates from Eric Biederman: "This updates init and user mode helper tasks to be ordinary user mode tasks. Commit 40966e316f86 ("kthread: Ensure struct kthread is present for all kthreads") caused init and the user mode helper threads that call kernel_execve to have struct kthread allocated for them. This struct kthread going away during execve in turned made a use after free of struct kthread possible. Here, commit 343f4c49f243 ("kthread: Don't allocate kthread_struct for init and umh") is enough to fix the use after free and is simple enough to be backportable. The rest of the changes pass struct kernel_clone_args to clean things up and cause the code to make sense. In making init and the user mode helpers tasks purely user mode tasks I ran into two complications. The function task_tick_numa was detecting tasks without an mm by testing for the presence of PF_KTHREAD. The initramfs code in populate_initrd_image was using flush_delayed_fput to ensuere the closing of all it's file descriptors was complete, and flush_delayed_fput does not work in a userspace thread. I have looked and looked and more complications and in my code review I have not found any, and neither has anyone else with the code sitting in linux-next" * tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: sched: Update task_tick_numa to ignore tasks without an mm fork: Stop allowing kthreads to call execve fork: Explicitly set PF_KTHREAD init: Deal with the init process being a user mode process fork: Generalize PF_IO_WORKER handling fork: Explicity test for idle tasks in copy_thread fork: Pass struct kernel_clone_args into copy_thread kthread: Don't allocate kthread_struct for init and umh
2022-06-03Merge tag 'for-linus-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/umlLinus Torvalds14-51/+81
Pull UML updates from Richard Weinberger: - Various cleanups and fixes: xterm, serial line, time travel - Set ARCH_HAS_GCOV_PROFILE_ALL * tag 'for-linus-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: Fix out-of-bounds read in LDT setup um: chan_user: Fix winch_tramp() return value um: virtio_uml: Fix broken device handling in time-travel um: line: Use separate IRQs per line um: Enable ARCH_HAS_GCOV_PROFILE_ALL um: Use asm-generic/dma-mapping.h um: daemon: Make default socket configurable um: xterm: Make default terminal emulator configurable
2022-05-27um: chan_user: Fix winch_tramp() return valueJohannes Berg1-4/+5
The previous fix here was only partially correct, it did result in returning a proper error value in case of error, but it also clobbered the pid that we need to return from this function (not just zero for success). As a result, it returned 0 here, but later this is treated as a pid and used to kill the process, but since it's now 0 we kill(0, SIGKILL), which makes UML kill itself rather than just the helper thread. Fix that and make it more obvious by using a separate variable for the pid. Fixes: ccf1236ecac4 ("um: fix error return code in winch_tramp()") Reported-and-tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27um: virtio_uml: Fix broken device handling in time-travelJohannes Berg1-10/+23
If a device implementation crashes, virtio_uml will mark it as dead by calling virtio_break_device() and scheduling the work that will remove it. This still seems like the right thing to do, but it's done directly while reading the message, and if time-travel is used, this is in the time-travel handler, outside of the normal Linux machinery. Therefore, we cannot acquire locks or do normal "linux-y" things because e.g. lockdep will be confused about the context. Move handling this situation out of the read function and into the actual IRQ handler and response handling instead, so that in the case of time-travel we don't call it in the wrong context. Chances are the system will still crash immediately, since the device implementation crashing may also cause the time- travel controller to go down, but at least all of that now happens without strange warnings from lockdep. Fixes: c8177aba37ca ("um: time-travel: rework interrupt handling in ext mode") Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27um: line: Use separate IRQs per lineJohannes Berg6-33/+29
Today, all possible serial lines (ssl*=) as well as all possible consoles (con*=) each share a single interrupt (with a fixed number) with others of the same type. Now, if you have two lines, say ssl0 and ssl1, and one of them is connected to an fd you cannot read (e.g. a file), but the other gets a read interrupt, then both of them get the interrupt since it's shared. Then, the read() call will return EOF, since it's a file being written and there's nothing to read (at least not at the current offset, at the end). Unfortunately, this is treated as a read error, and we close this line, losing all the possible output. It might be possible to work around this and make the IRQ sharing work, however, now that we have dynamically allocated IRQs that are easy to use, simply use that to achieve separating between the events; then there's no interrupt for that line and we never attempt the read in the first place, thus not closing the line. This manifested itself in the wifi hostap/hwsim tests where the parallel script communicates via one serial console and the kernel messages go to another (a file) and sending data on the communication console caused the kernel messages to stop flowing into the file. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-By: anton ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27um: Enable ARCH_HAS_GCOV_PROFILE_ALLVincent Whitchurch1-0/+1
Enable ARCH_HAS_GCOV_PROFILE_ALL so that CONFIG_GCOV_PROFILE_ALL can be selected on UML. I didn't need to explicitly disable GCOV on anything to get this to work on the configs I tested. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27um: Use asm-generic/dma-mapping.hJohannes Berg1-0/+1
If DMA (PCI over virtio) is enabled, then some drivers may enable CONFIG_DMA_OPS as well, and then we pull in the x86 definition of get_arch_dma_ops(), which uses the dma_ops symbol, which isn't defined. Since we don't have real DMA ops nor any kind of IOMMU fix this in the simplest possible way: pull in the asm-generic file instead of inheriting the x86 one. It's not clear why those drivers that do (e.g. VDPA) "select DMA_OPS", and if they'd even work with this, but chances are nobody will be wanting to do that anyway, so fixing the build failure is good enough. Reported-by: Randy Dunlap <rdunlap@infradead.org> Fixes: 68f5d3f3b654 ("um: add PCI over virtio emulation driver") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27um: daemon: Make default socket configurableJohannes Berg2-1/+9
Even if daemon network is deprecated, some configurations may still use it (e.g. Debian), and not want to default to the /tmp/uml.ctl socket location. Allow configuring the default socket location. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Tested-by: Ritesh Raj Sarraf <ritesh@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27um: xterm: Make default terminal emulator configurableJohannes Berg3-3/+13
Make the default terminal emulator configurable so e.g. Debian can set it to x-terminal-emulator instead of the current default of xterm. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Tested-by: Ritesh Raj Sarraf <ritesh@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-26Merge tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuildLinus Torvalds1-1/+0
Pull Kbuild updates from Masahiro Yamada: - Add HOSTPKG_CONFIG env variable to allow users to override pkg-config - Support W=e as a shorthand for KCFLAGS=-Werror - Fix CONFIG_IKHEADERS build to support toybox cpio - Add scripts/dummy-tools/pahole to ease distro packagers' life - Suppress false-positive warnings from checksyscalls.sh for W=2 build - Factor out the common code of arch/*/boot/install.sh into scripts/install.sh - Support 'kernel-install' tool in scripts/prune-kernel - Refactor module-versioning to link the symbol versions at the final link of vmlinux and modules - Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in an arch-agnostic way - Refactor modpost, Makefiles * tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (56 commits) genksyms: adjust the output format to modpost kbuild: stop merging *.symversions kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS modpost: extract symbol versions from *.cmd files modpost: add sym_find_with_module() helper modpost: change the license of EXPORT_SYMBOL to bool type modpost: remove left-over cross_compile declaration kbuild: record symbol versions in *.cmd files kbuild: generate a list of objects in vmlinux modpost: move *.mod.c generation to write_mod_c_files() modpost: merge add_{intree_flag,retpoline,staging_flag} to add_header scripts/prune-kernel: Use kernel-install if available kbuild: factor out the common installation code into scripts/install.sh modpost: split new_symbol() to symbol allocation and hash table addition modpost: make sym_add_exported() always allocate a new symbol modpost: make multiple export error modpost: dump Module.symvers in the same order of modules.order modpost: traverse the namespace_list in order modpost: use doubly linked list for dump_lists modpost: traverse unresolved symbols in order ...
2022-05-25Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds1-1/+2
Pull networking updates from Jakub Kicinski: "Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection" * tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits) ptp: ocp: Add firmware header checks ptp: ocp: fix PPS source selector debugfs reporting ptp: ocp: add .init function for sma_op vector ptp: ocp: vectorize the sma accessor functions ptp: ocp: constify selectors ptp: ocp: parameterize input/output sma selectors ptp: ocp: revise firmware display ptp: ocp: add Celestica timecard PCI ids ptp: ocp: Remove #ifdefs around PCI IDs ptp: ocp: 32-bit fixups for pci start address Revert "net/smc: fix listen processing for SMC-Rv2" ath6kl: Use cc-disable-warning to disable -Wdangling-pointer selftests/bpf: Dynptr tests bpf: Add dynptr data slices bpf: Add bpf_dynptr_read and bpf_dynptr_write bpf: Dynptr support for ring buffers bpf: Add bpf_dynptr_from_mem for local dynptrs bpf: Add verifier support for dynptrs bpf: Suppress 'passing zero to PTR_ERR' warning bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack ...
2022-05-24Merge tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/randomLinus Torvalds1-7/+2
Pull random number generator updates from Jason Donenfeld: "These updates continue to refine the work began in 5.17 and 5.18 of modernizing the RNG's crypto and streamlining and documenting its code. New for 5.19, the updates aim to improve entropy collection methods and make some initial decisions regarding the "premature next" problem and our threat model. The cloc utility now reports that random.c is 931 lines of code and 466 lines of comments, not that basic metrics like that mean all that much, but at the very least it tells you that this is very much a manageable driver now. Here's a summary of the various updates: - The random_get_entropy() function now always returns something at least minimally useful. This is the primary entropy source in most collectors, which in the best case expands to something like RDTSC, but prior to this change, in the worst case it would just return 0, contributing nothing. For 5.19, additional architectures are wired up, and architectures that are entirely missing a cycle counter now have a generic fallback path, which uses the highest resolution clock available from the timekeeping subsystem. Some of those clocks can actually be quite good, despite the CPU not having a cycle counter of its own, and going off-core for a stamp is generally thought to increase jitter, something positive from the perspective of entropy gathering. Done very early on in the development cycle, this has been sitting in next getting some testing for a while now and has relevant acks from the archs, so it should be pretty well tested and fine, but is nonetheless the thing I'll be keeping my eye on most closely. - Of particular note with the random_get_entropy() improvements is MIPS, which, on CPUs that lack the c0 count register, will now combine the high-speed but short-cycle c0 random register with the lower-speed but long-cycle generic fallback path. - With random_get_entropy() now always returning something useful, the interrupt handler now collects entropy in a consistent construction. - Rather than comparing two samples of random_get_entropy() for the jitter dance, the algorithm now tests many samples, and uses the amount of differing ones to determine whether or not jitter entropy is usable and how laborious it must be. The problem with comparing only two samples was that if the cycle counter was extremely slow, but just so happened to be on the cusp of a change, the slowness wouldn't be detected. Taking many samples fixes that to some degree. This, combined with the other improvements to random_get_entropy(), should make future unification of /dev/random and /dev/urandom maybe more possible. At the very least, were we to attempt it again today (we're not), it wouldn't break any of Guenter's test rigs that broke when we tried it with 5.18. So, not today, but perhaps down the road, that's something we can revisit. - We attempt to reseed the RNG immediately upon waking up from system suspend or hibernation, making use of the various timestamps about suspend time and such available, as well as the usual inputs such as RDRAND when available. - Batched randomness now falls back to ordinary randomness before the RNG is initialized. This provides more consistent guarantees to the types of random numbers being returned by the various accessors. - The "pre-init injection" code is now gone for good. I suspect you in particular will be happy to read that, as I recall you expressing your distaste for it a few months ago. Instead, to avoid a "premature first" issue, while still allowing for maximal amount of entropy availability during system boot, the first 128 bits of estimated entropy are used immediately as it arrives, with the next 128 bits being buffered. And, as before, after the RNG has been fully initialized, it winds up reseeding anyway a few seconds later in most cases. This resulted in a pretty big simplification of the initialization code and let us remove various ad-hoc mechanisms like the ugly crng_pre_init_inject(). - The RNG no longer pretends to handle the "premature next" security model, something that various academics and other RNG designs have tried to care about in the past. After an interesting mailing list thread, these issues are thought to be a) mainly academic and not practical at all, and b) actively harming the real security of the RNG by delaying new entropy additions after a potential compromise, making a potentially bad situation even worse. As well, in the first place, our RNG never even properly handled the premature next issue, so removing an incomplete solution to a fake problem was particularly nice. This allowed for numerous other simplifications in the code, which is a lot cleaner as a consequence. If you didn't see it before, https://lore.kernel.org/lkml/YmlMGx6+uigkGiZ0@zx2c4.com/ may be a thread worth skimming through. - While the interrupt handler received a separate code path years ago that avoids locks by using per-cpu data structures and a faster mixing algorithm, in order to reduce interrupt latency, input and disk events that are triggered in hardirq handlers were still hitting locks and more expensive algorithms. Those are now redirected to use the faster per-cpu data structures. - Rather than having the fake-crypto almost-siphash-based random32 implementation be used right and left, and in many places where cryptographically secure randomness is desirable, the batched entropy code is now fast enough to replace that. - As usual, numerous code quality and documentation cleanups. For example, the initialization state machine now uses enum symbolic constants instead of just hard coding numbers everywhere. - Since the RNG initializes once, and then is always initialized thereafter, a pretty heavy amount of code used during that initialization is never used again. It is now completely cordoned off using static branches and it winds up in the .text.unlikely section so that it doesn't reduce cache compactness after the RNG is ready. - A variety of functions meant for waiting on the RNG to be initialized were only used by vsprintf, and in not a particularly optimal way. Replacing that usage with a more ordinary setup made it possible to remove those functions. - A cleanup of how we warn userspace about the use of uninitialized /dev/urandom and uninitialized get_random_bytes() usage. Interestingly, with the change you merged for 5.18 that attempts to use jitter (but does not block if it can't), the majority of users should never see those warnings for /dev/urandom at all now, and the one for in-kernel usage is mainly a debug thing. - The file_operations struct for /dev/[u]random now implements .read_iter and .write_iter instead of .read and .write, allowing it to also implement .splice_read and .splice_write, which makes splice(2) work again after it was broken here (and in many other places in the tree) during the set_fs() removal. This was a bit of a last minute arrival from Jens that hasn't had as much time to bake, so I'll be keeping my eye on this as well, but it seems fairly ordinary. Unfortunately, read_iter() is around 3% slower than read() in my tests, which I'm not thrilled about. But Jens and Al, spurred by this observation, seem to be making progress in removing the bottlenecks on the iter paths in the VFS layer in general, which should remove the performance gap for all drivers. - Assorted other bug fixes, cleanups, and optimizations. - A small SipHash cleanup" * tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (49 commits) random: check for signals after page of pool writes random: wire up fops->splice_{read,write}_iter() random: convert to using fops->write_iter() random: convert to using fops->read_iter() random: unify batched entropy implementations random: move randomize_page() into mm where it belongs random: remove mostly unused async readiness notifier random: remove get_random_bytes_arch() and add rng_has_arch_random() random: move initialization functions out of hot pages random: make consistent use of buf and len random: use proper return types on get_random_{int,long}_wait() random: remove extern from functions in header random: use static branch for crng_ready() random: credit architectural init the exact amount random: handle latent entropy and command line from random_init() random: use proper jiffies comparison macro random: remove ratelimiting for in-kernel unseeded randomness random: move initialization out of reseeding hot path random: avoid initializing twice in credit race random: use symbolic constants for crng_init states ...
2022-05-24kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCSMasahiro Yamada1-1/+0
include/{linux,asm-generic}/export.h defines a weak symbol, __crc_* as a placeholder. Genksyms writes the version CRCs into the linker script, which will be used for filling the __crc_* symbols. The linker script format depends on CONFIG_MODULE_REL_CRCS. If it is enabled, __crc_* holds the offset to the reference of CRC. It is time to get rid of this complexity. Now that modpost parses text files (.*.cmd) to collect all the CRCs, it can generate C code that will be linked to the vmlinux or modules. Generate a new C file, .vmlinux.export.c, which contains the CRCs of symbols exported by vmlinux. It is compiled and linked to vmlinux in scripts/link-vmlinux.sh. Put the CRCs of symbols exported by modules into the existing *.mod.c files. No additional build step is needed for modules. As before, *.mod.c are compiled and linked to *.ko in scripts/Makefile.modfinal. No linker magic is used here. The new C implementation works in the same way, whether CONFIG_RELOCATABLE is enabled or not. CONFIG_MODULE_REL_CRCS is no longer needed. Previously, Kbuild invoked additional $(LD) to update the CRCs in objects, but this step is unneeded too. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nicolas Schier <nicolas@fjasle.eu> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
2022-05-13um: use fallback for random_get_entropy() instead of zeroJason A. Donenfeld1-7/+2
In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. This is accomplished by just including the asm-generic code like on other architectures, which means we can get rid of the empty stub function here. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-11ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEPEric W. Biederman5-8/+10
User mode linux is the last user of the PT_DTRACE flag. Using the flag to indicate single stepping is a little confusing and worse changing tsk->ptrace without locking could potentionally cause problems. So use a thread info flag with a better name instead of flag in tsk->ptrace. Remove the definition PT_DTRACE as uml is the last user. Cc: stable@vger.kernel.org Acked-by: Johannes Berg <johannes@sipsolutions.net> Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-3-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-05-08um: vector: switch to netif_napi_add_weight()Jakub Kicinski1-1/+2
UM's netdev driver uses a custom napi weight, switch to the new API for setting custom weight. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-07fork: Generalize PF_IO_WORKER handlingEric W. Biederman1-6/+4
Add fn and fn_arg members into struct kernel_clone_args and test for them in copy_thread (instead of testing for PF_KTHREAD | PF_IO_WORKER). This allows any task that wants to be a user space task that only runs in kernel mode to use this functionality. The code on x86 is an exception and still retains a PF_KTHREAD test because x86 unlikely everything else handles kthreads slightly differently than user space tasks that start with a function. The functions that created tasks that start with a function have been updated to set ".fn" and ".fn_arg" instead of ".stack" and ".stack_size". These functions are fork_idle(), create_io_thread(), kernel_thread(), and user_mode_thread(). Link: https://lkml.kernel.org/r/20220506141512.516114-4-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-05-07fork: Pass struct kernel_clone_args into copy_threadEric W. Biederman1-2/+5
With io_uring we have started supporting tasks that are for most purposes user space tasks that exclusively run code in kernel mode. The kernel task that exec's init and tasks that exec user mode helpers are also user mode tasks that just run kernel code until they call kernel execve. Pass kernel_clone_args into copy_thread so these oddball tasks can be supported more cleanly and easily. v2: Fix spelling of kenrel_clone_args on h8300 Link: https://lkml.kernel.org/r/20220506141512.516114-2-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-05-03ubd: don't set the discard_alignment queue limitChristoph Hellwig1-1/+0
The discard_alignment queue limit is named a bit misleading means the offset into the block device at which the discard granularity starts. Setting it to the discard granularity as done by ubd is mostly harmless but also useless. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220418045314.360785-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17block: remove QUEUE_FLAG_DISCARDChristoph Hellwig1-2/+0
Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes. The only places where needs special attention is the RAID5 driver, which must clear discard support for security reasons by default, even if the default stacking rules would allow for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-02Merge tag 'kbuild-fixes-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuildLinus Torvalds1-4/+0
Pull Kbuild fixes from Masahiro Yamada: - Fix empty $(PYTHON) expansion. - Fix UML, which got broken by the attempt to suppress Clang warnings. - Fix warning message in modpost. * tag 'kbuild-fixes-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: modpost: restore the warning message for missing symbol versions Revert "um: clang: Strip out -mno-global-merge from USER_CFLAGS" kbuild: Remove '-mno-global-merge' kbuild: fix empty ${PYTHON} in scripts/link-vmlinux.sh kconfig: remove stale comment about removed kconfig_print_symbol()
2022-04-01Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-1/+0
Pull vfs updates from Al Viro: "Assorted bits and pieces" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: aio: drop needless assignment in aio_read() clean overflow checks in count_mounts() a bit seq_file: fix NULL pointer arithmetic warning uml/x86: use x86 load_unaligned_zeropad() asm/user.h: killed unused macros constify struct path argument of finish_automount()/do_add_mount() fs: Remove FIXME comment in generic_write_checks()
2022-04-02Revert "um: clang: Strip out -mno-global-merge from USER_CFLAGS"Nathan Chancellor1-4/+0
This reverts commit 6580c5c18fb3df2b11c5e0452372f815deeff895. This patch is buggy, as noted in the patch linked below. The root cause has been solved by removing '-mno-global-merge' for the entire kernel. Link: https://lore.kernel.org/r/20220322173547.677760-1-nathan@kernel.org/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-03-31Merge tag 'for-linus-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/umlLinus Torvalds14-70/+102
Pull UML updates from Richard Weinberger: - Devicetree support (for testing) - Various cleanups and fixes: UBD, port_user, uml_mconsole - Maintainer update * tag 'for-linus-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: run_helper: Write error message to kernel log on exec failure on host um: port_user: Improve error handling when port-helper is not found um: port_user: Allow setting path to port-helper using UML_PORT_HELPER envvar um: port_user: Search for in.telnetd in PATH um: clang: Strip out -mno-global-merge from USER_CFLAGS docs: UML: Mention telnetd for port channel um: Remove unused timeval_to_ns() function um: Fix uml_mconsole stop/go um: Cleanup syscall_handler_t definition/cast, fix warning uml: net: vector: fix const issue um: Fix WRITE_ZEROES in the UBD Driver um: Migrate vector drivers to NAPI um: Fix order of dtb unflatten/early init um: fix and optimize xor select template for CONFIG64 and timetravel mode um: Document dtb command line option lib/logic_iomem: correct fallback config references um: Remove duplicated include in syscalls_64.c MAINTAINERS: Update UserModeLinux entry
2022-03-28Merge tag 'ptrace-cleanups-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespaceLinus Torvalds2-5/+4
Pull ptrace cleanups from Eric Biederman: "This set of changes removes tracehook.h, moves modification of all of the ptrace fields inside of siglock to remove races, adds a missing permission check to ptrace.c The removal of tracehook.h is quite significant as it has been a major source of confusion in recent years. Much of that confusion was around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled making the semantics clearer). For people who don't know tracehook.h is a vestiage of an attempt to implement uprobes like functionality that was never fully merged, and was later superseeded by uprobes when uprobes was merged. For many years now we have been removing what tracehook functionaly a little bit at a time. To the point where anything left in tracehook.h was some weird strange thing that was difficult to understand" * tag 'ptrace-cleanups-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: ptrace: Remove duplicated include in ptrace.c ptrace: Check PTRACE_O_SUSPEND_SECCOMP permission on PTRACE_SEIZE ptrace: Return the signal to continue with from ptrace_stop ptrace: Move setting/clearing ptrace_message into ptrace_stop tracehook: Remove tracehook.h resume_user_mode: Move to resume_user_mode.h resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume signal: Move set_notify_signal and clear_notify_signal into sched/signal.h task_work: Decouple TIF_NOTIFY_SIGNAL and task_work task_work: Call tracehook_notify_signal from get_signal on all architectures task_work: Introduce task_work_pending task_work: Remove unnecessary include from posix_timers.h ptrace: Remove tracehook_signal_handler ptrace: Remove arch_syscall_{enter,exit}_tracehook ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h ptrace/arm: Rename tracehook_report_syscall report_syscall ptrace: Move ptrace_report_syscall into ptrace.h
2022-03-28Merge tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds1-0/+1
Pull char/misc and other driver updates from Greg KH: "Here is the big set of char/misc and other small driver subsystem updates for 5.18-rc1. Included in here are merges from driver subsystems which contain: - iio driver updates and new drivers - fsi driver updates - fpga driver updates - habanalabs driver updates and support for new hardware - soundwire driver updates and new drivers - phy driver updates and new drivers - coresight driver updates - icc driver updates Individual changes include: - mei driver updates - interconnect driver updates - new PECI driver subsystem added - vmci driver updates - lots of tiny misc/char driver updates All of these have been in linux-next for a while with no reported problems" * tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (556 commits) firmware: google: Properly state IOMEM dependency kgdbts: fix return value of __setup handler firmware: sysfb: fix platform-device leak in error path firmware: stratix10-svc: add missing callback parameter on RSU arm64: dts: qcom: add non-secure domain property to fastrpc nodes misc: fastrpc: Add dma handle implementation misc: fastrpc: Add fdlist implementation misc: fastrpc: Add helper function to get list and page misc: fastrpc: Add support to secure memory map dt-bindings: misc: add fastrpc domain vmid property misc: fastrpc: check before loading process to the DSP misc: fastrpc: add secure domain support dt-bindings: misc: add property to support non-secure DSP misc: fastrpc: Add support to get DSP capabilities misc: fastrpc: add support for FASTRPC_IOCTL_MEM_MAP/UNMAP misc: fastrpc: separate fastrpc device from channel context dt-bindings: nvmem: brcm,nvram: add basic NVMEM cells dt-bindings: nvmem: make "reg" property optional nvmem: brcm_nvram: parse NVRAM content into NVMEM cells nvmem: dt-bindings: Fix the error of dt-bindings check ...
2022-03-27Merge tag 'x86_core_for_5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+4
Pull x86 CET-IBT (Control-Flow-Integrity) support from Peter Zijlstra: "Add support for Intel CET-IBT, available since Tigerlake (11th gen), which is a coarse grained, hardware based, forward edge Control-Flow-Integrity mechanism where any indirect CALL/JMP must target an ENDBR instruction or suffer #CP. Additionally, since Alderlake (12th gen)/Sapphire-Rapids, speculation is limited to 2 instructions (and typically fewer) on branch targets not starting with ENDBR. CET-IBT also limits speculation of the next sequential instruction after the indirect CALL/JMP [1]. CET-IBT is fundamentally incompatible with retpolines, but provides, as described above, speculation limits itself" [1] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html * tag 'x86_core_for_5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) kvm/emulate: Fix SETcc emulation for ENDBR x86/Kconfig: Only allow CONFIG_X86_KERNEL_IBT with ld.lld >= 14.0.0 x86/Kconfig: Only enable CONFIG_CC_HAS_IBT for clang >= 14.0.0 kbuild: Fixup the IBT kbuild changes x86/Kconfig: Do not allow CONFIG_X86_X32_ABI=y with llvm-objcopy x86: Remove toolchain check for X32 ABI capability x86/alternative: Use .ibt_endbr_seal to seal indirect calls objtool: Find unused ENDBR instructions objtool: Validate IBT assumptions objtool: Add IBT/ENDBR decoding objtool: Read the NOENDBR annotation x86: Annotate idtentry_df() x86,objtool: Move the ASM_REACHABLE annotation to objtool.h x86: Annotate call_on_stack() objtool: Rework ASM_REACHABLE x86: Mark __invalid_creds() __noreturn exit: Mark do_group_exit() __noreturn x86: Mark stop_this_cpu() __noreturn objtool: Ignore extra-symbol code objtool: Rename --duplicate to --lto ...
2022-03-24Merge tag 'flexible-array-transformations-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linuxLinus Torvalds1-1/+1
Pull flexible-array transformations from Gustavo Silva: "Treewide patch that replaces zero-length arrays with flexible-array members. This has been baking in linux-next for a whole development cycle" * tag 'flexible-array-transformations-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: treewide: Replace zero-length arrays with flexible-array members
2022-03-23Merge tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-genericLinus Torvalds1-4/+3
Pull asm-generic updates from Arnd Bergmann: "There are three sets of updates for 5.18 in the asm-generic tree: - The set_fs()/get_fs() infrastructure gets removed for good. This was already gone from all major architectures, but now we can finally remove it everywhere, which loses some particularly tricky and error-prone code. There is a small merge conflict against a parisc cleanup, the solution is to use their new version. - The nds32 architecture ends its tenure in the Linux kernel. The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release. - A series from Masahiro Yamada cleans up some of the uapi header files to pass the compile-time checks" * tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits) nds32: Remove the architecture uaccess: remove CONFIG_SET_FS ia64: remove CONFIG_SET_FS support sh: remove CONFIG_SET_FS support sparc64: remove CONFIG_SET_FS support lib/test_lockup: fix kernel pointer check for separate address spaces uaccess: generalize access_ok() uaccess: fix type mismatch warnings from access_ok() arm64: simplify access_ok() m68k: fix access_ok for coldfire MIPS: use simpler access_ok() MIPS: Handle address errors for accesses above CPU max virtual user address uaccess: add generic __{get,put}_kernel_nofault nios2: drop access_ok() check from __put_user() x86: use more conventional access_ok() definition x86: remove __range_not_ok() sparc64: add __{get,put}_kernel_nofault() nds32: fix access_ok() checks in get/put_user uaccess: fix nios2 and microblaze get_user_8() sparc64: fix building assembly files ...
2022-03-22Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds1-0/+1
Pull folio updates from Matthew Wilcox: - Rewrite how munlock works to massively reduce the contention on i_mmap_rwsem (Hugh Dickins): https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/ - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig): https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/ - Convert GUP to use folios and make pincount available for order-1 pages. (Matthew Wilcox) - Convert a few more truncation functions to use folios (Matthew Wilcox) - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox) - Convert rmap_walk to use folios (Matthew Wilcox) - Convert most of shrink_page_list() to use a folio (Matthew Wilcox) - Add support for creating large folios in readahead (Matthew Wilcox) * tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits) mm/damon: minor cleanup for damon_pa_young selftests/vm/transhuge-stress: Support file-backed PMD folios mm/filemap: Support VM_HUGEPAGE for file mappings mm/readahead: Switch to page_cache_ra_order mm/readahead: Align file mappings for non-DAX mm/readahead: Add large folio readahead mm: Support arbitrary THP sizes mm: Make large folios depend on THP mm: Fix READ_ONLY_THP warning mm/filemap: Allow large folios to be added to the page cache mm: Turn can_split_huge_page() into can_split_folio() mm/vmscan: Convert pageout() to take a folio mm/vmscan: Turn page_check_references() into folio_check_references() mm/vmscan: Account large folios correctly mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios mm/vmscan: Free non-shmem folios without splitting them mm/rmap: Constify the rmap_walk_control argument mm/rmap: Convert rmap_walk() to take a folio mm: Turn page_anon_vma() into folio_anon_vma() mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read() ...
2022-03-21arch: Add pmd_pfn() where it is missingMike Rapoport1-0/+1
We need to use this function in common code, so define it for architectures and/or configrations that miss it. The result of pmd_pfn() will only be used if TRANSPARENT_HUGEPAGE is enabled, but a function or macro called pmd_pfn() must be defined, even on machines with two level page tables. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-21um: Allow builds with ClangKees Cook1-0/+1
Add SUBARCH target for Clang+um (which must go last, not alphabetically, so the other SUBARCHes are assigned). Remove open-coded "DEFINE" macro, instead using linux/kbuild.h's version which was updated to use Clang-friendly assembly in commit cf0c3e68aa81 ("kbuild: fix asm-offset generation to work with clang"). Redefine "DEFINE_LONGS" in terms of "COMMENT" and "DEFINE" so that the intended coment actually has useful content. Add a missed "break" to avoid implicit fall-through warnings. This lets me run KUnit tests with Clang: $ ./tools/testing/kunit/kunit.py run --make_options LLVM=1 ... Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: David Gow <davidgow@google.com> Cc: linux-um@lists.infradead.org Cc: linux-kbuild@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Cc: kunit-dev@googlegroups.com Cc: llvm@lists.linux.dev Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/lkml/Yg2YubZxvYvx7%2Fnm@dev-arch.archlinux-ax161/ Tested-by: David Gow <davidgow@google.com> Link: https://lore.kernel.org/lkml/CABVgOSk=oFxsbSbQE-v65VwR2+mXeGXDDjzq8t7FShwjJ3+kUg@mail.gmail.com/ Signed-off-by: Kees Cook <keescook@chromium.org> --- v1: https://lore.kernel.org/lkml/20220217002843.2312603-1-keescook@chromium.org v2: https://lore.kernel.org/lkml/20220224055831.1854786-1-keescook@chromium.org v3: - use kbuild.h to avoid duplication (Masahiro) - fix intended comments (Masahiro) - use SUBARCH (Nathan)
2022-03-18parport_pc: Also enable driver for PCI systemsMaciej W. Rozycki1-0/+1
Nowadays PC-style parallel ports come in the form of PCI and PCIe option cards and there are some combined parallel/serial option cards as well that we handle in the parport subsystem. There is nothing in particular that would prevent them from being used in any system equipped with PCI or PCIe connectivity, except that we do not permit the PARPORT_PC config option to be selected for platforms for which ARCH_MIGHT_HAVE_PC_PARPORT has not been set for. The only PCI platforms that actually can't make use of PC-style parallel port hardware are those newer PCIe systems that have no support for I/O cycles in the host bridge, required by such parallel ports. Notably, this includes the s390 arch, which has port I/O accessors that cause compilation warnings (promoted to errors with `-Werror'), and there are other cases such as the POWER9 PHB4 device, though this one has variable port I/O accessors that depend on the particular system. Also it is not clear whether the serial port side of devices enabled by PARPORT_SERIAL uses port I/O or MMIO. Finally Super I/O solutions are always either ISA or platform devices. Make the PARPORT_PC option selectable also for PCI systems then, except for the s390 arch, however limit the availability of PARPORT_PC_SUPERIO to platforms that enable ARCH_MIGHT_HAVE_PC_PARPORT. Update platforms accordingly for the required <asm/parport.h> header. Acked-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://lore.kernel.org/r/alpine.DEB.2.21.2202141955550.34636@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-15x86/alternative: Use .ibt_endbr_seal to seal indirect callsPeter Zijlstra1-0/+4
Objtool's --ibt option generates .ibt_endbr_seal which lists superfluous ENDBR instructions. That is those instructions for which the function is never indirectly called. Overwrite these ENDBR instructions with a NOP4 such that these function can never be indirect called, reducing the number of viable ENDBR targets in the kernel. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154319.822545231@infradead.org
2022-03-11um: run_helper: Write error message to kernel log on exec failure on hostGlenn Washburn1-0/+5
The best place to log errors from the host side is in the kernel log within the UML guest. Letting the user now that exec() failed and why is very helpful when the user is trying to determine why some aspect of UML is not working. For instance, when telneting into the UML instance, if the connection is established and then immediately dropped, this may be due to exec() failing because in.telnetd is not found. Signed-off-by: Glenn Washburn <development@efficientek.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-03-11um: port_user: Improve error handling when port-helper is not foundGlenn Washburn1-0/+12
Check if port-helper exists and is executable. If not, write an error message to the kernel log with information to help the user diagnose the issue and exit with an error. If UML_PORT_HELPER was not set, write a message suggesting that the user set it. This makes it easier to understand why telneting to the UML instance is failing and what can be done to fix it. Signed-off-by: Glenn Washburn <development@efficientek.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-03-11um: port_user: Allow setting path to port-helper using UML_PORT_HELPER envvarGlenn Washburn1-0/+4
This is useful when the uml-utilities user-space package has not been installed by the administrator and an unprivileged user wants to be able to telnet into a UML instance. The user can install the port-helper binary to a writable path and set UML_PORT_HELPER. Fallback to using hardcoded path to port-helper if environment variable is not set. Signed-off-by: Glenn Washburn <development@efficientek.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-03-11um: port_user: Search for in.telnetd in PATHGlenn Washburn1-1/+1
This allows in.telnetd to be run from non-standard installation locations and is especially useful when running a UML instance as an unprivileged user on a system where the administrator has not installed the in.telnetd binary. Signed-off-by: Glenn Washburn <development@efficientek.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-03-11um: clang: Strip out -mno-global-merge from USER_CFLAGSDavid Gow1-0/+4
The things built with USER_CFLAGS don't seem to recognise it as a compiler option, and print a warning: clang: warning: argument unused during compilation: '-mno-global-merge' [-Wunused-command-line-argument] Fixes: 744814d2fa ("um: Allow builds with Clang") Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Richard Weinberger <richard@nod.at>