linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-05-09	binder: add BINDER_GET_EXTENDED_ERROR ioctl	Carlos Llamas	1	-0/+16
	Provide a userspace mechanism to pull precise error information upon failed operations. Extending the current error codes returned by the interfaces allows userspace to better determine the course of action. This could be for instance, retrying a failed transaction at a later point and thus offloading the error handling from the driver. Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org> Acked-by: Todd Kjos <tkjos@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20220429235644.697372-3-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-09	io_uring: support CQE32 in io_uring_cqe	Stefan Roesch	1	-0/+7
	This adds the big_cqe array to the struct io_uring_cqe to support large CQE's. Co-developed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Link: https://lore.kernel.org/r/20220426182134.136504-2-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-09	io_uring: add support for 128-byte SQEs	Jens Axboe	1	-0/+8
	Normal SQEs are 64-bytes in length, which is fine for all the commands we support. However, in preparation for supporting passthrough IO, provide an option for setting up a ring with 128-byte SQEs. We continue to use the same type for io_uring_sqe, it's marked and commented with a zero sized array pad at the end. This provides up to 80 bytes of data for a passthrough command - 64 bytes for the extra added data, and 16 bytes available at the end of the existing SQE. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-09	Merge branch 'for-5.19/io_uring-socket' into for-5.19/io_uring-passthrough	Jens Axboe	1	-2/+9
	* for-5.19/io_uring-socket: io_uring: use the text representation of ops in trace io_uring: rename op -> opcode io_uring: add io_uring_get_opcode io_uring: add type to op enum io_uring: add socket(2) support net: add __sys_socket_file() io_uring: fix trace for reduced sqe padding io_uring: add fgetxattr and getxattr support io_uring: add fsetxattr and setxattr support fs: split off do_getxattr from getxattr fs: split off setxattr_copy and do_setxattr function from setxattr
2022-05-09	Merge branch 'for-5.19/io_uring' into for-5.19/io_uring-passthrough	Jens Axboe	1	-0/+37
	* for-5.19/io_uring: (85 commits) io_uring: don't clear req->kbuf when buffer selection is done io_uring: eliminate the need to track provided buffer ID separately io_uring: move provided buffer state closer to submit state io_uring: move provided and fixed buffers into the same io_kiocb area io_uring: abstract out provided buffer list selection io_uring: never call io_buffer_select() for a buffer re-select io_uring: get rid of hashed provided buffer groups io_uring: always use req->buf_index for the provided buffer group io_uring: ignore ->buf_index if REQ_F_BUFFER_SELECT isn't set io_uring: kill io_rw_buffer_select() wrapper io_uring: make io_buffer_select() return the user address directly io_uring: kill io_recv_buffer_select() wrapper io_uring: use 'sr' vs 'req->sr_msg' consistently io_uring: add POLL_FIRST support for send/sendmsg and recv/recvmsg io_uring: check IOPOLL/ioprio support upfront io_uring: replace smp_mb() with smp_mb__after_atomic() in io_sq_thread() io_uring: add IORING_SETUP_TASKRUN_FLAG io_uring: use TWA_SIGNAL_NO_IPI if IORING_SETUP_COOP_TASKRUN is used io_uring: set task_work notify method at init time io-wq: use __set_notify_signal() to wake workers ...
2022-05-09	rfkill: uapi: fix RFKILL_IOCTL_MAX_SIZE ioctl request definition	Gleb Fotengauer-Malinovskiy	1	-1/+1
	The definition of RFKILL_IOCTL_MAX_SIZE introduced by commit 54f586a91532 ("rfkill: make new event layout opt-in") is unusable since it is based on RFKILL_IOC_EXT_SIZE which has not been defined. Fix that by replacing the undefined constant with the constant which is intended to be used in this definition. Fixes: 54f586a91532 ("rfkill: make new event layout opt-in") Cc: stable@vger.kernel.org # 5.11+ Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Link: https://lore.kernel.org/r/20220506172454.120319-1-glebfm@altlinux.org [add commit message provided later by Dmitry] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-09	landlock: Add clang-format exceptions	Mickaël Salaün	1	-0/+4
	In preparation to a following commit, add clang-format on and clang-format off stanzas around constant definitions. This enables to keep aligned values, which is much more readable than packed definitions. Link: https://lore.kernel.org/r/20220506160513.523257-2-mic@digikod.net Cc: stable@vger.kernel.org Signed-off-by: Mickaël Salaün <mic@digikod.net>
2022-05-06	Merge tag 'tee-cleanup-for-v5.19' of https://git.linaro.org/people/jens.wiklander/linux-tee into arm/drivers	Arnd Bergmann	1	-4/+0
	TEE cleanup Removes the old and unused TEE_IOCTL_SHM_* flags Removes unused the unused tee_shm_va2pa() and tee_shm_pa2va() functions * tag 'tee-cleanup-for-v5.19' of https://git.linaro.org/people/jens.wiklander/linux-tee: tee: remove flags TEE_IOCTL_SHM_MAPPED and TEE_IOCTL_SHM_DMA_BUF tee: remove tee_shm_va2pa() and tee_shm_pa2va() Link: https://lore.kernel.org/r/20220506070328.GA1344495@jade Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-05-05	io_uring: add POLL_FIRST support for send/sendmsg and recv/recvmsg	Jens Axboe	1	-0/+10
	If IORING_RECVSEND_POLL_FIRST is set for recv/recvmsg or send/sendmsg, then we arm poll first rather than attempt a receive or send upfront. This can be useful if we expect there to be no data (or space) available for the request, as we can then avoid wasting time on the initial issue attempt. Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-05	Revert "Merge branch 'mlxsw-line-card-model'"	Jakub Kicinski	1	-5/+0
	This reverts commit 5e927a9f4b9f29d78a7c7d66ea717bb5c8bbad8e, reversing changes made to cfc1d91a7d78cf9de25b043d81efcc16966d55b3. The discussion is still ongoing so let's remove the uAPI until the discussion settles. Link: https://lore.kernel.org/all/20220425090021.32e9a98f@kernel.org/ Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220504154037.539442-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	2	-2/+10
	tools/testing/selftests/net/forwarding/Makefile f62c5acc800e ("selftests/net/forwarding: add missing tests to Makefile") 50fe062c806e ("selftests: forwarding: new test, verify host mdb entries") https://lore.kernel.org/all/20220502111539.0b7e4621@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04	cfg80211: support disabling EHT mode	Muna Sinada	1	-0/+2
	Allow userspace to disable EHT mode during association. Signed-off-by: Muna Sinada <quic_msinada@quicinc.com> Signed-off-by: Aloka Dixit <quic_alokad@quicinc.com> Link: https://lore.kernel.org/r/20220323224636.20211-1-quic_alokad@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-04	mptcp: netlink: allow userspace-driven subflow establishment	Florian Westphal	1	-0/+3
	This allows userspace to tell kernel to add a new subflow to an existing mptcp connection. Userspace provides the token to identify the mptcp-level connection that needs a change in active subflows and the local and remote addresses of the new or the to-be-removed subflow. MPTCP_PM_CMD_SUBFLOW_CREATE requires the following parameters: { token, { loc_id, family, loc_addr4 \| loc_addr6 }, { family, rem_addr4 \| rem_addr6, rem_port } MPTCP_PM_CMD_SUBFLOW_DESTROY requires the following parameters: { token, { family, loc_addr4 \| loc_addr6, loc_port }, { family, rem_addr4 \| rem_addr6, rem_port } Acked-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04	mptcp: netlink: Add MPTCP_PM_CMD_REMOVE	Kishen Maloor	1	-0/+2
	This change adds a MPTCP netlink command for issuing a REMOVE_ADDR signal for an address over the chosen MPTCP connection from a userspace path manager. The command requires the following parameters: {token, loc_id}. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04	mptcp: netlink: Add MPTCP_PM_CMD_ANNOUNCE	Kishen Maloor	1	-0/+2
	This change adds a MPTCP netlink interface for issuing ADD_ADDR advertisements over the chosen MPTCP connection from a userspace path manager. The command requires the following parameters: { token, { loc_id, family, daddr4 \| daddr6 [, dport] } [, if_idx], flags[signal] }. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04	gpiolib: cdev: Add hardware timestamp clock type	Dipen Patel	1	-0/+3
	This patch adds new clock type for the GPIO controller which can timestamp gpio lines in using hardware means. To expose such functionalities to the userspace, code has been added where during line create or set config API calls, it checks for new clock type and if requested, calls HTE API. During line change event, the HTE subsystem pushes timestamp data to userspace through gpiolib-cdev. Signed-off-by: Dipen Patel <dipenp@nvidia.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-05-04	Merge remote-tracking branch 'arm64/for-next/sme' into kvmarm-master/next	Marc Zyngier	2	-0/+11
	Merge arm64's SME branch to resolve conflicts with the WFxT branch. Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-04	KVM: arm64: Implement PSCI SYSTEM_SUSPEND	Oliver Upton	1	-0/+2
	ARM DEN0022D.b 5.19 "SYSTEM_SUSPEND" describes a PSCI call that allows software to request that a system be placed in the deepest possible low-power state. Effectively, software can use this to suspend itself to RAM. Unfortunately, there really is no good way to implement a system-wide PSCI call in KVM. Any precondition checks done in the kernel will need to be repeated by userspace since there is no good way to protect a critical section that spans an exit to userspace. SYSTEM_RESET and SYSTEM_OFF are equally plagued by this issue, although no users have seemingly cared for the relatively long time these calls have been supported. The solution is to just make the whole implementation userspace's problem. Introduce a new system event, KVM_SYSTEM_EVENT_SUSPEND, that indicates to userspace a calling vCPU has invoked PSCI SYSTEM_SUSPEND. Additionally, add a CAP to get buy-in from userspace for this new exit type. Only advertise the SYSTEM_SUSPEND PSCI call if userspace has opted in. If a vCPU calls SYSTEM_SUSPEND, punt straight to userspace. Provide explicit documentation of userspace's responsibilites for the exit and point to the PSCI specification to describe the actual PSCI call. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-8-oupton@google.com
2022-05-04	KVM: arm64: Add support for userspace to suspend a vCPU	Oliver Upton	1	-0/+2
	Introduce a new MP state, KVM_MP_STATE_SUSPENDED, which indicates a vCPU is in a suspended state. In the suspended state the vCPU will block until a wakeup event (pending interrupt) is recognized. Add a new system event type, KVM_SYSTEM_EVENT_WAKEUP, to indicate to userspace that KVM has recognized one such wakeup event. It is the responsibility of userspace to then make the vCPU runnable, or leave it suspended until the next wakeup event. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-7-oupton@google.com
2022-05-03	mptcp: expose server_side attribute in MPTCP netlink events	Kishen Maloor	1	-0/+1
	This change records the 'server_side' attribute of MPTCP_EVENT_CREATED and MPTCP_EVENT_ESTABLISHED events to inform their recipient about the Client/Server role of the running MPTCP application. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/246 Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03	seccomp: Add wait_killable semantic to seccomp user notifier	Sargun Dhillon	1	-0/+2
	This introduces a per-filter flag (SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV) that makes it so that when notifications are received by the supervisor the notifying process will transition to wait killable semantics. Although wait killable isn't a set of semantics formally exposed to userspace, the concept is searchable. If the notifying process is signaled prior to the notification being received by the userspace agent, it will be handled as normal. One quirk about how this is handled is that the notifying process only switches to TASK_KILLABLE if it receives a wakeup from either an addfd or a signal. This is to avoid an unnecessary wakeup of the notifying task. The reasons behind switching into wait_killable only after userspace receives the notification are: * Avoiding unncessary work - Often, workloads will perform work that they may abort (request racing comes to mind). This allows for syscalls to be aborted safely prior to the notification being received by the supervisor. In this, the supervisor doesn't end up doing work that the workload does not want to complete anyways. * Avoiding side effects - We don't want the syscall to be interruptible once the supervisor starts doing work because it may not be trivial to reverse the operation. For example, unmounting a file system may take a long time, and it's hard to rollback, or treat that as reentrant. * Avoid breaking runtimes - Various runtimes do not GC when they are during a syscall (or while running native code that subsequently calls a syscall). If many notifications are blocked, and not picked up by the supervisor, this can get the application into a bad state. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220503080958.20220-2-sargun@sargun.me
2022-05-01	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	1	-1/+9
	Pull kvm fixes from Paolo Bonzini: "ARM: - Take care of faults occuring between the PARange and IPA range by injecting an exception - Fix S2 faults taken from a host EL0 in protected mode - Work around Oops caused by a PMU access from a 32bit guest when PMU has been created. This is a temporary bodge until we fix it for good. x86: - Fix potential races when walking host page table - Fix shadow page table leak when KVM runs nested - Work around bug in userspace when KVM synthesizes leaf 0x80000021 on older (pre-EPYC) or Intel processors Generic (but affects only RISC-V): - Fix bad user ABI for KVM_EXIT_SYSTEM_EVENT" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: work around QEMU issue with synthetic CPUID leaves Revert "x86/mm: Introduce lookup_address_in_mm()" KVM: x86/mmu: fix potential races when walking host page table KVM: fix bad user ABI for KVM_EXIT_SYSTEM_EVENT KVM: x86/mmu: Do not create SPTEs for GFNs that exceed host.MAXPHYADDR KVM: arm64: Inject exception on out-of-IPA-range translation fault KVM/arm64: Don't emulate a PMU for 32-bit guests if feature not set KVM: arm64: Handle host stage-2 faults from 32-bit EL0
2022-05-01	net: phy: Add 10BASE-T1L support in phy-c45	Alexandru Tachici	1	-0/+10
	This patch is needed because the BASE-T1 uses different registers for status, control and advertisement to those already employed in the existing phy-c45 functions. Where required, genphy_c45 functions will now check whether the device supports BASE-T1 and use the specific registers instead: 45.2.7.19 BASE-T1 AN control register, 45.2.7.20 BASE-T1 AN status, 45.2.7.21 BASE-T1 AN advertisement register, 45.2.7.22 BASE-T1 AN LP Base Page ability register, 45.2.1.185 BASE-T1 PMA/PMD control register. Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-01	net: phy: Add BaseT1 auto-negotiation registers	Alexandru Tachici	1	-0/+40
	Added BASE-T1 AN advertisement register (Registers 7.514, 7.515, and 7.516) and BASE-T1 AN LP Base Page ability register (Registers 7.517, 7.518, and 7.519). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-01	net: phy: Add 10-BaseT1L registers	Alexandru Tachici	1	-0/+25
	The 802.3gc specification defines the 10-BaseT1L link mode for ethernet trafic on twisted wire pair. PMA status register can be used to detect if the phy supports 2.4 V TX level and PCS control register can be used to enable/disable PCS level loopback. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-01	ethtool: Add 10base-T1L link mode entry	Alexandru Tachici	1	-0/+1
	Add entry for the 10base-T1L full duplex mode. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30	io_uring: add IORING_SETUP_TASKRUN_FLAG	Jens Axboe	1	-0/+7
	If IORING_SETUP_COOP_TASKRUN is set to use cooperative scheduling for running task_work, then IORING_SETUP_TASKRUN_FLAG can be set so the application can tell if task_work is pending in the kernel for this ring. This allows use cases like io_uring_peek_cqe() to still function appropriately, or for the task to know when it would be useful to call io_uring_wait_cqe() to run pending events. Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20220426014904.60384-7-axboe@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-30	io_uring: use TWA_SIGNAL_NO_IPI if IORING_SETUP_COOP_TASKRUN is used	Jens Axboe	1	-0/+8
	If this is set, io_uring will never use an IPI to deliver a task_work notification. This can be used in the common case where a single task or thread communicates with the ring, and doesn't rely on io_uring_cqe_peek(). This provides a noticeable win in performance, both from eliminating the IPI itself, but also from avoiding interrupting the submitting task unnecessarily. Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20220426014904.60384-6-axboe@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-29	io_uring: return hint on whether more data is available after receive	Jens Axboe	1	-0/+2
	For now just use a CQE flag for this, with big CQE support we could return the actual number of bytes left. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-29	Merge branch 'for-5.19/io_uring-socket' into for-5.19/io_uring-net	Jens Axboe	1	-2/+21
	* for-5.19/io_uring-socket: (73 commits) io_uring: use the text representation of ops in trace io_uring: rename op -> opcode io_uring: add io_uring_get_opcode io_uring: add type to op enum io_uring: add socket(2) support net: add __sys_socket_file() io_uring: fix trace for reduced sqe padding io_uring: add fgetxattr and getxattr support io_uring: add fsetxattr and setxattr support fs: split off do_getxattr from getxattr fs: split off setxattr_copy and do_setxattr function from setxattr io_uring: return an error when cqe is dropped io_uring: use constants for cq_overflow bitfield io_uring: rework io_uring_enter to simplify return value io_uring: trace cqe overflows io_uring: add trace support for CQE overflow io_uring: allow re-poll if we made progress io_uring: support MSG_WAITALL for IORING_OP_SEND(MSG) io_uring: add support for IORING_ASYNC_CANCEL_ANY io_uring: allow IORING_OP_ASYNC_CANCEL with 'fd' key ...
2022-04-29	taskstats: version 12 with thread group and exe info	Dr. Thomas Orgis	2	-4/+23
	The task exit struct needs some crucial information to be able to provide an enhanced version of process and thread accounting. This change provides: 1. ac_tgid in additon to ac_pid 2. thread group execution walltime in ac_tgetime 3. flag AGROUP in ac_flag to indicate the last task in a thread group / process 4. device ID and inode of task's /proc/self/exe in ac_exe_dev and ac_exe_inode 5. tools/accounting/procacct as demonstrator When a task exits, taskstats are reported to userspace including the task's pid and ppid, but without the id of the thread group this task is part of. Without the tgid, the stats of single tasks cannot be correlated to each other as a thread group (process). The taskstats documentation suggests that on process exit a data set consisting of accumulated stats for the whole group is produced. But such an additional set of stats is only produced for actually multithreaded processes, not groups that had only one thread, and also those stats only contain data about delay accounting and not the more basic information about CPU and memory resource usage. Adding the AGROUP flag to be set when the last task of a group exited enables determination of process end also for single-threaded processes. My applicaton basically does enhanced process accounting with summed cputime, biggest maxrss, tasks per process. The data is not available with the traditional BSD process accounting (which is not designed to be extensible) and the taskstats interface allows more efficient on-the-fly grouping and summing of the stats, anyway, without intermediate disk writes. Furthermore, I do carry statistics on which exact program binary is used how often with associated resources, getting a picture on how important which parts of a collection of installed scientific software in different versions are, and how well they put load on the machine. This is enabled by providing information on /proc/self/exe for each task. I assume the two 64-bit fields for device ID and inode are more appropriate than the possibly large resolved path to keep the data volume down. Add the tgid to the stats to complete task identification, the flag AGROUP to mark the last task of a group, the group wallclock time, and inode-based identification of the associated executable file. Add tools/accounting/procacct.c as a simplified fork of getdelays.c to demonstrate process and thread accounting. [thomas.orgis@uni-hamburg.de: fix version number in comment] Link: https://lkml.kernel.org/r/20220405003601.7a5f6008@plasteblaster Link: https://lkml.kernel.org/r/20220331004106.64e5616b@plasteblaster Signed-off-by: Dr. Thomas Orgis <thomas.orgis@uni-hamburg.de> Reviewed-by: Ismael Luceno <ismael@iodev.co.uk> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux	Linus Torvalds	1	-1/+1
	Pull arm64 fix from Will Deacon: "Rename and reallocate the PT_ARM_MEMTAG_MTE ELF segment type. This is a fix to the MTE ELF ABI for a bug that was added during the most recent merge window as part of the coredump support. The issue is that the value assigned to the new PT_ARM_MEMTAG_MTE segment type has already been allocated to PT_AARCH64_UNWIND by the ELF ABI, so we've bumped the value and changed the name of the identifier to be better aligned with the existing one" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: elf: Fix the arm64 MTE ELF segment name and value
2022-04-29	Merge branch 'kvm-fixes-for-5.18-rc5' into HEAD	Paolo Bonzini	1	-2/+7
	Fixes for (relatively) old bugs, to be merged in both the -rc and next development trees. The merge reconciles the ABI fixes for KVM_EXIT_SYSTEM_EVENT between 5.18 and commit c24a950ec7d6 ("KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES", 2022-04-13).
2022-04-29	Merge branch 'kvm-fixes-for-5.18-rc5' into HEAD	Paolo Bonzini	1	-1/+9
	Fixes for (relatively) old bugs, to be merged in both the -rc and next development trees: * Fix potential races when walking host page table * Fix bad user ABI for KVM_EXIT_SYSTEM_EVENT * Fix shadow page table leak when KVM runs nested
2022-04-29	KVM: fix bad user ABI for KVM_EXIT_SYSTEM_EVENT	Paolo Bonzini	1	-1/+9
	When KVM_EXIT_SYSTEM_EVENT was introduced, it included a flags member that at the time was unused. Unfortunately this extensibility mechanism has several issues: - x86 is not writing the member, so it would not be possible to use it on x86 except for new events - the member is not aligned to 64 bits, so the definition of the uAPI struct is incorrect for 32- on 64-bit userspace. This is a problem for RISC-V, which supports CONFIG_KVM_COMPAT, but fortunately usage of flags was only introduced in 5.18. Since padding has to be introduced, place a new field in there that tells if the flags field is valid. To allow further extensibility, in fact, change flags to an array of 16 values, and store how many of the values are valid. The availability of the new ndata field is tied to a system capability; all architectures are changed to fill in the field. To avoid breaking compilation of userspace that was using the flags field, provide a userspace-only union to overlap flags with data[0]. The new field is placed at the same offset for both 32- and 64-bit userspace. Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Message-Id: <20220422103013.34832-1-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-28	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	2	-1/+22
	include/linux/netdevice.h net/core/dev.c 6510ea973d8d ("net: Use this_cpu_inc() to increment net->core_stats") 794c24e9921f ("net-core: rx_otherhost_dropped to core_stats") https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/ drivers/net/wan/cosa.c d48fea8401cf ("net: cosa: fix error check return value of register_chrdev()") 89fbca3307d4 ("net: wan: remove support for COSA and SRP synchronous serial boards") https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28	elf: Fix the arm64 MTE ELF segment name and value	Catalin Marinas	1	-1/+1
	Unfortunately, the name/value choice for the MTE ELF segment type (PT_ARM_MEMTAG_MTE) was pretty poor: LOPROC+1 is already in use by PT_AARCH64_UNWIND, as defined in the AArch64 ELF ABI (https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst). Update the ELF segment type value to LOPROC+2 and also change the define to PT_AARCH64_MEMTAG_MTE to match the AArch64 ELF ABI namespace. The AArch64 ELF ABI document is updating accordingly (segment type not previously mentioned in the document). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: 761b9b366cec ("elf: Introduce the ARM MTE ELF segment type") Cc: Will Deacon <will@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Luis Machado <luis.machado@arm.com> Cc: Richard Earnshaw <Richard.Earnshaw@arm.com> Link: https://lore.kernel.org/r/20220425151833.2603830-1-catalin.marinas@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-04-27	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	Jakub Kicinski	1	-0/+12
	Daniel Borkmann says: ==================== pull-request: bpf-next 2022-04-27 We've added 85 non-merge commits during the last 18 day(s) which contain a total of 163 files changed, 4499 insertions(+), 1521 deletions(-). The main changes are: 1) Teach libbpf to enhance BPF verifier log with human-readable and relevant information about failed CO-RE relocations, from Andrii Nakryiko. 2) Add typed pointer support in BPF maps and enable it for unreferenced pointers (via probe read) and referenced ones that can be passed to in-kernel helpers, from Kumar Kartikeya Dwivedi. 3) Improve xsk to break NAPI loop when rx queue gets full to allow for forward progress to consume descriptors, from Maciej Fijalkowski & Björn Töpel. 4) Fix a small RCU read-side race in BPF_PROG_RUN routines which dereferenced the effective prog array before the rcu_read_lock, from Stanislav Fomichev. 5) Implement BPF atomic operations for RV64 JIT, and add libbpf parsing logic for USDT arguments under riscv{32,64}, from Pu Lehui. 6) Implement libbpf parsing of USDT arguments under aarch64, from Alan Maguire. 7) Enable bpftool build for musl and remove nftw with FTW_ACTIONRETVAL usage so it can be shipped under Alpine which is musl-based, from Dominique Martinet. 8) Clean up {sk,task,inode} local storage trace RCU handling as they do not need to use call_rcu_tasks_trace() barrier, from KP Singh. 9) Improve libbpf API documentation and fix error return handling of various API functions, from Grant Seltzer. 10) Enlarge offset check for bpf_skb_{load,store}_bytes() helpers given data length of frags + frag_list may surpass old offset limit, from Liu Jian. 11) Various improvements to prog_tests in area of logging, test execution and by-name subtest selection, from Mykola Lysenko. 12) Simplify map_btf_id generation for all map types by moving this process to build time with help of resolve_btfids infra, from Menglong Dong. 13) Fix a libbpf bug in probing when falling back to legacy bpf_probe_read() helpers; the probing caused always to use old helpers, from Runqing Yang. 14) Add support for ARCompact and ARCv2 platforms for libbpf's PT_REGS tracing macros, from Vladimir Isaev. 15) Cleanup BPF selftests to remove old & unneeded rlimit code given kernel switched to memcg-based memory accouting a while ago, from Yafang Shao. 16) Refactor of BPF sysctl handlers to move them to BPF core, from Yan Zhu. 17) Fix BPF selftests in two occasions to work around regressions caused by latest LLVM to unblock CI until their fixes are worked out, from Yonghong Song. 18) Misc cleanups all over the place, from various others. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits) selftests/bpf: Add libbpf's log fixup logic selftests libbpf: Fix up verifier log for unguarded failed CO-RE relos libbpf: Simplify bpf_core_parse_spec() signature libbpf: Refactor CO-RE relo human description formatting routine libbpf: Record subprog-resolved CO-RE relocations unconditionally selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests libbpf: Avoid joining .BTF.ext data with BPF programs by section name libbpf: Fix logic for finding matching program for CO-RE relocation libbpf: Drop unhelpful "program too large" guess libbpf: Fix anonymous type check in CO-RE logic bpf: Compute map_btf_id during build time selftests/bpf: Add test for strict BTF type check selftests/bpf: Add verifier tests for kptr selftests/bpf: Add C tests for kptr libbpf: Add kptr type tag macros to bpf_helpers.h bpf: Make BTF type match stricter for release arguments bpf: Teach verifier about kptr_get kfunc helpers bpf: Wire up freeing of referenced kptr bpf: Populate pairs of btf_id and destructor kfunc in btf bpf: Adapt copy_map_value for multiple offset case ... ==================== Link: https://lore.kernel.org/r/20220427224758.20976-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27	net: atm: remove support for ZeitNet ZN122x ATM devices	Jakub Kicinski	1	-47/+0
	This driver received nothing but automated fixes in the last 15 years. Since it's using virt_to_bus it's unlikely to be used on any modern platform. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-26	Merge tag 'for-5.18/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev	Linus Torvalds	1	-1/+1
	Pull fbdev fixes and updates from Helge Deller: "A bunch of outstanding fbdev patches - all trivial and small" * tag 'for-5.18/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: video: fbdev: clps711x-fb: Use syscon_regmap_lookup_by_phandle video: fbdev: mmp: replace usage of found with dedicated list iterator variable video: fbdev: sh_mobile_lcdcfb: Remove sh_mobile_lcdc_check_var() declaration video: fbdev: i740fb: Error out if 'pixclock' equals zero video: fbdev: i740fb: use memset_io() to clear screen video: fbdev: s3fb: Error out if 'pixclock' equals zero video: fbdev: arkfb: Error out if 'pixclock' equals zero video: fbdev: tridentfb: Error out if 'pixclock' equals zero video: fbdev: vt8623fb: Error out if 'pixclock' equals zero video: fbdev: kyro: Error out if 'lineclock' equals zero video: fbdev: neofb: Fix the check of 'var->pixclock' video: fbdev: imxfb: Fix missing of_node_put in imxfb_probe video: fbdev: omap: Make it CCF clk API compatible video: fbdev: aty/matrox/...: Prepare cleanup of powerpc's asm/prom.h video: fbdev: pm2fb: Fix a kernel-doc formatting issue linux/fb.h: Spelling s/palette/palette/ video: fbdev: sis: fix potential NULL dereference in sisfb_post_sis300() video: fbdev: pxafb: use if else instead video: fbdev: udlfb: properly check endpoint type video: fbdev: of: display_timing: Remove a redundant zeroing of memory
2022-04-26	io_uring: add type to op enum	Dylan Yudaken	1	-1/+1
	It is useful to have a type enum for opcodes, to allow the compiler to assert that every value is used in a switch statement. Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220426082907.3600028-2-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-26	tee: remove flags TEE_IOCTL_SHM_MAPPED and TEE_IOCTL_SHM_DMA_BUF	Andrew Davis	1	-4/+0
	These look to be leftover from an early edition of this driver. Userspace does not need this information. Checking all users of this that I have access to I have verified no one is using them. They leak internal use flags out to userspace. Even more they are not correct anymore after a45ea4efa358. Lets drop these flags before someone does try to use them for something and they become ABI. Signed-off-by: Andrew Davis <afd@ti.com> Acked-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2022-04-25	bpf: Allow storing referenced kptr in map	Kumar Kartikeya Dwivedi	1	-0/+12
	Extending the code in previous commits, introduce referenced kptr support, which needs to be tagged using 'kptr_ref' tag instead. Unlike unreferenced kptr, referenced kptr have a lot more restrictions. In addition to the type matching, only a newly introduced bpf_kptr_xchg helper is allowed to modify the map value at that offset. This transfers the referenced pointer being stored into the map, releasing the references state for the program, and returning the old value and creating new reference state for the returned pointer. Similar to unreferenced pointer case, return value for this case will also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer must either be eventually released by calling the corresponding release function, otherwise it must be transferred into another map. It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear the value, and obtain the old value if any. BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future commit will permit using BPF_LDX for such pointers, but attempt at making it safe, since the lifetime of object won't be guaranteed. There are valid reasons to enforce the restriction of permitting only bpf_kptr_xchg to operate on referenced kptr. The pointer value must be consistent in face of concurrent modification, and any prior values contained in the map must also be released before a new one is moved into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg returns the old value, which the verifier would require the user to either free or move into another map, and releases the reference held for the pointer being moved in. In the future, direct BPF_XCHG instruction may also be permitted to work like bpf_kptr_xchg helper. Note that process_kptr_func doesn't have to call check_helper_mem_access, since we already disallow rdonly/wronly flags for map, which is what check_map_access_type checks, and we already ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc, so check_map_access is also not required. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com
2022-04-25	fanotify: implement "evictable" inode marks	Amir Goldstein	1	-0/+1
	When an inode mark is created with flag FAN_MARK_EVICTABLE, it will not pin the marked inode to inode cache, so when inode is evicted from cache due to memory pressure, the mark will be lost. When an inode mark with flag FAN_MARK_EVICATBLE is updated without using this flag, the marked inode is pinned to inode cache. When an inode mark is updated with flag FAN_MARK_EVICTABLE but an existing mark already has the inode pinned, the mark update fails with error EEXIST. Evictable inode marks can be used to setup inode marks with ignored mask to suppress events from uninteresting files or directories in a lazy manner, upon receiving the first event, without having to iterate all the uninteresting files or directories before hand. The evictbale inode mark feature allows performing this lazy marks setup without exhausting the system memory with pinned inodes. This change does not enable the feature yet. Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiRDpuS=2uA6+ZUM7yG9vVU-u212tkunBmSnP_u=mkv=Q@mail.gmail.com/ Link: https://lore.kernel.org/r/20220422120327.3459282-15-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2022-04-25	PCI: Add PCI_EXP_SLTCTL_ASPL_DISABLE macro	Pali Rohár	1	-0/+1
	Add macro defining Auto Slot Power Limit Disable bit in Slot Control Register. Link: https://lore.kernel.org/r/20220412094946.27069-2-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2022-04-25	devlink: introduce line card info get message	Jiri Pirko	1	-0/+2
	Allow the driver to provide per line card info get op to fill-up info, similar to the "devlink dev info". Example: $ devlink lc info pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 versions: fixed: hw.revision 0 running: ini.version 4 Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25	devlink: introduce line card devices support	Jiri Pirko	1	-0/+3
	Line card can contain one or more devices that makes sense to make visible to the user. For example, this can be a gearbox with flash memory, which could be updated. Provide the driver possibility to attach such devices to a line card and expose those to user. Example: $ devlink lc show pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 state active type 16x100G supported_types: 16x100G devices: device 0 device 1 device 2 device 3 Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-24	io_uring: add socket(2) support	Jens Axboe	1	-0/+1
	Supports both regular socket(2) where a normal file descriptor is instantiated when called, or direct descriptors. Link: https://lore.kernel.org/r/20220412202240.234207-3-axboe@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-24	io_uring: add fgetxattr and getxattr support	Stefan Roesch	1	-0/+2
	This adds support to io_uring for the fgetxattr and getxattr API. Signed-off-by: Stefan Roesch <shr@fb.com> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20220323154420.3301504-5-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-24	io_uring: add fsetxattr and setxattr support	Stefan Roesch	1	-1/+5
	This adds support to io_uring for the fsetxattr and setxattr API. Signed-off-by: Stefan Roesch <shr@fb.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20220323154420.3301504-4-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>