aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/include/uapi/linux (follow)
AgeCommit message (Collapse)AuthorFilesLines
2024-10-29xfrm: Add support for per cpu xfrm state handling.Steffen Klassert1-0/+2
Currently all flows for a certain SA must be processed by the same cpu to avoid packet reordering and lock contention of the xfrm state lock. To get rid of this limitation, the IETF standardized per cpu SAs in RFC 9611. This patch implements the xfrm part of it. We add the cpu as a lookup key for xfrm states and a config option to generate acquire messages for each cpu. With that, we can have on each cpu a SA with identical traffic selector so that flows can be processed in parallel on all cpus. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Tested-by: Antony Antony <antony.antony@secunet.com> Tested-by: Tobias Brunner <tobias@strongswan.org>
2024-10-29iommu: Add new flag to explictly request PASID capable domainJason Gunthorpe1-0/+8
Introduce new flag (IOMMU_HWPT_ALLOC_PASID) to domain_alloc_users() ops. If IOMMU supports PASID it will allocate domain. Otherwise return error. In error path check for -EOPNOTSUPP and try to allocate non-PASID domain so that DMA-API mode work fine for drivers which does not support PASID as well. Also modify __iommu_group_alloc_default_domain() to call iommu_paging_domain_alloc_flags() with appropriate flag when allocating paging domain. Signed-off-by: Jason Gunthorpe <jgg@ziepe.ca> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20241028093810.5901-4-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29Merge tag 'amd-drm-next-6.13-2024-10-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-nextDave Airlie1-3/+4
amd-drm-next-6.13-2024-10-25: amdgpu: - SDMA queue reset support - SMU 13.0.6 updates - Add debugfs interface to help limit jpeg queue scheduling for testing - JPEG 4.0.3 updates - Initial runtime repartitioning support - GFX9 fixes - Misc code cleanups - Rework IP structures to better handle multiple instances of an IP - DML updates - DSC fixes - HDR fixes - Brightness control updates - Runtime pm cleanup - DMCUB fixes - DCN 3.5 updates - Struct drm_edid cleanup - Fetch EDID from _DDC if available - Ring noop optimizations - MES logging fixes - 3DLUT fixes - DCN 4.x fixes - SMU 13.x fixes - Fixes for set_soft_freq_range() - ACPI fixes - SMU 14.x updates - PSR-SU fixes - fdinfo cleanup - DCN documentation updates amdkfd: - Misc code cleanups - Increase event FIFO size - Copy wave state fixes for SDMA radeon: - Fix possible overflow in packet3 check - Late init connector fix - Always set GEM function pointer Documentation: - Update drm-memory documentation From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241025132336.2416913-1-alexander.deucher@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-10-28drm/amdkfd: flag per-queue reset support for gfx9Jonathan Kim1-1/+2
Flag KFD support for per-queue reset on GFX9 devices. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-28iommufd: Add IOMMU_IOAS_MAP_FILESteve Sistare1-0/+25
Define the IOMMU_IOAS_MAP_FILE ioctl interface, which allows a user to register memory by passing a memfd plus offset and length. Implement it using the memfd_pin_folios() kAPI. Link: https://patch.msgid.link/r/1729861919-234514-8-git-send-email-steven.sistare@oracle.com Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-10-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni2-9/+12
Cross-merge networking fixes after downstream PR. No conflicts and no adjacent changes. Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfAlexei Starovoitov2-9/+15
Cross-merge bpf fixes after downstream PR. No conflicts. Adjacent changes in: include/linux/bpf.h include/uapi/linux/bpf.h kernel/bpf/btf.c kernel/bpf/helpers.c kernel/bpf/syscall.c kernel/bpf/verifier.c kernel/trace/bpf_trace.c mm/slab_common.c tools/include/uapi/linux/bpf.h tools/testing/selftests/bpf/Makefile Link: https://lore.kernel.org/all/20241024215724.60017-1-daniel@iogearbox.net/ Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-24firmware/psci: Add definitions for PSCI v1.3 specificationDavid Woodhouse1-0/+5
The v1.3 PSCI spec (https://developer.arm.com/documentation/den0022) adds the SYSTEM_OFF2 function. Add definitions for it and its hibernation type parameter. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Miguel Luis <miguel.luis@oracle.com> Link: https://lore.kernel.org/r/20241019172459.2241939-2-dwmw2@infradead.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-24riscv: Allow ptrace control of the tagged address ABISamuel Holland1-0/+1
This allows a tracer to control the ABI of the tracee, as on arm64. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20241016202814.4061541-7-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24riscv: Add support for userspace pointer maskingSamuel Holland1-1/+4
RISC-V supports pointer masking with a variable number of tag bits (which is called "PMLEN" in the specification) and which is configured at the next higher privilege level. Wire up the PR_SET_TAGGED_ADDR_CTRL and PR_GET_TAGGED_ADDR_CTRL prctls so userspace can request a lower bound on the number of tag bits and determine the actual number of tag bits. As with arm64's PR_TAGGED_ADDR_ENABLE, the pointer masking configuration is thread-scoped, inherited on clone() and fork() and cleared on execve(). Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20241016202814.4061541-5-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24bpf: Add the missing BPF_LINK_TYPE invocation for sockmapHou Tao1-0/+3
There is an out-of-bounds read in bpf_link_show_fdinfo() for the sockmap link fd. Fix it by adding the missing BPF_LINK_TYPE invocation for sockmap link Also add comments for bpf_link_type to prevent missing updates in the future. Fixes: 699c23f02c65 ("bpf: Add bpf_link support for sk_msg and sk_skb progs") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241024013558.1135167-2-houtao@huaweicloud.com
2024-10-24thermal: netlink: Add the commands and the events for the thresholdsDaniel Lezcano1-7/+20
The thresholds exist but there is no notification neither action code related to them yet. These changes implement the netlink for the notifications when the thresholds are crossed, added, deleted or flushed as well as the commands which allows to get the list of the thresholds, flush them, add and delete. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/20241022155147.463475-3-daniel.lezcano@linaro.org [ rjw: Use the thermal_zone guard for locking, subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24pidfd: add ioctl to retrieve pid infoLuca Boccassi1-0/+50
A common pattern when using pid fds is having to get information about the process, which currently requires /proc being mounted, resolving the fd to a pid, and then do manual string parsing of /proc/N/status and friends. This needs to be reimplemented over and over in all userspace projects (e.g.: I have reimplemented resolving in systemd, dbus, dbus-daemon, polkit so far), and requires additional care in checking that the fd is still valid after having parsed the data, to avoid races. Having a programmatic API that can be used directly removes all these requirements, including having /proc mounted. As discussed at LPC24, add an ioctl with an extensible struct so that more parameters can be added later if needed. Start with returning pid/tgid/ppid and creds unconditionally, and cgroupid optionally. Signed-off-by: Luca Boccassi <luca.boccassi@gmail.com> Link: https://lore.kernel.org/r/20241010155401.2268522-1-luca.boccassi@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-23wifi: cfg80211: add monitor SKIP_TX flagFelix Fietkau1-0/+2
This can be used to indicate that the user is not interested in receiving locally sent packets on the monitor interface. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://patch.msgid.link/f0c20f832eadd36c71fba9a2a16ba57d78389b6c.1728462320.git-series.nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-23wifi: cfg80211: report per wiphy radio antenna maskFelix Fietkau1-0/+3
With multi-radio devices, each radio typically gets a fixed set of antennas. In order to be able to disable specific antennas for some radios, user space needs to know which antenna mask bits are assigned to which radio. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://patch.msgid.link/e0a26afa2c88eaa188ec96ec6d17ecac4e827641.1728462320.git-series.nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-23wifi: cfg80211: add option for vif allowed radiosFelix Fietkau1-0/+5
This allows users to prevent a vif from affecting radios other than the configured ones. This can be useful in cases where e.g. an AP is running on one radio, and triggering a scan on another radio should not disturb it. Changing the allowed radios list for a vif is supported, but only while it is down. While it is possible to achieve the same by always explicitly specifying a frequency list for scan requests and ensuring that the wrong channel/band is never accidentally set on an unrelated interface, this change makes multi-radio wiphy setups a lot easier to deal with for CLI users. By itself, this patch only enforces the radio mask for scanning requests and remain-on-channel. Follow-up changes build on this to limit configured frequencies. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://patch.msgid.link/eefcb218780f71a1549875d149f1196486762756.1728462320.git-series.nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-22Revert "drm/amdkfd: SMI report dropped event count"Alex Deucher1-6/+0
This reverts commit a3ab2d45b9887ee609cd3bea39f668236935774c. The userspace side for this code is not ready yet so revert for now. Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Philip Yang <Philip.Yang@amd.com>
2024-10-22block: sed-opal: add ioctl IOC_OPAL_SET_SID_PWGreg Joyce1-0/+1
After a SED drive is provisioned, there is no way to change the SID password via the ioctl() interface. A new ioctl IOC_OPAL_SET_SID_PW will allow the password to be changed. The valid current password is required. Signed-off-by: Greg Joyce <gjoyce@linux.ibm.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Link: https://lore.kernel.org/r/20240829175639.6478-2-gjoyce@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-22ublk: support device recovery without I/O queueingUday Shankar1-0/+18
ublk currently supports the following behaviors on ublk server exit: A: outstanding I/Os get errors, subsequently issued I/Os get errors B: outstanding I/Os get errors, subsequently issued I/Os queue C: outstanding I/Os get reissued, subsequently issued I/Os queue and the following behaviors for recovery of preexisting block devices by a future incarnation of the ublk server: 1: ublk devices stopped on ublk server exit (no recovery possible) 2: ublk devices are recoverable using start/end_recovery commands The userspace interface allows selection of combinations of these behaviors using flags specified at device creation time, namely: default behavior: A + 1 UBLK_F_USER_RECOVERY: B + 2 UBLK_F_USER_RECOVERY|UBLK_F_USER_RECOVERY_REISSUE: C + 2 The behavior A + 2 is currently unsupported. Add support for this behavior under the new flag combination UBLK_F_USER_RECOVERY|UBLK_F_USER_RECOVERY_FAIL_IO. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241007182419.3263186-5-ushankar@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-21reiserfs: The last commitJan Kara2-52/+0
Deprecation period of reiserfs ends with the end of this year so it is time to remove it from the kernel. Acked-by: Darrick J. Wong <djwong@kernel.org> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>
2024-10-21Merge 6.12-rc4 into usb-nextGreg Kroah-Hartman2-9/+12
We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-21Merge 6.12-rc4 into char-misc-nextGreg Kroah-Hartman2-9/+12
We need the iio fixes from 6.12-rc4 in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-18Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds1-8/+5
Pull bpf fixes from Daniel Borkmann: - Fix BPF verifier to not affect subreg_def marks in its range propagation (Eduard Zingerman) - Fix a truncation bug in the BPF verifier's handling of coerce_reg_to_size_sx (Dimitar Kanaliev) - Fix the BPF verifier's delta propagation between linked registers under 32-bit addition (Daniel Borkmann) - Fix a NULL pointer dereference in BPF devmap due to missing rxq information (Florian Kauer) - Fix a memory leak in bpf_core_apply (Jiri Olsa) - Fix an UBSAN-reported array-index-out-of-bounds in BTF parsing for arrays of nested structs (Hou Tao) - Fix build ID fetching where memory areas backing the file were created with memfd_secret (Andrii Nakryiko) - Fix BPF task iterator tid filtering which was incorrectly using pid instead of tid (Jordan Rome) - Several fixes for BPF sockmap and BPF sockhash redirection in combination with vsocks (Michal Luczaj) - Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered (Andrea Parri) - Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the possibility of an infinite BPF tailcall (Pu Lehui) - Fix a build warning from resolve_btfids that bpf_lsm_key_free cannot be resolved (Thomas Weißschuh) - Fix a bug in kfunc BTF caching for modules where the wrong BTF object was returned (Toke Høiland-Jørgensen) - Fix a BPF selftest compilation error in cgroup-related tests with musl libc (Tony Ambardar) - Several fixes to BPF link info dumps to fill missing fields (Tyrone Wu) - Add BPF selftests for kfuncs from multiple modules, checking that the correct kfuncs are called (Simon Sundberg) - Ensure that internal and user-facing bpf_redirect flags don't overlap (Toke Høiland-Jørgensen) - Switch to use kvzmalloc to allocate BPF verifier environment (Rik van Riel) - Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic splat under RT (Wander Lairson Costa) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (38 commits) lib/buildid: Handle memfd_secret() files in build_id_parse() selftests/bpf: Add test case for delta propagation bpf: Fix print_reg_state's constant scalar dump bpf: Fix incorrect delta propagation between linked registers bpf: Properly test iter/task tid filtering bpf: Fix iter/task tid filtering riscv, bpf: Make BPF_CMPXCHG fully ordered bpf, vsock: Drop static vsock_bpf_prot initialization vsock: Update msg_count on read_skb() vsock: Update rx_bytes on read_skb() bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock selftests/bpf: Add asserts for netfilter link info bpf: Fix link info netfilter flags to populate defrag flag selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx() selftests/bpf: Add test for truncation after sign extension in coerce_reg_to_size_sx() bpf: Fix truncation bug in coerce_reg_to_size_sx() selftests/bpf: Assert link info uprobe_multi count & path_size if unset bpf: Fix unpopulated path_size when uprobe_multi fields unset selftests/bpf: Fix cross-compiling urandom_read selftests/bpf: Add test for kfunc module order ...
2024-10-18Merge tag 'block-6.12-20241018' of git://git.kernel.dk/linuxLinus Torvalds1-1/+7
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Fix target passthrough identifier (Nilay) - Fix tcp locking (Hannes) - Replace list with sbitmap for tracking RDMA rsp tags (Guixen) - Remove unnecessary fallthrough statements (Tokunori) - Remove ready-without-media support (Greg) - Fix multipath partition scan deadlock (Keith) - Fix concurrent PCI reset and remove queue mapping (Maurizio) - Fabrics shutdown fixes (Nilay) - Fix for a kerneldoc warning (Keith) - Fix a race with blk-rq-qos and wakeups (Omar) - Cleanup of checking for always-set tag_set (SurajSonawane2415) - Fix for a crash with CPU hotplug notifiers (Ming) - Don't allow zero-copy ublk on unprivileged device (Ming) - Use array_index_nospec() for CDROM (Josh) - Remove dead code in drbd (David) - Tweaks to elevator loading (Breno) * tag 'block-6.12-20241018' of git://git.kernel.dk/linux: cdrom: Avoid barrier_nospec() in cdrom_ioctl_media_changed() nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function nvme: make keep-alive synchronous operation nvme-loop: flush off pending I/O while shutting down loop controller nvme-pci: fix race condition between reset and nvme_dev_disable() ublk: don't allow user copy for unprivileged device blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race nvme-multipath: defer partition scanning blk-mq: setup queue ->tag_set before initializing hctx elevator: Remove argument from elevator_find_get elevator: do not request_module if elevator exists drbd: Remove unused conn_lowest_minor nvme: disable CC.CRIME (NVME_CC_CRIME) nvme: delete unnecessary fallthru comment nvmet-rdma: use sbitmap to replace rsp free list block: Fix elevator_get_default() checking for NULL q->tag_set nvme: tcp: avoid race between queue_lock lock and destroy nvmet-passthru: clear EUID/NGUID/UUID while using loop target block: fix blk_rq_map_integrity_sg kernel-doc
2024-10-16fanotify: allow reporting errors on failure to open fdAmir Goldstein1-0/+1
When working in "fd mode", fanotify_read() needs to open an fd from a dentry to report event->fd to userspace. Opening an fd from dentry can fail for several reasons. For example, when tasks are gone and we try to open their /proc files or we try to open a WRONLY file like in sysfs or when trying to open a file that was deleted on the remote network server. Add a new flag FAN_REPORT_FD_ERROR for fanotify_init(). For a group with FAN_REPORT_FD_ERROR, we will send the event with the error instead of the open fd, otherwise userspace may not get the error at all. For an overflow event, we report -EBADF to avoid confusing FAN_NOFD with -EPERM. Similarly for pidfd open errors we report either -ESRCH or the open error instead of FAN_NOPIDFD and FAN_EPIDFD. In any case, userspace will not know which file failed to open, so add a debug print for further investigation. Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com> Link: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241003142922.111539-1-amir73il@gmail.com
2024-10-16ublk: don't allow user copy for unprivileged deviceMing Lei1-1/+7
UBLK_F_USER_COPY requires userspace to call write() on ublk char device for filling request buffer, and unprivileged device can't be trusted. So don't allow user copy for unprivileged device. Cc: stable@vger.kernel.org Fixes: 1172d5b8beca ("ublk: support user copy") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241016134847.2911721-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-16usb: gadget: uvc: configfs: Add frame-based frame format supportAkash Kumar1-0/+58
Add support for frame-based frame format, which can be used to support multiple formats like H264 or H265, in addition to MJPEG and YUV frames. The frame-based format is set to H264 by default, but it can be updated to other formats by modifying the GUID through the guid configfs attribute. Different structures are used for all three formats, as H264 has a different structure compared to MJPEG and uncompressed formats. These structures will be passed to the frame make function based on the active format, using a common frame structure with additional parameters needed only for frame-based formats. These parameters are handled at runtime in the UVC driver. Signed-off-by: Akash Kumar <quic_akakum@quicinc.com> Link: https://lore.kernel.org/r/20240927152138.31416-1-quic_akakum@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-16media: raspberrypi: Add support for RP1-CFETomi Valkeinen2-0/+337
Add support for Raspberry Pi CFE. The CFE is a hardware block that contains: - MIPI D-PHY - MIPI CSI-2 receiver - Front End ISP (FE) The driver has been upported from the Raspberry Pi kernel commit 88a681df9623 ("ARM: dts: bcm2712-rpi: Add i2c<n>_pins labels"). Co-developed-by: Naushir Patuck <naush@raspberrypi.com> Signed-off-by: Naushir Patuck <naush@raspberrypi.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2024-10-16media: uapi: Add meta formats for PiSP FE config and statsTomi Valkeinen1-0/+2
Add two meta formats for PiSP FE: V4L2_META_FMT_RPI_FE_CFG and V4L2_META_FMT_RPI_FE_STATS. The former is used to provide configuration for the FE and the latter is used to read the statistics from the FE. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2024-10-15Merge tag 'batadv-next-pullrequest-20241015' of git://git.open-mesh.org/linux-mergePaolo Abeni1-13/+16
Simon Wunderlich says: ==================== This cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - Add flex array to struct batadv_tvlv_tt_data, by Erick Archer - Use string choice helper to print booleans, by Sven Eckelmann - replace call_rcu by kfree_rcu for simple kmem_cache_free callback, by Julia Lawall * tag 'batadv-next-pullrequest-20241015' of git://git.open-mesh.org/linux-merge: batman-adv: replace call_rcu by kfree_rcu for simple kmem_cache_free callback batman-adv: Use string choice helper to print booleans batman-adv: Add flex array to struct batadv_tvlv_tt_data batman-adv: Start new development cycle ==================== Link: https://patch.msgid.link/20241015073946.46613-1-sw@simonwunderlich.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-15Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextPaolo Abeni1-0/+15
Daniel Borkmann says: ==================== pull-request: bpf-next 2024-10-14 The following pull-request contains BPF updates for your *net-next* tree. We've added 21 non-merge commits during the last 18 day(s) which contain a total of 21 files changed, 1185 insertions(+), 127 deletions(-). The main changes are: 1) Put xsk sockets on a struct diet and add various cleanups. Overall, this helps to bump performance by 12% for some workloads, from Maciej Fijalkowski. 2) Extend BPF selftests to increase coverage of XDP features in combination with BPF cpumap, from Alexis Lothoré (eBPF Foundation). 3) Extend netkit with an option to delegate skb->{mark,priority} scrubbing to its BPF program, from Daniel Borkmann. 4) Make the bpf_get_netns_cookie() helper available also to tc(x) BPF programs, from Mahe Tardy. 5) Extend BPF selftests covering a BPF program setting socket options per MPTCP subflow, from Geliang Tang and Nicolas Rybowski. bpf-next-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (21 commits) xsk: Use xsk_buff_pool directly for cq functions xsk: Wrap duplicated code to function xsk: Carry a copy of xdp_zc_max_segs within xsk_buff_pool xsk: Get rid of xdp_buff_xsk::orig_addr xsk: s/free_list_node/list_node/ xsk: Get rid of xdp_buff_xsk::xskb_list_node selftests/bpf: check program redirect in xdp_cpumap_attach selftests/bpf: make xdp_cpumap_attach keep redirect prog attached selftests/bpf: fix bpf_map_redirect call for cpu map test selftests/bpf: add tcx netns cookie tests bpf: add get_netns_cookie helper to tc programs selftests/bpf: add missing header include for htons selftests/bpf: Extend netkit tests to validate skb meta data tools: Sync if_link.h uapi tooling header netkit: Add add netkit scrub support to rt_link.yaml netkit: Simplify netkit mode over to use NLA_POLICY_MAX netkit: Add option for scrubbing skb meta data bpf: Remove unused macro selftests/bpf: Add mptcp subflow subtest selftests/bpf: Add getsockopt to inspect mptcp subflow ... ==================== Link: https://patch.msgid.link/20241014211110.16562-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-14netdev-genl: Support setting per-NAPI config valuesJoe Damato1-0/+1
Add support to set per-NAPI defer_hard_irqs and gro_flush_timeout. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241011184527.16393-7-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14netdev-genl: Dump gro_flush_timeoutJoe Damato1-0/+1
Support dumping gro_flush_timeout for a NAPI ID. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241011184527.16393-5-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14netdev-genl: Dump napi_defer_hard_irqsJoe Damato1-0/+1
Support dumping defer_hard_irqs for a NAPI ID. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20241011184527.16393-3-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-12media: videodev2: Add flag to unconditionally enumerate pixel formatsBenjamin Gaignard1-0/+3
When the index is ORed with V4L2_FMTDESC_FLAG_ENUM_ALL the driver clears the flag and enumerate all the possible formats, ignoring any limitations from the current configuration. Drivers which do not support this flag yet always return an EINVAL. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> [hverkuil: improved doc when the new flag is not supported by the driver]
2024-10-12Merge tag 'v6.12-rc2' into test2Jonathan Cameron1-1/+1
Linux 6.12-rc2 Resolved movement of asm/unaligned.h to linux/unaligned.h
2024-10-10thermal: core: Add user thresholds supportDaniel Lezcano1-0/+2
The user thresholds mechanism is a way to have the userspace to tell the thermal framework to send a notification when a temperature limit is crossed. There is no id, no hysteresis, just the temperature and the direction of the limit crossing. That means we can be notified when a threshold is crossed the way up only, or the way down only or both ways. That allows to create hysteresis values if it is needed. A threshold can be added, deleted or flushed. The latter means all thresholds belonging to a thermal zone will be deleted. When a threshold is added: - if the same threshold (temperature and direction) exists, an error is returned - if a threshold is specified with the same temperature but a different direction, the specified direction is added - if there is no threshold with the same temperature then it is created When a threshold is deleted: - if the same threshold (temperature and direction) exists, it is deleted - if a threshold is specified with the same temperature but a different direction, the specified direction is removed - if there is no threshold with the same temperature, then an error is returned When the threshold are flushed: - All thresholds related to a thermal zone are deleted When a threshold is crossed: - the userspace does not need to know which threshold(s) have been crossed, it will be notified with the current temperature and the previous temperature - if multiple thresholds have been crossed between two updates only one notification will be send to the userspace, it is pointless to send a notification per thresholds crossed as the userspace can handle that easily when it has the temperature delta information Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20240923100005.2532430-2-daniel.lezcano@linaro.org [ rjw: Subject edit, use BIT(0) and BIT(1) in symbol definitions ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-10bpf: Update bpf_override_return() commentMartin Kelly1-4/+0
The documentation says CONFIG_FUNCTION_ERROR_INJECTION is supported only on x86. This was presumably true at the time of writing, but it's now supported on many other architectures too. Drop this statement, since it's not correct anymore and it fits better in other documentation anyway. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Link: https://lore.kernel.org/r/20241010193301.995909-1-martin.kelly@crowdstrike.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-10bpf: fix argument type in bpf_loop documentationMatteo Croce1-1/+1
The `index` argument to bpf_loop() is threaded as an u64. This lead in a subtle verifier denial where clang cloned the argument in another register[1]. [1] https://github.com/systemd/systemd/pull/34650#issuecomment-2401092895 Signed-off-by: Matteo Croce <teknoraver@meta.com> Link: https://lore.kernel.org/r/20241010035652.17830-1-technoboy85@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-10netlink: spec: add shaper introspection supportPaolo Abeni1-0/+17
Allow the user-space to fine-grain query the shaping features supported by the NIC on each domain. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/3ddd10e450e3fe7d4b944c0d0b886d4483529ee6.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10netlink: spec: add shaper YAML specPaolo Abeni1-0/+78
Define the user-space visible interface to query, configure and delete network shapers via yaml definition. Add dummy implementations for the relevant NL callbacks. set() and delete() operations touch a single shaper creating/updating or deleting it. The group() operation creates a shaper's group, nesting multiple input shapers under the specified output shaper. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/7a33a1ff370bdbcd0cd3f909575c912cd56f41da.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09Fix misspelling of "accept*" in netAlexander Zubkov1-1/+1
Several files have "accept*" misspelled as "accpet*" in the comments. Fix all such occurrences. Signed-off-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241008162756.22618-2-green@qrator.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ptp: Add support for the AMZNC10C 'vmclock' deviceDavid Woodhouse1-0/+182
The vmclock device addresses the problem of live migration with precision clocks. The tolerances of a hardware counter (e.g. TSC) are typically around ±50PPM. A guest will use NTP/PTP/PPS to discipline that counter against an external source of 'real' time, and track the precise frequency of the counter as it changes with environmental conditions. When a guest is live migrated, anything it knows about the frequency of the underlying counter becomes invalid. It may move from a host where the counter running at -50PPM of its nominal frequency, to a host where it runs at +50PPM. There will also be a step change in the value of the counter, as the correctness of its absolute value at migration is limited by the accuracy of the source and destination host's time synchronization. In its simplest form, the device merely advertises a 'disruption_marker' which indicates that the guest should throw away any NTP synchronization it thinks it has, and start again. Because the shared memory region can be exposed all the way to userspace through the /dev/vmclock0 node, applications can still use time from a fast vDSO 'system call', and check the disruption marker to be sure that their timestamp is indeed truthful. The structure also allows for the precise time, as known by the host, to be exposed directly to guests so that they don't have to wait for NTP to resync from scratch. The PTP driver consumes this information if present. Like the KVM PTP clock, this PTP driver can convert TSC-based cross timestamps into KVM clock values. Unlike the KVM PTP clock, it does so only when such is actually helpful. The values and fields are based on the nascent virtio-rtc specification, and the intent is that a version (hopefully precisely this version) of this structure will be included as an optional part of that spec. In the meantime, this driver supports the simple ACPI form of the device which is being shipped in certain commercial hypervisors (and submitted for inclusion in QEMU). Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-08media: v4l: Add luma 16-bit interlaced pixel formatDmitry Perchanov1-0/+1
The formats added by this patch are: V4L2_PIX_FMT_Y16I Interlaced lumina format primary use in RealSense Depth cameras with stereo stream for left and right image sensors. Signed-off-by: Dmitry Perchanov <dmitry.perchanov@intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://lore.kernel.org/r/568efbd75290e286b8ad9e7347b5f43745121020.camel@intel.com Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-10-07netkit: Add option for scrubbing skb meta dataDaniel Borkmann1-0/+15
Jordan reported that when running Cilium with netkit in per-endpoint-routes mode, network policy misclassifies traffic. In this direct routing mode of Cilium which is used in case of GKE/EKS/AKS, the Pod's BPF program to enforce policy sits on the netkit primary device's egress side. The issue here is that in case of netkit's netkit_prep_forward(), it will clear meta data such as skb->mark and skb->priority before executing the BPF program. Thus, identity data stored in there from earlier BPF programs (e.g. from tcx ingress on the physical device) gets cleared instead of being made available for the primary's program to process. While for traffic egressing the Pod via the peer device this might be desired, this is different for the primary one where compared to tcx egress on the host veth this information would be available. To address this, add a new parameter for the device orchestration to allow control of skb->mark and skb->priority scrubbing, to make the two accessible from BPF (and eventually leave it up to the program to scrub). By default, the current behavior is retained. For netkit peer this also enables the use case where applications could cooperate/signal intent to the BPF program. Note that struct netkit has a 4 byte hole between policy and bundle which is used here, in other words, struct netkit's first cacheline content used in fast-path does not get moved around. Fixes: 35dfaad7188c ("netkit, bpf: Add bpf programmable net device") Reported-by: Jordan Rife <jrife@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nikolay Aleksandrov <razor@blackwall.org> Link: https://github.com/cilium/cilium/issues/34042 Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20241004101335.117711-1-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-10-07drm/amdkfd: SMI report dropped event countPhilip Yang1-0/+6
Add new SMI event to report the dropped event count. When the event kfifo is full, drop count is not zero, or no enough space left to store the event message, increase drop count. After reading event out from kfifo, if event was dropped, drop_count is not zero, generate a dropped event record and reset drop count to zero. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdkfd: Output migrate end event if migrate failedPhilip Yang1-3/+4
If page migration failed, also output migrate end event to match with migrate start event, with failure error_code added to the end of the migrate message macro. This will not break uAPI because application uses old message macro sscanf drop and ignore the error_code. Output GPU page fault restore end event if migration failed. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-05batman-adv: Add flex array to struct batadv_tvlv_tt_dataErick Archer1-13/+16
The "struct batadv_tvlv_tt_data" uses a dynamically sized set of trailing elements. Specifically, it uses an array of structures of type "batadv_tvlv_tt_vlan_data". So, use the preferred way in the kernel declaring a flexible array [1]. At the same time, prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). In this case, it is important to note that the attribute used is specifically __counted_by_be since variable "num_vlan" is of type __be16. The following change to the "batadv_tt_tvlv_ogm_handler_v1" function: - tt_vlan = (struct batadv_tvlv_tt_vlan_data *)(tt_data + 1); - tt_change = (struct batadv_tvlv_tt_change *)(tt_vlan + num_vlan); + tt_change = (struct batadv_tvlv_tt_change *)((void *)tt_data + + flex_size); is intended to prevent the compiler from generating an "out-of-bounds" notification due to the __counted_by attribute. The compiler can do a pointer calculation using the vlan_data flexible array memory, or in other words, this may be calculated as an array offset, since it is the same as: &tt_data->vlan_data[num_vlan] Therefore, we go past the end of the array. In other "multiple trailing flexible array" situations, this has been solved by addressing from the base pointer, since the compiler either knows the full allocation size or it knows nothing about it (this case, since it came from a "void *" function argument). The order in which the structure batadv_tvlv_tt_data and the structure batadv_tvlv_tt_vlan_data are defined must be swap to avoid an incomplete type error. Also, avoid the open-coded arithmetic in memory allocator functions [2] using the "struct_size" macro and use the "flex_array_size" helper to clarify some calculations, when possible. Moreover, the new structure member also allow us to avoid the open-coded arithmetic on pointers in some situations. Take advantage of this. This code was detected with the help of Coccinelle, and audited and modified manually. Link: https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays [1] Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [2] Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Erick Archer <erick.archer@outlook.com> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2024-10-05crypto: virtio - Drop sign/verify operationsLukas Wunner1-0/+1
The virtio crypto driver exposes akcipher sign/verify operations in a user space ABI. This blocks removal of sign/verify from akcipher_alg. Herbert opines: "I would say that this is something that we can break. Breaking it is no different to running virtio on a host that does not support these algorithms. After all, a software implementation must always be present. I deliberately left akcipher out of crypto_user because the API is still in flux. We should not let virtio constrain ourselves." https://lore.kernel.org/all/ZtqoNAgcnXnrYhZZ@gondor.apana.org.au/ "I would remove virtio akcipher support in its entirety. This API was never meant to be exposed outside of the kernel." https://lore.kernel.org/all/Ztqql_gqgZiMW8zz@gondor.apana.org.au/ Drop sign/verify support from virtio crypto. There's no strong reason to also remove encrypt/decrypt support, so keep it. A key selling point of virtio crypto is to allow guest access to crypto accelerators on the host. So far the only akcipher algorithm supported by virtio crypto is RSA. Dropping sign/verify merely means that the PKCS#1 padding is now always generated or verified inside the guest, but the actual signature generation/verification (which is an RSA decrypt/encrypt operation) may still use an accelerator on the host. Generating or verifying the PKCS#1 padding is cheap, so a hardware accelerator won't be of much help there. Which begs the question whether virtio crypto support for sign/verify makes sense at all. It would make sense for the sign operation if the host has a security chip to store asymmetric private keys. But the kernel doesn't even have an asymmetric_key_subtype yet for hardware-based private keys. There's at least one rudimentary driver for such chips (atmel-ecc.c for ATECC508A), but it doesn't implement the sign operation. The kernel would first have to grow support for a hardware asymmetric_key_subtype and at least one driver implementing the sign operation before exposure to guests via virtio makes sense. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: sig - Introduce sig_alg backendLukas Wunner1-0/+5
Commit 6cb8815f41a9 ("crypto: sig - Add interface for sign/verify") began a transition of asymmetric sign/verify operations from crypto_akcipher to a new crypto_sig frontend. Internally, the crypto_sig frontend still uses akcipher_alg as backend, however: "The link between sig and akcipher is meant to be temporary. The plan is to create a new low-level API for sig and then migrate the signature code over to that from akcipher." https://lore.kernel.org/r/ZrG6w9wsb-iiLZIF@gondor.apana.org.au/ "having a separate alg for sig is definitely where we want to be since there is very little that the two types actually share." https://lore.kernel.org/r/ZrHlpz4qnre0zWJO@gondor.apana.org.au/ Take the next step of that migration and augment the crypto_sig frontend with a sig_alg backend to which all algorithms can be moved. During the migration, there will briefly be signature algorithms that are still based on crypto_akcipher, whilst others are already based on crypto_sig. Allow for that by building a fork into crypto_sig_*() API calls (i.e. crypto_sig_maxsize() and friends) such that one of the two backends is selected based on the transform's cra_type. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>