aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/include/uapi/linux
AgeCommit message (Collapse)AuthorFilesLines
2025-09-15Input: add INPUT_PROP_HAPTIC_TOUCHPADAngela Czubak1-0/+1
INPUT_PROP_HAPTIC_TOUCHPAD property is to be set for a device with simple haptic capabilities. Signed-off-by: Angela Czubak <aczubak@google.com> Co-developed-by: Jonathan Denose <jdenose@google.com> Signed-off-by: Jonathan Denose <jdenose@google.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2025-09-15Input: add FF_HAPTIC effect typeAngela Czubak1-1/+21
FF_HAPTIC effect type can be used to trigger haptic feedback with HID simple haptic usages. Signed-off-by: Angela Czubak <aczubak@google.com> Co-developed-by: Jonathan Denose <jdenose@google.com> Signed-off-by: Jonathan Denose <jdenose@google.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2025-09-13x86/kexec: carry forward the boot DTB on kexecBrian Mak1-0/+4
Currently, the kexec_file_load syscall on x86 does not support passing a device tree blob to the new kernel. Some embedded x86 systems use device trees. On these systems, failing to pass a device tree to the new kernel causes a boot failure. To add support for this, we copy the behavior of ARM64 and PowerPC and copy the current boot's device tree blob for use in the new kernel. We do this on x86 by passing the device tree blob as a setup_data entry in accordance with the x86 boot protocol. This behavior is gated behind the KEXEC_FILE_FORCE_DTB flag. Link: https://lkml.kernel.org/r/20250805211527.122367-3-makb@juniper.net Signed-off-by: Brian Mak <makb@juniper.net> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Dave Young <dyoung@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Rob Herring <robh@kernel.org> Cc: Saravana Kannan <saravanak@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13mm/huge_memory: respect MADV_COLLAPSE with PR_THP_DISABLE_EXCEPT_ADVISEDDavid Hildenbrand1-1/+1
Let's allow for making MADV_COLLAPSE succeed on areas that neither have VM_HUGEPAGE nor VM_NOHUGEPAGE when we have THP disabled unless explicitly advised (PR_THP_DISABLE_EXCEPT_ADVISED). MADV_COLLAPSE is a clear advice that we want to collapse. Note that we still respect the VM_NOHUGEPAGE flag, just like MADV_COLLAPSE always does. So consequently, MADV_COLLAPSE is now only refused on VM_NOHUGEPAGE with PR_THP_DISABLE_EXCEPT_ADVISED, including for shmem. Link: https://lkml.kernel.org/r/20250815135549.130506-4-usamaarif642@gmail.com Co-developed-by: Usama Arif <usamaarif642@gmail.com> Signed-off-by: Usama Arif <usamaarif642@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: SeongJae Park <sj@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yafang <laoar.shao@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13prctl: extend PR_SET_THP_DISABLE to optionally exclude VM_HUGEPAGEDavid Hildenbrand1-0/+10
Patch series "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised", v5. This will allow individual processes to opt-out of THP = "always" into THP = "madvise", without affecting other workloads on the system. This has been extensively discussed on the mailing list and has been summarized very well by David in the first patch which also includes the links to alternatives, please refer to the first patch commit message for the motivation for this series. Patch 1 adds the PR_THP_DISABLE_EXCEPT_ADVISED flag to implement this, along with the MMF changes. Patch 2 is a cleanup patch for tva_flags that will allow the forced collapse case to be transmitted to vma_thp_disabled (which is done in patch 3). Patch 4 adds documentation for PR_SET_THP_DISABLE/PR_GET_THP_DISABLE. Patches 6-7 implement the selftests for PR_SET_THP_DISABLE for completely disabling THPs (old behaviour) and only enabling it at advise (PR_THP_DISABLE_EXCEPT_ADVISED). This patch (of 7): People want to make use of more THPs, for example, moving from the "never" system policy to "madvise", or from "madvise" to "always". While this is great news for every THP desperately waiting to get allocated out there, apparently there are some workloads that require a bit of care during that transition: individual processes may need to opt-out from this behavior for various reasons, and this should be permitted without needing to make all other workloads on the system similarly opt-out. The following scenarios are imaginable: (1) Switch from "none" system policy to "madvise"/"always", but keep THPs disabled for selected workloads. (2) Stay at "none" system policy, but enable THPs for selected workloads, making only these workloads use the "madvise" or "always" policy. (3) Switch from "madvise" system policy to "always", but keep the "madvise" policy for selected workloads: allocate THPs only when advised. (4) Stay at "madvise" system policy, but enable THPs even when not advised for selected workloads -- "always" policy. Once can emulate (2) through (1), by setting the system policy to "madvise"/"always" while disabling THPs for all processes that don't want THPs. It requires configuring all workloads, but that is a user-space problem to sort out. (4) can be emulated through (3) in a similar way. Back when (1) was relevant in the past, as people started enabling THPs, we added PR_SET_THP_DISABLE, so relevant workloads that were not ready yet (i.e., used by Redis) were able to just disable THPs completely. Redis still implements the option to use this interface to disable THPs completely. With PR_SET_THP_DISABLE, we added a way to force-disable THPs for a workload -- a process, including fork+exec'ed process hierarchy. That essentially made us support (1): simply disable THPs for all workloads that are not ready for THPs yet, while still enabling THPs system-wide. The quest for handling (3) and (4) started, but current approaches (completely new prctl, options to set other policies per process, alternatives to prctl -- mctrl, cgroup handling) don't look particularly promising. Likely, the future will use bpf or something similar to implement better policies, in particular to also make better decisions about THP sizes to use, but this will certainly take a while as that work just started. Long story short: a simple enable/disable is not really suitable for the future, so we're not willing to add completely new toggles. While we could emulate (3)+(4) through (1)+(2) by simply disabling THPs completely for these processes, this is a step backwards, because these processes can no longer allocate THPs in regions where THPs were explicitly advised: regions flagged as VM_HUGEPAGE. Apparently, that imposes a problem for relevant workloads, because "not THPs" is certainly worse than "THPs only when advised". Could we simply relax PR_SET_THP_DISABLE, to "disable THPs unless not explicitly advised by the app through MAD_HUGEPAGE"? *maybe*, but this would change the documented semantics quite a bit, and the versatility to use it for debugging purposes, so I am not 100% sure that is what we want -- although it would certainly be much easier. So instead, as an easy way forward for (3) and (4), add an option to make PR_SET_THP_DISABLE disable *less* THPs for a process. In essence, this patch: (A) Adds PR_THP_DISABLE_EXCEPT_ADVISED, to be used as a flag in arg3 of prctl(PR_SET_THP_DISABLE) when disabling THPs (arg2 != 0). prctl(PR_SET_THP_DISABLE, 1, PR_THP_DISABLE_EXCEPT_ADVISED). (B) Makes prctl(PR_GET_THP_DISABLE) return 3 if PR_THP_DISABLE_EXCEPT_ADVISED was set while disabling. Previously, it would return 1 if THPs were disabled completely. Now it returns the set flags as well: 3 if PR_THP_DISABLE_EXCEPT_ADVISED was set. (C) Renames MMF_DISABLE_THP to MMF_DISABLE_THP_COMPLETELY, to express the semantics clearly. Fortunately, there are only two instances outside of prctl() code. (D) Adds MMF_DISABLE_THP_EXCEPT_ADVISED to express "no THP except for VMAs with VM_HUGEPAGE" -- essentially "thp=madvise" behavior Fortunately, we only have to extend vma_thp_disabled(). (E) Indicates "THP_enabled: 0" in /proc/pid/status only if THPs are disabled completely Only indicating that THPs are disabled when they are really disabled completely, not only partially. For now, we don't add another interface to obtained whether THPs are disabled partially (PR_THP_DISABLE_EXCEPT_ADVISED was set). If ever required, we could add a new entry. The documented semantics in the man page for PR_SET_THP_DISABLE "is inherited by a child created via fork(2) and is preserved across execve(2)" is maintained. This behavior, for example, allows for disabling THPs for a workload through the launching process (e.g., systemd where we fork() a helper process to then exec()). For now, MADV_COLLAPSE will *fail* in regions without VM_HUGEPAGE and VM_NOHUGEPAGE. As MADV_COLLAPSE is a clear advise that user space thinks a THP is a good idea, we'll enable that separately next (requiring a bit of cleanup first). There is currently not way to prevent that a process will not issue PR_SET_THP_DISABLE itself to re-enable THP. There are not really known users for re-enabling it, and it's against the purpose of the original interface. So if ever required, we could investigate just forbidding to re-enable them, or make this somehow configurable. Link: https://lkml.kernel.org/r/20250815135549.130506-1-usamaarif642@gmail.com Link: https://lkml.kernel.org/r/20250815135549.130506-2-usamaarif642@gmail.com Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Usama Arif <usamaarif642@gmail.com> Tested-by: Usama Arif <usamaarif642@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Usama Arif <usamaarif642@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: SeongJae Park <sj@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yafang <laoar.shao@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13mempolicy: clarify what zone reclaim meansJoshua Hahn1-3/+9
The zone_reclaim_mode API controls the reclaim behavior when a node runs out of memory. Contrary to its user-facing name, it is internally referred to as "node_reclaim_mode". This can be confusing. But because we cannot change the name of the API since it has been in place since at least 2.6, let's try to be more explicit about what the behavior of this API is. Change the description to clarify what zone reclaim entails, and be explicit about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the past already [1] [2]. While at it, also soften the warning about changing these bits. [joshua.hahnjy@gmail.com: remove the reference to the vm.zone_reclaim_mode sysctl as an ABI] Link: https://lkml.kernel.org/r/20250806134404.2000234-1-joshua.hahnjy@gmail.com Link: https://lkml.kernel.org/r/20250805205048.1518453-1-joshua.hahnjy@gmail.com Link: https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/ [1] Link: https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/ [2] Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com> Acked-by: SeongJae Park <sj@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Byungchul Park <byungchul@sk.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Gregory Price <gourry@gourry.net> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Rakie Kim <rakie.kim@sk.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13Merge tag 'v6.17-rc3' into togregJonathan Cameron2-1/+2
Linux 6.17-rc3
2025-09-13iio: add power and energy measurement modifiersAntoniu Miclaus1-0/+4
Add new IIO modifiers to support power and energy measurement devices: Power modifiers: - IIO_MOD_ACTIVE: Real power consumed by the load - IIO_MOD_REACTIVE: Power that oscillates between source and load - IIO_MOD_APPARENT: Magnitude of complex power Signal quality modifiers: - IIO_MOD_RMS: Root Mean Square value Additionally adds: - IIO_CHAN_INFO_POWERFACTOR: Power factor channel info type for representing the ratio of active power to apparent power These modifiers enable proper representation of power measurement devices like energy meters and power analyzers. Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-09-13iio: add IIO_ALTCURRENT channel typeAntoniu Miclaus1-0/+1
Add support for IIO_ALTCURRENT channel type to distinguish AC current measurements from DC current measurements. This follows the same pattern as IIO_VOLTAGE and IIO_ALTVOLTAGE. Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-09-12Merge tag 'nf-next-25-09-11' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextJakub Kicinski1-0/+2
Florian Westphal says: ==================== netfilter: updates for net-next 1) Don't respond to ICMP_UNREACH errors with another ICMP_UNREACH error. 2) Support fetching the current bridge ethernet address. This allows a more flexible approach to packet redirection on bridges without need to use hardcoded addresses. From Fernando Fernandez Mancera. 3) Zap a few no-longer needed conditionals from ipvs packet path and convert to READ/WRITE_ONCE to avoid KCSAN warnings. From Zhang Tengfei. 4) Remove a no-longer-used macro argument in ipset, from Zhen Ni. * tag 'nf-next-25-09-11' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_reject: don't reply to icmp error messages ipvs: Use READ_ONCE/WRITE_ONCE for ipvs->enable netfilter: nft_meta_bridge: introduce NFT_META_BRI_IIFHWADDR support netfilter: ipset: Remove unused htable_bits in macro ahash_region selftest:net: fixed spelling mistakes ==================== Link: https://patch.msgid.link/20250911143819.14753-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11net: bridge: Introduce UAPI for BR_BOOLOPT_FDB_LOCAL_VLAN_0Petr Machata1-0/+3
The previous patches introduced a new option, BR_BOOLOPT_FDB_LOCAL_VLAN_0. When enabled, it has local FDB entries installed only on VLAN 0, instead of duplicating them across all VLANs. In this patch, add the corresponding UAPI toggle, and the code for turning the feature on and off. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/ea99bfb10f687fa58091e6e1c2f8acc33f47ca45.1757004393.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11Merge tag 'wireless-next-2025-09-11' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-nextJakub Kicinski1-3/+48
Johannes Berg says: ==================== Plenty of things going on, notably: - iwlwifi: major cleanups/rework - brcmfmac: gets AP isolation support - mac80211: gets more S1G support * tag 'wireless-next-2025-09-11' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (94 commits) wifi: mwifiex: fix endianness handling in mwifiex_send_rgpower_table wifi: cfg80211: Remove the redundant wiphy_dev wifi: mac80211: fix incorrect comment wifi: cfg80211: update the time stamps in hidden ssid wifi: mac80211: Fix HE capabilities element check wifi: mac80211: add tx_handlers_drop statistics to ethtool wifi: mac80211: fix reporting of all valid links in sta_set_sinfo() wifi: iwlwifi: mld: CHANNEL_SURVEY_NOTIF is always supported wifi: iwlwifi: mld: remove support of iwl_esr_mode_notif version 1 wifi: iwlwifi: mld: remove support from of sta cmd version 1 wifi: iwlwifi: mld: remove support of roc cmd version 5 wifi: iwlwifi: mld: remove support of mac cmd ver 2 wifi: iwlwifi: mld: don't consider phy cmd version 5 wifi: iwlwifi: implement wowlan status notification API update wifi: iwlwifi: fw: Add ASUS to PPAG and TAS list wifi: iwlwifi: add kunit tests for nvm parse wifi: iwlwifi: api: add a flag to iwl_link_ctx_modify_flags wifi: iwlwifi: pcie: move ltr_enabled to the specific transport wifi: iwlwifi: pcie: move pm_support to the specific transport wifi: iwlwifi: rename iwl_finish_nic_init ... ==================== Link: https://patch.msgid.link/20250911100854.20445-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-11netfilter: nft_meta_bridge: introduce NFT_META_BRI_IIFHWADDR supportFernando Fernandez Mancera1-0/+2
Expose the input bridge interface ethernet address so it can be used to redirect the packet to the receiving physical device for processing. Tested with nft command line tool. table bridge nat { chain PREROUTING { type filter hook prerouting priority 0; policy accept; ether daddr de:ad:00:00:be:ef meta pkttype set host ether daddr set meta ibrhwdr accept } } Joint work with Pablo Neira. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de>
2025-09-11tee: new ioctl to a register tee_shm from a dmabuf file descriptorEtienne Carriere1-0/+31
Add a userspace API to create a tee_shm object that refers to a dmabuf reference. Userspace registers the dmabuf file descriptor as in a tee_shm object. The registration is completed with a tee_shm returned file descriptor. Userspace is free to close the dmabuf file descriptor after it has been registered since all the resources are now held via the new tee_shm object. Closing the tee_shm file descriptor will eventually release all resources used by the tee_shm object when all references are released. The new IOCTL, TEE_IOC_SHM_REGISTER_FD, supports dmabuf references to physically contiguous memory buffers. Dmabuf references acquired from the TEE DMA-heap can be used as protected memory for Secure Video Path and such use cases. It depends on the TEE and the TEE driver if dmabuf references acquired by other means can be used. A new tee_shm flag is added to identify tee_shm objects built from a registered dmabuf, TEE_SHM_DMA_BUF. Signed-off-by: Etienne Carriere <etienne.carriere@foss.st.com> Signed-off-by: Olivier Masse <olivier.masse@nxp.com> Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2025-09-09media: include: update Hans Verkuil's email addressHans Verkuil2-2/+2
Replace hverkuil@xs4all.nl by hverkuil@kernel.org. Signed-off-by: Hans Verkuil <hverkuil@kernel.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2025-09-09media: update Hans Verkuil's email addressHans Verkuil1-1/+1
Replace hansverk@cisco.com by hverkuil@kernel.org. Signed-off-by: Hans Verkuil <hverkuil@kernel.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2025-09-09bonding: add support for per-port LACP actor priorityHangbin Liu1-0/+1
Introduce a new netlink attribute 'actor_port_prio' to allow setting the LACP actor port priority on a per-slave basis. This extends the existing bonding infrastructure to support more granular control over LACP negotiations. The priority value is embedded in LACPDU packets and will be used by subsequent patches to influence aggregator selection policies. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250902064501.360822-2-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-09ptp: Add ioctl commands to expose raw cycle counter valuesCarolina Jubran1-0/+4
Introduce two new ioctl commands, PTP_SYS_OFFSET_PRECISE_CYCLES and PTP_SYS_OFFSET_EXTENDED_CYCLES, to allow user space to access the raw free-running cycle counter from PTP devices. These ioctls are variants of the existing PRECISE and EXTENDED offset queries, but instead of returning device time in realtime, they return the raw cycle counter value. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://patch.msgid.link/1755008228-88881-2-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-08io_uring: introduce io_uring queryingPavel Begunkov2-0/+44
There are many parameters users might want to query about io_uring like available request types or the ring sizes. This patch introduces an interface for such slow path queries. It was written with several requirements in mind: - Can be used with or without an io_uring instance. Asking for supported setup flags before creating an instance as well as qeurying info about an already created ring are valid use cases. - Should be moderately fast. For example, users might use it to periodically retrieve ring attributes at runtime. As a consequence, it should be able to query multiple attributes in a single syscall. - Backward and forward compatible. - Should be reasobably easy to use. - Reduce the kernel code size for introducing new query types. It's implemented as a new registration opcode IORING_REGISTER_QUERY. The user passes one or more query strutctures linked together, each represented by struct io_uring_query_hdr. The header stores common control fields needed for processing and points to query type specific information. The header contains - The query type - The result field, which on return contains the error code for the query - Pointer to the query type specific information - The size of the query structure. The kernel will only populate up to the size, which helps with backward compatibility. The kernel can also reduce the size, so if the current kernel is older than the inteface the user tries to use, it'll get only the supported bits. - next_entry field is used to chain multiple queries. Apart from common registeration syscall failures, it can only immediately return an error code in case when the headers are incorrect or any other addresses and invalid. That usually mean that the userspace doesn't use the API right and should be corrected. All query type specific errors are returned in the header's result field. As an example, the patch adds a single query type for now, i.e. IO_URING_QUERY_OPCODES, which tells what register / request / etc. opcodes are supported, but there are particular plans to extend it. Note: there is a request probing interface via IORING_REGISTER_PROBE, but it's a mess. It requires the user to create a ring first, it only works for requests, and requires dynamic allocations. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-09-05fuse: add prune notificationMiklos Szeredi1-0/+8
Some fuse servers need to prune their caches, which can only be done if the kernel's own dentry/inode caches are pruned first to avoid dangling references. Add FUSE_NOTIFY_PRUNE, which takes an array of node ID's to try and get rid of. Inodes with active references are skipped. A similar functionality is already provided by FUSE_NOTIFY_INVAL_ENTRY with the FUSE_EXPIRE_ONLY flag. Differences in the interface are FUSE_NOTIFY_INVAL_ENTRY: - can only prune one dentry - dentry is determined by parent ID and name - if inode has multiple aliases (cached hard links), then they would have to be invalidated individually to be able to get rid of the inode FUSE_NOTIFY_PRUNE: - can prune multiple inodes - inodes determined by their node ID - aliases are taken care of automatically Reviewed-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-09-05fuse: remove FUSE_NOTIFY_CODE_MAX from <uapi/linux/fuse.h>Miklos Szeredi1-1/+0
Constants that change value from version to version have no place in an interface definition. Hopefully this won't break anything. Reviewed-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-09-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-0/+2
Cross-merge networking fixes after downstream PR (net-6.17-rc5). No conflicts. Adjacent changes: include/net/sock.h c51613fa276f ("net: add sk->sk_drop_counters") 5d6b58c932ec ("net: lockless sock_i_ino()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-04PCI/AER: Print TLP Log for errors introduced since PCIe r1.1Lukas Wunner1-0/+7
When reporting an error, the AER driver prints the TLP Header / Prefix Log only for errors enumerated in the AER_LOG_TLP_MASKS macro. The macro was never amended since its introduction in 2006 with commit 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver"). At the time, PCIe r1.1 was the latest spec revision. Amend the macro with errors defined since then to avoid omitting the TLP Header / Prefix Log for newer errors. The order of the errors in AER_LOG_TLP_MASKS follows PCIe r1.1 sec 6.2.7 rather than 7.10.2, because only the former documents for which errors a TLP Header / Prefix is logged. Retain this order. The section number is still 6.2.7 in today's PCIe r7.0. For Completion Timeouts, the TLP Header / Prefix is only logged if the Completion Timeout Prefix / Header Log Capable bit is set in the AER Capabilities and Control register. Introduce a tlp_header_logged() helper to check whether the TLP Header / Prefix Log is populated and use it in the two places which currently match against AER_LOG_TLP_MASKS directly. For Uncorrectable Internal Errors, logging of the TLP Header / Prefix is optional per PCIe r7.0 sec 6.2.7. If needed, drivers could indicate through a flag whether devices are capable and tlp_header_logged() could then check that flag. pcitools introduced macros for newer errors with commit 144b0911cc0b ("ls-ecaps: extend decode support for more fields for AER CE and UE status"): https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/commit/?id=144b0911cc0b Unfortunately some of those macros are overly long: PCI_ERR_UNC_POISONED_TLP_EGRESS PCI_ERR_UNC_DMWR_REQ_EGRESS_BLOCKED PCI_ERR_UNC_IDE_CHECK PCI_ERR_UNC_MISR_IDE_TLP PCI_ERR_UNC_PCRC_CHECK PCI_ERR_UNC_TLP_XLAT_EGRESS_BLOCKED This seems unsuitable for <linux/pci_regs.h>, so shorten to: PCI_ERR_UNC_POISON_BLK PCI_ERR_UNC_DMWR_BLK PCI_ERR_UNC_IDE_CHECK PCI_ERR_UNC_MISR_IDE PCI_ERR_UNC_PCRC_CHECK PCI_ERR_UNC_XLAT_BLK Note that some of the existing macros in <linux/pci_regs.h> do not match exactly with pcitools (e.g. PCI_ERR_UNC_SDES versus PCI_ERR_UNC_SURPDN), so it does not seem mandatory for them to be identical. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/5f707caf1260bd8f15012bb032f7da9a9b898aba.1756712066.git.lukas@wunner.de
2025-09-04wifi: nl80211: strict checking attributes for NL80211_CMD_SET_BSSArend van Spriel1-1/+4
Assure user-space only modifies attributes for NL80211_CMD_SET_BSS that are supported by the driver. This stricter checking is only done when user-space commits to it by including NL80211_ATTR_BSS_PARAM. Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://patch.msgid.link/20250817190435.1495094-4-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-04wifi: nl80211: allow drivers to support subset of NL80211_CMD_SET_BSSArend van Spriel1-0/+4
The so-called fullmac devices rely on firmware functionality and/or API to change BSS parameters. Today there are limited drivers supporting the nl80211 primitive, but they only handle a subset of the bss parameters passed if any. The mac80211 driver does handle all parameters and stores their configured values. Some of the BSS parameters were already conditional by wiphy->features. For these the wiphy->bss_param_support and wiphy->features fields are silently aligned in wiphy_register(). Maybe better to issue a warning instead when they are misaligned. Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://patch.msgid.link/20250817190435.1495094-2-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-04wifi: nl80211: Add EHT fixed Tx rate supportMuna Sinada1-2/+39
Add new attributes to support EHT MCS/NSS Tx rates and EHT GI/LTF. Parse EHT fixed MCS/NSS Tx rates and EHT GI/LTF values passed by the userspace, validate and add as part of cfg80211_bitrate_mask. MCS mask is constructed by new function, eht_build_mcs_mask(). Max NSS supported for MCS rates of 7, 9, 11 and 13 is utilized to set MCS bitmask for each NSS. MCS rates 14, and 15 if supported, are set only for NSS = 0. Co-developed-by: Aloka Dixit <aloka.dixit@oss.qualcomm.com> Signed-off-by: Aloka Dixit <aloka.dixit@oss.qualcomm.com> Signed-off-by: Muna Sinada <muna.sinada@oss.qualcomm.com> Link: https://patch.msgid.link/20250815213011.2704803-1-muna.sinada@oss.qualcomm.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-04wifi: mac80211: support block bitmap S1G TIM encodingLachlan Hodges1-1/+2
An S1G TIM PVB is encoded differently compared to a non-s1g TIM PVB. As the AP dictates which encoding mode it uses, here we only implement block bitmap encoding. This is the default encoding mode used by all current vendor implementations. Additionally, S1G has a maximum AID count of 8192, however we are limiting the current implementation to 1600. This has no resemblence to the standard and is purely an implementation detail. The reason for this is due to the TIM elements maximum length of 255. This allows for, at most, 25 encoded blocks for a PVB encoded with block bitmap. Support for the maximum of 8192 AIDs will require an implementation of page slicing to be added to mac80211. As a result, we perform extra validation on both the STA and AP side when receiving an AID as an S1G interface. Add support for block bitmap encoding for an S1G AP and limit the maximum AID count to 1600 for the current mac80211 implementations. Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250725132221.258217-2-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-04media: uapi: v4l2-controls: Cleanup codec definitionsPaul Kocialkowski1-12/+11
Move some fields closer to where they are used, add missing tabs and remove an extra newline. Signed-off-by: Paul Kocialkowski <paulk@sys-base.io> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-09-04netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIXPhil Sutter1-0/+2
This new attribute is supposed to be used instead of NFTA_DEVICE_NAME for simple wildcard interface specs. It holds a NUL-terminated string representing an interface name prefix to match on. While kernel code to distinguish full names from prefixes in NFTA_DEVICE_NAME is simpler than this solution, reusing the existing attribute with different semantics leads to confusion between different versions of kernel and user space though: * With old kernels, wildcards submitted by user space are accepted yet silently treated as regular names. * With old user space, wildcards submitted by kernel may cause crashes since libnftnl expects NUL-termination when there is none. Using a distinct attribute type sanitizes these situations as the receiving part detects and rejects the unexpected attribute nested in *_HOOK_DEVS attributes. Fixes: 6d07a289504a ("netfilter: nf_tables: Support wildcard netdev hook specs") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
2025-09-02fuse: allow synchronous FUSE_INITMiklos Szeredi1-0/+1
FUSE_INIT has always been asynchronous with mount. That means that the server processed this request after the mount syscall returned. This means that FUSE_INIT can't supply the root inode's ID, hence it currently has a hardcoded value. There are other limitations such as not being able to perform getxattr during mount, which is needed by selinux. To remove these limitations allow server to process FUSE_INIT while initializing the in-core super block for the fuse filesystem. This can only be done if the server is prepared to handle this, so add FUSE_DEV_IOC_SYNC_INIT ioctl, which a) lets the server know whether this feature is supported, returning ENOTTY othewrwise. b) lets the kernel know to perform a synchronous initialization The implementation is slightly tricky, since fuse_dev/fuse_conn are set up only during super block creation. This is solved by setting the private data of the fuse device file to a special value ((struct fuse_dev *) 1) and waiting for this to be turned into a proper fuse_dev before commecing with operations on the device file. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-08-30audit: add record for multiple object contextsCasey Schaufler1-0/+1
Create a new audit record AUDIT_MAC_OBJ_CONTEXTS. An example of the MAC_OBJ_CONTEXTS record is: type=MAC_OBJ_CONTEXTS msg=audit(1601152467.009:1050): obj_selinux=unconfined_u:object_r:user_home_t:s0 When an audit event includes a AUDIT_MAC_OBJ_CONTEXTS record the "obj=" field in other records in the event will be "obj=?". An AUDIT_MAC_OBJ_CONTEXTS record is supplied when the system has multiple security modules that may make access decisions based on an object security context. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> [PM: subj tweak, audit example readability indents] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-30audit: add record for multiple task security contextsCasey Schaufler1-0/+1
Replace the single skb pointer in an audit_buffer with a list of skb pointers. Add the audit_stamp information to the audit_buffer as there's no guarantee that there will be an audit_context containing the stamp associated with the event. At audit_log_end() time create auxiliary records as have been added to the list. Functions are created to manage the skb list in the audit_buffer. Create a new audit record AUDIT_MAC_TASK_CONTEXTS. An example of the MAC_TASK_CONTEXTS record is: type=MAC_TASK_CONTEXTS msg=audit(1600880931.832:113) subj_apparmor=unconfined subj_smack=_ When an audit event includes a AUDIT_MAC_TASK_CONTEXTS record the "subj=" field in other records in the event will be "subj=?". An AUDIT_MAC_TASK_CONTEXTS record is supplied when the system has multiple security modules that may make access decisions based on a subject security context. Refactor audit_log_task_context(), creating a new audit_log_subj_ctx(). This is used in netlabel auditing to provide multiple subject security contexts as necessary. Suggested-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> [PM: subj tweak, audit example readability indents] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-3/+4
Cross-merge networking fixes after downstream PR (net-6.17-rc4). No conflicts. Adjacent changes: drivers/net/ethernet/intel/idpf/idpf_txrx.c 02614eee26fb ("idpf: do not linearize big TSO packets") 6c4e68480238 ("idpf: remove obsolete stashing code") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-29Add RWF_NOSIGNAL flag for pwritev2Lauri Vasama1-1/+4
For a user mode library to avoid generating SIGPIPE signals (e.g. because this behaviour is not portable across operating systems) is cumbersome. It is generally bad form to change the process-wide signal mask in a library, so a local solution is needed instead. For I/O performed directly using system calls (synchronous or readiness based asynchronous) this currently involves applying a thread-specific signal mask before the operation and reverting it afterwards. This can be avoided when it is known that the file descriptor refers to neither a pipe nor a socket, but a conservative implementation must always apply the mask. This incurs the cost of two additional system calls. In the case of sockets, the existing MSG_NOSIGNAL flag can be used with send. For asynchronous I/O performed using io_uring, currently the only option (apart from MSG_NOSIGNAL for sockets), is to mask SIGPIPE entirely in the call to io_uring_enter. Thankfully io_uring_enter takes a signal mask, so only a single syscall is needed. However, copying the signal mask on every call incurs a non-zero performance penalty. Furthermore, this mask applies to all completions, meaning that if the non-signaling behaviour is desired only for some subset of operations, the desired signals must be raised manually from user-mode depending on the completed operation. Add RWF_NOSIGNAL flag for pwritev2. This flag prevents the SIGPIPE signal from being raised when writing on disconnected pipes or sockets. The flag is handled directly by the pipe filesystem and converted to the existing MSG_NOSIGNAL flag for sockets. Signed-off-by: Lauri Vasama <git@vasama.org> Link: https://lore.kernel.org/20250827133901.1820771-1-git@vasama.org Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-29media: aspeed: Allow to capture from SoC display (GFX)Jammy Huang1-0/+7
ASPEED BMC IC has 2 different display engines. Please find AST2600's datasheet to get detailed information. 1. VGA on PCIe 2. SoC Display (GFX) By default, video engine (VE) will capture video from VGA. This patch adds an option to capture video from GFX with standard ioctl, vidioc_s_input. An enum, aspeed_video_input, is added for this purpose. enum aspeed_video_input { VIDEO_INPUT_VGA = 0, VIDEO_INPUT_GFX, VIDEO_INPUT_MAX }; To test this feature, you will need to enable GFX first. Please refer to ASPEED's SDK_User_Guide, 6.3.x Soc Display driver, for more information. In your application, you will need to use v4l2 ioctl, VIDIOC_S_INPUT, as below to select before start streaming. int rc; struct v4l2_input input; input.index = VIDEO_INPUT_GFX; rc = ioctl(fd, VIDIOC_S_INPUT, &input); if (rc < 0) { ... } Link: https://github.com/AspeedTech-BMC/openbmc/releases Signed-off-by: Jammy Huang <jammy_huang@aspeedtech.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> [hverkuil: split up three overly long lines]
2025-08-29media: uapi: Cleanup tab after define in headersPaul Kocialkowski2-24/+24
Some definitions use a tab after the define keyword instead of the usual single space. Replace it for better consistency. Signed-off-by: Paul Kocialkowski <paulk@sys-base.io> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-08-29media: uapi: Move colorimetry controls at the end of the filePaul Kocialkowski1-34/+34
The colorimetry controls class is defined after the stateless codec class at the top of the controls header. It is currently defined in the middle of stateless codec controls. Move the colorimetry controls after the stateless codec controls, at the end of the file. Signed-off-by: Paul Kocialkowski <paulk@sys-base.io> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-08-28uapi: wrap compiler_types.h in an ifdef instead of the implicit stripJakub Kicinski1-0/+2
The uAPI stddef header includes compiler_types.h, a kernel-only header, to make sure that kernel definitions of annotations like __counted_by() take precedence. There is a hack in scripts/headers_install.sh which strips includes of compiler.h and compiler_types.h when installing uAPI headers. While explicit handling makes sense for compiler.h, which is included all over the uAPI, compiler_types.h is only included by stddef.h (within the uAPI, obviously it's included in kernel code a lot). Remove the stripping from scripts/headers_install.sh and wrap the include of compiler_types.h in #ifdef __KERNEL__ instead. This should be equivalent functionally, but is easier to understand to a casual reader of the code. It also makes it easier to work with kernel headers directly from under tools/ Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250825201828.2370083-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-27io_uring/nop: add support for IORING_SETUP_CQE_MIXEDJens Axboe1-0/+1
This adds support for setting IORING_NOP_CQE32 as a flag for a NOP command, in which case a 32b CQE will be posted rather than a regular one. This is the default if the ring has been setup with IORING_SETUP_CQE32. If the ring has been setup with IORING_SETUP_CQE_MIXED, then 16b CQEs will be posted without this flag set, and 32b CQEs if this flag is set. For the latter case, sqe->off is what will be posted as cqe->big_cqe[0] and sqe->addr is what will be posted as cqe->big_cqe[1]. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-27io_uring: add support for IORING_SETUP_CQE_MIXEDJens Axboe1-0/+6
Normal rings support 16b CQEs for posting completions, while certain features require the ring to be configured with IORING_SETUP_CQE32, as they need to convey more information per completion. This, in turn, makes ALL the CQEs be 32b in size. This is somewhat wasteful and inefficient, particularly when only certain CQEs need to be of the bigger variant. This adds support for setting up a ring with mixed CQE sizes, using IORING_SETUP_CQE_MIXED. When setup in this mode, CQEs posted to the ring may be either 16b or 32b in size. If a CQE is 32b in size, then IORING_CQE_F_32 is set in the CQE flags to indicate that this is the case. If this flag isn't set, the CQE is the normal 16b variant. CQEs on these types of mixed rings may also have IORING_CQE_F_SKIP set. This can happen if the ring is one (small) CQE entry away from wrapping, and an attempt is made to post a 32b CQE. As CQEs must be contigious in the CQ ring, a 32b CQE cannot wrap the ring. For this case, a single dummy CQE is posted with the SKIP flag set. The application should simply ignore those. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-27fuse: add COPY_FILE_RANGE_64 that allows large copiesMiklos Szeredi1-1/+11
The FUSE protocol uses struct fuse_write_out to convey the return value of copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE interface supports a 64-bit size copies and there's no reason why copies should be limited to 32-bit. Introduce a new op COPY_FILE_RANGE_64, which is identical, except the number of bytes copied is returned in a 64-bit value. If the fuse server does not support COPY_FILE_RANGE_64, fall back to COPY_FILE_RANGE. Reported-by: Florian Weimer <fweimer@redhat.com> Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-08-27KVM: Allow and advertise support for host mmap() on guest_memfd filesFuad Tabba1-0/+2
Now that all the x86 and arm64 plumbing for mmap() on guest_memfd is in place, allow userspace to set GUEST_MEMFD_FLAG_MMAP and advertise support via a new capability, KVM_CAP_GUEST_MEMFD_MMAP. The availability of this capability is determined per architecture, and its enablement for a specific guest_memfd instance is controlled by the GUEST_MEMFD_FLAG_MMAP flag at creation time. Update the KVM API documentation to detail the KVM_CAP_GUEST_MEMFD_MMAP capability, the associated GUEST_MEMFD_FLAG_MMAP, and provide essential information regarding support for mmap in guest_memfd. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Shivank Garg <shivankg@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250729225455.670324-22-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-08-26devlink: Make health reporter burst period configurableShahar Shitrit1-0/+2
Enable configuration of the burst period — a time window starting from the first error recovery, during which the reporter allows recovery attempts for each reported error. This feature is helpful when a single underlying issue causes multiple errors, as it delays the start of the grace period to allow sufficient time for recovering all related errors. For example, if multiple TX queues time out simultaneously, a sufficient burst period could allow all affected TX queues to be recovered within that window. Without this period, only the first TX queue that reports a timeout will undergo recovery, while the remaining TX queues will be blocked once the grace period begins. Configuration example: $ devlink health set pci/0000:00:09.0 reporter tx burst_period 500 Configuration example with ynl: ./tools/net/ynl/pyynl/cli.py \ --spec Documentation/netlink/specs/devlink.yaml \ --do health-reporter-set --json '{ "bus-name": "auxiliary", "dev-name": "mlx5_core.eth.0", "port-index": 65535, "health-reporter-name": "tx", "health-reporter-burst-period": 500 }' Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250824084354.533182-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-26vhost: Fix ioctl # for VHOST_[GS]ET_FORK_FROM_OWNERNamhyung Kim1-2/+2
The VHOST_[GS]ET_FEATURES_ARRAY ioctl already took 0x83 and it would result in a build error when the vhost uapi header is used for perf tool build like below. In file included from trace/beauty/ioctl.c:93: tools/perf/trace/beauty/generated/ioctl/vhost_virtio_ioctl_array.c: In function ‘ioctl__scnprintf_vhost_virtio_cmd’: tools/perf/trace/beauty/generated/ioctl/vhost_virtio_ioctl_array.c:36:18: error: initialized field overwritten [-Werror=override-init] 36 | [0x83] = "SET_FORK_FROM_OWNER", | ^~~~~~~~~~~~~~~~~~~~~ tools/perf/trace/beauty/generated/ioctl/vhost_virtio_ioctl_array.c:36:18: note: (near initialization for ‘vhost_virtio_ioctl_cmds[131]’) Fixes: 7d9896e9f6d02d8a ("vhost: Reintroduce kthread API and add mode selection") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Message-Id: <20250819063958.833770-1-namhyung@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>
2025-08-25Merge 6.17-rc3 into char-misc-nextGreg Kroah-Hartman2-1/+2
We need the char/misc/iio fixes in here as well to build on. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-24io_uring: add UAPI definitions for mixed CQE postingsJens Axboe1-0/+10
This adds the CQE flags related to supporting a mixed CQ ring mode, where both normal (16b) and big (32b) CQEs may be posted. No functional changes in this patch. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-24io_uring: uring_cmd: add multishot supportMing Lei1-1/+5
Add UAPI flag IORING_URING_CMD_MULTISHOT for supporting multishot uring_cmd operations with provided buffer. This enables drivers to post multiple completion events from a single uring_cmd submission, which is useful for: - Notifying userspace of device events (e.g., interrupt handling) - Supporting devices with multiple event sources (e.g., multi-queue devices) - Avoiding the need for device poll() support when events originate from multiple sources device-wide The implementation adds two new APIs: - io_uring_cmd_select_buffer(): selects a buffer from the provided buffer group for multishot uring_cmd - io_uring_mshot_cmd_post_cqe(): posts a CQE after event data is pushed to the provided buffer Multishot uring_cmd must be used with buffer select (IOSQE_BUFFER_SELECT) and is mutually exclusive with IORING_URING_CMD_FIXED for now. The ublk driver will be the first user of this functionality: https://github.com/ming1/linux/commits/ublk-devel/ Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250821040210.1152145-3-ming.lei@redhat.com [axboe: fold in fix for !CONFIG_IO_URING] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-22Merge tag 'block-6.17-20250822' of git://git.kernel.dk/linuxLinus Torvalds1-1/+1
Pull block fixes from Jens Axboe: "A set of fixes for block that should go into this tree. A bit larger than what I usually have at this point in time, a lot of that is the continued fixing of the lockdep annotation for queue freezing that we recently added, which has highlighted a number of little issues here and there. This contains: - MD pull request via Yu: - Add a legacy_async_del_gendisk mode, to prevent a user tools regression. New user tools releases will not use such a mode, the old release with a new kernel now will have warning about deprecated behavior, and we prepare to remove this legacy mode after about a year later - The rename in kernel causing user tools build failure, revert the rename in mdp_superblock_s - Fix a regression that interrupted resync can be shown as recover from mdstat or sysfs - Improve file size detection for loop, particularly for networked file systems, by using getattr to get the size rather than the cached inode size. - Hotplug CPU lock vs queue freeze fix - Lockdep fix while updating the number of hardware queues - Fix stacking for PI devices - Silence bio_check_eod() for the known case of device removal where the size is truncated to 0 sectors" * tag 'block-6.17-20250822' of git://git.kernel.dk/linux: block: avoid cpu_hotplug_lock depedency on freeze_lock block: decrement block_rq_qos static key in rq_qos_del() block: skip q->rq_qos check in rq_qos_done_bio() blk-mq: fix lockdep warning in __blk_mq_update_nr_hw_queues block: tone down bio_check_eod loop: use vfs_getattr_nosec for accurate file size loop: Consolidate size calculation logic into lo_calculate_size() block: remove newlines from the warnings in blk_validate_integrity_limits block: handle pi_tuple_size in queue_limits_stack_integrity selftests: ublk: Use ARRAY_SIZE() macro to improve code md: fix sync_action incorrect display during resync md: add helper rdev_needs_recovery() md: keep recovery_cp in mdp_superblock_s md: add legacy_async_del_gendisk mode
2025-08-20ACPI: pfr_update: Fix the driver update version checkChen Yu1-0/+1
The security-version-number check should be used rather than the runtime version check for driver updates. Otherwise, the firmware update would fail when the update binary had a lower runtime version number than the current one. Fixes: 0db89fa243e5 ("ACPI: Introduce Platform Firmware Runtime Update device driver") Cc: 5.17+ <stable@vger.kernel.org> # 5.17+ Reported-by: "Govindarajulu, Hariganesh" <hariganesh.govindarajulu@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Link: https://patch.msgid.link/20250722143233.3970607-1-yu.c.chen@intel.com [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-08-19binder: introduce transaction reports via netlinkLi Li1-0/+37
Introduce a generic netlink multicast event to report binder transaction failures to userspace. This allows subscribers to monitor these events and take appropriate actions, such as stopping a misbehaving application that is spamming a service with huge amount of transactions. The multicast event contains full details of the failed transactions, including the sender/target PIDs, payload size and specific error code. This interface is defined using a YAML spec, from which the UAPI and kernel headers and source are auto-generated. Signed-off-by: Li Li <dualli@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20250727182932.2499194-4-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>