aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2026-03-30KVM: arm64: Allow userspace to create protected VMs when pKVM is enabledWill Deacon1-0/+5
Introduce a new VM type for KVM/arm64 to allow userspace to request the creation of a "protected VM" when the host has booted with pKVM enabled. For now, this feature results in a taint on first use as many aspects of a protected VM are not yet protected! Tested-by: Fuad Tabba <tabba@google.com> Tested-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Will Deacon <will@kernel.org> Link: https://patch.msgid.link/20260330144841.26181-32-will@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-30hvc/xen: Check console connection flagJason Andryuk1-0/+13
When the console out buffer is filled, __write_console() will return 0 as it cannot send any data. domU_write_console() will then spin in `while (len)` as len doesn't decrement until xenconsoled attaches. This would block a domU and nullify the parallelism of Hyperlaunch until dom0 userspace starts xenconsoled, which empties the buffer. Xen 4.21 added a connection field to the xen console page. This is set to XENCONSOLE_DISCONNECTED (1) when a domain is built, and xenconsoled will set it to XENCONSOLE_CONNECTED (0) when it connects. Update the hvc_xen driver to check the field. When the field is disconnected, drop the write with -ENOTCONN. We only drop the write when the field is XENCONSOLE_DISCONNECTED (1) to try for maximum compatibility. The Xen toolstack has historically zero initialized the console, so it should see XENCONSOLE_CONNECTED (0) by default. If an implemenation used uninitialized memory, only checking for XENCONSOLE_DISCONNECTED could have the lowest chance of not connecting. This lets the hyperlaunched domU boot without stalling. Once dom0 starts xenconsoled, xl console can be used to access the domU's hvc0. Paritally sync console.h from xen.git to bring in the new field. Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Link: https://patch.msgid.link/20260318235326.14568-1-jason.andryuk@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-30lib/linear_ranges: Add linear_range_get_selector_high_arrayAmit Sunil Dhamne1-0/+3
Add a helper function to find the selector for a given value in a linear range array. The selector should be such that the value it represents should be higher or equal to the given value. Signed-off-by: Amit Sunil Dhamne <amitsd@google.com> Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com> Acked-by: Mark Brown <broonie@kernel.org> Link: https://patch.msgid.link/20260325-max77759-charger-v9-4-4486dd297adc@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-30mfd: max77759: add register bitmasks and modify irq configs for chargerAmit Sunil Dhamne1-29/+137
Add register bitmasks for charger function. In addition split the charger IRQs further such that each bit represents an IRQ downstream of charger regmap irq chip. In addition populate the ack_base to offload irq ack to the regmap irq chip framework. Signed-off-by: Amit Sunil Dhamne <amitsd@google.com> Reviewed-by: André Draszik <andre.draszik@linaro.org> Link: https://patch.msgid.link/20260325-max77759-charger-v9-3-4486dd297adc@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-30usb: translate ENOSPC for user spaceOliver Neukum1-0/+2
In case of insufficient bandwidth usb_submit_urb() returns -ENOSPC. Translating this to -EIO is not optimal. There are insufficient resources not an error. EBUSY is a better fit. Signed-off-by: Oliver Neukum <oneukum@suse.com> Link: https://patch.msgid.link/20260325145537.372993-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-30USB: uapi: add BULK_MAX_PACKET_UPDATEOliver Neukum1-9/+11
The spec for Embedded USB2 Version 2.0 adds a new feature request. This needs to be added to uapi for monitoring. Signed-off-by: Oliver Neukum <oneukum@suse.com> Link: https://patch.msgid.link/20260319144715.2957358-2-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-30usb: uapi: add usb 3.0 authentication declarationsOliver Neukum1-0/+13
This adds the USB authentication extensions to the uapi chapter 9 declarations, so that user space tools correctly operate on the descriptor and commands. This is necessary for sniffing and debugging in gadget mode to correctly work, even though the kernel does not use these requests in host mode. Signed-off-by: Oliver Neukum <oneukum@suse.com> Link: https://patch.msgid.link/20260319144715.2957358-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-30dt-bindings: clock: qcom: Add SM8750 GPU clocksKonrad Dybcio1-0/+50
The SM8750 features a "traditional" GPU_CC block, much of which is controlled through the GMU microcontroller. GPU_CC block requires the MX and CX rail control and thus add the corresponding power-domains and require-opps. Additionally, there's an separate GX_CC block, where the GX GDSC is moved. Update the bindings to accommodate for SM8750 SoC. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260305-gpucc_sm8750_v2-v5-1-78292b40b053@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-30dt-bindings: clock: qcom: Add CMN PLL support for IPQ8074John Crispin1-0/+15
The CMN PLL block in the IPQ8074 SoC takes 48 MHz as the reference input clock. Its output clocks are the bias_pll_cc_clk (300 MHz) and bias_pll_nss_noc_clk (416.5 MHz) clocks used by the networking subsystem. Add the related compatible for IPQ8074 to the ipq9574-cmn-pll generic schema. Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260311183942.10134-4-ansuelsmth@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-30dt-bindings: clock: qcom: Add CMN PLL support for IPQ6018John Crispin1-0/+15
The CMN PLL block in the IPQ6018 SoC takes 48 MHz as the reference input clock. Its output clocks are the bias_pll_cc_clk (300 MHz) and bias_pll_nss_noc_clk (416.5 MHz) clocks used by the networking subsystem. Add the related compatible for IPQ6018 to the ipq9574-cmn-pll generic schema. Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260311183942.10134-2-ansuelsmth@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-30dt-bindings: arm: qcom,ids: Add SoC ID for SA8650PLei wang1-0/+1
Add unique ID for Qualcomm SA8650P SoC. Signed-off-by: Lei wang <quic_leiwan@quicinc.com> Signed-off-by: Radu Rendec <rrendec@redhat.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260321152307.9131-2-rrendec@redhat.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-30dax: export dax_dev_get()John Groves1-0/+1
famfs needs to look up a dax_device by dev_t when resolving fmap entries that reference character dax devices. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: John Groves <john@groves.net> Link: https://patch.msgid.link/0100019d311daab5-bb212f0b-4e05-4668-bf53-d76fab56be68-000000@email.amazonses.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2026-03-30dax: Add fs_dax_get() func to prepare dax for fs-dax usageJohn Groves1-4/+12
The fs_dax_get() function should be called by fs-dax file systems after opening a fsdev dax device. This adds holder_operations, which provides a memory failure callback path and effects exclusivity between callers of fs_dax_get(). fs_dax_get() is specific to fsdev_dax, so it checks the driver type (which required touching bus.[ch]). fs_dax_get() fails if fsdev_dax is not bound to the memory. This function serves the same role as fs_dax_get_by_bdev(), which dax file systems call after opening the pmem block device. This can't be located in fsdev.c because struct dax_device is opaque there. This will be called by fs/fuse/famfs.c in a subsequent commit. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: John Groves <john@groves.net> Link: https://patch.msgid.link/0100019d311d8750-75395c22-031b-4d5f-aebe-790dca656b87-000000@email.amazonses.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2026-03-30dax: Add dax_set_ops() for setting dax_operations at bind timeJohn Groves1-0/+1
Add a new dax_set_ops() function that allows drivers to set the dax_operations after the dax_device has been allocated. This is needed for fsdev_dax where the operations need to be set during probe and cleared during unbind. The fsdev driver uses devm_add_action_or_reset() for cleanup consistency, avoiding the complexity of mixing devm-managed resources with manual cleanup in a remove() callback. This ensures cleanup happens automatically in the correct reverse order when the device is unbound. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: John Groves <john@groves.net> Link: https://patch.msgid.link/0100019d311d65a0-b9c1419e-f3a0-4afd-b0bd-848f18ff5950-000000@email.amazonses.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2026-03-30dax: Factor out dax_folio_reset_order() helperJohn Groves1-0/+1
Both fs/dax.c:dax_folio_put() and drivers/dax/fsdev.c: fsdev_clear_folio_state() (the latter coming in the next commit after this one) contain nearly identical code to reset a compound DAX folio back to order-0 pages. Factor this out into a shared helper function. The new dax_folio_reset_order() function: - Clears the folio's mapping and share count - Resets compound folio state via folio_reset_order() - Clears PageHead and compound_head for each sub-page - Restores the pgmap pointer for each resulting order-0 folio - Returns the original folio order (for callers that need to advance by that many pages) Two intentional differences from the original dax_folio_put() logic: 1. folio->share is cleared unconditionally. This is correct because the DAX subsystem maintains the invariant that share != 0 only when mapping == NULL (enforced by dax_folio_make_shared()). dax_folio_put() ensures share has reached zero before calling this helper, so the unconditional clear is safe. 2. folio->pgmap is now explicitly restored for order-0 folios. For the dax_folio_put() caller this is a no-op (reads and writes back the same field). It is intentional for the upcoming fsdev_clear_folio_state() caller, which converts previously-compound folios and needs pgmap re-established for all pages regardless of order. This simplifies fsdev_clear_folio_state() from ~50 lines to ~15 lines. Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: John Groves <john@groves.net> Link: https://patch.msgid.link/0100019d311cc6b9-5be7428a-7f16-4774-8f90-a44b88ac5660-000000@email.amazonses.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2026-03-30powercap: intel_rapl: Remove unused AVERAGE_POWER primitiveKuppuswamy Sathyanarayanan1-1/+0
The AVERAGE_POWER primitive and RAPL_PRIMITIVE_DERIVED flag are not used anywhere in the code. Remove them to simplify the primitive handling logic. No functional changes. Co-developed-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20260313185333.2370733-2-sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-30powercap: correct kernel-doc function parameter namesRandy Dunlap1-2/+2
Use the correct function parameter names in kernel-doc comments to avoid these warnings: Warning: include/linux/powercap.h:254 function parameter 'name' not described in 'powercap_register_control_type' Warning: include/linux/powercap.h:298 function parameter 'nr_constraints' not described in 'powercap_register_zone' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20260312051444.685136-1-rdunlap@infradead.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-30crypto/ccp: Implement SNP x86 shutdownTycho Andersen (AMD)1-1/+4
The SEV firmware has support to disable SNP during an SNP_SHUTDOWN_EX command. Verify that this support is available and set the flag so that SNP is disabled when it is not being used. In cases where SNP is disabled, skip the call to amd_iommu_snp_disable(), as all of the IOMMU pages have already been made shared. Also skip the panic case, since snp_shutdown() does IPIs. Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://patch.msgid.link/20260324161301.1353976-7-tycho@kernel.org
2026-03-30Merge drm/drm-fixes into drm-misc-next-fixesMaxime Ripard60-179/+481
Boris needs 7.0-rc6 for a shmem helper fix. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2026-03-29SUNRPC: xdr.h: fix all kernel-doc warningsRandy Dunlap1-24/+24
Correct a function parameter name (s/page/folio/) and add function return value sections for multiple functions to eliminate kernel-doc warnings: Warning: include/linux/sunrpc/xdr.h:298 function parameter 'folio' not described in 'xdr_set_scratch_folio' Warning: include/linux/sunrpc/xdr.h:337 No description found for return value of 'xdr_stream_remaining' Warning: include/linux/sunrpc/xdr.h:357 No description found for return value of 'xdr_align_size' Warning: include/linux/sunrpc/xdr.h:374 No description found for return value of 'xdr_pad_size' Warning: include/linux/sunrpc/xdr.h:387 No description found for return value of 'xdr_stream_encode_item_present' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29svcrdma: Add Write chunk WRs to the RPC's Send WR chainChuck Lever1-3/+10
Previously, Write chunk RDMA Writes were posted via a separate ib_post_send() call with their own completion handler. Each Write chunk incurred a doorbell and generated a completion event. Link Write chunk WRs onto the RPC Reply's Send WR chain so that a single ib_post_send() call posts both the RDMA Writes and the Send WR. A single completion event signals that all operations have finished. This reduces both doorbell rate and completion rate, as well as eliminating the latency of a round-trip between the Write chunk completion and the subsequent Send WR posting. The lifecycle of Write chunk resources changes: previously, the svc_rdma_write_done() completion handler released Write chunk resources when RDMA Writes completed. With WR chaining, resources remain live until the Send completion. A new sc_write_info_list tracks Write chunk metadata attached to each Send context, and svc_rdma_write_chunk_release() frees these resources when the Send context is released. The svc_rdma_write_done() handler now handles only error cases. On success it returns immediately since the Send completion handles resource release. On failure (WR flush), it closes the connection to signal to the client that the RPC Reply is incomplete. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29svcrdma: Add fair queuing for Send Queue accessChuck Lever1-0/+10
When the Send Queue fills, multiple threads may wait for SQ slots. The previous implementation had no ordering guarantee, allowing starvation when one thread repeatedly acquires slots while others wait indefinitely. Introduce a ticket-based fair queuing system. Each waiter takes a ticket number and is served in FIFO order. This ensures forward progress for all waiters when SQ capacity is constrained. The implementation has two phases: 1. Fast path: attempt to reserve SQ slots without waiting 2. Slow path: take a ticket, wait for turn, then wait for slots The ticket system adds two atomic counters to the transport: - sc_sq_ticket_head: next ticket to issue - sc_sq_ticket_tail: ticket currently being served A dedicated wait queue (sc_sq_ticket_wait) handles ticket ordering, separate from sc_send_wait which handles SQ capacity. This separation ensures that send completions (the high-frequency wake source) wake only the current ticket holder rather than all queued waiters. Ticket handoff wakes only the ticket wait queue, and each ticket holder that exits via connection close propagates the wake to the next waiter in line. When a waiter successfully reserves slots, it advances the tail counter and wakes the next waiter. This creates an orderly handoff that prevents starvation while maintaining good throughput on the fast path when contention is low. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29SUNRPC: Optimize rq_respages allocation in svc_alloc_argChuck Lever1-0/+4
svc_alloc_arg() invokes alloc_pages_bulk() with the full rq_maxpages count (~259 for 1MB messages) for the rq_respages array, causing a full-array scan despite most slots holding valid pages. svc_rqst_release_pages() NULLs only the range [rq_respages, rq_next_page) after each RPC, so only that range contains NULL entries. Limit the rq_respages fill in svc_alloc_arg() to that range instead of scanning the full array. svc_init_buffer() initializes rq_next_page to span the entire rq_respages array, so the first svc_alloc_arg() call fills all slots. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29SUNRPC: Track consumed rq_pages entriesChuck Lever1-0/+10
The rq_pages array holds pages allocated for incoming RPC requests. Two transport receive paths NULL entries in rq_pages to prevent svc_rqst_release_pages() from freeing pages that the transport has taken ownership of: - svc_tcp_save_pages() moves partial request data pages to svsk->sk_pages during multi-fragment TCP reassembly. - svc_rdma_clear_rqst_pages() moves request data pages to head->rc_pages because they are targets of active RDMA Read WRs. A new rq_pages_nfree field in struct svc_rqst records how many entries were NULLed. svc_alloc_arg() uses it to refill only those entries rather than scanning the full rq_pages array. In steady state, the transport NULLs a handful of entries per RPC, so the allocator visits only those entries instead of the full ~259 slots (for 1MB messages). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29SUNRPC: Allocate a separate Reply page arrayChuck Lever1-24/+23
struct svc_rqst uses a single dynamically-allocated page array (rq_pages) for both the incoming RPC Call message and the outgoing RPC Reply message. rq_respages is a sliding pointer into rq_pages that each transport receive path must compute based on how many pages the Call consumed. This boundary tracking is a source of confusion and bugs, and prevents an RPC transaction from having both a large Call and a large Reply simultaneously. Allocate rq_respages as its own page array, eliminating the boundary arithmetic. This decouples Call and Reply buffer lifetimes, following the precedent set by rq_bvec (a separate dynamically- allocated array for I/O vectors). Each svc_rqst now pins twice as many pages as before. For a server running 16 threads with a 1MB maximum payload, the additional cost is roughly 16MB of pinned memory. The new dynamic svc thread count facility keeps this overhead minimal on an idle server. A subsequent patch in this series limits per-request repopulation to only the pages released during the previous RPC, avoiding a full-array scan on each call to svc_alloc_arg(). Note: We've considered several alternatives to maintaining a full second array. Each alternative reintroduces either boundary logic complexity or I/O-path allocation pressure. rq_next_page is initialized in svc_alloc_arg() and svc_process() during Reply construction, and in svc_rdma_recvfrom() as a precaution on error paths. Transport receive paths no longer compute it from the Call size. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29NFSD/export: Add sign_fh export optionBenjamin Coddington1-2/+2
In order to signal that filehandles on this export should be signed, add a "sign_fh" export option. Filehandle signing can help the server defend against certain filehandle guessing attacks. Setting the "sign_fh" export option sets NFSEXP_SIGN_FH. In a future patch NFSD uses this signal to append a MAC onto filehandles for that export. While we're in here, tidy a few stray expflags to more closely align to the export flag order. Link: https://lore.kernel.org/linux-nfs/cover.1772022373.git.bcodding@hammerspace.com Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29NFSD: Add a key for signing filehandlesBenjamin Coddington1-0/+1
A future patch will enable NFSD to sign filehandles by appending a Message Authentication Code(MAC). To do this, NFSD requires a secret 128-bit key that can persist across reboots. A persisted key allows the server to accept filehandles after a restart. Enable NFSD to be configured with this key via the netlink interface. Link: https://lore.kernel.org/linux-nfs/cover.1772022373.git.bcodding@hammerspace.com Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29sunrpc: split cache_detail queue into request and reader listsJeff Layton1-1/+3
Replace the single interleaved queue (which mixed cache_request and cache_reader entries distinguished by a ->reader flag) with two dedicated lists: cd->requests for upcall requests and cd->readers for open file handles. Readers now track their position via a monotonically increasing sequence number (next_seqno) rather than by their position in the shared list. Each cache_request is assigned a seqno when enqueued, and a new cache_next_request() helper finds the next request at or after a given seqno. This eliminates the cache_queue wrapper struct entirely, simplifies the reader-skipping loops in cache_read/cache_poll/cache_ioctl/ cache_release, and makes the data flow easier to reason about. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29sunrpc: convert queue_wait from global to per-cache-detail waitqueueJeff Layton1-0/+2
The queue_wait waitqueue is currently a file-scoped global, so a wake_up for one cache_detail wakes pollers on all caches. Convert it to a per-cache-detail field so that only pollers on the relevant cache are woken. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29sunrpc: convert queue_lock from global spinlock to per-cache-detail lockJeff Layton1-0/+1
The global queue_lock serializes all upcall queue operations across every cache_detail instance. Convert it to a per-cache-detail spinlock so that different caches (e.g. auth.unix.ip vs nfsd.fh) no longer contend with each other on queue operations. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29sunrpc: Add XPT flags missing from SVC_XPRT_FLAG_LISTChuck Lever1-1/+3
Commit eccbbc7c00a5 ("nfsd: don't use sv_nrthreads in connection limiting calculations.") and commit 898374fdd7f0 ("nfsd: unregister with rpcbind when deleting a transport") added new XPT flags but neglected to update the show_svc_xprt_flags() macro. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29Documentation: Add the RPC language description of NLM version 4Chuck Lever1-0/+233
In order to generate source code to encode and decode NLMv4 protocol elements, include a copy of the RPC language description of NLMv4 for xdrgen to process. The language description is an amalgam of RFC 1813 and the Open Group's XNFS specification: https://pubs.opengroup.org/onlinepubs/9629799/chap10.htm The C code committed here was generated from the new nlm4.x file using tools/net/sunrpc/xdrgen/xdrgen. The goals of replacing hand-written XDR functions with ones that are tool-generated are to improve memory safety and make XDR encoding and decoding less brittle to maintain. The xdrgen utility derives both the type definitions and the encode/decode functions directly from protocol specifications, using names and symbols familiar to anyone who knows those specs. Unlike hand-written code that can inadvertently diverge from the specification, xdrgen guarantees that the generated code matches the specification exactly. We would eventually like xdrgen to generate Rust code as well, making the conversion of the kernel's NFS stacks to use Rust just a little easier for us. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29NFSD: Enforce timeout on layout recall and integrate lease manager fencingDai Ngo1-0/+1
When a layout conflict triggers a recall, enforcing a timeout is necessary to prevent excessive nfsd threads from being blocked in __break_lease ensuring the server continues servicing incoming requests efficiently. This patch introduces a new function to lease_manager_operations: lm_breaker_timedout: Invoked when a lease recall times out and is about to be disposed of. This function enables the lease manager to inform the caller whether the file_lease should remain on the flc_list or be disposed of. For the NFSD lease manager, this function now handles layout recall timeouts. If the layout type supports fencing and the client has not been fenced, a fence operation is triggered to prevent the client from accessing the block device. While the fencing operation is in progress, the conflicting file_lease remains on the flc_list until fencing is complete. This guarantees that no other clients can access the file, and the client with exclusive access is properly blocked before disposal. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29sunrpc: Fix compilation error (`make W=1`) when dprintk() is no-opAndy Shevchenko2-5/+6
Clang compiler is not happy about set but unused variables: .../flexfilelayout/flexfilelayoutdev.c:56:9: error: variable 'ret' set but not used [-Werror,-Wunused-but-set-variable] .../flexfilelayout/flexfilelayout.c:1505:6: error: variable 'err' set but not used [-Werror,-Wunused-but-set-variable] .../nfs4proc.c:9244:12: error: variable 'ptr' set but not used [-Werror,-Wunused-but-set-variable] Fix these by forwarding parameters of dprintk() to no_printk(). The positive side-effect is a format-string checker enabled even for the cases when dprintk() is no-op. Fixes: d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout Driver") Fixes: fc931582c260 ("nfs41: create_session operation") Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29sunrpc: Kill RPC_IFDEBUG()Andy Shevchenko1-2/+0
RPC_IFDEBUG() is used in only two places. In one the user of the definition is guarded by ifdeffery, in the second one it's implied due to dprintk() usage. Kill the macro and move the ifdeffery to the regular condition with the variable defined inside, while in the second case add the same conditional and move the respective code there. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Make linux/lockd/nlm.h an internal headerChuck Lever2-60/+0
The NLM protocol constants and status codes in nlm.h are needed only by lockd's internal implementation. NFS client code and NFSD interact with lockd through the stable API in bind.h and have no direct use for protocol-level definitions. Exposing these definitions globally via bind.h creates unnecessary coupling between lockd internals and its consumers. Moving nlm.h from include/linux/lockd/ to fs/lockd/ clarifies the API boundary: bind.h provides the lockd service interface, while nlm.h remains available only to code within fs/lockd/ that implements the protocol. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Move xdr.h from include/linux/lockd/ to fs/lockd/Chuck Lever2-116/+2
The lockd subsystem unnecessarily exposes internal NLM XDR type definitions through the global include path. These definitions are not used by any code outside fs/lockd/, making them inappropriate for include/linux/lockd/. Moving xdr.h to fs/lockd/ narrows the API surface and clarifies that these types are internal implementation details. The comment in linux/lockd/bind.h stating xdr.h was needed for "xdr-encoded error codes" is stale: no lockd API consumers use those codes. Forward declarations for struct nfs_fh and struct file_lock are added to bind.h because their definitions were previously pulled in transitively through xdr.h. Additionally, nfs3proc.c and proc.c need explicit includes of filelock.h for FL_CLOSE and for accessing struct file_lock members, respectively. Built and tested with lockd client/server operations. No functional change. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Remove lockd/debug.hChuck Lever1-40/+0
The lockd include structure has unnecessary indirection. The header include/linux/lockd/debug.h is consumed only by fs/lockd/lockd.h, creating an extra compilation dependency and making the code harder to navigate. Fold the debug.h definitions directly into lockd.h and remove the now-redundant header. This reduces the include tree depth and makes the debug-related definitions easier to find when working on lockd internals. Build-tested with lockd built as module and built-in. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Relocate include/linux/lockd/lockd.hChuck Lever1-401/+0
Headers placed in include/linux/ form part of the kernel's internal API and signal to subsystem maintainers that other parts of the kernel may depend on them. By moving lockd.h into fs/lockd/, lockd becomes a more self-contained module whose internal interfaces are clearly distinguished from its public contract with the rest of the kernel. This relocation addresses a long-standing XXX comment in the header itself that acknowledged the file's misplacement. Future changes to lockd internals can now proceed with confidence that external consumers are not inadvertently coupled to implementation details. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Move share.h from include/linux/lockd/ to fs/lockd/Chuck Lever2-32/+2
The share.h header defines struct nlm_share and declares the DOS share management functions used by the NLM server to implement NLM_SHARE and NLM_UNSHARE operations. These interfaces are used exclusively within the lockd subsystem. A git grep search confirms no external code references them. Relocating this header from include/linux/lockd/ to fs/lockd/ narrows the public API surface of the lockd module. Out-of-tree code cannot depend on these internal interfaces after this change. Future refactoring of the share management implementation thus requires no consideration of external consumers. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Move xdr4.h from include/linux/lockd/ to fs/lockd/Chuck Lever3-49/+4
The xdr4.h header declares NLMv4-specific XDR encoder/decoder functions and error codes that are used exclusively within the lockd subsystem. Moving it from include/linux/lockd/ to fs/lockd/ clarifies the intended scope of these declarations and prevents external code from depending on lockd-internal interfaces. This change reduces the public API surface of the lockd module and makes it easier to refactor NLMv4 internals without risk of breaking out-of-tree consumers. The header's contents are implementation details of the NLMv4 wire protocol handling, not a contract with other kernel subsystems. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29NFS: Use nlmclnt_shutdown_rpc_clnt() to safely shut down NLMChuck Lever1-0/+1
A race condition exists in shutdown_store() when writing to the sysfs "shutdown" file concurrently with nlm_shutdown_hosts_net(). Without synchronization, the following sequence can occur: 1. shutdown_store() reads server->nlm_host (non-NULL) 2. nlm_shutdown_hosts_net() acquires nlm_host_mutex, calls rpc_shutdown_client(), sets h_rpcclnt to NULL, and potentially frees the host via nlm_gc_hosts() 3. shutdown_store() dereferences the now-stale or freed host Introduce nlmclnt_shutdown_rpc_clnt(), which acquires nlm_host_mutex before accessing h_rpcclnt. This synchronizes with nlm_shutdown_hosts_net() and ensures the rpc_clnt pointer remains valid during the shutdown operation. This change also improves API layering: NFS client code no longer needs to include the internal lockd header to access nlm_host fields. The new helper resides in bind.h alongside other public lockd interfaces. Reported-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Relocate nlmsvc_unlock API declarationsChuck Lever2-6/+7
The nlmsvc_unlock_all_by_sb() and nlmsvc_unlock_all_by_ip() functions are part of lockd's external API, consumed by other kernel subsystems. Their declarations currently reside in linux/lockd/lockd.h alongside internal implementation details, which blurs the boundary between lockd's public interface and its private internals. Moving these declarations to linux/lockd/bind.h groups them with other external API functions and makes the separation explicit. This clarifies which functions are intended for external use and reduces the risk of internal implementation details leaking into the public API surface. Build-tested with allyesconfig; no functional changes. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Have nlm_fopen() return errno valuesChuck Lever2-5/+5
The nlm_fopen() function is part of the API between nfsd and lockd. Currently its return value is an on-the-wire NLM status code. But that forces NFSD to include NLM wire protocol definitions despite having no other dependency on the NLM wire protocol. In addition, a CONFIG_LOCKD_V4 Kconfig symbol appears in the middle of NFSD source code. Refactor: Let's not use on-the-wire values as part of a high-level API between two Linux kernel modules. That's what we have errno for, right? And, instead of simply moving the CONFIG_LOCKD_V4 check, we can get rid of it entirely and let the decision of what actual NLM status code goes on the wire to be left up to NLM version-specific code. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Introduce nlm__int__deadlockChuck Lever1-0/+1
The use of CONFIG_LOCKD_V4 in combination with a later cast_status() in the NLMv3 code is difficult to reason about. Instead, replace the use of nlm_deadlock with an implementation-defined status value that version-specific code translates appropriately. The new approach establishes a translation boundary: generic lockd code returns nlm__int__deadlock when posix_lock_file() yields -EDEADLK. Version-specific handlers (svc4proc.c for NLMv4, svcproc.c for NLMv3) translate this internal status to the appropriate wire protocol value. NLMv4 maps to nlm4_deadlock; NLMv3 maps to nlm_lck_denied (since NLMv3 lacks a deadlock-specific status code). Later this modification will also remove the need to include NLMv4 headers in NLMv3 and generic code. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29lockd: Relocate and rename nlm_drop_replyChuck Lever2-2/+6
The nlm_drop_reply status code is internal to the kernel's lockd implementation and must never appear on the wire. Its previous location in xdr.h grouped it with legitimate NLM protocol status codes, obscuring this critical distinction. Relocate the definition to lockd.h with a comment block for internal status codes, and rename to nlm__int__drop_reply to make its internal-only nature explicit. This prepares for adding additional internal status codes in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29nfsd/sunrpc: move rq_cachetype into struct nfsd_thread_local_infoJeff Layton1-1/+0
The svc_rqst->rq_cachetype field is only accessed by nfsd. Move it into the nfsd_thread_local_info instead. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29nfsd/sunrpc: add svc_rqst->rq_private pointer and remove rq_lease_breakerJeff Layton1-1/+4
rq_lease_breaker has always been a NFSv4 specific layering violation in svc_rqst. The reason it's there though is that we need a place that is thread-local, and accessible from the svc_rqst pointer. Add a new rq_private pointer to struct svc_rqst. This is intended for use by the threads that are handling the service. sunrpc code doesn't touch it. In nfsd, define a new struct nfsd_thread_local_info. nfsd declares one of these on the stack and puts a pointer to it in rq_private. Add a new ntli_lease_breaker field to the new struct and convert all of the places that access rq_lease_breaker to use the new field instead. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29Merge tag 'vfs-7.0-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds4-16/+5
Pull vfs fixes from Christian Brauner: - Fix netfs_limit_iter() hitting BUG() when an ITER_KVEC iterator reaches it via core dump writes to 9P filesystems. Add ITER_KVEC handling following the same pattern as the existing ITER_BVEC code. - Fix a NULL pointer dereference in the netfs unbuffered write retry path when the filesystem (e.g., 9P) doesn't set the prepare_write operation. - Clear I_DIRTY_TIME in sync_lazytime for filesystems implementing ->sync_lazytime. Without this the flag stays set and may cause additional unnecessary calls during inode deactivation. - Increase tmpfs size in mount_setattr selftests. A recent commit bumped the ext4 image size to 2 GB but didn't adjust the tmpfs backing store, so mkfs.ext4 fails with ENOSPC writing metadata. - Fix an invalid folio access in iomap when i_blkbits matches the folio size but differs from the I/O granularity. The cur_folio pointer would not get invalidated and iomap_read_end() would still be called on it despite the IO helper owning it. - Fix hash_name() docstring. - Fix read abandonment during netfs retry where the subreq variable used for abandonment could be uninitialized on the first pass or point to a deleted subrequest on later passes. - Don't block sync for filesystems with no data integrity guarantees. Add a SB_I_NO_DATA_INTEGRITY superblock flag replacing the per-inode AS_NO_DATA_INTEGRITY mapping flag so sync kicks off writeback but doesn't wait for flusher threads. This fixes a suspend-to-RAM hang on fuse-overlayfs where the flusher thread blocks when the fuse daemon is frozen. - Fix a lockdep splat in iomap when reads fail. iomap_read_end_io() invokes fserror_report() which calls igrab() taking i_lock in hardirq context while i_lock is normally held with interrupts enabled. Kick failed read handling to a workqueue. - Remove the redundant netfs_io_stream::front member and use stream->subrequests.next instead, fixing a potential issue in the direct write code path. * tag 'vfs-7.0-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: netfs: Fix the handling of stream->front by removing it iomap: fix lockdep complaint when reads fail writeback: don't block sync for filesystems with no data integrity guarantees netfs: Fix read abandonment during retry vfs: fix docstring of hash_name() iomap: fix invalid folio access when i_blkbits differs from I/O granularity selftests/mount_setattr: increase tmpfs size for idmapped mount tests fs: clear I_DIRTY_TIME in sync_lazytime netfs: Fix NULL pointer dereference in netfs_unbuffered_write() on retry netfs: Fix kernel BUG in netfs_limit_iter() for ITER_KVEC iterators
2026-03-30Merge tag 'drm-xe-next-2026-03-26-1' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-nextDave Airlie1-0/+69
Hi Dave and Sima, Here goes our late, final drm-xe-next PR towards 7.1. We just purgeable BO uAPI in today, hence the late pull. In the big things we have: - Add support for purgeable buffer objects Thanks, Matt UAPI Changes: - Add support for purgeable buffer objects (Arvind, Himal) Driver Changes: - Remove useless comment (Maarten) - Issue GGTT invalidation under lock in ggtt_node_remove (Brost, Fixes) - Fix mismatched include guards in header files (Shuicheng) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/acX4fWxPkZrrfwnT@gsse-cloud1.jf.intel.com