Merge drm/drm-next into drm-misc-next

Backmerging to get v6.15-rc1 into drm-misc-next. Also fixes a build issue when enabling CONFIG_DRM_SCHED_KUNIT_TEST. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
author: Thomas Zimmermann <tzimmermann@suse.de> 2025-04-07 13:47:40 +0200
committer: Thomas Zimmermann <tzimmermann@suse.de> 2025-04-07 14:35:48 +0200
commit: 1afba39f9305fe4061a4e70baa6ebab9d41459da (patch)
tree: 67e7ed686bd75adc559025870151c4de6d649ced /Documentation/gpu
parent: drm/bridge: samsung-dsim: use dynamic lifetime management (diff)
parent: Linux 6.15-rc1 (diff)
download: wireguard-linux-1afba39f9305fe4061a4e70baa6ebab9d41459da.tar.xz
wireguard-linux-1afba39f9305fe4061a4e70baa6ebab9d41459da.zip
9 files changed, 728 insertions, 9 deletions
diff --git a/Documentation/gpu/amdgpu/amdgpu-glossary.rst b/Documentation/gpu/amdgpu/amdgpu-glossary.rst
index 00a47ebb0b0f..1e9283e076ba 100644
--- a/Documentation/gpu/amdgpu/amdgpu-glossary.rst
+++ b/Documentation/gpu/amdgpu/amdgpu-glossary.rst
@@ -12,6 +12,9 @@ we have a dedicated glossary for Display Core at
       The number of CUs that are active on the system.  The number of active
       CUs may be less than SE * SH * CU depending on the board configuration.
 
+    CE
+      Constant Engine
+
     CP
       Command Processor
 
@@ -68,6 +71,9 @@ we have a dedicated glossary for Display Core at
     IB
       Indirect Buffer
 
+    IMU
+      Integrated Management Unit (Power Management support)
+
     IP
         Intellectual Property blocks
 
@@ -80,6 +86,12 @@ we have a dedicated glossary for Display Core at
     KIQ
       Kernel Interface Queue
 
+    MC
+      Memory Controller
+
+    ME
+      MicroEngine (Graphics)
+
     MEC
       MicroEngine Compute
 
@@ -92,6 +104,9 @@ we have a dedicated glossary for Display Core at
     MQD
       Memory Queue Descriptor
 
+    PFP
+      Pre-Fetch Parser (Graphics)
+
     PPLib
       PowerPlay Library - PowerPlay is the power management component.
 
@@ -99,7 +114,10 @@ we have a dedicated glossary for Display Core at
         Platform Security Processor
 
     RLC
-      RunList Controller
+      RunList Controller. This name is a remnant of past ages and doesn't have
+      much meaning today. It's a group of general-purpose helper engines for
+      the GFX block. It's involved in GFX power management and SR-IOV, among
+      other things.
 
     SDMA
       System DMA
@@ -110,14 +128,35 @@ we have a dedicated glossary for Display Core at
     SH
       SHader array
 
-    SMU
-      System Management Unit
+    SMU/SMC
+      System Management Unit / System Management Controller
+
+    SRLC
+      Save/Restore List Control
+
+    SRLG
+      Save/Restore List GPM_MEM
+
+    SRLS
+      Save/Restore List SRM_MEM
 
     SS
       Spread Spectrum
 
+    TA
+      Trusted Application
+
+    TOC
+      Table of Contents
+
+    UVD
+      Unified Video Decoder
+
     VCE
       Video Compression Engine
 
     VCN
       Video Codec Next
+
+    VPE
+      Video Processing Engine
diff --git a/Documentation/gpu/amdgpu/display/dc-glossary.rst b/Documentation/gpu/amdgpu/display/dc-glossary.rst
index 0b0ffd428dd2..7dc034e9e586 100644
--- a/Documentation/gpu/amdgpu/display/dc-glossary.rst
+++ b/Documentation/gpu/amdgpu/display/dc-glossary.rst
@@ -167,9 +167,6 @@ consider asking in the amdgfx and update this page.
     MALL
       Memory Access at Last Level
 
-    MC
-      Memory Controller
-
     MPC/MPCC
       Multiple pipes and plane combine
 
@@ -232,6 +229,3 @@ consider asking in the amdgfx and update this page.
 
     VRR
       Variable Refresh Rate
-
-    UVD
-      Unified Video Decoder
diff --git a/Documentation/gpu/drivers.rst b/Documentation/gpu/drivers.rst
index 74fc2cbf1b6f..78b80be17f21 100644
--- a/Documentation/gpu/drivers.rst
+++ b/Documentation/gpu/drivers.rst
@@ -25,6 +25,7 @@ GPU Driver Documentation
    panfrost
    panthor
    zynqmp
+   nova/index
 
 .. only::  subproject and html
 
diff --git a/Documentation/gpu/nova/core/guidelines.rst b/Documentation/gpu/nova/core/guidelines.rst
new file mode 100644
index 000000000000..a389d65d7982
--- /dev/null
+++ b/Documentation/gpu/nova/core/guidelines.rst
@@ -0,0 +1,24 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+
+==========
+Guidelines
+==========
+
+This documents contains the guidelines for nova-core. Additionally, all common
+guidelines of the Nova project do apply.
+
+Driver API
+==========
+
+One main purpose of nova-core is to implement the abstraction around the
+firmware interface of GSP and provide a firmware (version) independent API for
+2nd level drivers, such as nova-drm or the vGPU manager VFIO driver.
+
+Therefore, it is not permitted to leak firmware (version) specifics, through the
+driver API, to 2nd level drivers.
+
+Acceptance Criteria
+===================
+
+- To the extend possible, patches submitted to nova-core must be tested for
+  regressions with all 2nd level drivers.
diff --git a/Documentation/gpu/nova/core/todo.rst b/Documentation/gpu/nova/core/todo.rst
new file mode 100644
index 000000000000..ca08377d3b73
--- /dev/null
+++ b/Documentation/gpu/nova/core/todo.rst
@@ -0,0 +1,446 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+
+=========
+Task List
+=========
+
+Tasks may have the following fields:
+
+- ``Complexity``: Describes the required familiarity with Rust and / or the
+  corresponding kernel APIs or subsystems. There are four different complexities,
+  ``Beginner``, ``Intermediate``, ``Advanced`` and ``Expert``.
+- ``Reference``: References to other tasks.
+- ``Link``: Links to external resources.
+- ``Contact``: The person that can be contacted for further information about
+  the task.
+
+Enablement (Rust)
+=================
+
+Tasks that are not directly related to nova-core, but are preconditions in terms
+of required APIs.
+
+FromPrimitive API
+-----------------
+
+Sometimes the need arises to convert a number to a value of an enum or a
+structure.
+
+A good example from nova-core would be the ``Chipset`` enum type, which defines
+the value ``AD102``. When probing the GPU the value ``0x192`` can be read from a
+certain register indication the chipset AD102. Hence, the enum value ``AD102``
+should be derived from the number ``0x192``. Currently, nova-core uses a custom
+implementation (``Chipset::from_u32`` for this.
+
+Instead, it would be desirable to have something like the ``FromPrimitive``
+trait [1] from the num crate.
+
+Having this generalization also helps with implementing a generic macro that
+automatically generates the corresponding mappings between a value and a number.
+
+| Complexity: Beginner
+| Link: https://docs.rs/num/latest/num/trait.FromPrimitive.html
+
+Generic register abstraction
+----------------------------
+
+Work out how register constants and structures can be automatically generated
+through generalized macros.
+
+Example:
+
+.. code-block:: rust
+
+	register!(BOOT0, 0x0, u32, pci::Bar<SIZE>, Fields [
+	   MINOR_REVISION(3:0, RO),
+	   MAJOR_REVISION(7:4, RO),
+	   REVISION(7:0, RO), // Virtual register combining major and minor rev.
+	])
+
+This could expand to something like:
+
+.. code-block:: rust
+
+	const BOOT0_OFFSET: usize = 0x00000000;
+	const BOOT0_MINOR_REVISION_SHIFT: u8 = 0;
+	const BOOT0_MINOR_REVISION_MASK: u32 = 0x0000000f;
+	const BOOT0_MAJOR_REVISION_SHIFT: u8 = 4;
+	const BOOT0_MAJOR_REVISION_MASK: u32 = 0x000000f0;
+	const BOOT0_REVISION_SHIFT: u8 = BOOT0_MINOR_REVISION_SHIFT;
+	const BOOT0_REVISION_MASK: u32 = BOOT0_MINOR_REVISION_MASK | BOOT0_MAJOR_REVISION_MASK;
+
+	struct Boot0(u32);
+
+	impl Boot0 {
+	   #[inline]
+	   fn read(bar: &RevocableGuard<'_, pci::Bar<SIZE>>) -> Self {
+	      Self(bar.readl(BOOT0_OFFSET))
+	   }
+
+	   #[inline]
+	   fn minor_revision(&self) -> u32 {
+	      (self.0 & BOOT0_MINOR_REVISION_MASK) >> BOOT0_MINOR_REVISION_SHIFT
+	   }
+
+	   #[inline]
+	   fn major_revision(&self) -> u32 {
+	      (self.0 & BOOT0_MAJOR_REVISION_MASK) >> BOOT0_MAJOR_REVISION_SHIFT
+	   }
+
+	   #[inline]
+	   fn revision(&self) -> u32 {
+	      (self.0 & BOOT0_REVISION_MASK) >> BOOT0_REVISION_SHIFT
+	   }
+	}
+
+Usage:
+
+.. code-block:: rust
+
+	let bar = bar.try_access().ok_or(ENXIO)?;
+
+	let boot0 = Boot0::read(&bar);
+	pr_info!("Revision: {}\n", boot0.revision());
+
+| Complexity: Advanced
+
+Delay / Sleep abstractions
+--------------------------
+
+Rust abstractions for the kernel's delay() and sleep() functions.
+
+FUJITA Tomonori plans to work on abstractions for read_poll_timeout_atomic()
+(and friends) [1].
+
+| Complexity: Beginner
+| Link: https://lore.kernel.org/netdev/20250228.080550.354359820929821928.fujita.tomonori@gmail.com/ [1]
+
+IRQ abstractions
+----------------
+
+Rust abstractions for IRQ handling.
+
+There is active ongoing work from Daniel Almeida [1] for the "core" abstractions
+to request IRQs.
+
+Besides optional review and testing work, the required ``pci::Device`` code
+around those core abstractions needs to be worked out.
+
+| Complexity: Intermediate
+| Link: https://lore.kernel.org/lkml/20250122163932.46697-1-daniel.almeida@collabora.com/ [1]
+| Contact: Daniel Almeida
+
+Page abstraction for foreign pages
+----------------------------------
+
+Rust abstractions for pages not created by the Rust page abstraction without
+direct ownership.
+
+There is active onging work from Abdiel Janulgue [1] and Lina [2].
+
+| Complexity: Advanced
+| Link: https://lore.kernel.org/linux-mm/20241119112408.779243-1-abdiel.janulgue@gmail.com/ [1]
+| Link: https://lore.kernel.org/rust-for-linux/20250202-rust-page-v1-0-e3170d7fe55e@asahilina.net/ [2]
+
+Scatterlist / sg_table abstractions
+-----------------------------------
+
+Rust abstractions for scatterlist / sg_table.
+
+There is preceding work from Abdiel Janulgue, which hasn't made it to the
+mailing list yet.
+
+| Complexity: Intermediate
+| Contact: Abdiel Janulgue
+
+ELF utils
+---------
+
+Rust implementation of ELF header representation to retrieve section header
+tables, names, and data from an ELF-formatted images.
+
+There is preceding work from Abdiel Janulgue, which hasn't made it to the
+mailing list yet.
+
+| Complexity: Beginner
+| Contact: Abdiel Janulgue
+
+PCI MISC APIs
+-------------
+
+Extend the existing PCI device / driver abstractions by SR-IOV, config space,
+capability, MSI API abstractions.
+
+| Complexity: Beginner
+
+Auxiliary bus abstractions
+--------------------------
+
+Rust abstraction for the auxiliary bus APIs.
+
+This is needed to connect nova-core to the nova-drm driver.
+
+| Complexity: Intermediate
+
+Debugfs abstractions
+--------------------
+
+Rust abstraction for debugfs APIs.
+
+| Reference: Export GSP log buffers
+| Complexity: Intermediate
+
+Vec extensions
+--------------
+
+Implement ``Vec::truncate`` and ``Vec::resize``.
+
+Currently this is used for some experimental code to parse the vBIOS.
+
+| Reference vBIOS support
+| Complexity: Beginner
+
+GPU (general)
+=============
+
+Parse firmware headers
+----------------------
+
+Parse ELF headers from the firmware files loaded from the filesystem.
+
+| Reference: ELF utils
+| Complexity: Beginner
+| Contact: Abdiel Janulgue
+
+Build radix3 page table
+-----------------------
+
+Build the radix3 page table to map the firmware.
+
+| Complexity: Intermediate
+| Contact: Abdiel Janulgue
+
+vBIOS support
+-------------
+
+Parse the vBIOS and probe the structures required for driver initialization.
+
+| Contact: Dave Airlie
+| Reference: Vec extensions
+| Complexity: Intermediate
+
+Initial Devinit support
+-----------------------
+
+Implement BIOS Device Initialization, i.e. memory sizing, waiting, PLL
+configuration.
+
+| Contact: Dave Airlie
+| Complexity: Beginner
+
+Boot Falcon controller
+----------------------
+
+Infrastructure to load and execute falcon (sec2) firmware images; handle the
+GSP falcon processor and fwsec loading.
+
+| Complexity: Advanced
+| Contact: Dave Airlie
+
+GPU Timer support
+-----------------
+
+Support for the GPU's internal timer peripheral.
+
+| Complexity: Beginner
+| Contact: Dave Airlie
+
+MMU / PT management
+-------------------
+
+Work out the architecture for MMU / page table management.
+
+We need to consider that nova-drm will need rather fine-grained control,
+especially in terms of locking, in order to be able to implement asynchronous
+Vulkan queues.
+
+While generally sharing the corresponding code is desirable, it needs to be
+evaluated how (and if at all) sharing the corresponding code is expedient.
+
+| Complexity: Expert
+
+VRAM memory allocator
+---------------------
+
+Investigate options for a VRAM memory allocator.
+
+Some possible options:
+  - Rust abstractions for
+    - RB tree (interval tree) / drm_mm
+    - maple_tree
+  - native Rust collections
+
+| Complexity: Advanced
+
+Instance Memory
+---------------
+
+Implement support for instmem (bar2) used to store page tables.
+
+| Complexity: Intermediate
+| Contact: Dave Airlie
+
+GPU System Processor (GSP)
+==========================
+
+Export GSP log buffers
+----------------------
+
+Recent patches from Timur Tabi [1] added support to expose GSP-RM log buffers
+(even after failure to probe the driver) through debugfs.
+
+This is also an interesting feature for nova-core, especially in the early days.
+
+| Link: https://lore.kernel.org/nouveau/20241030202952.694055-2-ttabi@nvidia.com/ [1]
+| Reference: Debugfs abstractions
+| Complexity: Intermediate
+
+GSP firmware abstraction
+------------------------
+
+The GSP-RM firmware API is unstable and may incompatibly change from version to
+version, in terms of data structures and semantics.
+
+This problem is one of the big motivations for using Rust for nova-core, since
+it turns out that Rust's procedural macro feature provides a rather elegant way
+to address this issue:
+
+1. generate Rust structures from the C headers in a separate namespace per version
+2. build abstraction structures (within a generic namespace) that implement the
+   firmware interfaces; annotate the differences in implementation with version
+   identifiers
+3. use a procedural macro to generate the actual per version implementation out
+   of this abstraction
+4. instantiate the correct version type one on runtime (can be sure that all
+   have the same interface because it's defined by a common trait)
+
+There is a PoC implementation of this pattern, in the context of the nova-core
+PoC driver.
+
+This task aims at refining the feature and ideally generalize it, to be usable
+by other drivers as well.
+
+| Complexity: Expert
+
+GSP message queue
+-----------------
+
+Implement low level GSP message queue (command, status) for communication
+between the kernel driver and GSP.
+
+| Complexity: Advanced
+| Contact: Dave Airlie
+
+Bootstrap GSP
+-------------
+
+Call the boot firmware to boot the GSP processor; execute initial control
+messages.
+
+| Complexity: Intermediate
+| Contact: Dave Airlie
+
+Client / Device APIs
+--------------------
+
+Implement the GSP message interface for client / device allocation and the
+corresponding client and device allocation APIs.
+
+| Complexity: Intermediate
+| Contact: Dave Airlie
+
+Bar PDE handling
+----------------
+
+Synchronize page table handling for BARs between the kernel driver and GSP.
+
+| Complexity: Beginner
+| Contact: Dave Airlie
+
+FIFO engine
+-----------
+
+Implement support for the FIFO engine, i.e. the corresponding GSP message
+interface and provide an API for chid allocation and channel handling.
+
+| Complexity: Advanced
+| Contact: Dave Airlie
+
+GR engine
+---------
+
+Implement support for the graphics engine, i.e. the corresponding GSP message
+interface and provide an API for (golden) context creation and promotion.
+
+| Complexity: Advanced
+| Contact: Dave Airlie
+
+CE engine
+---------
+
+Implement support for the copy engine, i.e. the corresponding GSP message
+interface.
+
+| Complexity: Intermediate
+| Contact: Dave Airlie
+
+VFN IRQ controller
+------------------
+
+Support for the VFN interrupt controller.
+
+| Complexity: Intermediate
+| Contact: Dave Airlie
+
+External APIs
+=============
+
+nova-core base API
+------------------
+
+Work out the common pieces of the API to connect 2nd level drivers, i.e. vGPU
+manager and nova-drm.
+
+| Complexity: Advanced
+
+vGPU manager API
+----------------
+
+Work out the API parts required by the vGPU manager, which are not covered by
+the base API.
+
+| Complexity: Advanced
+
+nova-core C API
+---------------
+
+Implement a C wrapper for the APIs required by the vGPU manager driver.
+
+| Complexity: Intermediate
+
+Testing
+=======
+
+CI pipeline
+-----------
+
+Investigate option for continuous integration testing.
+
+This can go from as simple as running KUnit tests over running (graphics) CTS to
+booting up (multiple) guest VMs to test VFIO use-cases.
+
+It might also be worth to consider the introduction of a new test suite directly
+sitting on top of the uAPI for more targeted testing and debugging. There may be
+options for collaboration / shared code with the Mesa project.
+
+| Complexity: Advanced
diff --git a/Documentation/gpu/nova/guidelines.rst b/Documentation/gpu/nova/guidelines.rst
new file mode 100644
index 000000000000..13ab13984a18
--- /dev/null
+++ b/Documentation/gpu/nova/guidelines.rst
@@ -0,0 +1,69 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+
+==========
+Guidelines
+==========
+
+This document describes the general project guidelines that apply to nova-core
+and nova-drm.
+
+Language
+========
+
+The Nova project uses the Rust programming language. In this context, all rules
+of the Rust for Linux project as documented in
+:doc:`../../rust/general-information` apply. Additionally, the following rules
+apply.
+
+- Unless technically necessary otherwise (e.g. uAPI), any driver code is written
+  in Rust.
+
+- Unless technically necessary, unsafe Rust code must be avoided. In case of
+  technical necessity, unsafe code should be isolated in a separate component
+  providing a safe API for other driver code to use.
+
+Style
+-----
+
+All rules of the Rust for Linux project as documented in
+:doc:`../../rust/coding-guidelines` apply.
+
+For a submit checklist, please also see the `Rust for Linux Submit checklist
+addendum <https://rust-for-linux.com/contributing#submit-checklist-addendum>`_.
+
+Documentation
+=============
+
+The availability of proper documentation is essential in terms of scalability,
+accessibility for new contributors and maintainability of a project in general,
+but especially for a driver running as complex hardware as Nova is targeting.
+
+Hence, adding documentation of any kind is very much encouraged by the project.
+
+Besides that, there are some minimum requirements.
+
+- Every non-private structure needs at least a brief doc comment explaining the
+  semantical sense of the structure, as well as potential locking and lifetime
+  requirements. It is encouraged to have the same minimum documentation for
+  non-trivial private structures.
+
+- uAPIs must be fully documented with kernel-doc comments; additionally, the
+  semantical behavior must be explained including potential special or corner
+  cases.
+
+- The APIs connecting the 1st level driver (nova-core) with 2nd level drivers
+  must be fully documented. This includes doc comments, potential locking and
+  lifetime requirements, as well as example code if applicable.
+
+- Abbreviations must be explained when introduced; terminology must be uniquely
+  defined.
+
+- Register addresses, layouts, shift values and masks must be defined properly;
+  unless obvious, the semantical sense must be documented. This only applies if
+  the author is able to obtain the corresponding information.
+
+Acceptance Criteria
+===================
+
+- Patches must only be applied if reviewed by at least one other person on the
+  mailing list; this also applies for maintainers.
diff --git a/Documentation/gpu/nova/index.rst b/Documentation/gpu/nova/index.rst
new file mode 100644
index 000000000000..2701b3f4af35
--- /dev/null
+++ b/Documentation/gpu/nova/index.rst
@@ -0,0 +1,30 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+
+=======================
+nova NVIDIA GPU drivers
+=======================
+
+The nova driver project consists out of two separate drivers nova-core and
+nova-drm and intends to supersede the nouveau driver for NVIDIA GPUs based on
+the GPU System Processor (GSP).
+
+The following documents apply to both nova-core and nova-drm.
+
+.. toctree::
+   :titlesonly:
+
+   guidelines
+
+nova-core
+=========
+
+The nova-core driver is the core driver for NVIDIA GPUs based on GSP. nova-core,
+as the 1st level driver, provides an abstraction around the GPUs hard- and
+firmware interfaces providing a common base for 2nd level drivers, such as the
+vGPU manager VFIO driver and the nova-drm driver.
+
+.. toctree::
+   :titlesonly:
+
+   core/guidelines
+   core/todo
diff --git a/Documentation/gpu/rfc/gpusvm.rst b/Documentation/gpu/rfc/gpusvm.rst
new file mode 100644
index 000000000000..bcf66a8137a6
--- /dev/null
+++ b/Documentation/gpu/rfc/gpusvm.rst
@@ -0,0 +1,112 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+
+===============
+GPU SVM Section
+===============
+
+Agreed upon design principles
+=============================
+
+* migrate_to_ram path
+	* Rely only on core MM concepts (migration PTEs, page references, and
+	  page locking).
+	* No driver specific locks other than locks for hardware interaction in
+	  this path. These are not required and generally a bad idea to
+	  invent driver defined locks to seal core MM races.
+	* An example of a driver-specific lock causing issues occurred before
+	  fixing do_swap_page to lock the faulting page. A driver-exclusive lock
+	  in migrate_to_ram produced a stable livelock if enough threads read
+	  the faulting page.
+	* Partial migration is supported (i.e., a subset of pages attempting to
+	  migrate can actually migrate, with only the faulting page guaranteed
+	  to migrate).
+	* Driver handles mixed migrations via retry loops rather than locking.
+* Eviction
+	* Eviction is defined as migrating data from the GPU back to the
+	  CPU without a virtual address to free up GPU memory.
+	* Only looking at physical memory data structures and locks as opposed to
+	  looking at virtual memory data structures and locks.
+	* No looking at mm/vma structs or relying on those being locked.
+	* The rationale for the above two points is that CPU virtual addresses
+	  can change at any moment, while the physical pages remain stable.
+	* GPU page table invalidation, which requires a GPU virtual address, is
+	  handled via the notifier that has access to the GPU virtual address.
+* GPU fault side
+	* mmap_read only used around core MM functions which require this lock
+	  and should strive to take mmap_read lock only in GPU SVM layer.
+	* Big retry loop to handle all races with the mmu notifier under the gpu
+	  pagetable locks/mmu notifier range lock/whatever we end up calling
+          those.
+	* Races (especially against concurrent eviction or migrate_to_ram)
+	  should not be handled on the fault side by trying to hold locks;
+	  rather, they should be handled using retry loops. One possible
+	  exception is holding a BO's dma-resv lock during the initial migration
+	  to VRAM, as this is a well-defined lock that can be taken underneath
+	  the mmap_read lock.
+	* One possible issue with the above approach is if a driver has a strict
+	  migration policy requiring GPU access to occur in GPU memory.
+	  Concurrent CPU access could cause a livelock due to endless retries.
+	  While no current user (Xe) of GPU SVM has such a policy, it is likely
+	  to be added in the future. Ideally, this should be resolved on the
+	  core-MM side rather than through a driver-side lock.
+* Physical memory to virtual backpointer
+	* This does not work, as no pointers from physical memory to virtual
+	  memory should exist. mremap() is an example of the core MM updating
+	  the virtual address without notifying the driver of address
+	  change rather the driver only receiving the invalidation notifier.
+	* The physical memory backpointer (page->zone_device_data) should remain
+	  stable from allocation to page free. Safely updating this against a
+	  concurrent user would be very difficult unless the page is free.
+* GPU pagetable locking
+	* Notifier lock only protects range tree, pages valid state for a range
+	  (rather than seqno due to wider notifiers), pagetable entries, and
+	  mmu notifier seqno tracking, it is not a global lock to protect
+          against races.
+	* All races handled with big retry as mentioned above.
+
+Overview of baseline design
+===========================
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
+   :doc: Overview
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
+   :doc: Locking
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
+   :doc: Migration
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
+   :doc: Partial Unmapping of Ranges
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
+   :doc: Examples
+
+Possible future design features
+===============================
+
+* Concurrent GPU faults
+	* CPU faults are concurrent so makes sense to have concurrent GPU
+	  faults.
+	* Should be possible with fined grained locking in the driver GPU
+	  fault handler.
+	* No expected GPU SVM changes required.
+* Ranges with mixed system and device pages
+	* Can be added if required to drm_gpusvm_get_pages fairly easily.
+* Multi-GPU support
+	* Work in progress and patches expected after initially landing on GPU
+	  SVM.
+	* Ideally can be done with little to no changes to GPU SVM.
+* Drop ranges in favor of radix tree
+	* May be desirable for faster notifiers.
+* Compound device pages
+	* Nvidia, AMD, and Intel all have agreed expensive core MM functions in
+	  migrate device layer are a performance bottleneck, having compound
+	  device pages should help increase performance by reducing the number
+	  of these expensive calls.
+* Higher order dma mapping for migration
+	* 4k dma mapping adversely affects migration performance on Intel
+	  hardware, higher order (2M) dma mapping should help here.
+* Build common userptr implementation on top of GPU SVM
+* Driver side madvise implementation and migration policies
+* Pull in pending dma-mapping API changes from Leon / Nvidia when these land
diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
index 476719771eef..396e535377fb 100644
--- a/Documentation/gpu/rfc/index.rst
+++ b/Documentation/gpu/rfc/index.rst
@@ -18,6 +18,10 @@ host such documentation:
 
 .. toctree::
 
+    gpusvm.rst
+
+.. toctree::
+
     i915_gem_lmem.rst
 
 .. toctree::
author	Thomas Zimmermann <tzimmermann@suse.de>	2025-04-07 13:47:40 +0200
committer	Thomas Zimmermann <tzimmermann@suse.de>	2025-04-07 14:35:48 +0200
commit	1afba39f9305fe4061a4e70baa6ebab9d41459da (patch)
tree	67e7ed686bd75adc559025870151c4de6d649ced /Documentation/gpu
parent	drm/bridge: samsung-dsim: use dynamic lifetime management (diff)
parent	Linux 6.15-rc1 (diff)
download	wireguard-linux-1afba39f9305fe4061a4e70baa6ebab9d41459da.tar.xz wireguard-linux-1afba39f9305fe4061a4e70baa6ebab9d41459da.zip