aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-04-11drm/xe/guc: Refactor GuC debugfs initializationMichal Wajdeczko1-63/+67
We don't have to drmm_kmalloc() local copy of debugfs_list to write there our pointer to the struct xe_guc as we can extract pointer to the struct xe_gt from the grandparent debugfs entry, in similar way to what we did for GT debugfs files. Note that there is no change in file/directory structure, just refactored how files are created and how functions are called. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250403142635.1821-2-michal.wajdeczko@intel.com
2025-04-10drm/xe: Allow to drop vram resizingLucas De Marchi2-3/+6
The default behavior if the LMEMBAR doesn't match the maximum possible size is to try to resize it. However the user might want to keep, even for testing the behavior with small BAR, whatever size was set via sysfs. Change the module parameter to int and check for negative value. Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250409-bar-resize-param-v1-1-75bf4df38aa0@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-10drm/xe/guc: Bump the recommended GuC version to 70.44.1John Harrison1-10/+10
A new workaround requires a newer GuC version. So, recommend that users install it. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://lore.kernel.org/r/20250403185619.1555853-6-John.C.Harrison@Intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-10drm/xe/guc: Enable w/a 16026508708John Harrison3-0/+8
The workaround is only relevant to SRIOV but does affect all platforms. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://lore.kernel.org/r/20250403185619.1555853-2-John.C.Harrison@Intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-09drm/xe: Add page queue multiplierMatthew Brost1-2/+9
For an unknown reason the math to determine the PF queue size does is not correct - compute UMD applications are overflowing the PF queue which is fatal. A multippier of 8 fixes the problem. Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://lore.kernel.org/r/20250408155915.78770-1-matthew.brost@intel.com
2025-04-09drm/xe: remove unused LE_COSShuicheng Lin1-1/+0
The LE_COS definition missed passing the value parameter to REG_FIELD_PREP. This didn't cause build errors because the entire macro was unused. The value for this field is universally "0" for every MOCS entry on the old Xe_LP platforms, and the whole field has been removed from Xe_HP onward. Just delete the line so that we don't have an unused definition. Suggested-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250405171539.599850-1-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-04-08drm/xe: Enable configfs support for survivability modeRiana Tauro6-19/+108
Enable survivability mode if supported and configfs attribute is set. Enabling survivability mode manually is useful in cases where pcode does not detect failure, validation and for IFR (in-field-repair). To set configfs survivability mode attribute for a device echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode The card enters survivability mode if supported v2: add a log if survivability mode is enabled for unsupported platforms (Rodrigo) Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250407051414.1651616-4-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-08drm/xe: Add documentation for survivability modeRiana Tauro2-11/+30
Add survivability mode document to pcode document as it is enabled when pcode detects a failure. v2: fix kernel-doc (Lucas) Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250407051414.1651616-3-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-08drm/xe: Add configfs to enable survivability modeRiana Tauro6-0/+221
Registers a configfs subsystem called 'xe' that creates a directory in the mounted configfs directory (/sys/kernel/config) Userspace can then create the device that has to be configured under the xe directory mkdir /sys/kernel/config/xe/0000:03:00.0 The device created will have the following attributes to be configured /sys/kernel/config/xe/ .. 0000:03:00.0/ ... survivability_mode v2: fix kernel-doc fix return value (Lucas) v3: fix kernel-doc (Lucas) Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250407051414.1651616-2-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-07drm/xe: Fix taking invalid lock on wedgeLucas De Marchi2-0/+14
If device wedges on e.g. GuC upload, the submission is not yet enabled and the state is not even initialized. Protect the wedge call so it does nothing in this case. It fixes the following splat: [] xe 0000:bf:00.0: [drm] device wedged, needs recovery [] ------------[ cut here ]------------ [] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [] WARNING: CPU: 48 PID: 312 at kernel/locking/mutex.c:564 __mutex_lock+0x8a1/0xe60 ... [] RIP: 0010:__mutex_lock+0x8a1/0xe60 [] mutex_lock_nested+0x1b/0x30 [] xe_guc_submit_wedge+0x80/0x2b0 [xe] Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://lore.kernel.org/r/20250402-warn-after-wedge-v1-1-93e971511fa5@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-07drm/xe: Ensure XE_BO_FLAG_CPU_ADDR_MIRROR has a unique valueMatt Roper1-10/+11
When XE_BO_FLAG_PINNED_NORESTORE and XE_BO_FLAG_PINNED_LATE_RESTORE were added, they were assigned BO flag values in the middle of the flag range, requiring renumbering of the higher flags. In both cases, XE_BO_FLAG_CPU_ADDR_MIRROR was overlooked during renumbering because it was defined below XE_BO_FLAG_GGTT_ALL and thus was not immediately visible in code diffs changing this area of the code; this resulted in XE_BO_FLAG_CPU_ADDR_MIRROR clashing with another flag. Assign XE_BO_FLAG_CPU_ADDR_MIRROR a unique value, and also move the definition of XE_BO_FLAG_GGTT_ALL down below all of the individual flags so that this kind of mistake is less likely in the future. Also, while we're at it, fix up some space vs tab whitespace inconsistency in these flag definitions. Fixes: 7f387e6012b6 ("drm/xe: add XE_BO_FLAG_PINNED_LATE_RESTORE") Fixes: 045448da87bf ("drm/xe: Add XE_BO_FLAG_PINNED_NORESTORE") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250404220053.1758356-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-04-07drm/xe: Allow scratch page under fault mode for certain platformOak Zeng2-2/+7
Normally scratch page is not allowed when a vm is operate under page fault mode, i.e., in the existing codes, DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE and DRM_XE_VM_CREATE_FLAG_FAULT_MODE are mutual exclusive. The reason is fault mode relies on recoverable page to work, while scratch page can mute recoverable page fault. On xe2 and xe3, out of bound prefetch can cause page fault and further system hang because xekmd can't resolve such page fault. SYCL and OCL language runtime requires out of bound prefetch to be silently dropped without causing any functional problem, thus the existing behavior doesn't meet language runtime requirement. At the same time, HW prefetching can cause page fault interrupt. Due to page fault interrupt overhead (i.e., need Guc and KMD involved to fix the page fault), HW prefetching can be slowed by many orders of magnitude. Fix those problems by allowing scratch page under fault mode for xe2 and xe3. With scratch page in place, HW prefetching could always hit scratch page instead of causing interrupt. A side effect is, scratch page could hide application program error. Application out of bound accesses are hided by scratch page mapping, instead of get reported to user. v2: Refine commit message (Thomas) v3: Move the scratch page flag check to after scratch page wa (Thomas) v4: drop NEEDS_SCRATCH macro (matt) Add a comment to DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE Signed-off-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250403165328.2438690-4-oak.zeng@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
2025-04-07drm/xe: Clear scratch page on vm_bindOak Zeng3-41/+88
When a vm runs under fault mode, if scratch page is enabled, we need to clear the scratch page mapping on vm_bind for the vm_bind address range. Under fault mode, we depend on recoverable page fault to establish mapping in page table. If scratch page is not cleared, GPU access of address won't cause page fault because it always hits the existing scratch page mapping. When vm_bind with IMMEDIATE flag, there is no need of clearing as immediate bind can overwrite the scratch page mapping. So far only is xe2 and xe3 products are allowed to enable scratch page under fault mode. On other platform we don't allow scratch page under fault mode, so no need of such clearing. v2: Rework vm_bind pipeline to clear scratch page mapping. This is similar to a map operation, with the exception that PTEs are cleared instead of pointing to valid physical pages. (Matt, Thomas) TLB invalidation is needed after clear scratch page mapping as larger scratch page mapping could be backed by physical page and cached in TLB. (Matt, Thomas) v3: Fix the case of clearing huge pte (Thomas) Improve commit message (Thomas) v4: TLB invalidation on all LR cases, not only the clear on bind cases (Thomas) v5: Misc cosmetic changes (Matt) Drop pt_update_ops.invalidate_on_bind. Directly wire xe_vma_op.map.invalidata_on_bind to bind_op_prepare/commit (Matt) v6: checkpatch fix (Matt) v7: No need to check platform needs_scratch deciding invalidate_on_bind (Matt) v8: rebase v9: rebase v10: fix an error in xe_pt_stage_bind_entry, introduced in v9 rebase Signed-off-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250403165328.2438690-3-oak.zeng@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
2025-04-07drm/xe: Introduced needs_scratch bit in device descriptorOak Zeng2-0/+7
On some platform, scratch page is needed for out of bound prefetch to work. Introduce a bit in device descriptor to specify whether this device needs scratch page to work. v2: introduce a needs_scratch bit in device info (Thomas, Jonathan) v3: drop NEEDS_SCRATCH macro (Matt) Signed-off-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250403165328.2438690-2-oak.zeng@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
2025-04-04drm/xe/sriov: support non-contig VRAM provisioningMatthew Auld1-6/+14
Currently we can run into issues with provisioning VRAM region, due to requiring contig VRAM BO underneath. We sometimes see that allocation (multiple GB) can fail even when there is enough free space. We don't need CPU access to the buffer in the first place, so can forgo pin_map and therefore also the contig requirement. Keep the same behavior with save and restore during suspend/resume (which can now be done with blitter). We also need the VRAM to occupy the same pages so we don't need to re-program the LMTT, so should still remain pinned (also we don't want something to try evict it). With that covert over to plain pinned kernel object. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-16-matthew.auld@intel.com
2025-04-04drm/xe: allow non-contig VRAM kernel BOMatthew Auld1-1/+8
If the kernel bo doesn't care about vmap(), either directly or indirectly with save/restore then we don't need to force contig for such buffers. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-15-matthew.auld@intel.com
2025-04-04drm/xe: unconditionally apply PINNED for pin_map()Matthew Auld8-14/+10
Some users apply PINNED and some don't when using pin_map(). The pin in pin_map() should imply PINNED so just unconditionally apply it and clean up all users. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-14-matthew.auld@intel.com
2025-04-04drm/xe: add XE_BO_FLAG_PINNED_LATE_RESTOREMatthew Auld9-52/+93
With the idea of having more pinned objects using the blitter engine where possible, during suspend/resume, mark the pinned objects which can be done during the late phase once submission/migration has been setup. Start out simple with lrc and page-tables from userspace. v2: - s/early_restore/late_restore; early restore was way too bold with too many places being impacted at once. v3: - Split late vs early into separate lists, to align with newly added apply-to-pinned infra. v4: - Rebase. v5: - Make sure we restore the late phase kernel_bo_present in igpu. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-13-matthew.auld@intel.com
2025-04-04drm/xe/migrate: ignore CCS for kernel objectsMatthew Auld1-3/+6
For kernel BOs we don't clear the CCS state on creation, therefore we should be careful to ignore it when copying pages. In a future patch we opt for using the copy path here for kernel BOs, so this now needs to be considered. v2: - Drop bogus asserts (CI) Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-12-matthew.auld@intel.com
2025-04-04drm/xe: Add XE_BO_FLAG_PINNED_NORESTOREMatthew Brost8-11/+22
Not all BOs need to be restored on resume / d3cold exit, add XE_BO_FLAG_PINNED_NO_RESTORE which skips restoring of BOs rather just allocates VRAM for the BO. This should slightly speedup resume / d3cold exit flows. Marking GuC ADS, GuC CT, GuC log, GuC PC, and SA as NORESTORE. v2: - s/WONTNEED/NORESTORE (Vivi) - Rebase on newly added g2g and backup object flow Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-11-matthew.auld@intel.com
2025-04-04drm/xe: use backup object for pinned save/restoreMatthew Auld3-152/+167
Currently we move pinned objects, relying on the fact that the lpfn/fpfn will force the placement to occupy the same pages when restoring. However this then limits all such pinned objects to be contig underneath. In addition it is likely a little fragile moving pinned objects in the first place. Rather than moving such objects rather copy the page contents to a secondary system memory object, that way the VRAM pages never move and remain pinned. This also opens the door for eventually having non-contig pinned objects that can also be saved/restored using blitter. v2: - Make sure to drop the fence ref. - Handle NULL bo->migrate. v3: - Ensure we add the copy fence to the BOs, otherwise backup_obj can be freed before pipelined copy finishes. v4: - Rebase on newly added apply-to-pinned infra. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1182 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-10-matthew.auld@intel.com
2025-04-04drm/xe/xe2hpg: Add Wa_16025250150Aradhya Bhatia2-0/+24
Add Wa_16025250150 for the Xe2_HPG (graphics version: 20.01) platforms. It is a permanent workaround, and applicable on all the steppings. Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250325134421.1489416-1-aradhya.bhatia@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2025-04-03drm/xe: Fix xe_pt_stage_bind_walk kerneldocThomas Hellström1-5/+9
The structure was missing a proper kerneldoc header and once that was added a number of typos and errors became obvious. Fix those. Reported-by: Lucas De Marchi <lucas.demarchi@intel.com> Closes: https://lore.kernel.org/intel-xe/x53tcs5bjldw6lcorjemuheklxcmepdvr2u7lvt3hpqrzqoc4h@nsu6hs25taqj/ Fixes: b2d4b03b03a7 ("drm/xe: Make the PT code handle placement per PTE rather than per vma / range") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250402122924.25526-1-thomas.hellstrom@linux.intel.com
2025-04-02drm/xe/pmu: Add GT frequency eventsVinay Belgaumkar3-3/+52
Define PMU events for GT frequency (actual and requested). The instantaneous values for these frequencies will be displayed. Following PMU events are being added: xe_0000_00_02.0/gt-actual-frequency/ [Kernel PMU event] xe_0000_00_02.0/gt-requested-frequency/ [Kernel PMU event] Standard perf commands can be used to monitor GT frequency: $ perf stat -e xe_0000_00_02.0/gt-requested-frequency,gt=0/ -I1000 1.001229762 1483 Mhz xe_0000_00_02.0/gt-requested-frequency,gt=0/ 2.006175406 1483 Mhz xe_0000_00_02.0/gt-requested-frequency,gt=0/ v2: Use locks while storing/reading samples, keep track of multiple clients (Lucas) and other general cleanup. v3: Review comments (Lucas) and use event counts instead of mask for active events. v4: Add freq events to event_param_valid method (Riana) v5: Use instantaneous values instead of aggregating (Lucas) v6: Obtain fwake at init for freq events as well and use non fwake variant method for reading requested freq to avoid lockdep issues (Lucas) v7: Review comments (Rodrigo, Lucas) Cc: Riana Tauro <riana.tauro@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250331204827.2535393-1-vinay.belgaumkar@intel.com
2025-03-31drm/xe: Make PPHWSP size explicit in xe_gt_lrc_size()Gustavo Sousa1-7/+10
The context of each engine starts with a 4k memory space for the "Per-process HW status page" (PPHWSP). In xe_gt_lrc_size(), we have been implicitly accounting for that page in the switch statement on the engine class. Since the PPHWSP is common to all engines, let's extract that into it's own assignment. That makes the context structure more explicit in the code and aligns better with the descriptions in Bspec. Another advantage of keeping it separate is that now the sizes used in the switch statement match the sizes we calculate engine-specific context images, which have their own Bspec pages. Bspec: 67296, 60159, 45554 Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250328-explicit-pphwsp-size-in-xe_gt_lrc_size-v1-1-ceb9ce7c8bc1@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-03-31drm/xe: Invalidate L3 read-only cachelines for geometry streams tooKenneth Graunke2-4/+10
Historically, the Vertex Fetcher unit has not been an L3 client. That meant that, when a buffer containing vertex data was written to, it was necessary to issue a PIPE_CONTROL::VF Cache Invalidate to invalidate any VF L2 cachelines associated with that buffer, so the new value would be properly read from memory. Since Tigerlake and later, VERTEX_BUFFER_STATE and 3DSTATE_INDEX_BUFFER have included an "L3 Bypass Enable" bit which userspace drivers can set to request that the vertex fetcher unit snoop L3. However, unlike most true L3 clients, the "VF Cache Invalidate" bit continues to only invalidate the VF L2 cache - and not any associated L3 lines. To handle that, PIPE_CONTROL has a new "L3 Read Only Cache Invalidation Bit", which according to the docs, "controls the invalidation of the Geometry streams cached in L3 cache at the top of the pipe." In other words, the vertex and index buffer data that gets cached in L3 when "L3 Bypass Disable" is set. Mesa always sets L3 Bypass Disable so that the VF unit snoops L3, and whenever it issues a VF Cache Invalidate, it also issues a L3 Read Only Cache Invalidate so that both L2 and L3 vertex data is invalidated. xe is issuing VF cache invalidates too (which handles cases like CPU writes to a buffer between GPU batches). Because userspace may enable L3 snooping, it needs to issue an L3 Read Only Cache Invalidate as well. Fixes significant flickering in Firefox on Meteorlake, which was writing to vertex buffers via the CPU between batches; the missing L3 Read Only invalidates were causing the vertex fetcher to read stale data from L3. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4460 Fixes: 6ef3bb60557d ("drm/xe: enable lite restore") Cc: stable@vger.kernel.org # v6.13+ Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250330165923.56410-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-31drm/xe: Restore EIO errno return when GuC PC start failsRodrigo Vivi1-0/+1
Commit b4b05e53b550 ("drm/xe/guc_pc: Retry and wait longer for GuC PC start"), leads to the following Smatch static checker warning: drivers/gpu/drm/xe/xe_guc_pc.c:1073 xe_guc_pc_start() warn: missing error code here? '_dev_err()' failed. 'ret' = '0' Fixes: b4b05e53b550 ("drm/xe/guc_pc: Retry and wait longer for GuC PC start") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/intel-xe/1454a5f1-ee18-4df1-a6b2-a4a3dddcd1cb@stanley.mountain/ Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250328181752.26677-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-31drm/xe: Don't print error about hwconfig when using execlistsStuart Summers1-1/+2
This error message is only applicable for platforms using GuC submission - to warn the user if the GuC they are using or the platform they are running doesn't have this information to provide to userspace about the platform. When forcing execlist submission, which is something only used for debug, the user is running at their own risk and should understand the limitations of running without GuC. v2 (John/Lucas): Don't print an info message with execlists Signed-off-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://lore.kernel.org/r/20250328154236.9216-1-stuart.summers@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-03-28drm/xe/guc: Re-word message about ADS size changesJohn Harrison1-2/+2
The error capture list in the ADS is initially allocated using a placeholder size. When the actual size is determinied later on, there is a debug print about the new size. However, the wording is such that some people see it as an unexpected thing and therefore a potential problem. So re-word it to be a little less concerning. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250325203211.3907890-1-John.C.Harrison@Intel.com
2025-03-28drm/xe: avoid plain 64-bit divisionArnd Bergmann1-2/+2
Building the xe driver for i386 results in a link time warning: x86_64-linux-ld: drivers/gpu/drm/xe/xe_migrate.o: in function `xe_migrate_vram': xe_migrate.c:(.text+0x1e15): undefined reference to `__udivdi3' Avoid this by using DIV_U64_ROUND_UP() instead of DIV_ROUND_UP(). The driver is unlikely to be used on 32=bit hardware, so the extra cost here is not too important. Fixes: 9c44fd5f6e8a ("drm/xe: Add migrate layer functions for SVM support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20250324210612.2927194-1-arnd@kernel.org Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-28drm/xe/guc: Reformat dead CT reason string to be devcoredump compatibleJohn Harrison1-1/+1
The dump on a dead CT tries to emulate the devcoredump formatting (it would use devcoredump code directly but that requires more re-work to happen - work in progress). So update the print of the dead CT reason code to match the format of the 'reason' string that was added to the actual devcoredump a little while ago. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250325203111.3907426-1-John.C.Harrison@Intel.com
2025-03-28drm/xe/d3cold: Set power state to D3Cold during s2idle/s3Badal Nilawar1-0/+1
According to pci core guidelines, pci_save_config is recommended when the driver explicitly needs to set the pci power state. As of now xe kmd is only doing pci_save_config while entering to s2idle/s3 state, which makes pci core think that device driver has already applied required pci power state. This leads to GPU remain in D0 state. To fix the issue setting the pci power state to D3Cold. Fixes:dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250327161914.432552-1-badal.nilawar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-28drm/xe/hw_engine: define sysfs_ops on all directoriesTejas Upadhyay1-56/+52
Sysfs_ops needs to be defined on all directories which can have attr files with set/get method. Add sysfs_ops to even those directories which is currently empty but would have attr files with set/get method in future. Leave .default with default sysfs_ops as it will never have setter method. V2(Himal/Rodrigo): - use single sysfs_ops for all dir and attr with set/get - add default ops as ./default does not need runtime pm at all Fixes: 3f0e14651ab0 ("drm/xe: Runtime PM wake on every sysfs call") Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250327122647.886637-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2025-03-27drm/xe/xe3lpg: Apply Wa_14022293748, Wa_22019794406Julia Filipchuk1-0/+2
Extend Wa_14022293748, Wa_22019794406 to Xe3_LPG Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250325224310.1455499-1-julia.filipchuk@intel.com
2025-03-27drm/xe: Use local fence in error path of xe_migrate_clearMatthew Brost1-1/+1
The intent of the error path in xe_migrate_clear is to wait on locally generated fence and then return. The code is waiting on m->fence which could be the local fence but this is only stable under the job mutex leading to a possible UAF. Fix code to wait on local fence. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250311182915.3606291-1-matthew.brost@intel.com
2025-03-27drm/xe: Ensure fixed_slice_mode gets set after ccs_mode changeNiranjana Vishwanathapura1-6/+6
The RCU_MODE_FIXED_SLICE_CCS_MODE setting is not getting invoked in the gt reset path after the ccs_mode setting by the user. Add it to engine register update list (in hw_engine_setup_default_state()) which ensures it gets set in the gt reset and engine reset paths. v2: Add register update to engine list to ensure it gets updated after engine reset also. Fixes: 0d97ecce16bd ("drm/xe: Enable Fixed CCS mode setting") Cc: stable@vger.kernel.org Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250327185604.18230-1-niranjana.vishwanathapura@intel.com
2025-03-27drm/xe: Fix an out-of-bounds shift when invalidating TLBThomas Hellström1-2/+10
When the size of the range invalidated is larger than rounddown_pow_of_two(ULONG_MAX), The function macro roundup_pow_of_two(length) will hit an out-of-bounds shift [1]. Use a full TLB invalidation for such cases. v2: - Use a define for the range size limit over which we use a full TLB invalidation. (Lucas) - Use a better calculation of the limit. [1]: [ 39.202421] ------------[ cut here ]------------ [ 39.202657] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13 [ 39.202673] shift exponent 64 is too large for 64-bit type 'long unsigned int' [ 39.202688] CPU: 8 UID: 0 PID: 3129 Comm: xe_exec_system_ Tainted: G U 6.14.0+ #10 [ 39.202690] Tainted: [U]=USER [ 39.202690] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023 [ 39.202691] Call Trace: [ 39.202692] <TASK> [ 39.202695] dump_stack_lvl+0x6e/0xa0 [ 39.202699] ubsan_epilogue+0x5/0x30 [ 39.202701] __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6 [ 39.202705] xe_gt_tlb_invalidation_range.cold+0x1d/0x3a [xe] [ 39.202800] ? find_held_lock+0x2b/0x80 [ 39.202803] ? mark_held_locks+0x40/0x70 [ 39.202806] xe_svm_invalidate+0x459/0x700 [xe] [ 39.202897] drm_gpusvm_notifier_invalidate+0x4d/0x70 [drm_gpusvm] [ 39.202900] __mmu_notifier_release+0x1f5/0x270 [ 39.202905] exit_mmap+0x40e/0x450 [ 39.202912] __mmput+0x45/0x110 [ 39.202914] exit_mm+0xc5/0x130 [ 39.202916] do_exit+0x21c/0x500 [ 39.202918] ? lockdep_hardirqs_on_prepare+0xdb/0x190 [ 39.202920] do_group_exit+0x36/0xa0 [ 39.202922] get_signal+0x8f8/0x900 [ 39.202926] arch_do_signal_or_restart+0x35/0x100 [ 39.202930] syscall_exit_to_user_mode+0x1fc/0x290 [ 39.202932] do_syscall_64+0xa1/0x180 [ 39.202934] ? do_user_addr_fault+0x59f/0x8a0 [ 39.202937] ? lock_release+0xd2/0x2a0 [ 39.202939] ? do_user_addr_fault+0x5a9/0x8a0 [ 39.202942] ? trace_hardirqs_off+0x4b/0xc0 [ 39.202944] ? clear_bhb_loop+0x25/0x80 [ 39.202946] ? clear_bhb_loop+0x25/0x80 [ 39.202947] ? clear_bhb_loop+0x25/0x80 [ 39.202950] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 39.202952] RIP: 0033:0x7fa945e543e1 [ 39.202961] Code: Unable to access opcode bytes at 0x7fa945e543b7. [ 39.202962] RSP: 002b:00007ffca8fb4170 EFLAGS: 00000293 [ 39.202963] RAX: 000000000000003d RBX: 0000000000000000 RCX: 00007fa945e543e3 [ 39.202964] RDX: 0000000000000000 RSI: 00007ffca8fb41ac RDI: 00000000ffffffff [ 39.202964] RBP: 00007ffca8fb4190 R08: 0000000000000000 R09: 00007fa945f600a0 [ 39.202965] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 [ 39.202966] R13: 00007fa9460dd310 R14: 00007ffca8fb41ac R15: 0000000000000000 [ 39.202970] </TASK> [ 39.202970] ---[ end trace ]--- Fixes: 332dd0116c82 ("drm/xe: Add range based TLB invalidations") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> #v1 Link: https://lore.kernel.org/r/20250326151634.36916-1-thomas.hellstrom@linux.intel.com
2025-03-27drm/xe/migrate: Switch from drm to dev managed actionsAradhya Bhatia1-3/+3
Change the scope of the migrate subsystem to be dev managed instead of drm managed. The parent pci struct &device, that the xe struct &drm_device is a part of, gets removed when a hot unplug is triggered, which causes the underlying iommu group to get destroyed as well. The migrate subsystem, which handles the lifetime of the page-table tree (pt) BO, doesn't get a chance to keep the BO back during the hot unplug, as all the references to DRM haven't been put back. When all the references to DRM are indeed put back later, the migrate subsystem tries to put back the pt BO. Since the underlying iommu group has been already destroyed, a kernel NULL ptr dereference takes place while attempting to keep back the pt BO. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3914 Suggested-by: Thomas Hellstrom <thomas.hellstrom@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250326151929.1495972-1-aradhya.bhatia@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2025-03-27drm/xe: Make the PT code handle placement per PTE rather than per vma / rangeThomas Hellström2-62/+70
With SVM, ranges forwarded to the PT code for binding can, mostly due to races when migrating, point to both VRAM and system / foreign device memory. Make the PT code able to handle that by checking, for each PTE set up, whether it points to local VRAM or to system memory. v2: - Fix system memory GPU atomic access. v3: - Avoid the UAPI change. It needs more thought. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250326080551.40201-6-thomas.hellstrom@linux.intel.com
2025-03-27drm/xe/migrate: Allow xe_migrate_vram() also on non-pagefault capable devicesThomas Hellström1-2/+3
The drm_pagemap functionality does not depend on the device having recoverable pagefaults available. So allow xe_migrate_vram() also for such devices. Even if this will have little use in practice, it's beneficial for testin multi-device SVM, since a memory provider could be a non-pagefault capable gpu. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250326080551.40201-5-thomas.hellstrom@linux.intel.com
2025-03-27drm/xe/bo: Add a bo remove callbackThomas Hellström6-17/+140
On device unbind, migrate exported bos, including pagemap bos to system. This allows importers to take proper action without disruption. In particular, SVM clients on remote devices may continue as if nothing happened, and can chose a different placement. The evict_flags() placement is chosen in such a way that bos that aren't exported are purged. For pinned bos, we unmap DMA, but their pages are not freed yet since we can't be 100% sure they are not accessed. All pinned external bos (not just the VRAM ones) are put on the pinned.external list with this patch. But this only affects the xe_bo_pci_dev_remove_pinned() function since !VRAM bos are ignored by the suspend / resume functionality. As a follow-up we could look at removing the suspend / resume iteration over pinned external bos since we currently don't allow pinning external bos in VRAM, and other external bos don't need any special treatment at suspend / resume. v2: - Address review comments. (Matthew Auld). v3: - Don't introduce an external_evicted list (Matthew Auld) - Add a discussion around suspend / resume behaviour to the commit message. - Formatting fixes. v4: - Move dma-unmaps of pinned kernel bos to a dev managed callback to give subsystems using these bos a chance to clean them up. (Matthew Auld) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250326080551.40201-4-thomas.hellstrom@linux.intel.com
2025-03-27drm/xe/svm: Fix a potential bo UAFThomas Hellström1-2/+5
If drm_gpusvm_migrate_to_devmem() succeeds, if a cpu access happens to the range the bo may be freed before xe_bo_unlock(), causing a UAF. Since the reference is transferred, use xe_svm_devmem_release() to release the reference on drm_gpusvm_migrate_to_devmem() failure, and hold a local reference to protect the UAF. Fixes: 2f118c949160 ("drm/xe: Add SVM VRAM migration") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250326080551.40201-3-thomas.hellstrom@linux.intel.com
2025-03-27drm/xe: Introduce CONFIG_DRM_XE_GPUSVMThomas Hellström9-27/+98
Don't rely on CONFIG_DRM_GPUSVM because other drivers may enable it causing us to compile in SVM support unintentionally. Also take the opportunity to leave more code out of compilation if !CONFIG_DRM_XE_GPUSVM and !CONFIG_DRM_XE_DEVMEM_MIRROR v3: - Fixes for compilation errors on 32-bit. This changes the Kconfig logic a bit. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250326080551.40201-2-thomas.hellstrom@linux.intel.com
2025-03-26drm/xe/bmg: Add one additional PCI IDMatt Roper1-0/+1
One additional BMG PCI ID has been added to the spec; make sure our driver recognizes devices with this ID properly. Bspec: 68090 Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250325224709.4073080-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-03-26drm/xe: Add fault injection for xe_oa_alloc_regsNakshtra Goyal1-0/+1
Add fault injection for xe_oa_alloc_regs to allow it to fail while executing xe_oa_add_config_ioctl(). This need to be added as it cannot be reached by injecting error through IOCTL arguments. Signed-off-by: Nakshtra Goyal <nakshtra.goyal@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250227102339.2859726-1-nakshtra.goyal@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
2025-03-25drm/xe/pf: Enable per-function engine activity statsRiana Tauro1-0/+20
Enable per-function engine activity stats when VF's are enabled and disable when VF's are disabled v2: fix commit message remove reset stats from pf config (Michal) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250311071759.2117211-4-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-03-25drm/xe/xe_pmu: Add PMU support for per-function engine activity statsRiana Tauro1-8/+31
Add PMU support for per-function engine activity stats. per-function engine activity is enabled when VF's are enabled. If 2 VF's are enabled, then the applicable function values are 0 - PF engine activity 1 - VF1 engine activity 2 - VF2 engine activity This can be read from perf tool as shown below ./perf stat -e xe_<bdf>/engine-active-ticks,gt=0,engine_class=0, engine_instance=0,function=1/ -I 1000 v2: fix documentation (Umesh) remove global for functions (Lucas, Michal) v3: fix commit message move function_id checks to same place (Michal) v4: fix comment (Umesh) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250311071759.2117211-3-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-03-25drm/xe: Add support for per-function engine activityRiana Tauro5-33/+192
Add support for function level engine activity stats. Engine activity stats are enabled when VF's are enabled v2: remove unnecessary initialization move offset to improve code readability (Umesh) remove global for function engine activity (Lucas) v3: fix commit message (Michal) v4: remove enable function parameter fix kernel-doc (Umesh) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250311071759.2117211-2-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-03-25drm/xe: Remove extra spaces in xe_vm.cMaarten Lankhorst1-3/+3
There are extra spaces in xe_vm_bind_ioctl_validate_bo(), remove those. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250320211519.632432-1-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-03-24drm/xe/hw_engine_class_sysfs: Allow to inject error during probeFrancois Dugast1-0/+1
Allow fault injection in a function used during initialization by xe_hw_engine_class_sysfs_init() so that its error handling can be tested. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250314105050.636983-1-francois.dugast@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>