aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-06-05drm/xe: Create LRC BO without VMNiranjana Vishwanathapura2-28/+4
Specifying VM during lrc->bo creation requires VM's reference to be held for the lifetime of lrc->bo as it will use VM's dma reservation object. Using VM's dma reservation object for lrc->bo doesn't provide any advantage. Hence do not pass VM while creating lrc->bo. v2: Use xe_bo_unpin_map_no_vm (Matthew Brost) Fixes: 264eecdba211 ("drm/xe: Decouple xe_exec_queue and xe_lrc") Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250529052031.2429120-2-niranjana.vishwanathapura@intel.com (cherry picked from commit fbeaad071a98fef87deccee81d564de1c8e8e16d) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/guc_submit: add back fixMatthew Auld1-0/+11
Daniele noticed that the fix in commit 2d2be279f1ca ("drm/xe: fix UAF around queue destruction") looks to have been unintentionally removed as part of handling a conflict in some past merge commit. Add it back. Fixes: ac44ff7cec33 ("Merge tag 'drm-xe-fixes-2024-10-10' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes") Reported-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250603174213.1543579-2-matthew.auld@intel.com (cherry picked from commit 9d9fca62dc49d96f97045b6d8e7402a95f8cf92a) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/pxp: Clarify PXP queue creation behavior if PXP is not readyDaniele Ceraolo Spurio2-2/+11
The expected flow of operations when using PXP is to query the PXP status and wait for it to transition to "ready" before attempting to create an exec_queue. This flow is followed by the Mesa driver, but there is no guarantee that an incorrectly coded (or malicious) app will not attempt to create the queue first without querying the status. Therefore, we need to clarify what the expected behavior of the queue creation ioctl is in this scenario. Currently, the ioctl always fails with an -EBUSY code no matter the error, but for consistency it is better to distinguish between "failed to init" (-EIO) and "not ready" (-EBUSY), the same way the query ioctl does. Note that, while this is a change in the return code of an ioctl, the behavior of the ioctl in this particular corner case was not clearly spec'd, so no one should have been relying on it (and we know that Mesa, which is the only known userspace for this, didn't). v2: Minor rework of the doc (Rodrigo) Fixes: 72d479601d67 ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250522225401.3953243-7-daniele.ceraolospurio@intel.com (cherry picked from commit 21784ca96025b62d95b670b7639ad70ddafa69b8) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/pxp: Use the correct define in the set_property_funcs arrayDaniele Ceraolo Spurio1-1/+1
The define of the extension type was accidentally used instead of the one of the property itself. They're both zero, so no functional issue, but we should use the correct define for code correctness. Fixes: 41a97c4a1294 ("drm/xe/pxp/uapi: Add API to mark a BO as using PXP") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250522225401.3953243-6-daniele.ceraolospurio@intel.com (cherry picked from commit 1d891ee820fd0fbb4101eacb0d922b5050a24933) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/sched: stop re-submitting signalled jobsMatthew Auld1-1/+9
Customer is reporting a really subtle issue where we get random DMAR faults, hangs and other nasties for kernel migration jobs when stressing stuff like s2idle/s3/s4. The explosions seems to happen somewhere after resuming the system with splats looking something like: PM: suspend exit rfkill: input handler disabled xe 0000:00:02.0: [drm] GT0: Engine reset: engine_class=bcs, logical_mask: 0x2, guc_id=0 xe 0000:00:02.0: [drm] GT0: Timedout job: seqno=24496, lrc_seqno=24496, guc_id=0, flags=0x13 in no process [-1] xe 0000:00:02.0: [drm] GT0: Kernel-submitted job timed out The likely cause appears to be a race between suspend cancelling the worker that processes the free_job()'s, such that we still have pending jobs to be freed after the cancel. Following from this, on resume the pending_list will now contain at least one already complete job, but it looks like we call drm_sched_resubmit_jobs(), which will then call run_job() on everything still on the pending_list. But if the job was already complete, then all the resources tied to the job, like the bb itself, any memory that is being accessed, the iommu mappings etc. might be long gone since those are usually tied to the fence signalling. This scenario can be seen in ftrace when running a slightly modified xe_pm IGT (kernel was only modified to inject artificial latency into free_job to make the race easier to hit): xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ... xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13 xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=1, guc_state=0x0, flags=0x4 xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=0, guc_state=0x0, flags=0x3 xe_exec_queue_stop: dev=0000:00:02.0, 1:0x1, gt=1, width=1, guc_id=1, guc_state=0x0, flags=0x3 xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=2, guc_state=0x0, flags=0x3 xe_exec_queue_resubmit: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13 xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ... ..... xe_exec_queue_memory_cat_error: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x3, flags=0x13 So the job_run() is clearly triggered twice for the same job, even though the first must have already signalled to completion during suspend. We can also see a CAT error after the re-submit. To prevent this only resubmit jobs on the pending_list that have not yet signalled. v2: - Make sure to re-arm the fence callbacks with sched_start(). v3 (Matt B): - Stop using drm_sched_resubmit_jobs(), which appears to be deprecated and just open-code a simple loop such that we skip calling run_job() on anything already signalled. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4856 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: William Tseng <william.tseng@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://lore.kernel.org/r/20250528113328.289392-2-matthew.auld@intel.com (cherry picked from commit 38fafa9f392f3110d2de431432d43f4eef99cd1b) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe: Rework eviction rejection of bound external bosThomas Hellström3-18/+105
For preempt_fence mode VM's we're rejecting eviction of shared bos during VM_BIND. However, since we do this in the move() callback, we're getting an eviction failure warning from TTM. The TTM callback intended for these things is eviction_valuable(). However, the latter doesn't pass in the struct ttm_operation_ctx needed to determine whether the caller needs this. Instead, attach the needed information to the vm under the vm->resv, until we've been able to update TTM to provide the needed information. And add sufficient lockdep checks to prevent misuse and races. v2: - Fix a copy-paste error in xe_vm_clear_validating() v3: - Fix kerneldoc errors. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 0af944f0e308 ("drm/xe: Reject BO eviction if BO is bound to current VM") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250528164105.234718-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 9d5558649f68e2e84a87a909631b30e15ca0f8ec) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/vsec: fix CONFIG_INTEL_VSEC dependencyArnd Bergmann1-1/+2
The XE driver can be built with or without VSEC support, but fails to link as built-in if vsec is in a loadable module: x86_64-linux-ld: vmlinux.o: in function `xe_vsec_init': (.text+0x1e83e16): undefined reference to `intel_vsec_register' The normal fix for this is to add a 'depends on INTEL_VSEC || !INTEL_VSEC', forcing XE to be a loadable module as well, but that causes a circular dependency: symbol DRM_XE depends on INTEL_VSEC symbol INTEL_VSEC depends on X86_PLATFORM_DEVICES symbol X86_PLATFORM_DEVICES is selected by DRM_XE The problem here is selecting a symbol from another subsystem, so change that as well and rephrase the 'select' into the corresponding dependency. Since X86_PLATFORM_DEVICES is 'default y', there is no change to defconfig builds here. Fixes: 0c45e76fcc62 ("drm/xe/vsec: Support BMG devices") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250529172355.2395634-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit e4931f8be347ec5f19df4d6d33aea37145378c42) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe: drop redundant conversion to boolRaag Jadav1-1/+1
The result of integer comparison already evaluates to bool. No need for explicit conversion. No functional impact. Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505292205.MoljmkjQ-lkp@intel.com/ Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250529160937.490147-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 61761a6b57f2818983466d24aab60baab471ba21) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/hwmon: Move card reactive critical power under channel cardKarthik Poosa2-13/+13
Move power2/curr2_crit to channel 1 i.e power1/curr1_crit as this represents the entire card critical power/current. v2: Update the date of curr1_crit also in hwmon documentation. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: 345dadc4f68b ("drm/xe/hwmon: Add infra to support card power and energy attributes") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-3-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 25e963a09e059ffdb15c09cc79cfded855b43668) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/hwmon: Add support to manage power limits though mailboxKarthik Poosa8-106/+318
Add support to manage power limits using pcode mailbox commands for supported platforms. v2: - Address review comments. (Badal) - Use mailbox commands instead of registers to manage power limits for BMG. - Clamp the maximum power limit to GPU firmware default value. v3: - Clamp power limit in write also for platforms with mailbox support. v4: - Remove unnecessary debug prints. (Badal) v5: - Update description of variable pl1_on_boot to fix kernel-doc error. v6: - Improve commit message, refer to BIOS as GPU firmware. - Change macro READ_PL_FROM_BIOS to READ_PL_FROM_FW. - Rectify drm_warn to drm_info. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: e90f7a58e659 ("drm/xe/hwmon: Add HWMON support for BMG") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-2-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 7596d839f6228757fe17a810da2d1c5f3305078c) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/vm: move xe_svm_init() earlierMatthew Auld1-7/+12
In xe_vm_close_and_put() we need to be able to call xe_svm_fini(), however during vm creation we can call this on the error path, before having actually initialised the svm state, leading to various splats followed by a fatal NPD. Fixes: 6fd979c2f331 ("drm/xe: Add SVM init / close / fini to faulting VMs") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4967 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250514152424.149591-4-matthew.auld@intel.com (cherry picked from commit 4f296d77cf49fcb5f90b4674123ad7f3a0676165) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-05drm/xe/vm: move rebind_work init earlierMatthew Auld1-4/+4
In xe_vm_close_and_put() we need to be able to call flush_work(rebind_work), however during vm creation we can call this on the error path, before having actually set up the worker, leading to a splat from flush_work(). It looks like we can simply move the worker init step earlier to fix this. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250514152424.149591-3-matthew.auld@intel.com (cherry picked from commit 96af397aa1a2d1032a6e28ff3f4bc0ab4be40e1d) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-04MAINTAINERS: .mailmap: update Rob Clark's email addressRob Clark2-3/+5
Remap historical email addresses to @oss.qualcomm.com. Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Acked-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Patchwork: https://patchwork.freedesktop.org/patch/656974/
2025-06-04mailmap: Update entry for Akhil P OommenAkhil P Oommen1-1/+2
A new policy within qualcomm requires me to use a new email address for all future contributions to Linux kernel. Update the mailmap to map my old email addresses to the new one, ie akhilpo@oss.qualcomm.com Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Link: https://lore.kernel.org/lkml/20250603121508.296678-1-quic_akhilpo@quicinc.com Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2025-06-04MAINTAINERS: update my email addressAbhinav Kumar1-2/+2
My current email address will stop working soon. Use linux.dev email instead. Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Acked-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/655555/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2025-06-04MAINTAINERS: drop myself as maintainerAbhinav Kumar1-1/+2
I will no longer regularly work on this platform. Hence will step down from maintainer duties. Also, add Jessica as a reviewer to the MSM DRM subsystem to help out with the reviews. Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Acked-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/655558/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2025-06-04drm/i915/display: Fix u32 overflow in SNPS PHY HDMI PLL setupDibin Moolakadan Subrahmanian1-8/+8
When configuring the HDMI PLL, calculations use DIV_ROUND_UP_ULL and DIV_ROUND_DOWN_ULL macros, which internally rely on do_div. However, do_div expects a 32-bit (u32) divisor, and at higher data rates, the divisor can exceed this limit. This leads to incorrect division results and ultimately misconfigured PLL values. This fix replaces do_div calls with div64_base64 calls where diviser can exceed u32 limit. Fixes: 5947642004bf ("drm/i915/display: Add support for SNPS PHY HDMI PLL algorithm for DG2") Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Cc: Suraj Kandpal <suraj.kandpal@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Dibin Moolakadan Subrahmanian <dibin.moolakadan.subrahmanian@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://lore.kernel.org/r/20250528064557.4172149-1-dibin.moolakadan.subrahmanian@intel.com (cherry picked from commit ce924116e43ffbfa544d82976c4b9d11bcde9334) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-06-02drm/i915/guc: Handle race condition where wakeref count drops below 0Jesus Narvaez1-3/+14
There is a rare race condition when preparing for a reset where guc_lrc_desc_unpin() could be in the process of deregistering a context while a different thread is scrubbing outstanding contexts and it alters the context state and does a wakeref put. Then, if there is a failure with deregister_context(), a second wakeref put could occur. As a result the wakeref count could drop below 0 and fail an INTEL_WAKEREF_BUG_ON() check. Therefore if there is a failure with deregister_context(), undo the context state changes and do a wakeref put only if the context was set to be destroyed earlier. v2: Expand comment to better explain change. (Daniele) v3: Removed addition to the original comment. (Daniele) Fixes: 2f2cc53b5fe7 ("drm/i915/guc: Close deregister-context race against CT-loss") Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Mousumi Jana <mousumi.jana@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250528230551.1855177-1-jesus.narvaez@intel.com (cherry picked from commit f36a75aba1c3176d177964bca76f86a075d2943a) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-06-02drm/i915/psr: Fix using wrong mask in REG_FIELD_PREPJouni Högander1-2/+2
Wrong mask is used in PORT_ALPM_LFPS_CTL_FIRST_LFPS_HALF_CYCLE_DURATION and PORT_ALPM_LFPS_CTL_LAST_LFPS_HALF_CYCLE_DURATION. Fixes: 295099580f04 ("drm/i915/psr: Add missing ALPM AUX-Less register definitions") Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://lore.kernel.org/r/20250526120512.1702815-12-jouni.hogander@intel.com (cherry picked from commit 8097128a40ff378761034ec72cdbf6f46e466dc0) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-06-02drm/i915/guc: Check if expecting reply before decrementing outstanding_submission_g2hJesus Narvaez1-1/+1
When sending a H2G message where a reply is expected in guc_submission_send_busy_loop(), outstanding_submission_g2h is incremented before the send. However, if there is an error sending the message, outstanding_submission_g2h is decremented without checking if a reply is expected. Therefore, check if reply is expected when there is a failure before decrementing outstanding_submission_g2h. Fixes: 2f2cc53b5fe7 ("drm/i915/guc: Close deregister-context race against CT-loss") Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Mousumi Jana <mousumi.jana@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250514225224.4142684-1-jesus.narvaez@intel.com (cherry picked from commit a6a26786f22a4ab0227bcf610510c4c9c2df0808) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-05-29drm/amdkfd: Map wptr BO to GART unconditionallyLang Yu2-13/+13
For simulation C models that don't run CP FW where adev->mes.sched_version is not populated correctly. This causes NULL dereference in amdgpu_amdkfd_free_gtt_mem(dev->adev, (void **)&pqn->q->wptr_bo_gart) and warning on unpinned BO in amdgpu_bo_gpu_offset(q->properties.wptr_bo). Compared with adding version check here and there, always map wptr BO to GART simplifies things. v2: Add NULL check in amdgpu_amdkfd_free_gtt_mem.(Philip) Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/mes: remove some unused functionsAlex Deucher2-67/+0
Nothing uses them so remove them. Leftover from MES bring up. Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/mes: add missing locking in helper functionsAlex Deucher1-0/+16
We need to take the MES lock. Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-05-29drm/amd: Export DMCUB version to sysfsMario Limonciello1-3/+5
For supported ASICs DMCU version is exported, but ASICs that support DMCUB there is no information exported to sysfs. Add an attribute for DMCUB. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Link: https://lore.kernel.org/r/20250527155942.476354-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amd/pm: Enable static metrics table supportAsad Kamal1-0/+5
Enable static metrics support to fetch board voltage and pldm version for smu_v13_0_14 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amd/pm: Enable static metrics table supportAsad Kamal1-2/+4
Enable static metrics support to fetch board voltage and pldm version for other smu_v13_0_6 program Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amd/display: Constify struct timing_generator_funcsChristophe JAILLET9-9/+9
'struct timing_generator_funcs' are not modified in these drivers. Constifying these structures moves some data to a read-only section, so increases overall security, especially when the structure holds some function pointers. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amd/display: Add null pointer check for get_first_active_display()Wentao Liang1-0/+3
The function mod_hdcp_hdcp1_enable_encryption() calls the function get_first_active_display(), but does not check its return value. The return value is a null pointer if the display list is empty. This will lead to a null pointer dereference in mod_hdcp_hdcp2_enable_encryption(). Add a null pointer check for get_first_active_display() and return MOD_HDCP_STATUS_DISPLAY_NOT_FOUND if the function return null. Fixes: 2deade5ede56 ("drm/amd/display: Remove hdcp display state with mst fix") Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # v5.8
2025-05-29drm/amdgpu: Get mca address for old eeprom recordsganglxie1-0/+9
after getting mca address for old eeprom records with 'address==0', it can be correctly parsed under none-nps1, or it will be dropped. Signed-off-by: ganglxie <ganglxie@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu: handle old RAS eeprom data in non-nps1 modeganglxie3-2/+39
Get MCA address from PA in nps1, then convert MCA address to PA in specific nps mode. Signed-off-by: ganglxie <ganglxie@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29amd/amdkfd: fix a kfd_process ref leakYifan Zhang1-0/+1
This patch is to fix a kfd_prcess ref leak. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu: Add userq fence support to SDMAv6.0Arunpravin Paneer Selvam3-16/+41
Add userq fence support to SDMAv6.0 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdkfd: Identical code for different branchesSunday Clement1-6/+1
This patch removes the if/else statement in the cik_event_interrupt_wq function because it is redundant with both branches resulting in identical outcomes, this improves code readibility. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amd/pm: Optimize get gpu metrics data functionAsad Kamal3-7/+9
Optimize get gpu metrics data function for smu_v13_0_12 to allocate metrics structure only once v2: Free and alloc moved to same function(Kevin) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu: amdgpu_vram_mgr_new(): Clamp lpfn to total vramJohn Olender1-1/+1
The drm_mm allocator tolerated being passed end > mm->size, but the drm_buddy allocator does not. Restore the pre-buddy-allocator behavior of allowing such placements. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3448 Signed-off-by: John Olender <john.olender@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-05-29drm/amdgpu/vcn5.0.1: read back register after writtenDavid (Ming Qiang) Wu1-0/+15
The addition of register read-back in VCN v5.0.1 is intended to prevent potential race conditions. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/vcn5: read back register after writtenDavid (Ming Qiang) Wu1-0/+20
The addition of register read-back in VCN v5.0.0 is intended to prevent potential race conditions. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/vcn4.0.5: read back register after writtenDavid (Ming Qiang) Wu1-0/+10
The addition of register read-back in VCN v4.0.5 is intended to prevent potential race conditions. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/vcn4.0.3: read back register after writtenDavid (Ming Qiang) Wu1-0/+16
The addition of register read-back in VCN v4.0.3 is intended to prevent potential race conditions. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/vcn4: read back register after writtenDavid (Ming Qiang) Wu1-0/+20
The addition of register read-back in VCN v4.0.0 is intended to prevent potential race conditions. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/vcn3: read back register after writtenDavid (Ming Qiang) Wu1-0/+20
The addition of register read-back in VCN v3.0 is intended to prevent potential race conditions. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/vcn2.5: read back register after writtenDavid (Ming Qiang) Wu1-0/+19
The addition of register read-back in VCN v2.5 is intended to prevent potential race conditions. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/vcn2: read back register after writtenDavid (Ming Qiang) Wu1-0/+21
The addition of register read-back in VCN v2.0 is intended to prevent potential race conditions. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29Revert "drm/amd/display: pause the workload setting in dm"Fangzhi Zuo1-10/+1
This reverts commit 50f29ead1f1ba48983b6c5e3813b15e497714f55. Reason for revert: cause corruption on Dell U3224KB DP2 display. Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/vcn1: read back register after writtenDavid (Ming Qiang) Wu1-0/+21
V3: drop changes where readbacks have implemented. This patch set is to add readbacks only. V2: use common register UVD_STATUS for readback (standard PCI MMIO behavior, i.e. readback post all writes to let the writes hit the hardware) add readback in ..._stop() for more coverage. Similar to the changes made for VCN v4.0.5 where readback to post the writes to avoid race with the doorbell, the addition of register readback support in other VCN versions is intended to prevent potential race conditions, even though such issues have not been observed yet. This change ensures consistency across different VCN variants and helps avoid similar issues. The overhead introduced is negligible. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amd/display: Reuse Subvp debug option for FAMSAurabindo Pillai2-3/+6
FAMS is the successor to SubVP starting with DCN4x. Reuse the same debug option to disable FAMS for debugging purposes. Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29Revert "drm/amd/display: more liberal vmin/vmax update for freesync"Aurabindo Pillai1-11/+5
This reverts commit cfb2d41831ee5647a4ae0ea7c24971a92d5dfa0d since it causes regressions on certain configs. Revert until the issue can be isolated and debugged. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4238 Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-05-28drm/amd/display: Add some missing register headers for DCN401Aurabindo Pillai2-0/+42
Add some HDCP related register headers for future use. Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-28drm/amd/amdgpu: Add GPIO resources required for amdispPratap Nirujogi4-2/+67
ISP is a child device to GFX, and its device specific information is not available in ACPI. Adding the 2 GPIO resources required for ISP_v4_1_1 in amdgpu_isp driver. - GPIO 0 to allow sensor driver to enable and disable sensor module. - GPIO 85 to allow ISP driver to enable and disable ISP RGB streaming mode. Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-28drm/xe: Add missing documentation of rpa_freqRodrigo Vivi1-0/+3
While at it, already adjust the rpe_freq frequency, to highlight that both are calculated by PCODE at runtime. Fixes: c6aac2fa77a3 ("drm/xe: Introduce the RPa information") Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250521165146.39616-4-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 39578fa40420fb11dbe4f42225a347e945d8fd0e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>