aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/drivers/gpu/drm/amd (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-08-29Merge tag 'amd-drm-fixes-6.17-2025-08-28' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixesDave Airlie5-17/+28
amd-drm-fixes-6.17-2025-08-28: amdgpu: - UserQ fixes - Revert CSA fix - SR-IOV fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250828173904.75850-1-alexander.deucher@amd.com
2025-08-29Merge tag 'drm-misc-fixes-2025-08-28' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixesDave Airlie1-2/+32
Several nouveau fixes to remove unused code, fix an error path and be less restrictive with the formats it accepts. A fix for amdgpu to pin vmapped dma-buf, and a revert for tegra for a regression in the dma-buf / GEM code. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250828-hypersonic-colorful-squirrel-64f04b@houat
2025-08-27drm/amdgpu/userq: fix error handling of invalid doorbellAlex Deucher1-0/+1
If the doorbell is invalid, be sure to set the r to an error state so the function returns an error. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7e2a5b0a9a165a7c51274aa01b18be29491b4345) Cc: stable@vger.kernel.org
2025-08-27drm/amdgpu: update firmware version checks for user queue supportJesse.Zhang1-3/+3
The minimum firmware versions required for user queue functionality have been increased to address an issue where the queue privilege state was lost during queue connect operations. The problem occurred because the privilege state was being restored to its initial value at the beginning of the function, overwriting the state that was properly set during the queue connect case. This commit updates the minimum version requirements: - ME firmware from 2390 to 2420 - PFP firmware from 2530 to 2580 - MEC firmware from 2600 to 2650 - MES firmware remains at 120 These updated firmware versions contain the necessary fixes to properly maintain queue privilege state throughout connect operations. Fixes: 61ca97e9590c ("drm/amdgpu: Add fw minimum version check for usermode queue") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5f976c9939f0d5916d2b8ef3156a6d1799781df1) Cc: stable@vger.kernel.org
2025-08-27drm/amd/amdgpu: disable hwmon power1_cap* for gfx 11.0.3 on vf modeYang Wang1-8/+10
the PPSMC_MSG_GetPptLimit msg is not valid for gfx 11.0.3 on vf mode, so skiped to create power1_cap* hwmon sysfs node. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e82a8d441038d8cb10b63047a9e705c42479d156) Cc: stable@vger.kernel.org
2025-08-27Revert "drm/amdgpu: fix incorrect vm flags to map bo"Alex Deucher1-2/+2
This reverts commit b08425fa77ad2f305fe57a33dceb456be03b653f. Revert this to align with 6.17 because the fixes tag was wrong on this commit. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit be33e8a239aac204d7e9e673c4220ef244eb1ba3)
2025-08-27drm/amdgpu/gfx12: set MQD as appriopriate for queue typesAlex Deucher1-2/+6
Set the MQD as appropriate for the kernel vs user queues. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7b9110f2897957efd9715b52fc01986509729db3) Cc: stable@vger.kernel.org
2025-08-27drm/amdgpu/gfx11: set MQD as appriopriate for queue typesAlex Deucher1-2/+6
Set the MQD as appropriate for the kernel vs user queues. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 063d6683208722b1875f888a45084e3d112701ac) Cc: stable@vger.kernel.org
2025-08-23Merge tag 'drm-misc-fixes-2025-08-21' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixesDave Airlie3-3/+4
A bunch of fixes for 6.17: - analogix_dp: devm_drm_bridge_alloc() error handling fix - gaudi: Memory deallocation fix - gpuvm: Documentation warning fix - hibmc: Various misc fixes - nouveau: Memory leak fixes, typos - panic: u64 division handling on 32 bits architecture fix - rockchip: Kconfig fix, register caching fix - rust: memory layout and safety fixes - tests: Endianness fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250821-economic-dandelion-rooster-c57fa9@houat
2025-08-22drm/amdgpu: Pin buffers while vmap'ing exported dma-buf objectsThomas Zimmermann1-2/+32
Current dma-buf vmap semantics require that the mapped buffer remains in place until the corresponding vunmap has completed. For GEM-SHMEM, this used to be guaranteed by a pin operation while creating an S/G table in import. GEM-SHMEN can now import dma-buf objects without creating the S/G table, so the pin is missing. Leads to page-fault errors, such as the one shown below. [ 102.101726] BUG: unable to handle page fault for address: ffffc90127000000 [...] [ 102.157102] RIP: 0010:udl_compress_hline16+0x219/0x940 [udl] [...] [ 102.243250] Call Trace: [ 102.245695] <TASK> [ 102.2477V95] ? validate_chain+0x24e/0x5e0 [ 102.251805] ? __lock_acquire+0x568/0xae0 [ 102.255807] udl_render_hline+0x165/0x341 [udl] [ 102.260338] ? __pfx_udl_render_hline+0x10/0x10 [udl] [ 102.265379] ? local_clock_noinstr+0xb/0x100 [ 102.269642] ? __lock_release.isra.0+0x16c/0x2e0 [ 102.274246] ? mark_held_locks+0x40/0x70 [ 102.278177] udl_primary_plane_helper_atomic_update+0x43e/0x680 [udl] [ 102.284606] ? __pfx_udl_primary_plane_helper_atomic_update+0x10/0x10 [udl] [ 102.291551] ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170 [ 102.297208] ? lockdep_hardirqs_on+0x88/0x130 [ 102.301554] ? _raw_spin_unlock_irq+0x24/0x50 [ 102.305901] ? wait_for_completion_timeout+0x2bb/0x3a0 [ 102.311028] ? drm_atomic_helper_calc_timestamping_constants+0x141/0x200 [ 102.317714] ? drm_atomic_helper_commit_planes+0x3b6/0x1030 [ 102.323279] drm_atomic_helper_commit_planes+0x3b6/0x1030 [ 102.328664] drm_atomic_helper_commit_tail+0x41/0xb0 [ 102.333622] commit_tail+0x204/0x330 [...] [ 102.529946] ---[ end trace 0000000000000000 ]--- [ 102.651980] RIP: 0010:udl_compress_hline16+0x219/0x940 [udl] In this stack strace, udl (based on GEM-SHMEM) imported and vmap'ed a dma-buf from amdgpu. Amdgpu relocated the buffer, thereby invalidating the mapping. Provide a custom dma-buf vmap method in amdgpu that pins the object before mapping it's buffer's pages into kernel address space. Do the opposite in vunmap. Note that dma-buf vmap differs from GEM vmap in how it handles relocation. While dma-buf vmap keeps the buffer in place, GEM vmap requires the caller to keep the buffer in place. Hence, this fix is in amdgpu's dma-buf code instead of its GEM code. A discussion of various approaches to solving the problem is available at [1]. v3: - try (GTT | VRAM); drop CPU domain (Christian) v2: - only use mapable domains (Christian) - try pinning to domains in preferred order Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 660cd44659a0 ("drm/shmem-helper: Import dmabuf without mapping its sg_table") Reported-by: Thomas Zimmermann <tzimmermann@suse.de> Closes: https://lore.kernel.org/dri-devel/ba1bdfb8-dbf7-4372-bdcb-df7e0511c702@suse.de/ Cc: Shixiong Ou <oushixiong@kylinos.cn> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Link: https://lore.kernel.org/dri-devel/9792c6c3-a2b8-4b2b-b5ba-fba19b153e21@suse.de/ # [1] Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250821064031.39090-1-tzimmermann@suse.de
2025-08-20Merge drm/drm-fixes into drm-misc-fixesMaxime Ripard5-11/+33
Update drm-misc-fixes to -rc2. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-08-18drm/amd/display: Fix DP audio DTO1 clock source on DCE 6.Timur Kristóf1-15/+6
On DCE 6, DP audio was not working. However, it worked when an HDMI monitor was also plugged in. Looking at dce_aud_wall_dto_setup it seems that the main difference is that we use DTO1 when only DP is plugged in. When programming DTO1, it uses audio_dto_source_clock_in_khz which is set from get_dp_ref_freq_khz The dce60_get_dp_ref_freq_khz implementation looks incorrect, because DENTIST_DISPCLK_CNTL seems to be always zero on DCE 6, so it isn't usable. I compared dce60_get_dp_ref_freq_khz to the legacy display code, specifically dce_v6_0_audio_set_dto, and it turns out that in case of DCE 6, it needs to use the display clock. With that, DP audio started working on Pitcairn, Oland and Cape Verde. However, it still didn't work on Tahiti. Despite having the same DCE version, Tahiti seems to have a different audio device. After some trial and error I realized that it works with the default display clock as reported by the VBIOS, not the current display clock. The patch was tested on all four SI GPUs: * Pitcairn (DCE 6.0) * Oland (DCE 6.4) * Cape Verde (DCE 6.0) * Tahiti (DCE 6.0 but different) The testing was done on Samsung Odyssey G7 LS28BG700EPXEN on each of the above GPUs, at the following settings: * 4K 60 Hz * 1080p 60 Hz * 1080p 144 Hz Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 645cc7863da5de700547d236697dffd6760cf051) Cc: stable@vger.kernel.org
2025-08-18drm/amd/display: Fix fractional fb divider in set_pixel_clock_v3Timur Kristóf1-1/+1
For later VBIOS versions, the fractional feedback divider is calculated as the remainder of dividing the feedback divider by a factor, which is set to 1000000. For reference, see: - calculate_fb_and_fractional_fb_divider - calc_pll_max_vco_construct However, in case of old VBIOS versions that have set_pixel_clock_v3, they only have 1 byte available for the fractional feedback divider, and it's expected to be set to the remainder from dividing the feedback divider by 10. For reference see the legacy display code: - amdgpu_pll_compute - amdgpu_atombios_crtc_program_pll This commit fixes set_pixel_clock_v3 by dividing the fractional feedback divider passed to the function by 100000. Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 027e7acc7e17802ebf28e1edb88a404836ad50d6) Cc: stable@vger.kernel.org
2025-08-18drm/amd/display: Don't print errors for nonexistent connectorsTimur Kristóf2-5/+15
When getting the number of connectors, the VBIOS reports the number of valid indices, but it doesn't say which indices are valid, and not every valid index has an actual connector. If we don't find a connector on an index, that is not an error. Considering these are not actual errors, don't litter the logs. Fixes: 60df5628144b ("drm/amd/display: handle invalid connector indices") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 249d4bc5f1935f04bb45b3b63c0f8922565124f7)
2025-08-18drm/amd/display: Don't warn when missing DCE encoder capsTimur Kristóf1-4/+4
On some GPUs the VBIOS just doesn't have encoder caps, or maybe not for every encoder. This isn't really a problem and it's handled well, so let's not litter the logs with it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 33e0227ee96e62d034781e91f215e32fd0b1d512)
2025-08-18drm/amd/display: Fill display clock and vblank time in dce110_fill_display_configsTimur Kristóf3-11/+3
Also needed by DCE 6. This way the code that gathers this info can be shared between different DCE versions and doesn't have to be repeated. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8107432dff37db26fcb641b6cebeae8981cd73a0) Cc: stable@vger.kernel.org
2025-08-18drm/amd/display: Find first CRTC and its line time in dce110_fill_display_configsTimur Kristóf1-10/+20
dce110_fill_display_configs is shared between DCE 6-11, and finding the first CRTC and its line time is relevant to DCE 6 too. Move the code to find it from DCE 11 specific code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4ab09785f8d5d03df052827af073d5c508ff5f63) Cc: stable@vger.kernel.org
2025-08-18drm/amd/display: Adjust DCE 8-10 clock, don't overclock by 15%Timur Kristóf1-7/+5
Adjust the nominal (and performance) clocks for DCE 8-10, and set them to 625 MHz, which is the value used by the legacy display code in amdgpu_atombios_get_clock_info. This was tested with Hawaii, Tonga and Fiji. These GPUs can output 4K 60Hz (10-bit depth) at 625 MHz. The extra 15% clock was added as a workaround for a Polaris issue which uses DCE 11, and should not have been used on DCE 8-10 which are already hardcoded to the highest possible display clock. Unfortunately, the extra 15% was mistakenly copied and kept even on code paths which don't affect Polaris. This commit fixes that and also adds a check to make sure not to exceed the maximum DCE 8-10 display clock. Fixes: 8cd61c313d8b ("drm/amd/display: Raise dispclk value for Polaris") Fixes: dc88b4a684d2 ("drm/amd/display: make clk mgr soc specific") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1ae45b5d4f371af8ae51a3827d0ec9fe27eeb867)
2025-08-18drm/amd/display: Don't overclock DCE 6 by 15%Timur Kristóf1-5/+3
The extra 15% clock was added as a workaround for a Polaris issue which uses DCE 11, and should not have been used on DCE 6 which is already hardcoded to the highest possible display clock. Unfortunately, the extra 15% was mistakenly copied and kept even on code paths which don't affect Polaris. This commit fixes that and also adds a check to make sure not to exceed the maximum DCE 6 display clock. Fixes: 8cd61c313d8b ("drm/amd/display: Raise dispclk value for Polaris") Fixes: dc88b4a684d2 ("drm/amd/display: make clk mgr soc specific") Fixes: 3ecb3b794e2c ("drm/amd/display: dc/clk_mgr: add support for SI parts (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 427980c1cbd22bb256b9385f5ce73c0937562408) Cc: stable@vger.kernel.org
2025-08-18drm/amd/display: Add null pointer check in mod_hdcp_hdcp1_create_session()Chenyuan Yang1-0/+3
The function mod_hdcp_hdcp1_create_session() calls the function get_first_active_display(), but does not check its return value. The return value is a null pointer if the display list is empty. This will lead to a null pointer dereference. Add a null pointer check for get_first_active_display() and return MOD_HDCP_STATUS_DISPLAY_NOT_FOUND if the function return null. This is similar to the commit c3e9826a2202 ("drm/amd/display: Add null pointer check for get_first_active_display()"). Fixes: 2deade5ede56 ("drm/amd/display: Remove hdcp display state with mst fix") Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5e43eb3cd731649c4f8b9134f857be62a416c893)
2025-08-18drm/amd/display: Fix Xorg desktop unresponsive on Replay panelTom Chung1-0/+19
[WHY & HOW] IPS & self-fresh feature can cause vblank counter resets between vblank disable and enable. It may cause system stuck due to wait the vblank counter. Call the drm_crtc_vblank_restore() during vblank enable to estimate missed vblanks by using timestamps and update the vblank counter in DRM. It can make the vblank counter increase smoothly and resolve this issue. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 34d66bc7ff10e146a4cec76cf286979740a10954) Cc: stable@vger.kernel.org
2025-08-18drm/amd/display: Avoid a NULL pointer dereferenceMario Limonciello1-0/+3
[WHY] Although unlikely drm_atomic_get_new_connector_state() or drm_atomic_get_old_connector_state() can return NULL. [HOW] Check returns before dereference. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1e5e8d672fec9f2ab352be121be971877bff2af9) Cc: stable@vger.kernel.org
2025-08-18drm/amdgpu/swm14: Update power limit logicAlex Deucher1-5/+25
Take into account the limits from the vbios. Ported from the SMU13 code. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4352 Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 203cc7f1dd86f2c8de5c3c6182f19adac7c9c206) Cc: stable@vger.kernel.org
2025-08-18drm/amd/display: Revert Add HPO encoder support to ReplayGabe Teeger4-62/+5
This reverts commits: commit 1f26214d268b ("drm/amd/display: Add HPO encoder support to Replay") commit 3bfce48b109f ("drm/amd/display: Add support for Panel Replay on DP1 eDP (panel_inst=1)") due to visual confirm issue. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Gabe Teeger <gabe.teeger@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 92f68f6a1b297633159a3f3759e4dfc7e5b58abb)
2025-08-13Revert "drm/amdgpu: Use dma_buf from GEM object instance"Thomas Zimmermann3-3/+4
This reverts commit 515986100d176663d0a03219a3056e4252f729e6. The dma_buf field in struct drm_gem_object is not stable over the object instance's lifetime. The field becomes NULL when user space releases the final GEM handle on the buffer object. This resulted in a NULL-pointer deref. Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer: Acquire internal references on GEM handles") only solved the problem partially. They especially don't work for buffer objects without a DRM framebuffer associated. Hence, this revert to going back to using .import_attach->dmabuf. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250715082635.34974-1-tzimmermann@suse.de
2025-08-12drm/amdgpu: fix task hang from failed job submission during process killLiu01 Tong2-4/+14
During process kill, drm_sched_entity_flush() will kill the vm entities. The following job submissions of this process will fail, and the resources of these jobs have not been released, nor have the fences been signalled, causing tasks to hang and timeout. Fix by check entity status in amdgpu_vm_ready() and avoid submit jobs to stopped entity. v2: add amdgpu_vm_ready() check before amdgpu_vm_clear_freed() in function amdgpu_cs_vm_handling(). Fixes: 1f02f2044bda ("drm/amdgpu: Avoid extra evict-restore process.") Signed-off-by: Liu01 Tong <Tong.Liu01@amd.com> Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f101c13a8720c73e67f8f9d511fbbeda95bcedb1)
2025-08-12drm/amdgpu: fix incorrect vm flags to map boJack Xiao1-2/+2
It should use vm flags instead of pte flags to specify bo vm attributes. Fixes: 7946340fa389 ("drm/amdgpu: Move csa related code to separate file") Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b08425fa77ad2f305fe57a33dceb456be03b653f)
2025-08-12drm/amdgpu: fix vram reservation issueYiPeng Chai1-2/+1
The vram block allocation flag must be cleared before making vram reservation, otherwise reserving addresses within the currently freed memory range will always fail. Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu") Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d38eaf27de1b8584f42d6fb3f717b7ec44b3a7a1)
2025-08-12drm/amdgpu: Add PSP fw version check for fw reserve GFX commandFrank Min1-3/+16
The fw reserved GFX command is only supported starting from PSP fw version 0x3a0e14 and 0x3b0e0d. Older versions do not support this command. Add a version guard to ensure the command is only used when the running PSP fw meets the minimum version requirement. This ensures backward compatibility and safe operation across fw revisions. Fixes: a3b7f9c306e1 ("drm/amdgpu: reclaim psp fw reservation memory region") Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 065e23170a1e09bc9104b761183e59562a029619)
2025-08-08Merge tag 'drm-next-2025-08-08' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds20-119/+321
Pull drm fixes from Dave Airlie: "This is the fixes that built up in the merge window, mostly amdgpu and xe with one i915 display fix, seems like things are pretty good for rc1. i915: - DP LPFS fixes xe: - SRIOV: PF fixes and removal of need of module param - Fix driver unbind around Devcoredump - Mark xe driver as BROKEN if kernel page size is not 4kB amdgpu: - GC 9.5.0 fixes - SMU fix - DCE 6 DC fixes - mmhub client ID fixes - VRR fix - Backlight fix - UserQ fix - Legacy reset fix - Misc fixes amdkfd: - CRIU fix - Debugfs fix" * tag 'drm-next-2025-08-08' of https://gitlab.freedesktop.org/drm/kernel: (28 commits) drm/amdgpu: add missing vram lost check for LEGACY RESET drm/amdgpu/discovery: fix fw based ip discovery drm/amdkfd: Destroy KFD debugfs after destroy KFD wq amdgpu/amdgpu_discovery: increase timeout limit for IFWI init drm/amdgpu: Update SDMA firmware version check for user queue support drm/amdgpu: Add NULL check for asic_funcs drm/amd/display: Revert "drm/amd/display: Fix AMDGPU_MAX_BL_LEVEL value" drm/amd/display: fix a Null pointer dereference vulnerability drm/amd/display: Add primary plane to commits for correct VRR handling drm/amdgpu: update mmhub 3.3 client id mappings drm/amdgpu: update mmhub 3.0.1 client id mappings drm/amdgpu: Retain job->vm in amdgpu_job_prepare_job drm/amd/display: Fix DCE 6.0 and 6.4 PLL programming. drm/amd/display: Don't overwrite dce60_clk_mgr drm/amdkfd: Fix checkpoint-restore on multi-xcc drm/amd: Restore cached manual clock settings during resume drm/amd: Restore cached power limit during resume drm/amdgpu: Update external revid for GC v9.5.0 drm/amdgpu: Update supported modes for GC v9.5.0 Mark xe driver as BROKEN if kernel page size is not 4kB ...
2025-08-06drm/amdgpu: add missing vram lost check for LEGACY RESETAlex Deucher1-0/+1
Legacy resets reset the memory controllers so VRAM contents may be unreliable after reset. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit aae94897b6661a2a4b1de2d328090fc388b3e0af) Cc: stable@vger.kernel.org
2025-08-06drm/amdgpu/discovery: fix fw based ip discoveryAlex Deucher2-36/+41
We only need the fw based discovery table for sysfs. No need to parse it. Additionally parsing some of the board specific tables may result in incorrect data on some boards. just load the binary and don't parse it on those boards. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4441 Fixes: 80a0e8282933 ("drm/amdgpu/discovery: optionally use fw based ip discovery") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 62eedd150fa11aefc2d377fc746633fdb1baeb55) Cc: stable@vger.kernel.org
2025-08-06drm/amdkfd: Destroy KFD debugfs after destroy KFD wqAmber Lin1-1/+1
Since KFD proc content was moved to kernel debugfs, we can't destroy KFD debugfs before kfd_process_destroy_wq. Move kfd_process_destroy_wq prior to kfd_debugfs_fini to fix a kernel NULL pointer problem. It happens when /sys/kernel/debug/kfd was already destroyed in kfd_debugfs_fini but kfd_process_destroy_wq calls kfd_debugfs_remove_process. This line debugfs_remove_recursive(entry->proc_dentry); tries to remove /sys/kernel/debug/kfd/proc/<pid> while /sys/kernel/debug/kfd is already gone. It hangs the kernel by kernel NULL pointer. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Eric Huang <jinhuieric.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0333052d90683d88531558dcfdbf2525cc37c233) Cc: stable@vger.kernel.org
2025-08-06amdgpu/amdgpu_discovery: increase timeout limit for IFWI initXaver Hugl1-2/+2
With a timeout of only 1 second, my rx 5700XT fails to initialize, so this increases the timeout to 2s. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3697 Signed-off-by: Xaver Hugl <xaver.hugl@kde.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9ed3d7bdf2dcdf1a1196630fab89a124526e9cc2) Cc: stable@vger.kernel.org
2025-08-04drm/amdgpu: Update SDMA firmware version check for user queue supportJesse.Zhang1-1/+1
This commit fixes a firmware version check for enabling user queue support in SDMA v7.0. The previous version check (7836028) was incorrect and could lead to issues with PROTECTED_FENCE_SIGNAL commands causing register conflicts between MCU_DBG0 and MCU_DBG1. Fixes: 8c011408ed84 ("drm/amdgpu/sdma7: add ucode version checks for userq support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 92e2449241516c95aab95eea91faecd0fa2b7ed5) Cc: stable@vger.kernel.org
2025-08-04drm/amdgpu: Add NULL check for asic_funcsLijo Lazar1-1/+2
If driver load fails too early, asic_funcs pointer remains unassigned. Add NULL check to sanitize unwind path. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 582bf7c5158dce16f7dc5b8345b7876bd8031224) Cc: stable@vger.kernel.org
2025-08-04drm/amd/display: Revert "drm/amd/display: Fix AMDGPU_MAX_BL_LEVEL value"Mario Limonciello1-4/+4
This reverts commit 66abb996999de0d440a02583a6e70c2c24deab45. This broke custom brightness curves but it wasn't obvious because of other related changes. Custom brightness curves are always from a 0-255 input signal. The correct fix was to fix the default value which was done by [1]. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4412 Link: https://lore.kernel.org/amd-gfx/0f094c4b-d2a3-42cd-824c-dc2858a5618d@kernel.org/T/#m69f875a7e69aa22df3370b3e3a9e69f4a61fdaf2 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6ec8a5cbec751625133461600d0d4950ffd3a214) Cc: stable@vger.kernel.org
2025-08-04drm/amd/display: fix a Null pointer dereference vulnerabilitySiyang Liu1-9/+10
[Why] A null pointer dereference vulnerability exists in the AMD display driver's (DC module) cleanup function dc_destruct(). When display control context (dc->ctx) construction fails (due to memory allocation failure), this pointer remains NULL. During subsequent error handling when dc_destruct() is called, there's no NULL check before dereferencing the perf_trace member (dc->ctx->perf_trace), causing a kernel null pointer dereference crash. [How] Check if dc->ctx is non-NULL before dereferencing. Link: https://lore.kernel.org/r/tencent_54FF4252EDFB6533090A491A25EEF3EDBF06@qq.com Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> (Updated commit text and removed unnecessary error message) Signed-off-by: Siyang Liu <Security@tencent.com> Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9dd8e2ba268c636c240a918e0a31e6feaee19404) Cc: stable@vger.kernel.org
2025-08-04drm/amd/display: Add primary plane to commits for correct VRR handlingMichel Dänzer1-0/+9
amdgpu_dm_commit_planes calls update_freesync_state_on_stream only for the primary plane. If a commit affects a CRTC but not its primary plane, it would previously not trigger a refresh cycle or affect LFC, violating current UAPI semantics. Fixes e.g. atomic commits affecting only the cursor plane being limited to the minimum refresh rate. Don't do this for the legacy cursor ioctls though, it would break the UAPI semantics for those. Suggested-by: Xaver Hugl <xaver.hugl@kde.org> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3034 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit cc7bfba95966251b254cb970c21627124da3b7f4) Cc: stable@vger.kernel.org
2025-08-04drm/amdgpu: update mmhub 3.3 client id mappingsAlex Deucher1-1/+104
Update the client id mapping so the correct clients get printed when there is a mmhub page fault. v2: fix typos spotted by David Wu. v3: fix additional typo spotted by David. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e932f4779a2d329841bb9ca70bb80a4bb2d707b6) Cc: stable@vger.kernel.org
2025-08-04drm/amdgpu: update mmhub 3.0.1 client id mappingsAlex Deucher1-25/+32
Update the client id mapping so the correct clients get printed when there is a mmhub page fault. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2a2681eda73b99a2c1ee8cdb006099ea5d0c2505) Cc: stable@vger.kernel.org
2025-08-04drm/amdgpu: Retain job->vm in amdgpu_job_prepare_jobYuanShang1-7/+0
The field job->vm is used in function amdgpu_job_run to get the page table re-generation counter and decide whether the job should be skipped. Specifically, function amdgpu_vm_generation checks if the VM is valid for this job to use. For instance, if a gfx job depends on a cancelled sdma job from entity vm->delayed, then the gfx job should be skipped. Fixes: 26c95e838e63 ("drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepare") Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ed76936c6b10b547c6df4ca75412331e9ef6d339) Cc: stable@vger.kernel.org
2025-08-04drm/amd/display: Fix DCE 6.0 and 6.4 PLL programming.Timur Kristóf2-14/+25
Apparently, both DCE 6.0 and 6.4 have 3 PLLs, but PLL0 can only be used for DP. Make sure to initialize the correct amount of PLLs in DC for these DCE versions and use PLL0 only for DP. Also, on DCE 6.0 and 6.4, the PLL0 needs to be powered on at initialization as opposed to DCE 6.1 and 7.x which use a different clock source for DFS. The following functions were used as reference from the old radeon driver implementation of DCE 6.x: - radeon_atom_pick_pll - atombios_crtc_set_disp_eng_pll Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 35222b5934ec8d762473592ece98659baf6bc48e) Cc: stable@vger.kernel.org
2025-08-04drm/amd/display: Don't overwrite dce60_clk_mgrTimur Kristóf1-1/+0
dc_clk_mgr_create accidentally overwrites the dce60_clk_mgr with the dce_clk_mgr, causing incorrect behaviour on DCE6. Fix it by removing the extra dce_clk_mgr_construct. Fixes: 62eab49faae7 ("drm/amd/display: hide VGH asic specific structs") Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bbddcbe36a686af03e91341b9bbfcca94bd45fb6) Cc: stable@vger.kernel.org
2025-08-04drm/amdkfd: Fix checkpoint-restore on multi-xccDavid Yat Sin3-16/+67
GPUs with multi-xcc have multiple MQDs per queue. This patch saves and restores all the MQDs within the partition. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a578f2a58c3ab38f0643b1b6e7534af860233cb1) Cc: stable@vger.kernel.org
2025-08-04drm/amd: Restore cached manual clock settings during resumeMario Limonciello1-0/+10
If the SCLK limits have been set before S3 they will not be restored. The limits are however cached in the driver and so they can be restored by running a commit sequence during resume. Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250725031222.3015095-3-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4e9526924d09057a9ba854305e17eded900ced82) Cc: stable@vger.kernel.org
2025-08-04drm/amd: Restore cached power limit during resumeMario Limonciello1-0/+6
The power limit will be cached in smu->current_power_limit but if the ASIC goes into S3 this value won't be restored. Restore the value during SMU resume. Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250725031222.3015095-2-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 26a609e053a6fc494403e95403bc6a2470383bec) Cc: stable@vger.kernel.org
2025-08-04drm/amdgpu: Update external revid for GC v9.5.0Lijo Lazar1-0/+2
Use different external revid for GC v9.5.0 SOCs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 21c6764ed4bfaecad034bc4fd15dd64c5a436325) Cc: stable@vger.kernel.org
2025-08-04drm/amdgpu: Update supported modes for GC v9.5.0Lijo Lazar1-1/+4
For GC v9.5.0 SOCs, both CPX and QPX compute modes are also supported in NPS2 mode. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9d1ac25c7f830e0132aa816393b1e9f140e71148) Cc: stable@vger.kernel.org
2025-07-31Merge tag 'drm-next-2025-08-01' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds17-61/+88
Pull drm fixes from Dave Airlie: "Just a bunch of amdgpu and xe fixes. amdgpu: - DSC divide by 0 fix - clang fix - DC debugfs fix - Userq fixes - Avoid extra evict-restore with KFD - Backlight fix - Documentation fix - RAS fix - Add new kicker handling - DSC fix for DCN 3.1.4 - PSR fix - Atomic fix - DC reset fixes - DCN 3.0.1 fix - MMHUB client mapping fix xe: - Fix BMG probe on unsupported mailbox command - Fix OA static checker warning about null gt - Fix a NULL vs IS_ERR() bug in xe_i2c_register_adapter - Fix missing unwind goto in GuC/HuC - Don't register I2C devices if VF - Clear whole GuC g2h_fence during initialization - Avoid call kfree for drmm_kzalloc - Fix pci_dev reference leak on configfs - SRIOV: Disable CSC support on VF * tag 'drm-next-2025-08-01' of https://gitlab.freedesktop.org/drm/kernel: (24 commits) drm/xe/vf: Disable CSC support on VF drm/amdgpu: update mmhub 4.1.0 client id mappings drm/amd/display: Allow DCN301 to clear update flags drm/amd/display: Pass up errors for reset GPU that fails to init HW drm/amd/display: Only finalize atomic_obj if it was initialized drm/amd/display: Avoid configuring PSR granularity if PSR-SU not supported drm/amd/display: Disable dsc_power_gate for dcn314 by default drm/amdgpu: add kicker fws loading for gfx12/smu14/psp14 drm/amd/amdgpu: fix missing lock for cper.ring->rptr/wptr access drm/amd/display: Fix misuse of /** to /* in 'dce_i2c_hw.c' drm/amd/display: fix initial backlight brightness calculation drm/amdgpu: Avoid extra evict-restore process. drm/amdgpu: track whether a queue is a kernel queue in amdgpu_mqd_prop drm/amdgpu: check if hubbub is NULL in debugfs/amdgpu_dm_capabilities drm/amdgpu: Initialize data to NULL in imu_v12_0_program_rlc_ram() drm/amd/display: Fix divide by zero when calculating min ODM factor drm/xe/configfs: Fix pci_dev reference leak drm/xe/hw_engine_group: Avoid call kfree() for drmm_kzalloc() drm/xe/guc: Clear whole g2h_fence during initialization drm/xe/vf: Don't register I2C devices if VF ...