aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-04-08drm/amdgpu: Reset RAS table if header is invalidLijo Lazar1-5/+7
If a valid header is not found during RAS eeprom init, consider it as new and reset RAS table info. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: add loop bits for NPS2 page retirementTao Zhou2-0/+12
Support NPS2 RAS. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amd/amdgpu: decouple ASPM with pcie dpmKenneth Feng1-2/+0
ASPM doesn't need to be disabled if pcie dpm is disabled. So ASPM can be independantly enabled. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08amd/amdgpu: Init vcn hardware per instance for vcn 4.0.3Ruili Ji1-22/+29
Add interface for hardware init by vcn instance. v2: fix code format Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: Disable ACA on VFsVictor Skvortsov2-6/+8
VFs query RAS error counts directly from host with AMDGPU_RAS_VIRT_ERROR_COUNT_QUERY. When ACA is enabled, an unusable aca_sysfs is created rather than amdgpu_ras_sysfs_create() Likewise, VFs depend on host support to query CPERs, rather than ACA component. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: use "irq" in place of "interrupt" in DCE6/8 as in DCE10/11Alexandre Demers2-12/+12
"interrupt" becomes "irq" in: dce_vX_0_set_hpd_interrupt_state() dce_vX_0_set_crtc_interrupt_state() dce_vX_0_set_pageflip_interrupt_state() It is easier when going through the code to just change the DCE number in the functions' name to find and compare them across DCE versions. Also, it standardizes function mapping inside a given structure where .set and .process are both set to functions with a "_irq" suffix. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: fix typos in DCEsAlexandre Demers4-16/+15
In DCE6, DCE8, DCE10, DCE11, "hdp" is replaced by "hpd" and replace "type" by "hpd" for a uniform parameter naming usage across DCEs. In link_factory.c, there is a missing "p" to "types" Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu/mes12: optimize MES pipe FW version fetchingAlex Deucher1-9/+12
Don't fetch it again if we already have it. It seems the registers don't reliably have the value at resume in some cases. Fixes: 785f0f9fe742 ("drm/amdgpu: Add mes v12_0 ip block support (v4)") Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: cancel gfx idle work in device suspend for s0ixAlex Deucher1-0/+7
This is normally handled in the gfx IP suspend callbacks, but for S0ix, those are skipped because we don't want to touch gfx. So handle it in device suspend. Fixes: b9467983b774 ("drm/amdgpu: add dynamic workload profile switching for gfx10") Fixes: 963537ca2325 ("drm/amdgpu: add dynamic workload profile switching for gfx11") Fixes: 5f95a1549555 ("drm/amdgpu: add dynamic workload profile switching for gfx12") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu/gfx12: dump full CP packet header FIFOsAlex Deucher1-6/+35
In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu/gfx11: dump full CP packet header FIFOsAlex Deucher1-10/+49
In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu/gfx10: dump full CP packet header FIFOsAlex Deucher1-13/+51
In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx9.4.3: dump full CP packet header FIFOsAlex Deucher1-15/+37
In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx9: dump full CP packet header FIFOsAlex Deucher1-13/+49
In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Fix CPER error handling on VFsVictor Skvortsov1-4/+12
CPER read will loop infinitely if an error is encountered and the more bit is set. Add error checks to break upon failure. v2: added function pointer checks Suggested-by: Tony Yi <Tony.Yi@amd.com> Signed-off-by: Victor Skvortsov <Victor.Skvortsov@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Fix the comment to avoid warningSunil Khatri1-0/+1
Fix the below comment warning drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:541: warning: Function parameter or struct member 'adev' not described in 'amdgpu_sdma_register_on_reset_callbacks' Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Fix xgmi v6.4.1 link status reportingLijo Lazar1-6/+18
Use the right register offsets for getting link status. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Add basic validation for RAS headerLijo Lazar1-3/+19
If RAS header read from EEPROM is corrupted, it could result in trying to allocate huge memory for reading the records. Add some validation to header fields. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdkfd: Drop workaround for GC v9.4.3 revID 0Apurv Mishra2-13/+9
Remove workaround code for the early engineering samples GC v9.4.3 SOCs with revID 0 Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: huge sid.h cleanup, drop substituted defines.Alexandre Demers1-1209/+2
Now that we are using the proper defines, cleanup useless old "substituted" defines. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: move si.c away from sid.hAlexandre Demers1-182/+183
Replace defines by the ones added earlier to GFX6, SMU6 and DCE6 Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: enable FW workaround for VCN 4_0_5Boyuan Zhang1-0/+4
Enabling VCN FW workaround for drm key injection through shared memory for vcn 4_0_5 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Add indirect L1_TLB_CNTL reg programming for VFsVictor Skvortsov5-21/+93
VFs on some IP versions are unable to access this register directly. This register must be programmed before PSP ring is setup, so use PSP VF mailbox directly. PSP will broadcast the register value to all VF assigned instances. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx12: Implement the gfx12 kgq pipe resetPrike Liang1-2/+67
Implement the GFX12 kgq pipe reset, and temporarily disable the GFX12 pipe reset until the CPFW fully support it. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx11: Implement the GFX11 KCQ pipe resetPrike Liang1-2/+134
Implement the GFX11 compute pipe reset. As the GFX11 CPFW still hasn't fully supported pipe reset yet, therefore disable the KCQ pipe reset temporarily. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx11: Implement the GFX11 KGQ pipe resetPrike Liang2-2/+72
Implement the kernel graphics queue pipe reset,and the driver will fallback to pipe reset when the queue reset fails. However, the ME FW hasn't fully supported pipe reset yet so disable the KGQ pipe reset temporarily. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx11: fix CSIB handlingAlex Deucher1-2/+0
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx10: fix CSIB handlingAlex Deucher1-2/+0
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx9: fix CSIB handlingAlex Deucher1-2/+0
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx8: fix CSIB handlingAlex Deucher1-2/+0
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx7: fix CSIB handlingAlex Deucher1-2/+0
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx6: fix CSIB handlingAlex Deucher1-2/+0
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx: assign the actual me0 queues per pipeAlex Deucher3-5/+5
Set the actual number of queues per pipe for ME0 (gfx). This way we will dump all of the queues properly in dev core dumps. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx: decouple the number of kgqs from the hwAlex Deucher4-9/+13
The driver currently sets up one kgq per pipe. As such adev->gfx.me.num_queue_per_pipe is hardcoded to 1 everywhere. This is fine for kernel queues, but when we enable user queues we need to know that actual number of queues per pipe. Decouple the kgq setup from the actual hardware count. For dev core dumps and user queues, we want to know the actual number of queues per pipe. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx: make amdgpu_gfx_me_queue_to_bit() staticAlex Deucher2-4/+2
It's not used outside of amdgpu_gfx.c. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx10: Add Cleaner Shader Support for GFX10.3.x GPUsSrinivasan Shanmugam1-0/+30
Enable the cleaner shader for other GFX10.3.x series of GPUs to provide data isolation between GPU workloads. The cleaner shader is responsible for clearing the Local Data Store (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which helps prevent data leakage and ensures accurate computation results. This update extends cleaner shader support to GFX10.3.x GPUs, previously available for GFX10.3.0. It enhances security by clearing GPU memory between processes and maintains a consistent GPU state across KGD and KFD workloads. Cc: Mario Sopena-Novales <mario.novales@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: drop some dead codeAlex Deucher1-61/+0
Drop the cgs smu firmware code for SI, it's not used. The smu firmware fetching for SI is done in si_dpm.c. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: continue cleaning up sid.h and si_enums.hAlexandre Demers2-92/+14
Remove more duplicated defines and move some in sid.h for coherence with CIK. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amd/amdgpu: Fix typoAnanta Srikar1-1/+1
Fixes a typo in the word "version" in an error message. Signed-off-by: Ananta Srikar <srikarananta01@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Replace deprecated function strcpy() with strscpy()Andres Urian Florez1-34/+34
Instead of using the strcpy() deprecated function to populate the fw_name, use the strscpy() function Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy Signed-off-by: Andres Urian Florez <andres.emb.sys@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: add rebar parameterAlex Deucher3-0/+15
Add a new parameter to disable BAR resizing. Note that this only disables the driver from attempting to resize the BAR, The BIOS may have resized the BAR at boot. Some teams have found this useful in debugging P2P DMA issues on systems where the available MMIO space did not allow for all of the GPUs present to resize their BARs. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: cleanup DCE6 a bit moreAlexandre Demers2-6/+2
Use shifts already available in DCE6's defines, masks and shifts. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: keep removing sid.h dependency from si_dma.cAlexandre Demers2-5/+3
Move and rename DMA_SEM_INCOMPLETE_TIMER_CNTL and DMA_SEM_WAIT_FAIL_TIMER_CNTL in oss_1_0_d.h Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: move si_dma.c away from sid.h and si_enums.hAlexandre Demers3-85/+63
Replace defines for the ones in oss_1_0_d.h and oss_1_0_sh_mask.h Taking the opportunity to add some comments taken from cik_sdma.c Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: make GFX6 easier to readAlexandre Demers1-4/+8
Just fix the style and add a comment for reading easiness Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: move DCE6 away from sid.h and si_enums.h definesAlexandre Demers3-262/+176
This cleans up DCE6. I added some minor tweaks taken from CIK to exit early v2: minor fixes (Alex) Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: use GRPH_SECONDARY_SURFACE_ADDRESS_MASK with GRPH_SECONDARY_SURFACE_ADDRESS in DCE6Alexandre Demers1-1/+1
It seems a copy-paste error: since we are working with mmGRPH_SECONDARY_SURFACE_ADDRESS, GRPH_SECONDARY_SURFACE_ADDRESS__GRPH_SECONDARY_SURFACE_ADDRESS_MASK should be used. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: move si_ih.c away from sid.h definesAlexandre Demers1-8/+8
They are properly defined under oss_1_0_d.h Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: remove PACKET3 duplicated defines from si_enums.hAlexandre Demers1-123/+0
PACKET3 is already in sid.h, as it is done under cikd.h for CIK Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: use proper defines, shifts and masks in DCE6 codeAlexandre Demers4-35/+10
By replacing VGA_VSTATUS_CNTL by VGA_RENDER_CONTROL__VGA_VSTATUS_CNTL_MASK, we also need to fix its usage in GMC6. Note: VGA_VSTATUS_CNTL's binary value was inverted in dce_6_0_sh_mask.h, so we need to invert its value where it was used. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>