aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-04-30drm/amdgpu/mes: consolidate on a single mes reset callbackAlex Deucher1-4/+4
Use the legacy one as it covers both kernel queues and user queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-30drm/amdgpu/mes: remove more unused functionsAlex Deucher1-26/+0
These were leftover from mes bring up and are unused. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-30drm/amdgpu: Fix API status offset for MES queue resetJesse.Zhang1-1/+1
The mes_v11_0_reset_hw_queue and mes_v12_0_reset_hw_queue functions were using the wrong union type (MESAPI__REMOVE_QUEUE) when getting the offset for api_status. Since these functions handle queue reset operations, they should use MESAPI__RESET union instead. This fixes the polling of API status during hardware queue reset operations in the MES for both v11 and v12 versions. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-By: Shaoyun.liu <Shaoyun.liu@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-21drm/amdgpu/mes11: add conversion for priority levelsAlex Deucher1-2/+19
Convert driver priority levels to MES11 priority levels. At the moment they are the same, but they may not always be. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11drm/amdgpu: adjust enforce_isolation handlingAlex Deucher1-1/+1
Switch from a bool to an enum and allow more options for enforce isolation. There are now 3 modes of operation: - Disabled (0) - Enabled (serialization and cleaner shader) (1) - Enabled in legacy mode (no serialization or cleaner shader) (2) This provides better flexibility for more use cases. Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11drm/amdgpu/mes11: use the device value for enforce isolationAlex Deucher1-1/+1
Use the local setting rather than the global parameter. Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu/mes: centralize gfx_hqd mask managementAlex Deucher1-13/+3
Move it to amdgpu_mes to align with the compute and sdma hqd masks. No functional change. v2: rebase on new changes v3: misc optimizations Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri<sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: Remove the MES self testArunpravin Paneer Selvam1-13/+1
Remove MES self test as this conflicts the userqueue fence interrupts. v2:(Christian) - remove the amdgpu_mes_self_test() function and any now unused code. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: fix MES GFX maskArvind Yadav1-2/+13
Current MES GFX mask prevents FW to enable oversubscription. This patch does the following: - Fixes the mask values and adds a description for the same - Removes the central mask setup and makes it IP specific, as it would be different when the number of pipes and queues are different. v2: squash in fix from Shashank Cc: Christian König <Christian.Koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amd/amdgpu: Fix typoAnanta Srikar1-1/+1
Fixes a typo in the word "version" in an error message. Signed-off-by: Ananta Srikar <srikarananta01@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/mes11: optimize MES pipe FW version fetchingAlex Deucher1-0/+4
Don't fetch it again if we already have it. It seems the registers don't reliably have the value at resume in some cases. Fixes: 028c3fb37e70 ("drm/amdgpu/mes11: initiate mes v11 support") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4083 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-03-13drm/amd/amdgpu: Fix MES init sequenceShaoyun Liu1-30/+29
When MES is been used , the set_hw_resource_1 API is required to initialize MES internal context correctly Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27drm/amdgpu/mes11: drop amdgpu_mes_suspend()/amdgpu_mes_resume() callsAlex Deucher1-13/+1
They are noops on GFX11 for most firmware versions. KFD already handles its own queues and they should already be unmapped at this point so even if this runs, it's not doing anything. Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25drm/amdgpu/mes: keep enforce isolation up to dateAlex Deucher1-0/+4
Re-send the mes message on resume to make sure the mes state is up to date. Fixes: 8521e3c5f058 ("drm/amd/amdgpu: limit single process inside MES") Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Shaoyun Liu <shaoyun.liu@amd.com> Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25drm/amdgpu: correct the name of mes_pipe structureLikun Gao1-7/+7
Correct the structure name admgpu_mes_pipe to amdgpu_mes_pipe. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17drm/amdgpu/mes11: allocate hw_resource_1 buffer onceAlex Deucher1-26/+24
Allocate the buffer at sw init time so we don't alloc and free it for every suspend/resume or reset cycle. Reviewed-by: Shaoyun.liu <shaouyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12drm/amdgpu/mes: Add cleaner shader fence address handling in MES for GFX11Srinivasan Shanmugam1-6/+15
This commit introduces enhancements to the handling of the cleaner shader fence in the AMDGPU MES driver: - The MES (Microcode Execution Scheduler) now sends a PM4 packet to the KIQ (Kernel Interface Queue) to request the cleaner shader, ensuring that requests are handled in a controlled manner and avoiding the race conditions. - The CP (Compute Processor) firmware has been updated to use a private bus for accessing specific registers, avoiding unnecessary operations that could lead to issues in VF (Virtual Function) mode. - The cleaner shader fence memory address is now set correctly in the `mes_set_hw_res_pkt` structure, allowing for proper synchronization of the cleaner shader execution. Cc: lin cao <lin.cao@amd.com> Cc: Jingwen Chen <Jingwen.Chen2@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Suggested-by: Shaoyun Liu <shaoyun.liu@amd.com> Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12drm/amdgpu/mes11: fix set_hw_resources_1 calculationAlex Deucher1-1/+1
It's GPU page size not CPU page size. In most cases they are the same, but not always. This can lead to overallocation on systems with larger pages. Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12drm/amdgpu: add support for GC IP version 11.5.3Tim Huang1-0/+2
This initializes GC IP version 11.5.3. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: reduce the mmio writes in kiq settingPrike Liang1-3/+1
There's no need to perform the two MMIO writes in the KIQ Setting registers programmed period, and reducing the MMIO writes will save the driver loading time. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-12drm/amd/amdgpu: limit single process inside MESShaoyun Liu1-0/+15
This is for MES to limit only one process for the user queues Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-11drm/amd/amdgpu: Increase MES log buffer to dump mes scratch datashaoyunl1-1/+11
MES internal scratch data is useful for mes debug, it can only located in VRAM, change the allocation type and increase size for mes 11 Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Acked-by: Feifei Xu <Feifei.Xu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-22drm/amdgpu: Clean the functions pointer set as NULLSunil Khatri1-2/+0
We dont need to set the functions to NULL which arent needed as global structure members are by default set to zero or NULL for pointers. Cc: Leo Liu <leo.liu@amd.com> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdgpu: update the handle ptr in hw_finiSunil Khatri1-14/+13
Update the *handle to amdgpu_ip_block ptr for all functions pointers of hw_fini. Also update the ip_block ptr where ever needed as there were cyclic dependency of hw_fini on suspend and some followed clean up. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdgpu: update the handle ptr in hw_initSunil Khatri1-7/+13
Update the *handle to amdgpu_ip_block ptr for all functions pointers of hw_init. Also update the ip_block ptr where ever needed as there were cyclic dependency of hw_init on resume. v2: squash in isp fix Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdgpu: update the handle ptr in resumeSunil Khatri1-2/+2
Update the *handle to amdgpu_ip_block ptr for all functions pointers of resume. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdgpu: update the handle ptr in suspendSunil Khatri1-2/+2
Update the *handle to amdgpu_ip_block ptr for all functions pointers of suspend. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amdgpu: update the handle ptr in sw_finiSunil Khatri1-2/+2
update the *handle to amdgpu_ip_block ptr for all functions pointers of sw_fini. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amdgpu: update the handle ptr in sw_initSunil Khatri1-2/+2
update the *handle to amdgpu_ip_block ptr for all functions pointers of sw_init. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amdgpu: update the handle ptr in late_initSunil Khatri1-2/+2
Update the ptr handle to amdgpu_ip_block ptr in all the functions of late_init function ptr. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amdgpu: update the handle ptr in early_initSunil Khatri1-2/+2
update the handle ptr to amdgpu_ip_block ptr for all functions pointers on early_init. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-26drm/amdgpu/mes11: update mes_reset_queue function to support sdma queueJiadong Zhu1-1/+26
Reset sdma queue through mmio based on me_id and queue_id. v2: simplify callflows and register calculation. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-18drm/amdgpu/mes11: reduce timeoutAlex Deucher1-1/+1
The firmware timeout is 2s. Reduce the driver timeout to 2.1 seconds to avoid back pressure on queue submissions. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3627 Fixes: f7c161a4c250 ("drm/amdgpu: increase mes submission timeout") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-09-06drm/amdgpu/mes11: Indent an if statmentDan Carpenter1-1/+1
Indent the "break" statement one more tab. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-02drm/amdgpu/mes11: implement mmio queue reset for gfx11Jiadong Zhu1-0/+80
Implement queue reset for graphic and compute queue. v2: use amdgpu_gfx_rlc funcs to enter/exit safe mode. v3: use gfx_v11_0_request_gfx_index_mutex() v4: fix mutex handling Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29drm/amdgpu/mes: add mes mapping legacy queue switchJack Xiao1-15/+34
For mes11 old firmware has issue to map legacy queue, add a flag to switch mes to map legacy queue. Fixes: f9d8c5c7855d ("drm/amdgpu/gfx: enable mes to map legacy queue support") Reported-by: Andrew Worsley <amworsley@gmail.com> Link: https://lists.freedesktop.org/archives/amd-gfx/2024-August/112773.html Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20drm/amdgpu: Implement MES Suspend and Resume APIs for GFX11Mukul Joshi1-2/+30
Add implementation for MES Suspend and Resume APIs to unmap/map all queues for GFX11. Support for GFX12 will be added when the corresponding firmware support is in place. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16drm/amdgpu/mes11: add API for user queue resetAlex Deucher1-0/+21
Add API for resetting user queues. Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13drm/amdgpu/mes12: configure two pipes hardware resourcesJack Xiao1-5/+2
Configure two pipes with different hardware resources. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13drm/amdgpu/mes: add multiple mes ring instances supportJack Xiao1-17/+17
Add multiple mes ring instances in mes structure to support multiple mes pipes. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13drm/amdgpu/mes11: add API for legacy queue resetAlex Deucher1-0/+33
Add API for resetting kernel queues. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13drm/amdgpu/mes: fix mes ring buffer overflowJack Xiao1-4/+14
wait memory room until enough before writing mes packets to avoid ring buffer overflow. v2: squash in sched_hw_submission fix Fixes: de3246254156 ("drm/amdgpu: cleanup MES11 command submission") Fixes: fffe347e1478 ("drm/amdgpu: cleanup MES12 command submission") Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27drm/amdgpu: increase mes log buffer size for gfx12Michael Chen1-0/+2
MES firmware requires larger log buffer for gfx12. Allocate proper buffer respectively for gfx11 and gfx12. Signed-off-by: Michael Chen <michael.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-12drm/amdgpu/mes11: update opcode stringsAlex Deucher1-0/+3
Add new packet. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-02drm/amdgpu: add firmware for GC IP v11.5.2Tim Huang1-0/+2
This patch is to add firmware for GC 11.5.2. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19drm/amdgpu: cleanup MES11 command submissionChristian König1-28/+48
The approach of having a separate WB slot for each submission doesn't really work well and for example breaks GPU reset. Use a status query packet for the fence update instead since those should always succeed we can use the fence of the original packet to signal the state of the operation. While at it cleanup the coding style. Fixes: eef016ba8986 ("drm/amdgpu/mes11: Use a separate fence per transaction") Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-08drm/amdgpu/mes11: fix kiq ring ready flagJack Xiao1-1/+2
kiq ring test has overwitten ready flag, need disable after gfx hw init. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-30drm/amdgpu/mes11: increase waiting time for engine readyJack Xiao1-1/+1
mes schq engine require more waiting time for engine ready before packet submission. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-30drm/amdgpu/mes11: adjust mes initialization sequenceJack Xiao1-1/+8
Adjust mes queue initialization before kgq/kcq initialization to enable mes mapping legacy queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-30drm/amdgpu/mes11: add mes mapping legacy queue supportJack Xiao1-0/+26
Add mes11 map legacy queue packet submission. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>