aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2021-09-23drm/amdgpu: Updated RAS infrastructureJohn Clements7-52/+146
Update RAS infrastructure to support RAS query for MCA subblocks Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stageGuchun Chen1-4/+5
adev->rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu will still use adev->rmmio for access after amdgpu_device_unmap_mmio. The patch is to move such SRIOV calling earlier to fini_early stage. Fixes: 07775fc13878 ("drm/amdgpu: Unmap all MMIO mappings") Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: Fix resume failures when device is goneAndrey Grodzovsky1-0/+4
Problem: When device goes into suspend and unplugged during it then all HW programming during resume fails leading to a bad SW during pci remove handling which follows. Because device is first resumed and only later removed we cannot rely on drm_dev_enter/exit here. Fix: Use a flag we use for PCIe error recovery to avoid accessing registres. This allows to successfully complete pm resume sequence and finish pci remove. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: Fix MMIO access page faultAndrey Grodzovsky2-8/+17
Add more guards to MMIO access post device unbind/unplug Bug: https://bugs.archlinux.org/task/72092?project=1&order=dateopened&sort=desc&pagenum=1 Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: Fix crash on device remove/driver unloadAndrey Grodzovsky7-90/+105
Crash: BUG: unable to handle page fault for address: 00000000000010e1 RIP: 0010:vega10_power_gate_vce+0x26/0x50 [amdgpu] Call Trace: pp_set_powergating_by_smu+0x16a/0x2b0 [amdgpu] amdgpu_dpm_set_powergating_by_smu+0x92/0xf0 [amdgpu] amdgpu_dpm_enable_vce+0x2e/0xc0 [amdgpu] vce_v4_0_hw_fini+0x95/0xa0 [amdgpu] amdgpu_device_fini_hw+0x232/0x30d [amdgpu] amdgpu_driver_unload_kms+0x5c/0x80 [amdgpu] amdgpu_pci_remove+0x27/0x40 [amdgpu] pci_device_remove+0x3e/0xb0 device_release_driver_internal+0x103/0x1d0 device_release_driver+0x12/0x20 pci_stop_bus_device+0x79/0xa0 pci_stop_and_remove_bus_device_locked+0x1b/0x30 remove_store+0x7b/0x90 dev_attr_store+0x17/0x30 sysfs_kf_write+0x4b/0x60 kernfs_fop_write_iter+0x151/0x1e0 Why: VCE/UVD had dependency on SMC block for their suspend but SMC block is the first to do HW fini due to some constraints How: Since the original patch was dealing with suspend issues move the SMC block dependency back into suspend hooks as was done in V1 of the original patches. Keep flushing idle work both in suspend and HW fini seuqnces since it's essential in both cases. Fixes: 859e4659273f1d ("drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend") Fixes: bf756fb833cbe8 ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend") Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: Fix uvd ib test timeout when use pre-allocated BOxinhui pan1-1/+8
Now we use same BO for create/destroy msg. So destroy will wait for the fence returned from create to be signaled. The default timeout value in destroy is 10ms which is too short. Lets wait both fences with the specific timeout. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: Put drm_dev_enter/exit outside hot codepathxinhui pan3-14/+36
We hit soft hang while doing memory pressure test on one numa system. After a qucik look, this is because kfd invalid/valid userptr memory frequently with process_info lock hold. Looks like update page table mapping use too much cpu time. perf top says below, 75.81% [kernel] [k] __srcu_read_unlock 6.19% [amdgpu] [k] amdgpu_gmc_set_pte_pde 3.56% [kernel] [k] __srcu_read_lock 2.20% [amdgpu] [k] amdgpu_vm_cpu_update 2.20% [kernel] [k] __sg_page_iter_dma_next 2.15% [drm] [k] drm_dev_enter 1.70% [drm] [k] drm_prime_sg_to_dma_addr_array 1.18% [kernel] [k] __sg_alloc_table_from_pages 1.09% [drm] [k] drm_dev_exit So move drm_dev_enter/exit outside gmc code, instead let caller do it. They are gart_unbind, gart_map, vm_clear_bo, vm_update_pdes and gmc_init_pdb0. vm_bo_update_mapping already calls it. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: Resolve nBIF RAS error harvesting bugJohn Clements1-6/+24
Set correct RAS nBIF error query register offsets on aldebaran Signed-off-by: John Clements <john.clements@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: Update PSP TA unload functionCandice Li1-10/+10
Update PSP TA unload function to use PSP TA context as input argument. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: Conform ASD header/loading to generic TA systemsCandice Li2-44/+26
Update asd_context structure and add asd_initialize function to conform ASD header/loading to generic TA systems. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: Demote TMZ unsupported log message from warning to infoPaul Menzel1-1/+1
As the user cannot do anything about the unsupported Trusted Memory Zone (TMZ) feature, do not warn about it, but make it informational, so demote the log level from warning to info. Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_countMichel Dänzer2-2/+2
This was unusual; normally, inline functions are declared static as well, and defined in a header file if used by multiple compilation units. The latter would be more involved in this case, so just drop the inline declaration for now. Fixes compile failure building for ppc64le on RHEL 8: In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32, from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_recovery_init’: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed in call to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not available 90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here 1985 | max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count(); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: c84d46707ebb "drm/amdgpu: validate bad page threshold in ras(v3)" Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-22Merge tag 'drm-misc-next-2021-09-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-nextDave Airlie4-13/+9
drm-misc-next for $kernel-version: UAPI Changes: Cross-subsystem Changes: - dma-buf: Avoid a warning with some allocations, Remove DMA_FENCE_TRACE macros Core Changes: - bridge: New helper to git rid of panels in drivers - fence: Improve dma_fence_add_callback documentation, Improve dma_fence_ops->wait documentation - ioctl: Unexport drm_ioctl_permit - lease: Documentation improvements - fourcc: Add new macro to determine the modifier vendor - quirks: Add the Steam Deck, Chuwi HiBook, Chuwi Hi10 Pro, Samsung Galaxy Book 10.6, KD Kurio Smart C15200 2-in-1, Lenovo Ideapad D330 - resv: Improve the documentation - shmem-helpers: Allocate WC pages on x86, Switch to vmf_insert_pfn - sched: Fix for a timer being canceled too soon, Avoid null pointer derefence if the fence is null in drm_sched_fence_free, Convert drivers to rely on its dependency tracking - ttm: Switch to kerneldoc, new helper to clear all DMA mappings, pool shrinker optitimization, Remove ttm_tt_destroy_common, Fix for unbinding on multiple drivers Driver Changes: - bochs: New PCI IDs - msm: Fence ordering impromevemnts - stm: Add layer alpha support, zpos - v3d: Fix for a Vulkan CTS failure - vc4: Conversion to the new bridge helpers - vgem: Use shmem helpers - virtio: Support mapping exported vram - zte: Remove obsolete driver - bridge: Probe improvements for it66121, enable DSI EOTP for anx7625, errors propagation improvements for anx7625 - panels: 60fps mode for otm8009a, New driver for Samsung S6D27A1 Signed-off-by: Dave Airlie <airlied@redhat.com> # gpg: Signature made Thu 16 Sep 2021 17:30:50 AEST # gpg: using EDDSA key 5C1337A45ECA9AEB89060E9EE3EF0D6F671851C5 # gpg: Can't check signature: No public key From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210916073132.ptbbmjetm7v3ufq3@gilmour
2021-09-16drm/amdgpu: Demote TMZ unsupported log message from warning to infoPaul Menzel1-1/+1
As the user cannot do anything about the unsupported Trusted Memory Zone (TMZ) feature, do not warn about it, but make it informational, so demote the log level from warning to info. Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_countMichel Dänzer2-2/+2
This was unusual; normally, inline functions are declared static as well, and defined in a header file if used by multiple compilation units. The latter would be more involved in this case, so just drop the inline declaration for now. Fixes compile failure building for ppc64le on RHEL 8: In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32, from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_recovery_init’: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed in call to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not available 90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here 1985 | max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count(); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: c84d46707ebb "drm/amdgpu: validate bad page threshold in ras(v3)" Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16drm/amdgpu: move iommu_resume before ip init/resumeJames Zhu1-0/+12
Separate iommu_resume from kfd_resume, and move it before other amdgpu ip init/resume. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-09-16drm/amdgpu: add amdgpu_amdkfd_resume_iommuJames Zhu2-0/+11
Add amdgpu_amdkfd_resume_iommu for amdgpu. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-09-16drm/amdkfd: separate kfd_iommu_resume from kfd_resumeJames Zhu1-0/+6
Separate kfd_iommu_resume from kfd_resume for fine-tuning of amdgpu device init/resume/reset/recovery sequence. v2: squash in fix for !CONFIG_HSA_AMD Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-09-14drm/amdgpu: use IS_ERR for debugfs APIsNirmoy Das2-8/+6
debugfs APIs returns encoded error so use IS_ERR for checking return value. v2: return PTR_ERR(ent) References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686 Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-By: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-09-14drm/amdgpu: fix use after free during BO moveChristian König1-9/+9
The memory backing old_mem is already freed at that point, move the check a bit more up. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: bfa3357ef9ab ("drm/ttm: allocate resource object instead of embedding it v2") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1699 Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-09-14drm/amd/amdgpu: Increase HWIP_MAX_INSTANCE to 10Ernst Sjöstrand1-1/+1
Seems like newer cards can have even more instances now. Found by UBSAN: array-index-out-of-bounds in drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:318:29 index 8 is out of range for type 'uint32_t *[8]' Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1697 Cc: stable@vger.kernel.org Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: Fix a race of IB testxinhui pan1-2/+2
Direct IB submission should be exclusive. So use write lock. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: VCN avoid memory allocation during IB testxinhui pan1-53/+44
alloc extra msg from direct IB pool. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: VCE avoid memory allocation during IB testxinhui pan1-14/+13
alloc extra msg from direct IB pool. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: UVD avoid memory allocation during IB testxinhui pan4-53/+74
move BO allocation in sw_init. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: Unify PSP TA contextCandice Li8-118/+141
Remove all TA binary structures and add the specific binary structure in struct ta_context. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: move iommu_resume before ip init/resumeJames Zhu1-0/+12
Separate iommu_resume from kfd_resume, and move it before other amdgpu ip init/resume. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: add amdgpu_amdkfd_resume_iommuJames Zhu2-0/+11
Add amdgpu_amdkfd_resume_iommu for amdgpu. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdkfd: separate kfd_iommu_resume from kfd_resumeJames Zhu1-0/+6
Separate kfd_iommu_resume from kfd_resume for fine-tuning of amdgpu device init/resume/reset/recovery sequence. v2: squash in fix for !CONFIG_HSA_AMD Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: Get atomicOps info from Host for sriov setupshaoyunl2-12/+16
The AtomicOp Requester Enable bit is reserved in VFs and the PF value applies to all associated VFs. so guest driver can not directly enable the atomicOps for VF, it depends on PF to enable it. In current design, amdgpu driver will get the enabled atomicOps bits through private pf2vf data Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: Increase direct IB pool sizexinhui pan1-7/+2
Direct IB pool is used for vce/vcn IB extra msg too. Increase its size to AMDGPU_IB_POOL_SIZE. v2: Squash in unused variable removal Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: Update RAS trigger error block supportJohn Clements1-0/+3
Added trigger error support for MP0/MP1/MPIO blocks Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: Update RAS status printJohn Clements1-27/+10
Remove uncessary RAS status prints Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: refactor function to init no-psp fwLikun Gao1-85/+75
Refactor the code of amdgpu_ucode_init_single_fw to make it more readable as too many ucode need to handle on this function currently. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: cleanup debugfs for amdgpu ringsNirmoy Das3-23/+9
Use debugfs_create_file_size API for creating ring debugfs, and as its a NULL returning API, change the return type for amdgpu_debugfs_ring_init API as well. Also cleanup surrounding code. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: use IS_ERR for debugfs APIsNirmoy Das2-8/+6
debugfs APIs returns encoded error so use IS_ERR for checking return value. v2: return PTR_ERR(ent) References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686 Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-By: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14Merge drm/drm-next into drm-misc-nextMaxime Ripard107-1596/+2500
Kickstart new drm-misc-next cycle. Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-09-10Merge tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drmLinus Torvalds14-70/+129
Pull drm fixes from Dave Airlie: "Just an initial bunch of fixes for the merge window, amdgpu is most of them with a few ttm fixes and an fbdev avoid multiply overflow fix. core: - Make some dma-buf config options depend on DMA_SHARED_BUFFER - Handle multiplication overflow of fbdev xres/yres in the core ttm: - Fix ttm_bo_move_memcpy() when ttm_resource is subclassed - Fix ttm deadlock if target BO isn't idle - ttm build fix - ttm docs fix dma-buf: - config option fixes fbdev: - limit resolutions to avoid int overflow i915: - stddef change. amdgpu: - Misc cleanups, typo fixes - EEPROM fix - Add some new PCI IDs - Scatter/Gather display support for Yellow Carp - PCIe DPM fix for RKL platforms - RAS fix amdkfd: - SVM fix vc4: - static function fix mgag200: - fix uninit var panfrost: - lock_region fixes" * tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drm: (36 commits) drm/ttm: Fix a deadlock if the target BO is not idle during swap fbmem: don't allow too huge resolutions dma-buf: DMABUF_SYSFS_STATS should depend on DMA_SHARED_BUFFER dma-buf: DMABUF_DEBUG should depend on DMA_SHARED_BUFFER drm/i915: use linux/stddef.h due to "isystem: trim/fixup stdarg.h and other headers" dma-buf: DMABUF_MOVE_NOTIFY should depend on DMA_SHARED_BUFFER drm/amdkfd: drop process ref count when xnack disable drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode drm/amdgpu: fix fdinfo race with process exit drm/amdgpu: Fix a deadlock if previous GEM object allocation fails drm/amdgpu: stop scheduler when calling hw_fini (v2) drm/amdgpu: Clear RAS interrupt status on aldebaran drm/amd/display: Initialize lt_settings on instantiation drm/amd/display: cleanup idents after a revert drm/amd/display: Fix memory leak reported by coverity drm/ttm: Fix ttm_bo_move_memcpy() for subclassed struct ttm_resource drm/amdgpu/swsmu: fix spelling mistake "minimun" -> "minimum" drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform drm/amdgpu: show both cmd id and name when psp cmd failed drm/amd/display: setup system context for APUs ...
2021-09-07drm/amdgpu: sdma: clean up identationColin Ian King1-4/+4
There is a statement that is indented incorrectly. Clean it up. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07drm/amdgpu: clean up inconsistent indentingColin Ian King1-4/+3
There are a couple of statements that are indented one character too deeply, clean these up. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07drm/amdgpu: remove unused amdgpu_bo_validateChristian König2-35/+0
Just drop some dead code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07drm/amdgpu: fix use after free during BO moveChristian König1-9/+9
The memory backing old_mem is already freed at that point, move the check a bit more up. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: bfa3357ef9ab ("drm/ttm: allocate resource object instead of embedding it v2") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1699 Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07drm/amdgpu: Create common PSP TA load functionCandice Li2-204/+93
Creat common PSP TA load function and update PSP ta_mem_context with size information. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-02drm/amd/amdgpu: Increase HWIP_MAX_INSTANCE to 10Ernst Sjöstrand1-1/+1
Seems like newer cards can have even more instances now. Found by UBSAN: array-index-out-of-bounds in drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:318:29 index 8 is out of range for type 'uint32_t *[8]' Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1697 Cc: stable@vger.kernel.org Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-02dma-buf: nuke DMA_FENCE_TRACE macros v2Christian König1-9/+1
Only the DRM GPU scheduler, radeon and amdgpu where using them and they depend on a non existing config option to actually emit some code. v2: keep the signal path as is for now Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210818105443.1578-1-christian.koenig@amd.com
2021-09-01drm/amdgpu:schedule vce/vcn encode based on prioritySatyajit Sahu1-0/+16
Schedule the encode job in VCE/VCN encode ring based on the priority set by UMD. Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01drm/amdgpu/vcn: set the priority for each encode ringSatyajit Sahu6-5/+28
VCN has multiple rings. Set the proper priority level for each encode ring while initializing. Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01drm/amdgpu/vce: set the priority for each ringSatyajit Sahu5-3/+24
VCE has multiple rings. Set the proper priority level for each ring while initializing. Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01drm/amd/amdgpu: add mpio to ras blockCandice Li3-0/+4
Add MPIO to RAS block Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01drm/amd/amdgpu: consolidate PSP TA unload functionCandice Li1-120/+40
Create common PSP TA unload function and replace all common TA unloading sequences. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>