aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-10-21Merge tag 'drm-misc-fixes-2022-10-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixesDave Airlie1-1/+8
drm-misc-fixes for v6.1-rc2: - Fix a buffer overflow in format_helper_test. - Set DDC pointer in drmm_connector_init. - Compiler fixes for panfrost. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c4d05683-8ebe-93b8-d24c-d1d2c68f12c4@linux.intel.com
2022-10-20drm/amdgpu: fix sdma doorbell init ordering on APUsAlex Deucher2-5/+21
Commit 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") uncovered a bug in amdgpu that required a reordering of the driver init sequence to avoid accessing a special register on the GPU before it was properly set up leading to an PCI AER error. This reordering uncovered a different hw programming ordering dependency in some APUs where the SDMA doorbells need to be programmed before the GFX doorbells. To fix this, move the SDMA doorbell programming back into the soc15 common code, but use the actual doorbell range values directly rather than the values stored in the ring structure since those will not be initialized at this point. This is a partial revert, but with the doorbell assignment fixed so the proper doorbell index is set before it's used. Fixes: e3163bc8ffdfdb ("drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: skhan@linuxfoundation.org Cc: stable@vger.kernel.org
2022-10-20Merge drm/drm-fixes into drm-misc-fixesThomas Zimmermann364-10665/+73474
Backmerging to get v6.1-rc1. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2022-10-19drm/amdgpu: use DRM_SCHED_FENCE_DONT_PIPELINE for VM updatesChristian König1-1/+8
Make sure that we always have a CPU round trip to let the submission code correctly decide if a TLB flush is necessary or not. Signed-off-by: Christian König <christian.koenig@amd.com> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2113#note_1579296 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014081553.114899-2-christian.koenig@amd.com
2022-10-18drm/amdgpu: Fix for BO move issueArunpravin Paneer Selvam1-0/+3
A user reported a bug on CAPE VERDE system where uvd_v3_1 IP component failed to initialize as there is an issue with BO move code from one memory to other. In function amdgpu_mem_visible() called by amdgpu_bo_move(), when there are no blocks to compare or if we have a single block then break the loop. Fixes: 312b4dc11d4f ("drm/amdgpu: Fix VRAM BO swap issue") Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: dequeue mes scheduler during finiYuBiao Wang1-3/+39
[Why] If mes is not dequeued during fini, mes will be in an uncleaned state during reload, then mes couldn't receive some commands which leads to reload failure. [How] Perform MES dequeue via MMIO after all the unmap jobs are done by mes and before kiq fini. v2: Move the dequeue operation inside kiq_hw_fini. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/pm: enable thermal alert on smu_v13_0_10Kenneth Feng1-6/+4
enable thermal alert on smu_v13_0_10 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: Program GC registers through RLCG interface in gfx_v11/gmc_v11Yifan Zha3-9/+13
[Why] L1 blocks most of GC registers accessing by MMIO. [How] Use RLCG interface to program GC registers under SRIOV VF in full access time. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdkfd: Fix type of reset_type parameter in hqd_destroy() callbackNathan Chancellor1-2/+3
When booting a kernel compiled with CONFIG_CFI_CLANG on a machine with an RX 6700 XT, there is a CFI failure in kfd_destroy_mqd_cp(): [ 12.894543] CFI failure at kfd_destroy_mqd_cp+0x2a/0x40 [amdgpu] (target: hqd_destroy_v10_3+0x0/0x260 [amdgpu]; expected type: 0x8594d794) Clang's kernel Control Flow Integrity (kCFI) makes sure that all indirect call targets have a type that exactly matches the function pointer prototype. In this case, hqd_destroy()'s third parameter, reset_type, should have a type of 'uint32_t' but every implementation of this callback has a third parameter type of 'enum kfd_preempt_type'. Update the function pointer prototype to match reality so that there is no more CFI violation. Link: https://github.com/ClangBuiltLinux/linux/issues/1738 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/display: Increase frame size limit for display_mode_vba_util_32.oGuenter Roeck1-1/+1
Building 32-bit images may fail with the following error. drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_util_32.c: In function ‘dml32_UseMinimumDCFCLK’: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_util_32.c:3142:1: error: the frame size of 1096 bytes is larger than 1024 bytes This is seen when building i386:allmodconfig with any of the following compilers. gcc (Debian 12.2.0-3) 12.2.0 gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 The problem is not seen if the compiler supports GCC_PLUGIN_LATENT_ENTROPY because in that case CONFIG_FRAME_WARN is already set to 2048 even for 32-bit builds. dml32_UseMinimumDCFCLK() was introduced with commit dda4fb85e433 ("drm/amd/display: DML changes for DCN32/321"). It declares a large number of local variables. Increase the frame size for the affected file to 2048, similar to other files in the same directory, to enable 32-bit build tests with affected compilers. Fixes: dda4fb85e433 ("drm/amd/display: DML changes for DCN32/321") Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Reported-by: Łukasz Bartosik <ukaszb@google.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/pm: add SMU IP v13.0.4 IF version define to V7Tim Huang1-1/+1
The pmfw has changed the driver interface version, so keep same with the fw. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/pm: update SMU IP v13.0.4 driver interface versionTim Huang1-2/+15
Update the SMU driver interface version to V7. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/pm: Init pm_attr_list when dpm is disabledZhenGuo Yin1-2/+2
[Why] In SRIOV multi-vf, dpm is always disabled, and pm_attr_list won't be initialized. There will be a NULL pointer call trace after removing the dpm check condition in amdgpu_pm_sysfs_fini. BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:amdgpu_device_attr_remove_groups+0x20/0x90 [amdgpu] Call Trace: <TASK> amdgpu_pm_sysfs_fini+0x2f/0x40 [amdgpu] amdgpu_device_fini_hw+0xdf/0x290 [amdgpu] [How] List pm_attr_list should be initialized when dpm is disabled. Fixes: a6ad27cec585fe ("drm/amd/pm: Remove redundant check condition") Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/pm: disable cstate feature for gpu reset scenarioEvan Quan3-0/+25
Suggested by PMFW team and same as what did for gfxoff feature. This can address some Mode1Reset failures observed on SMU13.0.0. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/pm: fulfill SMU13.0.7 cstate control interfaceEvan Quan1-0/+12
Fulfill the functionality for cstate control. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/pm: fulfill SMU13.0.0 cstate control interfaceEvan Quan1-0/+11
Fulfill the functionality for cstate control. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: Add sriov vf ras support in amdgpu_ras_asic_supportedYiPeng Chai1-5/+9
V2: Add sriov vf ras support in amdgpu_ras_asic_supported. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: Enable ras support for mp0 v13_0_0 and v13_0_10YiPeng Chai1-0/+10
V1: Enable ras support for CHIP_IP_DISCOVERY asic type. V2: 1. Change commit comment. 2. Enable ras support for mp0 v13_0_0 and v13_0_10. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: Enable gmc soft reset on gmc_v11_0_3YiPeng Chai1-0/+1
Enable gmc soft reset on gmc_v11_0_3. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: skip mes self test for gc 11.0.3Likun Gao1-1/+2
Temporary disable mes self teset for gc 11.0.3. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/pm: skip loading pptable from driver on secure board for smu_v13_0_10Kenneth Feng1-1/+2
skip loading pptable from driver on secure board since it's loaded from psp. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Guan Yu <Guan.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/amdgpu: enable gfx clock gating features on smu_v13_0_10Kenneth Feng2-1/+6
enable gfx clock gating features on smu_v13_0_10 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Jack Gui <Jack.Gui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/pm: remove the pptable id override on smu_v13_0_10Kenneth Feng1-3/+0
remove the pptable id override on smu_v13_0_10, and the id is fetched from vbios now. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amd/pm: temporarily disable thermal alert on smu_v13_0_10Kenneth Feng1-4/+6
temporarily disable thermal alert on smu_v13_0_10 due to kfd test fail. will enable it again after confirming the thermal hardware setting. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: Revert "drm/amdgpu: getting fan speed pwm for vega10 properly"Asher Song1-12/+13
This reverts commit 16fb4dca95daa9d8e037201166a58de8284f4268. Unfortunately, that commit causes fan monitors can't be read and written properly. Fixes: 16fb4dca95daa9 ("drm/amdgpu: getting fan speed pwm for vega10 properly") Signed-off-by: Asher Song <Asher.Song@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: Refactor mode2 reset logic for v11.0.7Victor Zhao1-8/+17
- refactor mode2 on v11.0.7 to align with aldebaran - comment out using mode2 reset as default for now, will introduce another controller to replace previous reset_level_mask v2: squash in unused variable removal (Alex) Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18Revert "drm/amdgpu: let mode2 reset fallback to default when failure"Victor Zhao9-20/+2
This reverts commit dac6b80818ac2353631c5a33d140d8d5508e2957. This commit reverted the AMDGPU_SKIP_MODE2_RESET as it conflicts with the original design of reset handler. Will redesign it. Fixes: dac6b80818ac23 ("drm/amdgpu: let mode2 reset fallback to default when failure") Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18Revert "drm/amdgpu: add debugfs amdgpu_reset_level"Victor Zhao4-17/+0
This reverts commit 5bd8d53f6fa53eab5433698d1362dae2aa53c1cc. This commit breaks the reset logic for aldebaran, revert it for now. Will move the mask inside the reset handler. Fixes: 5bd8d53f6fa53e ("drm/amdgpu: add debugfs amdgpu_reset_level") Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: set vm_update_mode=0 as default for Sienna Cichlid in SRIOV caseDanijel Slivka3-1/+15
For asic with VF MMIO access protection avoid using CPU for VM table updates. CPU pagetable updates have issues with HDP flush as VF MMIO access protection blocks write to mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL register during sriov runtime. v3: introduce virtualization capability flag AMDGPU_VF_MMIO_ACCESS_PROTECT which indicates that VF MMIO write access is not allowed in sriov runtime Signed-off-by: Danijel Slivka <danijel.slivka@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-14Merge tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmLinus Torvalds3-13/+19
Pull more MM updates from Andrew Morton: - fix a race which causes page refcounting errors in ZONE_DEVICE pages (Alistair Popple) - fix userfaultfd test harness instability (Peter Xu) - various other patches in MM, mainly fixes * tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (29 commits) highmem: fix kmap_to_page() for kmap_local_page() addresses mm/page_alloc: fix incorrect PGFREE and PGALLOC for high-order page mm/selftest: uffd: explain the write missing fault check mm/hugetlb: use hugetlb_pte_stable in migration race check mm/hugetlb: fix race condition of uffd missing/minor handling zram: always expose rw_page LoongArch: update local TLB if PTE entry exists mm: use update_mmu_tlb() on the second thread kasan: fix array-bounds warnings in tests hmm-tests: add test for migrate_device_range() nouveau/dmem: evict device private memory during release nouveau/dmem: refactor nouveau_dmem_fault_copy_one() mm/migrate_device.c: add migrate_device_range() mm/migrate_device.c: refactor migrate_vma and migrate_deivce_coherent_page() mm/memremap.c: take a pgmap reference on page allocation mm: free device private pages have zero refcount mm/memory.c: fix race when faulting a device private page mm/damon: use damon_sz_region() in appropriate place mm/damon: move sz_damon_region to damon_sz_region lib/test_meminit: add checks for the allocation functions ...
2022-10-14drm/amd/display: Fix build breakage with CONFIG_DEBUG_FS=nNathan Chancellor1-1/+1
After commit 8799c0be89eb ("drm/amd/display: Fix vblank refcount in vrr transition"), a build with CONFIG_DEBUG_FS=n is broken due to a misplaced brace, along the lines of: In file included from drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_trace.h:39, from drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:41: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: At top level: ./include/drm/drm_atomic.h:864:9: error: expected identifier or ‘(’ before ‘for’ 864 | for ((__i) = 0; \ | ^~~ drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8317:9: note: in expansion of macro ‘for_each_new_crtc_in_state’ 8317 | for_each_new_crtc_in_state(state, crtc, new_crtc_state, j) | ^~~~~~~~~~~~~~~~~~~~~~~~~~ Move the brace within the #ifdef so that the file can be built with or without CONFIG_DEBUG_FS. Fixes: 8799c0be89eb ("drm/amd/display: Fix vblank refcount in vrr transition") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-13Merge tag 'drm-next-2022-10-14' of git://anongit.freedesktop.org/drm/drmLinus Torvalds100-995/+1617
Pull more drm updates from Dave Airlie: "Round of fixes for the merge window stuff, bunch of amdgpu and i915 changes, this should have the gcc11 warning fix, amongst other changes. amdgpu: - DC mutex fix - DC SubVP fixes - DCN 3.2.x fixes - DCN 3.1.x fixes - SDMA 6.x fixes - Enable DPIA for 3.1.4 - VRR fixes - VRAM BO swapping fix - Revert dirty fb helper change - SR-IOV suspend/resume fixes - Work around GCC array bounds check fail warning - UMC 8.10 fixes - Misc fixes and cleanups i915: - Round to closest in g4x+ HDMI clock readout - Update MOCS table for EHL - Fix PSR_IMR/IIR field handling - Fix watermark calculations for gen12+/DG2 modifiers - Reject excessive dotclocks early - Fix revocation of non-persistent contexts - Handle migration for dpt - Fix display problems after resume - Allow control over the flags when migrating - Consider DG2_RC_CCS_CC when migrating buffers" * tag 'drm-next-2022-10-14' of git://anongit.freedesktop.org/drm/drm: (110 commits) drm/amd/display: Add HUBP surface flip interrupt handler drm/i915/display: consider DG2_RC_CCS_CC when migrating buffers drm/i915: allow control over the flags when migrating drm/amd/display: Simplify bool conversion drm/amd/display: fix transfer function passed to build_coefficients() drm/amd/display: add a license to cursor_reg_cache.h drm/amd/display: make virtual_disable_link_output static drm/amd/display: fix indentation in dc.c drm/amd/display: make dcn32_split_stream_for_mpc_or_odm static drm/amd/display: fix build error on arm64 drm/amd/display: 3.2.207 drm/amd/display: Clean some DCN32 macros drm/amdgpu: Add poison mode query for umc v8_10_0 drm/amdgpu: Update umc v8_10_0 headers drm/amdgpu: fix coding style issue for mca notifier drm/amdgpu: define convert_error_address for umc v8.7 drm/amdgpu: define RAS convert_error_address API drm/amdgpu: remove check for CE in RAS error address query drm/i915: Fix display problems after resume drm/amd/display: fix array-bounds error in dc_stream_remove_writeback() [take 2] ...
2022-10-12mm: free device private pages have zero refcountAlistair Popple1-1/+1
Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page refcount") device private pages have no longer had an extra reference count when the page is in use. However before handing them back to the owning device driver we add an extra reference count such that free pages have a reference count of one. This makes it difficult to tell if a page is free or not because both free and in use pages will have a non-zero refcount. Instead we should return pages to the drivers page allocator with a zero reference count. Kernel code can then safely use kernel functions such as get_page_unless_zero(). Link: https://lkml.kernel.org/r/cf70cf6f8c0bdb8aaebdbfb0d790aea4c683c3c6.1664366292.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-12mm/memory.c: fix race when faulting a device private pageAlistair Popple3-12/+18
Patch series "Fix several device private page reference counting issues", v2 This series aims to fix a number of page reference counting issues in drivers dealing with device private ZONE_DEVICE pages. These result in use-after-free type bugs, either from accessing a struct page which no longer exists because it has been removed or accessing fields within the struct page which are no longer valid because the page has been freed. During normal usage it is unlikely these will cause any problems. However without these fixes it is possible to crash the kernel from userspace. These crashes can be triggered either by unloading the kernel module or unbinding the device from the driver prior to a userspace task exiting. In modules such as Nouveau it is also possible to trigger some of these issues by explicitly closing the device file-descriptor prior to the task exiting and then accessing device private memory. This involves some minor changes to both PowerPC and AMD GPU code. Unfortunately I lack hardware to test either of those so any help there would be appreciated. The changes mimic what is done in for both Nouveau and hmm-tests though so I doubt they will cause problems. This patch (of 8): When the CPU tries to access a device private page the migrate_to_ram() callback associated with the pgmap for the page is called. However no reference is taken on the faulting page. Therefore a concurrent migration of the device private page can free the page and possibly the underlying pgmap. This results in a race which can crash the kernel due to the migrate_to_ram() function pointer becoming invalid. It also means drivers can't reliably read the zone_device_data field because the page may have been freed with memunmap_pages(). Close the race by getting a reference on the page while holding the ptl to ensure it has not been freed. Unfortunately the elevated reference count will cause the migration required to handle the fault to fail. To avoid this failure pass the faulting page into the migrate_vma functions so that if an elevated reference count is found it can be checked to see if it's expected or not. [mpe@ellerman.id.au: fix build] Link: https://lkml.kernel.org/r/87fsgbf3gh.fsf@mpe.ellerman.id.au Link: https://lkml.kernel.org/r/cover.60659b549d8509ddecafad4f498ee7f03bb23c69.1664366292.git-series.apopple@nvidia.com Link: https://lkml.kernel.org/r/d3e813178a59e565e8d78d9b9a4e2562f6494f90.1664366292.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Lyude Paul <lyude@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-12drm/amd/display: Add HUBP surface flip interrupt handlerAurabindo Pillai1-0/+1
Add the hubp surface flip handler. This fixes some flip timeout issues. Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-10-11drm/amd/display: Simplify bool conversionYang Li1-1/+1
The result of 'pwr_status == 0' is Boolean, and the question mark expression is redundant. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2354 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amd/display: fix transfer function passed to build_coefficients()Alex Deucher1-1/+1
The default argument should be enum TRANSFER_FUNCTION_SRGB rather than the current boolean value which improperly maps to TRANSFER_FUNCTION_BT709. Commit 9b3d76527f6e ("drm/amd/display: Revert adding degamma coefficients") looks to have improperly reverted commit d02097095916 ("drm/amd/display: Add regamma/degamma coefficients and set sRGB when TF is BT709") replacing the enum value with a boolean value. Cc: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Cc: Jaehyun Chung <jaehyun.chung@amd.com> Cc: Zeng Heng <zengheng4@huawei.com> Fixes: 9b3d76527f6e ("drm/amd/display: Revert adding degamma coefficients") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amd/display: add a license to cursor_reg_cache.hAlex Deucher1-0/+1
It's MIT. Fixes: b73353f7f3d434 ("drm/amd/display: Use the same cursor info across features") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amd/display: make virtual_disable_link_output staticAlex Deucher1-1/+1
It's not used outside of virtual_link_hwss.c. Fixes a -Wmissing-prototypes warning. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amd/display: fix indentation in dc.cAlex Deucher1-3/+3
Fixes a warning in dc.c. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amd/display: make dcn32_split_stream_for_mpc_or_odm staticAlex Deucher1-1/+1
It's not used outside of dcn32_fpu.c. Fixes: 20dad3813b3c15 ("drm/amd/display: Add a helper to map ODM/MPC/Multi-Plane resources") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amd/display: fix build error on arm64Yang Yingliang1-0/+4
dcn20_build_mapped_resource() and dcn20_acquire_dsc() is not defined, if CONFIG_DRM_AMD_DC_DCN is disabled. Fix the following build error on arm64: ERROR: modpost: "dcn20_build_mapped_resource" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! ERROR: modpost: "dcn20_acquire_dsc" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! Fixes: 20dad3813b3c ("drm/amd/display: Add a helper to map ODM/MPC/Multi-Plane resources") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amd/display: 3.2.207Aric Cyr1-1/+1
DC version 3.2.207 brings along the following: - PMFW z-state interface update - Cursor update refactor - Fixes to DSC validation, DCFCLK during Freesync, etc. - Code cleanup Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amd/display: Clean some DCN32 macrosRodrigo Siqueira1-10/+1
Some unused macros might mislead developers during the debug, which can be removed without any issue. This commit drops some unused references to SE_COMMON_MASK_SH_LIST_DCN32. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amdgpu: Add poison mode query for umc v8_10_0Candice Li1-0/+26
Add poison mode query support on umc v8_10_0. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amdgpu: Update umc v8_10_0 headersCandice Li2-0/+5
Add GeccCtrl offset and mask to umc v8_10_0 headers. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amdgpu: fix coding style issue for mca notifierTao Zhou1-3/+3
Fix some issues found by checkpatch script. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amdgpu: define convert_error_address for umc v8.7Tao Zhou1-22/+25
So the code can be simplified. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amdgpu: define RAS convert_error_address APITao Zhou3-93/+64
Make the code reusable and remove redundant code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amdgpu: remove check for CE in RAS error address queryTao Zhou4-90/+59
Only RAS UE error address is queried currently, no need to check CE status. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>