aboutsummaryrefslogtreecommitdiffstatshomepage
AgeCommit message (Collapse)AuthorFilesLines
2023-10-26drm/amd/display: add pipe resource management callbacks to DML2Wenjing Liu4-0/+30
[why] Need DML2 to support new pipe resource management APIs. Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Acked-by: Roman Li <roman.li@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Reduce default backlight min from 5 nits to 1 nitsSwapnil Patel1-2/+2
[Why & How] Currently set_default_brightness_aux function uses 5 nits as lower limit to check for valid default_backlight setting. However some newer panels can support even lower default settings Reviewed-by: Agustin Gutierrez <agustin.gutierrez@amd.com> Acked-by: Roman Li <roman.li@amd.com> Signed-off-by: Swapnil Patel <swapnil.patel@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Update SDP VSC colorimetry from DP test automation requestGeorge Shen1-0/+6
[Why] Certain test equipment vendors check the SDP VSC for colorimetry against the value from the test request during certain DP link layer tests for YCbCr test cases. [How] Update SDP VSC with colorimetry from test automation request. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Acked-by: Roman Li <roman.li@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Add a check for idle power optimizationSung Joon Kim7-7/+34
[why] Need a helper function to check idle power is allowed so that dc doesn't access any registers that are power-gated. [how] Implement helper function to check idle power optimization. Enable a hook to check if detection is allowed. V2: Add function hooks for set and get idle states. Check if function hook was properly initialized. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Reviewed-by: Nicholas Choi <nicholas.choi@amd.com> Acked-by: Roman Li <roman.li@amd.com> Signed-off-by: Sung Joon Kim <sungkim@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Revert "Improve x86 and dmub ips handshake"Nicholas Kazlauskas11-129/+27
This reverts commit 1288d702080949f87688d49dfeeacc99f40adc9b. Causes intermittent hangs during reboot stress testing. Reviewed-by: Duncan Ma <duncan.ma@amd.com> Acked-by: Roman Li <roman.li@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Fix MST Multi-Stream Not Lighting Up on dcn35Fangzhi Zuo1-0/+6
dcn35 misses .enable_symclk_se hook that makes MST DSC not functional when having multiple FE clk to be enabled. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Roman Li <roman.li@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd: Explicitly disable ASPM when dynamic switching disabledMario Limonciello4-22/+10
Currently there are separate but related checks: * amdgpu_device_should_use_aspm() * amdgpu_device_aspm_support_quirk() * amdgpu_device_pcie_dynamic_switching_supported() Simplify into checking whether DPM was enabled or not in the auto case. This works because amdgpu_device_pcie_dynamic_switching_supported() populates that value. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd: Move AMD_IS_APU check for ASPM into top level functionMario Limonciello7-14/+6
There is no need for every ASIC driver to perform the same check. Move the duplicated code into amdgpu_device_should_use_aspm(). Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not supportedMario Limonciello4-5/+5
Rather than individual ASICs checking for the quirk, set the quirk at the driver level. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26Revert "drm/amdkfd: Use partial migrations in GPU page faults"Philip Yang4-160/+85
This reverts commit dc427a473e5d119232ddb27530920d9796cdea70. The change prevents migrating the entire range to VRAM because retry fault restore_pages map the remaining system memory range to GPUs. It will work correctly to submit together with partial mapping to GPU patch later. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26Revert "drm/amdkfd:remove unused code"Philip Yang2-0/+63
This reverts commit f9caf6cdd5cc1f4006fd7b6b113658c0b0159f23. Needed for the next revert patch. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Fix copyright notice in DC codeStylon Wang35-3/+215
[Why & How] Fix incomplete copyright notice in DC code. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Fix copyright notice in DML2 codeStylon Wang22-1/+45
[Why & How] Fix incomplete copyright notice in DML2 code. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Add missing copyright notice in DMUBStylon Wang2-0/+38
[Why & How] Add missing/incomplete copyright notice in DMUB files Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdgpu remove restriction of sriov max_pfn on Vega10Lin.Cao1-5/+2
Remove restriction of sriov max_pfn so that TBA and TMA can move to high 47 bits address. Regression test: change range alloc flag of libdrm as AMDGPU_VA_RANGE_HIGH and there is no flr occur when testing amdgpu_test of drm. Signed-off-by: Lin.Cao <lincao12@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdkfd: Address 'remap_list' not described in 'svm_range_add'Srinivasan Shanmugam1-0/+1
Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2073: warning: Function parameter or member 'remap_list' not described in 'svm_range_add' Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULLQu Huang1-0/+6
In certain types of chips, such as VEGA20, reading the amdgpu_regs_smc file could result in an abnormal null pointer access when the smc_rreg pointer is NULL. Below are the steps to reproduce this issue and the corresponding exception log: 1. Navigate to the directory: /sys/kernel/debug/dri/0 2. Execute command: cat amdgpu_regs_smc 3. Exception Log:: [4005007.702554] BUG: kernel NULL pointer dereference, address: 0000000000000000 [4005007.702562] #PF: supervisor instruction fetch in kernel mode [4005007.702567] #PF: error_code(0x0010) - not-present page [4005007.702570] PGD 0 P4D 0 [4005007.702576] Oops: 0010 [#1] SMP NOPTI [4005007.702581] CPU: 4 PID: 62563 Comm: cat Tainted: G OE 5.15.0-43-generic #46-Ubunt u [4005007.702590] RIP: 0010:0x0 [4005007.702598] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.702600] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.702605] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.702609] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.702612] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.702615] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.702618] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.702622] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.702626] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.702629] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0 [4005007.702633] Call Trace: [4005007.702636] <TASK> [4005007.702640] amdgpu_debugfs_regs_smc_read+0xb0/0x120 [amdgpu] [4005007.703002] full_proxy_read+0x5c/0x80 [4005007.703011] vfs_read+0x9f/0x1a0 [4005007.703019] ksys_read+0x67/0xe0 [4005007.703023] __x64_sys_read+0x19/0x20 [4005007.703028] do_syscall_64+0x5c/0xc0 [4005007.703034] ? do_user_addr_fault+0x1e3/0x670 [4005007.703040] ? exit_to_user_mode_prepare+0x37/0xb0 [4005007.703047] ? irqentry_exit_to_user_mode+0x9/0x20 [4005007.703052] ? irqentry_exit+0x19/0x30 [4005007.703057] ? exc_page_fault+0x89/0x160 [4005007.703062] ? asm_exc_page_fault+0x8/0x30 [4005007.703068] entry_SYSCALL_64_after_hwframe+0x44/0xae [4005007.703075] RIP: 0033:0x7f5e07672992 [4005007.703079] Code: c0 e9 b2 fe ff ff 50 48 8d 3d fa b2 0c 00 e8 c5 1d 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 e c 28 48 89 54 24 [4005007.703083] RSP: 002b:00007ffe03097898 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [4005007.703088] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5e07672992 [4005007.703091] RDX: 0000000000020000 RSI: 00007f5e06753000 RDI: 0000000000000003 [4005007.703094] RBP: 00007f5e06753000 R08: 00007f5e06752010 R09: 00007f5e06752010 [4005007.703096] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000 [4005007.703099] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 [4005007.703105] </TASK> [4005007.703107] Modules linked in: nf_tables libcrc32c nfnetlink algif_hash af_alg binfmt_misc nls_ iso8859_1 ipmi_ssif ast intel_rapl_msr intel_rapl_common drm_vram_helper drm_ttm_helper amd64_edac t tm edac_mce_amd kvm_amd ccp mac_hid k10temp kvm acpi_ipmi ipmi_si rapl sch_fq_codel ipmi_devintf ipm i_msghandler msr parport_pc ppdev lp parport mtd pstore_blk efi_pstore ramoops pstore_zone reed_solo mon ip_tables x_tables autofs4 ib_uverbs ib_core amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) iommu_v 2 amd_sched(OE) amdkcl(OE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm igb ahci xhci_pci libahci i2c_piix4 i2c_algo_bit xhci_pci_renesas dca [4005007.703184] CR2: 0000000000000000 [4005007.703188] ---[ end trace ac65a538d240da39 ]--- [4005007.800865] RIP: 0010:0x0 [4005007.800871] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.800874] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.800878] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.800881] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.800883] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.800886] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.800888] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.800891] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.800895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.800898] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0 Signed-off-by: Qu Huang <qu.huang@linux.dev> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdgpu: bypass RAS error reset in some conditionsTao Zhou1-1/+9
PMFW is responsible for RAS error reset in some conditions, driver can skip the operation. v2: add check for ras->in_recovery, it's set earlier than amdgpu_in_reset. v3: fix error in gpu reset check. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdgpu: enable RAS poison mode for APUTao Zhou1-1/+2
Enable it by default on APU platform. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdgpu/vpe: correct queue stop programingLang Yu1-8/+10
Otherwise IB test would fail during GPU reset. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Fix DMUB errors introduced by DML2Rodrigo Siqueira1-4/+5
When DML 2 was introduced, it changed part of the generic sequence of DC, which caused issues on previous DCNs with DMUB support. This commit ensures the new sequence only works for new DCNs from 3.5 and above. Changes since V1: - Harry: Use the attribute using_dml2 instead of check the DCN version. Cc: Vitaly Prosyak <vprosyak@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Qingqing Zhuo <Qingqing.Zhuo@amd.com> Cc: Daniel Wheeler <daniel.wheeler@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Fixes: 7966f319c66d ("drm/amd/display: Introduce DML2") Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd: Disable ASPM for VI w/ all Intel systemsMario Limonciello1-1/+1
Originally we were quirking ASPM disabled specifically for VI when used with Alder Lake, but it appears to have problems with Rocket Lake as well. Like we've done in the case of dpm for newer platforms, disable ASPM for all Intel systems. Cc: stable@vger.kernel.org # 5.15+ Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default") Reported-and-tested-by: Paolo Gentili <paolo.gentili@canonical.com> Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Remove power sequencing checkAgustin Gutierrez1-2/+1
[Why] Some ASICs keep backlight powered on after dpms off command has been issued. [How] The check for no edp power sequencing was never going to pass. The value is never changed from what it is set by design. Cc: stable@vger.kernel.org # 6.1+ Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2765 Reviewed-by: Swapnil Patel <swapnil.patel@amd.com> Acked-by: Roman Li <roman.li@amd.com> Signed-off-by: Agustin Gutierrez <agustin.gutierrez@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: Set the DML2 attribute to false in all DCNs older than version 3.5Rodrigo Siqueira13-2/+15
When DML2 was introduced, it targeted only new DCN versions. For controlling which ASIC should use this new version of DML, it was introduced the using_dml2 attribute. To avoid ambiguities, this commit explicitly sets using_dml2 to false in all ASICs that do not support DML2. Cc: Vitaly Prosyak <vprosyak@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Qingqing Zhuo <Qingqing.Zhuo@amd.com> Cc: Daniel Wheeler <daniel.wheeler@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/pm: Fix the return value in default caseMa Jun1-3/+1
Fix the return value in default case and drop redundant 'break'. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdgpu: Add API to get full IP versionLijo Lazar1-0/+7
Fetch the full version of IP including variant and subrevision. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdgpu: add tmz support for GC IP v11.5.0Jiadong Zhu1-0/+1
Add tmz support for GC 11.5.0. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/pm: drop unneeded dpm features disablement for SMU 14.0.0Jiadong Zhu1-1/+2
PMFW will handle the features disablement properly for gpu reset case, driver involvement may cause some unexpected issues. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdgpu: modify if condition in nbio_v7_7.cLi Ma1-2/+2
remove unnecessary "enable" in if condition. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdkfd: Fix shift out-of-bounds issueJesse Zhang1-1/+1
[ 567.613292] shift exponent 255 is too large for 64-bit type 'long unsigned int' [ 567.614498] CPU: 5 PID: 238 Comm: kworker/5:1 Tainted: G OE 6.2.0-34-generic #34~22.04.1-Ubuntu [ 567.614502] Hardware name: AMD Splinter/Splinter-RPL, BIOS WS43927N_871 09/25/2023 [ 567.614504] Workqueue: events send_exception_work_handler [amdgpu] [ 567.614748] Call Trace: [ 567.614750] <TASK> [ 567.614753] dump_stack_lvl+0x48/0x70 [ 567.614761] dump_stack+0x10/0x20 [ 567.614763] __ubsan_handle_shift_out_of_bounds+0x156/0x310 [ 567.614769] ? srso_alias_return_thunk+0x5/0x7f [ 567.614773] ? update_sd_lb_stats.constprop.0+0xf2/0x3c0 [ 567.614780] svm_range_split_by_granularity.cold+0x2b/0x34 [amdgpu] [ 567.615047] ? srso_alias_return_thunk+0x5/0x7f [ 567.615052] svm_migrate_to_ram+0x185/0x4d0 [amdgpu] [ 567.615286] do_swap_page+0x7b6/0xa30 [ 567.615291] ? srso_alias_return_thunk+0x5/0x7f [ 567.615294] ? __free_pages+0x119/0x130 [ 567.615299] handle_pte_fault+0x227/0x280 [ 567.615303] __handle_mm_fault+0x3c0/0x720 [ 567.615311] handle_mm_fault+0x119/0x330 [ 567.615314] ? lock_mm_and_find_vma+0x44/0x250 [ 567.615318] do_user_addr_fault+0x1a9/0x640 [ 567.615323] exc_page_fault+0x81/0x1b0 [ 567.615328] asm_exc_page_fault+0x27/0x30 [ 567.615332] RIP: 0010:__get_user_8+0x1c/0x30 Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdgpu: refine ras error kernel log printYang Wang2-39/+82
refine ras error kernel log to avoid user-ridden ambiguity. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdgpu: fix find ras error node errorYang Wang1-4/+3
the origin function might return the wrong node. Fixes: 5b1270beb380 ("drm/amdgpu: add ras_err_info to identify RAS error source") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/display: reprogram det size while seamless bootHugo Hu3-0/+33
[Why] During system boot in second screen only mode on a seamless boot system, there is a chance that the pipe's det size might not be reset. [How] Reset the det size while resetting the pipe during seamless boot. Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com> Acked-by: Roman Li <roman.li@amd.com> Signed-off-by: Hugo Hu <hugo.hu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amd/pm: record mca debug mode in RASTao Zhou1-0/+1
Call amdgpu_ras_set_mca_debug_mode when we set mca debug mode in smu v13_0_6. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-27usb: typec: altmodes/displayport: fixup drm internal api change vs new user.Dave Airlie1-1/+2
usb: typec: altmodes/displayport: Signal hpd low when exiting mode and drm: Add HPD state to drm_connector_oob_hotplug_event() sideswiped each other. Signal disconnected always. Signed-off-by: Dave Airlie <airlied@redhat.com>
2023-10-26drm/amdgpu: move buffer funcs setting up a levelAlex Deucher11-84/+19
Rather than doing this in the IP code for the SDMA paging engine, move it up to the core device level init level. This should fix the scheduler init ordering. v2: drop extra parens v3: drop SDMA helpers v4: Added a Fixes tag because amdgpu dereferences an uninitialized scheduler without this patch, and this patch fixes this. (Luben) Tested-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20231025171928.3318505-1-alexander.deucher@amd.com Acked-by: Christian König <christian.koenig@amd.com> Fixes: 56e449603f0ac5 ("drm/sched: Convert the GPU scheduler to variable number of run-queues") Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
2023-10-26MAINTAINERS: Update the GPU Scheduler emailLuben Tuikov1-1/+1
Update the GPU Scheduler maintainer email. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@gmail.com> Cc: AMD Graphics <amd-gfx@lists.freedesktop.org> Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org> Signed-off-by: Luben Tuikov <ltuikov89@gmail.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Link: https://lore.kernel.org/r/20231026174438.18427-2-ltuikov89@gmail.com
2023-10-26drm/sched: Convert the GPU scheduler to variable number of run-queuesLuben Tuikov11-25/+98
The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Emma Anholt <emma@anholt.net> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com
2023-10-26MAINTAINERS: drm/ci: add entries for xfail filesHelen Koike1-0/+7
DRM CI keeps track of which tests are failing, flaking or being skipped by the ci in the expectations files. Add entries for those files to the corresponding driver maintainer, so they can be notified when they change. Signed-off-by: Helen Koike <helen.koike@collabora.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20230919182249.153499-1-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: docs: add step about how to request privilegesHelen Koike1-2/+5
Clarify the procedure developer must follow to request privileges to run tests on Freedesktop gitlab CI. This measure was added to avoid untrusted people to misuse the infrastructure. Signed-off-by: Helen Koike <helen.koike@collabora.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Link: https://lore.kernel.org/r/20231024004525.169002-11-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: do not automatically retry on errorHelen Koike1-14/+0
Since the kernel doesn't use a bot like Mesa that requires tests to pass in order to merge the patches, leave it to developers and/or maintainers to manually retry. Suggested-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Helen Koike <helen.koike@collabora.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Link: https://lore.kernel.org/r/20231024004525.169002-10-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: export kernel configHelen Koike2-1/+2
Export the resultant kernel config, making it easier to verify if the resultant config was correctly generated. Suggested-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Helen Koike <helen.koike@collabora.com> Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Link: https://lore.kernel.org/r/20231024004525.169002-9-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: increase i915 job timeout to 1h30mHelen Koike1-0/+5
With the new sharding, the default job timeout is not enough for i915 and their jobs are failing before completing. See below the current execution time: 🞋 job i915:tgl 8/8 has new status: success (37m3s) 🞋 job i915:tgl 7/8 has new status: success (19m43s) 🞋 job i915:tgl 6/8 has new status: success (21m47s) 🞋 job i915:tgl 5/8 has new status: success (18m16s) 🞋 job i915:tgl 4/8 has new status: success (21m43s) 🞋 job i915:tgl 3/8 has new status: success (17m59s) 🞋 job i915:tgl 2/8 has new status: success (22m15s) 🞋 job i915:tgl 1/8 has new status: success (18m52s) 🞋 job i915:cml 2/2 has new status: success (1h19m58s) 🞋 job i915:cml 1/2 has new status: success (55m45s) 🞋 job i915:whl 2/2 has new status: success (1h8m56s) 🞋 job i915:whl 1/2 has new status: success (54m3s) 🞋 job i915:kbl 3/3 has new status: success (37m43s) 🞋 job i915:kbl 2/3 has new status: success (36m37s) 🞋 job i915:kbl 1/3 has new status: success (34m52s) 🞋 job i915:amly 2/2 has new status: success (1h7m60s) 🞋 job i915:amly 1/2 has new status: success (59m18s) 🞋 job i915:glk 2/2 has new status: success (58m26s) 🞋 job i915:glk 1/2 has new status: success (50m23s) 🞋 job i915:apl 3/3 has new status: success (1h6m39s) 🞋 job i915:apl 2/3 has new status: success (1h4m45s) 🞋 job i915:apl 1/3 has new status: success (1h7m38s) (generated with ci_run_n_monitor.py script) The longest job is 1h19m58s, so adjust the timeout. Signed-off-by: Helen Koike <helen.koike@collabora.com> Link: https://lore.kernel.org/r/20231024004525.169002-8-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: add subset-1-gfx to LAVA_TAGS and adjust shardsHelen Koike2-10/+15
The Collabora Lava farm added a tag called `subset-1-gfx` to half of devices the graphics community use. Lets use this tag so we don't occupy all the resources. This is particular important because Mesa3D shares the resources with DRM-CI and use them to do pre-merge tests, so it can block developers from getting their patches merged. Signed-off-by: Helen Koike <helen.koike@collabora.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Link: https://lore.kernel.org/r/20231024004525.169002-7-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: clean up xfails (specially flakes list)Helen Koike33-287/+162
Since the script that collected the list of the expectation files was bogus and placing test to the flakes list incorrectly, restart the expectation files with the correct script. This reduces a lot the number of tests in the flakes list. Signed-off-by: Helen Koike <helen.koike@collabora.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Link: https://lore.kernel.org/r/20231024004525.169002-6-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: uprev IGT and make sure core_getversion is runHelen Koike3-9/+26
IGT has recently merged a patch that makes code_getversion test to fails if the driver isn't loaded or if it isn't the expected one defined in variable IGT_FORCE_DRIVER. Without this test, jobs were passing when the driver didn't load or probe for some reason, giving the illusion that everything was ok. Uprev IGT to include this modification and include core_getversion test in all the shards. Signed-off-by: Helen Koike <helen.koike@collabora.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Link: https://lore.kernel.org/r/20231024004525.169002-5-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: add helper script update-xfails.pyHelen Koike2-0/+221
Add helper script that given a gitlab pipeline url, analyse which are the failures and flakes and update the xfails folder accordingly. Example: Trigger a pipeline in gitlab infrastructure, than re-try a few jobs more than once (so we can have data if failures are consistent across jobs with the same name or if they are flakes) and execute: update-xfails.py https://gitlab.freedesktop.org/helen.fornazier/linux/-/pipelines/970661 git diff should show you that it updated files in xfails folder. Signed-off-by: Helen Koike <helen.koike@collabora.com> Tested-by: Vignesh Raman <vignesh.raman@collabora.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Link: https://lore.kernel.org/r/20231024004525.169002-4-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: fix DEBIAN_ARCH and get amdgpu probingHelen Koike4-8/+8
amdgpu driver wasn't loading because amdgpu firmware wasn't being installed in the rootfs due to the wrong DEBIAN_ARCH variable. rename ARCH to DEBIAN_ARCH also, so we don't have the confusing DEBIAN_ARCH, KERNEL_ARCH and ARCH. Signed-off-by: Helen Koike <helen.koike@collabora.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Link: https://lore.kernel.org/r/20231024004525.169002-3-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: uprev mesa version: fix container build & crosvmHelen Koike4-3/+22
When building containers, some rust packages were installed without locking the dependencies version, which got updated and started giving errors like: error: failed to compile `bindgen-cli v0.62.0`, intermediate artifacts can be found at `/tmp/cargo-installkNKRwf` Caused by: package `rustix v0.38.13` cannot be built because it requires rustc 1.63 or newer, while the currently active rustc version is 1.60.0 A patch to Mesa was added fixing this error, so update it. Also, commit in linux kernel 6.6 rc3 broke booting in crosvm. Mesa has upreved crosvm to fix this issue. Signed-off-by: Helen Koike <helen.koike@collabora.com> [crosvm mesa update] Co-Developed-by: Vignesh Raman <vignesh.raman@collabora.com> Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com> [v1 container build uprev] Tested-by: Jessica Zhang <quic_jesszhan@quicinc.com> Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Link: https://lore.kernel.org/r/20231024004525.169002-2-helen.koike@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-10-26drm/ci: Enable CONFIG_BACKLIGHT_CLASS_DEVICERob Clark2-0/+2
Dependency for CONFIG_DRM_PANEL_EDP. Missing this was causing the drm driver to not probe on devices that use panel-edp. Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Helen Koike <helen.koike@collabora.com> Acked-by: Helen Koike <helen.koike@collabora.com> Link: https://lore.kernel.org/r/20231002164715.157298-1-robdclark@gmail.com Signed-off-by: Maxime Ripard <mripard@kernel.org>