linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-12-03	drm/amdgpu: fix calltrace during kmd unload(v3)	Monk Liu	1	-39/+1
	issue: kernel would report a warning from a double unpin during the driver unloading on the CSB bo why: we unpin it during hw_fini, and there will be another unpin in sw_fini on CSB bo. fix: actually we don't need to pin/unpin it during hw_init/fini since it is created with kernel pinned, we only need to fullfill the CSB again during hw_init to prevent CSB/VRAM lost after S3 v2: get_csb in init_rlc so hw_init() will make CSIB content back even after reset or s3 v3: use bo_create_kernel instead of bo_create_reserved for CSB otherwise the bo_free_kernel() on CSB is not aligned and would lead to its internal reserve pending there forever take care of gfx7/8 as well Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22	drm/amdgpu: Update Arcturus golden registers	Jay Cornwall	1	-0/+1
	Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22	drm/amdgpu: disable gfxoff on original raven	Alex Deucher	1	-2/+7
	There are still combinations of sbios and firmware that are not stable. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-08	drm/amdgpu: allow direct upload save restore list for raven2	changzhu	1	-1/+3
	It will cause modprobe atombios stuck problem in raven2 if it doesn't allow direct upload save restore list from gfx driver. So it needs to allow direct upload save restore list for raven2 temporarily. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06	drm/amdgpu/renoir: move gfxoff handling into gfx9 module	Alex Deucher	1	-0/+6
	To properly handle the option parsing ordering. Reviewed-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06	drm/amdgpu: change read of GPU clock counter on Vega10 VF	Eric Huang	1	-3/+16
	Using unified VBIOS has performance drop in sriov environment. The fix is switching to another register instead. Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06	drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9	changzhu	1	-0/+7
	It needs to add warning to update firmware in gfx9 in case that firmware is too old to have function to realize dummy read in cp firmware. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06	drm/amdgpu: disallow direct upload save restore list from gfx driver	Hawking Zhang	1	-1/+2
	Direct uploading save/restore list via mmio register writes breaks the security policy. Instead, the driver should pass s&r list to psp. For all the ASICs that use rlc v2_1 headers, the driver actually upload s&r list twice, in non-psp ucode front door loading phase and gfx pg initialization phase. The latter is not allowed. VG12 is the only exception where the driver still keeps legacy approach for S&R list uploading. In theory, this can be elimnated if we have valid srcntl ucode for VG12. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Candice Li <Candice.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30	drm/amdgpu: fix no ACK from LDS read during stress test for Arcturus	Le Ma	1	-0/+1
	Set mmSQ_CONFIG.DISABLE_SMEM_SOFT_CLAUSE as W/R. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30	drm/amdgpu: bypass some cleanup work after err_event_athub (v2)	Le Ma	1	-2/+4
	PSP lost connection when err_event_athub occurs. These cleanup work can be skipped in BACO reset. v2: squash in missing include (Alex) Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <hawking.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25	drm/amdgpu: remove unused parameter in amdgpu_gfx_kiq_free_ring	Nirmoy Das	1	-1/+1
	Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17	drm/amdgpu: fix S3 failed as RLC safe mode entry stucked in polloing gfx acq	Prike Liang	1	-5/+0
	Fix gfx cgpg setting sequence for RLC deadlock at safe mode entry in polling gfx response. The patch can fix VCN IB test failed and DAL get dispaly count failed issue. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17	drm/amdgpu: add GFX_PIPELINE capacity check for updating gfx cgpg	Prike Liang	1	-1/+2
	Before disable gfx pipeline power gating need check the flag AMD_PG_SUPPORT_GFX_PIPELINE. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15	drm/amdgpu: add RAS support for VML2 and ATCL2	Dennis Li	1	-0/+167
	v1: Add codes to query the EDC count of VML2 & ATCL2 v2: Rename VML2/ATCL2 registers and drop their mask define v3: Add back the ECC mask for VML2 registers Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <hawking.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15	drm/amdgpu: change to query the actual EDC counter	Dennis Li	1	-325/+496
	For the potential request in the future, change to query the actual EDC counter. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <hawking.zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03	drm/amdgpu: remove ih_info parameter of gfx_ras_late_init	Tao Zhou	1	-4/+1
	gfx_ras_late_init can get the info by itself Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03	drm/amdgpu: add common gfx_ras_fini function	Tao Zhou	1	-13/+1
	gfx_ras_fini can be shared among all generations of gfx Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03	drm/amdgpu: move gfx ecc functions to generic gfx file	Tao Zhou	1	-39/+2
	gfx ras ecc common functions could be reused among all gfx generations Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03	drm/amdgpu: update parameter of ras_ih_cb	Tao Zhou	1	-2/+2
	change struct ras_err_data err_data to void err_data, align with umc code and the callback's declaration in each ras block could pay no attention to the structure type Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03	drm/amdgpu: remove gfx9 NGG	Marek Olšák	1	-195/+0
	Never used. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03	drm/amdgpu: do not init mec2 jt for renoir	Hawking Zhang	1	-1/+2
	For ASICs like renoir/arct, driver doesn't need to load mec2 jt. when mec1 jt is loaded, mec2 jt will be loaded automatically since the write is actaully broadcasted to both. We need to more time to test other gfx9 asic. but for now we should be able to draw conclusion that mec2 jt is not needed for renoir and arct. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16	drm/amdgpu: remove program of lbpw for renoir	Aaron Liu	1	-2/+0
	These is no LBPW on Renoir. So removing program of lbpw for renoir. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16	drm/amdgpu: fix CPDMA hang in PRT mode for VEGA10	Tianci.Yin	1	-9/+9
	add and_mask since the programming logic of golden setting changed Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13	drm/amdgpu/gfx: switch to amdgpu_gfx_ras_late_init helper function	Hawking Zhang	1	-34/+3
	amdgpu_gfx_ras_late_init is used to init gfx specfic ras debugfs/sysfs node and gfx specific interrupt handler. It can be shared among gfx generations Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13	drm/amdgpu: set ip specific ras interface pointer to NULL after free it	Hawking Zhang	1	-2/+5
	to prevent access to dangling pointers Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13	drm/amdgpu: Avoid HW GPU reset for RAS.	Andrey Grodzovsky	1	-4/+6
	Problem: Under certain conditions, when some IP bocks take a RAS error, we can get into a situation where a GPU reset is not possible due to issues in RAS in SMU/PSP. Temporary fix until proper solution in PSP/SMU is ready: When uncorrectable error happens the DF will unconditionally broadcast error event packets to all its clients/slave upon receiving fatal error event and freeze all its outbound queues, err_event_athub interrupt will be triggered. In such case and we use this interrupt to issue GPU reset. THe GPU reset code is modified for such case to avoid HW reset, only stops schedulers, deatches all in progress and not yet scheduled job's fences, set error code on them and signals. Also reject any new incoming job submissions from user space. All this is done to notify the applications of the problem. v2: Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c Remove print param from amdgpu_ras_query_error_count v3: Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset for other XGMI hive memebers. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13	drm/amdgpu: only apply gds clearing workaround when ras is supported	Hawking Zhang	1	-0/+4
	gds clearing workaround should only be applied on asics that support gfx ras Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13	drm/amdgpu: fix memory leak when ras is not supported on specific ip block	Hawking Zhang	1	-1/+2
	free ras_if if ras is not supported Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13	drm/amdgpu: switch to amdgpu_ras_late_init for gfx v9 block (v2)	Hawking Zhang	1	-71/+21
	call helper function in late init phase to handle ras init for gfx ip block v2: call ras_late_fini to do clean up when fail to enable interrupt Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13	drm/amdgpu: switch to new amdgpu_nbio structure	Hawking Zhang	1	-3/+3
	no functional change, just switch to new structures Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-27	drm/amdgpu: fix GFXOFF on Picasso and Raven2	Aaron Liu	1	-7/+7
	For picasso(adev->pdev->device == 0x15d8)&raven2(adev->rev_id >= 0x8), firmware is sufficient to support gfxoff. In commit 98f58ada2d37e, for picasso&raven2, return directly and cause gfxoff disabled. Fixes: 98f58ada2d37 ("drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible") Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22	drm/amdgpu: update gc/sdma goldensetting for rn	Aaron Liu	1	-4/+3
	This patch updates gc/sdma goldensetting for renoir Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22	drm/amdgpu: add set_gfx_cgpg implement (v2)	Aaron Liu	1	-0/+5
	add set_gfx_cgpg implement v2: check if using sw_smu (Alex) Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21	drm/amdgpu: remove duplicated include from gfx_v9_0.c	YueHaibing	1	-1/+0
	Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21	drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible	Alex Deucher	1	-0/+4
	We need to set certain power gating flags after we determine if the firmware version is sufficient to support gfxoff. Previously we set the pg flags in early init, but we later we might have disabled gfxoff if the firmware versions didn't support it. Move adding the additional pg flags after we determine whether or not to support gfxoff. Fixes: 005440066f92 ("drm/amdgpu: enable gfxoff again on raven series (v2)") Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
2019-08-12	drm/amdgpu: update lbpw for renoir	Aaron Liu	1	-0/+1
	enable gfx_v9_0_init_lbpw for renoir Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12	drm/amdgpu: enable power gating for renoir	Aaron Liu	1	-0/+1
	enable gfx power gating for renoir Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12	drm/amdgpu: enable clock gating for renoir	Aaron Liu	1	-0/+1
	enable gfx&common clock gating for renoir Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12	drm/amdgpu: add gfx golden settings for renoir (v2)	Huang Rui	1	-0/+26
	This patch adds gfx golden settings for renoir real asic. v2: update settings (Alex) Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12	drm/amdgpu: set rlc funcs for renoir	Aaron Liu	1	-0/+1
	add gfx_v9_0_rlc_funcs for renoir Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12	drm/amdgpu: add gfx support for renoir	Huang Rui	1	-2/+24
	Add Renoir checks to gfx9 code. Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12	drm/amdgpu: fix gfx9 soft recovery	Pierre-Eric Pelloux-Prayer	1	-1/+1
	The SOC15_REG_OFFSET() macro wasn't used, making the soft recovery fail. v2: use WREG32_SOC15 instead of WREG32 + SOC15_REG_OFFSET Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12	drm/amdgpu: increase CGCG gfx idle threshold for Arcturus	Le Ma	1	-2/+6
	Follow the hw spec, and no need to consider gfxoff on Arcturus Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12	drm/amdgpu: add gfx clock gating for Arcturus	Le Ma	1	-0/+4
	Add ARCTURUS case in gfx set clockgating function. No 3d clock on Arcturus. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12	drm/amdgpu: add check to avoid array bound issue	Guchun Chen	1	-0/+3
	Sub_block_index can be passed from user level, so add one check before accessing the array first to prevent array index out of bound problem. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-09	Merge tag 'v5.3-rc3' into drm-next-5.4	Alex Deucher	1	-0/+9
	Linux 5.3-rc3 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02	drm/amdgpu: disable MEC2 JT context init for Arcturus	John Clements	1	-5/+11
	We don't need to handle it like other asics. Signed-off-by: John Clements <john.clements@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02	drm/amdgpu: removed duplicate line	John Clements	1	-1/+0
	Remove duplicate break. Signed-off-by: John Clements <john.clements@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02	drm/amdgpu: replace AMDGPU_RAS_UE with AMDGPU_RAS_SUCCESS	Tao Zhou	1	-1/+1
	ce can also trigger interrupt, and even both ce and ue error can be found in one ras query, distinguishing between ce and ue in interrupt handler is uncessary. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Suggested-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02	drm/amdkfd: Extend CU mask to 8 SEs (v3)	Jay Cornwall	1	-0/+4
	Following bitmap layout logic introduced by: "drm/amdgpu: support get_cu_info for Arcturus". v2: squash in fixup for gfx_v9_0.c (Alex) v3: squash in debug print output fix Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>