aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-06-13amdgpu: no need to check return value of debugfs_create functionsGreg Kroah-Hartman1-2/+2
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: xinhui pan <xinhui.pan@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: Feifei Xu <Feifei.Xu@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-11drm/amdgpu: Fix bounds checking in amdgpu_ras_is_supported()Dan Carpenter1-0/+2
The "block" variable can be set by the user through debugfs, so it can be quite large which leads to shift wrapping here. This means we report a "block" as supported when it's not, and that leads to array overflows later on. This bug is not really a security issue in real life, because debugfs is generally root only. Fixes: 36ea1bd2d084 ("drm/amdgpu: add debugfs ctrl node") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24drm/amdgpu: ras support suspend/resumexinhui pan1-1/+3
add ras suspend function. rename ras_post_init to amdgpu_ras_resume. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24drm/amdgpu: add badpages sysfs interafcexinhui pan1-0/+1
add badpages node. it will output badpages list in format gpu pfn : gpu page size : flags example 0x00000000 : 0x00001000 : R 0x00000001 : 0x00001000 : R 0x00000002 : 0x00001000 : R 0x00000003 : 0x00001000 : R 0x00000004 : 0x00001000 : R 0x00000005 : 0x00001000 : R 0x00000006 : 0x00001000 : R 0x00000007 : 0x00001000 : P 0x00000008 : 0x00001000 : P 0x00000009 : 0x00001000 : P flags can be one of below characters R: reserved. P: pending for reserve. F: failed to reserve for some reasons. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24drm/amdgpu: handle ras resetxinhui pan1-0/+3
add another flag to allow IP do a gpu reset after device init. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24drm/amdgpu: Revert "drm/amdgpu: skip gpu reset when ras error occured"xinhui pan1-3/+0
Enable this now to reset the GPU on RAS errors. This reverts commit 138352e5752aa3e694951d70c8fe8730219f4edf. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-04-10drm/amdgpu: Introduce another ras enable functionxinhui pan1-0/+3
Many parts of the whole SW stack can program the ras enablement state during the boot. Now we handle that case by adding one function which check the ras flags and choose different code path. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-27drm/amdgpu: Fix amdgpu ras to ta enums conversionxinhui pan1-0/+56
Add helpes to transalte the two enums. And it will catch bugs easily. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19drm/amdgpu: add new ras workflow control flagsxinhui pan1-0/+3
add ras post init function. Do some initialization after all IP have finished their late init. Add new member flags which will control the ras work flow. For now, vbios enable ras for us on boot. That might change in the future. So there should be a flag from vbios to tell us if ras is enabled or not on boot. Looks like there is no such info now. Other bits of the flags are reserved to control other parts of ras. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19drm/amdgpu: add new member hw_supportedxinhui pan1-0/+3
Currently, it is not clear how ras is supported. Both software and hardware can set the supported. That is confusing. Fix it by adding new member hw_supported. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19drm/amdgpu: skip gpu reset when ras error occuredxinhui pan1-0/+3
gpu reset is not stable on vega20 A1. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19drm/amdgpu: add debugfs ctrl nodexinhui pan1-0/+9
allow userspace enable/disable ras Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19drm/amdgpu: add amdgpu_ras.c to support ras (v2)xinhui pan1-0/+217
add obj management. add feature control. add debugfs infrastructure. add sysfs infrastructure. add IH infrastructure. add recovery infrastructure. It is a framework. Other IPs need call amdgpu_ras_xxx function instead of psp_ras_xxx functions. v2: squash in warning fixes Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>