<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/drivers/gpu/drm/amd/amdgpu, branch master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/drivers/gpu/drm/amd/amdgpu?h=master</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/drivers/gpu/drm/amd/amdgpu?h=master'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-11-10T20:31:38Z</updated>
<entry>
<title>Merge tag 'drm-misc-fixes-2022-11-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes</title>
<updated>2022-11-10T20:31:38Z</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2022-11-10T20:31:37Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=2e4b294576e32fb02562ad6839d6888ab7b45102'/>
<id>urn:sha1:2e4b294576e32fb02562ad6839d6888ab7b45102</id>
<content type='text'>
drm-misc-fixes for v6.1-rc5:
- HDMI fixes to vc4.
- Make panfrost's uapi header compile with C++.
- Add rotation quirks for 2 panels.
- Fix s/r in amdgpu_vram_mgr_new
- Handle 1 gb boundary correctly in panfrost mmu code.

Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
From: Maarten Lankhorst &lt;maarten.lankhorst@linux.intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/e02de501-4b85-28a0-3f6e-751ca13f5f9d@linux.intel.com
</content>
</entry>
<entry>
<title>drm/amdgpu: Drop eviction lock when allocating PT BO</title>
<updated>2022-11-09T23:06:04Z</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2022-11-02T20:55:31Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=e034a0d9aaee5c9129d5dfdfdfcab988a953412d'/>
<id>urn:sha1:e034a0d9aaee5c9129d5dfdfdfcab988a953412d</id>
<content type='text'>
Re-take the eviction lock immediately again after the allocation is
completed, to fix circular locking warning with drm_buddy allocator.

Move amdgpu_vm_eviction_lock/unlock/trylock to amdgpu_vm.h as they are
called from multiple files.

Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Unlock bo_list_mutex after error handling</title>
<updated>2022-11-09T23:05:16Z</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2022-11-03T14:24:52Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=64f65135c41a75f933d3bca236417ad8e9eb75de'/>
<id>urn:sha1:64f65135c41a75f933d3bca236417ad8e9eb75de</id>
<content type='text'>
Get below kernel WARNING backtrace when pressing ctrl-C to kill kfdtest
application.

If amdgpu_cs_parser_bos returns error after taking bo_list_mutex, as
caller amdgpu_cs_ioctl will not unlock bo_list_mutex, this generates the
kernel WARNING.

Add unlock bo_list_mutex after amdgpu_cs_parser_bos error handling to
cleanup bo_list userptr bo.

 WARNING: kfdtest/2930 still has locks held!
 1 lock held by kfdtest/2930:
  (&amp;list-&gt;bo_list_mutex){+.+.}-{3:3}, at: amdgpu_cs_ioctl+0xce5/0x1f10 [amdgpu]
  stack backtrace:
   dump_stack_lvl+0x44/0x57
   get_signal+0x79f/0xd00
   arch_do_signal_or_restart+0x36/0x7b0
   exit_to_user_mode_prepare+0xfd/0x1b0
   syscall_exit_to_user_mode+0x19/0x40
   do_syscall_64+0x40/0x80

Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: workaround for TLB seq race</title>
<updated>2022-11-09T22:58:17Z</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2022-11-02T13:55:13Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=77c092e054262b594614bad5e5f47e57c5d29639'/>
<id>urn:sha1:77c092e054262b594614bad5e5f47e57c5d29639</id>
<content type='text'>
It can happen that we query the sequence value before the callback
had a chance to run.

Workaround that by grabbing the fence lock and releasing it again.
Should be replaced by hw handling soon.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
CC: stable@vger.kernel.org # 5.19+
Fixes: 5255e146c99a6 ("drm/amdgpu: rework TLB flushing")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2113
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Acked-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Tested-by: Stefan Springer &lt;stefanspr94@gmail.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix the lpfn checking condition in drm buddy</title>
<updated>2022-11-08T13:00:18Z</updated>
<author>
<name>Ma Jun</name>
<email>Jun.Ma2@amd.com</email>
</author>
<published>2022-09-14T12:53:31Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=e0b26b9482461e9528552f54fa662c2269f75b3f'/>
<id>urn:sha1:e0b26b9482461e9528552f54fa662c2269f75b3f</id>
<content type='text'>
Because the value of man-&gt;size is changed during suspend/resume process,
use mgr-&gt;mm.size instead of man-&gt;size here for lpfn checking.

Signed-off-by: Ma Jun &lt;Jun.Ma2@amd.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20220914125331.2467162-1-Jun.Ma2@amd.com
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Disable GPU reset on SRIOV before remove pci.</title>
<updated>2022-11-02T21:16:25Z</updated>
<author>
<name>Gavin Wan</name>
<email>Gavin.Wan@amd.com</email>
</author>
<published>2022-10-26T17:45:25Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=8fe8ce896c1cc29d6bfebb3c7b3cc948f72cd32c'/>
<id>urn:sha1:8fe8ce896c1cc29d6bfebb3c7b3cc948f72cd32c</id>
<content type='text'>
The recent change brought a bug on SRIOV envrionment. It caused
unloading amdgpu failed on Guest VM. The reason is that the VF
FLR was requested while unloading amdgpu driver, but the VF FLR
of SRIOV sequence is wrong while removing PCI device.

For SRIOV, the guest driver should not trigger the whole XGMI hive
to do the reset. Host driver control how the device been reset.

Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2")
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Shaoyun Liu &lt;Shaoyun.Liu@amd.com&gt;
Signed-off-by: Gavin Wan &lt;Gavin.Wan@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: disable GFXOFF during compute for GFX11</title>
<updated>2022-11-02T21:16:11Z</updated>
<author>
<name>Graham Sider</name>
<email>Graham.Sider@amd.com</email>
</author>
<published>2022-10-26T19:08:24Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=a3e5ce56f3d260f2ec8e5242c33f57e60ae9eba7'/>
<id>urn:sha1:a3e5ce56f3d260f2ec8e5242c33f57e60ae9eba7</id>
<content type='text'>
Temporary workaround to fix issues observed in some compute applications
when GFXOFF is enabled on GFX11.

Signed-off-by: Graham Sider &lt;Graham.Sider@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org # 6.0.x
</content>
</entry>
<entry>
<title>drm/amd: Fail the suspend if resources can't be evicted</title>
<updated>2022-11-02T21:00:39Z</updated>
<author>
<name>Mario Limonciello</name>
<email>mario.limonciello@amd.com</email>
</author>
<published>2022-10-26T19:03:55Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=8d4de331f1b24a22d18e3c6116aa25228cf54854'/>
<id>urn:sha1:8d4de331f1b24a22d18e3c6116aa25228cf54854</id>
<content type='text'>
If a system does not have swap and memory is under 100% usage,
amdgpu will fail to evict resources.  Currently the suspend
carries on proceeding to reset the GPU:

```
[drm] evicting device resources failed
[drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block &lt;vcn_v3_0&gt; failed -12
[drm] free PSP TMR buffer
[TTM] Failed allocating page table
[drm] evicting device resources failed
amdgpu 0000:03:00.0: amdgpu: MODE1 reset
amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
```

At this point if the suspend actually succeeded I think that amdgpu
would have recovered because the GPU would have power cut off and
restored.  However the kernel fails to continue the suspend from the
memory pressure and amdgpu fails to run the "resume" from the aborted
suspend.

```
ACPI: PM: Preparing to enter system sleep state S3
SLUB: Unable to allocate memory on node -1, gfp=0xdc0(GFP_KERNEL|__GFP_ZERO)
  cache: Acpi-State, object size: 80, buffer size: 80, default order: 0, min order: 0
  node 0: slabs: 22, objs: 1122, free: 0
ACPI Error: AE_NO_MEMORY, Could not update object reference count (20210730/utdelete-651)

[drm:psp_hw_start [amdgpu]] *ERROR* PSP load kdb failed!
[drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
[drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block &lt;psp&gt; failed -62
amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-62).
PM: dpm_run_callback(): pci_pm_resume+0x0/0x100 returns -62
amdgpu 0000:03:00.0: PM: failed to resume async: error -62
```

To avoid this series of unfortunate events, fail amdgpu's suspend
when the memory eviction fails.  This will let the system gracefully
recover and the user can try suspend again when the memory pressure
is relieved.

Reported-by: post@davidak.de
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2223
Signed-off-by: Mario Limonciello &lt;mario.limonciello@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: correct MES debugfs versions</title>
<updated>2022-11-02T20:57:32Z</updated>
<author>
<name>Graham Sider</name>
<email>Graham.Sider@amd.com</email>
</author>
<published>2022-10-25T18:42:13Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=e542ca6e3e554bad53b2ea5741873b67f4585ea9'/>
<id>urn:sha1:e542ca6e3e554bad53b2ea5741873b67f4585ea9</id>
<content type='text'>
Use mes.sched_version, mes.kiq_version for debugfs as
mes.ucode_fw_version does not contain correct versioning information.

Signed-off-by: Graham Sider &lt;Graham.Sider@amd.com&gt;
Reviewed-by: Jack Xiao &lt;Jack.Xiao@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: set fb_modifiers_not_supported in vkms</title>
<updated>2022-11-02T20:56:13Z</updated>
<author>
<name>Yifan Zhang</name>
<email>yifan1.zhang@amd.com</email>
</author>
<published>2022-10-24T04:47:47Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=89b3554782e6b65894f0551e9e0a82ad02dac94d'/>
<id>urn:sha1:89b3554782e6b65894f0551e9e0a82ad02dac94d</id>
<content type='text'>
This patch to fix the gdm3 start failure with virual display:

/usr/libexec/gdm-x-session[1711]: (II) AMDGPU(0): Setting screen physical size to 270 x 203
/usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to make import prime FD as pixmap: 22
/usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument
/usr/libexec/gdm-x-session[1711]: (WW) AMDGPU(0): Failed to set mode on CRTC 0
/usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to enable any CRTC
gnome-shell[1840]: Running GNOME Shell (using mutter 42.2) as a X11 window and compositing manager
/usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument

vkms doesn't have modifiers support, set fb_modifiers_not_supported to bring the gdm back.

Signed-off-by: Yifan Zhang &lt;yifan1.zhang@amd.com&gt;
Acked-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Tim Huang &lt;Tim.Huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
