AMDgpu driver crashing when starting some games

Hello! Recently got a RX 6900XT and I’m having an annoying issue where my GPU driver will crash when starting some games. It always happens like this:

  1. Start game
  2. Notice that every monitor has frozen but audio continues playing, this goes for 5-10 seconds
  3. All monitors go black for 2-3 seconds and audio stops
  4. Monitors come back and I’m at the login screen

Anyone got any ideas?

NOTE: it does not always happen, I’d say roughly 1 out of 10 times or so, if that. Often enough to be a massive pain.

EDIT: I found these error messages using dmesg.

[ 45.298913] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:8 vmid:5 pasid:32779, for process r5apex.exe pid 3351 thread dxvk-submit pid 3457)
[ 45.298924] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x00008000041ff000 from client 0x1b (UTCL2)
[ 45.298928] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00500C10
[ 45.298931] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CPG (0x6)
[ 45.298933] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0
[ 45.298936] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0
[ 45.298938] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x1
[ 45.298941] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 45.298943] amdgpu 0000:08:00.0: amdgpu: RW: 0x0
[ 55.316967] [drm:amdgpu_job_timedout [amdgpu]] ERROR ring gfx_0.0.0 timeout, signaled seq=19999, emitted seq=20001
[ 55.317302] [drm:amdgpu_job_timedout [amdgpu]] ERROR Process information: process r5apex.exe pid 3351 thread dxvk-submit pid 3457
[ 55.317613] amdgpu 0000:08:00.0: amdgpu: GPU reset begin!

Can you share the output of pacman -Q | grep -E "(amd|radeon|mesa)"

amd-ucode 20230804.7be2766d-2
lib32-mesa 1:23.1.6-4
lib32-vulkan-radeon 1:23.1.6-4
mesa 1:23.1.6-4
mesa-utils 9.0.0-2
vulkan-radeon 1:23.1.6-4
xf86-video-amdgpu 23.0.0-1