Lenovo thinkpad 14s AMD ryzen pro 7 ai gen 6 random crash and reboot

Hello all,

I have been using Endeavour OS KDE on my new laptop (Lenovo Thinkpad 14s AMD Ryzen 7 pro AI) and it is great apart from a strange behaviour that started happening lately.

The system is fully updated and right now I have both the linux-lts 6.12.20-1 and linux 6.13.8.arch1-1 installed. Here is the problem: randomly the image freezes for a few seconds, then the screen turns black for a few more seconds. Then the image comes back but everything is frozen (i.e. keyboard and mouse, external and trackpad). At this point one or two things happen: either the system becomes unfrozen and I can use it or the screen goes black again, only this time the system reboots.

This problem first started when I installed the 6.13 kernel (I don’t remember the exact subsertion when it started). So I just switched to the LTS and 6.12 and it worked fine. Until today when the LTS was updated to 6.12.20 and bam system freeze and all the symptoms above. I managed to look at journalctl right after reboot and this is what it was in there:

mar 24 09:34:16 thor kernel: amdgpu 0000:c3:00.0: amdgpu: Dumping IP State
mar 24 09:34:16 thor kernel: amdgpu 0000:c3:00.0: amdgpu: Dumping IP State Completed
mar 24 09:34:16 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring vcn_unified_0 timeout, signaled seq=2917, emitted seq=2919
mar 24 09:34:16 thor kernel: amdgpu 0000:c3:00.0: amdgpu: Process information: process RDD Process pid 2937 thread firefox:cs0 pid 14438
mar 24 09:34:16 thor kernel: amdgpu 0000:c3:00.0: amdgpu: GPU reset begin!
mar 24 09:34:16 thor kernel: amdgpu 0000:c3:00.0: amdgpu: MODE2 reset
mar 24 09:34:16 thor kernel: amdgpu 0000:c3:00.0: amdgpu: GPU reset succeeded, trying to resume
mar 24 09:34:16 thor kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000900000).
mar 24 09:34:16 thor kernel: amdgpu 0000:c3:00.0: amdgpu: SMU is resuming...
mar 24 09:34:16 thor kernel: amdgpu 0000:c3:00.0: amdgpu: SMU is resumed successfully!
mar 24 09:34:16 thor kernel: [drm] DMUB hardware initialized: version=0x09001700
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring jpeg_dec_0 uses VM inv eng 1 on hub 8
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: ring vpe uses VM inv eng 4 on hub 8
mar 24 09:34:17 thor kernel: amdgpu 0000:c3:00.0: amdgpu: GPU reset(1) succeeded!
mar 24 09:34:22 thor kernel: amdgpu 0000:c3:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001A SMN_C2PMSG_82:0x00000000
mar 24 09:34:22 thor kernel: amdgpu 0000:c3:00.0: amdgpu: Failed to disable gfxoff!
mar 24 09:34:27 thor kernel: amdgpu 0000:c3:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001A SMN_C2PMSG_82:0x00000000
mar 24 09:34:27 thor kernel: amdgpu 0000:c3:00.0: amdgpu: Failed to power gate VPE!
mar 24 09:34:27 thor kernel: [drm:vpe_set_powergating_state [amdgpu]] *ERROR* Dpm disable vpe failed, ret = -62.
mar 24 09:34:32 thor kernel: amdgpu 0000:c3:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001A SMN_C2PMSG_82:0x00000000
mar 24 09:34:32 thor kernel: amdgpu 0000:c3:00.0: amdgpu: Failed to power gate JPEG!
mar 24 09:34:32 thor kernel: [drm:jpeg_v4_0_5_set_powergating_state [amdgpu]] *ERROR* Dpm disable jpeg failed, ret = -62.
mar 24 09:34:33 thor kernel: amdgpu 0000:c3:00.0: amdgpu: Dumping IP State

I am not sure what else I should post here but if someone needs more info I can post more. Has anyone encountered something like this?

Cheers

can you make a report to bugzilla kernel

never made one. what is the procedure?

https://bugzilla.kernel.org/
Probably the wrong place.

archlinux will not solve errors kernel amdgpu …
you can return config file : zcat /proc/config.gz

I see. OK.

first login
then check in the list ( here driver DRI not intel )
an give the report ( journalctl ) + config.gz , version kernel + version firmware

amdgpu issues should be reported here i guess

1 Like

If you don’t mind. can you walk me through how to obtain all this info? I have never done something like this. Also, I assume I have to wait for the next crash to do this yes?

not really
just

inxi -Fza > inxi.txt
sudo journalctl -b -1 > journalctl.txt
sudo zcat /proc/config.gz > config.gz
sudo pacman -Qs firmware > firmware.txt

and provide the files