Desktop freezes on AMD notebook

Hi all,

I’m having issues on a new Lenovo notebook equipped with AMD Ryzen 7 7840U with Radeon 780M Graphics. This is a very recent model.

When I’m working on the desktop, it sometimes flickers and freezes completely. Suddenly, 90% of the display becomes black and I can only see the top bar, for example. The solution is to close the lid and open it again. I’m having trouble to identify the source of the problem. At first I though it to be related to the AMD processor/graphics card, then I also thought about the network card, a Realtek rtw89. I also thought about energy management. I don’t know.

What I know is that this issue is felt across different Linux distros and DEs, so this is not Arch/Endeavour specific, nor Cinnamon-related (which I’m running here).

Here is the output from inxi:

inxi -Fazy
System:
  Kernel: 6.5.8-arch1-1 arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
    clocksource: tsc available: hpet,acpi_pm
    parameters: initrd=\45aed1c1421c4b4ea592eb0880bb1cb5\6.5.8-arch1-1\initrd
    nvme_load=YES nowatchdog rw root=UUID=7f618cba-dad6-41b1-98ac-644763599693
    systemd.machine_id=45aed1c1421c4b4ea592eb0880bb1cb5
  Desktop: Cinnamon v: 5.8.4 tk: GTK v: 3.24.38 vt: 7 dm: LightDM v: 1.32.0
    Distro: EndeavourOS base: Arch Linux
Machine:
  Type: Laptop System: LENOVO product: 82X3 v: Yoga Slim 6 14APU8
    serial: <superuser required> Chassis: type: 10 v: Yoga Slim 6 14APU8
    serial: <superuser required>
  Mobo: LENOVO model: LNVNB161216 v: SDK0T76463 WIN
    serial: <superuser required> UEFI: LENOVO v: M4CN29WW date: 06/21/2023
Battery:
  ID-1: BAT0 charge: 65.6 Wh (100.0%) condition: 65.6/65.0 Wh (100.9%)
    volts: 17.5 min: 15.5 model: Celxpert L22C4PF1 type: Li-poly serial: <filter>
    status: full cycles: 4
CPU:
  Info: model: AMD Ryzen 7 7840U with Radeon 780M Graphics bits: 64
    type: MT MCP arch: Zen 4 gen: 5 level: v4 note: check built: 2022+
    process: TSMC n5 (5nm) family: 0x19 (25) model-id: 0x74 (116) stepping: 1
    microcode: 0xA704101
  Topology: cpus: 1x cores: 8 tpc: 2 threads: 16 smt: enabled cache:
    L1: 512 KiB desc: d-8x32 KiB; i-8x32 KiB L2: 8 MiB desc: 8x1024 KiB
    L3: 16 MiB desc: 1x16 MiB
  Speed (MHz): avg: 482 high: 1717
    min/max: 400/5132:6076:5447:5289:5605:5918:5760 scaling:
    driver: amd-pstate-epp governor: powersave cores: 1: 400 2: 400 3: 400
    4: 400 5: 400 6: 400 7: 400 8: 400 9: 400 10: 400 11: 400 12: 400 13: 1717
    14: 400 15: 400 16: 400 bogomips: 105441
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: retbleed status: Not affected
  Type: spec_rstack_overflow mitigation: safe RET, no microcode
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Enhanced / Automatic IBRS, IBPB: conditional,
    STIBP: always-on, RSB filling, PBRSB-eIBRS: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: AMD Phoenix1 vendor: Lenovo driver: amdgpu v: kernel arch: RDNA-3
    code: Phoenix process: TSMC n4 (4nm) built: 2022+ pcie: gen: 4 speed: 16 GT/s
    lanes: 16 ports: active: eDP-1 empty: DP-1, DP-2, DP-3, DP-4, DP-5, DP-6,
    HDMI-A-1 bus-ID: 62:00.0 chip-ID: 1002:15bf class-ID: 0300 temp: 41.0 C
  Device-2: Chicony Integrated Camera driver: uvcvideo type: USB rev: 2.0
    speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 3-1:2 chip-ID: 04f2:b7b7
    class-ID: fe01 serial: <filter>
  Display: x11 server: X.Org v: 21.1.8 driver: X: loaded: amdgpu
    unloaded: modesetting alternate: fbdev,vesa dri: radeonsi gpu: amdgpu
    display-ID: :0 screens: 1
  Screen-1: 0 s-res: 2880x1800 s-dpi: 96 s-size: 762x476mm (30.00x18.74")
    s-diag: 898mm (35.37")
  Monitor-1: eDP-1 mapped: eDP model: AU Optronics 0x26a4 built: 2022
    res: 2880x1800 hz: 120 dpi: 243 gamma: 1.2 size: 301x188mm (11.85x7.4")
    diag: 355mm (14") ratio: 16:10 modes: max: 2880x1800 min: 640x480
  API: EGL v: 1.5 hw: drv: amd radeonsi platforms: device: 0 drv: radeonsi
    device: 1 drv: swrast surfaceless: drv: radeonsi x11: drv: radeonsi
    inactive: gbm,wayland
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 23.2.1-arch1.2
    glx-v: 1.4 direct-render: yes renderer: AMD Radeon Graphics (gfx1103_r1 LLVM
    16.0.6 DRM 3.54 6.5.8-arch1-1) device-ID: 1002:15bf memory: 1.95 GiB
    unified: no
Audio:
  Device-1: AMD Rembrandt Radeon High Definition Audio vendor: Lenovo
    driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16
    bus-ID: 62:00.1 chip-ID: 1002:1640 class-ID: 0403
  Device-2: AMD ACP/ACP3X/ACP6x Audio Coprocessor vendor: Lenovo
    driver: snd_pci_ps v: kernel alternate: snd_pci_acp3x, snd_rn_pci_acp3x,
    snd_pci_acp5x, snd_pci_acp6x, snd_acp_pci, snd_rpl_pci_acp6x,
    snd_sof_amd_renoir, snd_sof_amd_rembrandt pcie: gen: 4 speed: 16 GT/s
    lanes: 16 bus-ID: 62:00.5 chip-ID: 1022:15e2 class-ID: 0480
  Device-3: AMD Family 17h/19h HD Audio vendor: Lenovo driver: snd_hda_intel
    v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16 bus-ID: 62:00.6
    chip-ID: 1022:15e3 class-ID: 0403
  API: ALSA v: k6.5.8-arch1-1 status: kernel-api
    tools: alsactl,alsamixer,amixer
  Server-1: PipeWire v: 0.3.83 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Realtek vendor: Lenovo driver: rtw89_8852ce v: kernel pcie: gen: 2
    speed: 5 GT/s lanes: 1 port: 6000 bus-ID: 01:00.0 chip-ID: 10ec:c852
    class-ID: 0280
  IF: wlan0 state: up mac: <filter>
Bluetooth:
  Device-1: Realtek Bluetooth Radio driver: btusb v: 0.8 type: USB rev: 1.0
    speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-5:2 chip-ID: 0bda:5852
    class-ID: e001 serial: <filter>
  Report: btmgmt ID: hci0 rfk-id: 2 state: down bt-service: disabled
    rfk-block: hardware: no software: no address: N/A
Drives:
  Local Storage: total: 953.87 GiB used: 200.75 GiB (21.0%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: MZVL21T0HCLR-00BL2
    size: 953.87 GiB block-size: physical: 512 B logical: 512 B speed: 63.2 Gb/s
    lanes: 4 tech: SSD serial: <filter> fw-rev: CL1QGXA7 temp: 35.9 C
    scheme: GPT
Partition:
  ID-1: / raw-size: 65.43 GiB size: 63.85 GiB (97.59%) used: 8.83 GiB (13.8%)
    fs: ext4 dev: /dev/nvme0n1p3 maj-min: 259:3
  ID-2: /home raw-size: 97.66 GiB size: 95.56 GiB (97.86%)
    used: 4.04 GiB (4.2%) fs: ext4 dev: /dev/nvme0n1p6 maj-min: 259:6
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default) zswap: yes
    compressor: zstd max-pool: 20%
  ID-1: swap-1 type: partition size: 31.26 GiB used: 0 KiB (0.0%)
    priority: -2 dev: /dev/nvme0n1p9 maj-min: 259:9
Sensors:
  System Temperatures: cpu: 44.8 C mobo: N/A gpu: amdgpu temp: 40.0 C
  Fan Speeds (rpm): N/A
Info:
  Processes: 324 Uptime: 1h 26m wakeups: 45679 Memory: total: 16 GiB note: est.
  available: 13.34 GiB used: 2.47 GiB (18.5%) Init: systemd v: 254
  default: graphical tool: systemctl Compilers: gcc: 13.2.1 Packages:
  pm: pacman pkgs: 896 libs: 254 tools: yay Shell: Bash v: 5.1.16
  running-in: gnome-terminal inxi: 3.3.30

I’m also providing output from journalctl for the last boot where I felt this problem. I know that this ocurred at least at 12:01:ss and at 12:03:ss. The only thing I see is plenty of Realtek network card errors. Here is the output:

journalctl --boot=-1 --priority=3 --catalog --no-pager
Oct 24 11:55:25 endbox kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.GP18.SATA], AE_NOT_FOUND (20230331/dswload2-162)
Oct 24 11:55:25 endbox kernel: ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230331/psobject-220)
Oct 24 11:56:10 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 11:56:10 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 11:59:34 endbox kernel: ucsi_acpi USBC000:00: failed to re-enable notifications (-110)
Oct 24 11:59:44 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 11:59:44 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:01:35 endbox kernel: ucsi_acpi USBC000:00: failed to re-enable notifications (-110)
Oct 24 12:02:50 endbox systemd-coredump[2518]: [🡕] Process 2512 (csd-datetime-me) of user 0 dumped core.
                                               
                                               Stack trace of thread 2512:
                                               #0  0x00007f7973cfbf41 g_type_check_instance_is_fundamentally_a (libgobject-2.0.so.0 + 0x3af41)
                                               #1  0x00007f7973ce369e g_object_unref (libgobject-2.0.so.0 + 0x2269e)
                                               #2  0x00007f7973bc992d n/a (libglib-2.0.so.0 + 0x5492d)
                                               #3  0x00007f7973bcdf50 n/a (libglib-2.0.so.0 + 0x58f50)
                                               #4  0x00007f7973bcf021 n/a (libglib-2.0.so.0 + 0x5a021)
                                               #5  0x00007f7973c2d2b7 n/a (libglib-2.0.so.0 + 0xb82b7)
                                               #6  0x00007f7973bcfb47 g_main_loop_run (libglib-2.0.so.0 + 0x5ab47)
                                               #7  0x000056051a0b018a n/a (csd-datetime-mechanism + 0x418a)
                                               #8  0x00007f797399dcd0 n/a (libc.so.6 + 0x27cd0)
                                               #9  0x00007f797399dd8a __libc_start_main (libc.so.6 + 0x27d8a)
                                               #10 0x000056051a0b02c5 n/a (csd-datetime-mechanism + 0x42c5)
                                               
                                               Stack trace of thread 2513:
                                               #0  0x00007f7973a8473d syscall (libc.so.6 + 0x10e73d)
                                               #1  0x00007f7973c28247 g_cond_wait (libglib-2.0.so.0 + 0xb3247)
                                               #2  0x00007f7973b9a1b4 n/a (libglib-2.0.so.0 + 0x251b4)
                                               #3  0x00007f7973c02a2e n/a (libglib-2.0.so.0 + 0x8da2e)
                                               #4  0x00007f7973c009a5 n/a (libglib-2.0.so.0 + 0x8b9a5)
                                               #5  0x00007f7973a029eb n/a (libc.so.6 + 0x8c9eb)
                                               #6  0x00007f7973a867cc n/a (libc.so.6 + 0x1107cc)
                                               
                                               Stack trace of thread 2514:
                                               #0  0x00007f7973a78f6f __poll (libc.so.6 + 0x102f6f)
                                               #1  0x00007f7973c2d206 n/a (libglib-2.0.so.0 + 0xb8206)
                                               #2  0x00007f7973bcd112 g_main_context_iteration (libglib-2.0.so.0 + 0x58112)
                                               #3  0x00007f7973bcd162 n/a (libglib-2.0.so.0 + 0x58162)
                                               #4  0x00007f7973c009a5 n/a (libglib-2.0.so.0 + 0x8b9a5)
                                               #5  0x00007f7973a029eb n/a (libc.so.6 + 0x8c9eb)
                                               #6  0x00007f7973a867cc n/a (libc.so.6 + 0x1107cc)
                                               
                                               Stack trace of thread 2516:
                                               #0  0x00007f7973a78f6f __poll (libc.so.6 + 0x102f6f)
                                               #1  0x00007f7973c2d206 n/a (libglib-2.0.so.0 + 0xb8206)
                                               #2  0x00007f7973bcfb47 g_main_loop_run (libglib-2.0.so.0 + 0x5ab47)
                                               #3  0x00007f7973e350bc n/a (libgio-2.0.so.0 + 0x1120bc)
                                               #4  0x00007f7973c009a5 n/a (libglib-2.0.so.0 + 0x8b9a5)
                                               #5  0x00007f7973a029eb n/a (libc.so.6 + 0x8c9eb)
                                               #6  0x00007f7973a867cc n/a (libc.so.6 + 0x1107cc)
                                               ELF object binary architecture: AMD x86-64
░░ Subject: Process 2512 (csd-datetime-me) dumped core
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ Documentation: man:core(5)
░░ 
░░ Process 2512 (csd-datetime-me) crashed and dumped core.
░░ 
░░ This usually indicates a programming error in the crashing program and
░░ should be reported to its vendor as a bug.
Oct 24 12:03:06 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:03:06 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:03:34 endbox kernel: ucsi_acpi USBC000:00: failed to re-enable notifications (-110)
Oct 24 12:04:05 endbox kernel: ucsi_acpi USBC000:00: failed to re-enable notifications (-110)
Oct 24 12:04:20 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:04:20 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:06:33 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:06:33 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:17:28 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:17:28 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:18:38 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:18:38 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:24:20 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:24:20 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:26:24 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:26:24 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:28:43 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:28:43 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:30:35 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:30:35 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:33:46 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:33:46 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:36:45 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:36:45 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:47:35 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:47:35 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:54:07 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:54:07 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:54:22 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:54:22 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:55:55 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:55:55 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:56:26 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:56:26 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 12:59:30 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 12:59:31 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 13:00:24 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 13:00:24 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 13:01:34 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 13:01:34 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 13:06:41 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 13:06:41 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 13:07:34 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 13:07:34 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 13:07:44 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 13:07:44 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 13:09:22 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 24 13:09:22 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 24 13:14:18 endbox cinnamon-screensaver-pam-helper[3549]: pam_unix(cinnamon-screensaver:auth): conversation failed
Oct 24 13:14:18 endbox cinnamon-screensaver-pam-helper[3549]: pam_unix(cinnamon-screensaver:auth): auth could not identify password for [l13]

Can anyone help here? Or give some clues on where to look? This is pretty annoying. Many thanks.

As a user, the best I could do is a small search for your recent and powerful chip.
There’s this that came up:
https://bbs.archlinux.org/viewtopic.php?pid=2127613

https://wiki.archlinux.org/title/AMDGPU

You might want to collect more info and exchange with more knowledgeable volunteers :wink:

All your temperatures are ok ?
$ watch sensors

My system froze at 15:09:ss. I run the journalctl -b to capture everything and nothing is there. I closed the lid at 15:10:04 and the previous error message came at 15:08:53. This is strange to me

Oct 25 15:08:40 endbox kernel: rtw89_8852ce 0000:01:00.0: SER catches error: 0x1000
Oct 25 15:08:40 endbox kernel: rtw89_8852ce 0000:01:00.0: firmware failed to ack for leaving ps mode
Oct 25 15:08:40 endbox kernel: rtw89_8852ce 0000:01:00.0: SER catches error: 0x1001
Oct 25 15:08:40 endbox kernel: rtw89_8852ce 0000:01:00.0: SER catches error: 0x1002
Oct 25 15:08:40 endbox kernel: rtw89_8852ce 0000:01:00.0: c2h class 1 func 3 not support
Oct 25 15:08:53 endbox kernel: rtw89_8852ce 0000:01:00.0: no tx fwcmd resource
Oct 25 15:08:53 endbox kernel: rtw89_8852ce 0000:01:00.0: failed to send h2c
Oct 25 15:10:04 endbox systemd-logind[637]: Lid closed.
Oct 25 15:10:04 endbox systemd-logind[637]: The system will suspend now!

Sounds like GPU resets and I guess you have the gen 4?
Arch page for the gen 3 amd
https://wiki.archlinux.org/title/Lenovo_ThinkPad_T14s_(AMD)_Gen_3

After many attempts at changing the configuration I noticed that there is no crash if i don’t use the highest resolution. My monitor is capable of 2880 x 1800 (16.10) @ 120hz. I usually use it with 200% scaling. If I opt for 1920 x 1200 @ 120hz, even if using 125% scaling, it doen’st crash any more. No clue on what causes this. Any ideas where to look for?

I run sway and for me it seems to happen when I use some specific xwayland applications.

Birthing pains. As usual it will take 6-12 months to iron out all the quirks that come with a new CPU after it hits the wild. Best bet is to just make sure you are rolling the latest kernel.

And after many failed attempts I was able to put an end to it.

I added the kernel parameter:

amdgpu.dcdebugmask=0x10

… and it’s gone, for good!

In the specific case of Endeavour OS, using Dracut and systemd-boot, I added that parameter to /etc/kernel/cmdline to make the fix permanent. If using GRUB, you need to append it to /etc/default/grub instead.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.