Random Crashes And Freezes With KDE Plasma

I’m on a custom-built PC with an RX 7800XT and a Ryzen 9 7900X. Compatibility of all parts was checked by an online configurator, but I assembled it myself. I also installed a fresh version of Endeavour, just to avoid old configs from my NVIDIA system

It happens completely random while using Steam, FireFox, Kate… Suddenly, the system freezes, then after a few seconds, I get back to the Plasma login shell. Sometimes I even see some kernel messages just before the login prompt. And sometimes the PC crashes so hard that I have to manually restart it.

The crashes happen both with X11 and Wayland, no system load and heavy load.

Some errors from DMESG:

[  486.258693] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=113030, emitted seq=113032
[  486.258879] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_x11 pid 3266 thread kwin_x11:cs0 pid 3297
[  486.259021] amdgpu 0000:12:00.0: amdgpu: GPU reset begin!
[  486.324038] amdgpu 0000:12:00.0: amdgpu: MODE2 reset
[  486.331634] amdgpu 0000:12:00.0: amdgpu: GPU reset succeeded, trying to resume
[  486.331742] [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
[  486.331761] [drm] VRAM is lost due to GPU reset!
...
[  488.819260] amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
...
[ 1693.914412] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=117898, emitted seq=117900
[ 1693.914587] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_x11 pid 5806 thread kwin_x11:cs0 pid 5837
...
[ 1694.425861] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!

Could someone please help me? I don’t even know where to begin looking for a solution.

Try install the LTS kernel and make it the default.
Try system update from the Welcome app.
Are you on NVIDIA card?

I’ll try that, thanks. I’m on AMD now, and I did a fresh install of EndeavourOS (the install option without NVIDIA support).

With the LTS kernel the system crashes as often and about in the same manner as with the Arch kernel.

Something other that I noticed: Everytime Steam shows that little message in the bottom right corner, KWIN crashes due to a graphics reset.

What is the hardware info? Post the url. It is better to see all of the info as opposed to just knowing the processor and gpu.

inxi -Faz | eos-sendlog

Do you even have an nvidia gpu? You said you have an RX 7800XT.

Yes, I’m on AMD. Sorry for the confusion.

This is my hardware: https://0x0.st/XOIA.txt

Your UEFI Bios is quite dated. There are 7 new updates with newer AMD AGESA. I would look at the list and update. The latest few add support for newer AM5 processors 8000 series.

Edit: You should be installing with the default menu on the live ISO not the nvidia unless you have nvidia gpu. There are a number of things to set for amdgpu if you want vulkan and hardware acceleration etc.

If you need help with this just ask. The arch wiki provides information to help with that but … It’s not always helpful though as some people follow things that are not needed or already implemented.

https://wiki.archlinux.org/title/AMDGPU#Installation

Edit: You have newer high end hardware. Use default UEFI settings. Until you get to know how your hardware works and performs this is best.

Great, thanks a lot! Yes, I did that, I installed the default ISO without NVIDIA.

Thank you, I’ll look into it!

You also have integrated graphics. Not sure how you have the UEFI Bios settings set for the dedicated gpu? Auto or Dedicated only?

I just checked, I plugged the monitor in the CPU’s HDMI outlet and not in the GPU’s one. I changed that and now I see my dedicated graphics under “About this system”. The system feels more stable now, too.

If that was the problem, I’m really sorry to have wasted your time

If using the hdmi output that is on the motherboard it will be outputting from the onboard integrated graphics. You should try it running off the dedicated gpu and set the UEFI Bios to use only the dedicated card.

Alright, did that, so far everything seems to work fine. Thank you!

Now show inxi -Ga

Graphics:
  Device-1: AMD Navi 32 [Radeon RX 7700 XT / 7800 XT] vendor: Sapphire
    driver: amdgpu v: kernel arch: RDNA-3 code: Navi-3x process: TSMC n5 (5nm)
    built: 2022+ pcie: gen: 4 speed: 16 GT/s lanes: 16 ports: active: HDMI-A-2
    empty: DP-1, DP-2, HDMI-A-1, Writeback-1 bus-ID: 03:00.0
    chip-ID: 1002:747e class-ID: 0300
  Device-2: AMD Raphael vendor: ASUSTeK driver: amdgpu v: kernel
    arch: RDNA-2 code: Navi-2x process: TSMC n7 (7nm) built: 2020-22 pcie:
    gen: 4 speed: 16 GT/s lanes: 16 ports: active: none
    empty: DP-3,HDMI-A-3,Writeback-2 bus-ID: 12:00.0 chip-ID: 1002:164e
    class-ID: 0300 temp: 41.0 C
  Display: wayland server: X.org v: 1.21.1.13 with: Xwayland v: 24.1.1
    compositor: kwin_wayland driver: X: loaded: amdgpu
    unloaded: modesetting,radeon alternate: fbdev,vesa dri: radeonsi
    gpu: amdgpu,amdgpu display-ID: 0
  Monitor-1: HDMI-A-2 res: 1920x1080 size: N/A modes: N/A
  API: EGL v: 1.5 hw: drv: amd radeonsi platforms: device: 0 drv: radeonsi
    device: 1 drv: radeonsi device: 2 drv: swrast gbm: drv: kms_swrast
    surfaceless: drv: radeonsi wayland: drv: radeonsi x11: drv: radeonsi
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 24.1.5-arch1.1
    glx-v: 1.4 direct-render: yes renderer: AMD Radeon RX 7800 XT (radeonsi
    navi32 LLVM 18.1.8 DRM 3.57 6.10.2-arch1-1) device-ID: 1002:747e
    memory: 15.62 GiB unified: no display-ID: :1.0
  API: Vulkan v: 1.3.279 layers: 4 device: 0 type: discrete-gpu name: AMD
    Radeon RX 7800 XT (RADV NAVI32) driver: mesa radv v: 24.1.5-arch1.1
    device-ID: 1002:747e surfaces: xcb,xlib,wayland device: 1
    type: integrated-gpu name: AMD Radeon Graphics (RADV RAPHAEL_MENDOCINO)
    driver: mesa radv v: 24.1.5-arch1.1 device-ID: 1002:164e
    surfaces: xcb,xlib,wayland

Looks good. There are some other things that can be done for gaming. Also accelerated video decoding and hardware acceleration with packages needed to verify everything is working and some configuration settings needed.

It’s all in the link if it’s something you need and just ask if you don’t understand something.

https://wiki.archlinux.org/title/AMDGPU#Installation

Is the part under “Experimental” necessary or recommended for gaming?

Also, do I need to implement SI and CSI support? Since I’m on the RX 7000 series, I don’t need support for the HD 7000 series, right? This is my first time using AMD, so I’m not firm in it’s terminology.

No … I wouldn’t say so.

You mean Si and Cik? No because your Gpu isn’t one of those.

The only things I’m recommending if you are gaming using steam is the 32 bit lib files for mesa, vulkan-raedon and the lib32-vulkan-radeon. Then you might want to set up hardware acceleration.

So to run through it if this is the case i would install and then after all is done reboot before verifying all is working. It should if done properly get output for each of the commands to verify it’s all working.

lib32-mesa
vulkan-radeon
lib32-vulkan-radeon

For accelerated video (VA-API) (VDPAU)

libva-mesa-driver
lib32-libva-mesa-driver
mesa-vdpau
lib32-mesa-vdpau

Verifying VA-API you need to install

libva-utils

Then run after everything is done and reboot before verifying.

vainfo

Verifying VDPAU you need to install

vdpau

Then run after everything is done and reboot before verifying.

vdpauinfo

Before verifying both of the above you need to set the following in

/etc/environment

add

LIBVA_DRIVER_NAME=radeonsi
VDPAU_DRIVER=radeonsi

To check vulkan run

vulkaninfo

I think all the dependencies (vulkan-tools) will be installed when installing vulkan-radeon so the command should work after doing all the above.

So just to clarify. You have to do everything first then reboot. If you have done it all correctly then the commands to check and verify will give you output.

Edit: If anything isn’t clear just let me know.

[jon@pc-eos ~]$ vdpauinfo
display: :1   screen: 0
Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory
Error creating VDPAU device: 1
[jon@pc-eos ~]$ 

What does that mean? I followed all the steps before, and also rebooted the computer.