System soft lock on kernel 5.15

Hi everyone, new user here in the forums but been using Endeavour since Antergos dropped support.

I’m having an issue with the latest kernel version and decided to try my luck here before trying other places. If this isn’t the right place to ask, just point me in the right direction and I’ll gladly head that way :smiley:

As the title mentions, the system soft-lock after logging in ever since upgrading to kernel 5.15 (any patch version). After a few minutes I can’t interact with the taskbar or open new windows, and any command I run on the terminal that requires superuser privileges just hang. As mentioned, this happens after a few minutes.

Also most of the time when logging in, keyboard, mouse or both fail to register any input but weirdly enough, sysrq key combinations work even if the keyboard was not registering input.

My hardware is the following:

  • Ryzen 9 5950x
  • 64 Gb Crucial Ballistix Ram
  • Vega 64 / RX 6900XT (for VM passthrough)
  • Asus Dark Hero motherboard
  • 2x Corsair MP600 Force NVME 1TB disks

I’m using linux-zen kernel with KDE Plasma 5.23.4 (X11) and LightDM

Using kernel 5.14.16 makes the system usable again so I have downgraded the linux-zen package and headers to said version but still update the regular linux kernel package to test if things improve with newer versions.

I managed to get the output of dmesg before the system locked and I can certainly see many errors but not sure where to go from there. Here is a link for the full log, in case it helps

Feel free to ask for any other information that might be needed but for now I haven’t seen anyone having this issues.

Thanks in advance!

Looks to be related to your network interface driver,

[  +0,000004] Workqueue: events rtl8125_esd_task [r8125]

This could be a regression in the driver itself, something like power saving, or another application is triggering a bug, e.g.:

[  +0,000000] task:P2P_DISCOVER    state:D stack:    0 pid: 4783 ppid:     1 flags:0x00000000

(Thing like torrent traffic can play havoc with network adapters, particularly wifi.)

If 5.15 is having the issue then switching to 5.10 for the time being would get you a maintained kernel, and doing a kernel bisection between 5.14.16 and 5.15.0 could be useful to find the specific change that triggered the problem (which can then be reported to the kernel/driver developers).

Doesn’t the r8169 kernel module work on this ethernet?

No idea.

Can we check which driver is loaded?

lsmod | grep r816

If it’s r8168 then remove that package and see if the built-in r8169 works better (it probably will).

Something that I forgot to mention is that I have a bridge network interface where I connect my physical network interfaces and the virtual network interface from my windows VM to have it in the same network as the host instead of using a NAT.

As for

lsmod | grep r816

there is no output, but if a look for the previously mentioned driver (r8125) it’s properly loaded.

The only way I managed to get the 2.5gbps ethernet port to work was to install the driver mentioned (r8125) from AUR. Unfortunatedly I had already tested removing the package but all I got was a broken bridge network :frowning:

So the driver is needed, but looks like it doesn’t work with kernel 5.15. Therefore, same options apply: use linux-lts until it’s fixed, report an issue to the upstream driver developers, bisect the kernel to find the offending change.

@jonathon not sure it’s driver related, because even without it the system locks, For now reverting to kernel 5.14 is working fine (need it for the amdgpu hotplug fix that was added) but also installed LTS kernel to be on the safe side.

As much as I’ve been daily driving linux for quite a few years, I’ve never bisected the kernel before since it’s been quite stable (up until now). I will give it a try (google surely knows how) but any pointer to the right documentation would be great :slight_smile:

In any case, thanks for the help, really appreciated :smiley:

Hmm. So… something else is causing a hang in the network… :thinking:

https://wiki.archlinux.org/index.php/Bisecting_bugs_with_Git

:wink:

there is to much change from end linux5.14 and start 5.15.2 ,
the big one is option on compile options