Issues with nvidia 550 drivers

There seem to be some issues with the current nvidia drivers causing crashing and hard freezes in certain circumstances. Notably, it can cause a freeze during the update process leaving the system in a broken state. I believe we have seen a few people have this issue on the forum recently though I don’t think anyone attributed it nvidia.

There are a couple of references to the issue below though they are a bit hard to follow. It is something to be aware of if you are using these drivers.

2 Likes

I wonder, is it same behaviour for dkms / normal one?

Yup, one of those was me, RTX3070, didn’t attribute to nVidia, just bad luck.
dkms here.

2 Likes

I don’t seem to be experience the issue on the dkms version myself, but I’ll keep a look out for it.

Yes, I’m on dkms as well, with an RTX4070, Plasma and attempting Wayland (with a few minor issues) and I’ve not experienced this issue yet. Although, I’ve read a bit about it here and other places online.

I’d agree, it seems initially most people had not associated it with Nvidia, but it’s starting to look like the new update was the source of the problem - again. When it works, it works quite well. But if there are problems, it’s a hot mess.

I’m using the 550.xx drivers and i don’t notice any issues using Kde with Wayland on GTX 1060 during just normally daily use.

My only issue which is minor is in wayland nvidia-settings has no options to change anything and does not show very much.In xorg all the settings show up.i also have nvidia-dkms installed.

To be clear, the driver is causing kernel memory corruption so what that means is that the problems won’t be consistent. You may be fine for weeks with no issues and then suddenly have a strange problem or have a hard freeze.

550 ist the first driver which causes no issues and even fixed the bloody Xid 109 error with Alan Wake II with RT enabled. Eagerly waiting for 15th May :wink:.

1 Like

God damn it NVIDIA, you dun goofed again.

I can reproduce this with a script that is spamming

echo 3 > /proc/sys/vm/drop_caches
echo 1 > /proc/sys/vm/compact_memory

It seems to be a bug with VFS structure that novidea driver incorrectly writes to

1 Like

i intend to make a detailed post about this but currently the fix is to either:

  1. not letting udevadm trigger get… triggered? exectuted? run? whatever…
sudo touch /etc/systemd/do-not-udevadm-trigger-on-update
  1. downgrading nvidia drivers to 535 or 545 series

I think this is only an option if you are using the LTS kernel.

They should be able to: https://archive.archlinux.org/packages/n/nvidia/

I thought that the older nvidia drivers were not compatible with the 6.8 kernel due to the symbol changes.

Oh. I see what you mean. So, it would possibly require downgrading the whole system to a specific date. That wouldn’t be a good idea in this case, if you’re right.

OR, of course, installing the LTS kernel… But then the current kernel would become unusable/unbootable, I imagine.

I found what I think you’re referring to: https://archlinux.org/news/nvidia-45528-is-incompatible-with-linux-59/

It’s nvidia 455.28 specifically that isn’t compatible with kernel 5.9 and up.

@ddnn @dalto From what I’ve seen elsewhere online the 535 driver goes pretty far back as far as kernel support. Per Nvidia documentation, the 535 driver supports as far back as Ubuntu 20.04 (and similar timeframe RHEL/fedora/Suse distros), which shipped with the 5.4 kernel.

I’m happy to see evidence that I’m wrong, no problem here. But it looks like it supports quite a ways backwards.

1 Like

The issue we are discussing is the opposite. If you can use the older drivers with newer kernels.

My understanding is that the 6.8 kernel only works with 550 and newer but I haven’t tested that myself.

1 Like

Apologies, I misunderstood it.

1 Like

It is possible that you are still entirely correct on this because the link I provided is specific to nvidia 455.28 (news from 2020 at that) rather than a 5XX driver. That’s just what I could find.

i cant find anything related to that but im probably not looking in the right place at the time. however at least the nvidia-all-tkg driver (https://github.com/Frogging-Family/nvidia-all) has been patched as far back as 525 for kernel 6.8 as of last week so it should be possible to downgrade using tkg drivers with latest kernel