Cannot boot using nvidia-dkms, have to install nvidia

Hi team I have been running into issues installing Galelio on my system.

I have been unable to boot into my DM post-install and I tracked it down to nvidia-dkms. Upon booting I would reach “Initializing Graphical Interface” and then it would just flash but it wasn’t locked.

I entered tty, believing it to be an issue with nvidia (it was), I uninstalled nvidia-dkms and installed nvidia.
Once I installed nvidia and rebooted it would finally enter my DM.

Long story short I borked my install trying to install and decided to wipe and start over but now the installer just hard locks on dkms install nvidia/545.29.06 -k 6.6.2-arch-1

The system is so locked that a simple reset isn’t enough, I have to cold boot.

Any thoughts on this issue?


This appears to have been old kernel or driver code that was still kicking around, it went away after a boot into Windows then full shut down.

One issue that still remains though is I cannot boot to the desktop environment using nvidia-dkms, it only works with nvidia.

13900K
4090
GRUB, BTRFS
Cinnamon

I can boot the live USB and install but cannot reach the DE when booting after install, the only fix I have found is to install nvidia instead of nvidia-dkms.

I had this issue with Cassini as well, if I did an offline install I could load DE but after system update it would no longer load.

https://discovery.endeavouros.com/nvidia/nvidia-optional-enhancements-and-troubleshooting/2021/03/

It is this issue here

It isn’t a big problem to use nvidia instead of nvidia-dkms if that is working for you.

Just remember that you need to install nvidia-lts if you also use the lts kernel.

Switching kernels is actually how I got into the first mess.

I wanted to switch to linux-zen and ran into issues, when building the kernel it hardlocked on the dkms module step for nvidia.

That issue persisted after restoring my snapshot and trying again, which lead me to try re-installing and where surprisingly the hardlockiing issue was still present.

I’ve performed another fresh install, this time with System-d, and it’s the same, it can’t load into DM with nividia-dkms.

It’s weird because the Live USB uses the dkms driver. I want to use GRUB for the BTRFS Snapshots, that has saved my butt in the past :smiley:

sudo dmesg | grep microcode | eos-sendlog

could be an issue with microcode or even the Bios …

I just done 2 installs one offline one online with nvidia card present uding Nvidia option on boot both went totally without any error or issue.

2023-11-30_18-23

This output leads more to something system related and not directly dkms or nvidia…

Thats intel i7 CPU?

inxi -Fxxc0z | eos-sendlog would create full specs and URL to the pastebin of it you can post here for more investigation.

It’s i9 13900K on a MSI Pro Z790-A latest Bios revision.

I removed the 4090 and booted using the Intel igpu and I can swap back and forth between installing Nvidia and nvidia-dkms and the kernel rebuilds fine.

Booting with the 4090 I must have nvidia installed and if I attempt to install nvidia-dkms it hard locks as shown in those screenshots.

Unfortunately I don’t have logs because I have wiped the partition since. I will need to replicate it again but that shouldnt be a problem.

As an aside is ibt=off still a thing?

I came across this when googling the black screen issue, something to do with Intel CPU, Nvidia GPU a newer kernels

current nvidia driver should not have this issue

So I have narrowed this down a little.
I just performed an offline install, system-d, ext4 and I can boot with nvidia-dkms into KDE

If I can still boot after updating that means it is likely Cinnamon causing the issue, very unlikely it’s GRUB or BTRFS which are the only other 2 options I changed compared to this current install.

One difference as well is the Cinnamon install uses LXDM, whereas KDE uses SDDM

Nope, hard locked same spot during update, and borked my boot loader.

Realistically it could also be the Nvidia driver update.


Cinnamon uses lightdm.
LXDM is only set for LxDE

could be ralated, but if it hardlocks while you do rebuild the Nvidia Modules this is nothing i would put on the Driver package itself.
If you run offline install it will not rebuild Nvidia driver-module (they are buld already on creating the ISO) but you will do so when updating after this.

So it could be also an issue with the CPU or RAM even Harddrive.

I mean you can try harder :wink:
Install offline do not update after first boot and go rebuilding the Nvidia Modules with the package version from ISO.

sudo pacman -U https://archive.archlinux.org/packages/path/nvidia-dkms-545.29.02-
2-x86_64.pkg.tar.zst

That systemd-boot do not show OS entry is because arch does remove initramfs images when rebuilding… you should be able to arch-chroot and rebuild them.

I have a sacrificial 256GB USB drive I can use to test an offline install, don’t want to mess with my clean setup lol.

You made me think of something though.

Throughout all these Online installs it would freeze at the same part, but not always.

When I could take a screen shot of the exact error it was throwing CPU microcode errors and the system locks like it was an unstable overclock.

That still doesn’t explain though why when an Online install worked it wouldn’t load the DM.

you actually tried the boot parameter? it could be something like a regression that puts this back as an issue on current kernel…

if there is an issue causing cpu / kernel to bail out in some circumstances… it can show in many different ways… and not loading DM can be an issue with faulty driver modules build… may only half / incomplete

Not while I had this hard locking issue. At first my problem was only the DM not loading, that is when I tried it.

Between then and now the Nvidia driver package was updated to a newer revision than what is in the iso so now when I do an online install it also updates Nvidia.

The iso version is 545.29.02, current is 545.29.06

I had settled on accepting I needed to install Nvidia over Nvidia-dkms and it worked.

It was when I wanted to switch to Linux-zen that I remembered I should use dkms, leading me to discover this new issue.

Testing offline right now, pasting this incase I lose it LOL

Didn’t lock up, so that narrows it right down to nvidia-dkms 545.29.06, but now the question is why.


Spoke too soon, we’re back to the DM not loading, I can access tty though.

The eos-sendlog commands you provided earlier aren’t providing a url in tty, I’ll see if I can grab them from home.

eos-sendlog has had an issue before version 23.23-1, this new version should work better.
But you can use any working pastebin like service.

After a system update, an uninstall of nvidia-dkms and install of nvidia I am back in so I can provide

https://0x0.st/Hxxy.txt

https://0x0.st/Hxxw.txt

If I install nvidia-dkms at this point the system will lock, this is just a test USB so I am fine with that, I will need to figure out how to rebuild the bootloader though.