Solution to Regular Kernel Hangs on Boot

This is a solution to Regular kernel hangs on boot 90% of the time, lts kernel always works where I can’t reply because the topic is closed

Short answer: for my laptop, I added the line to the regular kernel boot command: i915.force_probe=!8086 xe.force_probe=8086.

Explanation:

The problem turned out to be a conflict between the i915 driver, and the xe driver which was introduced in the 6.9 kernels. So, it would have occurred on the lts kernels as soon as they moved to a 6.9+ kernel. To find the solution for your PC/laptop, these should be the steps:

Run lspci -v when booted with the lts kernel and find the VGA card. In my case, this included (xe was not included when I booted with the 6.6 kernel):

00:02.0 VGA compatible controller: Intel Corporation Alder Lake-P GT2 [Iris Xe Graphics] (rev 0c) (prog-if 00 [VGA controller])
	DeviceName: VGA compatible controller
	Subsystem: CLEVO/KAPOK Computer Device 7716
	Flags: bus master, fast devsel, latency 0, IRQ 195, IOMMU group 0
	Memory at 81000000 (64-bit, non-prefetchable) [size=16M]
	Memory at 90000000 (64-bit, prefetchable) [size=256M]
	I/O ports at 1000 [size=64]
	Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: i915
	Kernel modules: i915, xe

I ran setpci --dumpregs | grep _ID$ and the output was:

     00 W VENDOR_ID
     02 W DEVICE_ID
     2c W SUBSYSTEM_VENDOR_ID
     2e W SUBSYSTEM_ID
     40 W CB_SUBSYSTEM_VENDOR_ID
     42 W CB_SUBSYSTEM_ID
0028 00 - ECAP_HIER_ID

I ran setpci -s 00:02.0 00.w and the answer was 4606, so I added i915.force_probe=!46a6 xe.force_probe=46a6 to the 6.11 boot command, with no luck. I then ran setpci -s 00:00.0 00.w and added i915.force_probe=!8086 xe.force_probe=8086 and I can now boot the regular 6.11 kernel successfully.

I don’t know whether there is a better solution, until the lts kernel also moves to 6.9+. At that time, I can blacklist the i915 module and force load the xe module. If you have a better idea, please speak up. Thank you.

Have you tried uninstalling the Intel driver? xf86-video-intel
Edit: Then it will use the xe kernel driver.

xf86-video-intel is not installed:

$ yay -Qs xf86-video-intel
$ yay -Ss xf86-video-intel
aur/xf86-video-intel-git 1:2.99.917+916+g31486f40-1 (+117 0.00) 
    X.org Intel i810/i830/i915/945G/G965+ video drivers
extra/xf86-video-intel 1:2.99.917+923+gb74b67f0-2 (724.5 KiB 2.2 MiB) [xorg-drivers] 
    X.org Intel i810/i830/i915/945G/G965+ video drivers
$

That’s odd when it shows. :thinking:

Kernel driver in use: i915

Check with pacman

pacman -Qi xf86-video-intel

The same here on my setup,

0000:00:02.0 VGA compatible controller: Intel Corporation TigerLake-LP GT2 [Iris Xe Graphics] (rev 01) (prog-if 00 [VGA controller])
	Subsystem: Lenovo Device 3f19
	Flags: bus master, fast devsel, latency 0, IRQ 183, IOMMU group 1
	Memory at 601e000000 (64-bit, non-prefetchable) [size=16M]
	Memory at 4000000000 (64-bit, prefetchable) [size=256M]
	I/O ports at 4000 [size=64]
	Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: i915
	Kernel modules: i915, xe

xf86-video-intel is not installed

pacman -Qi xf86-video-intel
error: package 'xf86-video-intel' was not found
1 Like

@ksbhaskar

Edit: From the link if your hardware is correct as explained.

Via boot parameters like “i915.force_probe=!56a2 xe.force_probe=56a2” is enough to prevent loading of the i915 driver and loading instead the experimental Xe driver, assuming your PCI graphics ID is 0x56a2. Just adjust for your appropriate Intel PCI graphics ID and from there it’s easy to boot with this new kernel driver.

@ricklinux it’s not installed:

$ pacman -Qi xf86-video-intel
error: package 'xf86-video-intel' was not found
$

The i915 driver is included with the 6.11 kernel, as is the xe driver, but the 6.6 kernel only has the i915 driver. So the 6.6 kernel has no conflict, but the 6.11 kernel does.

$ fd i915.ko.zst /lib/modules | grep /i915
/lib/modules/6.11.2-arch1-1/kernel/drivers/gpu/drm/i915/i915.ko.zst
/lib/modules/6.6.54-1-lts/kernel/drivers/gpu/drm/i915/i915.ko.zst
$ fd xe.ko.zst /lib/modules | grep /xe
/lib/modules/6.11.2-arch1-1/kernel/drivers/gpu/drm/xe/xe.ko.zst
$

Ya …i’m just trying to understand. :face_with_spiral_eyes:

Yes @ricklinux, that was one of the links on my way to the solution. Then I had to figure out how to get the address, which I got from How do I change the PCI device ID of my graphics card in the system? (to install Quadro driver on a GeForce)](https://unix.stackexchange.com/questions/154952/how-do-i-change-the-pci-device-id-of-my-graphics-card-in-the-system-to-install). Of course, I went down many rabbit holes before finding these two pages, because I didn’t know where to start. Finally, when I blacklisted the i915 driver, and the 6.11 kernel booted, but HDMI was not working, I knew that the problem was something to do with the video driver. It took many hours over two months of troubleshooting to find the workaround hack.

It’s difficult to understand some of these things. Linux has a lot of rabbit holes when something isn’t working. I can only scratch the surface and then get buried. :rofl:

1 Like

Well. it seems I spoke too soon. After running yay -Syu to update the system, the workaround no longer works. So it’s back to the lts kernel until the lts kernel also fails to boot, at which point I might be up the creek without a paddle. Oh well…

Sounds like what was happening to me with the latest kernel for sometime now, the lts kernel works like a charm with or without.

Found this to overcome my problem on my arch install.

added MODULES=(vmd) to mkinitcpio.conf

# in this array.  For instance:
#     MODULES=(usbhid xhci_hcd)
MODULES=(vmd)