Cannot start any self compiled Kernel stuck at "reached target graphical interface"

I tried to compile kernels myself to use (5.9.2, 5.10.186, 5.15.120) with help of the arch wiki

I had a hard time because of unusual errors but eventually got a working kernel. But neither of the Kernels I compiled I can get into the Desktop Environment I am always stuck at
[ ok ] reached target graphical interface
After that I logged into tty to see whats up but I couldnt really find out


I found out that vmmon belongs to VMWare which I dont really care about so I can safely ignore that.
About the other once I have no clue expect the unable to load firmware which belongs to my ethernet adapter I think.

On the older Kernel (5.9.2) I get additional Errors/Warnings about EDAC amd64: Error: F0 not found but since these are gone with newer versions I dont think that these are the root cause.

I installed xf86-video-fbdev and xf86-video-vesa which let me use startx but that only gave me 3 broken white consoles scattered around my screen

Also here is an example log I generated with kernel 5.9.2

NOTE
I can start any Kernel that was precompiled with no issue whatsoever but not once I compiled myself

Specs
DE: Plasma 5.27.6
CPU: AMD Ryzen 7 5800X (16) @ 3.800GHz
GPU: AMD ATI Radeon RX 6700XT
Memory: 32GB
MB: B550

Generally with kvm I use virtio drivers, more efficient and cheaper too use. Virtio network , virtuo scsi, virtio screen etc.

I don’t know what you are using for your session, I usually use virt-manager and can easily select those devices.

If you are using VMware then you should use any virtio / para virtual drivers they have, ie vmmon which I guess allows VMware to communicate with the vm.

Not sure what you mean by not starting given the log you posted, that seems to say that the kennel started ok, just the programs later didn’t give you the gui you wanted? Or I’m misinterpreting your post?

Thank you for your answer I think I phrased some sentences really wrong :sweat_smile:
The problem is not VMware at all I don’t really use that.

The problem is rather that the system is not booting into any gui it is just stuck at reached target graphical interface and cannot get passed that. And whatever log I check I cannot find a proper error that causes this issue.

Normal boot (happens on every precompiled kernel) start → Kernel Things → SDDM → Plasma Desktop
Self compiled Kernel: start → Kernel things (As it gets stuck on reached target graphical interface)

Do you maybe have a clue which error log I could provide to further narrow down the issue? :slight_smile:

EDIT:

At this Stage it just hangs forever and wont boot properly

Note: The EDAC amd64 errors are only caused on this version and are fixed in a newer version (Which I also tried and which has the same issue)

Boot Log of that boot

I think you have the title wrong. You are starting the kernel fine. You are getting stuck in the systemctl bootup process starting graphical target.

Getting stuck on graphical would result in me doing systemctl set-default multi-users.target

Might not be graphical which you seem to want, but it would allow systemctl to reach an booted state, and then you can experiment with startx etc, which it seems you were doing anyway using ALTF2 etc to get another virtual console.

I’m still not sure if you are booting this on physical hardware of booting this in a KVM virtual machine as is suggested by the *kvm kernel version?

It has been a long time since I build a kernel. Normally I would start with a known good configuration, ie /boot/config*, but that doesn’t seem to be around with my manjaro install, seems like /proc/config.gz does exist though on both manjaro/eos, which should work. Not sure what you were using. If not that then you might like to compare configurations, might be instructive.

I would be looking at journalctl -kb and journalctl -xe -b.
lspci is also a good command to run to check the pci devices visible, especially if virtual which would point out any deficiencies in you virtual machine configuration.
Really I need to know if you are booting virutal or physical.

You need to get your head around how systemd works, as that is what is running the boot process.

systemd is looking for graphical target,

$ cat /lib/systemd/system/graphical.target
[Unit]
Description=Graphical Interface
Documentation=man:systemd.special(7)
Requires=multi-user.target
Wants=display-manager.service
Conflicts=rescue.service rescue.target
After=multi-user.target rescue.service rescue.target display-manager.service
AllowIsolate=yes
$

Which you can see depends on display-manager.service, any you have to chase the systemd dependencies down with systemctl status thing and checking /{lib,usr/lib,etc,run}/systemd/system/thing or some other related path. If you find something interesting with systemctl status thing then you can use journalctl -xe -u thing or maybe without -u if not a unit.

Thank you for your answer again and for helping me, I am quiet new to this

This is running on my physical hardware and is not a VM

Yes that makes sense, I am not that experienced with the kernel boot process yet and didn’t know this doesn’t belong to the kernel anymore

About that I am currently running the newest Kernel 6.4.3-273-tkg-pds
What I did was I copied over my config from /proc/config.gz and then did make olddefconfig to apply the default config for what changed with the kernel version, at leasts thats what I understood what its doing.

About the tracing, I will try to get some more information about that now and after my lunch I will edit my reply after I got some new information, also thank you again for your reply its greatly appreciated :smiley:

OK, Physical host, new kernel compiled using the old kernel configuration.

Still suggest you use systemctl to change the default boot target to multi-user.target.
You can always start normal graphics by using: exec sudo systemctl isolate graphical.target

check this following should be the same on old and new kernel:

cd /var/tmp; what=old; arg=knn; sudo lspci -$arg | tee $what.lspci-$arg

and then when booted into the new kernel, change what=new and rerun, then use diff -u *.lspci-$arg
this will verify that the same drivers are bound to the pci devices on the bus, which is one step to ensuring you have everything working.

Certainly still worth checking all the journalctl -kb output, though a little more difficult to cross verify as there will be sequence and prefix differences, and probably minor value differences too. Bus graphics not starting is likely a pci driver missing.

Nvidia or Intel or AMD VGA? the lspci -knn will show which kernel drivers are available or in use. If you have a difference then you have a direction in which to hunt.

That straight throws me into TTY1 is that suppose to happen?

Are you referring to the new kernel that I build or the newer kernel version?

heres the log I cannot find anything interesting in there sadly

Since I own a AMD CPU and AMD GPU I would rather say I have AMD VGA than Nvidia or Intel also again

lspci -knn

journalctl -kb

Heres how it looks if I type startx

And here I tried to use sddm --test-mode to see any issues regarding that

yes tty1 was the intended result, allowing system to complete boot up

as I said , the isolate command will do a normal graphical start

the actual lspci output isn’t that interesting, what would be interesting is any differences between precompiled kernel and your compiled kernel, so take both lspci outputs and run diff -u old new, and I bet of you do that diff -u you will find differences for device 2d:00.0

the while boxes seem just like normal xterm windows, maybe with twm running as a window manager, maybe even focus follows mouse, very old school, and probably some sort of fallback setup, though if you are using startx you should check your ~/.xinitrc it will probably have xterms and maybe twm too. You should check you can move the mouse around, and get input focus and use the xterms are terminals.

why the xterm/twm environment is running and why plasma/kde is not running I don’t know, probably because kde is not in .xinitrc and xterm/twm is in .xinitrc.

so it seems to me that X is starting, but not the version that sddm is happy with

So cross check 2d:00.0, and that should give you a driver to hunt down, whether it is a precompiled driver, or you have to compile some special module for your hardware, maybe with dkms.

If you want to switch back from multi-user.target as the default then systemctl set-default graphical.target

Might be relevant too, or some hints: Error boot after nvidia update : Stuck on reached target graphical interface

If you find that you have already built the drive, then load it using modprobe and try sddm again. If it works, then you have to convince dracut/mkinitcpio to load it, although if it normally loads with the normal precompiled kernel, I don’t know why it isn’t loading now, unless it is something to do with secure boot and signed drivers.

Okay makes sense then, if I do the normal graphical start it just gives me a black screen with a blinking underscore at the top left

Also for some reason the lscpi command doesnt give me an output it only prints the help page could you check what I misswrote in the command or tell me I think I typed it correctly

and since the file is empty it creates ofc its the same

yes probably I can interact with them normally and also launch application such as firefox which appear normally but I cannot close them or at least dont know how

might sound weird but I think there is no .xinitrc in my home directory
I only see .Xauthority in my home directory

I also looked at this thread but it only suggests downgrading nvidia packages and I have an AMD gpu
I also saw similar threads in the arch forum but they were all about nvidia gpus

my mistake should be lspci -$arg

since both files are empty then diff is empty so eos-sendlog is uninterested, as you found

ok it no .xinitrc then xterm is probably a fallback default

can’t close because no window decorations so there is no window manager running, so to exit xterms you would have to type exit, and there would be no way for starting other apps except for typeing app & in the windows, or in another virtual console window using DISPLAY=:0 app &

no worries and goddamn you were right I think


I think its not using amdgpu but rather snd_hda_intel
how do I fix that? :confused:, rebuilding the kernel or can I just enable some modules?

Building kernels and dealing with drivers you need to become at least passing familiar with things like PCI bus and how devices integrate into it, so you can interpret things like the differences, and also be prepared to look up the drivers so you don’t think a sound driver is running your graphics card.

You are misinterpreting the diff output. It is a very useful tool, so it is worth becoming familiar with it. The - lines are in the first file but not in the second, while + would mean the opposite.

Both files have snd-intel, which is a sound driver, while it is also a 2d: device, but not 2d:00.0 ,so probably internal to the graphics card.

You need to explore /lib/modules and find the amdgpu module.
Then you need to understand how it got there.
Was it installed by pacman?
Was it built independently, and if so how and how do you replicate it?

Oof yeah that makes sense thank you, although my main kernel (6.4.3) is using the amdgpu driver while my build kernel (5.9.2) is using no driver? also since there’s no “+” before kernel driver in use: snd_hda_intel does that mean both the kernels use that for the Audio controller?

Since both files contain that it would probably mean that the Audio Controller should work as expected but not the GPU driver, right?

Could I use things like Ksearch to find that amdgpu module or what is the prefered way of doing that?

A lot of packages, regarding amd:

❯ yay -Qe | grep amd
amd-ucode 20230625.ee91452d-4
opencl-amd 1:5.6.0-2
xf86-video-amdgpu 23.0.0-1

and these regarding Mesa:

lib32-libva-mesa-driver 23.1.3-1
lib32-mesa-vdpau 23.1.3-1
libva-mesa-driver 23.1.3-2
mesa-utils 9.0.0-2
mesa-vdpau 23.1.3-2

The kernel was build from source, first I copied over my stock EOS kernel configuration from my current running kernel, did make olddefconfig to get the default config for all things that changed between the kernel versions and Build the kernel as usual or as explained here

When I was asking about built independently I was talking about the old kernel with the working amdgpu driver, not about the new kernel which I know you built. We need to know about the amdgpu driver and how it got there so we can do something similar but with your kernel.

While I use a window manager, I’m mostly a terminal guy, mainly use the graphics for web browser and lots of terminal sessions. I also don’t use KDE at all, so no idea about KDE specific stuff, or most desktop generic GUI stuff. I stick to terminal command line since it works everywhere, so you will see my suggestions run that way.

As for the sound driver, since it is loaded, I expect it would work, but would have to search for how to test, maybe IIRC aplay, but I wouldn’t worry about it specifically.

You need to find the amdgpu driver in /lib/modules and which package put it there.

find /lib/modules -type f -name 'amdgpu*'
pacman -Ql | grep /lib/modules.*amdgpu

You can probably use yay too I guess, but you need to search file paths not just package names. For me, that package is linux61 package. Also interesting is exacly which modules do you have, and which are effectively missing in your compiled kernel, because you probably want everything to work.

cd /lib/modules
ls -l
for i in *; do (cd $i && find * -type f | sort > /tmp/m.$i); done
cd /tmp
ls -l m.*
wc -l m.*

All the kernel modules for a kernel are usually in a directory in /lib/modules, and since you can boot both the original kernel and the manual build, you have at least 2 directories here. We have lists of every modules for each kernel in /tmp/m.* although there may be some uncertainty here because I don’t know you exact configuration, and you might have some external extra modules in a separate directory.

You can see from the wc -l output the count of lines, which is the count of modules, in each directory, and in general I expect them to be same for the same configuration and the same extra modules added. Since amdgpu is missing, probably not, yet.

That should give you a summary of modules for each kernel, and you can use diff to find the differences in the files, ie

diff -u m.old m.new

If you want to know what is in m.old, but not in m.new, and not common to both, then you can use something like comm:

comm -23 m.old m.new

read the comm manual to see the variations of -1 not in file 1, -2 not in file 2, -3 not common. Since it deals with negatives it usually confuses me a bit until i get it right.

Ohh I see, Im sorry for misinterpreting. I have 3 Kernels 1. Standard linux Kernel 2. Linux tkq pds 3. Linux 5.9.2 (self compiled) // edit: and some other failed attempts to build the kernel where I had weird errors I hadn’t fixed there
And yes I understand now thank you for explaining and again thank you for your time and patience! :slight_smile:

No worries at all I really like the terminal and I am eagered to learn more commands I can use and master

At least hopefully one less thing to worry about :sweat_smile:

About the first command find /lib/modules -type f -name 'amdgpu*'
I already find something about it

/lib/modules/6.4.3-273-tkg-pds/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.zst
/lib/modules/6.4.3-273-tkg-pds/build/include/uapi/drm/amdgpu_drm.h
/lib/modules/5.9.2/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko
/lib/modules/6.4.3-arch1-2/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.zst
/lib/modules/6.4.3-arch1-2/build/include/uapi/drm/amdgpu_drm.h

This is the output and I see that the 5.9.2 Kernel doesnt have the amdgpu_drm.h file, maybe could that be a problem?

About pacman -Ql | grep /lib/modules.*amdgpu this command the output is as follows:

[mixel@mixel-main-new modules]$ pacman -Ql | grep /lib/modules.*amdgpu
linux /usr/lib/modules/6.4.3-arch1-2/kernel/drivers/gpu/drm/amd/amdgpu/
linux /usr/lib/modules/6.4.3-arch1-2/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.zst
linux-headers /usr/lib/modules/6.4.3-arch1-2/build/drivers/gpu/drm/amd/amdgpu/
linux-headers /usr/lib/modules/6.4.3-arch1-2/build/drivers/gpu/drm/amd/amdgpu/Kconfig
linux-headers /usr/lib/modules/6.4.3-arch1-2/build/include/uapi/drm/amdgpu_drm.h
linux-tkg-pds /usr/lib/modules/6.4.3-273-tkg-pds/kernel/drivers/gpu/drm/amd/amdgpu/
linux-tkg-pds /usr/lib/modules/6.4.3-273-tkg-pds/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.zst
linux-tkg-pds-headers /usr/lib/modules/6.4.3-273-tkg-pds/build/drivers/gpu/drm/amd/amdgpu/
linux-tkg-pds-headers /usr/lib/modules/6.4.3-273-tkg-pds/build/drivers/gpu/drm/amd/amdgpu/Kconfig
linux-tkg-pds-headers /usr/lib/modules/6.4.3-273-tkg-pds/build/include/uapi/drm/amdgpu_drm.h

I don’t see any output regarding any kernel version 5.X.X

This is the commands I removed unessesarry logs e.g. systemd-private-64775c5eb29d4703bcc27a97174992e9-earlyoom.service-Qqr3MA

[mixel@mixel-main-new modules]$ cd /lib/modules
[mixel@mixel-main-new modules]$ ls -l
total 52
drwxr-xr-x 3 root root 4096 18. Jul 23:47 5.4.249
drwxr-xr-x 3 root root 4096 19. Jul 21:54 5.9.2
drwxr-xr-x 3 root root 4096 30. Dez 2022  6.0.12-arch1-1
drwxr-xr-x 3 root root 4096 21. Okt 2022  6.0.1-arch2-1
drwxr-xr-x 3 root root 4096 31. Okt 2022  6.0.2-arch1-1
drwxr-xr-x 3 root root 4096 15. Nov 2022  6.0.6-arch1-1
drwxr-xr-x 3 root root 4096 12. Jan 2023  6.1.1-arch1-1
drwxr-xr-x 3 root root 4096 21. Jan 11:59 6.1.4-arch1-1
drwxr-xr-x 5 root root 4096 18. Jul 18:24 6.4.3-273-tkg-pds
drwxr-xr-x 5 root root 4096 18. Jul 18:24 6.4.3-arch1-2
[mixel@mixel-main-new modules]$ for i in *; do (cd $i && find * -type f | sort > /tmp/m.$i); done
[mixel@mixel-main-new modules]$ cd /tmp
[mixel@mixel-main-new tmp]$ ls -l m.*
-rw-r--r-- 1 mixel mixel    681 21. Jul 15:00 m.5.4.249
-rw-r--r-- 1 mixel mixel 202340 21. Jul 15:00 m.5.9.2
-rw-r--r-- 1 mixel mixel     52 21. Jul 15:00 m.6.0.12-arch1-1
-rw-r--r-- 1 mixel mixel      0 21. Jul 15:00 m.6.0.1-arch2-1
-rw-r--r-- 1 mixel mixel      0 21. Jul 15:00 m.6.0.2-arch1-1
-rw-r--r-- 1 mixel mixel      0 21. Jul 15:00 m.6.0.6-arch1-1
-rw-r--r-- 1 mixel mixel     52 21. Jul 15:00 m.6.1.1-arch1-1
-rw-r--r-- 1 mixel mixel     52 21. Jul 15:00 m.6.1.4-arch1-1
-rw-r--r-- 1 mixel mixel 907947 21. Jul 15:00 m.6.4.3-273-tkg-pds
-rw-r--r-- 1 mixel mixel 909177 21. Jul 15:00 m.6.4.3-arch1-2
[mixel@mixel-main-new tmp]$ wc -l m.*
     25 m.5.4.249
   5084 m.5.9.2
      2 m.6.0.12-arch1-1
      0 m.6.0.1-arch2-1
      0 m.6.0.2-arch1-1
      0 m.6.0.6-arch1-1
      2 m.6.1.1-arch1-1
      2 m.6.1.4-arch1-1
  23442 m.6.4.3-273-tkg-pds
  23474 m.6.4.3-arch1-2
  67666 total

I sadly coulnt get to the rest of your reply yet because I was busy today :confused:
I will do that once I am home again and have time but as always thank you :).