Serious booting issues

About two months ago, I got rid of my desktop tower and decided to switch to small form factor computers in order to save space around my desk and remain practical for my actual needs. Also, the GPU in the tower was an RTX 3000 series and being a Linux user, I figured it’s time to move on from Nvidia to hopefully create a better user experience. So far, this decision has made my life harder. Allow me to explain in detail, and hopefully someone can understand what’s going on here and help me make the right decision moving forward. Please note, I am a very novice user with Linux but a rather quick learner and great student when met with a great teacher.

I decided to buy two ‘Intel NUC 13th Gen devices’ in the bare-bones model, which doesn’t come with SSD or RAM. Then, I went ahead and bought 64g 3200Mhz of DDR4 GSkill RAM and 2TB Samsung Evo Plus M.2 SSD for both of the devices. When I received it I popped it in and made sure to update the BIOS on both devices to the latest versions and to turn off secure boot as well just to ensure I don’t run into issues with Linux.

From the start, the plan was to use one of these NUC devices with Debian Stable and the other with EndeavourOS. As soon as I tried to install Linux on the first device, I noticed something didn’t feel right. The ISO images were not being detected by the BIOS, meaning I couldn’t boot from the USB. I spent countless hours figuring this one out, and thankfully I did. The issue was that for some dumb reason, Intel decided not to support MBR partition tables in this device and instead only supports GPT partition tables. Most Linux distro images default to the MBR as far as I can tell, so this was an interesting discovery. The only solution I could find around this limitation was to use ‘Ventoy’ which thankfully allows you to format the partition table to GPT. I went ahead and did that, and thankfully it worked and allowed me to install Linux on both devices.

Things worked fine for maybe a week max until I started noticing issues. The EndeavourOS device for some reason stopped booting and would only go to a black screen that has a tinge of light to it, meaning something is there, but my screen can’t see it. About 90% of the time when this happens, I cannot reboot from the keyboard and am forced to do a hardware reset. Same thing with the Debian device. It would very often freeze as soon as the GRUB loader says “Loading Ramdisk”, it just freezes there. To make matters even worse, once the device goes into this non-bootable state, not only can I not get back in, but 99% of my attempts with a Live USB of any distro does the exact same thing, which makes no sense. Why would a Live USB not boot and go to a black screen only? Regardless of if I choose ‘nomodeset’, recovery mode or anything along those lines. I must have attempted a good 10 distros, and they all failed on USB to boot in any capacity.

Now here comes the interesting part. I did an RMA on one of the devices through Intel support, and the problems still occurs on the new device. I tried changing SSD’s and it still occurs. I tried changing RAM and it still occurs. Today I decided to try one option on the Fedora Live ISO titled ‘test media kit and start live’ or something along those lines. I think it’s the second option down when loading their KDE Spin distro. Anyhow, I finally got in with that option. I quickly loaded up ‘KDE Partition Manager’ and used the ‘check’ disk option on my OS SSD and applied it. After I rebooted I was able to boot in again which was awesome. I went ahead and did the same on the other device too, and I was able to get in on that one as well. Success right? No. Not at all. While this does get me in temporarily for a few boots, the same issue reoccurs on its own within a few boots or days. After ruling out some of these variables, I have no clue what should be my next move.

I’ve never run into such a bizarre issue. Anyone have any idea what it sounds like is happening? Should I sell these devices and move over to the ‘Lenovo ThinkCentre’ mini pc instead? I hear Lenovo Thinkpads are very good for Linux, so I am wondering if their mini pc’s would’ve been a better buy instead of these Intel NUC’s. Regardless, a part of me really wants to understand why this is happening. I went through this forum and seen some other users having quasi-similar issues.

Thanks in advance for reading and helping!

It sounds like you have a lemon. I know you got a replacement unit, but could it be you got two lemons?

I was reading through some of the posts in the community forum here, and most of the folks with Intel NUC 13 + Linux issues are griping about the WiFi card. Not your thing!

Are you sure the computer is getting enough power? Those little NUCs typically don’t need much, but if it is on a power strip with a bunch of other devices or something it could cause booting issues when everything is starting up at once.

Check in the BIOS settings that the SATA controller is set to AHCI mode (not RAID, or anything else) and you have it in “UEFI only” mode or CSM is disabled.

It could also be a bad BIOS update. This is from the ArchWiki article:

Note: Prior to BIOS update, review the community response to the latest BIOS version compatability with the specific NUC model, as some versions are known to cause new regressions.

Maybe try the BIOS release prior to the most recent, instead of the most recent.

1 Like

Very interesting. Yeah, power hint is a very good one.

Things I would try:

  • investigate your physical setup - physically change location of the devices. Faulty power line, room too humid, too hot, not enough air, etc. Move them far away from each other as well (see 3rd point)
  • what do you connect to the NUC devices? Mouse, Keyboard, Screen - good chance one of those things is faulty / faulty cable
  • does the failure happens when you boot both of them? or when you only boot one of them? Just in case it is a very corner case weird one in a million case of some resonance
  • disable all non essential stuff in BIOS; set everything on low
  • once you login - does journalctl -b -1 tells any story ?
  • try Windows on one of those devices for few days

I assume you run standard kernel with EoS

Good luck!