I’d like to ask for some help on what to do next.
It’s my first time building a new desktop. I finished putting all the pieces together and installed EOS on Aug. 9th. I’ve been experiencing sporadic hard-reboots.
Here is my hardware info created through hw-probe
: https://linux-hardware.org/?probe=296f4130cd
The problem
- I experienced 2 times of reboots on the installer ISO while downloading packages for install. I didn’t pay proper attention to it because my mind was occupied with the download taking so long. (It was a bug in the ISO hot-patch which got fixed a few days after AFAIK.) I was AFK when the second reboot happened, and the ISO booted into the first option (non-NVIDIA, so I guess nouveau?). I installed EOS with that which took like 4 hours lol, but that’s ok.
- After the installation, the reboots have happened randomly both in TTY and Hyprland. I haven’t been able to tell any specific patterns when the reboots happen including workloads in CPU/GPU, temperatures, networking, disk I/O, etc.
- When the reboot happens, it just happens at an instance. There is no freezing or stuttering.
Output of `last reboot`
reboot system boot 6.10.6-arch1-1 Mon Aug 26 16:23 still running
reboot system boot 6.10.6-arch1-1 Mon Aug 26 15:52 - 16:21 (00:29)
reboot system boot 6.10.6-arch1-1 Sat Aug 24 23:00 - 23:37 (00:36)
reboot system boot 6.10.6-arch1-1 Sat Aug 24 16:46 - crash (06:13)
reboot system boot 6.10.6-arch1-1 Sat Aug 24 12:03 - crash (04:43)
reboot system boot 6.10.6-arch1-1 Sat Aug 24 10:51 - crash (01:12)
reboot system boot 6.10.6-arch1-1 Sat Aug 24 10:46 - crash (00:04)
reboot system boot 6.10.6-arch1-1 Sat Aug 24 01:10 - 01:15 (00:05)
reboot system boot 6.10.6-arch1-1 Fri Aug 23 22:42 - crash (02:28)
reboot system boot 6.10.6-arch1-1 Fri Aug 23 16:38 - 17:04 (00:26)
reboot system boot 6.10.6-arch1-1 Fri Aug 23 15:38 - 16:38 (00:59)
reboot system boot 6.10.6-arch1-1 Fri Aug 23 13:14 - crash (02:24)
reboot system boot 6.10.6-arch1-1 Thu Aug 22 20:35 - 13:14 (16:38)
reboot system boot 6.10.6-arch1-1 Thu Aug 22 13:04 - crash (07:31)
reboot system boot 6.10.6-arch1-1 Wed Aug 21 15:17 - 01:49 (10:32)
reboot system boot 6.10.6-arch1-1 Wed Aug 21 13:22 - 15:14 (01:51)
reboot system boot 6.10.5-arch1-1 Tue Aug 20 13:50 - 00:59 (11:09)
reboot system boot 6.10.5-arch1-1 Mon Aug 19 20:31 - 00:35 (04:04)
reboot system boot 6.10.5-arch1-1 Mon Aug 19 18:48 - 19:52 (01:03)
reboot system boot 6.10.5-arch1-1 Mon Aug 19 12:23 - crash (06:24)
reboot system boot 6.10.5-arch1-1 Sat Aug 17 13:43 - 15:28 (01:44)
reboot system boot 6.10.5-arch1-1 Fri Aug 16 13:45 - 01:10 (11:25)
reboot system boot 6.10.4-arch2-1 Thu Aug 15 22:47 - 03:04 (04:16)
reboot system boot 6.10.4-arch2-1 Thu Aug 15 13:14 - crash (09:33)
reboot system boot 6.10.4-arch2-1 Thu Aug 15 13:11 - 13:13 (00:02)
reboot system boot 6.10.4-arch2-1 Wed Aug 14 13:38 - 00:44 (11:06)
reboot system boot 6.10.4-arch2-1 Tue Aug 13 22:19 - 23:04 (00:45)
reboot system boot 6.10.4-arch2-1 Tue Aug 13 22:01 - 22:18 (00:17)
reboot system boot 6.10.4-arch2-1 Tue Aug 13 17:54 - 17:58 (00:04)
reboot system boot 6.10.4-arch2-1 Tue Aug 13 14:44 - crash (03:10)
reboot system boot 6.10.4-arch2-1 Tue Aug 13 11:42 - crash (03:01)
reboot system boot 6.10.3-arch1-2 Mon Aug 12 16:16 - 00:28 (08:11)
reboot system boot 6.10.3-arch1-2 Mon Aug 12 15:50 - 16:15 (00:24)
reboot system boot 6.10.3-arch1-2 Sun Aug 11 18:33 - 01:10 (06:36)
reboot system boot 6.10.3-arch1-2 Sat Aug 10 18:08 - 01:19 (07:11)
reboot system boot 6.10.3-arch1-2 Sat Aug 10 00:59 - 01:02 (00:03)
reboot system boot 6.10.3-arch1-2 Fri Aug 9 23:01 - crash (01:57)
wtmp begins Fri Aug 9 23:01:13 2024
Things I’ve tried so far
journalctl -e -b -1
does not show anything out of ordinary. Sometimes, the last entry is over 30 minutes before the reboot.- I’ve tried tweaking some BIOS settings. I actually found out that my MoBo was setting a wrong SPD profile for my RAMs, so I set that to a non-XMP baseline profile (4800MHz, 40-40-40-77, 1.1V). I also turned off “Dynamic Boost” for RAMs.
- I’ve also tried turning off turbo boost for my Intel CPU, turning on “Power Loading” which keeps additional load on PSU when a lot of parts are in idling state, and changing the power-off button to activate only when long-pressed for 4 seconds.
- I got the
intel-ucode
package update on Aug. 15th and my MoBo BIOS update on Aug. 19th regarding the Intel CPU microcode bug requesting too much voltage. - Ran
memtest86+
with 4 consecutive passes and no errors for almost 5 hours. - Ran
stress
usings-tui
for 10 minutes each, once withsqrt()
only and once again combining all other options. Didn’t notice anything wrong including temperature. - Ran
vkmark
with default settings. Didn’t notice anything wrong including temperature. - Ran
unigine-valley
. The screen was like 1 fps probably b/c I haven’t figured out the right Hyprland settings, but the benchmark result was 71.5 - 617.5 fps with 319.2 fps on average. - Tried running the demo of Tekken 8 from Steam. Ran fine with the CPU turbo boost both on and off.
Things I’m considering
- Maybe the PSU is faulty given that the system journal doesn’t seem to have time to flush to the file system? Maybe I should look into what people call RMA? But I’ve also read that problems with PSU tend to involve a few seconds of freeze or other malfunctioning.
- Maybe I should try using
nouveau
driver? Though I’m not sure if this is actually a driver problem.
What I’d like to ask
I’m hoping if anyone could see something I’m not seeing or be able to provide a general guidance on approaching this kind of problem. I’ve been doing my best on searching the internet for information, but now I feel like I should get some help from people more experienced in computer hardwares.
I’m aware that my problem might not be specific to EndeavourOS, but I’m posting the question here because I don’t really know where else I could go to. If you happen to know a place that might be a better fit, I’d be happy to hear where that would be.