I am very frustrated, I have been having random reboots for months now, and everything I have done to track down has failed. Looking for some advice…
PROBLEM: Reboots happens about once a day. I first thought it was only when I was on command line, but then had it happen while using Firefox. The closest I have gotten to reproducing is in System Profiler and Benchmark if I click “CPU Blowfish” it has crashed several times suggesting its cpu related. The pc does not have to be under load for this to happen, I am not running any games the fans are hardly spinning.
SYSTEM INFO:
Kernel Linux 6.6.10-arch1-1 (x86_64)
Version #1 SMP PREEMPT_DYNAMIC Fri, 05 Jan 2024 16:20:41 +0000
C Library GNU C Library / (GNU libc) 2.38
Distribution EndeavourOS Linux
Processor 13th Gen Intel(R) Core™ i9-13900K
Memory 65648MB (2125MB used)
GPU: NVIDIA GeForce RTX 3090/PCIe/SSE2 (BIOS 94.024b.00.0b)
Mother Board: Asus Prime Z790-A Wifi (BIOS 1604)
THINGS I HAVE TRIED:
- I updated to the latest mother board bios, and gpu vbios neither stopped reboots.
- I monitored temps while idle and under load CPU never got about 65c . Also didn’t reboot under load.
- I ran Memtest64+ for 30min all tests passed.
- I have searched every log I can find the only things I have found are in journalctl -b and are as follows:
[ 0.000000] x86/split lock detection: #AC: crashing the kernel on kernel split_locks and warning on user-space split_locks
I have turned this off from grub with “split_lock_detect=off”, this did not stop the random reboots.
Jan 11 19:36:38 batcave kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20230628/dsfield-184)
Jan 11 19:36:38 batcave kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20230628/dswload2-477)
Jan 11 19:36:38 batcave kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20230628/psparse-529)
I haven’t figured out how to make these stop but its my understanding these can be ignored. Also there are about 20-25 of these msgs.
- when I run “last reboot” right after a crash I will see 2 running kernels, I believe second one is just the crashed one that the os never shutdown.
reboot system boot 6.6.10-arch1-1 Sun Jan 14 13:25 still running
reboot system boot 6.6.10-arch1-1 Sun Jan 14 13:23 still running
My first reboot was in kernel 6.0.8-arch1-1
MY GUT:
When I first built computer I was not having this issue, it seems to have “developed”. The only new hardware I have added is a new Keychron Q6 keyboard. My power supply is a beefy Seasonic 1000w PSU.
I think this is some kernel process that wakes up maybe related to CPU usage because of the “CPU Blowfish” seems to make it happen fairly consistently, but its not from load it never gets that chance. Its like it tries to start something and crashes.
Any help appreciated, I feel like I am looking in wrong logs, that there has to be something somewhere that says what is happening…