Yesterday I replaced my Ryzen 3 2200g with a Ryzen 7 5700X, I then proceeded to reinstall EndeavourOS (was time to clean up my system anyways), now every couple of hours my system would crash and then proceed to reboot. This is not an issue on Windows at all so suspect it may be related to Linux and not a broken hardware issue. Before this this upgrade I’ve very rarely ever experienced any crashes on EOS.
After the crashes I ran journalctl | grep -i "hardware err"
And received this:
Sep 19 07:19:18 netsu-pc kernel: mce: [Hardware Error]: Machine check events logged
Sep 19 07:19:18 netsu-pc kernel: mce: [Hardware Error]: CPU 5: Machine Check: 0 Bank 5: bea0000001000108
Sep 19 07:19:18 netsu-pc kernel: mce: [Hardware Error]: TSC 0
Sep 19 07:19:18 netsu-pc kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1726723156 SOCKET 0 APIC a microcode a20120a
Sep 19 13:59:19 netsu-pc kernel: mce: [Hardware Error]: Machine check events logged
Sep 19 13:59:19 netsu-pc kernel: mce: [Hardware Error]: CPU 13: Machine Check: 0 Bank 5: bea0000000000108
Sep 19 13:59:19 netsu-pc kernel: mce: [Hardware Error]: TSC 0 ADDR ffffff8e26f382 MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Sep 19 13:59:19 netsu-pc kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1726747157 SOCKET 0 APIC b microcode a20120a
Can confirm nothing is overheating (heard it may be the cause):
sensors
k10temp-pci-00c3
Adapter: PCI adapter
Tctl: +54.2°C
Tccd1: +53.5°C
nouveau-pci-0800
Adapter: PCI adapter
fan1: 745 RPM
temp1: +35.0°C (high = +95.0°C, hyst = +3.0°C)
(crit = +105.0°C, hyst = +5.0°C)
(emerg = +135.0°C, hyst = +5.0°C
I also tried adding processor.max_cstate=1
to GRUB_CMDLINE_LINUX_DEFAULT
for grub after reading about others having similar issues, but at best it may have just decreased the frequency of the crashes, but most likely did not do much if anything.
The crashes seems to happen most frequently when I switch applications (like VSCode to Vivaldi with alt+tab), in fact, 70% of the crashes seems to happen when I’m using a browser, but that may just be coincidence. The CPU and RAM are not stressed at all, hovering low constantly during my uses. This issue may potentially also be a Budgie issue, since I’ve always used XFCE with i3 and decided to test out Budgie with this new install.
Possibly Important Hardware Info:
CPU: Ryzen 7 5700X
GPU: NVIDIA 1080 TI
Software Info:
OS: EndeavourOS (Linux 6.6.51-1-lts and latest available downloadable ISO)
DE: Budgie
Common Software Open During Crashes: Vivaldi, Firefox, VSCode