My PC is frequently rebooting itself

Hyper is based on electron? :thinking:

yes

Run memtest from any of the linux bootable USBs that have it. (AFAIR EOS has it too) Let it run for couple of hours at least, to give the system chance to heat up.
Depending upon the distro you choose, you may have to set boot to legacy in the bios to run this.

Can you tell us more about you hardware (even the psu) an if you have OC/UV something?

From my experience, random shutdowns are almost always a hardware issue, most often itā€™s HDD (check S.M.A.R.T), second CPU (overheating, or power issues), sometimes itā€™s RAM, but itā€™s rare.

1 Like

@locuaz
Have you tried running with mce=off as a kernel parameter in the deafult grub command line and then update grub with sudo grub-mkconfig -o /boot/grub/grub.cfg

Can you tell us more about you hardware (even the psu) an if you have OC/UV something?

  • Memory: Corsair Vengeance LPX DDR4 2x8GB 3200MHz
  • Mainboard: ASUS CrossHair VI Hero
  • PSU: Antryx 650W Xtreme Pro
  • CPU cooler: Corsair H11Oi 280MM
  • No OC/UV

From my experience, random shutdowns are almost always a hardware issue, most often itā€™s HDD (check S.M.A.R.T), second CPU (overheating, or power issues), sometimes itā€™s RAM, but itā€™s rare.

Recently I bought 2 disks:

  • Samsung 970 EVO Plus Series - PCIe NVMe - M.2 Internal SSD
  • Western Digital HD WD Blue PC, 3 TB - Class 5400 RPM, SATA 6 Gb/s

I have started monitoring the temperature.

Have you tried running with mce=off as a kernel parameter in the deafult grub command line and then update grub with sudo grub-mkconfig -o /boot/grub/grub.cfg

Iā€™ve changed my default kernel to LTS, Iā€™m waiting whether the same problem happens

Iā€™ve just had another reboot with LTS kernel.

$ journalctl | grep mce                              
Sep 21 13:03:33 antares kernel: mce: [Hardware Error]: Machine check events logged
Sep 21 13:03:33 antares kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 5: bea0000000000108
Sep 21 13:03:33 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffafb0ee40 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Sep 21 13:03:33 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1663783406 SOCKET 0 APIC 8 microcode 8001138
Sep 23 10:12:41 antares kernel: mce: [Hardware Error]: Machine check events logged
Sep 23 10:12:41 antares kernel: mce: [Hardware Error]: CPU 12: Machine Check: 0 Bank 5: bea0000000000108
Sep 23 10:12:41 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff955f8dae MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Sep 23 10:12:41 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1663945955 SOCKET 0 APIC 9 microcode 8001138
Oct 01 18:35:42 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 01 18:35:42 antares kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 5: bea0000000000108
Oct 01 18:35:42 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff959816fe MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Oct 01 18:35:42 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1664667335 SOCKET 0 APIC 0 microcode 8001138
Oct 15 10:29:42 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 15 10:29:42 antares kernel: mce: [Hardware Error]: CPU 8: Machine Check: 0 Bank 5: bea0000000000108
Oct 15 10:29:42 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffc46121cc MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Oct 15 10:29:42 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1665847776 SOCKET 0 APIC 1 microcode 8001138
Oct 19 21:58:41 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 19 21:58:41 antares kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 5: bea0000000000108
Oct 19 21:58:41 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffba00503c MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Oct 19 21:58:41 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1666234714 SOCKET 0 APIC 8 microcode 8001138
Oct 27 00:16:28 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 27 00:16:28 antares kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 5: bea0000000000108
Oct 27 00:16:28 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff95a0503c MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Oct 27 00:16:28 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1666847781 SOCKET 0 APIC 0 microcode 8001138
Oct 29 08:40:19 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 29 08:40:19 antares kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 0: baa0000000060185
Oct 29 08:40:19 antares kernel: mce: [Hardware Error]: TSC 0 MISC d012000100000000 SYND 2d030000 IPID b000000000 
Oct 29 08:40:19 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1667050812 SOCKET 0 APIC 0 microcode 8001138
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 5: bea0000000000108
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffa2d85ca0 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1667085189 SOCKET 0 APIC 8 microcode 8001138
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: CPU 12: Machine Check: 0 Bank 5: bea0000000000108
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffa2d85ca0 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1667085189 SOCKET 0 APIC 9 microcode 8001138
Oct 31 12:12:12 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 31 12:12:12 antares kernel: mce: [Hardware Error]: CPU 12: Machine Check: 0 Bank 5: bea0000000000108
Oct 31 12:12:12 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff8620803c MISC d0130fff00000000 SYND 4d000000 IPID 500b000000000 
Oct 31 12:12:12 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1667236326 SOCKET 0 APIC 9 microcode 8001138
Nov 02 09:28:18 antares kernel: mce: [Hardware Error]: Machine check events logged
Nov 02 09:28:18 antares kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 5: bea0000000000108
Nov 02 09:28:18 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffb98bab3c MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Nov 02 09:28:18 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1667399293 SOCKET 0 APIC 8 microcode 8001138

I found this article dealing a similar problem. There are two interesting comments:

Thatā€™s not a good idea. If you boot with mce=off and the machine experiences an uncorrectable MCE which raises a machine check exception, the machine will immediately shutdown without the ability to even log the error to know what kind of error it was. Supplying ā€œmce=offā€ on the kernel command line is almost never a good idea.

At long last, the solution that worked for me was to increase my RAM voltage in the BIOS from the default setting of 1.35v (with XMP profile enabled) to 1.38v. At 1.36v and 1.37v, I still experienced the occasional reboot.

Iā€™m gonna increase my RAM voltageā€¦ crossing fingers.

https://wiki.archlinux.org/title/Ryzen#Random_reboots

https://wiki.gentoo.org/wiki/Ryzen#Random_reboots_with_mce_events

1 Like

Iā€™ve increased my RAM voltage, but my motherboard apparently does not have a Curve Optimizer feature, here some images of my BIOS.

IMG_20221102_100505

I donā€™t know if increase the voltage your CPU is related to CPU Core Voltage or CPU SOC Voltage, both of them have these options only:

  • Auto
  • Manual mode
  • Offset mode

IMG_20221102_103956
IMG_20221102_104028
IMG_20221102_104142
IMG_20221102_104148

the first gen Ryzen is very sensitive to Ram and everything over the official supported 2667MHz is OC. It can be that the CPU doenā€™t handle you Ram.

An I had an 1700 to, but 1,155V is very low an I find it strange because it is set to auto, I had even the same MB and on auto it was always something +/- 1,35V

Have you tried idle=nomwait as a kernel parameter and update grub.

Yes, Iā€™m waiting for next random reboot :grimacing:

@locuaz
I donā€™t know what kind of a user you are but i donā€™t change my UEFI settings much from the defaults except for those things that need to be enabled such as KVM and XMP profile and maybe a couple other items. I donā€™t overclock, i donā€™t change my voltage settings or anything else. I have no need to do that. I rely on the UEFI settings and current AMD AGESA to provide the best settings required for the hardware itā€™s using.

I am an enthusiastic user :smiling_face_with_tear:, and like you I donā€™t want to alter BIOS or kernel, but these things happen so what choice do I have?

Well, I am planning to buy a new PC, do you have any recommendations on a budget for a second world citizen?

I always build my own desktops but it can be more expensive than some offers for prebuilt or custom built units. It depends on what you are looking for and what the budget is and what is being offered for that price. It also depends on the area you are in what you can get. I am lucky that i can get pretty much what is available. The newest Ryzens are out but newer boards are quite expensive and so are the latest components. Iā€™m sure you can find something that will be in your budget.

Isnā€™t CPU Core voltage a bit low? It should be about 1.3V, or am I wrong?

Iā€™m monitoring for temperatures and stability these days, are those checked temperatures normal?

image