Hyper is based on electron?
yes
Run memtest from any of the linux bootable USBs that have it. (AFAIR EOS has it too) Let it run for couple of hours at least, to give the system chance to heat up.
Depending upon the distro you choose, you may have to set boot to legacy in the bios to run this.
Can you tell us more about you hardware (even the psu) an if you have OC/UV something?
From my experience, random shutdowns are almost always a hardware issue, most often itās HDD (check S.M.A.R.T), second CPU (overheating, or power issues), sometimes itās RAM, but itās rare.
@locuaz
Have you tried running with mce=off
as a kernel parameter in the deafult grub command line and then update grub with sudo grub-mkconfig -o /boot/grub/grub.cfg
Can you tell us more about you hardware (even the psu) an if you have OC/UV something?
- Memory: Corsair Vengeance LPX DDR4 2x8GB 3200MHz
- Mainboard: ASUS CrossHair VI Hero
- PSU: Antryx 650W Xtreme Pro
- CPU cooler: Corsair H11Oi 280MM
- No OC/UV
From my experience, random shutdowns are almost always a hardware issue, most often itās HDD (check S.M.A.R.T), second CPU (overheating, or power issues), sometimes itās RAM, but itās rare.
Recently I bought 2 disks:
- Samsung 970 EVO Plus Series - PCIe NVMe - M.2 Internal SSD
- Western Digital HD WD Blue PC, 3 TB - Class 5400 RPM, SATA 6 Gb/s
I have started monitoring the temperature.
Have you tried running with
mce=off
as a kernel parameter in the deafult grub command line and then update grub withsudo grub-mkconfig -o /boot/grub/grub.cfg
Iāve changed my default kernel to LTS, Iām waiting whether the same problem happens
Iāve just had another reboot with LTS kernel.
$ journalctl | grep mce
Sep 21 13:03:33 antares kernel: mce: [Hardware Error]: Machine check events logged
Sep 21 13:03:33 antares kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 5: bea0000000000108
Sep 21 13:03:33 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffafb0ee40 MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Sep 21 13:03:33 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1663783406 SOCKET 0 APIC 8 microcode 8001138
Sep 23 10:12:41 antares kernel: mce: [Hardware Error]: Machine check events logged
Sep 23 10:12:41 antares kernel: mce: [Hardware Error]: CPU 12: Machine Check: 0 Bank 5: bea0000000000108
Sep 23 10:12:41 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff955f8dae MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Sep 23 10:12:41 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1663945955 SOCKET 0 APIC 9 microcode 8001138
Oct 01 18:35:42 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 01 18:35:42 antares kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 5: bea0000000000108
Oct 01 18:35:42 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff959816fe MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Oct 01 18:35:42 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1664667335 SOCKET 0 APIC 0 microcode 8001138
Oct 15 10:29:42 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 15 10:29:42 antares kernel: mce: [Hardware Error]: CPU 8: Machine Check: 0 Bank 5: bea0000000000108
Oct 15 10:29:42 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffc46121cc MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Oct 15 10:29:42 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1665847776 SOCKET 0 APIC 1 microcode 8001138
Oct 19 21:58:41 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 19 21:58:41 antares kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 5: bea0000000000108
Oct 19 21:58:41 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffba00503c MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Oct 19 21:58:41 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1666234714 SOCKET 0 APIC 8 microcode 8001138
Oct 27 00:16:28 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 27 00:16:28 antares kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 5: bea0000000000108
Oct 27 00:16:28 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff95a0503c MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Oct 27 00:16:28 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1666847781 SOCKET 0 APIC 0 microcode 8001138
Oct 29 08:40:19 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 29 08:40:19 antares kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 0: baa0000000060185
Oct 29 08:40:19 antares kernel: mce: [Hardware Error]: TSC 0 MISC d012000100000000 SYND 2d030000 IPID b000000000
Oct 29 08:40:19 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1667050812 SOCKET 0 APIC 0 microcode 8001138
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 5: bea0000000000108
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffa2d85ca0 MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1667085189 SOCKET 0 APIC 8 microcode 8001138
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: CPU 12: Machine Check: 0 Bank 5: bea0000000000108
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffa2d85ca0 MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Oct 29 18:13:16 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1667085189 SOCKET 0 APIC 9 microcode 8001138
Oct 31 12:12:12 antares kernel: mce: [Hardware Error]: Machine check events logged
Oct 31 12:12:12 antares kernel: mce: [Hardware Error]: CPU 12: Machine Check: 0 Bank 5: bea0000000000108
Oct 31 12:12:12 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff8620803c MISC d0130fff00000000 SYND 4d000000 IPID 500b000000000
Oct 31 12:12:12 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1667236326 SOCKET 0 APIC 9 microcode 8001138
Nov 02 09:28:18 antares kernel: mce: [Hardware Error]: Machine check events logged
Nov 02 09:28:18 antares kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 5: bea0000000000108
Nov 02 09:28:18 antares kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffb98bab3c MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Nov 02 09:28:18 antares kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1667399293 SOCKET 0 APIC 8 microcode 8001138
I found this article dealing a similar problem. There are two interesting comments:
Thatās not a good idea. If you boot with mce=off and the machine experiences an uncorrectable MCE which raises a machine check exception, the machine will immediately shutdown without the ability to even log the error to know what kind of error it was. Supplying āmce=offā on the kernel command line is almost never a good idea.
At long last, the solution that worked for me was to increase my RAM voltage in the BIOS from the default setting of 1.35v (with XMP profile enabled) to 1.38v. At 1.36v and 1.37v, I still experienced the occasional reboot.
Iām gonna increase my RAM voltageā¦ crossing fingers.
https://wiki.archlinux.org/title/Ryzen#Random_reboots
https://wiki.gentoo.org/wiki/Ryzen#Random_reboots_with_mce_events
Iāve increased my RAM voltage, but my motherboard apparently does not have a Curve Optimizer feature, here some images of my BIOS.
I donāt know if increase the voltage your CPU
is related to CPU Core Voltage
or CPU SOC Voltage
, both of them have these options only:
- Auto
- Manual mode
- Offset mode
the first gen Ryzen is very sensitive to Ram and everything over the official supported 2667MHz is OC. It can be that the CPU doenāt handle you Ram.
An I had an 1700 to, but 1,155V is very low an I find it strange because it is set to auto, I had even the same MB and on auto it was always something +/- 1,35V
Have you tried idle=nomwait
as a kernel parameter and update grub.
Yes, Iām waiting for next random reboot
@locuaz
I donāt know what kind of a user you are but i donāt change my UEFI settings much from the defaults except for those things that need to be enabled such as KVM and XMP profile and maybe a couple other items. I donāt overclock, i donāt change my voltage settings or anything else. I have no need to do that. I rely on the UEFI settings and current AMD AGESA to provide the best settings required for the hardware itās using.
I am an enthusiastic user , and like you I donāt want to alter BIOS or kernel, but these things happen so what choice do I have?
Well, I am planning to buy a new PC, do you have any recommendations on a budget for a second world citizen?
I always build my own desktops but it can be more expensive than some offers for prebuilt or custom built units. It depends on what you are looking for and what the budget is and what is being offered for that price. It also depends on the area you are in what you can get. I am lucky that i can get pretty much what is available. The newest Ryzens are out but newer boards are quite expensive and so are the latest components. Iām sure you can find something that will be in your budget.
Isnāt CPU Core voltage a bit low? It should be about 1.3V, or am I wrong?
Iām monitoring for temperatures and stability these days, are those checked temperatures normal?