Random restarts (usually during night hours) - how to troubleshoot?

Noob question time.

I keep having random restarts, usually during night/early morning hours. I wake up and see my PC rebooted, or in some limbo with black screen, in which case a force shutdown and reboot usually works and the PC doesn’t crash again until the following night. This has been going on for long enough that I’ve gotten used to it. But now I need to travel and I need my PC to come back up when I’m away.

What’s the best way to start troubleshooting this? I’ve searched the internet for solutions, but haven’t come across any thing that has helped.

Posting my logs below.

inxi -Fxxc0z | eos-sendlog: https://0x0.st/8g-c.txt

journalctl -k -b -2 | eos-sendlog: https://0x0.st/8g-A.txt

Any more info needed for troubleshoot?

What is your firewall app?
Any router issue?

I use ufw for firewall. No issues with router that I know of.

Journal shows a hardware error.
The motherboard BIOS date is quite recent, but have you checked if there’s a new version?

make sure you have removed/disabled firewalld as of its installed by default on EndevaourOS for a long time.

systemctl status firewalld would tell if irs running.
If you use UFW you have to disable firewalld. If not it will cause issues.

Journal shows a ton of UFW log . But after this also very aggressive L1-Cache error + Uncorrected, software restartable error .. could be a reral hardware issue or very agressive performance settings for CPU and or RAM in Bios/Firmware?
I see also zenpower and razermouse these could cause issues too.

Memtest at least could sort out RAM Hardware is faulty.

And try LTS Kernel to see if it causes the same issues if not it could be an issue with zenpower. or Kernel regression..

All kernels are the same. Right now LTS dowsn’t even boot to GUI since the last update. Memtest was fine and all stress tests have been completed without issues.

I thought the issue was with the NZXT H1 AIO, which tends to gunk up, but I’ve recently cleaned it and temps have not been high. I am planning to switch it to a Noctua air cooler though.

What’s most curious is the time of the crashes/reboots. Usually between 3 am and 5 am. I’ll have to check my Plex settings, because I do have it set to scan the library during “off hours”.

i can recommend arctic fans-. very silent and a lot cheaper :wink:

mce errors in log so this might be solved with some boot parameters, anyone?
@ricklinux perhaps?
I don’t use AMD cpu

Error address: 0x00000005f1560410 indicates physical memory (RAM or cache).

Type: Load Store Unit, Cache Level: L1 extremely looks like CPU issue

looks also like a real kernel crash:

RIP: 0010:0xffffffff941430e2
Code: Unable to access opcode bytes at 0xffffffff941430b8.

rip = kernel crash

https://www.amd.com/en/support/downloads/drivers.html/processors/ryzen/ryzen-5000-series/amd-ryzen-7-5700g.html
shows it is zen 3

@His_Turdness https://aur.archlinux.org/packages/zenpower3-dkms this is installed?

Edit: Silly me, I was browsing on my laptop when I checked. It has been a long week.

Yes, my main PC has zenpower3-dkms installed.

Arctic is great! Great value for money. I already have a 140mm BeQuiet Silent wings fan, which is really quiet and good quality. I’m going to use it in addition to the Noctua that comes with the cooler.

I have swapped the AIO cooler to the Noctua air cooler. First night there was a crash, but that was at my old apartment. Since then we moved to our temporary place (at the in-laws…), where I have a limited set up. Only connected the PC to a loaner display with one single HDMI connection. This display refuses to go to sleep. But at least the PC hasn’t crashed!

Can this be a “solution” to the problem? Maybe the problem was related to one of the connected cables? (I’d suspect it was the DisplayPort cable.) I will need to confirm this. Tonight I’m running the PC headless, with just a HDMI plugged in on the PC, but not on the display. Tomorrow I’ll try with the Display Port cable.