Hardware problems with my main PC

As per the thread title, I’m having problems, most likely hardware-related, on my main PC.

The first thing is that there’s no video signal when I boot up. I’ve swapped out various cables (DisplayPort, HDMI, DVI), but the problem still occurs.

After rebooting several times, the video signal still appears. Sometimes, when I turn the PC on, it shuts down immediately.

But I’ve had other problems: both Garuda and EndeavourOS triggered a kernel panic (the code I got when scanning the EOS kernel panic QR code was 13023317).

Furthermore, when I tried to run the memtest from the EndeavourOS flash drive, the PC rebooted after a few seconds, and then I had no video signal again.

All the hardware is “recent” (about 3 years old), except for an NVMe drive, a 4TB SSD, and a 6TB mechanical HDD from my old PC, which are a few years older. But everything else is more recent.

My configuration is this:

  • CPU: Ryzen 7 5700X
  • Motherboard: Asus Prime X570-P (AM4), BIOS/firmware version 5013
  • RAM: Corsair Vengeance RGB RS 64GB (2 x 32GB), DDR4 3600MHz
  • Graphics Card: SAPPHIRE RADEON RX6750XT GAMING OC NITRO+ 12GB GDDR6
  • Power Supply: Corsair RMX1000 1000W Modular ATX
    Hard Drives: 2 x 2TB NVMe, 3 x 2TB SATA3 SSDs, 1 x 4TB SSD, 2 x mechanical hard drives (one 6TB and one 8TB).

What do you think I could do? What could it be?

Are there any tests I could run from a live flash drive (like a command to “test” the power supply lines or something)?

When it comes to informatics and hardware, I generally “make do,” but I’m not a technician.

I don’t tinker with software or hardware in any depth.

Apart from installing the OS and fixing it myself in case of problems (most of the time), or assembling the PC myself, the rest is beyond my capabilities.

panic sounds like unhealthy ssd or hdd (to me). the only s.m.a.r.t health checker (tests only drives) i trust personally is hdsentinel.

video sounds like something else altogether but power supply?

separate problems with ssd, hdd, mobo, power supply, and the sapphire seem statistically impossible.

1-2 components ruining the experience seems plausible.

I’ve had a lot of dying HDDs and a lot of this fits the bill…but not the video..

That’s a very vague list of spitballing as I consider a life of hardware woes…but the only success in this position is eliminate everything you can…I mean I’ve yanked mobo batteries and jumped reset contacts with screwdrivers to eliminate. you got your work cut out for you,

ssd/hdd health check first if this was me but I’m not the sharpest too in shed,..

1 Like

There are 4 newer Bios updates for this board just F.Y.I. so you may want to update that once you get it working properly. Maybe replace the CMOS batttery. Also try re-seating the memory and Gpu. Then try booting into the bios settings and then set to default and see if it makes any difference. I assume you have EOS installed on one of the nvme drives. You’ll need to test the memory at some point some how. Hopefully you can get it to boot on the live ISO after for that. You could also arch-chroot and make sure it’s updated if it boot’s on the live ISO.

1 Like

I would check the CPU.

This is a hardware failure.

1 Like

The HDDs are “storage” disks, one is formatted in nfts (the 6TB one) which comes from the old PC (there are files and data saved there), and the other 8TB one is formatted in ext4. In short, they don’t have operating systems installed. They’re only used to store data (files, backups, and so on).

Yes, of course, I also have EOS installed, specifically on one of the two NVMe drives (I can’t remember which, the older one or the newer one).

I had already done a check for the disks, NVMe and SSD some time ago, no errors were found, and the number of hours was good even for the older ones. None of them were “end of life.”

Then I’ll unplug everything I can. Maybe I’ll even start the PC with only one stick of RAM at a time, even moving it, to see if that’s the problem or not.

Changing the CMOS battery and re-seating the RAM modules and the Gpu and unplugging everything and even starting with one stick of RAM is a good place to start. Cpu’s rarely go bad unless they are damaged during installation or destroyed by incorrect power settings from overclocking. I have only ever had one bad Cpu right out of the box and it was an Intel decades ago. It was faulty and tests proved it so they replaced it. Checking all your connections as you go should help.

1 Like

I agree. First place to start is to remove variables. Take it right back to the basics and see if the issue persists. At least then you can either rule-in or rule-out a bunch of components.

My suspicion is the PSU, or possibly bad or mis-configured RAM.

If it’s the PSU, the issues may no longer present after unplugging a bunch of things, so keep that in mind. It doesn’t necessarily mean those unplugged things were the issue.

@Sermor
Have you made any progress figuring out the issues you’ve been having with your hardware?

Not at the moment. I haven’t had much time to test, partly because I don’t have a suitable testing station right now (I have to unplug everything, open the PC, and then put it back in its place every time; in short, I don’t have another elevated station, at least not yet).

I’ll test more thoroughly this weekend; at least I’ll have more time to run all the necessary tests.

So, I did this:

I changed the CMOS battery and also reset it, but I didn’t touch anything else. Then I tried rebooting.

The first time, the video signal was still missing. I unplugged the PC for a while and then tried booting it, and after that, the video signal came back.

I went into the BIOS and reset everything, even the RAM settings with DOCP. When I rebooted, it booted normally, and I launched Arch, where I ran some tests.

Specifically, I ran the internal benchmark in Cyberpunk 2077, and then used some LLM models on LLMStudio, and everything worked fine.

After that, I rebooted three times in sequence (going to system, to the desktop, and rebooting), and then I tried rebooting after unplugging the PC for about 15 minutes, and everything worked fine.

I also ran the memtest with the RAM set to DOCP, and after about an hour of testing, it reported a pass with zero errors.

The only negative is that the NVMe where EndeavourOS is installed (a Samsung 980 EVO NVMe) is no longer detected, neither by the BIOS nor by gparted. Arch is also installed on an NVMe, a Western Digital Black, but it works fine.

Suggestions for checking if the Samsung NVMe is actually dead? Any other tests I could run on my hardware to see if the problem is resolved (aside from the NVMe)?

So you aren’t able to boot on the nvme drive that eos is installed on? So I assume you did your tests with the nvme that Arch is installed on? Are you using grub or systmed-boot? Is it just missing the boot menu or you can’t access that drive period. So gparted doen’t even see it? I myself use Western Digital Black.

Edit: So you didn’t even have to re-seat the ram or gpu or unplug anything else to get it working? Maybe the Samsung drive was the issue? If the system doesn’t see the drive you can’t really test it. I guess you could remove it and then try putting it back in and see if it gets recognized?

That could be the problem where you started with. Or did you overclock your ram? that could be the problem as well

Either reseat or test it in another pc to be sure. Or in a external disclosure.

These are good suggestions if you have the ability to do them. I don’t have any experience with the samsung nvme drives. Not sure if the OP has the 970 or 980 and or pro version? There’s also newer 990 versions. What I can tell you is that I have Western Digital Black and they have been installed on, wiped, partitioned and erased and re-partitioned till the cows come home. Ive beat the hell out of them with hard power offs etc. and I’ve never had any issues with them.

I think the problem could be instability in the power lines during POST. In fact, both at startup after the CMOS reset and now after several boot attempts with no video signal, I still got the message that the POST test had failed.

I think I’ll try replacing the power supply first; it’s probably a worn component issue. I hope so, at least. Power supplies are still available at decent prices, unlike RAM.

Anyway, yes, EOS is on a Samsung NVMe, which if I remember correctly is a 980 Evo. Now it’s no longer detected by the BIOS or gparted.

Garuda is also on a SATA 3 SSD, which is a Samsung (an 870 or something similar) with a 4TB capacity. I had a kernel panic on that one too.

Since I have other SSDs from other brands, in addition to the other NVMe (Western Digital Black) that works, when I replace the power supply, I’ll remove the Samsung NVMe and also the Samsung SATA 3 SSD (just to be on the safe side).

The Samsung ones are excellent in terms of performance, but perhaps they’re a bit too “sensitive.”

Perhaps I’ll test them out on the old secondary PC I’ve rebuilt. That way I’ll see if the NVMe is truly dead.

1 Like