Updating passed kernel 6.12.74-1 gives "Decompression Failed: ZSTD-compressed data is corrupt"

This is a bit of a weird one I think as I can’t really find anything related to my exact situation.

First off here are my specs (I’m going to remove the 2060):

I have an old gaming PC I repurposed into a home server and installed EOS on it. Everything was fine for a while but suddenly after a kernel I started getting boot errors. Please see the screenshot for the messages

The system would try to boot twice before kicking me to the BIOS. The weird thing here is I would change nothing in the BIOS, exit without saving and the system would boot. However if I left the BIOS by saving and rebooting the system would loop back to the above.

So on choosing exit without saving at the BIOS and getting the system back up I would find a random BTRFS filesystem would be corrupt and read only (I have 3, my home partition, an NVMe drive and a mechanical drive). Generally i’d copy the data off the affected drive, reformat the drive and be free of corruption. Until my next reboot and the cycle repeats.

I found eventually that the LTS Kernal at the time didn’t cause this issue (6.12.74-1) so I stuck with that kernel until EOS updated the LTS kernel passed this point. I added the working kernel and the relevant packages to pacman.conf pkgignore.

Obviously this isn’t ideal so i’d like to resolve this somehow. I’ve searched this for weeks and have come up short so turning to you lovely lot in the hopes someone has a clue what’s going on or can point me in the general of a fix.

Thank you in advance!

Welcome to the community @kangy :waving_hand::smiley: :enos_flag:

goku-dragon-ball-z

(noted your drive names :grin:)

What brand and model power supply (PSU) are you using in that system? To power all of those components (and there’s a lot), you’d need at least a decent 700W PSU.

I’d suggest removing some things. The Nvidia GeForce RTX 2060 would be my pick as a simple thing to remove and test, but you might also try unplugging the power of the non-essential hard drives.

If the issue doesn’t persist while testing that, you might have a power issue.

Thanks!
I have a couple of the drives on my gaming PC with the same naming convention as well :laughing:

I can’t remember the exact model but it’s a Corsair 750w Gold I believe.
I have removed the 2060 tonight but I should have mentioned I had this issue when it was just the 2060 in the system and not the 3070 alongside it.

The bit that is throwing me off is it’s working okay on the specific version of the kernel i’m on and doesn’t on any versions higher.

I will retest with some of the HDDs and none essential parts disconnected to cover that base though!

So it doesn’t seem to be a power issue, it happens with only the essentials connected

You might try running some SMART tests on your drives, to try and rule out a hardware issue:

sudo smartctl -t short /dev/device
sudo smartctl -t long /dev/device

Replacing device there with your actual devices, likes sda, nvme0n1, etc. To get a list of your devices, use:

lsblk -l

Please share more info about the system.

Lets start with this:

  • content of /etc/fstab
  • full content of the EFI partition
  • output of bootctl status
  • output of bootctl list

Sorry for the delay. I have done the smart test and all drives are showing as okay with no errors

Here’s my fstab:

# <file system>             <mount point>  <type>  <options>  <dump>  <pass>
UUID=37630c35-c279-4c6f-b9cb-da9018ddec06   /home         btrfs   noatime,compress=zstd                  0 0
UUID=6393-F8E2                              /efi          vfat    fmask=0137,dmask=0027                  0 2
UUID=d6b21910-2c5f-457d-9add-6aa357a430da   /             btrfs   subvol=/@,noatime,compress=zstd        0 0
UUID=d6b21910-2c5f-457d-9add-6aa357a430da   /var/cache    btrfs   subvol=/@cache,noatime,compress=zstd   0 0
UUID=d6b21910-2c5f-457d-9add-6aa357a430da   /var/log      btrfs   subvol=/@log,noatime,compress=zstd     0 0
tmpfs                                       /tmp          tmpfs   noatime,mode=1777                      0 0
/dev/sda1                                   /mnt/Gogeta   btrfs   nofail,users                           0 0
/dev/sdb1                                   /mnt/Media    ext4    nofail,users                           0 0
/dev/sdc1                                   /mnt/Vegito   ext4    nofail,users                           0 0


/dev/nvme0n1p1                              /mnt/Broly    btrfs   defaults                               0 0

Is there a command I can run to fetch the EFI partition contents?

Here is my bootctl status:

System:
      Firmware: UEFI 2.70 (American Megatrends 5.14)
 Firmware Arch: x64
   Secure Boot: disabled (setup)
  TPM2 Support: yes
  Measured UKI: no
  Boot into FW: supported

Current Boot Loader:
       Product: systemd-boot 260.1-1-arch
     Features: ✓ Boot counting
               ✓ Menu timeout control
               ✓ One-shot menu timeout control
               ✓ Default entry control
               ✓ One-shot entry control
               ✓ Support for XBOOTLDR partition
               ✓ Support for passing random seed to OS
               ✓ Load drop-in drivers
               ✓ Support Type #1 sort-key field
               ✓ Support @saved pseudo-entry
               ✓ Support Type #1 devicetree field
               ✓ Enroll SecureBoot keys
               ✓ Retain SHIM protocols
               ✓ Menu can be disabled
               ✓ Multi-Profile UKIs are supported
               ✓ Loader reports network boot URL
               ✓ Support Type #1 uki field
               ✓ Support Type #1 uki-url field
               ✓ Loader reports active TPM2 PCR banks
     Partition: /dev/disk/by-partuuid/735e1854-ab74-4336-941f-25cce16b10dd
        Loader: └─/efi//EFI/SYSTEMD/SYSTEMD-BOOTX64.EFI
 Current Entry: 44614bce76004e25b0576dbd5312a371-6.12.74-1-lts.conf
 Default Entry: 44614bce76004e25b0576dbd5312a371-6.12.67-1-lts.conf

Random Seed:
 System Token: set
       Exists: yes

Available Boot Loaders on ESP:
          ESP: /efi (/dev/disk/by-partuuid/735e1854-ab74-4336-941f-25cce16b10dd)
         File: ├─/efi//EFI/systemd/systemd-bootx64.efi (systemd-boot 260.1-1-arch)
               └─/efi//EFI/BOOT/BOOTX64.EFI (systemd-boot 260.1-1-arch)

Boot Loaders Listed in EFI Variables:
        Title: Linux Boot Manager
           ID: 0x0000
       Status: active, boot-order
    Partition: /dev/disk/by-partuuid/735e1854-ab74-4336-941f-25cce16b10dd
         File: └─/efi//EFI/SYSTEMD/SYSTEMD-BOOTX64.EFI

        Title: UEFI OS
           ID: 0x0001
       Status: active, boot-order
    Partition: /dev/disk/by-partuuid/735e1854-ab74-4336-941f-25cce16b10dd
         File: └─/efi//EFI/BOOT/BOOTX64.EFI

Boot Loader Entry Locations:
          ESP: /efi (/dev/disk/by-partuuid/735e1854-ab74-4336-941f-25cce16b10dd, $BOOT)
       config: /efi//loader/loader.conf
        token: endeavouros

Default Boot Loader Entry:
         type: Boot Loader Specification Type #1 (.conf)
        title: EndeavourOS (6.12.74-1-lts)
           id: 44614bce76004e25b0576dbd5312a371-6.12.74-1-lts.conf
       source: /efi//loader/entries/44614bce76004e25b0576dbd5312a371-6.12.74-1-lts.conf (on the EFI System Partition)
     sort-key: endeavouros-6.12.74-1-lts
      version: 6.12.74-1-lts
   machine-id: 44614bce76004e25b0576dbd5312a371
        linux: /efi//44614bce76004e25b0576dbd5312a371/6.12.74-1-lts/linux
       initrd: /efi//44614bce76004e25b0576dbd5312a371/6.12.74-1-lts/initrd
      options: nvme_load=YES nowatchdog rw rootflags=subvol=/@ root=UUID=d6b21910-2c5f-457d-9add-6aa357a430da systemd.machine_id=44614bce76004e25b0576dbd5312a371

Here is my bootctl list:

        type: Boot Loader Specification Type #1 (.conf)
        title: EndeavourOS (6.12.74-1-lts) (default) (selected)
           id: 44614bce76004e25b0576dbd5312a371-6.12.74-1-lts.conf
       source: /efi//loader/entries/44614bce76004e25b0576dbd5312a371-6.12.74-1-lts.conf (on the EFI System Partition)
     sort-key: endeavouros-6.12.74-1-lts
      version: 6.12.74-1-lts
   machine-id: 44614bce76004e25b0576dbd5312a371
        linux: /efi//44614bce76004e25b0576dbd5312a371/6.12.74-1-lts/linux
       initrd: /efi//44614bce76004e25b0576dbd5312a371/6.12.74-1-lts/initrd
      options: nvme_load=YES nowatchdog rw rootflags=subvol=/@ root=UUID=d6b21910-2c5f-457d-9add-6aa357a430da systemd.machine_id=44614bce76004e25b0576dbd5312a371

         type: Boot Loader Specification Type #1 (.conf)
        title: EndeavourOS (6.12.74-1-lts-fallback)
           id: 44614bce76004e25b0576dbd5312a371-6.12.74-1-lts-fallback.conf
       source: /efi//loader/entries/44614bce76004e25b0576dbd5312a371-6.12.74-1-lts-fallback.conf (on the EFI System Partition)
     sort-key: endeavouros-6.12.74-1-lts-fallback
      version: 6.12.74-1-lts-fallback
   machine-id: 44614bce76004e25b0576dbd5312a371
        linux: /efi//44614bce76004e25b0576dbd5312a371/6.12.74-1-lts/linux
       initrd: /efi//44614bce76004e25b0576dbd5312a371/6.12.74-1-lts/initrd-fallback
      options: nvme_load=YES nowatchdog rw rootflags=subvol=/@ root=UUID=d6b21910-2c5f-457d-9add-6aa357a430da systemd.machine_id=44614bce76004e25b0576dbd5312a371

         type: Boot Loader Specification Type #1 (.conf)
        title: EndeavourOS (6.19.11-arch1-1)
           id: 44614bce76004e25b0576dbd5312a371-6.19.11-arch1-1.conf
       source: /efi//loader/entries/44614bce76004e25b0576dbd5312a371-6.19.11-arch1-1.conf (on the EFI System Partition)
     sort-key: endeavouros-6.19.11-arch1-1
      version: 6.19.11-arch1-1
   machine-id: 44614bce76004e25b0576dbd5312a371
        linux: /efi//44614bce76004e25b0576dbd5312a371/6.19.11-arch1-1/linux
       initrd: /efi//44614bce76004e25b0576dbd5312a371/6.19.11-arch1-1/initrd
      options: nvme_load=YES nowatchdog rw rootflags=subvol=/@ root=UUID=d6b21910-2c5f-457d-9add-6aa357a430da systemd.machine_id=44614bce76004e25b0576dbd5312a371

         type: Boot Loader Specification Type #1 (.conf)
        title: EndeavourOS (6.19.11-arch1-1-fallback)
           id: 44614bce76004e25b0576dbd5312a371-6.19.11-arch1-1-fallback.conf
       source: /efi//loader/entries/44614bce76004e25b0576dbd5312a371-6.19.11-arch1-1-fallback.conf (on the EFI System Partition)
     sort-key: endeavouros-6.19.11-arch1-1-fallback
      version: 6.19.11-arch1-1-fallback
   machine-id: 44614bce76004e25b0576dbd5312a371
        linux: /efi//44614bce76004e25b0576dbd5312a371/6.19.11-arch1-1/linux
       initrd: /efi//44614bce76004e25b0576dbd5312a371/6.19.11-arch1-1/initrd-fallback
      options: nvme_load=YES nowatchdog rw rootflags=subvol=/@ root=UUID=d6b21910-2c5f-457d-9add-6aa357a430da systemd.machine_id=44614bce76004e25b0576dbd5312a371

         type: Automatic
        title: Reboot Into Firmware Interface
           id: auto-reboot-to-firmware-setup
       source: /sys/firmware/efi/efivars/LoaderEntries-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f (on the EFI System Partition)

This is weird. Form my point of view the 6.12.67 entry should not be there. Your default should be 6.12.74.

your bootctl list shows this:

Quick question: Do you have “fast boot” enabled in the BIOS? If yes, please disable it.

And what is the output of efibootmgr -v ?

This is weird. Form my point of view the 6.12.67 entry should not be there. Your default should be 6.12.74.

I’ll be honest, I don’t know why it’s there either

efibootmgr -v shows this:

BootCurrent: 0000
Timeout: 1 seconds
BootOrder: 0000,0001
Boot0000* Linux Boot Manager    HD(2,GPT,735e1854-ab74-4336-941f-25cce16b10dd,0x800,0x3e8000)/\EFI\SYSTEMD\SYSTEMD-BOOTX64.EFI
      dp: 04 01 2a 00 02 00 00 00 00 08 00 00 00 00 00 00 00 80 3e 00 00 00 00 00 54 18 5e 73 74 ab 36 43 94 1f 25 cc e1 6b 10 dd 02 02 / 04 04 46 00 5c 00 45 00 46 00 49 00 5c 00 53 00 59 00 53 00 54 00 45 00 4d 00 44 00 5c 00 53 00 59 00 53 00 54 00 45 00 4d 00 44 00 2d 00 42 00 4f 00 4f 00 54 00 58 00 36 00 34 00 2e 00 45 00 46 00 49 00 00 00 / 7f ff 04 00
Boot0001* UEFI OS       HD(2,GPT,735e1854-ab74-4336-941f-25cce16b10dd,0x800,0x3e8000)/\EFI\BOOT\BOOTX64.EFI0000424f
      dp: 04 01 2a 00 02 00 00 00 00 08 00 00 00 00 00 00 00 80 3e 00 00 00 00 00 54 18 5e 73 74 ab 36 43 94 1f 25 cc e1 6b 10 dd 02 02 / 04 04 30 00 5c 00 45 00 46 00 49 00 5c 00 42 00 4f 00 4f 00 54 00 5c 00 42 00 4f 00 4f 00 54 00 58 00 36 00 34 00 2e 00 45 00 46 00 49 00 00 00 / 7f ff 04 00
    data: 00 00 42 4f

Rebooting into the BIOS now to check the fastboot status

Edit: Fastboot seems to be disabled

initial post is also showing old kernel version..

On some systems this UEFI timeout is too short. You may want to try something slower, e.g. 5 s.

sudo efibootmgr --timeout 5000
sudo bootctl update

And then reboot. See if that helps.

It’s a picture I had taken back at the time I initially tried to diagnose this issue. I’ve tried it over a couple of kernel versions with same results

I’ll give that a shot and report back. Thank you!

/etc/dracut.conf.d/eos-defaults.conf:

compress="cat"

and sudo reinstall-kernels

to use uncompressed kernel images just comes to my mind..

Thank you for your response but I just narrowed it down due to some unrelated issues.

It was indeed a bad RAM stick. I’ve removed the offending stick and everything is working as expected!

Thank you to everyone who chipped in to help me :slight_smile: