Boot stuck when swap on MD RAID

Hello, I am trying Cassini Nova release in Virtualbox.

During installation I created swap partition and the system booted fine. Then I created RAID0 on another 2 disks and created swap on it. Updated resume= option in /etc/default/grub and UUID in etc/fstab.
Ran grub-mkconfig -o /boot/grub/grub.cfg and dracut --force and now it is stuck during boot after mounting ZFS root on A start job is running on /dev/disk/by-uuid/7b8e9413-c312-44da-9b6e-2291be027343.

Some related info below:

[root@nbpg0603vm ~]# lsblk
NAME          MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
sda             8:0    0   15G  0 disk  
├─sda1          8:1    0 1000M  0 part  /boot
├─sda2          8:2    0 1000M  0 part  
├─sda3          8:3    0    8M  0 part  
├─sda4          8:4    0  512M  0 part  /boot/efi
└─sda5          8:5    0  6,8G  0 part  
sdb             8:16   0    5G  0 disk  
├─sdb1          8:17   0    8M  0 part  
├─sdb2          8:18   0  512M  0 part  
├─sdb3          8:19   0 1000M  0 part  
│ └─md126       9:126  0  1,9G  0 raid0 
├─sdb4          8:20   0 1000M  0 part  
│ └─md125       9:125  0  1,9G  0 raid0 [SWAP]
└─sdb5          8:21   0  2,5G  0 part  
  └─md127       9:127  0  5,1G  0 raid0 
    ├─md127p1 259:0    0  5,1G  0 part  
    └─md127p9 259:1    0    8M  0 part  
sdc             8:32   0    5G  0 disk  
├─sdc1          8:33   0    8M  0 part  
├─sdc2          8:34   0  512M  0 part  
├─sdc3          8:35   0 1000M  0 part  
│ └─md126       9:126  0  1,9G  0 raid0 
├─sdc4          8:36   0 1000M  0 part  
│ └─md125       9:125  0  1,9G  0 raid0 [SWAP]
└─sdc5          8:37   0  2,5G  0 part  
  └─md127       9:127  0  5,1G  0 raid0 
    ├─md127p1 259:0    0  5,1G  0 part  
    └─md127p9 259:1    0    8M  0 part  
sr0            11:0    1  1,9G  0 rom   
[root@nbpg0603vm ~]# blkid
/dev/sdb4: UUID="0f80200e-653f-20b2-e47e-b227f55ceadf" UUID_SUB="766c2b76-2e2f-c52d-0280-0a3ffda00ffe" LABEL="nbpg0603vm:swap" TYPE="linux_raid_member" PARTUUID="1872a0d1-8545-496c-abc7-85ebbf85cb33"
/dev/sdb5: UUID="9f07b10d-30a2-2c24-d45e-d535452b0b05" UUID_SUB="970f5efd-0ff8-c2c1-46c5-2ddc8b1ed338" LABEL="nbpg0603vm:root" TYPE="linux_raid_member" PARTUUID="b20a90e0-e643-4991-9cbf-208cd1376f41"
/dev/sdb3: UUID="f3b4478f-7c98-1910-ce42-7a6884952a55" UUID_SUB="ccb0398a-ed96-5e7c-3dbf-a8fa4f838677" LABEL="nbpg0603vm:boot" TYPE="linux_raid_member" PARTUUID="e4b4706a-468d-4527-9c6d-c40e887859c3"
/dev/sr0: BLOCK_SIZE="2048" UUID="2023-05-28-11-02-36-00" LABEL="EOS_202305" TYPE="iso9660" PTUUID="b15bb315" PTTYPE="dos"
/dev/sdc5: UUID="9f07b10d-30a2-2c24-d45e-d535452b0b05" UUID_SUB="4f61d318-d7c8-5659-465a-f60c56708dd8" LABEL="nbpg0603vm:root" TYPE="linux_raid_member" PARTUUID="2ce72cf4-557f-495e-8b24-d23e0f5e65a6"
/dev/sdc3: UUID="f3b4478f-7c98-1910-ce42-7a6884952a55" UUID_SUB="c7a23842-e36e-c30b-8a6b-a903bebb6106" LABEL="nbpg0603vm:boot" TYPE="linux_raid_member" PARTUUID="afb4d604-169d-4f15-8227-f734fc5c6083"
/dev/sdc4: UUID="0f80200e-653f-20b2-e47e-b227f55ceadf" UUID_SUB="44f1b4ac-690d-cc80-ccff-e85ac80eb5ab" LABEL="nbpg0603vm:swap" TYPE="linux_raid_member" PARTUUID="5414f4a6-e0dd-4b61-8494-25e53841e6d2"
/dev/sda4: LABEL_FATBOOT="EFI" LABEL="EFI" UUID="ED76-E911" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="d34f522a-ab79-4dda-b46b-9f90f8403d82"
/dev/sda5: LABEL="zpendeavouros" UUID="8206987191204923832" UUID_SUB="6749455970033933757" BLOCK_SIZE="4096" TYPE="zfs_member" PARTUUID="2587dadd-7d23-4a6e-bc69-420a41ae28f4"
/dev/sda1: LABEL="BOOT" UUID="b8bb1b28-ab6b-4759-bd8c-d7c8d59b62c0" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="c3f54808-945a-45c9-bd6d-404084622e84"
/dev/md127p1: LABEL="zpendeavouros" UUID="8206987191204923832" UUID_SUB="12164798272366758411" BLOCK_SIZE="4096" TYPE="zfs_member" PARTLABEL="zfs-544b578bb8fa0859" PARTUUID="3f09c016-5b26-6543-868a-716d236bad48"
/dev/md125: LABEL="SWAP" UUID="7b8e9413-c312-44da-9b6e-2291be027343" TYPE="swap"
/dev/md127p9: PARTUUID="99f03426-6501-084b-9804-5dfa8764b72a"
/dev/sdb2: PARTUUID="ba28b77c-4082-46d3-abba-29e15ef30d09"
/dev/sdb1: PARTUUID="99e96e33-0f71-49c8-a424-f84193bf2789"
/dev/sdc2: PARTUUID="7fc5249e-0e25-41d0-81e0-c321ea73946d"
/dev/sdc1: PARTUUID="1e4d4a27-afb7-4610-94ca-efce64d1dc1a"
/dev/sda2: PARTUUID="2347de75-62b0-4a43-9f07-66b73710475a"
/dev/sda3: PARTUUID="9930867c-5770-4567-a0ec-47bae247dd00"
[root@nbpg0603vm ~]# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a device; this may
# be used with UUID= as a more robust way to name devices that works even if
# disks are added and removed. See fstab(5).
#
# <file system>             <mount point>  <type>  <options>  <dump>  <pass>
UUID=ED76-E911                            /boot/efi      vfat    defaults,noatime 0 2
UUID=b8bb1b28-ab6b-4759-bd8c-d7c8d59b62c0 /boot          ext4    defaults,noatime 0 2
UUID=7b8e9413-c312-44da-9b6e-2291be027343 swap           swap    defaults   0 0
[root@nbpg0603vm ~]# cat /etc/default/grub 
# GRUB boot loader configuration

GRUB_DEFAULT='0'
GRUB_TIMEOUT='5'
GRUB_DISTRIBUTOR='EndeavourOS'
GRUB_CMDLINE_LINUX_DEFAULT='nowatchdog nvme_load=YES resume=UUID=7b8e9413-c312-44da-9b6e-2291be027343 loglevel=3 root=ZFS=zpendeavouros/ROOT/eos/root'
GRUB_CMDLINE_LINUX=""

# Preload both GPT and MBR modules so that they are not missed
GRUB_PRELOAD_MODULES="part_gpt part_msdos"

# Uncomment to enable booting from LUKS encrypted devices
#GRUB_ENABLE_CRYPTODISK=y

# Set to 'countdown' or 'hidden' to change timeout behavior,
# press ESC key to display menu.
GRUB_TIMEOUT_STYLE=menu

# Uncomment to use basic console
GRUB_TERMINAL_INPUT=console

# Uncomment to disable graphical terminal
#GRUB_TERMINAL_OUTPUT=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `videoinfo'
GRUB_GFXMODE=auto

# Uncomment to allow the kernel use the same resolution used by grub
GRUB_GFXPAYLOAD_LINUX=keep

# Uncomment if you want GRUB to pass to the Linux kernel the old parameter
# format "root=/dev/xxx" instead of "root=/dev/disk/by-uuid/xxx"
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
GRUB_DISABLE_RECOVERY='true'

# Uncomment and set to the desired menu colors.  Used by normal and wallpaper
# modes only.  Entries specified as foreground/background.
#GRUB_COLOR_NORMAL="light-blue/black"
#GRUB_COLOR_HIGHLIGHT="light-cyan/blue"

# Uncomment one of them for the gfx desired, a image background or a gfxtheme
GRUB_BACKGROUND='/usr/share/endeavouros/splash.png'
#GRUB_THEME="/path/to/gfxtheme"

# Uncomment to get a beep at GRUB start
#GRUB_INIT_TUNE="480 440 1"

# Uncomment to make GRUB remember the last selection. This requires
# setting 'GRUB_DEFAULT=saved' above.
#GRUB_SAVEDEFAULT=true

# Uncomment to disable submenus in boot menu
GRUB_DISABLE_SUBMENU='false'

# Probing for other operating systems is disabled for security reasons. Read
# documentation on GRUB_DISABLE_OS_PROBER, if still want to enable this
# functionality install os-prober and uncomment to detect and include other
# operating systems.
#GRUB_DISABLE_OS_PROBER=false
[root@nbpg0603vm ~]# 

WTF?

When I remove the resume= option it boots fine (the output above is after removing it).

Did you update mdadm.conf with the details? mdadm -Es?
Sounds like you used dracut to rebuilt the initrd, but it maybe needs mdadm.conf in it so can build the md device for the swap, else the uuid would not exist?

Does a module need to be loaded for mdadm in dracut, probably before the restore module? I am not familiar very with dracut, just enough to use rd.break=pre-mount, and I imagine a different data path would be needed for memory image restore.

IIRC auto detection of md devices needs partition type=fd and md metadata 0.90 ? Unless 1.x metadata is auto detected by the kernel?

I have a system with raid1 mdadm swap volume, but don’t use save/resume, so don’t need early access to the md device to restore a memory image.
I suppose this of similar complexity to the requirements for restoration from encrypted swap? Preprocessing of the swap device before you can get access to the data.

rd.break does not work in this situation.

When adding rd.shell, I can see no array was assembled. /etc/mdadm.conf file is empty in running system (except for DEVICE partitions) but all md raids are assembled anyway, so it shouldn’t be needed. I tried to add

ARRAY /dev/md/swap metadata=1.2 name=nbpg0603vm:swap UUID=0f802.....

anyway with mdadmconf="yes" in /etc/dracut.conf.d/mdraid.conf but the result is the same.
In the dracut output there is a line about installing mdadm module (although mdadm is installed only in the fallback initrd image). The order of loading modules is fixed by dracut developers, so I doubt there is anything to configure. But AFAIK, the array assembly should be done by kernel (module) itself, without mdadm utility needed. I also added rd.md=1 and rd.md.auto=1 and also rd.md.uuid=..... to the kernel command line but no success either.

OK, tried the fallback initrd image with rd.break kernel option. No arrays assembled, but running mdadm --assemble --scan got arrays working. So I tried the fallback image with rd.md=1, rd.md.conf=1 (with /etc/mdadm.conf filled) and rd.auto=1 and it worked!

1 Like

Try rebuilding your dracut images with sudo dracut-rebuild now. That will probably add support going forward so you won’t need to use the fallback.

After I ran dracut-rebuild it booted normally.

Having difficulties with ZPOOL on MD RAID0 I am now trying similar setup but on MD RAID1 and one step further: SWAP in LUKS on the RAID.
During installation I chose encrypted swap on partition /dev/sda4. No problems with swap during boot. Then I created degraded array on the other disk for SWAP (/dev/sdb4), destroyed LUKS container, added /dev/sda4 to the degraded array, created new LUKS container on the RAID, created swap in the LUKS container.
Updated GRUB kernel command line (rd.luks.uuid=.... and resume=....), updated fstab and crypttab (replaced /dev/mapper/luks-<UUID> with /dev/mapper/cryptswap and the new device UUID), added crypt and mdraid dracut modules, updated mdadm.conf, ran grub-mkconfig and dracut-rebuild.

Now during boot it halts for a while at

[  OK  ] Finished Wait for udev To Complete Device Initialization...

After that there is

[ TIME ] Timed out waiting for device ##5-9785-4937-917e-11bbadacaa5d.
[DEPEND] Dependency failed for Cryp#5-9785-4937-917e-11bbadacaa5d.
[DEPEND] Dependency failed for Local Encrypted Volumes.

Booting then continues to ZFS pool import target and again after a while it show some Warning info

[  142.754800] dracut-initqueue[463]: Warning: /lib/dracut/hooks/initqueue/finished/90-crypt.sh: "[ -e /dev/disk/by-id/dm-uuid-CRYPT-LUKS?-*b9e8270597854937917e11bbadacaa5d*-* ] || exit 1"
[  142.755747] dracut-initqueue[463]: Warning: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fdisk\x2fby-id\x2fmd-uuid-0f4c4f4d:d9a494d7:1f7e1615:9334f130.sh: "[ -e "/dev/disk/by-id/md-uuid-0f4c4f4d:d9a494d7:1f7e1615:9334f130" ]"
...
...
(the same for other arrays)

followed by

Please enter passphrase for disk luks-b9e82705-9785-4937-917e-11bbadacaa5d: (press TAB for no echo)

I enter password, boot continues to ZFS password and then after dracut pre-mount-hook is halts on

A start job is running for /dev/mapper/cryptswap (.....

If I remove rd.luks.uuid= and resume= option from kernel command line, it waits for a while after Reached target ZFS pool import target, then shows the same Warning messages regarding MD RAIDs, then the boot continues to

Please enter passphrase for disk cryptswap on swap: (press TAB for no echo)

and then boot finishes fine. Swap is on, but the RAID is degraded (/dev/sda4 missing).
If I add it manually again, the RAID is fine.

It seems to me there is some kind of race condition, also trying to mount LUKS multiple times, so what is the correct configuration for this setup?

Today I cooled down a bit, picked up pieces of broken keyboard and bought a new one, so…

Looking at the dracut’s crypt module code, I tried to rename my LUKS device from cryptswap to luks-<UUID> (by updating crypttab, fstab and GRUB kernel line options) and now the mounting of swap works fine! The RAID warning messages are still there though, so I thing it might be the similar case, as I named my arrays swap, boot, root1, root2 instead of keeping them anonymous (/dev/md127, /dev/md126, …).

EDIT: To use named crypto devices one has to use rd.luks.name=<UUID>=cryptswap kernel option.
No luck with RAID so far…

If you are using md127 etc as names you expect to be stable, then I’m not sure that is true? Certainly without luks, my mdnnn device names scan switch around on each boot, so in /etc/fstab I’ve been using /dev/disk/by-id/md-name-host:array with appropriate host:array of course. Don’t know how this impacts luks, if at all. mdadm -Es reports name=host:array for each device.

In my previous setup I was using RAID without LUKS on top of it and it worked. I seems to me there is some race condition between mdraid and crypto dracut module. I will try to write my own simple mdraid dracut module…

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.