Booting from raid1 fails

Hi all,

this is the first time I wanted to set up an arch (antergos based) system residing completely on raid1 devices.

I do have a complete image of a running system which I have cusomized strongly and this is my “base”. I have setup many systems on standard filesystems using these steps:

  1. partitioning disk to 3 partitions (ext4/ext4/swap) which are mounted on / , /home and swap
  2. copying all stuff from the image to /
  3. chrooting into the “new system”
  4. building new initramfs using mkinitcpio -p linux
  5. generating grub entries using grub-mkconfig -o /boot/grub/grub.cfg
  6. adding grub to mbr using grub-install /dev/sda
    This has worked many times - I get running every system I have “cloned” this way.

Now I wanted first time to do same using raid1. I have done this many times with debian systems and it worked flawlessly but arch has different mechs… I have done

  1. partitioning two disks with 3 partitions like 1) above but type 0xfd
  2. assembling 3 raid1 arrays from them
  3. formatting them as 1) from the chapter above
    …then proceeding like chapter above starting with 2)
    Every step runs without any warning or error. But when I try to boot the new system it gets stuck at “Loading initial ramdisk…”
    without chance to do anything than “press reset” :frowning:

I can chroot in the system and it works perfectly.
What I have tried:

  • added mdadm_udev hook (with previously executed mdadm --examine --scan > /etc/mdadm.conf)
  • added raid1 module
  • changed grub UUID to devices (/dev/md0)
  • removed microcode files from grub command
    in all combinations. Of course I have rebuilt initramfs and created grub.cfg if this was mandatory after changes. But always the same result: stuck at the line above.

I have googled much but no answer has helped. It looks like grub can find the kernel image and loads it (how can I debug this?) because of HDD-LED is flickering 1-2 seconds after grub starts boot process but loading of inittram fails (I do not know why). I know this issue is not related to Antergos or Endeavour but to Arch. I must have forgotten one important step (or done it wrong)… Any help is appreciated…

Best regards
df8oe

Hi!
As long no one answered your post yet i can say that all basic commands you typed seem right, i can’t tell about raid configs though 'cause i never used it.

I recommend you to try checking the archwiki https://wiki.archlinux.org/index.php/RAID

And let’s see if someone else can give you some light on this.

Or maybe just install onto a single drive and use Raider to convert it to RAID 1 later?

I know that Raider hasn’t been updated in YEARS but I played around with it on a test Mint 19.1 system recently and it worked perfectly.

Thanks for your replies. I haven never heard from “Raider” and will take a look at it. Sounds interesting…

I have spent three hours and investigated by myself. There were TWO independent issues… First I have to add systemd.debug-shell=1 to grub kernel parameter line to see that milliseconds after loading the kernel there was a kernel panic. This was caused by a bad BIOS bug and can be workaround by adding “noapic” to the kernel boot parameters. But after doing that system tries to load initramfs and was thrown to emergency shell after a couple of seconds. And the following is very strange: I always have used the EndeavourOS DVD to get a system for chrooting to the new system (not Endeavour-related - same with Antergos or Arch systems). But the shown raid-devices are not the ones which I can see in emergency console! But if I use a Debian based system to chroot the devices are matching. I am working since 20 years with Linux but I never ever have had such a strange issue… After using Debian as base for chrooting all is working now. I haven’t made anything wrong. The goal was the adding of the debug parameter to grub.cfg…

Best regards
df8oe

Maybe we need to add some package in the iso to detect raid partition? I hope to check that soon.
For now we’re installing base and base-devel from archlinux, but some other package may be needed for a raid install.

Hi,

this is VERY CURIOUS. It occurs with all Arch based systems I do won: Arch “plain”, "Antergos and Endeavour so I do not think there are missing packages. All Arch distros have shown my arrays as md125 / md126 / md127 and all Debian distros shows them as md0 / md1 / md2 - which is the correct assignment. For my asonishment Arch does acept them only with md0 / md1 / md2 at booting. Confusing. I do not have any explanation. I can 100% state: not an EndeavourOS problem.

1 Like

there is a text in the same article i linked you before " Mounting from a Live CD"

If your RAID 1 that is missing a disk array was wrongly auto-detected as RAID 1

Hope that helps, or try reading the raid1 install article again.

I just confirmed that common raid packages are installed properly in endeavour iso.

I am installing raid systems since years this identical way and they all do run flawlessly - like the newly installed arch raid now. But there is still a not understandable difference:

If I take a Debian live system and do a blkid to the hdds I get the same result as if I boot the raid system (or chroot into them). If I do the same using any Arch based live system I get different results. I can mount the arrays (which are assembelde correct anyway) only using these “wrong” device names. UUIDs are displayed correctly. So if I would have used UUIDs instead of devicenames all would work fine. Again: it is not a problem of EndeavourOS - do not worry. I think I must have more time to investigate what is the cause. And there are two simple solutions: 1) use Debian based system for chrooting and setting up initramfs, grub config and grub itself or 2) use UUIDs instead of device names.

1 Like