Booting From Any Disk in a RAID Array

From Zanecorpwiki

Jump to: navigation, search

Just because you setup RAID (with something greater than RAID0) doesn't mean you can survive the lost of any disk on the system. The main problem is boot-ability.

The problem is that (with software RAID) the boot loader gets started from the MBR (master boot record) of a disk. At this point in the boot process, the RAID array doesn't exist so you can't boot from the array. What we want to do is set it up so that we can boot each disk and then assemble the arrays from whatever's available after the initial boot. This way we can survive the loss of (at least) one disk in the array, still reboot, and run the (degraded) array.

You can modify the instructions as necessary, but we assume a layout similar to this on and tested on the openSUSE distro. These instructions do not require matching with a RAID setup, but are appropriate for that purpose--especially software RAID.

  1. arrange the bootable disks[notes 1] so that they are the first disks in the series; this will allow the server to survive physical removal of drives or mechanical faults that mimic such
  2. during the partitioning of the drives, create a 500MB or so[notes 2] partition on the first disk
  3. format it and set the mount point to /boot[notes 3]
  4. create a partition of equal size on each of the other disks to be made bootable

Contents

The Boot Partitions

Grub knows how to start up and read regular partitions with (most) regular file systems. It then reads the kernel image from the directory and uses that boot the rest of the system, including starting the RAID arrays. This is why it's problematic to have /boot on a RAID array.

You'll also want to create a first partition on each of the other disks (sdb/sdb1 and sdc/sdc1 in this example) which should be the same size as the sda1 partition. Leave these partitions unformatted and unmounted. We'll use them later on to propagate the boot data.

After installation, it's necessary to manually from each device:

# grub
grub > device (hd0) /dev/sda # sets up the device mapping
grub > root (hd0,0) # configures the location of 'root', which is really '/boot'
grub > setup (hd0) # installs MBR with the preceding config
grub > device (hd1) /dev/sdb
grub > root (hd1,0)
grub > setup (hd1)
grub > device (hd2) /dev/sdc
grub > root (hd2,0)
grub > setup (hd2)

And so on for all boot disks. The first step might not actually be necessary as the SuSE installer will have setup the MBR on sda1, but it's the way I did it so it's what's tested.[notes 4]

Update the Mount By

If, at this point, you were to remove the first disk and attempt a boot, you'd have a problem. It would start the boot process fine, but then attempt to mount /boot from the first device which is, by default, identified by device ID. The simple work-around is to change the mount-by directive in /etc/fstab from device ID to use the standard drive mapping. This causes the first drive found to be treated as /dev/sda and then boot is mounted from /dev/sda. Since all drives are the same, you're good to go.

A potentially better, if slightly more complicated solution, involves keeping the 'mount by ID' scheme, but moving the fstab to /boot and then linking from /etc. Each boot will be a carbon copy except the ID for /boot will be modified to point to the drive we're booting from. This will avoid complaints from openSUSE when trying to upgrade (which really wants mounting done by device ID) and should be more robust. It will also allow other drives to appear before the boot drives without incident.

TODO: develop script for this and include/link in the section below.

Propagate the Boot Data

dd if=/dev/sda1 of=/dev/sdb1
dd if=/dev/sda1 of=/dev/sdc1

Notes

  1. Of course not all disks need be used but depending on the RAID level and layout, you may need more than two to ensure that boot-ability isn't the weak link in your server setup.
  2. You can get by with a smaller boot partition, 100MB may be perfectly fine, but if you end up with multiple kernel images, etc., then you'll appreciate the space later on. 500MB seems reasonable given current disk vis-a-vie kernel sizes.
  3. Ext4 is my current preferred filesystem, though on some distros (as of August 2010) ext4 is not fully supported by the bootloader GRUB so ext3 may be the safer choice. Ext4 is fine for openSUSE 11.3, and I believe is supported since 11.1.
  4. The excellent reference at raid.wiki.kernel.org suggests that the grub setup should be done as (using the second drive as example) 'device (hd0) /dev/sdb; root (hd0,0); setup (hd0)'. However, in practice this didn't work for me and grub failed to fine the boot partition on startup. Perhaps it worked with an older version or something, but I felt it worth mentioning in case my own experience was actually the anomaly.
Personal tools