Recovering a linux software RAID1 system when the disk with the boot loader dies

In a previous post I covered installing Ubuntu on a software raid1 system with UEFI (GPT partition table as the disks were 4TB in size). One of my disks died, and it inevitably was the one that I hadn’t installed the boot loader on, so my system was unbootable (although the data was safe on the other disk).

In order to recover things, I firstly had to boot the computer with an Ubuntu system from a USB memory stick. The first thing to ensure is that the system boots the disk in UEFI mode. This may involve some fiddling in the UEFI settings when you first boot the computer, but you also need to ensure that the USB disk has been created in UEFI mode. I originally created mine on my macbook air with unetbootin, but this doesn’t support UEFI, so I had to do it manually with the following commands:

# Convert the downloaded iso image to the required format

hdiutil convert ubuntu-14.04.1-desktop-amd64.iso -format UDRW -o ubuntu-14.04.1-desktop-amd64.img

# Insert USB and work out what it’s called by running

diskutil list

# Mine was /dev/disk1 so unmount it with:

diskutil unmountDisk /dev/disk1

# Copy the image to the disk

sudo dd if=ubuntu-14.04.1-desktop-amd64.img.dmg of=/dev/disk1 bs=1m

I could then boot my machineinto Ubuntu  from the USB in UEFI mode. The next step was to install mdadm (the tool for managing linux software raid-it’s not on the cd so you need to get it from the software center) and then see what disks were available by using parted:

sudo parted -l

This showed me that the remaining disk was /dev/sdb and that it had 4 partitions, and in this case the two raid partitions are /dev/sdb3 and /dev/sdb4.

Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sdb: 3001GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number Start End Size File system Name Flags
1 1049kB 106MB 105MB fat32 primary boot
2 106MB 8000MB 7894MB linux-swap(v1) primary
3 8000MB 108GB 100GB primary msftdata
4 108GB 3001GB 2893GB

I then added the system partition (/dev/sdb) to a raid array called /dev/md0:

mdadm –assemble /dev/md0 /dev/sdb3

I then followed the instructions for reinstalling grub here: http://ubuntuforums.org/showthread.php?t=1581099

# Make a directory to mount the filesystem and the EFI boot partition – which for me is on /dev/sdb1

mkdir /mnt/temp

mkdir /mnt/temp/boot/efi

# Mount the system partition and boot partitions:

sudo mount /dev/md0 /mnt/temp

# (Not actually tried the below – think this is what I’ll need to do – end result just needs to be that /dev/sdb1 ends up mounted on/boot/efi in the chrooted system)

sudo mount /dev/sdb1 /mnt/temp/boot/efi

# Mount the temporary filesystems onto the system:

for i in /dev /dev/pts /proc /sys; do sudo mount -B $i /mnt/temp$i;  done
sudo cp /etc/resolv.conf /mnt/temp/etc/resolv.conf  # May be required to connect to the Internet.

# chroot to the new system
sudo chroot /mnt/temp

# Now install grub to the disk (this runs efibootmgr to actually install the boot loader)
sudo grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=arch_grub --recheck --debug

# Update grub to write the config files
sudo update-grub
And this seemed to work as I then had a bootable system. After booting with the new disk attached, the next job was to add the new disk to the raid array. The fist step is to copy the partition table from the working disk (/dev/sdb) to the new disk (/dev/sda). As this is a GPT disk, you can't do this with parted (why?!?), so I used sgdisk instead:
# BE VERY CAREFUL WITH THIS AS IF YOU GET THE ORDER WRONG YOU WILL NUKE THE PARTITION TABLE OF THE GOOD DISK!
# This copies the partition table FROM sdX  TO sdY and the randomises the GUID on the disk so that they can be used in the same machine
sgdisk -R /dev/sdY /dev/sdX
sgdisk -G /dev/sdY

With the disk operational, the final step is to add the two partitions to the raid array. The two raid arrays are called md1 and md2

mdadm –manage /dev/md1 –add /dev/sda3

mdadm –manage /dev/md2 –add /dev/sda4

cat /proc/mdstat then shows that the array is being rebuilt.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s