mycroes

There's always time to play

Wednesday, February 11, 2009

Fallback kernels using GRUB (or safe rebooting of remote machines)

If you can't reach a certain computer (like a server that's colocated), but need to reboot it anyway, it's better to do it in a safe way. GRUB allows the use of the "fallback" command to specify a kernel to boot if it can't boot the kernel specified by default, but what if it can boot the default kernel but the kernel dies at some point? Well GRUB will be happy, it did boot a kernel.

The solution is in the "savedefault" option. When using "default saved" in the GRUB config file, GRUB will boot the kernel that was saved as default. Very easy to understand. But how can we use this to our advantage? Well let's assume we have a new kernel and an old kernel, named new and old, the config will probably look something like this:
default 0
fallback 1
timeout 5

title New
kernel /new root=/dev/sda1

title Old
kernel /old root=/dev/sda1

There's two things that need to be changed here. First the kernel needs to know what to do when it panics. Because there's one reasonable thing to do, it's implemented by means of a kernel commandline argument. Append "panic=5" to the kernel command line in the GRUB config and the system will reboot five seconds after the kernel panics.

The second change is that we need to tell GRUB that it needs to boot the second kernel if the first fails. We can't easily tell GRUB to boot the second only if the first has failed, but we can reliably tell GRUB to make the second default whenever the first boots. The changes will result in the following config (menu.lst):
default saved
fallback 1
timeout 5

title New
kernel /new root=/dev/sda1 panic=5
savedefault 1

title Old
kernel /old root=/dev/sda1

This config will instruct GRUB to set Old as the default kernel whenever New gets booted, and the "default saved" line will tell GRUB to boot the kernel saved as default. This can be chained to have multiple fallback kernels, but that's all up to you.

There's one issue left, if you want to reboot the system but the new kernel did work out, GRUB will still (correctly) assume it has to boot the old kernel. However there's a linux userspace utity that will allow you to change the default:
# grub-set-default 0

This will reset the default to 0. Of course if the system boots you can edit the menu.lst file by hand, setting the default to 0, but if you want to use it again you had better reset the default beforehand.

Creating a multiboot USB flash drive using GRUB

So I am currently booting my system from a USB flash drive, and probably will be for quite some time. One thing that annoyed me about syslinux is that it's really nothing more than a very simple boot loader, but sometimes you want more.

I've always used GRUB. GRUB has support for splash images so it looks nice. I also think the editing features of GRUB are great. Mistyped the kernel name after updating? Edit it in the menu. I also really like GRUB's default feature, which allows for some nice tricks, but that's for another post.

All in all, I wanted to get booting using GRUB. I've always installed GRUB 'by hand'. I've noticed the existance of grub-install, but the first time I ever read about it, I read about some issue with it. It was probably not important, but since I knew GRUB's setup commands by hand I didn't need it anyway. Turns out that if you run linux, GRUB will just map your USB flash drive as another harddrive (or at least it did for me with two different drives).

To install GRUB, it expects some files to be present. It also expects a filesystem to be present too if you install the same way I do. Because I like my USB flash drives easily accessible on multiple platforms, I format them as FAT32. I must say I only used them in linux so far, so if you're just like me you might as well format them ext2. In typable commands:
# mkdosfs -F 32 -n [somename] /dev/sdxy
Or:
# mke2fs -L [somename] /dev/sdxy

Of course sdxy refers to the partition on the flash drive you want to use, in my case it's /dev/sde1. You don't have to use labels (-n and -L options respectively), but I prefer to label my filesystems.

I also prefer to have my filesystems clean, so I try to keep all this bootable stuff contained in a folder. Mount your flash drive, and create a folder boot:
# mount /dev/sdxy /mnt
# mkdir /mnt/boot

I will be using the folder boot to store my kernels, initramfs's and the GRUB folder with GRUB files. GRUB needs just three (3) files to get installed: stage1, stage2 and the appropiate stage1_5. Easiest way to get these is from an existing install, if that's not possible for you, ask google for another solution (i.e.: "compile grub"). When using a FAT32 formatted flash drive you need to copy fat_stage1_5, when using ext2 or ext3 you should use e2fs_stage1_5.
# mkdir /mnt/boot/grub
# cp /boot/grub/stage{1,2} /mnt/boot/grub
# cp /boot/grub/fat_stage1_5 /mnt/boot/grub

GRUB now has all the files it needs, so let's install it:
# grub
grub> device (hd7) /dev/sdx

grub> root (hd7,z)
Filesystem type is fat, partition type 0xb

grub> setup (hd7)
Checking if "/boot/grub/stage1" exists... yes
Checking if "/boot/grub/stage2" exists... yes
Checking if "/boot/grub/fat_stage1_5" exists... yes
Running "embed /boot/grub/fat_stage1_5 (hd7)"... 16 sectors are embedded.
succeeded
Running "install /boot/grub/stage1 (hd7) (hd7)1+16 p (hd7,0)/boot/grub/stage2
/boot/grub/menu.lst"... succeeded
Done.

grub> quit

this maps (hd7) to /dev/sdx, the root line needs to be changed to match z = y - 1, in other words if you're using partition 1, enter a 0 for z, increment both equally for other partitions. You should now be able to boot from the USB flash drive, but we're not done yet. I like to have a config too, and I included a nice menu background, so I copied splash.xpm.gz over to /mnt/boot/grub. As for the config, here's mine for a gentoo 2008.0 minimal livecd squashfs image on my flash drive:
default 0
timeout 30
splashimage=/boot/grub/splash.xpm.gz

title Gentoo Minimal LiveCD 2008.0
kernel /boot/kernel-2008.0.img root=/dev/ram0 init=/linuxrc dokeymap looptype=squashfs loop=/minimal-2008.0.squashfs cdroot vga=791
initrd /boot/initrd-2008.0.igz

Name the file menu.lst and GRUB should be able to find it. If you now boot from the flash drive, you should see the menu. And as another bonus, booting entries seems to be way faster than it is with syslinux!

Sunday, February 8, 2009

Migrating from single disk to 3-disk RAID 5 (on Gentoo)

When following this guide a disk crash before the final step might still result in data loss, so if your data *is* important, back it up!

As my 1TB harddrive was filling up I was considering the ways I could expand storage. I hate usb drives, just for the simple fact that they're slow. I also don't like external storage at all, unless it's network storage. However, network storage would be slow for me too unless I at least spend some money to buy a decent gigabit switch instead of my 10/100 switch.

The option I liked most turned out to be to add another internal drive. I was already using LVM, so extending the volume group to span another disk wouldn't be hard, but that had it's serious downsides too. I'm not someone who does regular backups, I actually just don't do them at all unless something is really broken or I really feel I need to. Last time either of those happened is longer than a year ago I think, so a lot has changed since then. Now there's a lot of stuff I just don't need to backup, for the simple fact that it's someone else's service. I only use IMAP mail accounts and that's about the most important thing I could lose...

Because I don't do regular backups, I hope my harddisks will stay alive. Starting with a little bit of data and a new harddisk I can survive a near-instant crash. However when I would extend my 1TB lvm volume group with another drive, the risk of losing all data would immediately become twice as high. That's something I'm not willing to risk, because there's ways in which I simply don't have to risk it.

The solution I came up with was to buy 2 extra drives and create a RAID 5 array on top of them. A RAID 5 array requires at least 3 block devices, best to use 3 different disks for that, and will allow recovery of all data if at most one drive fails. It's possible to start with 3 drives and later expand to more while keeping all data on the disks. Another advantage is that data will be spread across disks, so reads and writes will also be spread across disks, so they will both be faster (the processing overhead for RAID 5 is very low).

When I made up my mind I had to figure out how to transition from my single disk to the RAID array, without having to use yet another disk to temporarily store my current data. A while ago I found some pages that pointed out that it's possible to create a 'degraded' RAID array. A degraded array means that on drive fail you might lose data, but when I still have all data on the old disk that won't be an issue. After having all the data on the degraded array, the old drive can be added to the array to fix the array integrity again.

So to summarize all of the above this is what I had to do:
- Add the drives to the system
- Create a degraded RAID 5 array on the 2 new drives
- Copy all data to the degraded RAID array
- Clean the old disk and add it to the array

Of course in reality I also had to compile a kernel with linux software raid support, because it was not there yet.

And here's what I typed with some additional explanation:
# paludis -i mdadm

Install mdadm, the linux software raid utility

# fdisk /dev/sda
# fdisk /dev/sdc

Partition the disks, in my case it just meant creating a full size partition with type fd (linux raid autodetect). Yes, my new disks are sda and sdc, I might have switched cables somewhere...

# mdadm --create -f /dev/md0 --level=5 --raid-devices=3 /dev/sda1 /dev/sdc1 missing

Create the degraded RAID 5 array with a missing drive.

# mdadm --examine --scan >> /etc/mdadm/mdadm.conf

Create a usable config file. Don't know when I would need it, but hey, it's there when I need it...

# pvcreate /dev/md0

Initialize a physical volume for lvm.

# vgcreate vg /dev/md0

Create the volume group 'vg' on the RAID array. Keep in mind that it needs to be different from the volume group name you was already using, if you were already using one...

# lvcreate -L30G -nroot vg
# lvcreate -L1200G -nhome vg

Create logical volumes for the root and home partition (that's all I use).

# mkfs.ext4 -L root /dev/vg/root
# mkfs.ext4 -L home /dev/vg/home

Create a filesystem, why not use ext4 immediately? Especially nice with biggish files.

At this point it's time to copy over all files, everyone has their own opinions about how this should be done, I just cped them over from the live install, leaving out /dev, /proc, /sys and creating them afterwards, copied /dev/null and /dev/console (copying all of /dev isn't really feasible, and those are all you need in gentoo) and I was done copying.

Now there's some additional stuff to worry about. I'm using gentoo, so I have to worry about being able to start from my RAID array too. Because I wanted to use the full disks for my RAID array I'm going to have to boot from another device, like a usb flash drive. So I started by copying my kernel with RAID and lvm support to my usb drive, created the entry in my syslinux config and created an initrd with
genkernel --lvm --mdadm initrd
. This created a nice initrd, with the helpful message that to use the lvm stuff I also had to pass dolvm on my kernel commandline. Now it was time to see if it all worked, and... fail... My RAID array was not detected, so all I could use was my old disk. After a while I found that for mdadm support you also need to pass 'domdadm' on the kernel commandline, and that solved all of my issues with booting, now let's hope that gets documented somewhere too.

I was now able to boot using my new lvm logical volumes on my new RAID 5 array. I actually started copying my home volume when I was able to boot, because that would take quite some time. All in all there was just one thing left to do, and that was adding the old drive to the RAID array (after changing the partition layout of course):

# mdadm /dev/md0 --add /dev/sdb1

Add sdb1 to the raid array.

Immediately the raid array will be resynced. This will take quite a while, because the RAID array itself has no knowledge about the data on it, so the entire disks are being synced.

While this all happens and after this all happens you can
$ cat /proc/mdstat
to see the status of the RAID array.

Now you should have a working RAID 5 array, just make sure you don't delete your data because a RAID array won't protect you from being stupid.

For a more detailed guide on just setting up the RAID array with a missing drive see Linux Software RAID 101 -- Part 3: creating an array with a missing drive, the guide I used to create my RAID array (you will notice all the mdadm commands are directly copied from that page).

Issues with syslinux

A small followup to my post about a syslinux bootable usb stick, I'm writing down the current issues I have ran into.
1. syslinux is not your average bootloader
2. syslinux on a fat32 usb drive doesn't like to get recognized as USB-FDD

Point 1 is easy to understand. Syslinux is made to be useful in the special cases, not to be useful in about any case. Want to boot to a different kernel? Edit the config file. I think grub is far more userfriendly than syslinux will ever be, so I'm definately doing that grub bootable usb drive soon.

Point 2 is something different... I don't know if grub will work when the system tries to boot from the usb device as if it's a floppy drive. I don't know what's going wrong for syslinux either, but I would not be amazed if either the system or syslinux decides that it can't read beyound 2.88M (or 1.44, but I guess it would at least support large floppy disks). If it's the system, then grub probably won't make any difference at all... So now you might ask, why would you want to boot with usb floppy drive emulation? Well, I don't want to, but it seems some machines have no other option depending on the kind of usb device you use...

Expect a grub multiboot usb drive howto soon.