mycroes

There's always time to play

Wednesday, February 11, 2009

Fallback kernels using GRUB (or safe rebooting of remote machines)

If you can't reach a certain computer (like a server that's colocated), but need to reboot it anyway, it's better to do it in a safe way. GRUB allows the use of the "fallback" command to specify a kernel to boot if it can't boot the kernel specified by default, but what if it can boot the default kernel but the kernel dies at some point? Well GRUB will be happy, it did boot a kernel.

The solution is in the "savedefault" option. When using "default saved" in the GRUB config file, GRUB will boot the kernel that was saved as default. Very easy to understand. But how can we use this to our advantage? Well let's assume we have a new kernel and an old kernel, named new and old, the config will probably look something like this:
default 0
fallback 1
timeout 5

title New
kernel /new root=/dev/sda1

title Old
kernel /old root=/dev/sda1

There's two things that need to be changed here. First the kernel needs to know what to do when it panics. Because there's one reasonable thing to do, it's implemented by means of a kernel commandline argument. Append "panic=5" to the kernel command line in the GRUB config and the system will reboot five seconds after the kernel panics.

The second change is that we need to tell GRUB that it needs to boot the second kernel if the first fails. We can't easily tell GRUB to boot the second only if the first has failed, but we can reliably tell GRUB to make the second default whenever the first boots. The changes will result in the following config (menu.lst):
default saved
fallback 1
timeout 5

title New
kernel /new root=/dev/sda1 panic=5
savedefault 1

title Old
kernel /old root=/dev/sda1

This config will instruct GRUB to set Old as the default kernel whenever New gets booted, and the "default saved" line will tell GRUB to boot the kernel saved as default. This can be chained to have multiple fallback kernels, but that's all up to you.

There's one issue left, if you want to reboot the system but the new kernel did work out, GRUB will still (correctly) assume it has to boot the old kernel. However there's a linux userspace utity that will allow you to change the default:
# grub-set-default 0

This will reset the default to 0. Of course if the system boots you can edit the menu.lst file by hand, setting the default to 0, but if you want to use it again you had better reset the default beforehand.

No comments: