THogan.com Adventures In Information Systems Engineering

14Apr/090

Ubuntu Multipath Boot From SAN Experiment

ADVERTISEMENT

Today I ran a test of building an Ubuntu 8.10 Server x86_64 system that boots from SAN and has multipath enabled for the boot LUNs.  We had run through this exercise on Red Hat Enterprise Linux 5 (RHEL5) earlier, and wanted to test the setup on Ubuntu.

I thought I would have a nice long article to write about this.  Something complete and detailed to fill the void of information I found when looking for instructions myself.  Now I think I understand why there was nothing to be found on this topic; there really is nothing to it.

The build platform was an HP DL380 with Q-Logic 24xx cards connected to an EMC CLARiiON array.  I'm not sure what model, I just ask our storage engineer to give me LUNs and LUNs I get, but I do know it is a CLARiiON.  I asked our storage engineer for 20 GB and four paths, two HBAs each with access to two storage processors.

In other boot from SAN builds we usually have to expose only one path to the system during the install because we have had problems with the installer / LVM getting confused when there are multiple active paths.  Just for reaffirmation, I tried the Ubuntu install with all paths exposed.  The install was successful and there was no manual intervention required to make it through the first boot.

After the install completed I did an 'apt-get dist-upgrade' to bring all packages and the kernel image current.  Then I did an 'apt-get install multipath-tools multipath-tools-boot' and rebooted.  Up it came with LVM referring to the multipath device for its physical volumes.  That was mundane.  There was one hiccup in the first boot after installing multipath where fsck couldn't check /dev/sda1 because it was already nabbed by multipath.  I just commented the entry out in /etc/fstab and all was well on the next reboot.  I usually leave /boot unmounted just as a paranoid safety measure, but I assume changing the /etc/fstab entry to refer to the multipath device would fix the fsck problem too.

We proceeded to copy files to the system while pulling and replacing fiber cables and all worked as planned.  Just for fun to end the day we dezoned all the LUNs on the storage processors while copying files to it and watched the running system slowly fall apart.  The really quite amazing thing though, is that we re-mapped the LUNs to the system and it came back to life without a reboot!  My scp that had stalled out started moving again and finished up and all the multipath failure messages flushed to the system logs.  I have to admit that I didn't expect the system to survive losing all of its I/O for ten minutes =D

Filed under: Linux, Storage Leave a comment