Mirroring Disks

Poor folks' RAID

this is NOT the fake hardware raid as described here : Fake Raid HowTo (Ubuntu Community Documentation), which deals with so-called fake hardware raid controllers which, in contrast to real hardware raid controllers, are little more than multi-channel IDE or SATA controllers with additional software and BIOS features to emulate a RAID configuration. The fake raid we talking about here mimicis a "software raid" : we use software to mirror disks and/or partitions.

What is a RAID ?

Functionally, a RAID is a set of disks that provides either "striping" or "reduncancy", or a combination of both.

A striped set is a set of disks that act as 1 disk, and presents itself as 1 disk to the operating system. Data is spread over both disks. The result is better disk I/O performance (higher throughput), because the OS reads from and writes to multiple disks simultaneously. This is called a RAID level 0, because there is no redundancy.

A redundant set is also a set of disks that act as 1 disk, and presents itself as 1 disk to the operating system. In a redundant set, 1 (sometimes more than 1) disk can fail, be removed, be replaced, all without loss of data, because the other disks contain enough additional information to replace and 'rebuild'the data on the failed disk. This is accomplished either by

Faking a mirrored set (Raid-1)

If you want to mirror a disk or a partition (as fall-back for when your system disk dies, or as a back up for a data partition, ..) without investment in a hardware RAID controller or without using dedicated RAID software, you can accomplish redundancy by

This is sort of a manual, poor man's mirror : it's not automatic (especially the restore procedure), it's not completely transparent (the OS is aware of multiple disks, the file system is aware that there are copies of the files ...). It is more or less transparent in that a linux filesystem is unified and makes abstraction of the underlying storage volumes, because you use mount points to add storage to the directory hierarchy. We'll use that quality to create some form of fall back / redundancy.

Mirroring a disk

Recipe :

  1. add a disk that will hold your mirror partition it
  2. use dd for a 1 time copy
  3. use rsync to keep copy up to date (mount it as /mirror + set up chron)

for boot disks : dd will copy the boot sector as well.

caveats : 'rsync / ' will sync everything, including subdirectories that are mount points for other partitions. eg rsync / will include /home, even if /home is a separate partition. You may need multiple rsync statements to get things the way you want them. also : /tmp, /proc, /dev ... don't need to be mirrored because their contents is volatile and becomes obsolete or is created during a reboot of the operating system.

If all you want to do is to mirror a disk that contains a data partition, it's sufficient to do dd if=/dev/sda of=/dev/sdb. As dd will copy all blocks of the sda disk to sdb, this will include the partition table e.a. so sdb is a perfect clone of sda : you can swap the disks, or mount a partition of /sdb just as you do with sda. You've made a 'backup to disk"

mirroring a system disk / boot disk (or OS partition)

Likewise, you can clone the disk or partition that contains your root partition (/), boot files, boot records, partition table and so on, to create a mirror of your system disk.

	## add a disk to your computer
	fdisk -l
	

The output of fdisk will show 1 disk without partition table, that's your newly added disk. Take note of the device name (eg /dev/sdb), you'll need it in next steps

	## clone disks sda to disk sdb
	dd if=/dev/sda of=/dev/sdb
	
	## check the result : /dev/sdb is now identical to /dev/sdb
	fdisk -l
	
	## reboot so the new partitions get detected and /dev is populated correctly
	shutdown -r now
	

To keep both disks synchronized as you apply changes to the system files or data on /dev/sda, you can use rsync. rsync works on filesystem level, so you need to mount /dev/sdb1 (the partition !) to a suitable mount point. You could for instance create a directory /mirror for this purpose.

Note that you can and should exclude certain directories from rsync, because they're not real files or directories. The /mirror directory should be excluded as well because it's a mount point and the destination for your rsync operation. If you're using links or mountpoints for remote filsesystems (smbmount, nfs, ...), these will be treated as part of / as well and will therefore be included in your rsync operation. If that's not what you want, see man rsync for command line options to deal with such situations.

	## synchronising /dev/sda (/) and /dev/sdb1 (/mirror)
	mount -t ext3 /dev/sdb1 /mirror && \
	\
	rsync 	-av --delete-during \
			--exclude "/mirror" \
			--exclude "/mnt"	\
			--exclude "/media"	\
			--exclude "/dev"	\
			--exclude "/proc"	\
			--exclude "/tmp"	\
			--exclude "/sys"	\
			--exclude "/var/run"  \
			--exclude "/var/lock" \
			--exclude "/var/tmp"  \
			--exclude "volatile"  \
			--progress	
			/ /mirror
			
	umount /mirror
	

create a cron job with the above mount - rsync - unmount procedure so your mirror stays up to date.

restore

restore a data partition

this is simple :

  1. umount failed partition (eg /home)
  2. mount mirror partition as (new) home
  3. done. make it permanent by editing fstab.

restore a system partition, recover from a failed system disk

There's more than one way to use the mirror you've created to recover from a failed disk.

Simple : swap the disks

Since disk 2 is a copy of disk 1 and now connected at the appropriate controller, it will become 'sda' to your operating system. the system will boot as if nothing ever changed.

One thing that could interfere with this procedure are the Universal Identifiers (UID) used to identify disks in /etc/fstab. As a test, I did NOT change them, and the above procedure worked (in VMware). If this causes trouble, you can probably work around it by removing the UIDs from /etc/fstab, replace the UUID= references in /boot/grub/menu.list by device names (eg /dev/sda1) and run update-grub.

Unlike a real mirror, this can be used as a roll-back mechanism : say , you made config changes that did not work : as long as the changes have no been copied to the mirror disk, you can switch to the mirror disk to roll back to the previous state (so you might want to disable the mirroring cron job when you're about to make tricky, experimental changes to your system or testing something new.

Conclusion

it's a workaround, and not a real replacement for a real (hardware or software) RAID. For one it's not instantaneous : there will be data loss when falling back to the mirror (magnitude of this risk depends on the rsync schedule, but a short interval may be too much of a performance hog and interfere with normal operations.) 2ndly, the restore procedure has to be executed manually (while in a real raid, this is automatic, transparent to the operating system, and without down time). 3rdly, this system only mimics RAID-1, no other RAID levels.

While it's no real replacement for a proper RAID, this 'fake soft Raid' can be useful as a cheap and simple way to provide data redundancy and recovery mechanism eg. for a PC that you are using as a router or a firewall or to provide network services such as routing, dns, dhcp, etc to your LAN. dd and rsync can also be used, in the way described above, as a part of a backup-to-disk(-to-tape) scenario, or to provide a roll-back option in case a proper testing environment is not an option.

See also


Koen Noens
May 2008