Useful How-To's:
http://unthought.net/Software-RAID.HOWTO/
(OLD) http://www.tldp.org/HOWTO/Software-RAID-HOWTO-4.html
Where to get raidtools:
> cd /root
> mkdir raid
> cd raid/
> wget http://people.redhat.com/mingo/raidtools/raidtools-1.00.3.tar.gz
For software raid we need kernel > 2.4 with raid patches and the raid tools.
To test the kernel (from the FAQ):
If your system has RAID support, you should have a file called /proc/mdstat. Remember
it, that file is your friend. If you do not have that file, maybe your kernel does
not have RAID support. See what the contains, by doing a cat /proc/mdstat. It should
tell you that you have the right RAID personality (eg. RAID mode) registered, and
that no RAID devices are currently active.
Software raid is configured through /etc/raidtab. Here's an example for raid 1:
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/sdb6
raid-disk 0
device /dev/sdc5
raid-disk 1
The mkraid command is used to initialize a new raid array:
mkraid /dev/md0
Initial State
---------------
Here's the initial setup:
> df
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/hda2 38456340 1462116 35040720 5% /
/dev/hda1 23302 5976 16123 28% /boot
none 515348 0 515348 0% /dev/shm
/dev/hdc1 39516436 32828 37476280 1% /mnt/drive2
> fdisk /dev/hda
Command (m for help): p
Disk /dev/hda: 255 heads, 63 sectors, 4998 cylinders
Units = cylinders of 16065 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 3 24066 83 Linux
/dev/hda2 4 4867 39070080 83 Linux
/dev/hda3 4868 4998 1052257+ 82 Linux swap
> fdisk /dev/hdc
Command (m for help): p
Disk /dev/hdc: 16 heads, 63 sectors, 79656 cylinders
Units = cylinders of 1008 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hdc1 * 1 79656 40146592+ 83 Linux
Test Setup - Standard
---------------------
1. Remove old partitions on /dev/hdc:
> fdisk /dev/hdc
Command (m for help): p
Disk /dev/hdc: 16 heads, 63 sectors, 79656 cylinders
Units = cylinders of 1008 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hdc1 * 1 79656 40146592+ 83 Linux
Command (m for help): d
Partition number (1-4): 1
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
2. Create 2 equal sized partitions for testing:
> fdisk /dev/hdc
Command (m for help): p
Disk /dev/hdc: 16 heads, 63 sectors, 79656 cylinders
Units = cylinders of 1008 * 512 bytes
Device Boot Start End Blocks Id System
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-79656, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-79656, default 79656): +1024M
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (2082-79656, default 2082):
Using default value 2082
Last cylinder or +size or +sizeM or +sizeK (2082-79656, default 79656): +1024M
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
3. Setup /etc/raidtab:
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/hdc1
raid-disk 0
device /dev/hdc2
raid-disk 1
4. Prepare the partitions with mkraid:
> mkraid /dev/md0
handling MD device /dev/md0
analyzing super-block
disk 0: /dev/hdc1, 1048792kB, raid superblock at 1048704kB
disk 1: /dev/hdc2, 1048824kB, raid superblock at 1048704kB
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hdc2[1] hdc1[0]
1048704 blocks [2/2] [UU]
[========>............] resync = 44.2% (465664/1048704) finish=0.9min speed=10171K/sec
unused devices: <none>
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hdc2[1] hdc1[0]
1048704 blocks [2/2] [UU]
[=========>...........] resync = 48.7% (512512/1048704) finish=0.8min speed=10056K/sec
unused devices: <none>
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hdc2[1] hdc1[0]
1048704 blocks [2/2] [UU]
[===============>.....] resync = 79.7% (836608/1048704) finish=0.3min speed=10148K/sec
unused devices: <none>
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hdc2[1] hdc1[0]
1048704 blocks [2/2] [UU]
unused devices: <none>
5. Checkout the results:
> fdisk /dev/md0
Command (m for help): p
Disk /dev/md0: 2 heads, 4 sectors, 262176 cylinders
Units = cylinders of 8 * 512 bytes
Device Boot Start End Blocks Id System
/dev/md0p1 8 10036530 40146088+ 83 Linux
Partition 1 does not end on cylinder boundary:
phys=(1023, 15, 63) should be (1023, 1, 4)
Command (m for help):
6. Mount up the new partition:
> mkdir /mnt/raid
> mount /dev/md0 /mnt/raid
> df
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/hda2 38456340 1462376 35040460 5% /
/dev/hda1 23302 5976 16123 28% /boot
none 515348 0 515348 0% /dev/shm
/dev/md0 39516436 32828 37476280 1% /mnt/raid
7. Shutdown raid:
> umount /dev/md0
> raidstop --all /dev/md0
Test Setup - Degraded Mode
--------------------------
1. Create partitions:
> fdisk /dev/hdc
Command (m for help): p
Disk /dev/hdc: 16 heads, 63 sectors, 79656 cylinders
Units = cylinders of 1008 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hdc1 1 2081 1048792+ 83 Linux
/dev/hdc2 2082 4162 1048824 83 Linux
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (4163-79656, default 4163):
Using default value 4163
Last cylinder or +size or +sizeM or +sizeK (4163-79656, default 79656): +1024M
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 4
First cylinder (6244-79656, default 6244):
Using default value 6244
Last cylinder or +size or +sizeM or +sizeK (6244-79656, default 79656): +1024M
Command (m for help): p
Disk /dev/hdc: 16 heads, 63 sectors, 79656 cylinders
Units = cylinders of 1008 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hdc1 1 2081 1048792+ 83 Linux
/dev/hdc2 2082 4162 1048824 83 Linux
/dev/hdc3 4163 6243 1048824 83 Linux
/dev/hdc4 6244 8324 1048824 83 Linux
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
2. Format/mount the first new partition.
> mke2fs /dev/hdc3
mke2fs 1.27 (8-Mar-2002)
warning: 62 blocks unused.
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
131328 inodes, 262144 blocks
13110 blocks (5.00%) reserved for the super user
First data block=0
8 block groups
32768 blocks per group, 32768 fragments per group
16416 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 30 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
> mount /dev/hdc3 /mnt/drive2
3. Setup /etc/raidtab:
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/hdc1
raid-disk 0
device /dev/hdc2
raid-disk 1
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/hdc4
raid-disk 0
device /dev/hdc3
failed-disk 1
4. Build the array:
> mkraid /dev/md1
handling MD device /dev/md1
analyzing super-block
disk 0: /dev/hdc4, 1048824kB, raid superblock at 1048704kB
disk 1: /dev/hdc3, failed
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md1 : active raid1 hdc4[0]
1048704 blocks [2/1] [U_]
unused devices: <none>
5. Put a filesystem on the array:
> mke2fs /dev/md1
mke2fs 1.27 (8-Mar-2002)
warning: 32 blocks unused.
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
131328 inodes, 262144 blocks
13108 blocks (5.00%) reserved for the super user
First data block=0
8 block groups
32768 blocks per group, 32768 fragments per group
16416 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 23 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
6. Copy contents of failed drive to array:
> mount /dev/md1 /mnt/raid/
> cd /mnt/drive2
> find . -xdev | cpio -pm /mnt/raid
> umount /dev/hdc3
7. Edit /etc/raidtab making failed drive active:
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/hdc1
raid-disk 0
device /dev/hdc2
raid-disk 1
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/hdc4
raid-disk 0
device /dev/hdc3
raid-disk 1
8. Hot add the failed drive to the array:
> raidhotadd /dev/md1 /dev/hdc3
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md1 : active raid1 hdc3[2] hdc4[0]
1048704 blocks [2/1] [U_]
[=>...................] recovery = 8.4% (90048/1048704) finish=1.5min speed=10005K/sec
unused devices: <none>
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md1 : active raid1 hdc3[2] hdc4[0]
1048704 blocks [2/1] [U_]
[==========>..........] recovery = 51.8% (544640/1048704) finish=0.8min speed=10131K/sec
unused devices: <none>
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md1 : active raid1 hdc3[2] hdc4[0]
1048704 blocks [2/1] [U_]
[===============>.....] recovery = 77.7% (816512/1048704) finish=0.4min speed=9480K/sec
unused devices: <none>
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md1 : active raid1 hdc3[1] hdc4[0]
1048704 blocks [2/2] [UU]
unused devices: <none>
Thats it!
Test Setup - Degraded Mode - Real Drives
----------------------------------------
1. Check initial config of primary drive.
> fdisk /dev/hda
Command (m for help): p
Disk /dev/hda: 255 heads, 63 sectors, 1870 cylinders
Units = cylinders of 16065 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 3 24066 83 Linux
/dev/hda2 4 41 305235 82 Linux swap
/dev/hda3 42 1870 14691442+ 83 Linux
2. Partition secondary drive.
> fdisk /dev/hdd
Command (m for help): p
Disk /dev/hdd: 16 heads, 63 sectors, 29805 cylinders
Units = cylinders of 1008 * 512 bytes
Device Boot Start End Blocks Id System
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-29805, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-29805, default 29805): +25M
Command (m for help): p
Disk /dev/hdd: 16 heads, 63 sectors, 29805 cylinders
Units = cylinders of 1008 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hdd1 1 51 25672+ 83 Linux
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (52-29805, default 52):
Using default value 52
Last cylinder or +size or +sizeM or +sizeK (52-29805, default 29805): +300M
Command (m for help): p
Disk /dev/hdd: 16 heads, 63 sectors, 29805 cylinders
Units = cylinders of 1008 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hdd1 1 51 25672+ 83 Linux
/dev/hdd2 52 661 307440 83 Linux
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (662-29805, default 662):
Using default value 662
Last cylinder or +size or +sizeM or +sizeK (662-29805, default 29805):
Using default value 29805
Command (m for help): p
Disk /dev/hdd: 16 heads, 63 sectors, 29805 cylinders
Units = cylinders of 1008 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hdd1 1 51 25672+ 83 Linux
/dev/hdd2 52 661 307440 83 Linux
/dev/hdd3 662 29805 14688576 83 Linux
Command (m for help): p
Disk /dev/hdd: 16 heads, 63 sectors, 29805 cylinders
Units = cylinders of 1008 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hdd1 * 1 51 25672+ 83 Linux
/dev/hdd2 52 661 307440 83 Linux
/dev/hdd3 662 29805 14688576 83 Linux
Command (m for help): a
Partition number (1-4): 1
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
3. Make a filesystem on the /boot partition of the secondary drive
and mount it.
> mke2fs /dev/hdd1
mke2fs 1.27 (8-Mar-2002)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
6432 inodes, 25672 blocks
1283 blocks (5.00%) reserved for the super user
First data block=1
4 block groups
8192 blocks per group, 8192 fragments per group
1608 inodes per group
Superblock backups stored on blocks:
8193, 24577
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 32 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
> mount /dev/hdd1 /mnt/drive2
4. Install grub on the /boot partition of the secondary drive.
> grub-install --root-directory=/mnt/drive2 /dev/hdd
Probing devices to guess BIOS drives. This may take a long time.
Installation finished. No error reported.
This is the contents of the device map /mnt/drive2/boot/grub/device.map.
Check if this is correct or not. If any of the lines is incorrect,
fix it and re-run the script `grub-install'.
(fd0) /dev/fd0
(hd0) /dev/hda
(hd1) /dev/hdd
> cd /mnt/drive2
> ln -s ./boot/grub grub
5. Copy kernel files to /boot partition of the secondary drive.
> cp /boot/* /mnt/drive2/
> cp /boot/grub/grub.conf /mnt/drive2/grub
> cd /mnt/drive2/grub
> ln -s ./grub.conf ./boot/grub/menu.lst
> cd ..
> umount /dev/hdd1
6. Make the swap partition on the secondary drive.
> mkswap /dev/hdd2
Setting up swapspace version 1, size = 307436K
7. Setup the /etc/raidtab file in degraded mode:
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/hdd3
raid-disk 0
device /dev/hda3
failed-disk 1
8. Build the array:
> mkraid /dev/md0
handling MD device /dev/md0
analyzing super-block
disk 0: /dev/hdd3, 14688576kB, raid superblock at 14688512kB
disk 1: /dev/hda3, failed
9. Put ext3 filesystem on the array:
> mkfs -t ext3 /dev/md0
mke2fs 1.27 (8-Mar-2002)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
1836928 inodes, 3672128 blocks
183606 blocks (5.00%) reserved for the super user
First data block=0
113 block groups
32768 blocks per group, 32768 fragments per group
16256 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 37 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
10. Copy contents of failed drive to array:
> mkdir /mnt/raid
> mount /dev/md0 /mnt/raid/
> cd /
> find . -xdev | cpio -pm /mnt/raid
11. Create the initrd images:
> mkinitrd --preload raid1 --with=raid1 initrd-2.4.18-3.img 2.4.18-3
> cp initrd-2.4.18-3.img /mnt/drive2/
> cp /boot/initrd-2.4.18-3.img /boot/initrd-2.4.18-3.img.bak
> cp initrd-2.4.18-3.img /boot
12. Set partition type on secondary drive:
> fdisk /dev/hdd
Command (m for help): t
Partition number (1-4): 3
Hex code (type L to list codes): fd
Changed system type of partition 3 to fd (Linux raid autodetect)
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.
13. Edit /etc/fstab on the array:
Change the root (/) line to absolute instead of label reference.
14. REBOOT
15. Edit /etc/raidtab and change failed drive to active.
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 4
persistent-superblock 1
device /dev/hdd3
raid-disk 0
device /dev/hda3
raid-disk 1
16. Charge partition type of failed drive using fdisk.
> fdisk /dev/hda
The number of cylinders for this disk is set to 1870.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): p
Disk /dev/hda: 255 heads, 63 sectors, 1870 cylinders
Units = cylinders of 16065 * 512 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 3 24066 83 Linux
/dev/hda2 4 41 305235 82 Linux swap
/dev/hda3 42 1870 14691442+ 83 Linux
Command (m for help): t
Partition number (1-4): 3
Hex code (type L to list codes): fd
Changed system type of partition 3 to fd (Linux raid autodetect)
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.
17. REBOOT
18. Hot add the failed drive
> raidhotadd /dev/md0 /dev/hda3
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hda3[2] hdd3[0]
14688512 blocks [2/1] [U_]
[>....................] recovery = 0.3% (48704/14688512) finish=161.6min speed=1509K/sec
unused devices: <none>
> cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hda3[1] hdd3[0]
14688512 blocks [2/2] [UU]
unused devices: <none>
DONE !!!
Boot Considerations
-------------------
Current Future
------- ---------
/dev/hda1 -> grub boot partition -> /dev/hda1 or /dev/hdc1
/dev/hda2 -> root partition -> /dev/md0
/dev/hdc1 -> unused
/dev/hdc2 -> unused
/dev/md0 = /dev/hda2 + /dev/hdc2 (RAID 1 Mirror)
Issue 1: Raid is provided as modules but is needed before
the root filesystem is booted.
mkinitrd --with=<module> <ramdisk name> <kernel>
mkinitrd --preload raid5 --with=raid5 raid-ramdisk 2.2.5-22
An initrd command must be setup in the grub bootloader for this
to work.
Issue 2: The bios is set to boot from /dev/hda1. How do we get
booted from /dev/hdc1 with the root set to /dev/md0?
Solution: The search order for bootable drives in the BIOS should be
set to /dev/hda and then /dev/hdc. This may be automatic.
Current GRUB Config
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/hda2
# initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title Red Hat Linux (2.4.20-13.7)
root (hd0,0)
kernel /vmlinuz-2.4.20-13.7 ro root=/dev/hda2
initrd /initrd-2.4.20-13.7.img
New GRUB Config
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/hda2
# initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title Red Hat Linux (2.4.20-13.7) Primary
root (hd0,0)
kernel /vmlinuz-2.4.20-13.7 ro root=/dev/md0
initrd /initrd-2.4.20-13.7.img
title Red Hat Linux (2.4.20-13.7) Backup
root (hd1,0)
kernel /vmlinuz-2.4.20-13.7 ro root=/dev/md0
initrd /initrd-2.4.20-13.7.img
Setup Plan
----------
1. Configure the /etc/raidtab for a degraded raid 1 mirror using
/dev/hda2 and /dev/hdc2 with /dev/hdc2 as the active device and /dev/hda2
as a failed device.
2. Create new initrd with raid1 support. Reboot and verify that
new initrd is working.
2. Install modified grub on /dev/hdc1 and /dev/hda1.
3. Copy contents of hda2 to hdc2.
5. Reboot and verify that md0/hdc2 is working.
6. Hot add hda2 to the md0 array.
And that should do the trick!