Mdadm

Overview

Several physical disks (/dev/sdX) or partitions (/dev/sdX1) of equal size are joined into a single array.

Creating a RAID array

(Recommended) Create a partition on each disk. Note:
- Use optimal alignment, with "-a optimal" (this doesn't appear to have any obvious effect on behaviour though!)
- Use the "GPT" partition table format (to handle disks > 2TB)
- Name the partition "primary" (note that this is free text)
- Use 0% for partition start (this will normally mean that the partition start will be at the 1MB boundary, which gives optimal alignment)
- End 100MB before the end of the disk (this is to allow for slight variances in exact size of similar disks)
- Set partition type to raid (0xFD00); this is optional, but may encourage some tools to avoid writing directly to the disk (and avoid corrupting the array)

parted -a optimal -s /dev/sdX -- mklabel gpt mkpart primary 0% -100MB set 1 raid on

Create a RAID 5 array over 3 partitions:
- Note, the default metadata version is now 1.2 for create commands

mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdX1 /dev/sdY1 /dev/sdZ1

Wait (potentially several days) for the array to be built

Once built, save the current raid setup to /etc, to allow for automounting on startup:

mdadm --detail --scan >> /etc/mdadm/mdadm.conf

Update the initial boot image for all current kernel versions to include the new mdadm.conf:

update-initramfs -u -k all

Start the array:

mdadm --assemble /dev/md0 /dev/sdX1 /dev/sdY1 /dev/sdZ1

From this point, just treat the array (/dev/md0) as a normal physical disk.

Recovering from disk failure

Check the disk status in mdadm:

mdadm --detail /dev/md0

If the disk is already marked as failed, then skip this step. Otherwise:

mdadm /dev/md0 --fail /dev/sdX1

From this point, the array will continue to operate in "degraded" mode

Remove the failed disk:

mdadm /dev/md0 --remove /dev/sdX1

To more easily determine the disk for physical removal from the machine (once powered off), note down the serial number as reported by:

hdparm -i /dev/sdX | grep SerialNo

Add a replacement disk:

mdadm /dev/md0 --add /dev/sdY1

Wait (potentially several days) for the array to be resynced

Recover from a dirty reboot of a degraded array

If the server shuts down uncleanly (eg. due to a power cut) when the array is degraded, it will refuse to automatically assemble the array on startup (with a dmesg error of the form "cannot start dirty degraded array"). This is to because the data may be in an inconsistent state. In this situation:

Check that the good disks have the same number of events. If the numbers differ slightly, that suggests some of the data being written when the server shutdown wasn't written fully, and is probably corrupt (hopefully this will just mean a logfile with some bad characters, or similar).

mdadm --examine /dev/sdX /dev/sdY /dev/sdZ | grep Events

Assuming the number of events is the same (or very similar), forcibly assemble the array.

mdadm --assemble --force /dev/md0 /dev/sdX1 /dev/sdY1 /dev/sdZ1

Useful Commands

cat /proc/mdstat: Display a summary of current raid status
mdadm --detail /dev/md0: Display raid information on array md0
mdadm --examine /dev/sdf: Display raid information on device/partition sdf

Mdadm

Contents

Overview

Creating a RAID array

Recovering from disk failure

Recover from a dirty reboot of a degraded array

Useful Commands

Navigation menu

Views

Personal tools

Navigation

Search

Tools