====== Overview ======
Long name is Redundant Arrays of Inexpensive Disks - see wikipedia for RAID levels.

**IMPORTANT**: a RAID is no replacement for backups! So: make sure to backup the data on the RAID regularly.

====== Setup ======
===== Raid Setup =====
The mdadm tool handles linux software RAIDs.

<code bash>
 sudo apt-get install mdadm
</code>
Prepare the disks:

<code bash>
 fdisk /dev/sd[abcd]
</code>
create a primary partition and set its type to Linux raid autodetect (hex code: fd). Do this for all disks you want to combine as raid.

Create a raid level 1 device node md0 with 2 hard discs:

<code bash>
 mdadm --create --verbose /dev/md0 --level=1 --run --raid-devices=2 /dev/sda /dev/sdb
</code>
Format the new device as ext3

<code bash>
 mkfs.ext3 /dev/md0
</code>
Write the raid configuration to mdadm's config file

<code bash>
 mdadm --detail --scan --verbose > /etc/mdadm/mdadm.conf
</code>
You should add a mail contact to the config so that it finally looks like

<code>
 ARRAY /dev/md0 level=raid6 num-devices=4 UUID=595ee5d4:d8fe61ac:e35eacf0:6e4b8477    devices=/dev/sda,/dev/sdb,/dev/sdc,/dev/sdd      MAILADDR mail@bla.org
</code>
Create a mountpoint and edit /etc/fstab so the new raid can be mounted automatically

<code>
 /dev/md0      /mnt/raid     ext3    defaults    1 2
</code>
Make sure the raid is mounted at boot. Put into /etc/rc.local

<code bash>
 mdadm -As
 mount /mnt/raid
</code>
mdadm uses the raid configuration provided in the /etc/mdadm/mdadm.conf we created before.

====== Troubleshooting ======
===== Device or Resource Busy =====
When trying to create a RAID array on Ubuntu Karmic (9.10) you might get an error saying "Device or resource busy".

The culprit might be the dm-raid driver having taken control of the RAID devices.

<code>#!highlight bash
 sudo apt-get remove dmraid libdmraid<version>
</code>
generates a new initrd without the dm-raid driver.

Just reboot afterwards, and try mdadm --create again.

===== Problems when assembling =====
If you get error messages when assembling the raid with //mdadm -As// check the config in **/etc/mdadm/mdadm.cfg** . Try manually assembling the RAID using something like

<code>#!highlight bash
mdadm --assemble --scan /dev/sda /dev/sdb
</code>
If this works then it is most likely that the UUID in mdadm.cfg is wrong. To find the correct UUID, manually assemble the raid (see above) then use

<code>#!highlight bash
sudo mdadm --detail /dev/md0
</code>
to display the details. Copy the UUID to mdadm.cfg .

====== Restoring a RAID array ======
**IMPORTANT:** DO NOT USE mdadm --create on an existing array. Use //--assemble// (see below).

If you have an existing (mdadm) RAID array, you can tell mdadm to automatically find and use it:

<code>#!highlight bash
 sudo mdadm --assemble --scan # scanning tries to guess which partitions are to be assembled
</code>
Or you may explicitly choose the partitions to use:

<code>#!highlight bash
  sudo mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1
</code>
====== Usage ======
===== Raid monitoring =====
Installing mdadm activates a monitoring daemon which is started at boot. To see if it's running do

<code>#!highlight bash
 ps ax | grep monitor
</code>
You should see something like

<code>
 5785 ?        Ss     0:00 /sbin/mdadm --monitor --pid-file /var/run/mdadm/monitor.pid --daemonise --scan --syslog
</code>
If you add a mail address to the mdadm.conf, warning mails will be sent by the daemon in case of raid failures.

===== Access via smb =====
Install Samba server

<code>#!highlight bash
 sudo apt-get install samba
}}

Edit /etc/samba/smb.conf to make the shares accessible.

<code>
[DATA]
path = /mnt/raid/bla/
browseable = yes
read only = no
guest ok = no
create mask = 0644
directory mask = 0755
force user = rorschach
</code>
Create the users who should be allowed to access the shares and give them passwords.

<code>#!highlight bash
 sudo useradd -s /bin/true rorschach  # linux  user who may not login to the system
 sudo smbpasswd -L -a rorschach       #add samba user
 sudo smbpasswd -L -e rorschach        #enable samba user
</code>
====== Failures ======
===== RAID Health =====
<code>#!highlight bash
 mdadm --detail /dev/md0
</code>
shows for a healthy raid

<code>
 /dev/md0:          Version : 00.90.03    Creation Time : Thu Apr 17 11:21:06 2008       Raid Level : raid6       Array Size : 781422592 (745.22 GiB 800.18 GB)    Used Dev Size : 390711296 (372.61 GiB 400.09 GB)     Raid Devices : 4    Total Devices : 4  Preferred Minor : 0      Persistence : Superblock is persistent          Update Time : Fri Apr 18 09:46:39 2008            State : active   Active Devices : 4  Working Devices : 4   Failed Devices : 0    Spare Devices : 0         Chunk Size : 256K               UUID : 595ee5d4:d8fe61ac:e35eacf0:6e4b8477           Events : 0.15        Number   Major   Minor   RaidDevice State         0       8        0        0      active sync   /dev/sda         1       8       16        1      active sync   /dev/sdb         2       8       32        2      active sync   /dev/sdc         3       8       48        3      active sync   /dev/sdd
</code>
===== Simulated failure =====
<code>#!highlight bash
 mdadm --manage --set-faulty /dev/md1 /dev/sda
</code>
to set one disc as faulty. It says

<code>
 mdadm: set /dev/sda faulty in /dev/md0
</code>
Check the syslog to see what happens

<code>#!highlight bash
 tail -f /var/log/syslog
</code>
The event has been detected and a mail has been sent to the admin.

<code>
 Apr 18 10:17:39 INES kernel: [77650.308834]  --- rd:4 wd:3  Apr 18 10:17:39 INES kernel: [77650.308836]  disk 1, o:1, dev:sdb  Apr 18 10:17:39 INES kernel: [77650.308839]  disk 2, o:1, dev:sdc  Apr 18 10:17:39 INES kernel: [77650.308841]  disk 3, o:1, dev:sdd   *Apr 18 10:17:39 INES mdadm: Fail event detected on md device /dev/md0, component device /dev/sda*   Apr 18 10:17:39 INES postfix/pickup[30816]: 86B902CA824F: uid=0 from=  Apr 18 10:17:39 INES postfix/cleanup[32040]: 86B902CA824F: message-id=<20080418081739.86B902CA824F@INES.arfcd.com>  Apr 18 10:17:39 INES postfix/qmgr[14269]: 86B902CA824F: from=, size=861, nrcpt=1 (queue active)   *Apr 18 10:17:39 INES postfix/smtp[32042]: 86B902CA824F: to=, relay=s0ms2.arc.local[172.24.10.6]:25, delay=0.46, delays=0.22/0.04/0.1/0.1, dsn=2.6.0, status=sent*   Apr 18 10:17:39 INES postfix/qmgr[14269]: 86B902CA824F: removed  Apr 18 10:18:39 INES mdadm: SpareActive event detected on md device /dev/md0, component device /dev/sda
</code>
Now the raid details look like this

<code>
 sudo mdadm --detail /dev/md0  /dev/md0:          Version : 00.90.03     Creation Time : Thu Apr 17 11:21:06 2008       Raid Level : raid6       Array Size : 781422592 (745.22 GiB 800.18 GB)    Used Dev Size : 390711296 (372.61 GiB 400.09 GB)     Raid Devices : 4    Total Devices : 4  Preferred Minor : 0      Persistence : Superblock is persistent         Update Time : Fri Apr 18 10:19:10 2008             *State : clean, degraded*    Active Devices : 3  Working Devices : 3   Failed Devices : 1    Spare Devices : 0         Chunk Size : 256K                UUID : 595ee5d4:d8fe61ac:e35eacf0:6e4b8477           Events : 0.20        Number   Major   Minor   RaidDevice State         0       0        0        0      removed         1       8       16        1      active sync   /dev/sdb         2       8       32        2      active sync   /dev/sdc         3       8       48        3      active sync   /dev/sdd            *4       8        0        -      faulty spare   /dev/sda*
</code>
===== "Exchange" disks =====
Remove the old disk from the raid

<code>#!highlight bash
 mdadm /dev/md0 -r /dev/sda
</code>
Add the new disk to the raid

<code>#!highlight bash
 mdadm /dev/md0 -a /dev/sda
</code>
Now you should see a recovery

<code>
 sudo mdadm --detail /dev/md0  /dev/md0:          Version : 00.90.03    Creation Time : Thu Apr 17 11:21:06 2008       Raid Level : raid6       Array Size : 781422592 (745.22 GiB 800.18 GB)    Used Dev Size : 390711296 (372.61 GiB 400.09 GB)     Raid Devices : 4    Total Devices : 4  Preferred Minor : 0      Persistence : Superblock is persistent        Update Time : Fri Apr 18 10:25:41 2008             *State : clean, degraded, recovering*    Active Devices : 3  Working Devices : 4   Failed Devices : 0    Spare Devices : 1         Chunk Size : 256K       *Rebuild Status : 1% complete*                UUID : 595ee5d4:d8fe61ac:e35eacf0:6e4b8477           Events : 0.62        Number   Major   Minor   RaidDevice State          *4       8        0        0      spare rebuilding   /dev/sda*          1       8       16        1      active sync   /dev/sdb         2       8       32        2      active sync   /dev/sdc         3       8       48        3      active sync   /dev/sdd
</code>
and

<code>#!highlight bash
cat /proc/mdstat
</code>
<code>
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]   md0 : active raid6 sda[4] sdd[3] sdc[2] sdb[1]        781422592 blocks level 6, 256k chunk, algorithm 2 [4/3] [_UUU]        [>....................]  recovery =  3.9% (15251968/390711296) finish=105.1min speed=59486K/sec          unused devices:
</code>
===== Real failure =====
To recover data from a RAID1, you can try to mount one of the disks as a separate disk

<code>#!highlight bash
 sudo mount -t ext3  /dev/<the device>  # you NEED to specify the filesystem-type manually!
</code>
====== Benchmarking ======
<code>#!highlight bash
 sudo tiobench --size 66000 --threads 1 --threads 8
</code>
to test read and write performance with 1 and 8 threads.