User Tools

Site Tools


software_raid

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
software_raid [2012/03/26 14:45]
mantis [Raid Setup]
software_raid [2012/03/26 14:46] (current)
mantis [Raid Setup]
Line 1: Line 1:
 +====== Overview ======
 +Long name is Redundant Arrays of Inexpensive Disks - see wikipedia for RAID levels.
 +
 +**IMPORTANT**:​ a RAID is no replacement for backups! So: make sure to backup the data on the RAID regularly.
 +
 +====== Setup ======
 +===== Raid Setup =====
 +The mdadm tool handles linux software RAIDs.
 +
 +<code bash>
 + sudo apt-get install mdadm
 +</​code>​
 +Prepare the disks:
 +
 +<code bash>
 + fdisk /​dev/​sd[abcd]
 +</​code>​
 +create a primary partition and set its type to Linux raid autodetect (hex code: fd). Do this for all disks you want to combine as raid.
 +
 +Create a raid level 1 device node md0 with 2 hard discs:
 +
 +<code bash>
 + mdadm --create --verbose /dev/md0 --level=1 --run --raid-devices=2 /dev/sda /dev/sdb
 +</​code>​
 +Format the new device as ext3
 +
 +<code bash>
 + ​mkfs.ext3 /dev/md0
 +</​code>​
 +Write the raid configuration to mdadm'​s config file
 +
 +<code bash>
 + mdadm --detail --scan --verbose > /​etc/​mdadm/​mdadm.conf
 +</​code>​
 +You should add a mail contact to the config so that it finally looks like
 +
 +<​code>​
 + ARRAY /dev/md0 level=raid6 num-devices=4 UUID=595ee5d4:​d8fe61ac:​e35eacf0:​6e4b8477 ​   devices=/​dev/​sda,/​dev/​sdb,/​dev/​sdc,/​dev/​sdd ​     MAILADDR mail@bla.org
 +</​code>​
 +Create a mountpoint and edit /etc/fstab so the new raid can be mounted automatically
 +
 +<​code>​
 + /​dev/​md0 ​     /​mnt/​raid ​    ​ext3 ​   defaults ​   1 2
 +</​code>​
 +Make sure the raid is mounted at boot. Put into /​etc/​rc.local
 +
 +<code bash>
 + mdadm -As
 + mount /mnt/raid
 +</​code>​
 +mdadm uses the raid configuration provided in the /​etc/​mdadm/​mdadm.conf we created before.
 +
 +====== Troubleshooting ======
 +===== Device or Resource Busy =====
 +When trying to create a RAID array on Ubuntu Karmic (9.10) you might get an error saying "​Device or resource busy".
 +
 +The culprit might be the dm-raid driver having taken control of the RAID devices.
 +
 +<​code>#​!highlight bash
 + sudo apt-get remove dmraid libdmraid<​version>​
 +</​code>​
 +generates a new initrd without the dm-raid driver.
 +
 +Just reboot afterwards, and try mdadm --create again.
 +
 +===== Problems when assembling =====
 +If you get error messages when assembling the raid with //mdadm -As// check the config in **/​etc/​mdadm/​mdadm.cfg** . Try manually assembling the RAID using something like
 +
 +<​code>#​!highlight bash
 +mdadm --assemble --scan /dev/sda /dev/sdb
 +</​code>​
 +If this works then it is most likely that the UUID in mdadm.cfg is wrong. To find the correct UUID, manually assemble the raid (see above) then use
 +
 +<​code>#​!highlight bash
 +sudo mdadm --detail /dev/md0
 +</​code>​
 +to display the details. Copy the UUID to mdadm.cfg .
 +
 +====== Restoring a RAID array ======
 +**IMPORTANT:​** DO NOT USE mdadm --create on an existing array. Use //​--assemble//​ (see below).
 +
 +If you have an existing (mdadm) RAID array, you can tell mdadm to automatically find and use it:
 +
 +<​code>#​!highlight bash
 + sudo mdadm --assemble --scan # scanning tries to guess which partitions are to be assembled
 +</​code>​
 +Or you may explicitly choose the partitions to use:
 +
 +<​code>#​!highlight bash
 +  sudo mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1
 +</​code>​
 +====== Usage ======
 +===== Raid monitoring =====
 +Installing mdadm activates a monitoring daemon which is started at boot. To see if it's running do
 +
 +<​code>#​!highlight bash
 + ps ax | grep monitor
 +</​code>​
 +You should see something like
 +
 +<​code>​
 + 5785 ?        Ss     0:00 /sbin/mdadm --monitor --pid-file /​var/​run/​mdadm/​monitor.pid --daemonise --scan --syslog
 +</​code>​
 +If you add a mail address to the mdadm.conf, warning mails will be sent by the daemon in case of raid failures.
 +
 +===== Access via smb =====
 +Install Samba server
 +
 +<​code>#​!highlight bash
 + sudo apt-get install samba
 +}}
 +
 +Edit /​etc/​samba/​smb.conf to make the shares accessible.
 +
 +<​code>​
 +[DATA]
 +path = /​mnt/​raid/​bla/​
 +browseable = yes
 +read only = no
 +guest ok = no
 +create mask = 0644
 +directory mask = 0755
 +force user = rorschach
 +</​code>​
 +Create the users who should be allowed to access the shares and give them passwords.
 +
 +<​code>#​!highlight bash
 + sudo useradd -s /bin/true rorschach ​ # linux  user who may not login to the system
 + sudo smbpasswd -L -a rorschach ​      #add samba user
 + sudo smbpasswd -L -e rorschach ​       #enable samba user
 +</​code>​
 +====== Failures ======
 +===== RAID Health =====
 +<​code>#​!highlight bash
 + mdadm --detail /dev/md0
 +</​code>​
 +shows for a healthy raid
 +
 +<​code>​
 + /​dev/​md0: ​         Version : 00.90.03 ​   Creation Time : Thu Apr 17 11:21:06 2008       Raid Level : raid6       Array Size : 781422592 (745.22 GiB 800.18 GB)    Used Dev Size : 390711296 (372.61 GiB 400.09 GB)     Raid Devices : 4    Total Devices : 4  Preferred Minor : 0      Persistence : Superblock is persistent ​         Update Time : Fri Apr 18 09:46:39 2008            State : active ​  ​Active Devices : 4  Working Devices : 4   ​Failed Devices : 0    Spare Devices : 0         Chunk Size : 256K               UUID : 595ee5d4:​d8fe61ac:​e35eacf0:​6e4b8477 ​          ​Events : 0.15        Number ​  ​Major ​  ​Minor ​  ​RaidDevice State         ​0 ​      ​8 ​       0        0      active sync   /​dev/​sda ​        ​1 ​      ​8 ​      ​16 ​       1      active sync   /​dev/​sdb ​        ​2 ​      ​8 ​      ​32 ​       2      active sync   /​dev/​sdc ​        ​3 ​      ​8 ​      ​48 ​       3      active sync   /​dev/​sdd
 +</​code>​
 +===== Simulated failure =====
 +<​code>#​!highlight bash
 + mdadm --manage --set-faulty /dev/md1 /dev/sda
 +</​code>​
 +to set one disc as faulty. It says
 +
 +<​code>​
 + ​mdadm:​ set /dev/sda faulty in /dev/md0
 +</​code>​
 +Check the syslog to see what happens
 +
 +<​code>#​!highlight bash
 + tail -f /​var/​log/​syslog
 +</​code>​
 +The event has been detected and a mail has been sent to the admin.
 +
 +<​code>​
 + Apr 18 10:17:39 INES kernel: [77650.308834] ​ --- rd:4 wd:3  Apr 18 10:17:39 INES kernel: [77650.308836] ​ disk 1, o:1, dev:​sdb ​ Apr 18 10:17:39 INES kernel: [77650.308839] ​ disk 2, o:1, dev:​sdc ​ Apr 18 10:17:39 INES kernel: [77650.308841] ​ disk 3, o:1, dev:​sdd ​  *Apr 18 10:17:39 INES mdadm: Fail event detected on md device /dev/md0, component device /​dev/​sda* ​  Apr 18 10:17:39 INES postfix/​pickup[30816]:​ 86B902CA824F:​ uid=0 from=  Apr 18 10:17:39 INES postfix/​cleanup[32040]:​ 86B902CA824F:​ message-id=<​20080418081739.86B902CA824F@INES.arfcd.com> ​ Apr 18 10:17:39 INES postfix/​qmgr[14269]:​ 86B902CA824F:​ from=, size=861, nrcpt=1 (queue active) ​  *Apr 18 10:17:39 INES postfix/​smtp[32042]:​ 86B902CA824F:​ to=, relay=s0ms2.arc.local[172.24.10.6]:​25,​ delay=0.46, delays=0.22/​0.04/​0.1/​0.1,​ dsn=2.6.0, status=sent* ​  Apr 18 10:17:39 INES postfix/​qmgr[14269]:​ 86B902CA824F:​ removed ​ Apr 18 10:18:39 INES mdadm: SpareActive event detected on md device /dev/md0, component device /dev/sda
 +</​code>​
 +Now the raid details look like this
 +
 +<​code>​
 + sudo mdadm --detail /​dev/​md0 ​ /​dev/​md0: ​         Version : 00.90.03 ​    ​Creation Time : Thu Apr 17 11:21:06 2008       Raid Level : raid6       Array Size : 781422592 (745.22 GiB 800.18 GB)    Used Dev Size : 390711296 (372.61 GiB 400.09 GB)     Raid Devices : 4    Total Devices : 4  Preferred Minor : 0      Persistence : Superblock is persistent ​        ​Update Time : Fri Apr 18 10:19:10 2008             ​*State : clean, degraded* ​   Active Devices : 3  Working Devices : 3   ​Failed Devices : 1    Spare Devices : 0         Chunk Size : 256K                UUID : 595ee5d4:​d8fe61ac:​e35eacf0:​6e4b8477 ​          ​Events : 0.20        Number ​  ​Major ​  ​Minor ​  ​RaidDevice State         ​0 ​      ​0 ​       0        0      removed ​        ​1 ​      ​8 ​      ​16 ​       1      active sync   /​dev/​sdb ​        ​2 ​      ​8 ​      ​32 ​       2      active sync   /​dev/​sdc ​        ​3 ​      ​8 ​      ​48 ​       3      active sync   /​dev/​sdd ​           *4       ​8 ​       0        -      faulty spare   /​dev/​sda*
 +</​code>​
 +===== "​Exchange"​ disks =====
 +Remove the old disk from the raid
 +
 +<​code>#​!highlight bash
 + mdadm /dev/md0 -r /dev/sda
 +</​code>​
 +Add the new disk to the raid
 +
 +<​code>#​!highlight bash
 + mdadm /dev/md0 -a /dev/sda
 +</​code>​
 +Now you should see a recovery
 +
 +<​code>​
 + sudo mdadm --detail /​dev/​md0 ​ /​dev/​md0: ​         Version : 00.90.03 ​   Creation Time : Thu Apr 17 11:21:06 2008       Raid Level : raid6       Array Size : 781422592 (745.22 GiB 800.18 GB)    Used Dev Size : 390711296 (372.61 GiB 400.09 GB)     Raid Devices : 4    Total Devices : 4  Preferred Minor : 0      Persistence : Superblock is persistent ​       Update Time : Fri Apr 18 10:25:41 2008             ​*State : clean, degraded, recovering* ​   Active Devices : 3  Working Devices : 4   ​Failed Devices : 0    Spare Devices : 1         Chunk Size : 256K       ​*Rebuild Status : 1% complete* ​               UUID : 595ee5d4:​d8fe61ac:​e35eacf0:​6e4b8477 ​          ​Events : 0.62        Number ​  ​Major ​  ​Minor ​  ​RaidDevice State          *4       ​8 ​       0        0      spare rebuilding ​  /​dev/​sda* ​         1       ​8 ​      ​16 ​       1      active sync   /​dev/​sdb ​        ​2 ​      ​8 ​      ​32 ​       2      active sync   /​dev/​sdc ​        ​3 ​      ​8 ​      ​48 ​       3      active sync   /​dev/​sdd
 +</​code>​
 +and
 +
 +<​code>#​!highlight bash
 +cat /​proc/​mdstat
 +</​code>​
 +<​code>​
 +Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] ​  md0 : active raid6 sda[4] sdd[3] sdc[2] sdb[1] ​       781422592 blocks level 6, 256k chunk, algorithm 2 [4/3] [_UUU] ​       [>​....................] ​ recovery =  3.9% (15251968/​390711296) finish=105.1min speed=59486K/​sec ​         unused devices:
 +</​code>​
 +===== Real failure =====
 +To recover data from a RAID1, you can try to mount one of the disks as a separate disk
 +
 +<​code>#​!highlight bash
 + sudo mount -t ext3  /​dev/<​the device> ​ # you NEED to specify the filesystem-type manually!
 +</​code>​
 +====== Benchmarking ======
 +<​code>#​!highlight bash
 + sudo tiobench --size 66000 --threads 1 --threads 8
 +</​code>​
 +to test read and write performance with 1 and 8 threads.
  
software_raid.txt ยท Last modified: 2012/03/26 14:46 by mantis