This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
software_raid [2012/03/24 11:17] mantis |
software_raid [2012/03/26 14:46] (current) mantis [Raid Setup] |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Overview ====== | ||
+ | Long name is Redundant Arrays of Inexpensive Disks - see wikipedia for RAID levels. | ||
+ | |||
+ | **IMPORTANT**: a RAID is no replacement for backups! So: make sure to backup the data on the RAID regularly. | ||
+ | |||
+ | ====== Setup ====== | ||
+ | ===== Raid Setup ===== | ||
+ | The mdadm tool handles linux software RAIDs. | ||
+ | |||
+ | <code bash> | ||
+ | sudo apt-get install mdadm | ||
+ | </code> | ||
+ | Prepare the disks: | ||
+ | |||
+ | <code bash> | ||
+ | fdisk /dev/sd[abcd] | ||
+ | </code> | ||
+ | create a primary partition and set its type to Linux raid autodetect (hex code: fd). Do this for all disks you want to combine as raid. | ||
+ | |||
+ | Create a raid level 1 device node md0 with 2 hard discs: | ||
+ | |||
+ | <code bash> | ||
+ | mdadm --create --verbose /dev/md0 --level=1 --run --raid-devices=2 /dev/sda /dev/sdb | ||
+ | </code> | ||
+ | Format the new device as ext3 | ||
+ | |||
+ | <code bash> | ||
+ | mkfs.ext3 /dev/md0 | ||
+ | </code> | ||
+ | Write the raid configuration to mdadm's config file | ||
+ | |||
+ | <code bash> | ||
+ | mdadm --detail --scan --verbose > /etc/mdadm/mdadm.conf | ||
+ | </code> | ||
+ | You should add a mail contact to the config so that it finally looks like | ||
+ | |||
+ | <code> | ||
+ | ARRAY /dev/md0 level=raid6 num-devices=4 UUID=595ee5d4:d8fe61ac:e35eacf0:6e4b8477 devices=/dev/sda,/dev/sdb,/dev/sdc,/dev/sdd MAILADDR mail@bla.org | ||
+ | </code> | ||
+ | Create a mountpoint and edit /etc/fstab so the new raid can be mounted automatically | ||
+ | |||
+ | <code> | ||
+ | /dev/md0 /mnt/raid ext3 defaults 1 2 | ||
+ | </code> | ||
+ | Make sure the raid is mounted at boot. Put into /etc/rc.local | ||
+ | |||
+ | <code bash> | ||
+ | mdadm -As | ||
+ | mount /mnt/raid | ||
+ | </code> | ||
+ | mdadm uses the raid configuration provided in the /etc/mdadm/mdadm.conf we created before. | ||
+ | |||
+ | ====== Troubleshooting ====== | ||
+ | ===== Device or Resource Busy ===== | ||
+ | When trying to create a RAID array on Ubuntu Karmic (9.10) you might get an error saying "Device or resource busy". | ||
+ | |||
+ | The culprit might be the dm-raid driver having taken control of the RAID devices. | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | sudo apt-get remove dmraid libdmraid<version> | ||
+ | </code> | ||
+ | generates a new initrd without the dm-raid driver. | ||
+ | |||
+ | Just reboot afterwards, and try mdadm --create again. | ||
+ | |||
+ | ===== Problems when assembling ===== | ||
+ | If you get error messages when assembling the raid with //mdadm -As// check the config in **/etc/mdadm/mdadm.cfg** . Try manually assembling the RAID using something like | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | mdadm --assemble --scan /dev/sda /dev/sdb | ||
+ | </code> | ||
+ | If this works then it is most likely that the UUID in mdadm.cfg is wrong. To find the correct UUID, manually assemble the raid (see above) then use | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | sudo mdadm --detail /dev/md0 | ||
+ | </code> | ||
+ | to display the details. Copy the UUID to mdadm.cfg . | ||
+ | |||
+ | ====== Restoring a RAID array ====== | ||
+ | **IMPORTANT:** DO NOT USE mdadm --create on an existing array. Use //--assemble// (see below). | ||
+ | |||
+ | If you have an existing (mdadm) RAID array, you can tell mdadm to automatically find and use it: | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | sudo mdadm --assemble --scan # scanning tries to guess which partitions are to be assembled | ||
+ | </code> | ||
+ | Or you may explicitly choose the partitions to use: | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | sudo mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1 | ||
+ | </code> | ||
+ | ====== Usage ====== | ||
+ | ===== Raid monitoring ===== | ||
+ | Installing mdadm activates a monitoring daemon which is started at boot. To see if it's running do | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | ps ax | grep monitor | ||
+ | </code> | ||
+ | You should see something like | ||
+ | |||
+ | <code> | ||
+ | 5785 ? Ss 0:00 /sbin/mdadm --monitor --pid-file /var/run/mdadm/monitor.pid --daemonise --scan --syslog | ||
+ | </code> | ||
+ | If you add a mail address to the mdadm.conf, warning mails will be sent by the daemon in case of raid failures. | ||
+ | |||
+ | ===== Access via smb ===== | ||
+ | Install Samba server | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | sudo apt-get install samba | ||
+ | }} | ||
+ | |||
+ | Edit /etc/samba/smb.conf to make the shares accessible. | ||
+ | |||
+ | <code> | ||
+ | [DATA] | ||
+ | path = /mnt/raid/bla/ | ||
+ | browseable = yes | ||
+ | read only = no | ||
+ | guest ok = no | ||
+ | create mask = 0644 | ||
+ | directory mask = 0755 | ||
+ | force user = rorschach | ||
+ | </code> | ||
+ | Create the users who should be allowed to access the shares and give them passwords. | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | sudo useradd -s /bin/true rorschach # linux user who may not login to the system | ||
+ | sudo smbpasswd -L -a rorschach #add samba user | ||
+ | sudo smbpasswd -L -e rorschach #enable samba user | ||
+ | </code> | ||
+ | ====== Failures ====== | ||
+ | ===== RAID Health ===== | ||
+ | <code>#!highlight bash | ||
+ | mdadm --detail /dev/md0 | ||
+ | </code> | ||
+ | shows for a healthy raid | ||
+ | |||
+ | <code> | ||
+ | /dev/md0: Version : 00.90.03 Creation Time : Thu Apr 17 11:21:06 2008 Raid Level : raid6 Array Size : 781422592 (745.22 GiB 800.18 GB) Used Dev Size : 390711296 (372.61 GiB 400.09 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Fri Apr 18 09:46:39 2008 State : active Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Chunk Size : 256K UUID : 595ee5d4:d8fe61ac:e35eacf0:6e4b8477 Events : 0.15 Number Major Minor RaidDevice State 0 8 0 0 active sync /dev/sda 1 8 16 1 active sync /dev/sdb 2 8 32 2 active sync /dev/sdc 3 8 48 3 active sync /dev/sdd | ||
+ | </code> | ||
+ | ===== Simulated failure ===== | ||
+ | <code>#!highlight bash | ||
+ | mdadm --manage --set-faulty /dev/md1 /dev/sda | ||
+ | </code> | ||
+ | to set one disc as faulty. It says | ||
+ | |||
+ | <code> | ||
+ | mdadm: set /dev/sda faulty in /dev/md0 | ||
+ | </code> | ||
+ | Check the syslog to see what happens | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | tail -f /var/log/syslog | ||
+ | </code> | ||
+ | The event has been detected and a mail has been sent to the admin. | ||
+ | |||
+ | <code> | ||
+ | Apr 18 10:17:39 INES kernel: [77650.308834] --- rd:4 wd:3 Apr 18 10:17:39 INES kernel: [77650.308836] disk 1, o:1, dev:sdb Apr 18 10:17:39 INES kernel: [77650.308839] disk 2, o:1, dev:sdc Apr 18 10:17:39 INES kernel: [77650.308841] disk 3, o:1, dev:sdd *Apr 18 10:17:39 INES mdadm: Fail event detected on md device /dev/md0, component device /dev/sda* Apr 18 10:17:39 INES postfix/pickup[30816]: 86B902CA824F: uid=0 from= Apr 18 10:17:39 INES postfix/cleanup[32040]: 86B902CA824F: message-id=<20080418081739.86B902CA824F@INES.arfcd.com> Apr 18 10:17:39 INES postfix/qmgr[14269]: 86B902CA824F: from=, size=861, nrcpt=1 (queue active) *Apr 18 10:17:39 INES postfix/smtp[32042]: 86B902CA824F: to=, relay=s0ms2.arc.local[172.24.10.6]:25, delay=0.46, delays=0.22/0.04/0.1/0.1, dsn=2.6.0, status=sent* Apr 18 10:17:39 INES postfix/qmgr[14269]: 86B902CA824F: removed Apr 18 10:18:39 INES mdadm: SpareActive event detected on md device /dev/md0, component device /dev/sda | ||
+ | </code> | ||
+ | Now the raid details look like this | ||
+ | |||
+ | <code> | ||
+ | sudo mdadm --detail /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Thu Apr 17 11:21:06 2008 Raid Level : raid6 Array Size : 781422592 (745.22 GiB 800.18 GB) Used Dev Size : 390711296 (372.61 GiB 400.09 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Fri Apr 18 10:19:10 2008 *State : clean, degraded* Active Devices : 3 Working Devices : 3 Failed Devices : 1 Spare Devices : 0 Chunk Size : 256K UUID : 595ee5d4:d8fe61ac:e35eacf0:6e4b8477 Events : 0.20 Number Major Minor RaidDevice State 0 0 0 0 removed 1 8 16 1 active sync /dev/sdb 2 8 32 2 active sync /dev/sdc 3 8 48 3 active sync /dev/sdd *4 8 0 - faulty spare /dev/sda* | ||
+ | </code> | ||
+ | ===== "Exchange" disks ===== | ||
+ | Remove the old disk from the raid | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | mdadm /dev/md0 -r /dev/sda | ||
+ | </code> | ||
+ | Add the new disk to the raid | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | mdadm /dev/md0 -a /dev/sda | ||
+ | </code> | ||
+ | Now you should see a recovery | ||
+ | |||
+ | <code> | ||
+ | sudo mdadm --detail /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Thu Apr 17 11:21:06 2008 Raid Level : raid6 Array Size : 781422592 (745.22 GiB 800.18 GB) Used Dev Size : 390711296 (372.61 GiB 400.09 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Fri Apr 18 10:25:41 2008 *State : clean, degraded, recovering* Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Chunk Size : 256K *Rebuild Status : 1% complete* UUID : 595ee5d4:d8fe61ac:e35eacf0:6e4b8477 Events : 0.62 Number Major Minor RaidDevice State *4 8 0 0 spare rebuilding /dev/sda* 1 8 16 1 active sync /dev/sdb 2 8 32 2 active sync /dev/sdc 3 8 48 3 active sync /dev/sdd | ||
+ | </code> | ||
+ | and | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | cat /proc/mdstat | ||
+ | </code> | ||
+ | <code> | ||
+ | Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid6 sda[4] sdd[3] sdc[2] sdb[1] 781422592 blocks level 6, 256k chunk, algorithm 2 [4/3] [_UUU] [>....................] recovery = 3.9% (15251968/390711296) finish=105.1min speed=59486K/sec unused devices: | ||
+ | </code> | ||
+ | ===== Real failure ===== | ||
+ | To recover data from a RAID1, you can try to mount one of the disks as a separate disk | ||
+ | |||
+ | <code>#!highlight bash | ||
+ | sudo mount -t ext3 /dev/<the device> # you NEED to specify the filesystem-type manually! | ||
+ | </code> | ||
+ | ====== Benchmarking ====== | ||
+ | <code>#!highlight bash | ||
+ | sudo tiobench --size 66000 --threads 1 --threads 8 | ||
+ | </code> | ||
+ | to test read and write performance with 1 and 8 threads. | ||