raid
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
raid [2009/09/30 05:09] – 172.26.0.166 | raid [2009/11/16 14:23] – 172.26.0.166 | ||
---|---|---|---|
Line 5: | Line 5: | ||
==== Drive numbering ==== | ==== Drive numbering ==== | ||
- | If you're looking at the front of the HPC you'll see four rows of drives. | + | If you're looking at the front of the HPC you'll see four rows of drives. |
* Rows 0 - 2 are SATA, connected to the hardware 3ware RAID card | * Rows 0 - 2 are SATA, connected to the hardware 3ware RAID card | ||
* Row 3 are IDE | * Row 3 are IDE | ||
Line 50: | Line 50: | ||
| | ||
unused devices: < | unused devices: < | ||
+ | |||
+ | ==== Repair RAID ==== | ||
+ | When a disk is failing you might see errors in the system logs from smartd like this: | ||
+ | < | ||
+ | In that case you need to replace the drive. | ||
+ | < | ||
+ | Personalities : [raid1] [raid0] | ||
+ | md3 : active raid1 hdd1[1] hda1[0] | ||
+ | 200704 blocks [2/2] [UU] | ||
+ | | ||
+ | md1 : active raid1 hdd3[1] hda3[0] | ||
+ | 26627648 blocks [2/2] [UU] | ||
+ | | ||
+ | md2 : active raid0 hdd5[1] hda5[0] | ||
+ | 36868608 blocks 256k chunks | ||
+ | | ||
+ | md4 : active raid1 hdd6[1] hda6[0] | ||
+ | 2168640 blocks [2/2] [UU] | ||
+ | | ||
+ | md0 : active raid1 hdd2[1] hda2[0] | ||
+ | 30716160 blocks [2/2] [UU] | ||
+ | | ||
+ | unused devices: < | ||
+ | |||
+ | Because it is ''/ | ||
+ | < | ||
+ | # mdadm /dev/md1 --fail /dev/hda3 --remove /dev/hda3 | ||
+ | # mdadm /dev/md3 --fail /dev/hda1 --remove /dev/hda1 | ||
+ | # mdadm /dev/md4 --fail /dev/hda6 --remove / | ||
+ | |||
+ | Shut the server down and replace the faulty drive with a new one. After booting your drive letters may have shifted around, so just be sure to verify which is which before proceeding. | ||
+ | Clone the partition table from the good drive to the bad one: | ||
+ | < | ||
+ | Verify the new partitions can be seen: | ||
+ | < | ||
+ | # partprobe -s | ||
+ | /dev/hda: msdos partitions 1 2 3 4 <5 6> | ||
+ | /dev/hdd: msdos partitions 1 2 3 4 <5 6> | ||
+ | /dev/sda: msdos partitions 1 | ||
+ | /dev/sdb: msdos partitions 1 | ||
+ | /dev/sdc: msdos partitions 1 | ||
+ | </ | ||
+ | You can now add the new partitions back to the arrays: | ||
+ | < | ||
+ | # mdadm /dev/md1 --add /dev/hdd3 | ||
+ | # mdadm /dev/md3 --add /dev/hdd1 | ||
+ | # mdadm /dev/md4 --add / | ||
+ | |||
+ | Clearing any previous raid info on a disk (eg. reusing a disk from another decommissioned raid array) | ||
+ | |||
+ | # mdadm --zero-superblock /dev/hdc1 | ||
+ | Adding a disk to an array | ||
+ | |||
+ | # mdadm --add /dev/md0 /dev/hdc1 | ||
+ | |||
=== To Do list: === | === To Do list: === | ||
Line 65: | Line 120: | ||
===== Hardware RAID ===== | ===== Hardware RAID ===== | ||
- | There is a utility, tw_cli, which can be used to control | + | A 3ware 9500S SATA RAID card using the 3w-9xxx kernel module. |
- | | 8 | 9 | 10 | 11 | | + | ==== Physical Disk Layout ==== |
- | | 4 | 5 | 6 | 7 | | + | |
- | | 0 | 1 | 2 | 3 | | + | We have one RAID controller, ' |
+ | |||
+ | | Port 8 | Port 9 | Port 10 | Port 11 | | ||
+ | | Port 4 | Port 5 | Port 6 | Port 7 | | ||
+ | | Port 0 | Port 1 | Port 2 | Port 3 | | ||
+ | |||
+ | ==== Repairing ' | ||
+ | |||
+ | There is a utility, tw_cli, which can be used to control/ | ||
Study the output of '' | Study the output of '' | ||
* Which controller is active? (c0, c1, etc) | * Which controller is active? (c0, c1, etc) | ||
* Which unit is degraded? (u0, u1, u2, etc) | * Which unit is degraded? (u0, u1, u2, etc) | ||
- | * Which | + | * Which port is inactive or missing? (p1, p5, etc) |
+ | |||
+ | <note warning> | ||
Remove the faulty port: | Remove the faulty port: |
raid.txt · Last modified: 2010/09/19 23:58 by aorth