User Tools

Site Tools


raid

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
raid [2009/08/27 09:14] – created 172.26.0.166raid [2010/09/19 23:58] (current) aorth
Line 1: Line 1:
-=== HPC RAID array === +===== RAID ===== 
-The storage on the HPC is using a RAID array. (Level?)+We have two RAIDs on the HPC 
 +  * Linux kernel software RAID 
 +  * 3ware hardware RAID 
 +==== Drive numbering ====
  
-It is currently reporting a degraded array:+If you're looking at the front of the HPC you'll see four rows of drives.  From the bottom: 
 +  * Rows 0 - 2 are SATA, connected to the hardware 3ware RAID card 
 +  * Row 3 are IDE
  
-cat /proc/mdstat +===== Software RAID ===== 
-Personalities : [raid0] [raid1]  +The Linux kernel has the ''md'' (mirrored devices) driver for software RAID devices.  There are currently two 80 GB IDE hard drives connected to the server, ''/dev/hda'' and ''/dev/hdd'' These were set up as five RAID devices during the install of Rocks/CentOS. 
-md1 : active raid1 hda1[0] + 
-      129920 blocks [2/1] [U_]+Here is information on their configuration: 
 + 
 +<code># mount | grep md 
 +/dev/md0 on / type ext3 (rw) 
 +/dev/md3 on /boot type ext3 (rw) 
 +/dev/md2 on /scratch type ext3 (rw) 
 +/dev/md1 on /export type ext3 (rw) 
 +# df -h | grep md 
 +/dev/md0               29G   11G   17G  39% / 
 +/dev/md3              190M   60M  121M  34% /boot 
 +/dev/md2               35G  177M   33G   1% /scratch 
 +/dev/md1               25G  5.5G   18G  24% /export</code> 
 + 
 +It should be noted that ''/dev/md4'' is being used as swap: 
 +<code># swapon -s 
 +Filename Type Size Used Priority 
 +/dev/md4                                partition 2168632 0 -1</code> 
 + 
 +A snapshot of the software RAID's health: 
 + 
 +<code># cat /proc/mdstat  
 +Personalities : [raid1] [raid0]  
 +md3 : active raid1 hdd1[1] hda1[0] 
 +      200704 blocks [2/2] [UU]
              
-md3 : active raid1 hdc3[1] hda3[0] +md1 : active raid1 hdd3[1] hda3[0] 
-      2097024 blocks [2/2] [UU]+      26627648 blocks [2/2] [UU]
              
-md2 : active raid1 hdc5[1] +md2 : active raid0 hdd5[1] hda5[0
-      65437696 blocks [2/1] [_U]+      36868608 blocks 256k chunks
              
-md0 : active raid0 hdc2[1] hda2[0] +md4 : active raid1 hdd6[1] hda6[0] 
-      20971008 blocks 256k chunks+      2168640 blocks [2/2] [UU]
              
-unused devices: <none>+md0 : active raid1 hdd2[1] hda2[0] 
 +      30716160 blocks [2/2] [UU] 
 +       
 +unused devices: <none></code> 
 +==== Repair RAID ==== 
 +When a disk is failing you need to replace the drive.  First, look at the RAID configuration to see which partitions are in use by which arrays.  For example: 
 +<code># cat /proc/mdstat  
 +Personalities : [raid1] [raid0]  
 +md3 : active raid1 hdd1[1] hda1[0] 
 +      200704 blocks [2/2] [UU] 
 +       
 +md1 : active raid1 hdd3[1] hda3[0] 
 +      26627648 blocks [2/2] [UU] 
 +       
 +md2 : active raid0 hdd5[1] hda5[0] 
 +      36868608 blocks 256k chunks 
 +       
 +md4 : active raid1 hdd6[1] hda6[0] 
 +      2168640 blocks [2/2] [UU] 
 +       
 +md0 : active raid1 hdd2[1] hda2[0] 
 +      30716160 blocks [2/2] [UU] 
 +       
 +unused devices: <none></code> 
 + 
 +If ''/dev/hda'' is having problems, set all its RAID1 partitions as failed and remove them: 
 +<code># mdadm /dev/md0 --fail /dev/hda2 --remove /dev/hda2 
 +# mdadm /dev/md1 --fail /dev/hda3 --remove /dev/hda3 
 +# mdadm /dev/md3 --fail /dev/hda1 --remove /dev/hda1 
 +# mdadm /dev/md4 --fail /dev/hda6 --remove /dev/hda6</code> 
 +''/dev/md2'' is a RAID0 stripe mounted as ''/scratch'', so we have to umount it and then stop it (you can't remove volumes from a stripe): 
 +<code># umount /dev/md2 
 +# mdadm --stop /dev/md2</code> 
 +<note warning> You must Shutdown the server before you physically remove the drive! </note> 
 +Shut the server down and replace the faulty drive with a new one.  After booting your drive letters may have shifted around, so just be sure to verify which is which before proceeding. 
 +Clone the partition table from the good drive to the bad one: 
 +<code># sfdisk -d /dev/hda | sfdisk --force /dev/hdd</code> 
 +Verify the new partitions can be seen: 
 +<code># partprobe -s 
 +/dev/hda: msdos partitions 1 2 3 4 <5 6> 
 +/dev/hdd: msdos partitions 1 2 3 4 <5 6> 
 +/dev/sda: msdos partitions 1 
 +/dev/sdb: msdos partitions 1 
 +/dev/sdc: msdos partitions 1 
 +</code> 
 +Re-create the scratch partition (RAID0): 
 +<code># mdadm --create --verbose /dev/md2 --level=0 --raid-devices=2 /dev/hda5 /dev/hdd5 
 +# mkfs.ext3 /dev/md2 
 +# mount /dev/md2 /scratch</code> 
 +You can now add the new partitions back to the RAID1 arrays: 
 +<code># mdadm /dev/md0 --add /dev/hdd2 
 +# mdadm /dev/md1 --add /dev/hdd3 
 +# mdadm /dev/md3 --add /dev/hdd1 
 +# mdadm /dev/md4 --add /dev/hdd6</code> 
 +After adding you can monitor the progress of the RAID rebuilds by looking in ''/proc/mdstat'': 
 +<file>Personalities : [raid1] [raid0]  
 +md3 : active raid1 hdd1[1] hda1[0] 
 +      200704 blocks [2/2] [UU] 
 +       
 +md1 : active raid1 hdd3[2] hda3[0] 
 +      26627648 blocks [2/1] [U_] 
 +      [===================>.]  recovery = 95.4% (25407552/26627648) finish=0.7min speed=28648K/sec 
 +       
 +md2 : inactive hda5[0] 
 +      18434304 blocks 
 +        
 +md4 : active raid1 hdd6[2] hda6[0] 
 +      2168640 blocks [2/1] [U_] 
 +        resync=DELAYED 
 +       
 +md0 : active raid1 hdd2[1] hda2[0] 
 +      30716160 blocks [2/2] [UU] 
 +       
 +unused devices: <none></file> 
 +===== Hardware RAID ===== 
 + 
 +A 3ware 9500S-12 SATA RAID card using the 3w-9xxx kernel module.  It has 12 channels.  The HPC is configured to use RAID5 for all of its RAID arrays on the hardware RAID. 
 + 
 +==== Physical Disk Layout ==== 
 + 
 +We have one RAID controller, 'c1' Disks are plugged into ports, 'p0' - 'p11' The disks are then grouped into units (basically the rows), 'u0' - 'u2'
 + 
 +| Port 8 | Port 9 | Port 10 | Port 11 | 
 +| Port 4 | Port 5 | Port 6 | Port 7 | 
 +| Port 0 | Port 1 | Port 2 | Port 3 | 
 + 
 +==== Repairing 'degraded' arrays ==== 
 + 
 +There is a utility, ''tw_cli'', which can be used to control/monitor the hardware raid controller. 
 + 
 +Study the output of ''show'' to know which controller to manage.  Then you can use ''/c1 show'' to show the status of that particular controller.  Things to look for: 
 +  * Which controller is active? (c0, c1, etc) 
 +  * Which unit is degraded? (u0, u1, u2, etc) 
 +  * Which port is inactive or missing? (p1, p5, etc) 
 + 
 +<note warning>The controller supports hot swapping but you **must** remove a faulty drive through the ''tw_cli'' tool before you can swap drives.</note> 
 + 
 +Remove the faulty port: 
 +<code>maint remove c1 p5</code> 
 +Insert a new drive and rescan: 
 +<code>maint rescan</code> 
 +Rebuild the degraded array: 
 +<code>maint rebuild c1 u2 p5</code> 
 + 
 +Check the status of the rebuild by monitoring ''/c1 show'', but I have a feeling this might disturb the rebuild process.  In any case, you can check the status by following the output of ''dmesg'': 
 + 
 +<file>3w-9xxx: scsi1: AEN: INFO (0x04:0x000B): Rebuild started:unit=2. 
 +3w-9xxx: scsi1: AEN: INFO (0x04:0x0005): Background rebuild done:unit=2.</file> 
 + 
 +This sucks: 
 + 
 +<file>3w-9xxx: scsi1: AEN: INFO (0x04:0x0029): Background verify started:unit=0. 
 +3w-9xxx: scsi1: AEN: INFO (0x04:0x002B): Background verify done:unit=0. 
 +3w-9xxx: scsi1: AEN: ERROR (0x04:0x0002): Degraded unit detected:unit=0, port=3</file> 
 + 
 +<code>$ sudo tw_cli 
 +Password:  
 +//hpc-ilri> /c1 show 
 + 
 +Unit  UnitType  Status         %RCmpl  %V/I/ Stripe  Size(GB)  Cache  AVrfy 
 +------------------------------------------------------------------------------ 
 +u0    RAID-5    DEGRADED                   64K     698.461   ON     OFF     
 +u1    RAID-5    OK                         64K     698.461   ON     OFF     
 +u2    RAID-5    OK                         64K     698.461   ON     OFF     
 + 
 +Port   Status           Unit   Size        Blocks        Serial 
 +--------------------------------------------------------------- 
 +p0     OK               u0     232.88 GB   488397168     WD-WMAEP2714804      
 +p1     OK               u0     232.88 GB   488397168     WD-WMAEP1570106      
 +p2     OK               u0     232.88 GB   488397168     WD-WMAEP2712887      
 +p3     DEGRADED         u0     232.88 GB   488397168     WD-WMAEP2714418      
 +p4     OK               u2     232.88 GB   488397168     WD-WCAT1C715001      
 +p5     OK               u2     232.88 GB   488397168     WD-WMAEP2713449      
 +p6     OK               u2     232.88 GB   488397168     WD-WMAEP2715070      
 +p7     OK               u2     232.88 GB   488397168     WD-WMAEP2712590      
 +p8     OK               u1     232.88 GB   488397168     WD-WMAEP2712574      
 +p9     OK               u1     232.88 GB   488397168     WD-WMAEP2734142      
 +p10    OK               u1     232.88 GB   488397168     WD-WMAEP2702155      
 +p11    OK               u1     232.88 GB   488397168     WD-WMAEP2712472  </code>
  
 +Looks like another drive failed.
raid.txt · Last modified: 2010/09/19 23:58 by aorth