User Tools

Site Tools


raid

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
raid [2009/09/29 15:02] 172.26.0.166raid [2010/09/19 23:58] (current) aorth
Line 2: Line 2:
 We have two RAIDs on the HPC We have two RAIDs on the HPC
   * Linux kernel software RAID   * Linux kernel software RAID
-  * 3mware hardware RAID+  * 3ware hardware RAID
 ==== Drive numbering ==== ==== Drive numbering ====
  
-If you're looking at the front of the HPC you'll see four rows of drives.  From the bottom;+If you're looking at the front of the HPC you'll see four rows of drives.  From the bottom:
   * Rows 0 - 2 are SATA, connected to the hardware 3ware RAID card   * Rows 0 - 2 are SATA, connected to the hardware 3ware RAID card
   * Row 3 are IDE   * Row 3 are IDE
Line 32: Line 32:
 A snapshot of the software RAID's health: A snapshot of the software RAID's health:
  
 +<code># cat /proc/mdstat 
 +Personalities : [raid1] [raid0] 
 +md3 : active raid1 hdd1[1] hda1[0]
 +      200704 blocks [2/2] [UU]
 +      
 +md1 : active raid1 hdd3[1] hda3[0]
 +      26627648 blocks [2/2] [UU]
 +      
 +md2 : active raid0 hdd5[1] hda5[0]
 +      36868608 blocks 256k chunks
 +      
 +md4 : active raid1 hdd6[1] hda6[0]
 +      2168640 blocks [2/2] [UU]
 +      
 +md0 : active raid1 hdd2[1] hda2[0]
 +      30716160 blocks [2/2] [UU]
 +      
 +unused devices: <none></code>
 +==== Repair RAID ====
 +When a disk is failing you need to replace the drive.  First, look at the RAID configuration to see which partitions are in use by which arrays.  For example:
 <code># cat /proc/mdstat  <code># cat /proc/mdstat 
 Personalities : [raid1] [raid0]  Personalities : [raid1] [raid0] 
Line 51: Line 71:
 unused devices: <none></code> unused devices: <none></code>
  
-=== To Do list: ===+If ''/dev/hda'' is having problems, set all its RAID1 partitions as failed and remove them: 
 +<code># mdadm /dev/md0 --fail /dev/hda2 --remove /dev/hda2 
 +# mdadm /dev/md1 --fail /dev/hda3 --remove /dev/hda3 
 +# mdadm /dev/md3 --fail /dev/hda1 --remove /dev/hda1 
 +# mdadm /dev/md4 --fail /dev/hda6 --remove /dev/hda6</code> 
 +''/dev/md2'' is a RAID0 stripe mounted as ''/scratch'', so we have to umount it and then stop it (you can't remove volumes from a stripe): 
 +<code># umount /dev/md2 
 +# mdadm --stop /dev/md2</code> 
 +<note warning> You must Shutdown the server before you physically remove the drive! </note> 
 +Shut the server down and replace the faulty drive with a new one.  After booting your drive letters may have shifted around, so just be sure to verify which is which before proceeding. 
 +Clone the partition table from the good drive to the bad one: 
 +<code># sfdisk -d /dev/hda | sfdisk --force /dev/hdd</code> 
 +Verify the new partitions can be seen: 
 +<code># partprobe -s 
 +/dev/hda: msdos partitions 1 2 3 4 <5 6> 
 +/dev/hdd: msdos partitions 1 2 3 4 <5 6> 
 +/dev/sda: msdos partitions 1 
 +/dev/sdb: msdos partitions 1 
 +/dev/sdc: msdos partitions 1 
 +</code> 
 +Re-create the scratch partition (RAID0): 
 +<code># mdadm --create --verbose /dev/md2 --level=0 --raid-devices=2 /dev/hda5 /dev/hdd5 
 +# mkfs.ext3 /dev/md2 
 +# mount /dev/md2 /scratch</code> 
 +You can now add the new partitions back to the RAID1 arrays: 
 +<code># mdadm /dev/md0 --add /dev/hdd2 
 +# mdadm /dev/md1 --add /dev/hdd3 
 +# mdadm /dev/md3 --add /dev/hdd1 
 +# mdadm /dev/md4 --add /dev/hdd6</code> 
 +After adding you can monitor the progress of the RAID rebuilds by looking in ''/proc/mdstat'': 
 +<file>Personalities : [raid1] [raid0]  
 +md3 : active raid1 hdd1[1] hda1[0] 
 +      200704 blocks [2/2] [UU] 
 +       
 +md1 : active raid1 hdd3[2] hda3[0] 
 +      26627648 blocks [2/1] [U_] 
 +      [===================>.]  recovery = 95.4% (25407552/26627648) finish=0.7min speed=28648K/sec 
 +       
 +md2 inactive hda5[0] 
 +      18434304 blocks 
 +        
 +md4 : active raid1 hdd6[2] hda6[0] 
 +      2168640 blocks [2/1] [U_] 
 +        resync=DELAYED 
 +       
 +md0 : active raid1 hdd2[1] hda2[0] 
 +      30716160 blocks [2/2] [UU] 
 +       
 +unused devices: <none></file> 
 +===== Hardware RAID =====
  
 +A 3ware 9500S-12 SATA RAID card using the 3w-9xxx kernel module.  It has 12 channels.  The HPC is configured to use RAID5 for all of its RAID arrays on the hardware RAID.
  
-Prepare written instructions on how to repair disk arrays.+==== Physical Disk Layout ====
  
-What disks to we have?+We have one RAID controller, 'c1' Disks are plugged into ports, 'p0' - 'p11' The disks are then grouped into units (basically the rows), 'u0' - 'u2'.
  
-Add extra spare disks?+| Port 8 | Port 9 | Port 10 | Port 11 | 
 +| Port 4 | Port 5 | Port 6 | Port 7 | 
 +| Port 0 | Port 1 | Port 2 | Port 3 |
  
-How do you know which physical disk is broken to replace it?+==== Repairing 'degraded' arrays ====
  
- +There is a utility, ''tw_cli'', which can be used to control/monitor the hardware raid controller.
-===== Hardware RAID ===== +
- +
-There is a utility, tw_cli, which can be used to control the hardware raid.  The hardware RAID has three arrays, all RAID 5.  Each "unit" (row) is one array. +
- +
-| 8 | 9 | 10 | 11 | +
-| 4 | 5 | 6 | 7 | +
-| 0 | 1 | 2 | 3 |+
  
 Study the output of ''show'' to know which controller to manage.  Then you can use ''/c1 show'' to show the status of that particular controller.  Things to look for: Study the output of ''show'' to know which controller to manage.  Then you can use ''/c1 show'' to show the status of that particular controller.  Things to look for:
   * Which controller is active? (c0, c1, etc)   * Which controller is active? (c0, c1, etc)
   * Which unit is degraded? (u0, u1, u2, etc)   * Which unit is degraded? (u0, u1, u2, etc)
-  * Which +  * Which port is inactive or missing? (p1, p5, etc) 
 + 
 +<note warning>The controller supports hot swapping but you **must** remove a faulty drive through the ''tw_cli'' tool before you can swap drives.</note>
  
 Remove the faulty port: Remove the faulty port:
Line 82: Line 149:
 Rebuild the degraded array: Rebuild the degraded array:
 <code>maint rebuild c1 u2 p5</code> <code>maint rebuild c1 u2 p5</code>
-Check the status of the rebuild by monitoring ''/c1 show''+ 
 +Check the status of the rebuild by monitoring ''/c1 show'', but I have a feeling this might disturb the rebuild process.  In any case, you can check the status by following the output of ''dmesg'': 
 + 
 +<file>3w-9xxx: scsi1: AEN: INFO (0x04:0x000B): Rebuild started:unit=2. 
 +3w-9xxx: scsi1: AEN: INFO (0x04:0x0005): Background rebuild done:unit=2.</file> 
 + 
 +This sucks: 
 + 
 +<file>3w-9xxx: scsi1: AEN: INFO (0x04:0x0029): Background verify started:unit=0. 
 +3w-9xxx: scsi1: AEN: INFO (0x04:0x002B): Background verify done:unit=0. 
 +3w-9xxx: scsi1: AEN: ERROR (0x04:0x0002): Degraded unit detected:unit=0, port=3</file> 
 + 
 +<code>$ sudo tw_cli 
 +Password:  
 +//hpc-ilri> /c1 show 
 + 
 +Unit  UnitType  Status         %RCmpl  %V/I/ Stripe  Size(GB)  Cache  AVrfy 
 +------------------------------------------------------------------------------ 
 +u0    RAID-5    DEGRADED                   64K     698.461   ON     OFF     
 +u1    RAID-5    OK                         64K     698.461   ON     OFF     
 +u2    RAID-5    OK                         64K     698.461   ON     OFF     
 + 
 +Port   Status           Unit   Size        Blocks        Serial 
 +--------------------------------------------------------------- 
 +p0     OK               u0     232.88 GB   488397168     WD-WMAEP2714804      
 +p1     OK               u0     232.88 GB   488397168     WD-WMAEP1570106      
 +p2     OK               u0     232.88 GB   488397168     WD-WMAEP2712887      
 +p3     DEGRADED         u0     232.88 GB   488397168     WD-WMAEP2714418      
 +p4     OK               u2     232.88 GB   488397168     WD-WCAT1C715001      
 +p5     OK               u2     232.88 GB   488397168     WD-WMAEP2713449      
 +p6     OK               u2     232.88 GB   488397168     WD-WMAEP2715070      
 +p7     OK               u2     232.88 GB   488397168     WD-WMAEP2712590      
 +p8     OK               u1     232.88 GB   488397168     WD-WMAEP2712574      
 +p9     OK               u1     232.88 GB   488397168     WD-WMAEP2734142      
 +p10    OK               u1     232.88 GB   488397168     WD-WMAEP2702155      
 +p11    OK               u1     232.88 GB   488397168     WD-WMAEP2712472  </code> 
 + 
 +Looks like another drive failed.
raid.1254236554.txt.gz · Last modified: 2010/05/22 14:19 (external edit)