User Tools

Site Tools


backup:tape

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
backup:tape [2011/05/31 11:51] – [List SCSI devices] aorthbackup:tape [2011/11/09 12:55] (current) – removed aorth
Line 1: Line 1:
-====== Tape backup ====== 
  
-Tape backups are run manually __once per week__, __on Friday afternoon__.  We have four cassettes, each of which can hold seven tapes.  Our current tape backup needs are around ten tapes, so each pair has eleven tapes total just in case the size of the backups increases.  Each week we rotate the set of cassettes so that we always have a week of archived data. 
- 
-A full system backup includes: 
- 
-  * ''/'' <- (OS) 
-  * ''/mnt/export''  (homes and biosoft applications) 
-  * ''/mnt/export2''  (segoli data is here) 
-  * ''/mnt/export3''  (videodata) 
- 
-===== Example backup process ===== 
-==== Insert tapes ==== 
-==== Run Storix Backup ==== 
-From an X11 window: 
-<code>$ sudo sbadmin</code> 
- 
-  - Utilities -> Perform Tape Library Operations -> Move Tapes in Library 
-  - Move tape 1 -> Drive 1 
-    - {{:backup:move_tapes.png?nolink}} 
-    - {{:backup:move_tapes_2.png?nolink|}} 
-  - Display -> Clients, Servers & Media 
-    - "Read Label From Media" 
-    - "Expire/Remove" 
-      - {{:backup:expire_label.png?direct&300|}} 
-  - Actions -> Run Backup Jobs 
-    - {{:backup:run_backup.png?direct&300|}} 
-    - "Run Now" 
- 
-This takes about 30-35 hours depending on the load of the server and whether or not the robot is working properly. 
- 
-===== Problems ===== 
-  * Sometimes tapes are hard to remove from the cassette (this causes the robot to jam sometimes) 
-  * Even setting the virtual device to "sequential" doesn't work as desired (robot stops when a tape is full and waits for you to manually unload and load the next tape), so we use a "random tape library" instead 
- 
-===== Monitoring the backup ===== 
- 
-The Storix Backup tool shows the current status of the backup but if you're not sitting at the machine there is no way to see.  You can use a one-line shell script to loop periodically and check the status of the tape library.  This essentially becomes a log of the progress.  Output to somewhere web-readable, as web is accessible from outside ILRI: 
-<code># for num in `seq 1 1000`; do echo "Seq ${num}: $(mtx status)" >> /var/www/html/coffee.txt; sleep 1800;  done</code> 
-===== Log of backups ===== 
- 
-^  Date  ^  Tape set  ^  Notes  ^ 
-| Oct 30, 2009  |  A  | Robot jammed on tape 7, backup did not complete  | 
-| Nov 6, 2009  |  B  | Completed successfully  | 
-| Nov 13, 2009  |  A  | Completed successfully  | 
-| Nov 20, 2009  |  B  | Backup completed successfully, Verify process failed at tape 4 |  
-| Nov 27, 2009  |  A  | Completed successfully  | 
-| Dec 4, 2009  |  B  | Backup completed successfully, Verify process failed at tape 6  | 
-| Dec 11, 2009  |  A  | Backup failed to start (appears to be a software problem, server might need a reboot) | 
-| Dec 21, 2009  |  A  | Completed successfully  | 
-| Jan 8, 2010  |  B  | Completed successfully  | 
-| Jan 15, 2010  |  A  | Backup completed successfully, Verify process failed  | 
-| Jan 22, 2010  |  B  | Backup completed successfully, Verify stuck at 100%...  | 
-| Jan 29, 2010  |  A  | Backup complete successfully, Verify stuck at 8%...  | 
-| Feb 5, 2010  |  B  | Completed successfully | 
-| Feb 12, 2010  |  A  |Completed successfully | 
-| Feb 19, 2010  |  B  | Completed successfully | 
-| March 12, 2010  |  A  | Completed successfully | 
-| March 19, 2010  |  B  |Completed successfully | 
-| April 1, 2010 |  A  | Completed successfully | 
-| April 9, 2010  |  B  |Completed successfully | 
-| April 16, 2010  |  A  |Completed successfully | 
-| April 23, 2010  |  A  | Completed successfully | 
-| April 30, 2010  |  B  | Completed successfully | 
-| May 07, 2010  |  A  | Completed successfully | 
-| May 21, 2010  |  B  | completed successfully | 
-| June 4, 2010  |  A  | completed successfully | 
-| June 9, 2010  |  B  | completed successfully | 
-| June 18, 2010  |  A  | completed successfully | 
-| June 25, 2010  |  B  | Completed successfully | 
-| July 2, 2010  |  A  | Completed successfully | 
-| July 9, 2010  |  B  | Completed successfully | 
-| July 16, 2010  |  A  | Completed successfully | 
-| July 23, 2010  |  B  | Completed successfully | 
-| July 30, 2010  |  A  | Completed successfully | 
-| August 6, 2010  |  B  | Completed successfully | 
-| August 13, 2010  |  A  | ... | 
-| September 3, 2010  |  A  | Completed successfully, verify failed | 
-| September 10, 2010  |  B  | Completed successfully, verify failed | 
-| September 17, 2010  |  A  | HPC crashed during the previous night, backups couldn't run... will run them next week now that HPC is fixed | 
-| September 24, 2010  |  A  | Completed successfully | 
-| October 1, 2010  |  B  | Completed successfully | 
-| October 8, 2010  |  A  | Completed successfully | 
-| October 15, 2010  |  B  | Completed successfully | 
-| October 22, 2010  |  A  | Completed successfully | 
-| October 29, 2010  |  ...  | Alan in Switzerland, Etienne in China | 
-| November 5, 2010  |  B  | Apparently successful, but HPC crashed sometime during the weekend due to power fluctuations.  Verify failed. | 
-| November 12, 2010  |  A  | Completed successfully | 
-| November 19, 2010  |  B  | Had a problem (can't remember why, power?) | 
-| November 26, 2010  |  B  | Completed successfully | 
-| December 3, 2010  |  A  | Completed successfully | 
-| December 12, 2010  |  B  | Completed successfully | 
-| December 17, 2010  |  A  | Completed successfully | 
-| December 24, 2010  |  B  | Completed successfully | 
-| December 31, 2010  |  -  | gone for holidays | 
-| January 7, 2011  |  A  | Completed successfully | 
-| January 14, 2011  |  B  | Backup failed, tape library has error 205: X Axis Error. Reset the library and it appears to be ok. | 
-| January 21, 2011  |  B  | Completed successfully | 
-| January 28, 2011  |  A  | failed, crashed because of a job Anne was running | 
-| February 4, 2011  |  -  | gone for holidays | 
-| February 11, 2011  |  A  | no backup because of work to server room air conditioning | 
-| February 18, 2011  |  A  | Completed successfully | 
-| February 25, 2011  |  B  | Completed successfully | 
-| March 4, 2011  |  A  | failed | 
-| March 11, 2011  |  A  | Completed successfully | 
-| March 18, 2011  |  B  | Completed successfully | 
-| March 25, 2011  |  A  | Completed successfully | 
-| April 1, 2011  |  B  | Was running a restore for Anne so couldn't run backups | 
-| April 8, 2011  |  A  | Completed successfully | 
-| April 15, 2011  |  B  | Completed successfully | 
-| April 21, 2011  |  A  | Completed successfully | 
-| April 29, 2011  |  B  | Completed successfully | 
-| May 6, 2011  |  A  | failed... not sure why | 
-| May 13, 2011  |  A  | haven't started because I can't eject the tapes yet | 
-====== Storix Backup Administrator ====== 
-We are using an Exabyte Tape library for backups and the commercial Storix Backup Administrator software [[http://www.storix.com/]]. 
- 
-Version: 
-<code>$ cat /opt/storix/instconfig/version  
-6.3.4.4</code> 
- 
-Storix System Backup Administrator: ''/home/villierse/software/storix'' 
- 
-Graphicaluser interface: ''sbadmin'' 
- 
-The Exabyte device has one tape "drive" and a library of tapes.  It can hold three cassettes, each cassette can hold 7 tapes.  The robotic arm moves the tapes from the cassettes to the tape drive where they are unwound and read for backup/restore. 
-===== Documentation ===== 
- 
-  * {{:sba.pdf}} 
-  * {{:sbalinuxinst.pdf}} 
-  * {{:exabyte-basicbackup.pdf}} 
-  * {{:exabyte221l_manual.pdf}} 
-  * {{:exabyte_monitor.pdf}} 
- 
-===== Notes ===== 
- 
-''cat /proc/scsi/scsi''  (Display attached scsi devices) 
- 
-Tape drive: /dev/st0 
-Library: /dev/sg0 
- 
-Test: ''mt -f /dev/st0 status'' 
-BOT keyword means tape in drive 
- 
-Rewind tape: ''mt -f /dev/nst0 rewind or /mt -f /dev/nst0 rewoffl'' 
- 
-Make backup:            ''tar cvf /dev/st0 directory'' 
-List files on tape:     ''tar tvf /dev/st0'' 
-Rewind and eject tape:  ''mt -f /dev/st0 rewoffl'' 
-Restore tape (insert tape): ''tar xvf /dev/st0'' 
- 
-To make more than one backup to same tape: 
-Use ''/dev/nst0'' instead of ''/dev/st0''. This does not rewind the tape after the first backup finished. 
- 
-===== Troubleshooting ====== 
-The following commands can be useful in determining problems with devices. 
-==== mtx ==== 
-<code>mtx -f /dev/sg0 inquiry 
-Product Type: Medium Changer 
-Vendor ID: 'EXABYTE ' 
-Product ID: 'EXB-480         ' 
-Revision: '2.18' 
-Attached Changer: No</code> 
-==== tapeinfo ==== 
-<code>tapeinfo -f /dev/sg0 
-Product Type: Medium Changer 
-Vendor ID: 'EXABYTE ' 
-Product ID: 'EXB-480         ' 
-Revision: '2.18' 
-Attached Changer: No 
-SerialNumber: '67001141  ' 
-SCSI ID: 0 
-SCSI LUN: 0 
-Ready: yes</code> 
-==== loaderinfo ==== 
-<code>loaderinfo -f /dev/sg0 
-Product Type: Medium Changer 
-Vendor ID: 'EXABYTE ' 
-Product ID: 'EXB-480         ' 
-Revision: '2.18' 
-Attached Changer: No 
-Bar Code Reader: Yes 
-EAAP: Yes 
-Number of Medium Transport Elements: 1 
-Number of Storage Elements: 21 
-Number of Import/Export Element Elements: 1 
-Number of Data Transfer Elements: 1 
-Transport Geometry Descriptor Page: Yes 
-Invertable: No 
-Device Configuration Page: Yes 
-Can Transfer: Yes</code> 
- 
-==== List SCSI devices ==== 
-'/dev/sg*' are apparently all SCSI devices (some of which are the disks attached to the OS), which can be quite confusing.  ''/proc/scsi/scsi'' will show you information about attached scsi devices: 
-<code>cat /proc/scsi/scsi 
-Attached devices: 
-Host: scsi0 Channel: 00 Id: 00 Lun: 00 
-  Vendor: EXABYTE  Model: EXB-480          Rev: 2.18 
-  Type:   Medium Changer                   ANSI SCSI revision: 02 
-Host: scsi0 Channel: 00 Id: 01 Lun: 00 
-  Vendor: IBM      Model: ULTRIUM-TD1      Rev: 4561 
-  Type:   Sequential-Access                ANSI SCSI revision: 03 
-Host: scsi1 Channel: 00 Id: 00 Lun: 00 
-  Vendor: 3ware    Model: Logical Disk 00  Rev: 1.00 
-  Type:   Direct-Access                    ANSI SCSI revision: ffffffff 
-Host: scsi1 Channel: 00 Id: 01 Lun: 00 
-  Vendor: 3ware    Model: Logical Disk 01  Rev: 1.00 
-  Type:   Direct-Access                    ANSI SCSI revision: ffffffff 
-Host: scsi1 Channel: 00 Id: 02 Lun: 00 
-  Vendor: 3ware    Model: Logical Disk 02  Rev: 1.00 
-  Type:   Direct-Access                    ANSI SCSI revision: ffffffff</code> 
-   
-==== Force tape location ==== 
-<code>TAPE=/dev/sg0 mtx status</code> 
-==== /dev/st0 not ready ==== 
-Try to reset the library and drives from the front panel. 
- 
-==== Tape library commands ==== 
- 
-  * ''mtx status'' 
-  * ''mtx unload <slotnum> <drivenum>''  (Unloads media from drive <drivenum> into slot  <slotnum>.) 
- 
-===== Bootable USB recovery ===== 
- 
-http://www.storix.com/how-to/202-how-to-configure-a-bootable-usb-drive-for-bare-metal-recovery-sbadmin-v6 
backup/tape.1306842695.txt.gz · Last modified: 2011/05/31 11:51 by aorth