User Tools

Site Tools


backup:tape

This is an old revision of the document!


Tape backup

Tape backups are run manually once per week, on Friday afternoon. We have four cassettes, each of which can hold seven tapes. Our current tape backup needs are around ten tapes, so each pair has eleven tapes total just in case the size of the backups increases. Each week we rotate the set of cassettes so that we always have a week of archived data.

A full system backup includes:

  • / ← (OS)
  • /mnt/export (homes and biosoft applications)
  • /mnt/export2 (segoli data is here)
  • /mnt/export3 (videodata)

Example backup process

Insert tapes

Run Storix Backup

From an X11 window:

$ sudo sbadmin
  1. Utilities → Perform Tape Library Operations → Move Tapes in Library
  2. Move tape 1 → Drive 1
  3. Display → Clients, Servers & Media
    1. "Read Label From Media"
    2. "Expire/Remove"
  4. Actions → Run Backup Jobs
    1. "Run Now"

This takes about 30-35 hours depending on the load of the server and whether or not the robot is working properly.

Problems

  • Sometimes tapes are hard to remove from the cassette (this causes the robot to jam sometimes)
  • Even setting the virtual device to "sequential" doesn't work as desired (robot stops when a tape is full and waits for you to manually unload and load the next tape), so we use a "random tape library" instead

Monitoring the backup

The Storix Backup tool shows the current status of the backup but if you're not sitting at the machine there is no way to see. You can use a one-line shell script to loop periodically and check the status of the tape library. This essentially becomes a log of the progress. Output to somewhere web-readable, as web is accessible from outside ILRI:

# for num in `seq 1 1000`; do echo "Seq ${num}: $(mtx status)" >> /var/www/html/coffee.txt; sleep 1800;  done

Log of backups

Date Tape set Notes
Oct 30, 2009 A Robot jammed on tape 7, backup did not complete
Nov 6, 2009 B Completed successfully
Nov 13, 2009 A Completed successfully
Nov 20, 2009 B Backup completed successfully, Verify process failed at tape 4
Nov 27, 2009 A Completed successfully
Dec 4, 2009 B Backup completed successfully, Verify process failed at tape 6
Dec 11, 2009 A Backup failed to start (appears to be a software problem, server might need a reboot)
Dec 21, 2009 A Completed successfully
Jan 8, 2010 B Completed successfully
Jan 15, 2010 A Backup completed successfully, Verify process failed
Jan 22, 2010 B Backup completed successfully, Verify stuck at 100%…
Jan 29, 2010 A Backup complete successfully, Verify stuck at 8%…
Feb 5, 2010 B Completed successfully
Feb 12, 2010 A Completed successfully
Feb 19, 2010 B Completed successfully
March 12, 2010 A Completed successfully
March 19, 2010 B Completed successfully
April 1, 2010 A Completed successfully
April 9, 2010 B Completed successfully
April 16, 2010 A Completed successfully
April 23, 2010 A Completed successfully
April 30, 2010 B Completed successfully
May 07, 2010 A Completed successfully
May 21, 2010 B completed successfully
June 4, 2010 A completed successfully
June 9, 2010 B completed successfully
June 18, 2010 A completed successfully
June 25, 2010 B Completed successfully
July 2, 2010 A Completed successfully
July 9, 2010 B Completed successfully
July 16, 2010 A Completed successfully
July 23, 2010 B Completed successfully
July 30, 2010 A Completed successfully
August 6, 2010 B Completed successfully
August 13, 2010 A
September 3, 2010 A Completed successfully, verify failed
September 10, 2010 B Completed successfully, verify failed
September 17, 2010 A HPC crashed during the previous night, backups couldn't run… will run them next week now that HPC is fixed
September 24, 2010 A Completed successfully
October 1, 2010 B Completed successfully
October 8, 2010 A Completed successfully
October 15, 2010 B Completed successfully
October 22, 2010 A Completed successfully
October 29, 2010 Alan in Switzerland, Etienne in China
November 5, 2010 B Apparently successful, but HPC crashed sometime during the weekend due to power fluctuations. Verify failed.
November 12, 2010 A Completed successfully
November 19, 2010 B Had a problem (can't remember why, power?)
November 26, 2010 B Completed successfully
December 3, 2010 A Completed successfully
December 12, 2010 B Completed successfully
December 17, 2010 A Completed successfully
December 24, 2010 B Completed successfully
December 31, 2010 - gone for holidays
January 7, 2011 A Completed successfully
January 14, 2011 B Backup failed, tape library has error 205: X Axis Error. Reset the library and it appears to be ok.
January 21, 2011 B Completed successfully
January 28, 2011 A failed, crashed because of a job Anne was running
February 4, 2011 - gone for holidays
February 11, 2011 A no backup because of work to server room air conditioning
February 18, 2011 A Completed successfully
February 25, 2011 B Completed successfully
March 4, 2011 A failed
March 11, 2011 A Completed successfully
March 18, 2011 B Completed successfully
March 25, 2011 A Completed successfully
April 1, 2011 B Was running a restore for Anne so couldn't run backups
April 8, 2011 A Completed successfully
April 15, 2011 B Completed successfully
April 21, 2011 A Completed successfully
April 29, 2011 B Completed successfully
May 6, 2011 A failed… not sure why
May 13, 2011 A haven't started because I can't eject the tapes yet

Storix Backup Administrator

We are using an Exabyte Tape library for backups and the commercial Storix Backup Administrator software http://www.storix.com/.

Version:

$ cat /opt/storix/instconfig/version 
6.3.4.4

Storix System Backup Administrator: /home/villierse/software/storix

Graphicaluser interface: sbadmin

The Exabyte device has one tape "drive" and a library of tapes. It can hold three cassettes, each cassette can hold 7 tapes. The robotic arm moves the tapes from the cassettes to the tape drive where they are unwound and read for backup/restore.

Documentation

Notes

cat /proc/scsi/scsi (Display attached scsi devices)

Tape drive: /dev/st0 Library: /dev/sg0

Test: mt -f /dev/st0 status BOT keyword means tape in drive

Rewind tape: mt -f /dev/nst0 rewind or /mt -f /dev/nst0 rewoffl

Make backup: tar cvf /dev/st0 directory List files on tape: tar tvf /dev/st0 Rewind and eject tape: mt -f /dev/st0 rewoffl Restore tape (insert tape): tar xvf /dev/st0

To make more than one backup to same tape: Use /dev/nst0 instead of /dev/st0. This does not rewind the tape after the first backup finished.

Troubleshooting

The following commands can be useful in determining problems with devices.

mtx

mtx -f /dev/sg0 inquiry
Product Type: Medium Changer
Vendor ID: 'EXABYTE '
Product ID: 'EXB-480         '
Revision: '2.18'
Attached Changer: No

tapeinfo

tapeinfo -f /dev/sg0
Product Type: Medium Changer
Vendor ID: 'EXABYTE '
Product ID: 'EXB-480         '
Revision: '2.18'
Attached Changer: No
SerialNumber: '67001141  '
SCSI ID: 0
SCSI LUN: 0
Ready: yes

loaderinfo

loaderinfo -f /dev/sg0
Product Type: Medium Changer
Vendor ID: 'EXABYTE '
Product ID: 'EXB-480         '
Revision: '2.18'
Attached Changer: No
Bar Code Reader: Yes
EAAP: Yes
Number of Medium Transport Elements: 1
Number of Storage Elements: 21
Number of Import/Export Element Elements: 1
Number of Data Transfer Elements: 1
Transport Geometry Descriptor Page: Yes
Invertable: No
Device Configuration Page: Yes
Can Transfer: Yes

List SCSI devices

'/dev/sg*' are apparently all SCSI devices (some of which are the disks attached to the OS), which can be quite confusing. /proc/scsi/scsi will show you information about attached scsi devices:

cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: EXABYTE  Model: EXB-480          Rev: 2.18
  Type:   Medium Changer                   ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: IBM      Model: ULTRIUM-TD1      Rev: 4561
  Type:   Sequential-Access                ANSI SCSI revision: 03
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: 3ware    Model: Logical Disk 00  Rev: 1.00
  Type:   Direct-Access                    ANSI SCSI revision: ffffffff
Host: scsi1 Channel: 00 Id: 01 Lun: 00
  Vendor: 3ware    Model: Logical Disk 01  Rev: 1.00
  Type:   Direct-Access                    ANSI SCSI revision: ffffffff
Host: scsi1 Channel: 00 Id: 02 Lun: 00
  Vendor: 3ware    Model: Logical Disk 02  Rev: 1.00
  Type:   Direct-Access                    ANSI SCSI revision: ffffffff

Force tape location

TAPE=/dev/sg0 mtx status

/dev/st0 not ready

Try to reset the library and drives from the front panel.

Tape library commands

  • mtx status
  • mtx unload <slotnum> <drivenum> (Unloads media from drive <drivenum> into slot <slotnum>.)

Bootable USB recovery

Tape Backups

As of August, 2011 the "new" HPC has a Dell PowerVault TL2000 tape library with the following characteristics:

  • Serial-attached SCSI
  • Has an LTO-4 tape drive (IBM-ULT3580-HH4)
  • Can handle up to 24 tapes

See: http://www.dell.com/us/enterprise/p/powervault-tl2000/pd

backup/tape.1320127412.txt.gz · Last modified: 2011/11/01 06:03 by aorth