User Tools

Site Tools


upgrade_notes:hpc_june_2011

This is an old revision of the document!


HPC installation, June, 2011

Notes documenting the installation of the new HPC server in June, 2011. The machine is a Dell PowerEdge R910.

Machine specifications:

  • Quad Eight-core Xeon X7560
  • 128 GB RAM
  • 16 500GB, 7200RPM Serial-attached SCSI drives
  • Dell PowerEdge RAID Controller (PERC) H700, Manual

Pre-installation notes

Hardware RAID

Hardware RAID configuration set up to provide the following virtual/logical drives to the OS:

  • Disks 0,1,2,3 → RAID5
  • Disks 4,5,6,7,8,9,10,11 → RAID5
  • Disks 12,13,14,15 → RAID5

… where disks are physically laid out as follows:

Disk 0 Disk 4 Disk 8 Disk 12
Disk 1 Disk 5 Disk 9 Disk 13
Disk 2 Disk 6 Disk 10 Disk 14
Disk 3 Disk 7 Disk 11 Disk 15

The first disk group must not be too large because we need to boot from it. CentOS uses the legacy GRUB1 bootloader, which only supports the MSDOS partition table format, meaning that partitions cannot be larger than ~2 TB (http://en.wikipedia.org/wiki/Master_boot_record). Disk groups 2 and 3 can be larger, and therefore use the GUID Partition Table (GPT).

It seems the hardware RAID controller automatically sets partition tables to GPT when you create virtual disks, and CentOS's disk partitioning utility will gladly manipulate those partitions, but will refuse to install (because of the bootloader issue). You need to Ctrl-Alt-F2, for example, to a virtual terminal and use parted to set the partition table to msdos on the drive which is to be your boot drive.

Enter parted and ignore any errors it spits out in the beginning:

parted /dev/sda
mklabel
	yes
	msdos
q

After that you can switch back to the installer (Ctrl-Alt-F6) and partition away.

Post-installation notes

<note>If SSH seems slow, you need to disable GSSAPIAuthentication in /etc/ssh/sshd_config because it tells SSH to try to use Kerberos-based authentication, and we don't use that here!</note> <note>Turn off iptables for now or else you might get frustrated troubleshooting problems that don't exist!!

service iptables stop

</note>

Preparation

Rsync with ACL support

CentOS's rsync doesn't support extended access control lists, which we use for Segolip's data. We need to compile a new version which has this support:

sudo yum install libacl-devel
wget http://rsync.samba.org/ftp/rsync/src/rsync-3.0.8.tar.gz
tar zxf rsync-3.0.8.tar.gz 
./configure --prefix=/export/apps/rsync/3.0.8
make
sudo make install

Now we can use this special version of rsync later by calling: /export/apps/rsync/3.0.8/bin/rsync

Rocks service pack

Install the latest Rocks service pack. See documentation here: http://www.rocksclusters.org/roll-documentation/service-pack/5.4.2/

Users, groups, etc

Migrate existing users, groups, and home folders from the old server.

On the old server

Backup a few things and prepare for user/group migration…

/etc

Create a tarball of the system configuration files:

tar cjf /tmp/etc_june_2011.tar.bz2 /etc

Rocks MySQL databases

Rocks uses its own instance of mysql, in addition to the "system" mysql. At one point in time we thought it was smart to move some databases there so we wouldn't have to have two copies of mysql running at a time. Doesn't sound so nice now, but eh… we need to check what is in there and move it to the new server's system mysql.

mkdir rocks_mysql_dumps_june_2011
export HISTFILE=/dev/null
for dbname in `/opt/rocks/bin/mysql -u root -p'password' -Be 'show databases' | tail -n+3 | grep -v "#"`; do echo Dumping $dbname; /opt/rocks/bin/mysqldump -u root -p'password' --opt $dbname | bzip2 -c > rocks_mysql_dumps_june_2011/$dbname.sql.bz2; done
exit

System MySQL databases

The regular CentOS system MySQL databases… just in case! Make sure mysqld is running then:

mkdir system_mysql_dumps_june_2011
export HISTFILE=/dev/null
for dbname in `mysql -u root -p'password' -Be 'show databases' | tail -n+3`; do echo Dumping $dbname; mysqldump -u root -p'password' --opt $dbname | bzip2 -c > system_mysql_dumps_june_2011/$dbname.sql.bz2; done
exit

Storix backup tool

Commercial tape backup software. Etienne has the installer, but lets backup the installed directories too, just in case:

sudo su -
tar cjf /mnt/export/home/aorth/storix.tar.bz2 /storix
tar cjf /mnt/export/home/aorth/opt_storix.tar.bz2 /opt/storix

On the new server

<note>You should obviously be root for this bit!!! We shouldn't have any other users at this point anyways, but just make sure you're root!</note> Copy the tarball from the old server:

scp aorth@192.168.5.3:/tmp/etc_june_2011.tar.bz2 .
tar jxf etc_june_2011.tar.bz2

Migrate users and groups in /etc...

  • Manually copy all but system users, from around segoli at UID 658, into /etc/passwd
    • Delete old/duplicate users like tomcat, oracle, nfsnobody, condor, ensembl, ilri, zabbix, cluster
  • Copy users' passwords (same as users above) into /etc/shadow
    • Delete old/duplicates like above
  • Copy all but system groups into /etc/group
    • Delete old/duplicates/system groups like ilri, cluster, nfsnobody, tomcat, condor, zabbix, user, pbguest, robetta, screen
    • Make sure to copy important group memberships like ssh, gcc, wheel, etc…
  • Copy groups' passwords into /etc/gshadow
    • Delete old/duplicates/system like above
  • Edit /etc/sudoers to allow the wheel group to use sudo

Edit passwd to reflect home directory location

  • Replace /home/ with /export/home/ in /etc/passwd (vim or sed)
    • Run rocks sync users to get Rocks to go parse /etc/passwd and write a new /etc/auto.home with the new automounts for home directories

Copy homes

  • rsync -avz –exclude "segoli" –exclude "afischer" -e "ssh -i /root/.ssh/hpc_id_rsa" 192.168.5.3:/mnt/export/home/ /export/home/
    • (or… with –delete if you're absolutely sure!!! PROBABLY NOT!!?)
    • not sure if the condor user is support to have a home?
  • Anne Fischer and Segoli are special cases, their home folders were on export2, with a symlink in /mnt/export/home. Copy their stuff to the new home partition.
    • Copy Anne's stuff:
      • First time: rsync -avz -e "ssh -i /root/.ssh/hpc_id_rsa" 192.168.5.3:/mnt/export2/home/afischer /export/home/
      • Again for good measure!
    • Copy Segolip's stuff (needs rsync with ACL support from above!):
      • Add acl to /export's mount options in /etc/fstab and then remount /export: sudo mount -o remount /export
      • Sync: /export/apps/rsync/3.0.8/bin/rsync -avzA -e "ssh -i /root/.ssh/hpc_id_rsa" 192.168.5.3:/mnt/export2/home/segoli /export/home/

Copy /mnt/export2

Make sure to exclude some old/unused stuff…

rsync -avz --exclude "home" --exclude "pbroot" --exclude "mysql" --exclude "u01" --exclude "u04" -e "ssh -i /root/.ssh/hpc_id_rsa" 192.168.5.3:/mnt/export2/ /mnt/export2/

Copy /mnt/export3

The system's mysql directories live here (even though they are all old and we should probably start fresh!!), so we have to stop mysql before we rsync:

sudo service mysqld stop

Then copy everything over, exlcuding some old/unused stuff.

rsync -avz --delete-excluded --exclude "formatdb.log" --exclude "temp" --exclude "u03" --exclude "btk_backup" --exclude "segoli_backups" -e "ssh -i /root/.ssh/hpc_id_rsa" 192.168.5.3:/mnt/export3/ /mnt/export3/

Clean up old mysql databases which don't exist anymore (they are sym links to non-existent places):

cd /mnt/export3/mysql
find . -type l -exec rm {} \;
ln -sv /export /mnt/

Configure yum

Rocks 5.4 is based on CentOS 5.5, so you can use the repositories directly from CentOS.

CentOS-Base.repo

Copy a CentOS-Base.repo from an existing CentOS installation and place it in /etc/yum.repos.d/, for example:

# CentOS-Base.repo
#
# The mirror system uses the connecting IP address of the client and the
# update status of each mirror to pick mirrors that are updated to and
# geographically close to the client.  You should use this for CentOS updates
# unless you are manually picking other mirrors.
#
# If the mirrorlist= does not work for you, as a fall back you can try the 
# remarked out baseurl= line instead.
#
#

[base]
name=CentOS-$releasever - Base
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os
#baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5

#released updates 
[updates]
name=CentOS-$releasever - Updates
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates
#baseurl=http://mirror.centos.org/centos/$releasever/updates/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5

#packages used/produced in the build but not released
[addons]
name=CentOS-$releasever - Addons
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=addons
#baseurl=http://mirror.centos.org/centos/$releasever/addons/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5

#additional packages that may be useful
[extras]
name=CentOS-$releasever - Extras
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras
#baseurl=http://mirror.centos.org/centos/$releasever/extras/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5

#additional packages that extend functionality of existing packages
[centosplus]
name=CentOS-$releasever - Plus
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=centosplus
#baseurl=http://mirror.centos.org/centos/$releasever/centosplus/$basearch/
gpgcheck=1
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5

#contrib - packages by Centos Users
[contrib]
name=CentOS-$releasever - Contrib
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=contrib
#baseurl=http://mirror.centos.org/centos/$releasever/contrib/$basearch/
gpgcheck=1
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5

Excludes

We don't want to update packages that Rocks depends on, or else we might break it! See the Rocks wiki for a list of excludes, and place them in /etc/yum.conf in the [main] section:

exclude=dapl* ibutils* infiniband* kernel-ib* libibumad* libibmad* libibverbs* libmthca* libibverbs* libnes* librdmacm* ofed* qperf* libmlx4* compat-dapl* opensm* libcxgb3* libibcom* openib* openmpi* ibvexdmtools* libibpathverbs* libipathverbs* libsdp* srptools* ibsim*

Software Installation

From yum

Some random, one-off things from the top of my head

yum install screen strace

Environment Modules

https://wiki.rocksclusters.org/wiki/index.php/Rolls_Working_Group

rocks add roll /home/aorth/src/modules-5.4-1.x86_64.disk1.iso
rocks enable roll modules
cd /export/rocks/install
rocks create distro
rocks run roll modules >> installroll.sh
sh installroll.sh
reboot

Module files

Copy module files from HPC to /export/apps/

rsync -avz --delete-excluded --exclude "module-info" --exclude "module-cvs" --exclude "null" --exclude "use.own" --exclude "modules" --exclude "dot" --exclude "common" -e "ssh -i /root/.ssh/hpc_id_rsa" 192.168.5.3:/opt/modules/modulefiles/ /export/apps/modules/modulefiles/

Edit module files to reflect current application locations! For example, update the symlinks for the "latest" versions of apps:

sudo su -
cd /export/apps/blast/
ln -sv 2.2.25+ latest

R Statistics

http://www.r-project.org/

tar zxf R-2.13.0.tar.gz
cd R-2.13.0
./configure --prefix=/export/apps/R/2.13.0
make
sudo make install

Repeat the process for R 2.12.2 and 2.11.0, just in case users have those versions installed, we don't want to cause incompatibilities for them! Also, make sure the latest version has a symlink for itself!

sudo su -
cd /export/apps/R
ln -sv 2.13.0 latest

Make a similar one for the module file…

NCBI BLAST+

Install

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/

sudo mkdir -p /export/apps/blast
tar zxf ncbi-blast-2.2.25+-x64-linux.tar.gz
sudo cp -r ncbi-blast-2.2.25+ /export/apps/blast/2.2.25+

Create a symlink for the "latest" version of blast:

sudo su -
cd /export/apps/blast
ln -sv 2.2.25+ latest

Setup BLAST+ databases

Download a few databases from NCBI. I think a good location on the new server would be /export/bio/ncbi/db (as that is the Rocks default anyways):

sudo su -
cd /export/bio/ncbi/db
for name in nr.*gz; do tar zxf $name; done
/export/apps/blast/2.2.25+/bin/update_blastdb.pl env_nr nr nt
for name in *tar.gz; do tar zxf $name; done
rm *.gz *.md5

Test BLAST+

Pass the BLASTDB variable to blastp to see if it can find the env_nr database we just unzipped.

BLASTDB=/export/bio/ncbi/db /export/app/blast/2.2.25+/bin/blastp -db env_nr

Structure

http://pritch.bsd.uchicago.edu/software.html

tar zxf structure_kernel_source.tar.gz
cd structure_kernel_src
make
sudo mkdir -p /export/apps/structure/2.3.3/bin
sudo cp structure /export/apps/structure/2.3.3/bin/

MrBayes

http://mrbayes.csit.fsu.edu/

tar zxf mrbayes-3.1.2.tar.gz
cd mrbayes-3.1.2
sudo mkdir -p /export/apps/mrbayes/3.1.2/bin/
sudo cp mb /export/apps/mrbayes/3.1.2/bin/

BEAST

http://beast.bio.ed.ac.uk/

tar zxf BEASTv1.6.1.tgz
sudo mkdir -p /export/apps/BEAST
sudo cp -r BEASTv1.6.1 /export/apps/BEAST/1.6.1

Python

http://www.python.org/

tar jxf Python-2.7.1.tar.bz2
cd Python-2.7.1
./configure --prefix=/export/apps/python/2.7.1
make
sudo make install

Python - NumPy

http://numpy.scipy.org/

tar zxf numpy-1.6.0.tar.gz
cd numpy-1.6.0
/export/apps/python/2.7.1/bin/python setup.py build
sudo /export/apps/python/2.7.1/bin/python setup.py install

Python - BioPython

http://biopython.org/

tar zxf biopython-1.57.tar.gz
cd biopython-1.57
/export/apps/python/2.7.1/bin/python setup.py build
sudo /export/apps/python/2.7.1/bin/python setup.py install

Dendroscope

http://ab.inf.uni-tuebingen.de/software/dendroscope/

chmod +x Dendroscope_unix_2_7_4.sh
sudo ./Dendroscope_unix_2_7_4.sh

Tell the installer to install it to /export/apps/dendroscope/2.7.4

Samba

Install

sudo yum install samba

Configure

  1. Edit /etc/samba/smb.conf to migrate shares and bind, listen, allowed hosts, etc:
  2. workgroup = ILRI
    server string = Samba Server Version %v
    
    netbios name = HPC
    
    interfaces = lo eth1
    hosts deny = ALL
    hosts allow = 127. 192.168.5.1/32
  3. Convert the old smbpasswd file to tdbsam (this is using the smbpasswd file we backed up):
    1. pdbedit -i smbpasswd:etc/samba/smbpasswd -e tdbsam:/etc/samba/passdb.tdb
  4. Start samba:
    1. sudo service smb start
  5. Set samba to start when the system boots:
    1. sudo chkconfig smb on

IPtables firewall rules

Add the following to /etc/sysconfig/iptables:

# ssh rules
# accept SSH from 192.168.5.1 (ILRI corporate network)
-A INPUT -m state --state NEW -p tcp --dport ssh --source 192.168.5.1 -j ACCEPT
# allow all SSH access (even external to ILRI)
-A INPUT -m state --state NEW -p tcp --dport ssh -j ACCEPT
# need a way to specify the range of IP (ie, NOT 192.168.5.1, which is the corporate network here)
#-I INPUT -p tcp --dport 22 -i eth0 -m state --state NEW -m recent --set
#-I INPUT -p tcp --dport 22 -i eth0 -m state --state NEW -m recent --update --seconds 60 --hitcount 4 -j DROP

# NFS for ILRI corporate network
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p udp --dport 111 -j ACCEPT
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p tcp --dport 111 -j ACCEPT
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p tcp --dport 2049 -j ACCEPT
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p tcp --dport 32803 -j ACCEPT
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p udp --dport 32769 -j ACCEPT
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p tcp --dport 892 -j ACCEPT
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p udp --dport 892 -j ACCEPT
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p tcp --dport 875 -j ACCEPT
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p udp --dport 875 -j ACCEPT
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p tcp --dport 662 -j ACCEPT
-A INPUT -s 192.168.5.1/32 -m state --state NEW -p udp --dport 662 -j ACCEPT

# samba rules
-A INPUT -m state --state NEW -p tcp --dport 135 -j ACCEPT
-A INPUT -m state --state NEW -p udp --dport 137 -j ACCEPT
-A INPUT -m state --state NEW -p udp --dport 138 -j ACCEPT
-A INPUT -m state --state NEW -p tcp --dport 139 -j ACCEPT
-A INPUT -m state --state NEW -p tcp --dport 445 -j ACCEPT

#Zabbix Monitering
#-A INPUT -m state --state NEW -p tcp --dport 10050 -j ACCEPT

# SWAT rules
-A INPUT -m state --state NEW -p tcp --dport 901 -j ACCEPT

Viroblast

A cool web-based BLAST+ interface. Seems to be modeled on wwwblast, but ported to PHP and BLAST+. From University of Washington.

Install

sudo su -
cd /var/www/html
tar xvfp /home/aorth/viroblast-2.2.tar.gz
cd viroblast

Setup databases

First, make symlinks to the various installed databases (pre-formatted from NCBI):

ln -sv /export/bio/ncbi/db/nt.* /var/www/html/viroblast/db/nucleotide/
ln -sv /export/bio/ncbi/db/nr.* /var/www/html/viroblast/db/protein/

Then setup the viroblast.ini file to add the databases:

blastn: test_na_db => Nucleotide test database, nt => NCBI nt (June 2011)
blastp: test_aa_db => Protein test database, nr => NCBI nr (June 2011)

Update the blast+ version

Viroblast 2.2 comes with BLAST+ 2.2.24, symlink the latest installed version we have:

rm -rf blast+
ln -sv /export/apps/blast/latest blast+

Apache configuration

Tell Apache to load viroblast.php instead of index.php. Create /etc/httpd/conf.d/viroblast.conf:

<Directory /var/www/html/viroblast>
        DirectoryIndex viroblast.php
</Directory>

# redirect requests to paracel's blast web interface to viroblast
<IfModule mod_rewrite.c>
        RewriteRule ^/bwb(/.*)?$ http://hpctest.ilri.cgiar.org/viroblast [R=permanent,L]
</IfModule>

Restart Apache:

sudo apachectl graceul

Make sure to test with a small data set to make sure it works!

SSH configuration

Files to consider:

  • /etc/ssh/sshd_config
  • /etc/security/sshd_access.conf
  • Keys in /etc/ssh

Apache

/var/www/html

Move the stock Rocks index.html

sudo su -
cd /var/www/html
mv index.html index.html.rocks

Synchronize the other files and folders

rsync -avz --exclude "blast" --exclude "bwb" --exclude "ganglia" --exclude "gromacs" --exclude "install" --exclude "misc" --exclude "phpsysinfo" --exclude "robots.txt" --exclude "roll-documentation" --exclude "rss" --exclude "ppp-web.beforehellen" --exclude "t_coffee" --exclude "wiki" --exclude "wordpress" --exclude "tripwire" --exclude "RCS" -e "ssh -i /root/.ssh/hpc_id_rsa" 192.168.5.3:/var/www/html/ /var/www/html/

/var/www/cgi-bin

Copy the contents of cgi-bin

rsync -avz -e "ssh -i /root/.ssh/hpc_id_rsa" 192.168.5.3:/var/www/cgi-bin/ /var/www/cgi-bin/

httpd.conf

Change at least the following in Apache's main config file, /etc/httpd/conf/httpd.conf:

ServerAdmin a.orth@cgiar.org
<Directory "/var/www/html">
    Options FollowSymLinks
    AllowOverride Options
</Directory>
<IfModule mod_userdir.c>
    #UserDir disable
    UserDir public_html
</IfModule>

Other configs

Assuming you have a backup of the old HPC's /etc in your folder, copy the following to the new server's Apache config directory:

cp etc/httpd/conf.d/{artemis,iprscan,mobyle,ppp-web}.conf /etc/httpd/conf.d/
cp etc/httpd/conf.d/ppp-web.passwd /etc/httpd/conf.d/

mod_perl

Certain CGI web applications will need mod_perl and some perl modules, lets install them to preempt any problems!

sudo yum install mod_perl perl-XML-Parser perl-XML-Simple perl-DBI perl-DateManip perl-libwww-perl perl-Convert-ASN1 perl-SGMLSpm perl-Compress-Zlib perl-HTML-Tagset perl-HTML-Parser perl-BSD-Resource perl-DBD-MySQL perl-String-CRC32 perl-MailTools perl-IO-String perl-URI

Restart Apache

apachectl graceful

Disk Quotas

Implementation

  • Add "usrquota,grpquota" to /export's entry in /etc/fstab
  • Restart the computer
  • Create quota files on the file system and check current disk usage:
    • quotacheck -cug /export/

Hardware RAID controller tools

The Dell PERC H700 is a re-branded LSI MegaRaid:

lspci -v | grep LSI
01:00.0 RAID bus controller: LSI Logic / Symbios Logic LSI MegaSAS 9260 (rev 04)

Drivers can be found here: http://www.lsi.com/storage_home/products_home/internal_raid/megaraid_sas/6gb_s_value_line/sas9260-8i/index.html

Installation

sudo rpm -ivh Lib_Utils-1.00-08.noarch.rpm MegaCli-8.00.46-1.i386.rpm

Testing

List available controllers:

sudo /opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo -aALL
upgrade_notes/hpc_june_2011.1308384331.txt.gz · Last modified: by aorth