====== Oyster River Protocol ======
===== Information =====
* Version: 2.2.8
* Added: October, 2019
* Link: https://github.com/macmanes-lab/Oyster_River_Protocol
===== Usage =====
See versions of orp that are available:
$ module avail orp
Load a particular version into your environment and run it:
$ module load orp/2.2.8
$ /export/apps/orp/2.2.8/oyster.mk
Read [[https://oyster-river-protocol.readthedocs.io/en/latest/|the documentation]] to see more information about how to run Oyster River Protocol.
===== Installation ======
Notes from the sysadmin during installation. First, create a chroot on the local filesystem to perform the installation.
$ mkdir -p /var/tmp/chroot/orp
$ rpm --rebuilddb --root=/var/tmp/chroot/orp
$ wget https://hpc.ilri.cgiar.org/mirror/centos/7/os/x86_64/Packages/centos-release-7-7.1908.0.el7.centos.x86_64.rpm
$ sudo rpm --root=/var/tmp/chroot/orp -i centos-release-7-7.1908.0.el7.centos.x86_64.rpm
$ sudo yum --installroot=/var/tmp/chroot/orp install -y rpm-build yum wget vim git
$ sudo cp /etc/resolv.conf /var/tmp/chroot/orp/etc
$ sudo mount --bind /dev/ /var/tmp/chroot/orp/dev
$ sudo mount -t proc procfs /var/tmp/chroot/orp/proc
$ sudo mount -t sysfs sysfs /var/tmp/chroot/orp/sys
$ sudo chroot /var/tmp/chroot/orp
Then, after entering the chroot, try to loosely follow the instructions in the Dockerfile and the Makefile until you get the orp virtual environment activated:
# wget https://github.com/macmanes-lab/Oyster_River_Protocol/archive/2.2.8.tar.gz
# tar xf 2.2.8.tar.gz
# mv Oyster_River_Protocol-2.2.8 /export/apps/orp/2.2.8
# cd /export/apps/orp/2.2.8
# wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
# bash Miniconda3-latest-Linux-x86_64.sh -b -p software/anaconda/install
# source software/anaconda/install/bin/activate
# conda update -y -n base conda
# conda install -y pycryptosat
# conda config --set sat_solver pycryptosat
# conda env create -f py37_env.yml python=3.7
# conda activate orp
Set up a few extra pieces of software and download some data sets (see the Makefile):
# git clone -b 2.0.1 https://github.com/bcgsc/transabyss.git software/transabyss
# mkdir software/diamond
# cd software/diamond && curl -LO ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gz && gzip -d uniprot_sprot.fasta.gz
# diamond makedb --in uniprot_sprot.fasta -d swissprot
# cd /export/apps/orp/2.2.8
# mkdir busco_dbs
# cd busco_dbs
# wget http://busco.ezlab.org/v2/datasets/eukaryota_odb9.tar.gz && tar -zxf eukaryota_odb9.tar.gz
# cd /export/apps/orp/2.2.8
# cd software && tar -zxf orp-transrate.tar.gz
# cd /export/apps/orp/2.2.8
# Work around several Python issues:
## 1. Python is version 2 in Docker Ubuntu 18.04 installation, so we need to explicitly use that here since we're in the Anaconda environment where Python 3 is default
## 2. Install and upgrade pip for Python 2, as the EPEL version of pip is old and selects the Python 3 version of scipy
## 3. Install GCC 6 to fix error about old compiler: gcc: error: unrecognized command line option ‘-fno-plt’
## 4. Explicitly use Python 2 in shebang
# yum install epel-release
# yum install python2-pip python2-devel
# yum install centos-release-scl
# yum install devtoolset-6
# pip2 install --upgrade pip
# scl enable devtoolset-6 bash
# pip2 install python-igraph scipy numpy
# sed -i '1 s/python/python2/' software/OrthoFinder/orthofinder/orthofuser.py
# exit # from devtoolset-6
# sed -i 's#/home/ubuntu/Oyster_River_Protocol/#/export/apps/orp/2.2.8/#' software/config.ini
# env > orp
# exit # from chroot
$ sudo mkdir -p /export/apps/orp/2.2.8
$ sudo chown aorth /export/apps/orp/2.2.8
$ rsync -av /var/tmp/chroot/orp/export/apps/orp/2.2.8/ /export/apps/orp/2.2.8
$ sudo chown -R root:root /export/apps/orp/2.2.8
Oyster River Protocol has dozens of dependencies and is essentially impossible to install without the use of [[https://conda.io/miniconda.html|Miniconda]]. My strategy is to install Conda somewhere globally and then use it to install Oyster River Protocol. Furthermore, I first installed orp/2.2.8 in a chroot on the local file system and then rsync'd it over to the network applications directory. This is MUCH faster.
To create the [[https://github.com/ilri/hpc-environment-modules/tree/master/orp|modulefile]] I compared the output ''env'' before and after loading the orp environment with Conda.