RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program.
See versions of repeatmasker which are available:
$ module avail repeatmasker
Load one version into your environment and run it:
$ module load repeatmasker/4.1.1 $ RepeatMasker
Notes from the sysadmin during installation:
$ cd /tmp $ wget http://www.repeatmasker.org/RepeatMasker/RepeatMasker-4.1.1.tar.gz $ tar xf RepeatMasker-4.1.1.tar.gz $ sudo mkdir -p /export/apps/repeatmasker/4.1.1 $ sudo chown aorth:aorth /export/apps/repeatmasker/4.1.1 $ python3 -m venv /export/apps/repeatmasker/4.1.1/venv $ source /export/apps/repeatmasker/4.1.1/venv/bin/activate $ pip install h5py $ cp -r RepeatMasker/* /export/apps/repeatmasker/4.1.1 $ cd /export/apps/repeatmasker/4.1.1 # Here you have to manually enter the paths to trf, hmmer, and rmblast $ perl ./configure $ deactivate $ sudo chown -R root:root /export/apps/repeatmasker/4.1.1