RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program.
See versions of repeatmasker which are available:
$ module avail repeatmasker
Load one version into your environment and run it:
$ module load repeatmasker/4.1.2-p1 $ RepeatMasker
Notes from the sysadmin during installation:
$ cd /tmp $ wget http://www.repeatmasker.org/RepeatMasker/RepeatMasker-4.1.2-p1.tar.gz $ tar xf RepeatMasker-4.1.2-p1.tar.gz $ sudo mkdir -p /export/apps/repeatmasker/4.1.2-p1 $ sudo chown aorth:aorth /export/apps/repeatmasker/4.1.2-p1 $ python3 -m venv /export/apps/repeatmasker/4.1.2-p1/venv $ source /export/apps/repeatmasker/4.1.2-p1/venv/bin/activate $ cp -r RepeatMasker/* /export/apps/repeatmasker/4.1.2-p1 $ cd /export/apps/repeatmasker/4.1.2-p1 # Here you have to manually enter the paths to trf 4.0.9, hmmer 3.2.1, and rmblast 2.11.0 $ perl ./configure $ deactivate $ sudo chown -R root:root /export/apps/repeatmasker/4.1.1