Table of Contents
Prokka
Prokka is a software tool to annotate bacterial, archaeal and viral genomes quickly and produce standards-compliant output files.
Information
- Version: 1.14.6
- Added: February, 2017
- Updated: May, 2020
Usage
See which versions are available:
$ module avail prokka
Load one version into your environment and run it:
$ module load prokka/1.14.6 $ prokka
Note: Please use the ''--cpus'' option to tell prokka how many CPUs it should use or else it will automatically use eight (8). This number should match the number of CPUs you requested in your SLURM batch allocation.
Installation
Notes from the sysadmin during installation:
$ cd /tmp $ wget https://github.com/tseemann/prokka/archive/v1.14.6.tar.gz $ tar xf v1.14.6.tar.gz $ cd prokka-1.14.6 $ mkdir perl5 $ cpanm -l perl5 Time::Piece XML::Simple Digest::MD5 Module::Build $ export PERL5LIB=perl5/lib/perl5 $ grep -rh -oE "use Bio::.*$" bin/* binaries/* | sort -u | awk '{print $2}' | sed 's/;//' $ cpanm -l perl5 Bio::AlignIO Bio::Root::Version Bio::SearchIO Bio::Seq Bio::SeqFeature::Generic Bio::SeqIO Bio::Tools::CodonTable Bio::Tools::GFF Bio::Tools::GuessSeqFormat --force # we apparently also need Bio::SearchIO::hmmer3, but I only discovered that weeks later after trying to test prokka... hmmmm $ cpanm -l perl5 Bio::SearchIO::hmmer3 --force $ ./bin/prokka --setupdb # Upgrade tbl2asn, see: https://github.com/tseemann/prokka/issues/511 $ wget https://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/tbl2asn/linux64.tbl2asn.gz $ gunzip linux64.tbl2asn.gz $ chmod +x linux64.tbl2asn $ mv linux64.tbl2asn binaries/linux/tbl2asn $ sudo mkdir -p /export/apps/prokka/1.14.6 $ sudo cp -r . /export/apps/prokka/1.14.6
Note 1: Prokka only says that it requires "BioPerl", but the BioPerl distribution frowns upon requiring the entire distribution. Here I have attempted to guess which modules are needed by checking the actual Perl "use" statements.
Note 2: You can try without ''--force'' if you want, but many Perl modules "bail out" during installation because one out of a few hundred (or thousand) obscure or comprehensive tests fail.