User Tools

Site Tools


biological-databases

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
biological-databases [2019/03/13 19:33]
aorth
biological-databases [2020/04/08 11:14] (current)
aorth
Line 1: Line 1:
 ====== Biological Sequence Databases on the HPC ====== ====== Biological Sequence Databases on the HPC ======
 +
 +~~NOTOC~~
  
 Some of the most common biological sequence databases are available on the HPC for you to use with tools like BLAST. Below you can find the list of them, their location on the system, and the last time they were updated. Some of the most common biological sequence databases are available on the HPC for you to use with tools like BLAST. Below you can find the list of them, their location on the system, and the last time they were updated.
Line 5: Line 7:
 We endeavor to keep this list updated as the One True List™. We endeavor to keep this list updated as the One True List™.
  
-^Name        ^Version number   ^Last updated                ^Where it resides              ^How to use   +^Name        ^Comments   ^Updated¹              ^Database Location          
-|NCBI nr/nt nucleotide collection|N/A|24 Nov 2018|''/export/data/bio/ncbi/blast/db''|use ''BLASTDB=/export/data/bio/ncbi/blast/db'' and ''blastn ... -db nt'' in your Bash script+| nt | NCBI nucleotide collection (v5²) Mar 24, 2020 | ''/export/data/bio/ncbi/blast/db/v5''
-|NCBI nr/nt protein collection|N/A|16 Aug 2018|''/export/data/bio/ncbi/blast/db''|use ''BLASTDB=/export/data/bio/ncbi/blast/db'' and ''blastp ... -db nr'' in your Bash script+| nr | NCBI protein collection (v5²) Mar 24, 2020 | ''/export/data/bio/ncbi/blast/db/v5''
-|UniProt's UniProtKB/Swiss-Prot (manually curated, most reliable)|N/A|?|''/export/data/bio/uniprot/blast/db''|use ''BLASTDB=/export/data/bio/uniprot/blast/db'' in your Bash script+| UniProt's UniProtKB/Swiss-Prot | Manually curated, most reliable | July 1, 2019 | ''/export/data/bio/uniprot/blast/db'' | 
-|UniProt'UniProtKB/TrEMBL (automated curation)|N/A|?|''/export/data/bio/uniprot/blast/db''|use ''BLASTDB=/export/data/bio/uniprot/blast/db'' in your Bash script| +| UniProt's UniProtKB/TrEMBL | Automated curation | ? | ''/export/data/bio/uniprot/blast/db''
-|UniProt's UniRef100|N/A|?|''/export/data/bio/uniprot/blast/db''|use ''BLASTDB=/export/data/bio/uniprot/blast/db'' in your Bash script|+| UniProt'UniRef100 | | ? | ''/export/data/bio/uniprot/blast/db'' | 
 + 
 +==== Using These Databases ==== 
 +Tools like BLAST use the ''BLASTDB'' environment variable to find the location of the system's BLAST databases. ILRI's BLAST environment modules like ''blast/2.10.0+'' automatically set this variable when you load the module. 
 + 
 +If you are using different software you will need to set the variable manually, for example: 
 + 
 +<code> 
 +$ export BLASTDB=$BLASTDB:/export/data/bio/ncbi/blast/db/v5 
 +$ blastn -db nt -query file.seq -out blast.out 
 +</code> 
 + 
 + 
 +---- 
 + 
 +==== Notes ===== 
 +1. Use the following to determine the date of a BLAST database: 
 + 
 +<code>$ module load blast/2.10.0+ 
 +$ blastdbcmd -info -db nt grep Date 
 +</code> 
 + 
 +2. In 2019 [[https://ncbiinsights.ncbi.nlm.nih.gov/2019/05/24/have-you-tried-blast-2-9-0-and-version-5-blast-databases-dbv5/|NCBI introduced BLAST database format version 5]] and these only work with BLAST tools starting from 2.9.0. NCBI are no longer updating the version 4 databases, but we have preserved them in a separate directory if you are using tools that do not support version 5: 
 + 
 +''/export/data/bio/ncbi/blast/db/v4''
biological-databases.1552505595.txt.gz · Last modified: 2019/03/13 19:33 by aorth