BLASTing through the kingdom of life
BLAST tutorial

BLAST worksheet

DNA sequences

National Center for Biotechnology Information

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

Copyright © Digital World Biology 2009


   
GenBank contains over 3 million sequences, with over 14 billion nucleotides. BLAST can be used to compare an unknown sequence to all the sequences in GenBank and find sequences that match. This can be helpful for determining the possible identity of an unknown sequence.

In this activity you will use BLAST to identify unknown sequences.

1. Open the BLAST tutorial in a separate web browser window. We recommend that you adjust the windows in order to view the tutorial in one browser window and use BLAST in the other window.

2. Open the NCBI site in a third window. This is where you'll do the blast search.

3. Copy your sequence from the sequence data set (after reading the instructions), identify it using BLAST, and answer the questions on your worksheet. The worksheet includes example answers.

Copy the DNA sequence using one of three ways.

  • Use your mouse to highlight your sequence.
  • Click the right mouse button and select Copy.
  • Open the Edit menu in web browser and select Copy.
  • Use the keyboard commands (Ctrl + C for Windows, Command + C for Mac).

**Be sure to copy the entire sequence including the > symbol and the name.

Paste your sequence in the BLAST text box, using one of three methods:

  • Click the right mouse button and select Paste.
  • Open the Edit menu in web browser and select Paste.
  • Use the keyboard commands (Ctrl + V for Windows, Command + V for Mac).

You should be aware that the NCBI web changes regularly. Although the information in the tutorial is current, the NCBI web page will probably appear slightly different.

4. To get your sequence, either click the link below, or scroll down the page. There are 16 sequences in this data set and one example sequence.


List of sequences
1  2   3   4   5   6   7   8    9   10    11   12   13   14   15   16

>Example
ATCACTGTAGTAGTAGCTGGAAAGAGAAATCTGTGACTCCAATTAGCCA
GTTCCTGCAGACCTTGTGAGGACTAGAGGAAGAATGCTCCTGGCTGTTT
TGTACTGCCTGCTGTGGAGTTTCCAGACCTCCGCTGGCCATTTCCCTAG
AGCCTGTGTCTCCTCTAAGAACCTGATGGAGAAGGAATGCTGTCCACCG
TGGAGCGGGGACAGGAGTCCCTGTGGCCAGCTTTCAGGCAGAGGTTCC
TGTCAGAATATCCTTCTGTCCAATGCACCACTTGGGCCTCAATTTCCCTT
CACAGGGGTGGATGACCGGGAGTCGTGGCCTTCCGTCTTTTATAATAGG
ACCTGCCAGTGCTCTGGCAACTTCATGGGATTCAACTGTGGAAACTGCAA
GTTTGGCTTTTGGGGACCAAACTGCACAGAGAGACGACTCTTGGTGAGAA
GAAACATCTTCGATTTGAGTGCCCCAGAGAAGGACAAATTTTTTGCCTACC
TCACTTTAGCAAAGCATACCATCAGCTCAGACTATGTCATCCCCATAGGGA
CCATTGGCCAAATGAAAAATGGATCAACACCCATGTTTAACGACATCAATA
TTTATGACCTCTTTGTCTGGATGCATTATTATGTGTCAATGGATGCACTGC
TTGGGGGATCTGAAATCTGGAGAGACATTGATTTTGCCCATGAAGCACCA
GCTTTTCTGCCTTGGCATAGACTCTTCTTGTTGCGGTGGGAACAAGAAATC
CAGAAGCTGACAGGAGATGAAAACTTCACTATTCCATATTGGGACTGGCG
GGATGCAGAAAAGTGTGACATTTGCACAGATGAGTACATGGG

>Sequence 1
TCGAAATAACGCGTGTTCTCAACGCGGTCGCGCAGATGCCTTTGCTCATC
AGATGCGACCGCAACCACGTCCGCCGCCTTGTTCGCCGTCCCCGTGCCTC
AACCACCACCACGGTGTCGTCTTCCCCGAACGCGTCCCGGTCAGCCAGCC
TCCACGCGCCGCGCGCGCGGAGTGCCCATTCGGGCCGCAGCTGCGACGGT
GCCGCTCAGATTCTGTGTGGCAGGCGCGTGTTGGAGTCTAAA

>Sequence 2
GTTTATTAGTGATCATGGCTAAGTTTGCGTCCATCATCGCACTTCTTTTT
GCTGCTCTTGTTCTTTTTGCTGCTTTCGAAGCACCAACAATGGTGGAAGC
ACAGAAGTTGTGCGAAAGGCCAAGTGGGACATGGTCAGGAGTCTGTGGAA
ACAATAACGCATGCAAGAATCAGTGCATTAACCTTGAGAAAGCACGACAT
GGATCTTGCAACTATGTCTTCCCAGCTCACAAGTGTATCTGCTACTTTCC
TTGTTAATTTATCGCAAACTCTTTGGTGAATAGTTTTTATGTAATTTACA
CAAAATAAGTCAGTGTCACTATCCATGAGTGATTTTAAGACATGTACCAG
ATATGTTATGTTGGTTCGGTTATACAAATAAAGTTTTATTCACCA

>Sequence 3
CTCGAGACTAGTTCTCTCTCTCTCTCTCTCGTGCCGCATCTCACACCTGT
GGATGGACGGCAGCTGAACCGCGGGAAACTTTCGTTCTCACTCTACCTAG
ATGAACTTTAGTTTATATTAAACACGCGTCGACTCCCACACAAACCGTGC
TCGTTTTACATCTTTGTCTCCGCTTTTGAAAACGAGAAGTTGAATTCGCA
AGACGCAACTTTCCAGCCCCTCACTGAGCGGGCAGAGTCCGTGAAGCGAT
GGAGCCGTCCGTCATTCCCGGTGCTGACATACCCGACCTTTACTCCATTA
ACCCGTTTAATGTCACTTTTCCCGACGACGTTTTGAGTTTCGTTCCTGAT
GGGAGGAACTACACCGAACCTAACCCGGTAAAGAGCCGCG GAATCATCA
TCGCCATTTCCATCACCGCTC

>Sequence 4
GACATTACGGCGACCCAGTCTCCCCCGGTGTTGTCAGTGGGACTGGGCC
AGACCGCAACCATCACTTGTACGGCCAGTCAAAGCATCTACAGTAACCT
TGCTTGGTACCAGCAGAGAGAAGGACAGAAGCCCTCTCTCCTGATCTAT
GCTGCGACAACGCGATACGAAGGAGTCTCCGAGCGATTCAGCGGCAGTG
GATCAGGGACCAGTTTCACCCTGACAATCAGCAACGTTCAGAATGAGGA
TGTCGCTGACTATTACTGTCAGATCGCATATTCGATCTACTCCGGTTCC
GTTGTTTTCGGTGAAGGAACCAAGCTCAGACTGAGCCGT

>Sequence 5
GAATTCGCGGCCGCATGGGGGAGAAGCTGCCGGTTGTGTATAAACGCTT
CATCTGCTCGTTCCCGGATTGTAATGCCACGTATAACAAGAACCGGAAG
CTGCAGGCCCATCTGTGCAAGCACACGGGGGAGAGACCGTTTCCTTGCA
CATATGAAGGCTGTGAGAAAGGCTTTGTGACGCTGCATCACCTGAATCG
TCATGTGCTCTCCCACACCGGGGAGAAACCCTGCAAATGCGAAACGGAA
AATTGCAATTTGGCGTTCACCACAGCATCCAACATGAGGTTGCACTTCA
AAAGGGCTCATTCTTCTCCGGCGCAGGTCTACGTGTGTTATTTCGCAGA
CTGTGGCCAGCAGTTCAGGAAACATAACCAGCTAAAAATTCACCAGTAT
ATCCATACAAACCAGCAACCCTTCAAAT

>Sequence 6
GCCCAGCGTCTCTCGGAGGAAGCTAATTCTCAGGTTATCGCAGAGGAATC
TCTTGTAGCTCGTGCTGAGGCTACCGTTGTCCAAGCCGCCGCTCCAACCA
AATCCCTTGATCTGACAACATGGAAGTATGCTGATCTCAGAGACACTATC
AACACCTCAATCGATATTGCGCTCCTGTCAGCCTGCAAGGAGGAGTTCCA
TCGTCGTCTCAAGGTCTACCACGCCTGGAAGATGAAGAATAAGAAGGTTG
CCGCCGGCGACAAGGGCGGACCAGAGAGGGCTCCACAATCCATCTTTGAA
AGTGCCCAACAATACAACCAGCTGGCACCCCCTCCGAAAGCCACCAAGGC
TGCCCCAGCCAATCAGAACATCCAACGCTTCTTCAGGGTGCCTTTCTCCG
TGACTGGGTCCACCGCTCAGGGTCAGATGCCCGAGAGGGGTTGGTGGTAC
GCCCACTTTGACGGTCAGTGGATCGCCCGCCAGATGGAGGTACACCCCAC
CAAGGTCCCCGTTCTTCTGGTTGCAGGTAAAGATGATGAGAACATGTGTG
AGATGAGTTTGGAGGAGACTGGGTTGACACGACGTCCCAACGCCGAGATC
GTCGAGCGGGAGTTTGAGGAGCCCTGGAAGCGTAGCGGCGGTCAGCAGTA
CCACATGGCTGCAGTACGCAACAAGCAGGCTAGACCAACGTGGGCCACGC
AGAGCTTGAA

>Sequence 7
AACAATTCATTTTTCCTGCTTTCCTAGAAAATTCTATAAAAGCTTCAAAA
TGAATTACTTGGTGATGATTAGTTTGGCACTTCTCTTCGTGACAGGTGTA
GAGAGTGTAAAAGACGGTTATATTGTCGACGATGTAAACTGCACATACTT
TTGTGGTAGAAATGCATACTGCAACGAGGAATGTACCAAGTTGAAAGGTG
AGAGTGGTTATTGCCAATGGGCAAGTCCATATGGAAACGCCTGTTATTGC
TATAAATTGCCCGATCATGTACGTACTAAAGGACCAGGAAGATGCCATGG
CCGATAAATTATAAGATGGAATGTATCCTAAGTATCAATGTTAAATAAAT
ATAATCAAAAAATT

>Sequence 8
ACAGCAAGCGAACCGGAATTGCCAGCTGGGGCGCCCTCTGGTAAGGTTGG
GAAGCCCTGCAAAGTAAACTGGATGGCTTTCTTGCCGCCAAGGATCTGAT
GGCGCAGGGGATCAAGATCTGATCAAGAGACAGGATGAGGATCGTTTCGC
ATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGA
GAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATG
CCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAG
ACCGACCTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCT
ATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTG
TCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAG
GATCTCCTGTCATCTCACCTTGCTCCTGCC

>Sequence 9
TCTTGGTGAGGATCCGTTGAGAACAACCCAACCGCCGCCCCATCGCCCTN
GTTAGANTNATGGCCGCGTCGGCGCTGCACCAGACCACCAGCTTCCTCNG
CACCGCCCCTCGCCGGGATGAGCTCGTCCGCCGCGTCGGCGACTCCGGTG
GCCGCATCACCATGCGCCGCACCGTCAAGAGCGCGCCCCAGAGCATCTGG
TATGGACCTGACCGTCCCAAGTNCCTGGGCCCGTTCTCGGAGCAGACGCC
ATCGTACCTGACCGGAGAGTTCCCGGGAGACTACGGGTGGGACACGGCGG
GGCTATCGGCCGACCCGGANACGTTCGCTATGAACAGGGAGCTGGANGTG
ATCCACTCNCGGTGGGCGATGCTGGGGGCGCTGGGCTGCGTCTTCCCGGA
GATCCTGTCCAANAACGGGG

>Sequence 10
AGAACTACCAAGATGCTGCTCAGGTCAGAGATTGTCGTCTGTCTGGCCTT
CTGGATCTTGCACTTGAGAAAGATTATGTTCGAACCAAGGTGGCTGACTA
TATGAACCATCTCATTGACATTGGCGTACAGGGTTCAGACTTGATGCTTC
TAAGCACATGTGGCCTGGAGACATAAAGGCAATTTTGGACAAACTGCATA
ATCTCAATACAAAATGGTTCTCCCAAGGAAGCAGACCTTTCATTTTCCAA
GAGGTGATTGATCTGGGTGGTGAGGCAGTGTCAAGTAATGAGTATTTTGG
AAATGGCCGTGTGACAGAATTCAAATATGGAGCAAAATTGGGCAAAGTTA
TGCGCAAGTGGGATGGAGAAAAGATGTCCTACTTAAAGAACTGGGGAGAA
GGTTGGGGTTTGATGCCTTC

>Sequence 11
TGCATCACAAGGTTAATGTGAAAACACAGCGAGAAGTCCATTTCCCAATG
GACCTCTTGCAAGCCTGTGGTGCATCTGCCCCTAGGCCAGTTGCCCGTGT
TTCACGTGCAACCGACCTAGACCGACGCTACAGGTGCGTCCTCAGTTTAC
CTGAGGAGCGTGCTCGCAGTGTTGGGTGTAAATGGTCGTCGACCCGAGCG
GCGTTACGACGTGGACTCGAGGAGCTTGGCTCCCGCGAGTTCCGCCGTCG
TCTCCGTTTGGCGGACGATTGCTGGCGCGCGATCTGCGCGGCCGTCTGCA
CGGGTCGGAAGTTTCCTTCCTTCTCGGTGACAGATCGGCCGGCAAGAGCT
CGCCTTGCAAAAGTCTACCGTATGGGTCGTCGACTGCTAGTAGGTGTGGT
CTGCCGAGGCGAATCGGTCG

>Sequence 12
ATAACCACACGCCTTTGGCGTGATTATCAGCTTTCAAGTTTCAGTTACTA
AAACTAATACTGACTATAAAACAGAAGCAAAAAAATTTTCGATTTTTATG
AAAACGGTCGCAAAGAAGTTAGCAAAAATATATAATTTCTTTTGAAATTG
TTCACTTGGCCAAGCTGCAGTTTCAATATTTTAATAAAGGGGGCAGTAAA
AAGTGAAAAAAAAGAAAAGTTTCTGGCTTGTTTCTTTTTTAGTTATAGTA
GCTAGTGTTTTCTTTATATCTTTTGGATTTAGCAATCATTCTAAACAAGT
TGCTCAAGCGGCTAGTGATACGACATCAACTGATCACTCAAGCAATGATA
CAGCTGATTCTGTTAGCGACGGTGTTATTTTGCATGCATGGTGCTGGTCG
TTCAACACGATTAAAAACAACTTGAAACAGATTCATGACGCCGGCTACAC
AGCGGTTCAAACTTCACCTGTTAATGAAGTTAAAGTTGGAAATAGCGGGT
CTAAGTCATTAAATAACTGGTATTGGCTATATCAGCCAAC

>Sequence 13
GTGCGCGTTAGACCATATAAAGAAAAACCAATACAAACTCCAGCAAAATC
TGTTGATATAAGATATACTGTACAGTTTACTCCTTTAAACCCTGATGATG
ATTTCAAGCCAGTTCTCAAAGATACTAAACTATTGAAAACATTAGCTATC
GGCGACACCATCACATCCCAAGAATTACTAGCTCAAGCACAAAGCATTTT
AATCGAAAGCCATCCAGATTATACGATTTATGAACGTGATTCCTCAATCG
TCACTCATGACAATGACATTTTCCGTACGATTTTACCAACGGATCAAGAG
TTTACTTACCATGTCAAAAATCGGGAACAAGCTTATAAGGCCAATTCTAA
AACAGATATTAAAGAAAAAACGAACAACACCGAC

>Sequence 14
CTAATAATCCTTGGAATACTCCTATATTTTGTATAAAGAAGAAATCAGGG
AAATGGAGAATGCTAATTGATTTTAGAGAACTTAATGCAAAAACAGAAAA
AGGAGCAGAAGTCCAATTAGGATTACCTCACCCATCTGGATTACAGAAGA
GAAAGAATGTAACAGTTTTAGATATAGGAGATGCTTATTTTACCATCCCT
TTAGATCCTGATTATCAGCCCTATACTGCATTTACTTTACCATCTAAGAA
TAATCAAAGTCCAGGAAAAAGGTATATTTGGAAATCTCTTCCACAGGGGT
GGGTCTTGAGTCCCTTAATATACCAGAGCACTCTAGATAATATTCTACAA
CCATTTAGAA

>Sequence 15
ATGTTTTCCGGTGGCGGCGGCCCGCTGTCCCCCGGAGGAAAGTCGGCGGC
CAGGGCGGCGTCCGGGTTTTTTGCGCCCGCCGGCCCTCGCGGAGCCGGCC
GGGGACCCCCGCCTTGCTTGAGGCAAAACTTTTACAACCCCTACCTCGCC
CCAGTCGGGACGCAACAGAAGCCGACCGGGCCAACCCAGCGCCATACGTA
CTATAGCGAATGCGATGAATTTCGATTCATCGCCCCGCGGGTGCTGGACG
AGGATGCCCCCCCGGAGAAGCGCGCCGGGGTGCACGACGGTCACCTCAAG
CGCGCCCCCAAGGTGTACTGCGGGGGGGACGAGCGCGACGTCCTCCGCGT
CGGGTCGGGCGGCTTCTGGCCGCGGCGCTCGCGCCTGTGGGGCGGCGTGG
ACCACGCCCCGGCGGGGTTCAACCCCACCGTCACCGTCTTTCACGTGTAC
GACATCCTGGAGAACGTGGAGCACGCGTAC

>Sequence 16
CTCGGGTGACGAGTGGCGGACGGGTGAGTAATGTCTGGGGATCTGCCCGA
TAGAGGGGGATAACCACTGGAAACGGTGGCTAATACCGTATAACGTCGCR
AGACCAAAGAGGGGGACCTTCGGGCCTCTCACTATCGGATGAACCCAKAT
GGGATTAGCTAGTRSGCGGGGTMACGGGCCCACCTAGGCGACKATCCCTA
GCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGASACACGGYCCASA
CTCCTACGGGRGGCAGCAGKGGGGAATATTGCACARTGGGCGCAMGCCTG
ATGCASCCATGCCGYGTGTATGAAGARGGCCTTCGGGTTGTAAAGTWCTT
TCAGCGGGGAGGAAGGCKATGTGGTTAATAACCGCVTYGATTGACGTTAC
CCGCAGAAGAAGCACCGKCTAACTCCGTGCCAGCAGCCGCGGTWATACGG
AGGG