User Tools

Site Tools


seqclean:seqclean

This is an old revision of the document!


SeqClean


SeqClean is a tool for validation and trimming of DNA sequences from a flat file database (FASTA format). SeqClean was designed primarily for "cleaning" of EST databases, when specific vector and splice site data are not available, or when screening for various contaminating sequences is desired. The program works by processing the input sequence file and filtering its content according to a few criteria:

  • percentage of undetermined bases
  • polyA tail removal
  • overall low complexity analysis
  • short terminal matches with various sequences used

during the sequencing process (vectors, adapters)

  • strong matches with other contaminants or unwanted sequences

(mitochondrial, ribosomal, bacterial, other species than the

  target organism etc.)
  

The user is expected to provide the contaminant databases, they are not included in this package

seqclean/seqclean.1277307711.txt.gz · Last modified: 2010/06/23 15:41 by 172.26.15.75