seqclean:seqclean
This is an old revision of the document!
SeqClean
SeqClean is a tool for validation and trimming of DNA sequences from a flat file database (FASTA format). SeqClean was designed primarily for "cleaning" of EST databases, when specific vector and splice site data are not available, or when screening for various contaminating sequences is desired. The program works by processing the input sequence file and filtering its content according to a few criteria:
- percentage of undetermined bases
- polyA tail removal
- overall low complexity analysis
- short terminal matches with various sequences used
during the sequencing process (vectors, adapters)
- strong matches with other contaminants or unwanted sequences
(mitochondrial, ribosomal, bacterial, other species than the
target organism etc.)
The user is expected to provide the contaminant databases, they are not included in this package
seqclean/seqclean.1277307711.txt.gz · Last modified: 2010/06/23 15:41 by 172.26.15.75