User Tools

Site Tools


mkatari-bioinformatics-august-2013-gatknotes

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
mkatari-bioinformatics-august-2013-gatknotes [2014/06/11 13:36] mkatarimkatari-bioinformatics-august-2013-gatknotes [2014/07/02 15:19] mkatari
Line 38: Line 38:
 bowtie2 -x PTC_Human -U Cohen.fastq -S Cohen.sam bowtie2 -x PTC_Human -U Cohen.fastq -S Cohen.sam
 samtools view -bS Cohen.sam > Cohen.bam samtools view -bS Cohen.sam > Cohen.bam
-bowtie2 -x PTC_Human -U Sherman.fastq -S Sherman.sam 
-samtools view -bS Sherman.sam > Sherman.bam 
 </code> </code>
  
Line 49: Line 47:
    SORT_ORDER=coordinate    SORT_ORDER=coordinate
        
-java -jar /export/apps/picard-tools/1.112/SortSam.jar \ +
-   INPUT=Sherman.bam \ +
-   OUTPUT=Sherman.sorted.bam \ +
-   SORT_ORDER=coordinate+
 </code> </code>
  
Line 58: Line 53:
  
 <code> <code>
-java -jar /export/apps/picard-tools/1.112/AddOrReplaceReadGroups.jar \ 
-   INPUT=Sherman.sorted.bam \ 
-   OUTPUT=ShermanRG.bam \ 
-   RGLB=Sherman \ 
-   RGPL=IonTorrent \ 
-   RGPU=None \ 
-   RGSM=Sherman 
  
 java -jar /export/apps/picard-tools/1.112/AddOrReplaceReadGroups.jar \ java -jar /export/apps/picard-tools/1.112/AddOrReplaceReadGroups.jar \
Line 84: Line 72:
    ASSUME_SORTED=TRUE    ASSUME_SORTED=TRUE
  
-java -jar /export/apps/picard-tools/1.112/MarkDuplicates.jar \ 
-   INPUT=ShermanRG.bam \ 
-   OUTPUT=Sherman.dedup.bam \ 
-   METRICS_FILE=Sherman.dedup.metrics \ 
-   REMOVE_DUPLICATES=TRUE \ 
-   ASSUME_SORTED=TRUE 
 </code> </code>
  
Line 95: Line 77:
 <code> <code>
 samtools index Cohen.dedup.bam  samtools index Cohen.dedup.bam 
-samtools index Sherman.dedup.bam  
  
 #identifying indels #identifying indels
Line 104: Line 85:
    -o CohenforIndelRealigner.intervals    -o CohenforIndelRealigner.intervals
    
- #identifying indels 
-java -Xmx2g -jar /export/apps/GenomeAnalysisTK/GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar \ 
-   -T RealignerTargetCreator \ 
-   -R PTC_Human.fasta \ 
-   -I Sherman.dedup.bam \ 
-   -o ShermanforIndelRealigner.intervals 
  
    
Line 119: Line 94:
    -o Cohen.dedup.realign.bam    -o Cohen.dedup.realign.bam
  
- java -Xmx4g -jar /export/apps/GenomeAnalysisTK/GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar \ +</code>
-   -T IndelRealigner \ +
-   -R PTC_Human.fasta \ +
-   -I Sherman.dedup.bam \ +
-  -targetIntervals ShermanforIndelRealigner.intervals \ +
-   -o Sherman.dedup.realign.bam+
  
 +In some cases there may be a need to clean the sam/bam file(s) (soft-trimming the coordinates). To do this use CleanSam in Picard tools. You may want to just do it to all to avoid the error in a workflow, but it may not be necessary.
 +
 +<code>
 +java -jar /export/apps/picard-tools/1.112/CleanSam.jar \
 +   INPUT=Sherman.dedup.realign.bam \
 +   OUTPUT=Sherman.clean.dedup.realign.bam
 </code> </code>
  
-Now we merge the bam files and then sort and index them+Now we merge the bam files and then sort and index them. If you cleaned the bam file, remember to use the cleaned ones.
  
 <code> <code>
 java -jar /export/apps/picard-tools/1.112/MergeSamFiles.jar \ java -jar /export/apps/picard-tools/1.112/MergeSamFiles.jar \
-   INPUT=Sherman.dedup.realign.bam \+   INPUT=Sherman.clean.dedup.realign.bam \
    INPUT=Cohen.dedup.realign.bam \    INPUT=Cohen.dedup.realign.bam \
    OUTPUT=ShermanCohenMerged.bam      OUTPUT=ShermanCohenMerged.bam  
Line 142: Line 118:
  
  
-Finall !! run gatk+Finally !! run gatk
  
 <code> <code>
Line 153: Line 129:
    -glm SNP \    -glm SNP \
    -o PTC_human.gatk.vcf    -o PTC_human.gatk.vcf
 +
 +</code>
 +
 +If you want to load the vcf file into IGV, remember to index it first.
 +<code>
 +module load igvtools
 +igvtools index PTC_human.gatk.vcf
 +</code>
 +
 +If you would like to generate a table of from the vcf file use the following command
 +<code>
 +java -jar /export/apps/GenomeAnalysisTK/GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar \
 +     -R PTC_Human.fasta
 +     -T VariantsToTable \
 +     -V PTC_human.gatk.vcf \
 +     -F CHROM -F POS -F ID -F QUAL -F AC \
 +     -GF GT -GF GQ \
 +     -o PTC_human.gatk.vcf.table
 </code> </code>
mkatari-bioinformatics-august-2013-gatknotes.txt · Last modified: 2016/08/17 08:37 by mkatari