User Tools

Site Tools


mkatari-bioinformatics-august-2013-gatknotes

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
mkatari-bioinformatics-august-2013-gatknotes [2014/06/11 12:52] mkatarimkatari-bioinformatics-august-2013-gatknotes [2014/07/02 15:19] mkatari
Line 29: Line 29:
   * create a bam index for dedup bam files   * create a bam index for dedup bam files
   * realign samples   * realign samples
 +
 +Once they are all processed
   * merge all samples to one bam file   * merge all samples to one bam file
   * sort and index this merged bam file   * sort and index this merged bam file
Line 36: Line 38:
 bowtie2 -x PTC_Human -U Cohen.fastq -S Cohen.sam bowtie2 -x PTC_Human -U Cohen.fastq -S Cohen.sam
 samtools view -bS Cohen.sam > Cohen.bam samtools view -bS Cohen.sam > Cohen.bam
-bowtie2 -x PTC_Human -U Sherman.fastq -S Sherman.sam 
-samtools view -bS Sherman.sam > Sherman.bam 
 </code> </code>
  
Line 47: Line 47:
    SORT_ORDER=coordinate    SORT_ORDER=coordinate
        
-java -jar /export/apps/picard-tools/1.112/SortSam.jar \ +
-   INPUT=Sherman.bam \ +
-   OUTPUT=Sherman.sorted.bam \ +
-   SORT_ORDER=coordinate+
 </code> </code>
  
Line 56: Line 53:
  
 <code> <code>
-java -jar /export/apps/picard-tools/1.112/AddOrReplaceReadGroups.jar \ 
-   INPUT=Sherman.sorted.bam \ 
-   OUTPUT=ShermanRG.bam \ 
-   RGLB=Sherman \ 
-   RGPL=IonTorrent \ 
-   RGPU=None \ 
-   RGSM=Sherman 
  
 java -jar /export/apps/picard-tools/1.112/AddOrReplaceReadGroups.jar \ java -jar /export/apps/picard-tools/1.112/AddOrReplaceReadGroups.jar \
Line 82: Line 72:
    ASSUME_SORTED=TRUE    ASSUME_SORTED=TRUE
  
-java -jar /export/apps/picard-tools/1.112/MarkDuplicates.jar \ 
-   INPUT=ShermanRG.bam \ 
-   OUTPUT=Sherman.dedup.bam \ 
-   METRICS_FILE=Sherman.dedup.metrics \ 
-   REMOVE_DUPLICATES=TRUE \ 
-   ASSUME_SORTED=TRUE 
 </code> </code>
  
Line 93: Line 77:
 <code> <code>
 samtools index Cohen.dedup.bam  samtools index Cohen.dedup.bam 
-samtools index Sherman.dedup.bam  
  
 #identifying indels #identifying indels
Line 102: Line 85:
    -o CohenforIndelRealigner.intervals    -o CohenforIndelRealigner.intervals
    
- #identifying indels 
-java -Xmx2g -jar /export/apps/GenomeAnalysisTK/GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar \ 
-   -T RealignerTargetCreator \ 
-   -R PTC_Human.fasta \ 
-   -I Sherman.dedup.bam \ 
-   -o ShermanforIndelRealigner.intervals 
  
    
Line 117: Line 94:
    -o Cohen.dedup.realign.bam    -o Cohen.dedup.realign.bam
  
- java -Xmx4g -jar /export/apps/GenomeAnalysisTK/GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar \ +</code>
-   -T IndelRealigner \ +
-   -R PTC_Human.fasta \ +
-   -I Sherman.dedup.bam \ +
-  -targetIntervals ShermanforIndelRealigner.intervals \ +
-   -o Sherman.dedup.realign.bam+
  
 +In some cases there may be a need to clean the sam/bam file(s) (soft-trimming the coordinates). To do this use CleanSam in Picard tools. You may want to just do it to all to avoid the error in a workflow, but it may not be necessary.
 +
 +<code>
 +java -jar /export/apps/picard-tools/1.112/CleanSam.jar \
 +   INPUT=Sherman.dedup.realign.bam \
 +   OUTPUT=Sherman.clean.dedup.realign.bam
 </code> </code>
  
-Now we merge the bam files and then sort and index them+Now we merge the bam files and then sort and index them. If you cleaned the bam file, remember to use the cleaned ones.
  
 <code> <code>
 java -jar /export/apps/picard-tools/1.112/MergeSamFiles.jar \ java -jar /export/apps/picard-tools/1.112/MergeSamFiles.jar \
-   INPUT=Sherman.dedup.realign.bam \+   INPUT=Sherman.clean.dedup.realign.bam \
    INPUT=Cohen.dedup.realign.bam \    INPUT=Cohen.dedup.realign.bam \
    OUTPUT=ShermanCohenMerged.bam      OUTPUT=ShermanCohenMerged.bam  
Line 140: Line 118:
  
  
-Finall !! run gatk+Finally !! run gatk
  
 <code> <code>
Line 151: Line 129:
    -glm SNP \    -glm SNP \
    -o PTC_human.gatk.vcf    -o PTC_human.gatk.vcf
 +
 +</code>
 +
 +If you want to load the vcf file into IGV, remember to index it first.
 +<code>
 +module load igvtools
 +igvtools index PTC_human.gatk.vcf
 +</code>
 +
 +If you would like to generate a table of from the vcf file use the following command
 +<code>
 +java -jar /export/apps/GenomeAnalysisTK/GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar \
 +     -R PTC_Human.fasta
 +     -T VariantsToTable \
 +     -V PTC_human.gatk.vcf \
 +     -F CHROM -F POS -F ID -F QUAL -F AC \
 +     -GF GT -GF GQ \
 +     -o PTC_human.gatk.vcf.table
 </code> </code>
mkatari-bioinformatics-august-2013-gatknotes.txt · Last modified: 2016/08/17 08:37 by mkatari