User Tools

Site Tools


mkatari-bioinformatics-august-2013-gatknotes

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revisionBoth sides next revision
mkatari-bioinformatics-august-2013-gatknotes [2014/06/11 12:49] – created mkatarimkatari-bioinformatics-august-2013-gatknotes [2014/06/12 11:59] mkatari
Line 15: Line 15:
 bowtie2-build PTC_Human.fasta PTC_Human bowtie2-build PTC_Human.fasta PTC_Human
 samtools faidx PTC_Human.fasta samtools faidx PTC_Human.fasta
-java -jar /export/apps/picard-tools/1.112/CreateSequenceDictionary.jar R=PTC_Human.fasta O=PTC_Human.dict+java -jar /export/apps/picard-tools/1.112/CreateSequenceDictionary.jar 
 +   R=PTC_Human.fasta 
 +   O=PTC_Human.dict
  
 </code> </code>
Line 27: Line 29:
   * create a bam index for dedup bam files   * create a bam index for dedup bam files
   * realign samples   * realign samples
 +
 +Once they are all processed
   * merge all samples to one bam file   * merge all samples to one bam file
   * sort and index this merged bam file   * sort and index this merged bam file
Line 34: Line 38:
 bowtie2 -x PTC_Human -U Cohen.fastq -S Cohen.sam bowtie2 -x PTC_Human -U Cohen.fastq -S Cohen.sam
 samtools view -bS Cohen.sam > Cohen.bam samtools view -bS Cohen.sam > Cohen.bam
-bowtie2 -x PTC_Human -U Sherman.fastq -S Sherman.sam 
-samtools view -bS Sherman.sam > Sherman.bam 
 </code> </code>
  
 The picard method to sort is preferred by GATK The picard method to sort is preferred by GATK
 <code> <code>
-java -jar /export/apps/picard-tools/1.112/SortSam.jar INPUT=Cohen.bam OUTPUT=Cohen.sorted.bam SORT_ORDER=coordinate +java -jar /export/apps/picard-tools/1.112/SortSam.jar 
-java -jar /export/apps/picard-tools/1.112/SortSam.jar INPUT=Sherman.bam OUTPUT=Sherman.sorted.bam SORT_ORDER=coordinate+   INPUT=Cohen.bam 
 +   OUTPUT=Cohen.sorted.bam 
 +   SORT_ORDER=coordinate 
 +    
 </code> </code>
  
Line 47: Line 53:
  
 <code> <code>
-java -jar /export/apps/picard-tools/1.112/AddOrReplaceReadGroups.jar INPUT=Sherman.sorted.bam OUTPUT=ShermanRG.bam RGLB=Sherman RGPL=IonTorrent RGPU=None RGSM=Sherman 
  
-java -jar /export/apps/picard-tools/1.112/AddOrReplaceReadGroups.jar INPUT=Cohen.sorted.bam OUTPUT=CohenRG.bam RGLB=Cohen RGPL=IonTorrent RGPU=None RGSM=Cohen+java -jar /export/apps/picard-tools/1.112/AddOrReplaceReadGroups.jar 
 +   INPUT=Cohen.sorted.bam 
 +   OUTPUT=CohenRG.bam 
 +   RGLB=Cohen 
 +   RGPL=IonTorrent 
 +   RGPU=None 
 +   RGSM=Cohen
 </code> </code>
  
 This will remove any reads that map to the same exact place. It is helpful to get rid of artifacts.  This will remove any reads that map to the same exact place. It is helpful to get rid of artifacts. 
 <code> <code>
-java -jar /export/apps/picard-tools/1.112/MarkDuplicates.jar INPUT=CohenRG.bam OUTPUT=Cohen.dedup.bam METRICS_FILE=Cohen.dedup.metrics REMOVE_DUPLICATES=TRUE ASSUME_SORTED=TRUE+java -jar /export/apps/picard-tools/1.112/MarkDuplicates.jar 
 +   INPUT=CohenRG.bam 
 +   OUTPUT=Cohen.dedup.bam 
 +   METRICS_FILE=Cohen.dedup.metrics 
 +   REMOVE_DUPLICATES=TRUE 
 +   ASSUME_SORTED=TRUE
  
-java -jar /export/apps/picard-tools/1.112/MarkDuplicates.jar INPUT=ShermanRG.bam OUTPUT=Sherman.dedup.bam METRICS_FILE=Sherman.dedup.metrics REMOVE_DUPLICATES=TRUE ASSUME_SORTED=TRUE 
 </code> </code>
  
Line 62: Line 77:
 <code> <code>
 samtools index Cohen.dedup.bam  samtools index Cohen.dedup.bam 
-samtools index Sherman.dedup.bam  
  
 #identifying indels #identifying indels
Line 71: Line 85:
    -o CohenforIndelRealigner.intervals    -o CohenforIndelRealigner.intervals
    
- #identifying indels 
-java -Xmx2g -jar /export/apps/GenomeAnalysisTK/GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar \ 
-   -T RealignerTargetCreator \ 
-   -R PTC_Human.fasta \ 
-   -I Sherman.dedup.bam \ 
-   -o ShermanforIndelRealigner.intervals 
  
    
Line 85: Line 93:
   -targetIntervals CohenforIndelRealigner.intervals \   -targetIntervals CohenforIndelRealigner.intervals \
    -o Cohen.dedup.realign.bam    -o Cohen.dedup.realign.bam
- 
- java -Xmx4g -jar /export/apps/GenomeAnalysisTK/GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar \ 
-   -T IndelRealigner \ 
-   -R PTC_Human.fasta \ 
-   -I Sherman.dedup.bam \ 
-  -targetIntervals ShermanforIndelRealigner.intervals \ 
-   -o Sherman.dedup.realign.bam 
  
 </code> </code>
Line 98: Line 99:
  
 <code> <code>
-java -jar /export/apps/picard-tools/1.112/MergeSamFiles.jar INPUT=Sherman.dedup.realign.bam INPUT=Cohen.dedup.realign.bam OUTPUT=ShermanCohenMerged.bam  +java -jar /export/apps/picard-tools/1.112/MergeSamFiles.jar 
 +   INPUT=Sherman.dedup.realign.bam 
 +   INPUT=Cohen.dedup.realign.bam 
 +   OUTPUT=ShermanCohenMerged.bam   
 samtools sort ShermanCohenMerged.bam ShermanCohenMerged.sorted samtools sort ShermanCohenMerged.bam ShermanCohenMerged.sorted
 +
 samtools index ShermanCohenMerged.sorted.bam  samtools index ShermanCohenMerged.sorted.bam 
 </code> </code>
Line 115: Line 121:
    -glm SNP \    -glm SNP \
    -o PTC_human.gatk.vcf    -o PTC_human.gatk.vcf
 +
 +</code>
 +
 +If you want to load the vcf file into IGV, remember to index it first.
 +
 +If you would like to generate a table of from the vcf file use the following command
 +<code>
 +java -jar GenomeAnalysisTK.jar \
 +     -R PTC_Human.fasta
 +     -T VariantsToTable \
 +     -V PTC_human.gatk.vcf \
 +     -F CHROM -F POS -F ID -F QUAL -F AC \
 +     -GF GT -GF GQ \
 +     -o PTC_human.gatk.vcf.table
 </code> </code>
mkatari-bioinformatics-august-2013-gatknotes.txt · Last modified: 2016/08/17 08:37 by mkatari