User Tools

Site Tools


tutorials:population-diversity:snp-chips

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
tutorials:population-diversity:snp-chips [2020/09/21 11:46] – [Data analysis workflow with Plink 1.9] bnginatutorials:population-diversity:snp-chips [2020/09/22 10:21] (current) – [Data analysis workflow with Plink 1.9] bngina
Line 165: Line 165:
  
 <code> <code>
 +
 +
 +
 +######### summary statistics ########
  
 #missingness #missingness
Line 179: Line 183:
   *//bin_caprin_60k.lmiss// - for the loci   *//bin_caprin_60k.lmiss// - for the loci
  
-<code> 
  
-#The missing information found in the ''bin_caprin_60k.imiss'' for the individuals looks like below; 
  
 +#The missing information found in the ''bin_caprin_60k.imiss'' for the individuals looks like below;
 +<code>
 FID                               IID MISS_PHENO   N_MISS   N_GENO   F_MISS FID                               IID MISS_PHENO   N_MISS   N_GENO   F_MISS
             WG6694108-DNA_A01_110kin          Y     1325    53347  0.02484             WG6694108-DNA_A01_110kin          Y     1325    53347  0.02484
Line 195: Line 199:
   10           WG6694108-DNA_A10_Zkin2          Y     1349    53347  0.02529   10           WG6694108-DNA_A10_Zkin2          Y     1349    53347  0.02529
  
-##the information in each header is as follows;+</code>
  
 +The information in each header is as follows;
 +<code>
 FID                Family ID FID                Family ID
 IID                Individual ID IID                Individual ID
Line 203: Line 209:
 N_GENO             Number of non-obligatory missing genotypes i.e total number of SNPs used N_GENO             Number of non-obligatory missing genotypes i.e total number of SNPs used
 F_MISS             Proportion of missing SNPs (in percentage) F_MISS             Proportion of missing SNPs (in percentage)
 +</code>
  
- +The information found in the ''bin_caprin_60k.lmiss'' for the SNPs is as below; 
-#The information found in the ''bin_caprin_60k.lmiss'' for the SNPs is as below; +<code>
  CHR                           SNP   N_MISS   N_GENO   F_MISS  CHR                           SNP   N_MISS   N_GENO   F_MISS
              snp1-scaffold1-2170        4      648 0.006173              snp1-scaffold1-2170        4      648 0.006173
Line 218: Line 224:
     snp10004-scaffold1356-853276        3      648  0.00463     snp10004-scaffold1356-853276        3      648  0.00463
     snp10005-scaffold1356-907019        2      648 0.003086     snp10005-scaffold1356-907019        2      648 0.003086
-    +</code>   
-#The information in each column is as follows;+
  
 +The information in each column is as follows;
 +<code>
 SNP                SNP identifier SNP                SNP identifier
 CHR                Chromosome number CHR                Chromosome number
Line 226: Line 233:
 N_GENO             Number of non-obligatory missing genotypes i.e total number of genotypes in the population N_GENO             Number of non-obligatory missing genotypes i.e total number of genotypes in the population
 F_MISS             Proportion of sample missing for this SNP (in percentage) F_MISS             Proportion of sample missing for this SNP (in percentage)
 +</code>
  
 +We can generate a file with filters added for the rate missing data in individuals ''--mind'' and call rate for the SNPs ''--geno'' and also for the minor allele frequency //(MAF)// , with flag ''--maf''.
  
-</code>+The thresholds for these filters should be adjusted accordingly to the different data sets.
  
 +<code>
 +
 +#### filter data ###
 +
 +plink --file ${file} \
 + --geno 0.05 \   #95% call rate of SNPs
 + --maf 0.01\     #SNPs with less than 1% minor allele frequencies
 + --mind 0.25 \   #individuals with more than 25% missing data
 + --out ${out}/bin_caprin_60k_fltrd \
 + --make-bed
 +
 +</code>
 ===== Data analysis workflow with R and adegenet ===== ===== Data analysis workflow with R and adegenet =====
  
tutorials/population-diversity/snp-chips.1600688761.txt.gz · Last modified: 2020/09/21 11:46 by bngina