User Tools

Site Tools


tutorials:population-diversity:snp-chips

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
tutorials:population-diversity:snp-chips [2020/09/21 11:16] – [Data analysis workflow with Plink 1.9] bnginatutorials:population-diversity:snp-chips [2020/09/22 10:21] (current) – [Data analysis workflow with Plink 1.9] bngina
Line 165: Line 165:
  
 <code> <code>
 +
 +
 +
 +######### summary statistics ########
  
 #missingness #missingness
Line 179: Line 183:
   *//bin_caprin_60k.lmiss// - for the loci   *//bin_caprin_60k.lmiss// - for the loci
  
-<code> 
  
-#The missing information found in the ''bin_caprin_60k.imiss'' looks like below 
  
 +#The missing information found in the ''bin_caprin_60k.imiss'' for the individuals looks like below;
 +<code>
 FID                               IID MISS_PHENO   N_MISS   N_GENO   F_MISS FID                               IID MISS_PHENO   N_MISS   N_GENO   F_MISS
             WG6694108-DNA_A01_110kin          Y     1325    53347  0.02484             WG6694108-DNA_A01_110kin          Y     1325    53347  0.02484
Line 195: Line 199:
   10           WG6694108-DNA_A10_Zkin2          Y     1349    53347  0.02529   10           WG6694108-DNA_A10_Zkin2          Y     1349    53347  0.02529
  
-##the information in each header is as follows;+</code>
  
 +The information in each header is as follows;
 +<code>
 FID                Family ID FID                Family ID
 IID                Individual ID IID                Individual ID
 MISS_PHENO         Missing phenotype? (Y/N) MISS_PHENO         Missing phenotype? (Y/N)
 N_MISS             Number of missing SNPs N_MISS             Number of missing SNPs
-N_GENO             Number of non-obligatory missing genotypes +N_GENO             Number of non-obligatory missing genotypes i.e total number of SNPs used 
-F_MISS             Proportion of missing SNPs+F_MISS             Proportion of missing SNPs (in percentage) 
 +</code>
  
 +The information found in the ''bin_caprin_60k.lmiss'' for the SNPs is as below;
 +<code>
 + CHR                           SNP   N_MISS   N_GENO   F_MISS
 +             snp1-scaffold1-2170        4      648 0.006173
 +        snp1-scaffold708-1421224        8      648  0.01235
 +          snp10-scaffold1-352655        2      648 0.003086
 +     snp1000-scaffold1026-533890        0      648        0
 +    snp10000-scaffold1356-652219        4      648 0.006173
 +    snp10001-scaffold1356-703514        9      648  0.01389
 +    snp10002-scaffold1356-766996       10      648  0.01543
 +    snp10003-scaffold1356-808120        5      648 0.007716
 +    snp10004-scaffold1356-853276        3      648  0.00463
 +    snp10005-scaffold1356-907019        2      648 0.003086
 +</code>   
 +
 +The information in each column is as follows;
 +<code>
 +SNP                SNP identifier
 +CHR                Chromosome number
 +N_MISS             Number of individuals missing this SNP
 +N_GENO             Number of non-obligatory missing genotypes i.e total number of genotypes in the population
 +F_MISS             Proportion of sample missing for this SNP (in percentage)
 </code> </code>
  
 +We can generate a file with filters added for the rate missing data in individuals ''--mind'' and call rate for the SNPs ''--geno'' and also for the minor allele frequency //(MAF)// , with flag ''--maf''.
 +
 +The thresholds for these filters should be adjusted accordingly to the different data sets.
 +
 +<code>
 +
 +#### filter data ###
 +
 +plink --file ${file} \
 + --geno 0.05 \   #95% call rate of SNPs
 + --maf 0.01\     #SNPs with less than 1% minor allele frequencies
 + --mind 0.25 \   #individuals with more than 25% missing data
 + --out ${out}/bin_caprin_60k_fltrd \
 + --make-bed
 +
 +</code>
 ===== Data analysis workflow with R and adegenet ===== ===== Data analysis workflow with R and adegenet =====
  
tutorials/population-diversity/snp-chips.1600686960.txt.gz · Last modified: 2020/09/21 11:16 by bngina