User Tools

Site Tools


tutorials:population-diversity:snp-chips

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
tutorials:population-diversity:snp-chips [2020/09/21 11:07] – [Data analysis workflow with Plink 1.9] bnginatutorials:population-diversity:snp-chips [2020/09/22 10:09] – [Data analysis workflow with Plink 1.9] bngina
Line 165: Line 165:
  
 <code> <code>
 +
 +
 +
 +######### summary statistics ########
  
 #missingness #missingness
Line 174: Line 178:
 </code> </code>
  
-This created two files.+This creates two files.
  
   *//bin_caprin_60k.imiss// - for the individuals   *//bin_caprin_60k.imiss// - for the individuals
Line 181: Line 185:
  
  
 +#The missing information found in the ''bin_caprin_60k.imiss'' for the individuals looks like below;
 +<code>
 +FID                               IID MISS_PHENO   N_MISS   N_GENO   F_MISS
 +            WG6694108-DNA_A01_110kin          Y     1325    53347  0.02484
 +           WG6694108-DNA_A02_105kin1          Y     1346    53347  0.02523
 +             WG6694108-DNA_A03_55kin          Y     1313    53347  0.02461
 +             WG6694108-DNA_A04_50kin          Y     1360    53347  0.02549
 +            WG6694108-DNA_A05_104kin          Y     1350    53347  0.02531
 +             WG6694108-DNA_A06_82kin          Y     1412    53347  0.02647
 +            WG6694108-DNA_A07_75kin1          Y     1387    53347    0.026
 +           WG6694108-DNA_A08_110kin1          Y     1312    53347  0.02459
 +             WG6694108-DNA_A09_77kin          Y     1356    53347  0.02542
 +  10           WG6694108-DNA_A10_Zkin2          Y     1349    53347  0.02529
 +
 +</code>
 +
 +The information in each header is as follows;
 +<code>
 +FID                Family ID
 +IID                Individual ID
 +MISS_PHENO         Missing phenotype? (Y/N)
 +N_MISS             Number of missing SNPs
 +N_GENO             Number of non-obligatory missing genotypes i.e total number of SNPs used
 +F_MISS             Proportion of missing SNPs (in percentage)
 +</code>
 +
 +The information found in the ''bin_caprin_60k.lmiss'' for the SNPs is as below;
 +<code>
 + CHR                           SNP   N_MISS   N_GENO   F_MISS
 +             snp1-scaffold1-2170        4      648 0.006173
 +        snp1-scaffold708-1421224        8      648  0.01235
 +          snp10-scaffold1-352655        2      648 0.003086
 +     snp1000-scaffold1026-533890        0      648        0
 +    snp10000-scaffold1356-652219        4      648 0.006173
 +    snp10001-scaffold1356-703514        9      648  0.01389
 +    snp10002-scaffold1356-766996       10      648  0.01543
 +    snp10003-scaffold1356-808120        5      648 0.007716
 +    snp10004-scaffold1356-853276        3      648  0.00463
 +    snp10005-scaffold1356-907019        2      648 0.003086
 +</code>   
 +
 +The information in each column is as follows;
 +<code>
 +SNP                SNP identifier
 +CHR                Chromosome number
 +N_MISS             Number of individuals missing this SNP
 +N_GENO             Number of non-obligatory missing genotypes i.e total number of genotypes in the population
 +F_MISS             Proportion of sample missing for this SNP (in percentage)
 +</code>
  
 +We can generate a file with filters added for the rate missing data in individuals ''--mind'' and call rate for the SNPs ''--geno'' and also for the minor allele frequency //(MAF)// .
 ===== Data analysis workflow with R and adegenet ===== ===== Data analysis workflow with R and adegenet =====
  
tutorials/population-diversity/snp-chips.txt · Last modified: 2020/09/22 10:21 by bngina