Back to Manny's Bioinformatics Workshop HOME
Clean SNP file
Read the file making sure explicitly tell it to delimit using tab and header is true Remember to save the file as a tab delimited text file.
read.table("Draft sent to Manny.txt", sep="\t", header=T, row.names=1)->draft
To count na use The number of True can be counted.
apply(, 2, sum) ->
Identify columns that have ⇐ 7% of missing data
draft[ ,which( <= 0.07*nrow(draft)) ] -> draft.goodsnps
Do same for genotype
apply(, 1, sum) -> draft.goodsnps[<=0.07*ncol(draft.goodsnps),]->draft.goodsnps.goodgen
To remove the regions column and only save the snps.
snponly=draft.goodsnps.goodgen[,2:1259] row.names(snponly)=row.names(draft.goodsnps.goodgen)
Check frequency of the different alleles
mkatari-bioinformatics-august-2013-cleansnp.txt · Last modified: 2013/08/19 14:33 by mkatari