User Tools

Site Tools


mkatari-bioinformatics-august-2013-loadingphytozome

This is an old revision of the document!


Back to Manny's Bioinformatics Workshop Home

Loading Phytozome Annotations

Read Phytozome annotation file.

This step was tricky because there were characters such as quotes and hash marks. We need to reset the default values to allow the file to load without a problem.

anno<-read.table("Mesculenta_147_annotation_info.txt", 
           header=FALSE, 
           sep="\t", 
           quote="",
           comment.char="")

Creating GO objects for fisher test

First create a new data frame with only gene name and go-ids. Then simply source the createGOdata.R script. Make sure you put the script in your working directory before you source it. This will create two different lists in your workspace: go2gene and gene2go

justGeneGo<-anno[,c(2,10)]
source("createGOdata.R")

Run Fisher Exact Test

If you do not have a sample gene list to work with, simply create a random one.

gene.list<-names(gene2go)[sample(1:length(names(gene2go)), 500)]

Start by sourcing the GoTermEnrichment.R script. The function itself is called doFisherTest. It has four arguments: gene.list, gene2go , go2gene, and goAnnot. The first is your list of genes. The next two use gene2go and go2gene as default but different lists can be provided here and the Last is an annotation file for the terms. Download goterm_annot

To run the function:

source("GoTermEnrichment.R")
goterm_annot<-read.table("goterm_annot", sep="|", strip.white=TRUE, 
                          row.names=1, quote="")
goresults=doFisherTest(gene.list, goAnnot=goterm_annot)
write.table(goresults, "goresults", quote=F, sep="\t")
mkatari-bioinformatics-august-2013-loadingphytozome.1399551053.txt.gz · Last modified: 2014/05/08 12:10 by mkatari