Differences

This shows you the differences between two versions of the page.

--- mkatari-bioinformatics-august-2013-clustering [2013/10/11 14:59] – mkatari
+++ mkatari-bioinformatics-august-2013-clustering [2014/12/11 14:41] – mkatari
@@ Line 64: / Line 64: @@
 <code>
 plot(sigGenes.hclust.k2.sil)
+</code>
+====== K-means ======
+The K-means method uses euclidean distance to measure distance. Since in biology we are more interested in gene expression profiles instead of magnitude of expression levels, let's scale our data so that the mean of the expression values is 0 and the expression values will be the standard deviations away from the mean.
+<code>
+sigGenesMean = rowMeans(sigGenes.normalized)
+sigGenesSD = apply(sigGenes.normalized, 1, sd)
 </code>
@@ Line 85: / Line 93: @@
 </code>
-Create heatmap. We can save it to a pdf file
+Create heatmap. We can save it to a pdf file. Note that sigGenes.normalized is just a matrix. Here we can provide any matrix of values, for example hclust.k2.cluster2.normalized which is the expression values of genes in cluster 2 (see above)
 <code>