Additional Analytical Methods
All non-flagged array elements for which the fluorescent intensity in each channel was greater than 1.4 times the local background were considered well-measured and used to generate Fig. 1. Fluorescence ratios were log (base 2) transformed, and stored in a table (rows=individual cDNA clones, columns=single mRNA samples). Where samples had been analyzed on multiple arrays, multiple observations for an array element for a single sample were averaged. Array elements that were not well-measured on at least 80% of the 96 mRNA samples were excluded. Data for the remaining genes were centered by subtracting (in log space) the median observed value, to remove any effect of the amount of RNA in the reference pool. Fig. 1 depicts the gene expression measurements derived from 4,026 elements on the microarray.
Hierarchical clustering was applied to both axes using the weighted pair-group method with centroid average (WPGMC) (Sneath and Sokal 1973) as implemented in the program Cluster (Michael Eisen;http://www.microarrays.org/software). The distance matrixes used were Pearson correlation for clustering the arrays and the inner product of vectors normalized to magnitude 1 for the genes (this is a slight variant of Pearson correlation). The results were analyzed with Tree View (Michael Eisen; http://www.microarrays.org/software). The datasets used for Figs. 3 and 4 are also available at this website, along with numerous supplementary and additional analyses.
Sneath, P. H. A. and R. R. Sokal (1973). Numerical taxonomy; the principles and practice of numerical classification. San Francisco,, W. H. Freeman.