Additional Analytical
Methods
All non-flagged array
elements for which the fluorescent intensity in each channel was greater than
1.4 times the local background were considered well-measured and used to
generate Fig. 1. Fluorescence ratios were
log (base 2) transformed, and stored in a table (rows=individual cDNA clones,
columns=single mRNA samples). Where
samples had been analyzed on multiple arrays, multiple observations for an
array element for a single sample were averaged. Array elements that were not well-measured on at least 80% of the
96 mRNA samples were excluded. Data for
the remaining genes were centered by subtracting (in log space) the median
observed value, to remove any effect of the amount of RNA in the reference
pool. Fig. 1 depicts the gene
expression measurements derived from 4,026 elements on the microarray.
Hierarchical
clustering was applied to both axes using the weighted pair-group method with
centroid average (WPGMC) (Sneath and Sokal 1973) as implemented in the program Cluster (Michael
Eisen;http://www.microarrays.org/software).
The distance matrixes used were Pearson correlation for clustering the
arrays and the inner product of vectors normalized to magnitude 1 for the genes
(this is a slight variant of Pearson correlation). The results were analyzed
with Tree View (Michael Eisen; http://www.microarrays.org/software). The
datasets used for Figs. 3 and 4 are also available at this website, along with
numerous supplementary and additional analyses.
Sneath, P. H. A. and R. R. Sokal (1973). Numerical
taxonomy; the principles and practice of numerical classification. San
Francisco,, W. H. Freeman.