Figure 1. Cluster diagram depicting relative gene expression differences between cell lines. The red-green pseudocolor chart depicts gene expression data in comparison between different cell lines. Red blocks depict genes relatively over-expressed in comparison between the measured samples whereas green blocks depict genes relatively under-expressed. The data table has been organized by hierarchical clustering which groups the genes on the basis of their similarity in expression patterns across a set of experimental samples (e.g. cell lines), and groups the experimental samples together based upon their similarity in gene expression patterns across the set of chosen genes. The result of the analysis is a re-ordering of the data table such that genes with relatively similar patterns of expression across the sample set are adjacent to one another in the rows, and samples with similar patterns of expression in the set of chosen genes are adjacent to one another in the columns. The dendrogram above the color chart depicts the relative similarities of the cell lines to one another; terminal branches contain cell lines that express relatively similar patterns of gene expression across whereas those separated by longer branches express relatively less similar gene expression patterns [9]. A) Complete cluster diagram that depicts all 1287 transcripts across 18 independent cell lines including 24 different hybridizations. B) "Common epithelial" cell gene set that was expressed in both basal and luminal cells but was not expressed in the cells that have strong fibroblast-like characteristics. C) Breast basal epithelial cell gene set that was strongly expressed at in all HMEC derived cell lines. D) Stromal-like/fibroblast gene set that was expressed in some fibroblasts as well as some breast cancer derived cell line that were ostensibly mis-classified as carcinoma derived. E) Luminal epithelial gene set that was expressed in estrogen-receptor-positive cell lines as well as a few other lines. The color scale at the bottom left depicts the gene expression measured in each cell line relative to the average expression for each gene as determined in the 24 different cell line samples.