Variability of Gene Expression in Normal Human Blood Paper Home  
Normal Human Blood Paper figures  
Supplemental information  
Download Data  
Materials and Methods
Description of Materials and Methods  
People who contributed to the project

Materials and Methods

Patient information, blood samples, and RNA preparation.

Blood samples from apparently healthy human donors were obtained after informed consent and were treated anonymously throughout the analysis. Volunteer blood donors from the United States averaged 36.5 (±14.8) years of age, and males and females were equally represented. Samples were obtained from seven apparently healthy volunteer donors in Nepal as well (25-30 years of age, 4 females and 3 males). Complete blood counts (CBC) were determined at the Stanford University Hospital Clinical Laboratory by automated procedures. Measured parameters included total white count, differential counts for neutrophils, lymphocytes, monocytes, eosinophils, and basophils, red blood cell count, platelet count, hemoglobin, hematocrit, and erythrocyte indices (mean corpuscular volume, mean corpuscular hemoglobin, mean corpuscular hemoglobin concentration, and red cell distribution width). Time of blood draw was recorded, as was the self-reported health status and medication use of each subject, using a standardized questionnaire. PBMCs from 8 ml of blood were isolated using the Vacutainer Cell Preparation Tubes with Sodium Citrate (Becton Dickinson, Franklin Lakes, NJ), stored in Trizol reagent (Life Technologies, Rockville, MD) at –80°C, and total RNA extracted. Total RNA from 2.5 ml of whole blood was isolated with the PaxGene Blood RNA System (PreAnalytiX, Hombrechtikon, Switzerland) within 24 hours. Whole blood and PBMC total RNA was linearly amplified as previously described (1), with minor modifications (protocol). Both total and amplified RNA quality and quantity was assessed via electrophoresis and UV spectrophotometry.

The file containing the sample measurements is available from here


cDNA microarray manufacture and hybridization.

The cDNA microarrays were manufactured and hybridized as previously described (2,3) (also see http://brownlab.stanford.edu). Fluorescent (Cy5-labeled) cDNA probes were prepared from the amplified RNA samples as described (2) and hybridized to cDNA microarrays containing 37,632 array elements, representing approximately 18,000 unique human genes, of which 10,250 are named (4) . A common reference RNA (Cy3-labeled) was mixed with the Cy5-labeled experimental sample before hybridization to provide a common internal reference standard for comparison of relative gene expression levels across arrays (3). The reference for the PBMC experiments was a pool of RNA from a panel of 11 human cell lines (5); the reference for the whole blood experiments was the Stratagene Universal Human Reference RNA (Stratagene, La Jolla, CA). After obtaining fluorescence images of the hybridized arrays using a GenePix 4000B scanner (Axon Instruments, Union City, CA), spots and areas with obvious defects were excluded from further analyses.

Data selection and analysis.

Signal strength criteria were established for each dataset to exclude genes whose variation was associated only with very low signal intensity. Consequently, for the whole blood dataset, we only analyzed spots with an intensity/background ratio of at least 2.5 in the reference channel and 4.0 in the sample channel. For the PBMC dataset, the requirement was an intensity/background ratio of at least 3.0 in either channel. The criteria for inclusion of data in the dataset comparing variation in gene expression in whole blood and PBMCs was an intensity/background ratio of at least 2.5 in either channel. In addition, for each dataset, only genes that were well-measured (as defined by intensity/background criteria above) in at least 80% of samples were included in further analyses. Data from individual arrays were normalized by setting the mean log expression ratio of all elements on the array to zero and adjusting raw measurements accordingly. For hierarchical clustering, the measurements for each gene (as log ratios) were centered, by subtracting the mean across all samples, in order to emphasize relative expression within the experimental dataset.

Variance was calculated as


For the PBMC dataset, an “intrinsic score” was calculated for every element on the array that met the requirements for inclusion in the dataset (~ 25,000 elements). The intrinsic score is the ratio of the mean squared pairwise difference in that gene’s transcript levels between individuals, to the mean squared pairwise difference in the gene’s transcript levels between multiple samples from the same individual.

We used Significance Analysis of Microarrays (SAM) (6) to identify genes whose expression differed significantly between healthy male and female volunteer donors. We applied a false discovery rate threshold of 6.5%, and a delta of 0.52.

  1. E. Wang, L. D. Miller, G. A. Ohnmacht, E. T. Liu, F. M. Marincola, Nat Biotechnol 18, 457-9 (2000).
  2. M. B. Eisen, P. O. Brown, Methods Enzymol 303, 179-205 (1999).
  3. A. A. Alizadeh et al., Nature 403, 503-11 (2000).
  4. Alizadeh, A., et al., Cold Spring Harb Symp Quant Biol 64, 71-8 (1999).
  5. Perou, C.M., et al., Nature, 406, 747-52 (2000).

Home | Figures | Supplement | Download | Material & Methods | Authors