|
||||||||||||||||||||
|
|
Protocols from the manuscriptCell
Culture, Synchronization and RNA preparation HeLa S3 cells were plated (2x106
cells) in 150 mm tissue culture dishes in Dulbecco's Modified Eagle's Medium
with 10% Fetal Bovine Serum and 100U penicillin-streptomycin (Invitrogen Life
Technologies, Carlsbad CA). Cells
were arrested in S phase using a double thymidine block or in mitosis with a
thymidine-nocodazole block essentially as described previously (Whitfield et al., 2000) and in the supplemental material. Poly (A) RNA was prepared from cells collected at intervals
(typically 1-2 hrs) by lysing cells directly on the plate using the Fast Track
2.0 mRNA isolation kit (Invitrogen Life Technologies, Carlsbad CA). Synchrony was monitored by flow
cytometry analysis of propidium iodide stained cells (Stanford Shared FACS
facility). Mitotic cells were collected every 10 minutes
using an automated cell shaker (Eliassen et al., 1998), stored on ice and plated at two-hour intervals in fresh
pre-warmed media (at least 106 cells for each time point). Since only a small number of cells can
be obtained by mitotic shake-off, total RNA was prepared using ULTRASPEC RNA
isolation system (BIOTECX, Houston TX).
The number of cells in S phase at each point was determined using a
5-Bromo-2'-deoxyuridine (BrdU) Labeling and Detection Kit I (Roche Molecular
Biochemicals, Indianapolis IN). Reference RNA was prepared from asynchronously
growing HeLa cells using Trizol (Invitrogen Life Technologies, Carlsbad CA) and
poly(A) RNA isolated by affinity chromatography on oligo-dT cellulose (Amersham
Pharmacia Biotech, Piscataway NJ).
Poly(A) RNA was used as a reference in all experiments except the
mitotic shake-off, where total RNA was labeled. cDNA
synthesis and microarray hybridization RNA from synchronous cells was
reverse-transcribed into Cy5-dUTP (Amersham Pharmacia Biotech, Piscataway NJ)
labeled cDNA and reference RNA reverse-transcribed into Cy3-dUTP (Amersham
Pharmacia Biotech, Piscataway NJ) labeled cDNA using standard methods (Eisen and Brown, 1999) (Details are available at
http://genome-www.stanford.edu/Human-CellCycle/Hela/). Total RNA samples from cells collected
in the mitotic shake-off experiment and total reference RNA were first amplified using a modified
Eberwine protocol prior to cDNA synthesis (Wang et al., 2000), then labeled cDNA was prepared from the amplified RNA. Spotted cDNA microarrays, containing 22,692 elements representing approximately 16,332 different human genes or containing 43,198 elements representing approximately 29,621 genes (estimated by Unigene Clusters), were manufactured in the Stanford Microarray Facility (http://www.microarray.org). Equal amounts of Cy5- and Cy3-labeled cDNA were hybridized to spotted cDNA microarrays and scanned using a GenePix 4000A Scanner (Axon Instruments, Union City CA). Detailed protocols are available at http://brownlab.stanford.edu/protocols.html. Data
processing Data were extracted by superimposing a grid over
each array using GenePix 3.0 software (Axon Instruments, Union City CA). Spots of poor quality, determined by
visual inspection, were removed from further analysis. Data collected for each array were
stored in the Stanford Microarray Database (SMD) and are available
from SMD at http://genome-www.stanford.edu/microarray/
(Sherlock et al., 2001). Only features with signal intensity at least 20%
above background in both Cy-5 and Cy-3 channels and for which adequate quality
data were obtained for at least 80% of the samples in a given time course were
analyzed further. Data points that
did not meet these criteria are blank in the primary data tables. Log2(Cy5/Cy3) was retrieved
for each data point and used for all analysis, where (Cy5/Cy3) is the
normalized ratio of the background corrected intensities, as defined in SMD (Sherlock et al., 2001). Because of systematic differences between
experiments (e.g. array batch, labeling methods and synchronization methods)
each time course was centered independently by filtering out the first, most
significant eigengene (Alter et al., 2000), which was a dominant, constant vector. Since SVD requires a full data matrix,
missing data points were estimated using a k-nearest neighbors algorithm (Troyanskaya et al., 2001) with k=12.
These imputed values were used throughout the analysis and but were
restored to “unknown” status in the figures and left blank in the
primary data tables (http://genome-www.stanford.edu/Human-CellCycle/Hela/). Identification
of periodically expressed transcripts A Fourier Transform (Eq. 1 – 3) was
applied to the data for each clone in an experiment (Spellman et al., 1998) and the resulting vector (C, eq. 3) of the sine (A) and
cosine (B) coefficients was stored, where T is the cell cycle
period, t
is the time after release, f
is the phase offset and ratio(t) is the normalized Cy-5/Cy-3 expression ratio
at time t. The value of f was initially set to zero. The values obtained for C were
determined over a range of 40 values of T equally spaced 1hr above and below the
estimated cell cycle period and the resulting values averaged and stored.
The optimal cell cycle period was determined by
finding the value of T where the largest
numbers of genes pass an arbitrary magnitude cutoff (D, eq. 4). Fourier transforms were applied to the
data series for each gene (eq. 1-3), with equally spaced values of T, from 0 to
40 hrs in 15-minute increments.
The number of genes whose magnitude (D, eq. 4) exceeded the arbitrary
cut-off of 3, 5, or 7 was plotted and a period (T) was chosen that maximized
the numbers of genes exceeding our arbitrary thresholds. In most cases, the determined value of T was consistent with the data obtained by
flow cytometry. Because each experiment does not start at
exactly the same point in the cell cycle, an offset (f, Eq 1-2) was calculated for each dataset
relative to the first double thymidine arrest. The magnitudes from the Fourier transform (D, eq. 4) for the
1000 highest scoring clones were summed using different values of f, equally spaced between 0 and 2p.
The offset that gave the highest average combined magnitude (D, eq. 4)
between the two datasets for these 1000 genes was then used. The Fourier transform was then repeated
on the remaining datasets with the following values of T and f:
Thy-Thy 2 (T = 15.5 hrs, f = 0.5 rad), Thy-Thy 3 (T = 15.4, f = 0.0 rad), Thy-Noc (T
= 18.5, f = 3.2 rad), and mitotic
selection (T = 24.5, f = 3.5 rad). The vectors C (eq. 3) for all 5 datasets were then summed
and the genes ranked according to the magnitude (D, eq. 4) of their combined
vectors. Note, the Thy-Noc and
mitotic shake-off experiments, which arrest cells in mitosis, have offsets of
approximately half a cycle (p
radians) from Thy-Thy 1, which arrest cells at G1/S. Since the gene expression profiles of many cell
cycle genes do not precisely match sine and cosine curves, the expression
profile of each gene was correlated to an idealized vector obtained from known genes
expressed in each cell cycle phase (G1/S, S, G2, G2/M) as defined in Figure
2A. Using a standard Pearson
correlation, each gene received a peak correlation score defined as the highest
absolute value correlation between one of the four idealized vectors and its
expression profile (Spellman et al., 1998). The absolute
value of the peak correlation was used to scale the magnitude of the vector (C,
eq. 2) generating a “periodicity score” for each gene (Table 1). To estimate the minimum periodicity score for a
cell cycle regulated gene, the above analysis was repeated on randomized
data. The data were randomized
either within rows only, or within both rows and columns, for each of the five
datasets starting with the imputed, SVD centered data. The Fourier transform and correlations
were applied using the previously calculated values of T and f; the
resulting vectors (C, eq. 3) were combined for each dataset. The magnitude (D, eq. 4) of the Fourier
transform was scaled by each “gene’s” peak correlation to one
of the four ideal expression profiles.
This analysis, including the data randomization, was repeated ten times
and the scores combined by averaging the score for each of the highest scoring
“genes” from each randomization, followed by the second highest,
third highest etc. The estimated
false positive rate at a given periodicity score is the number of
“genes” that obtain at least that score in the randomized
data. We chose a minimum
periodicity score of 3.29, which gave us 1333 clones at an initial false positive
estimate of 1% when the data was randomized in rows. Repeating this analysis 10 times gave an estimated 10 false
positives (0.75 %; periodicity scores of 5.18 - 3.33) when the data were randomized only within rows and two false
positives (0.15 %; periodicity scores of 3.72 - 3.30) when the data were randomized in both rows and columns. The false positive estimate, calculated above,
is likely to underestimate the true false positive rate because it does not
take into account genes that received a high Fourier scores because they
exhibited a sinusoidal pattern in only part of a time course. To filter out genes that did
not show periodic expression, autocorrelations for each
1333 clones were calculated (Eq. 5).
The autocorrelation A is equal to the summation over all times t of the product of the
ratio at t
multiplied by the ratio at a time t + T, where T is the cell cycle period
determined by Fourier analysis. If
the data for a gene repeats with a period T, the autocorrelation will be high.
Autocorrelation scores were
calculated for Thy-Thy 3 and the Thy-Noc experiments because they represent multiple
cell cycles and because points were taken at equally spaced intervals
throughout the time course. In
experiment 3 autocorrelations were calculated for periods (T, eq. 5) of 15, 16
and 17 hours. In experiment 4
autocorrelations were calculated for periods of 16, 18 and 20 hours. The score for each gene in a given time
course was taken to be the maximum of the three autocorrelations. The final autocorrelation score
assigned to each gene was the sum of the scores calculated for each of the
two time courses. Autocorrelations were used as a filter to remove
genes that showed transient expression despite receiving a high
periodicity score. 199 genes with a negative autocorrelation (a
negative autocorrelation indicates the measured ratios do not repeat every cell cycle)
were eliminated from the initial set of putative cell cycle regulated
genes. Autocorrelations
were also calculated for data randomized in rows, whereupon few genes
received negative autocorrelations in the randomized data indicating
that the negative scores are unlikely to occur by chance. The
distribution of autocorrelation scores is shown in Supplemental Figure
16 Our final list contains 1134 clones that correspond to 874 UNIGENE clusters (UNIGENE build 143, released 11-09-2001, 21 clones not found in UNIGENE, 66 map to more than one UNIGENE cluster). The data for all 1134 clones as well as the primary data are available at http://genome-www.stanford.edu/Human-CellCycle/Hela/ Supplemental Protocols not in the manuscriptThis supplement provides experimental details of cell synchronization and array hybridization. Detailed protocols are also available from http://brownlab.stanford.edu/protocols.html Cell Synchronization HeLa S3 cells were plated at a
density of 2x106 cells in 150 mm tissue culture dishes in Dulbecco's
Modified Eagle's Medium (Invitrogen Life Technologies, Carlsbad CA) with 10%
Fetal Bovine Serum (Invitrogen Life Technologies, Carlsbad CA) and 100U penicillin-streptomycin
(Invitrogen Life Technologies, Carlsbad CA). Cells were arrested in S phase by a double thymidine block
as has previously been described (Whitfield et
al., 2000). 24 hrs after plating cells were blocked with 2 mM thymidine
for 17-18 hrs, released from the arrest for 9 hrs and arrested a second time
with thymidine. After 18 hrs, the
cells were released and followed for 30 – 46 hrs depending on the
experiment (see Figures 1-2). To
obtain populations of cells in mitosis, cells were arrested in 2mM thymidine
for 17-18 hrs, released for 4 hrs and blocked in 100 ng/ml nocodazole for 12
hrs. Floating mitotic cells were
collected, washed twice in 1X PBS and replated at a density of 4x106
cells in each 150 cm dish. The
cultures were typically at 10 – 25% confluence at the synchronous release
and reached confluence by the end of the time course. A detailed protocol for synchronization of HeLa cells by
double thymidine or thymidine-nocodazole blocks is available at http://genome-www.stanford.edu/Human-CellCycle/Hela/mandm.shtml. cDNA
synthesis and microarray hybridization. RNA
from synchronous cells was reverse-transcribed into Cy5-dUTP (Amersham
Pharmacia Biotech, Piscataway NJ) labeled cDNA and reference RNA was reverse
transcribed into Cy3-dUTP (Amersham Pharmacia Biotech, Piscataway NJ) labeled
cDNA using Superscript II reverse transcriptase (Invitrogen Life Technologies,
Carlsbad CA). Since only a small
number of cells could be obtained from mitotic shake-off, total RNA was first
amplified using a modified Eberwine protocol prior to cDNA synthesis (Wang et al., 2000); total reference RNA was amplified in parallel. cDNA
synthesis was primed using either oligo-dT and random hexamer (for the first
and second double thymidine time courses; Thy-Thy1 and Thy-Thy2), oligo-dT
alone (in third double thymidine time course and the thymidine-nocodazole time
course; Thy-Thy2 and Thy-Noc), or with random hexamer alone (for mitotic
shake-off). Equal amounts of Cy5- and Cy3-labeled cDNA were mixed and applied to spotted cDNA microarrays in 3X SSC/0.1% SDS in a final volume of 28 mL (24K arrays) or 38 mL (48K arrays) under 22x40 or 22x60 cm coverslips (Corning or Fisher Brand). Hybridizations were performed overnight (typically 12 – 18 hrs) in custom hybridization chambers (Monterey Industries, Richmond CA) at 65oC. After hybridization, arrays were washed for 2 minutes in each of four different wash solutions (2X SSC/0.1% SDS, 2X SSC, 1XSSC and 0.1X SSC) and excess wash solution removed by centrifugation at 500xg for 5 minutes. All arrays were scanned using a GenePix 4000A Scanner (Axon Instruments, Union City CA). Detailed protocols for microarray manufacture, RNA isolation, cDNA synthesis and microarray hybridization are available at http://brownlab.stanford.edu/protocols.html. |
||||||||||||||||||
|
|
||||||||||||||||||||
|
|
||||||||||||||||||||