|Home | About | Journals | Submit | Contact Us | Français|
We have conducted a study to compare the variability in measured gene expression levels associated with three types of microarray platforms. Total RNA samples were obtained from liver tissue of four male mice, two each from inbred strains A/J and C57BL/6J. The same four samples were assayed on Affymetrix Mouse Genome Expression Set 430 GeneChips (MOE430A and MOE430B), spotted cDNA microarrays, and spotted oligonucleotide microarrays using eight arrays of each type. Variances associated with measurement error were observed to be comparable across all microarray platforms. The MOE430A GeneChips and cDNA arrays had higher precision across technical replicates than the MOE430B GeneChips and oligonucleotide arrays. The Affymetrix platform showed the greatest range in the magnitude of expression levels followed by the oligonucleotide arrays. We observed good concordance in both estimated expression level and statistical significance of common genes between the Affymetrix MOE430A GeneChip and the oligonucleotide arrays. Despite their apparently high precision, cDNA arrays showed poor concordance with other platforms.
High throughput gene expression technologies emerged in the mid 1990s. Since that time both commercial and academic groups have developed a number of different microarray platforms. Two color array platforms employ clones or oligonucleotides that are spotted onto glass slides and two differentially labeled cDNA samples are hybridized together on an array.1 Although the array-to-array variability of “homemade” spotted arrays can be quite large, the pairing of samples effectively removes this source of variation from treatment comparisons. On one-color platforms such as the Affymetrix GeneChip,2 a single sample is hybridized to each array. The precision of the measurement is achieved by minimizing array-to-array variability. Thus success depends on tightly controlled array production and hybridization methods. The amount of variation within a technology and the amount of agreement across the different platforms are important issues for researchers and core laboratories. A number of studies have been conducted to compare different platforms but there is no clear consensus.3–7 While some claim a significant divergence across platforms,3,4,7 others assert that the level of concordance is acceptable.5,6
The probe contents of the different microarray platforms are varied. Initially, cDNA libraries were predominant, but because of concerns about annotation, clone identity, and probe performance,8 long oligonucleotide (50–70 mer) platforms have become increasingly popular. Commercially available oligonucleotide libraries have gained acceptance in recent years because the annotation and identity of oligonucleotides in these libraries are reliable and the hybridization characteristics of oligonucleotides are generally good.
The Jackson Laboratory’s Gene Expression Core offers assays on three different platforms: spotted cDNA microarrays, spotted oligonucleotide microarrays, and Affymetrix GeneChips. In order to help our clients make informed choices among these platforms, a small head-to-head comparison described herein was performed.
Microarray platforms can be benchmarked with respect to their accuracy and precision.9,10 Precision can be assessed by observing how similar repeated measurements are to one another. Accuracy, on the other hand, requires a priori knowledge of the gene expression in the biological system being studied. In lieu of such a standard, we chose to compare the results of common genes across the three platforms and use concordance as a surrogate measure of accuracy.
Total RNA samples were isolated from liver tissue obtained from two individual mice from each of the inbred mouse strains A/J and C57BL/6J.11 Male mice were obtained from production surplus at the Jackson Laboratory, fed to the age of 6 weeks on NIH standard diet (4% fat), and fasted for 4 h prior to tissue collection. The liver samples were stored in RNAlater (Ambion, Austin, TX) immediately following dissection and later homogenized in TRIzol (Invitrogen, Carlsbad, CA). Total RNA was isolated according to the manufacturer’s protocols. RNA quality was assessed using a 2100 Bioanalyzer instrument and an RNA 6000 Nano LabChip assay (Agilent Technologies, Palo Alto, CA).
cDNA was synthesized from 20 μg of each total RNA sample using the SuperScript Indirect cDNA Labeling System (Invitrogen). Random hexamer and anchored oligo(dT)20 primers were added to the samples to be hybridized to cDNA arrays in the first-strand cDNA synthesis reaction. Only oligo(dT)20 was used with samples hybridized to oligonucleotide microarrays. In each case, the resulting amino-modified cDNA was labeled according to the manufacturer’s protocols using Cy3 or Cy5 fluorescent dyes (Amersham Biosciences, Piscataway, NJ). Independent cDNA syntheses were performed to generate labeled products for each technical replicate (Fig. 11).
The cDNA microarrays were produced from the 15K NIA12 and 3K endocrine/pancreas mouse gene clone sets.13 The individual plasmids were purified and used as templates to amplify cDNA using primers adjacent to cloning sites. Purified polymerase chain reaction products were spotted in duplicate on GAPS II amino-silane coated slides (Corning Life Sciences, Acton, MA).
Oligonucleotide arrays were spotted from the 22K Mouse Release 2.0 Oligo Library (Compugen and Sigma-Genosys). The oligonucleotide probes are 5′-C6-amino modified, 65 base pairs in length, and were arrayed at 50 mM concentration onto UltraGAPS coated slides (Corning Life Sciences). This probe set was not spotted in duplicate due to space limitations on the arrays.
For each array type, probe materials were printed in a 50% (v/v) solution of DMSO using a Virtek ChipWriter Pro arrayer (Bio-Rad, Hercules, CA) contact printer employing Stealth Micro spotting pins (TeleChem International, Sunnyvale, CA). Environmental conditions used in array production were 45–50% relative humidity and 20–22°C in temperature.
Samples were hybridized to microarrays in Corning hybridization chambers in DIG Easy Hyb solution (Roche Diagnostics, Basel, Switzerland) or Pronto Hyb buffer (Corning Life Sciences) for cDNA and oligonucleotide microarrays, respectively. Murine COT-1 DNA (Invitrogen) was used with all arrays. Hybridization was performed at 42°C for 16–20 h in a volume of approximately 50 μL after which slides were washed and dried prior to scanning.
Hybridizations were performed in pairs using dye reversal to minimize biases arising from differences in the dyes. All comparisons were made directly between A/J and C57BL6/J samples. We used four technical replicates (two dye-reversed pairs) for each individual mouse (Figs. 1A and BB).
Microarray images were obtained by scanning each slide in a GenePix 4000B (Axon Instruments, Union City, CA) scanner. Settings were 33% and 100% laser power for cDNA and oligonucleotide arrays, respectively. Image quantitation was performed using the associated software. GenePix 4.0 and median intensity values (non-background subtracted) for each spot were used in subsequent data analysis.
Two micrograms of total RNA from each sample was labeled according to protocols recommended by manufacturers. Briefly, after reverse transcription with an oligo(dT)24-T7 primer (Genset, South La Jolla, CA), double-stranded cDNA was synthesized with the Superscript double-stranded cDNA synthesis custom kit (Invitrogen). In an in vitro transcription reaction with T7 RNA polymerase (MessageAmp aRNA kit, Ambion), the cDNA was linearly amplified and labeled with biotinylated nucleotides (Enzo Diagnostics, Farmingdale, NY). Ten micrograms of labeled and fragmented cRNA was then hybridized onto MOE430A and MOE430B GeneChip arrays (Affymetrix) for 16 h at 45°C. Post hybridization, staining and washing were performed according to manufacturer’s protocols (see Affymetrix Expression Analysis Technical Manual, http://www.affymetrix.com). Finally, the arrays were scanned with a Hewlett Packard argon-ion laser confocal slide scanner. The images were quantified using MAS 5.1 software (MicroArray Suite, Affymetrix) (Fig. 1C1C).
The data were imported into the R software environment (http://www.R-project.org) and analyzed using the R/maanova package.14 The spotted array data showed spatial variations and intensity-dependent biases, which were corrected with joint lowess transformation.15,16 Data for each channel were mean centered prior to gene level analysis. The R/affy package from Bioconductor (http://www.bioconductor.org) was used to preprocess the Affymetrix data. RMA background correction v2, quantile normalization and median polish were applied to process individual probe values into normalized summary values for each probe set.17 Data for each GeneChip were mean centered prior to gene level analysis.
Probes with common Mouse Genome Informatics (MGI) gene identifiers were identified across the Affymetrix GeneChips, oligo arrays, and cDNA arrays. Two different sets of common genes were identified, one for the MOE430A GeneChips (set A) and another for the MOE430B GeneChips (set B). These comparisons yielded 3736 and 856 unique, known genes in common among the platforms. For MOE430A, MOE430B, and cDNA platforms, the MGI identifiers were obtained from TIGR Resourcerer v8.0 (http://pga.tigr.org/tigr-scripts/magic/r1.pl). MGI identifiers for the oligo arrays were retrieved using the GenBank Accession IDs from the Mouse Genome Database [http://www.informatics.jax.org (December 2003)].
A mixed model was fitted to perform a variance component analysis.18,19 For the two-color arrays, overall variation can be decomposed into array (A), mouse (M), and measurement () variance components. We express the gene specific model as
where yij is the logarithmic intensity on array i in dye channel j. The data for each gene are decomposed into overall mean, μ; an effect of strain differences, Sk(ij); a mouse effect, Mm(ij); a dye effect, Dj; an array effect, Ai; and measurement errors, ij. Array and dye are indexed by i and j. The strain and mouse indices k, m are determined by values of i and j. The mouse term is nested within strain. In some analyses, the mouse effect has been left out, as noted below. Terms for array, mouse, and error were treated as random effects and variance components were estimated separately for each gene using methods described in Littell et al.20
For the Affymetrix platform, the total variation can be decomposed into mouse (M) and measurement () components. The gene-specific mixed model is
where yi is the logarithmic intensity obtained from the RMA analysis. Index i refers to the array, and the mouse and strain indices k, m are determined by i. With one-color platforms, the array and the error variance components are confounded and the term i includes both sources of variation. Again, mouse effects are nested within strain. Variance components for the mouse and error terms were estimated separately for each gene.20
In all subsequent analyses we dropped the mouse term from the ANOVA models and treated all terms except error as fixed effects. This was necessary because of the sample size in the experiment. The result is that all samples are treated as independent replicates in the 4 vs 4 sample comparison of strains A/J and C57BL6/J. For each gene, estimates of the relative expression differences between the two strains (log ratios) were calculated using the fixed model. The Fs-statistics and unadjusted p-values were also calculated using the fixed ANOVA model.14,19,21 False discovery rates were calculated from the unadjusted p-values using q-value software (http://faculty.washington.edu/~jstorey/qvalue/qvalue.R).22
Platforms were compared in a pairwise manner. We assessed concordance of the Fs-statistics using a 2 × 2 cross-classification of significant vs nonsignificant genes (Table 11).). We assessed concordance of the log ratios23 using a similar cross-classification of the direction change among only the significant genes. In both cases, significance was determined using the unadjusted p-value of the Fs-statistic. The critical p-value was varied over a range of levels. A log odds ratio near zero indicates no association. Positive log odds ratios indicate concordant results and negative log odds ratios indicate discordance.
Principle components analysis was performed on the log ratios in the R environment.24
Mixed model ANOVA was used to decompose the total variability of the array measurements into individual variance components as shown in Figure 22.. The distribution of biological variability exhibited a characteristic pattern across all platforms. Most genes have essentially zero biological variance but a small proportion of genes (less than 10%) exhibit large variability from mouse to mouse within a strain. The array variance component is the largest source of variation for the two-color arrays. However, this component cancels out of the strain comparisons due to the pairing of samples on arrays, and thus has little impact on our ability to detect differential expression. For the MOE430A and MOE430B GeneChips, array variability is undoubtedly contributing to the error variance.
The magnitudes of the estimated log ratios show the largest dynamic range for the Affymetrix GeneChips, followed by the oligo arrays, and lastly the cDNA arrays (Figs. 3A and CC).). The different platforms show comparable distributions in terms of the t-statistics (Figs. 3B and DD).
We varied the critical p-values to observe the pattern of agreement between platforms at different levels of statistical significance. When the critical value is not extreme (p > 0.05), the agreement of the log ratios among platforms is poor, i.e., the log odds ratio is close to zero. As the stringency is increased, the rate of agreement increases.
Oligo arrays and MOE430A showed the highest level of agreement for both statistical significance and the direction of expression change (Fig. 44).). At very stringent levels of significance, the gene expression changes of all the platforms agree well, but the best overall agreement was found for the MOE430A versus oligo comparisons. In other words, if a gene is found to be significant in either of the MOE430A or oligo arrays, it is likely to be significant in both and the direction of change is likely to be the same.
The scatter plot of log ratios (Fig. 55)) shows good correlation between MOE430A GeneChips and oligo arrays, while for the other two, many genes are discordant. The first two principal components of the log ratios explains 87% of the variability observed. The factor loadings of the principle components indicate high concordance between MOE430A and oligo platforms and discordance with the cDNA arrays.
The MOE430A platform consistently detected the largest number of significant genes (at fixed and unadjusted p-value) and had the lowest false discovery rate. At the critical p-value of 0.01, 3% of the significant genes are expected to be false-positives, less than a half compared with oligo arrays and MOE430B GeneChips. False discovery rates for cDNA arrays are the next lowest; at the critical p-value of 0.01, 6% of the significant genes are expected to be false-positives compared with approximately 10% for oligo arrays and MOE430B GeneChips. The same trend is observed for other critical p-values (Table 22,, Fig. 66).
The MOE430A GeneChips and cDNA arrays demonstrated higher precision across technical replicates as compared with MOE430B GeneChips and oligo arrays. The MOE430A and MOE430B GeneChips showed the greatest range in the magnitude of expression levels observed followed by the oligo arrays. Despite their apparent high precision, cDNA arrays showed the smallest dynamic range and displayed poor concordance with the other platforms surveyed. The oligo arrays and MOE430A GeneChips showed the highest concordance in both statistical significance and direction of change (sign of the log ratio) for genes shared in common among the platforms.
The high precision of the cDNA arrays found in this study could be attributed to any of several factors such as replicate printing of spots or the relatively high quality of the printed slides (as compared with the oligo arrays). We note that the oligo arrays used in this experiment were from an initial production printing and of a development quality. The design of the experiment at the cDNA synthesis stage seems to provide the most likely explanation for the paradoxical finding that spotted cDNA arrays had the highest precision but the lowest agreement with other platforms. However, these arrays may be measuring something other than expression of the gene indicated in the annotation either due to errors in the annotation of the cDNA clones or to nonspecific hybridization with the longer probes. Technical replication for the cDNA portion of the study was performed only at the array level, while the oligo array processing was replicated in its entirety, incorporating the variability inherent in each step of the method (Fig. 11).). MOE430A and 430B GeneChip experiments were also technically replicated at the cDNA synthesis step.
Several caveats must be raised before drawing conclusions. First, the number of arrays used was quite small. While eight arrays per platform in an experiment comparing two treatments might be reasonable, it may be too small for accurate variance components analysis. Ideally, statistical significance of expression changes would be based on the biological variability. However, because of the small sample size we used technical error variance from the fixed ANOVA model in the statistical tests. Second, in principle, accuracy cannot be measured directly in this experiment because of the absence of prior knowledge about the true levels of gene expression in each mouse liver sample. We used concordance among multiple platforms as a surrogate measure of accuracy, and it is possible that two platforms share common biases.
Despite these concerns, the high level of concordance of common genes between the MOE430A GeneChips and oligo arrays suggests that both technologies provide reliable estimates of relative gene expression.
This research was supported by U24 DK58750 NIDDK Biotechnology Centers Grant and NCI-CA34196 Cancer Center Support Grant.