Microarray expression analysis has revolutionized many facets of biology and will continue to be applied widely. However, significant questions remain with regard to the generation, analysis, and in particular, interpretation of microarray data. Although the validation of microarray expression results obtained for specific genes using independent techniques is still considered a desirable component of any microarray experiment, the genes selected for validation a priori, are usually identified from the microarray data. The selection is based on the implicit assumption that there is a good correlation between the microarray data and actual mRNA levels in the cells under investigation. One fundamental issue that has not been adequately addressed is how well microarray expression scores reflect actual mRNA levels in the sample being examined.
To facilitate data comparison between research groups it is important that the microarray community moves to adopt consistent validation methodologies. This is especially important if microarray technology is to play a role in the clinical setting [13
]. However, the choice of validation methodology remains a contentious issue [14
]. To date, qRT-PCR is the method of validation that has been used in the majority of published microarray studies, presumably because it is a rapid, sensitive, high throughput procedure that requires minimal amounts of test material compared to techniques such as Northern blotting or ribonuclease protection assays. As is the case for many studies, including ours, qRT-PCR is often the only feasible approach when rare or unique tissues are investigated. For these reasons, it would appear likely that qRT-PCR will continue to be used extensively for the validation of microarray expression data [15
]. To our knowledge, this study is the most extensive and practical examination of mammalian cells that focuses on the degree of correlation between expression level measurements obtained by oligonucleotide microarray analysis and qRT-PCR.
We observed strong correlations (p < 0.05) for the majority (>87%) of the 31 transcript-concordant genes that we examined in this study. In addition, although the MAS 5.0 software and RMA use different algorithms for the normalization of microarray data [8
] we found that the degree of correlation between microarray and qRT-PCR results was very similar irrespective of the normalization procedure employed.
Our data clearly demonstrate that similar microarray scores for different genes do not necessarily mean that similar qRT-PCR scores will be obtained. For example, ATBF1, OSF2, and SNIP1 yielded similar average log2 RMA scores (~6.6) but the average log2 qRT-PCR scores for the same genes were substantially different (0.27, -1.26, and 1.08, respectively). Similarly, KIT and ABCC4 exhibited identical average log2 MAS 5.0 scores (~7.5), while the corresponding average log2 qRT-PCR scores were -2.73 and 0.09, respectively. The finding that genes with similar microarray expression scores were unlikely to have similar qRT-PCR results presumably reflects the different hybridization kinetics of the probe sets for each gene. This observation has the major implication that on the basis of the qRT-PCR data that we obtained, it is generally not feasible to predict the true expression level of one gene based on the microarray expression score of another. In addition, we observed significant correlations for many genes with microarray expression scores, at least by RMA, of less than 100 (~log2100 = 6.64), which is at the lower end of the range of microarray scores we obtained in this study (range 6–23000). This finding indicates that the exclusion of genes with low microarray expression scores (e.g. <100) from further analysis, as has been adopted by some research groups in early microarray studies, may not be justified.
Determining fold-changes in gene expression levels between subsets of interest is often a critical aim of microarray studies. We found a significant and strong correlation using RMA (r = 0.89, p < 0.05) and MAS 5.0 (r = 0.92, p < 0.05). These data indicate that the direction of change of gene expression levels (i.e. either up or down regulation) between subsets of interest is accurately predicted by comparison of average microarray expression scores. Again, the fold-change correlations we observed were very similar irrespective of the normalization procedure we employed. Consistent with the results of Yuen et al (2001)[16
], fold change results determined by qRT-PCR were significantly greater than fold change assessed for the same genes by microarray analysis.
A recent study addressing gene expression profiles in Arabidopsis
reported a good correlation between oligonucleotide microarray and SYBR green qRT-PCR data when ratios of gene expression in shoot tissue versus root tissue were compared for highly expressed genes. However, the correlations between shoot versus root ratios were generally poor for genes expressed at low levels [17
]. We observed a similar trend towards poorer correlation for genes that exhibited fold-change differences of <1.5 between subsets of interest based on microarray expression scores compared to those with fold-change differences of >1.5. It is likely that this trend relates to the fact that small variations in mRNA levels (<2-fold) can be accurately detected by qRT-PCR, while the smaller dynamic range of microarrays means that the same changes may not be accurately reflected by microarray expression scores, especially for genes expressed at low levels (<1.5 pM or approximately 3.5 copies/cell) [18
]. This latter point is a likely explanation for the poor correlation observed for one gene, DMBT1
, which is expressed at very low levels according to our qRT-PCR data. Etienne et al
., 2004 [20
] observed a lower overall correlation between microarrray and semi-quantitative RT-PCR data compared to our study. These authors hypothesized that in addition to genes with low expression levels, those with very high expression levels or a greater percentage of absent calls, may show lower levels of correlation between Affymetrix expression scores and semi-quantitative RT-PCR data. We considered these issues in relation to the other poorly correlating genes in our study and found that none were expressed at levels that approach the fluorescence ceiling for the Affymetrix scanner (~50000). In addition, the absolute number or percentage of absent calls did not correlate significantly (p > 0.05) with the level of correlation between qRT-PCR results and microarray data (data not shown). It is possible that the differences between our results and those of Etienne and co-workers are related to the particular semi-quantitative RT-PCR methodology employed by these researchers, which may not be as sensitive as qRT-PCR, and as the authors point out, may not detect certain low level transcripts.
In addition to DMBT1
mentioned above, we identified 13 other poorly correlating genes from the 48 genes we assessed. Careful analysis of the alternative transcript data available through the LocusLink database http://www.ncbi.nih.nlm/LocusLink
indicated that for 10 of these 13 genes, different subsets of alternative transcripts may be recognized by microarray probe sets and qRT-PCR probes. Hence, this may be the explanation for the poor correlations observed for these genes. Possible explanations for the poor correlations that were observed for the three remaining genes (p53, UMPCMPK
, and TERF2
), all of which were transcript-concordant, include the existence of alternative cross-hybridising transcripts differentially recognized by the oligonucleotide probe sets and qRT-PCR probes, gene specific variation related to the different hybridization kinetics associated with the two technologies, and misleading results associated with errors in GenBank sequence data and/or probe set annotations [21
]. Additional experimental data will be required to address these possibilities. It is important to note that in our hands the reproducibility of both the qRT-PCR and oligonucleotide microarray methods is very high [22
]. Hence, it is unlikely that poor correlations observed in our study are associated with issues of experimental precision.
Interestingly, the microarray and qRT-PCR expression data correlated well for five genes for which the microarray probe sets were deemed unlikely to recognize the same transcripts as the qRT-PCR probes. These data suggest that despite the possibility of differential transcript recognition, identical transcripts were being detected by both assays in the particular tissues involved.