Variation in cDNA microarray analysis of gene expression using unamplified poly(A)+ RNA
In order to assess the reproducibility of microarray hybridization using standard methods, poly(A)+ RNA was isolated from both primary breast carcinoma BC2 and Universal Human Reference total RNA (Stratagene®). The BC2 poly(A)+ RNA labelled with Cy5 and the reference poly(A)+ RNA labelled with Cy3 were hybridized on five 42,000 clone cDNA microarrays. 16,333 clones had a signal greater than 50% above background on all five arrays. Three hybridizations were done on the same day using arrays from the same print batch and the average correlation coefficient between any two hybridizations was 0.97 ± 0.01, demonstrating a high reproducibility between parallel hybridizations done on the same day. Another two hybridizations were performed using poly(A)+ RNA isolated from BC2 total RNA three months later and a different print batch of microarrays. The correlation coefficient between these two arrays was 0.95, similar to the average correlation coefficient of the first three arrays. However, when the unamplified poly(A)+ RNA arrays performed weeks apart from different print batches were compared, the average correlation coefficient dropped to 0.89 ± 0.01, indicating that experimental variations due to differences in microarray printing, poly(A)+ RNA isolation, and RNA labelling and hybridization contribute a small but detectable change in results.
In order to minimize experimental variations, we created a virtual poly(A)+ RNA expression array that idealizes the gene expression of sample BC2 by using the average expression level of each clone over multiple hybridizations. By "expression level" we mean the normalized log (base2) ratio of signal intensities of Cy5 (experimental sample) to Cy3 (reference) fluorescence. The idealized expression profile from the poly(A)+ RNA virtual array is used as our unamplified "gold standard" for data analyses involving BC2. The correlation coefficient between each individual poly(A)+ RNA array and the gold standard ranges from 0.95–0.97, similar to that observed between hybridizations performed on the same day. The gold standard virtual array therefore represents well-measured gene expression in the primary tumor and minimizes individual experimental variations.
Template-switching does not affect the fidelity of amplification
As previously mentioned, a protocol based on T7 based linear amplification published by Wang, Miller and coworkers [24
] incorporated a TS mechanism [29
] in the synthesis of second strand cDNA at the 5' end in order to generate full-length ds cDNA. This was speculated to be of advantage in the hybridization of unmapped sequences spotted on arrays and to enable higher temperature cDNA synthesis that would enhance sequence specificity. However, no experimental evidence was provided to support the idea that the TS mechanism increases the fidelity of amplification.
To determine whether the addition of TS primer improves the fidelity of amplification, we compared gene expression profiles of aRNA amplified in the presence or absence of TS primer with expression profiles of unamplified poly(A)+
RNA. Total RNA isolated from primary breast carcinoma BC91 and reference total RNA were amplified with or without TS primer using the Wang-Miller protocol [24
], except that aRNA was purified using an RNeasy®
). A virtual "gold standard" poly(A)+
RNA array was created for BC91 using the average expression level of four hybridization replicates of unamplified poly(A)+
RNA. A "virtual correlation coefficient" for a given amplification protocol was obtained by comparing the virtual amplified array (averaged expression level for each clone from multiple amplified samples) to the virtual gold standard unamplified array for BC91. To determine the correlation between individual amplified samples and the gold standard, an "average correlation coefficient" for a given amplification protocol was also calculated (the sum of correlation coefficients of individual amplified samples with the gold standard divided by the number of amplified samples tested for each condition).
As shown in Table , using aRNA amplified from total RNA as reference, the expression profiles obtained in the absence of TS primer correlated with the gold standard slightly better than in the presence of TS primer, although the difference was not statistically significant. When poly(A)+ RNA rather than total RNA was amplified as reference, the correlation with the gold standard was slightly, but not statistically significantly, better with TS primer (Table ). The correlation coefficient using aRNA amplified from total RNA as reference is higher than using aRNA amplified from poly(A)+ RNA as reference regardless of whether TS primer is used. This suggests that when total RNA from a tumor sample is amplified for microarray analysis, the reference RNA should also be amplified from total RNA.
Table 1 Correlation coefficients of amplified and unamplified expression levels of 14,044 genes selected according to the described criteria. Amplifications with or without TS primer and with two different ds cDNA cleanup protocols were performed on BC91 total (more ...)
The yield of aRNA amplified from BC91 total RNA using different protocols is shown in Table . Assuming 1% of the total RNA is poly(A)+ RNA, a 253 and 370-fold amplification was observed in the presence and absence of TS primer, respectively. The yield of aRNA amplified from total RNA which are generated from cultured cell lines was 2- to 3-fold higher than when the primary tumor total RNA was amplified (data not shown).
Efficacy of amplification using 3 μg total RNA from BC91 and different ds cDNA cleanup methods, with or without TS primer.
These experiments demonstrate that the TS mechanism does not increase the fidelity of amplification, and therefore can be eliminated from the protocol. The reasons for the limited effect of the TS mechanism on the correlation coefficients probably are: 1) the second strand cDNA synthesis primed by the TS primer probably represents a small fraction, while the majority of the synthesis is self-primed or primed by small pieces of RNA generated by RNase H; and 2) adding a few base pairs to the ds cDNA prior to in vitro transcription does not change the aRNA significantly enough to affect the array hybridization.
DNA ligase activity is not required for amplification
Amplification protocols that do not include DNA ligase in second strand cDNA synthesis generate the same length aRNA (ranging from 0.2 kb to 6 kb, data not shown) as generated from a widely used T7 based amplification protocol developed by Affymetrix®
] which uses E. coli
DNA ligase. The correlation coefficient between amplified and unamplified sample and the yield of aRNA amplified without DNA ligase are high enough to suggest that DNA ligase activity is not necessary for RNA amplification in microarray analysis. For confirmation, we omitted DNA ligase from the protocol developed by Affymetrix®
and compared the expression profiles of the resulting aRNA to aRNA obtained using the standard Affymetrix®
protocol that includes DNA ligase. The correlation coefficient between amplified and unamplified samples is slightly higher in the absence of ligase (Table ), supporting our previous conclusion that ligase is not required for total RNA amplification in cDNA microarray analysis. However, the yield of aRNA is higher when ligase is used, suggesting that ligase may play a role in improving the efficiency of amplification.
Effect of DNA ligase on the fidelity of amplification.a,b
Column cleanup of ds cDNA does not improve the fidelity of amplification, but decreases the yield of aRNA
In the Wang-Miller protocol [24
], ds cDNA is purified using a Bio-6 column (Bio-Rad). A drawback to this method is that the cDNA is eluted with a large volume and needs to be concentrated into a much smaller volume by lyophilization prior to in vitro transcription. This is a time-consuming step, especially when large numbers of samples are processed. To eliminate the lyophilization step from the protocol, we used an alternative column–the Sephadex™ G-50 column–to filter out free nucleotides from the ds cDNA after completion of the second strand synthesis reaction. The ds cDNA is then precipitated following phenol-chloroform extraction and re-suspended in proper volume for in vitro transcription. The correlation coefficient between amplified and unamplified samples and the yield of aRNA using this less time-consuming modification are similar to that using the Wang-Miller protocol (Tables and ).
We further explored the question of what effects the ds cDNA column cleanup step itself had on amplification. We amplified total RNA from tumor BC2, either with or without the cleanup step of Sephadex™ G-50. Seven amplifications were done on different dates with the Sephadex™ G-50 column and five amplifications were done without this cleanup step. Both the virtual and the average correlation coefficients using the column are slightly lower than without it (Table ), suggesting that the column cleanup does not improve the fidelity of amplification. Moreover, the yield of aRNA is significantly higher without the column purification of ds cDNA, suggesting some loss of ds cDNA on the column. Since the column had a negative effect on amplification by decreasing the yield of aRNA without improving the fidelity of amplification, we eliminated this step from our protocol.
Effect of column cleanup on the fidelity and yield of amplification.a,b
Effect of in vitro transcription time on the fidelity of RNA amplification
To determine the effect of in vitro transcription time on amplification, duplicate reactions were performed at 37°C for 2, 3, 4, 5 and 6 hours. Two additional 5-hour incubation reactions were stored at 4°C overnight to determine the effect of low temperature incubation on amplification. The virtual correlation coefficient is slightly higher for the 5-hour incubation at 37°C (Figure ). However, in vitro transcription for 5 hours at 37°C plus overnight incubation at 4°C gives the highest yield of aRNA (Figure ). Since the yield of aRNA at any time point is sufficient for multiple hybridizations, we decided to use 5-hour incubation at 37°C for all subsequent amplifications.
Figure 1 Effects of in vitro transcription time on the fidelity of T7 based amplification and the yield of aRNA amplified from BC2 total RNA. Average correlation coefficients between amplified samples vs. unamplified poly(A)+ RNA at each time point are shown in (more ...)
Evaluation of the fidelity of T7 based linear amplification protocols
To systematically evaluate the fidelity of T7 based amplification, we compared the correlation coefficient obtained from four different protocols. The correlation coefficients of individual samples amplified using different protocols with the gold standard range from 0.74–0.86 (Figure ). The scatter plots comparing the gene expression of the virtual amplified samples for each protocol with the unamplified gold standard are shown in Figure , with the virtual correlation coefficients ranging from 0.83–0.86. The differences in correlations obtained using different protocols are not statistically different by Student's t-test, demonstrating that differences in gene expression for samples amplified using different protocols are minor.
Figure 2 Box graph of correlation coefficients of the gene expression levels for 11,123 clones, comparing individual amplified samples to the gold standard of BC2 (idealizing unamplified poly(A)+ RNA). Each closed circle represents the correlation coefficient (more ...)
Figure 3 Scatterplot matrix using average expression ratios of multiple replicate amplifications for each protocol and the gold standard. The X-axis and Y-axis show virtual gene expression level [normalized log(base2) fluorescence intensity ratio of sample to (more ...)
Our results also suggest that the level of bias introduced into gene expression profiling by amplification is relatively low; expression profiles obtained using aRNA provide a close approximation of the true expression profile of the original sample. To assess the biases of amplification quantitatively, we calculated the number and percentage of genes with expression level change by 4- or 2-fold after amplification. The biases of amplification by different protocols are similar. Specifically, less than 0.2% of 11,123 genes (12–15 genes) changed their expression level by 4-fold or greater and less than 6% (306–594 genes) changed expression by 2-fold or greater after amplification. With the Jeffrey lab protocol, less than 4% of genes showed changes in expression level by 2-fold or greater. Of the genes that changed, 7 genes and 139 genes changed their expression in all four protocols in the same direction greater than 4-fold and 2-fold, respectively. Also, the virtual correlation coefficients between different protocols are high (average 0.95) (Figure ), suggesting that slight differences in protocols based on T7 linear amplification mechanism do not affect the correlation of amplified samples to unamplified samples. These results suggest the conclusion that aRNA provides a close approximation of the true expression profile of the original sample.
We present here a components of variance model for explaining the different sources of variation in the amplification protocols (see Statistical Appendix, additional file 1
). The expression measurement for a gene for a specific array/protocol/sample can be broken down as
X = Z + e, where
X is the measured expression value
Z is the "true" expression that does not change under replication, and
e is the measurement error that does change under replication.
While we cannot estimate Z and e directly, we can estimate their variances from the data. For the four different amplified protocols and the unamplified arrays, the relevant variances are estimated in Table . The variance of the true expression (Var Z) ranges from 0.623–0.661 for the amplified protocols and is 0.726 for the unamplified arrays. The variance of the measurement error (Var e) ranges from 0.055–0.102 for the amplified protocols and is 0.059 for the unamplified arrays. The estimates of Var Z were obtained by averaging the pairwise covariances of the replicates within each protocol, and the estimates of Var e by using the within-protocol variance. (While the variance of a collection of numbers measures how much they vary about their mean, the covariance of two sets of numbers measures how much they vary with respect to each other. See Statistical Appendix for more details). We notice that Var Z for unamplified poly(A)+ RNA is larger than all of the others, indicating a dampening effect on gene expression by amplification. Measurement error variance is lowest for the Jeffrey lab protocol.
Variance of true gene expression (Var Z) and measurement error (Var e) for each of the different amplified protocols and the unamplified arrays.
The covariances between the different Zs for the different methods (estimated from the virtual arrays) are shown in Table . The off-diagonal elements of Table are the covariances; the diagonal elements are the variances of the virtual arrays. We notice that covariances among the amplified protocols, which range from 0.62 to 0.66, are higher than their covariances with the unamplified arrays, which range between 0.57 and 0.61. Furthermore, the variances of the amplified protocols (0.63 to 0.68) are lower than that of the unamplifed (0.74).
Covariances between the "true" gene expression for the different amplification protocols, estimated from the virtual arrays. The diagonal of the table contains the variances for the techniques.
This suggests the following further breakdown:
Z = Zc + Zs, where
Zc is a common expression component, with variance about 0.6, and
Zs is a specific expression component, with variance about 0.04 for the amplified arrays and 0.14 for the unamplified arrays.
Therefore, the amplified expression values for genes on an array are largely the same as on the unamplified array. The component of variation in which they differ appears to be common for all amplified protocols, and shows a much higher variance in the unamplified arrays. The effect of amplification can be summarized by saying that it has a dampening effect on the true expression of some genes (decreased variance in gene expression – see Figure ).
Figure 4 Scatterplot of the t-statistics (the numerical score underlying a t-test) comparing the differences in gene expression between amplified and unamplified RNA for BC2 (X-axis) and BC91 (Y-axis). The tests were based on 7 amplified total RNA and 5 unamplified (more ...)
A recent report compared amplified expression profiles of different primary breast tumors using Affymetrix®
]. Unger et al.
found that correlation coefficients between gene expressions in different tumors revealed by aRNA ranged between 0.71–0.89. However, gene expression was not measured using unamplified poly(A)+
RNA from these different tumors, raising the question of whether the observed fairly high correlation between diverse tumors was due to amplification bias. To answer this question, we compared the correlation coefficients between gene expression profiles of different tumors determined either with unamplified poly(A)+
RNA or aRNA amplified using our protocol (Table ). The correlation coefficients between BC2 and BC91 measured using poly(A)+
RNA or aRNA are the same, 0.55. Moreover, the correlation between differences in gene expression between amplified and unamplified samples for different tumors is weak (Figure ), suggesting that genes that differ through amplification depend on the sample rather than on systematic changes from amplification. Our results demonstrate that different primary breast tumors are not closely related to each other in gene expression profiles, and amplification does not affect the correlation of gene expression between different tumors. Amplification is therefore suitable for comparison of gene expression profiles among large sample sets.
Correlation between expression levels of different tumors (BC2 and BC91) determined with both poly(A)+ RNA and aRNA for each tumor.a,b
Evaluation of reproducibility of T7 based linear amplification
Another important aspect of RNA amplification is the degree of reproducibility. To evaluate this, we calculated the correlation coefficients between individual amplified samples. The correlation coefficients between individual hybridizations done on the same day for poly(A)+ RNA averaged 0.97 and for aRNA amplified on the same day averaged from 0.91–0.98 (Table ). The correlation coefficients between individual hybridizations done on different days averaged 0.89 for poly(A)+ RNA and 0.85–0.90 for aRNA amplified from total RNA on different days. The reproducibility of our protocol is slightly better than that using other protocols. Notably, when samples are amplified on the same day, the correlations are significantly higher than when samples are amplified on different days. In addition, samples amplified with protocols omitting ligase activity give higher reproducibility regardless of whether they are amplified on the same day or not.
Evaluation of the reproducibility of T7 based amplification.a,b
The effect of the amount of input total RNA on amplification
To determine the effect of the amount of input total RNA on single round amplification, we amplified different amounts of total RNA, 3 μg, 1 μg, 300 ng, 100 ng, 30 ng and 10 ng, using different amounts of T7 primer according to the quantity of input total RNA (Table ). When the input total RNA is lower than 300 ng, the yield of aRNA for is lower than the standard quantity required for one hybridization (3 μg). At amounts greater or equal to 300 ng, the correlation coefficients between amplified and unamplified samples and among amplified samples remain about the same. The fold of amplification increases with smaller quantities of template RNA, but the absolute yield of aRNA decreases. Therefore, within the range of 0.3–3 μg total RNA, decreasing the input RNA does not affect the fidelity and reproducibility of amplification.
The effect of the amount of template BC2 total RNA on the fidelity, reproducibility and yield of amplification.a