|Home | About | Journals | Submit | Contact Us | Français|
The quality of data from microarray analysis is highly dependent on the quality of RNA. Because of the lability of RNA, steps involved in tissue sampling, RNA purification, and RNA storage are known to potentially lead to the degradation of RNAs; therefore, assessment of RNA quality and integrity is essential. Existing methods for estimating the quality of RNA hybridized to a GeneChip either suffer from subjectivity or are inefficient in performance. To overcome these drawbacks, we propose a linear regression method for assessing RNA quality for a hybridized Genechip. In particular, our approach used the probe intensities from the .cel files that the Affymetrix software associates with each microarray. The effectiveness and the improvements of the proposed method over the existing methods are illustrated by the application of the method to the previously published 19 human Affymetrix microarray data sets for which external verification of RNA quality is available.
Hybridization-based DNA microarray technologies have evolved rapidly to become a key high-throughput technology for the simultaneous measurement of the relative expression levels of thousands of individual genes.1 Over the past several years, microarray technology has been used to explore transcriptional profiles and to obtain molecular expression signatures of the state of activity of diseased cells and patient samples.2 In the field of cancer, microarray analyses have provided information on pathology, progression or resistance to treatment.3,4 However, the reliability of microarray technology to detect transcriptional differences representative of original samples is affected by several factors such as array platform, RNA extraction methods, probe labeling, hybridization conditions, and image analysis.5 In particular, the quality of data from microarray analysis is highly dependent on the quality of the RNA extracted from the tissues, which in turn is dependent on the quality of the tissue samples. Because of the lability of RNA, various steps involved in tissue sampling, RNA purification, and RNA storage are known to potentially lead to degradation of the mRNAs by cleavage of RNases. The standard Affymetrix protocol uses as starting material 5–40 μg of total RNA, which is first reverse transcribed using a T7-Oligo (dT) promoter primer into cDNA molecules from which biotinylated cRNA is synthesized for hybridization onto GeneChip probe arrays. The T7-Oligo (dT) promoter primer contains an oligo (dT) 24 sequence at its 3′ end for specific binding to the poly(A) tail of mRNA and the core sequence of the T7 RNA polymerase promoter at its 5′ end. Consequently, the cDNA yield from the sequences near the 5′ end of partially degraded mRNA is significantly less than that from the sequences near the poly(A) tail.6,7 Hence, tolerance of this platform to degraded RNA samples may lie in the nature of the Affymetrix GeneChip design, which is 3′-biased. To overcome this obstacle, oligonucleotide probes are usually designed to be within the most 3′ 600 bp of a transcript. Additional probe sets in the 5′ region of the transcript have also been selected for certain housekeeping genes, including glyceraldehyde 3-phosphate dehydrogenase (GAPDH ), interferon-stimulated gene factor (ISGF), and β-actin. Signal intensity ratio of the 3′ probe set over the 5′ probe set is often referred to as the 3′/5′ ratio. This ratio gives an indication of the integrity of the starting RNA, efficiency of first strand cDNA synthesis, and/or in vitro transcription of cRNA.8 It has been suggested in the literature that the 3′/5′ ratio of a given microarray should be examined for assessing the quality of the mRNA hybridized to the array.9,10
A method, referred to as RNA digestion plots, for assessing mRNA integrity is available in the Bioconductor affy package in the R programming environment.11 RNA degradation plots, which consider all probes across an array, show expression as a function of the 5′-3′ position of probes. The plot shows the average intensity of each probe plotted against its interrogation position, and the slope of its trend indicates potential RNA degradation of cRNA hybridized to the array. However, the RNA digestion plots can be ineffective for assessing RNA integrity, as all mRNA samples could exhibit the same linear trend regardless of the extent of RNA degradation present.12 Therefore, an objective RNA quality assessment method based on a statistical model is desirable. The GeneChip Operating Software (GCOS) estimates the 3′/5′ ratio using the perfect match (PM) and mismatch (MM) probe set expression measure.13,14 Affymetrix suggests that ratios greater than 3 may indicate degraded RNA or insufficient transcription.13 Several methods for summarizing probe-level expression data into probe set summaries have been reported in the literature.15–18 The 3′/5′ ratios calculated after summarizing probe-level data using the MAS 5.0 (Affymetrix), robust multiarray average (RMA),15 GC-RMA,19 and the PM-only MBEI algorithms18,20 may yield different conclusions regarding sample quality depending upon whether the GAPDH or β-actin 3′/5′ ratios are compared with the recommended threshold of 3.12 These irresolvable discrepancies between the 3′/5′ ratios for different transcripts within the same GeneChip present difficulty in drawing a final conclusion regarding sample quality. Furthermore, all transcripts but a few are represented by only one probe set on a GeneChip. Due to the lack of replicates of 3′ and 5′ probe sets on a GeneChip, it is impossible to provide an estimate of the variance for the 3′/5′ ratio when probe set expression summaries are used in the estimation of the 3′/5′ ratio. Further, using an arbitrarily selected 3′/5′ ratio threshold for determining RNA quality is analogous to the use of fold-change thresholds for identifying differentially expressed genes, and this has been shown to be inferior to statistical methods which take variability into account.21, 22 Establishing a viability threshold for determining RNA quality is therefore problematic, and this becomes especially challenging when different probe set expression summary methods are available for use. To overcome the drawbacks in the above methods, Archer et al.12 proposed a mixed-effect model to estimate the 3′/5 ′ ratio and obtain its confidence interval for an individual GeneChip using the interior pixel-level intensity. The model had a complex multilevel nested covariance structure including the random effects for probe set, probe, and pixel. There are several drawbacks to this approach that limit its use in practice: (1) The pixel-level intensity used in this approach is not a routine microarray data type associated with each microarray and requires special data processing to obtain; (2) The effect of the interrogation position of probes was modeled as 40 separate cluster effects, thereby ignoring the linear dependence of probe expression level on the 5′-3′ position of probes.
To avoid such drawbacks, we propose a linear regression approach to model mRNA degradation. We perform analysis on probe-level mRNA expression data from the .cel files arising from a typical microarray experiment. For each control gene, we also model the dependence of the log2 transformed probe expression level on probe interrogation position as a linear function of log2 transformed probe position. Our approach to assessing the quality of hybridized RNA in Affymetrix GeneChip experiments differs from previous investigations in a fundamental way.
Here we use the same data sets as those used in Archer’s mixed model paper.12 For the proposed method of assessing RNA degradation, the 3′ and 5′ end probe sets for three human housekeeping genes were retained for analysis (Table 1) for an individual GeneChip; i.e., there are a total of six probe sets which are from three housekeeping genes and located at the two ends of each. For each of the six probe sets, there are 20 different probes whose proximities to the 5′ end (or 3′ end) of the gene are indicated by their unique interrogation position [the position of the 13th (“middle”) nucleotide of the probe sequence as it aligns on the consensus/exemplar sequence]. We used the probe intensities from the .cel files that the Affymetrix software associates with each microarray. No processing of the data was performed on the values obtained from these files (other than taking the logarithm). We conducted the analysis using just the PM probes. Probe sequence information (i.e., probe sequence composition and location in the transcript) was obtained from the Affymetrix Web site (www.affymetrix.com).
Previously published data23 describe five pairs of HG-U133A GeneChips that were hybridized as follows. First, total RNA was isolated from multiple 10-μm frozen sections from five snap-frozen ovarian tumor samples using Trizol reagent. For each tumor sample, similar-sized aliquots of total RNA were processed with and without a subsequent cleanup process using RNeasy reagents. The RNeasy cleanup should lead to good-quality RNA (GeneChips Ovarian1 G, Ovarian2 G, Ovarian3 G, Ovarian4 G, and Ovarian5 G), whereas omission of the cleanup step should lead to poor-quality RNA (GeneChips Ovarian1 I, Ovarian2 I, Ovarian3 I, Ovarian4 I, and Ovarian5 I). The samples were obtained and processed according to a Virginia Commonwealth University Institutional Review Board approved protocol. Details confirming the RNA quality, such as absorbency ratios, 28S/18S ratios, and length of cDNA and cRNA synthesis products, are reported elsewhere.23
Previously published data from the University of Tübingen describe nine Affymetrix GeneChips (4 HG-U133A; 5 HG-Focus) that were collected to assess the impact of RNA degradation on microarray gene expression in renal cell carcinoma samples.24 The goal of this study was to determine the effects of a two-round IVT protocol on 20 ng of partially degraded RNA on gene expression values in comparison to expression values obtained when high-quality RNA is hybridized. The RNA extraction and chemical degradation procedures are described elsewhere.24 The nine samples from this study have been labeled as follows: (1) The first letter indicates cell type, with “N” indicating normal cells and “T” indicating tumor cells; (2) the second letter reflects the level of degradation, ranging from “A,” indicating freshly isolated RNA (i.e., no degradation) to “D,” indicating the highest amount of degradation; (3) the third letter reflects the GeneChip used, with “U” indicating HG-U133A and “F” indicating HG-Focus. The extent of degradation was confirmed for all samples by electropherograms.23
The method described applies to a single array. In this work we consider gene, probe set location, i.e., 3′ or 5 ′ end of a transcript, and probe interrogation position as the three covariates that influence probe intensity. In addition, to further account for the impact of the sequence on the measured intensity, the log2 of the percent GC content [log2(GC%)] of a probe was included as a covariate, which has been shown to be a more effective method for adjusting for the probe-affinity effect than that proposed by Wu et al.19 according to Archer et al.12 Instead of log2(GC%), we also considered log2 percent C content [log2(C%)] of a probe as a covariate for adjusting for the probe-affinity effect. The three genes are assumed independent. Let yijk represent the log2 signal intensity for the ith housekeeping gene (i =1, 2, 3), the jth end of the transcript with j =1 and j =2 indicating the 3 ′ end and the 5 ′ end, respectively, and the kth probe (k =1, 2, …, 20). Let x1ijk=1 if j =1, and 0 otherwise. Let x2ijk represent the log2 (probe interrogation position), x3ijk represent log2(GC%) (or log2(C%)) for the kth probe at the jth end of gene i, respectively, and let εijk represent the i.i.d. random error for the kth probe at the jth end of gene i with εijk ~ N (0, σ2). Here we propose the following statistical model to estimate the 3′/5′ ratio for an individual GeneChip:
Note here, β1 represents the fixed effect associated with the 3 ′ end of the transcript, θi represents the fixed effect associated with ith gene with , β2i is the linear regression coefficient associated with the log2(probe interrogation position) for gene i, and β3 is the linear regression coefficient associated with the log2(GC%) [or log2(C%)]. The least square method will be used to estimate the model parameters and their associated variance. Since the log2 transformation of the signal intensities was applied to meet model assumptions (Gaussian), the parameter 2β1, 3′/5′ ratio and its 95% confidence interval were reported using the original scale by retransforming the estimated value of the parameter to the original scale via antilogs. This transformation was used to compare our results to Archer’s,12 but in practice can simply examine the 95% confidence interval for β1 to assess RNA quality.
All analysis was carried out in R and SAS. The R code is available at www.biostat.umn.edu/~cavanr. Tables 2 and and33 report the point estimates and the 95% confidence intervals of the 3′/5′ ratios for the 10 ovarian GenChips and the 9 renal cell carcinoma GeneChips, respectively. For the ovarian samples, the 95% confidence intervals for all five GeneChips hybridized with RNA processed without the RNeasy cleanup step did not include 1, suggesting poor RNA quality, as appropriate. For the ovarian samples for which the RNeasy cleanup procedure was used, the 95% confidence intervals appropriately included 1 for all five samples, suggesting adequate sample quality, as expected. For the normal and tumor renal cell samples processed when no degradation was present, the 95% confidence intervals included 1 for all five samples, as appropriate (Table 3). For the degraded samples, the 95% confidence intervals exclude 1 for all but normal renal cell with high amount of degradation of HG-U133A Genechip (NDU) when log2GC% was used for adjusting probe affinity, suggesting poor RNA quality, again as expected (Table 3). Also note that the point estimates of the 3′/5 ′ ratios for all nine poor-quality GeneChips are much higher than the point estimates of their corresponding GeneChips with good quality (Tables 2 and and3),3), as expected.
Notice that the 95% confidence intervals for the 3′/5′ ratios from the current model are remarkably smaller than those from Archer’s mixed model for all 19 GeneChips but one (Ovarian inhibited 2), thereby suggesting that the current model generally provides more precise estimates for the 3′/5′ ratios than Archer’s mixed model (Figure 1). Furthermore, the point estimates of the 3′/5′ ratios from the current model are distinctly different depending on whether degradation is known to be truly present or absent in the sample. For example, the 3′/5′ ratios are smaller than 2 for all five Ovarian samples with the RNeasy cleanup step and all five renal cell samples with good mRNA quality, while the 3′/5′ ratios are greater than 2 for all five Ovarian samples without the RNeasy cleanup step and for all four degraded renal cell samples. However, in Archer’s model, no such clear-cut separation was observed among the 3′/5′ ratios with respect to sample quality. For example, Archer’s results showed that the 3′/5′ ratio for a decayed biological sample with poor mRNA quality, GeneChip NBF, was 5.41, and it was 4.51 for the same biological sample with clean up (GeneChip NAF). This observation leads to the conclusion that the current model provides more accurate estimates than Archer’s mixed model.
To determine if measurements from the same probe sets are correlated and how they are correlated, we considered six different types of covariance structure: (1) independent; (2) compound symmetry; (3) first order autoregressive; (4) Gaussian spatial, i.e., the correlation between two observations at a distance of r probe position apart on the log2 scale is exp (−(r/ϕ)2), where ϕ is a parameter that is estimated via restricted maximum likelihood (REML); (5) spherical spatial, i.e., the correlation between two observations a distance r apart is 1 − 1.5(r/ϕ)+0.5(r/ϕ)3; and (6) exponential spatial, i.e., the correlation between two observations a distance r apart, is exp(−r/ϕ). We used the Bayesian information criterion (BIC) values to select the best covariance structure. Among 19 GeneChips, 17 had the independence model give the best BIC value.
It is known that the degraded mRNAs hybridized to a GeneChip will yield lower expression levels than high-quality mRNAs; however, it has not been shown in the literature to what extent the bias and variability are related to mRNA quality and what their impact is on downstream microarray data analysis. To determine this impact, we investigated the effect of mRNA-quality-related bias on the detection of differential expression, and sought to determine if commonly used normalization procedures could correct this bias. We used the probe set summaries without normalization, which were generated using the RMA method implemented in the Bioconductor affy package in the R programming environment.11 We conducted paired t-tests on those unnormalized probe set summaries, i.e., gene expression data for 5 pairs of ovarian tumor samples23 for each of the 22,283 probe sets on HG133UA GeneChip. For 15,502 out of 22,283 probe sets, the gene expression level was significantly higher in the ovarian samples processed using the RNeasy cleanup step (good) than in the ovarian samples processed without the RNeasy cleanup step (inhibited) at a significance level of 0.05, while only 5 out of 22,283 probe sets had significantly lower expression levels in good ovarian samples than in inhibited ovarian samples. This indicated that false discovery rates (FDR) might exceed 50% even in a sample size of 10.
To determine if the mRNA-quality-related bias will be corrected by commonly used normalization procedures, we used the same five pairs of ovarian tumor samples, but we normalized the gene expression data by (1) the RMA method implemented Bioconductor affy package in the R programming environment,11 and (2) centering the expression data by subtracting the overall mean expression of an array. For RMA-normalized expression data, we found that for 1576 probe sets, the expression level in good-quality ovarian samples was significantly higher than in inhibited ovarian samples at a significance level of 0.05. In addition, 3152 probe sets had significantly lower expression levels in good ovarian samples than in inhibited ovarian samples. Almost identical results were observed in the centered-normalized expression data. This observation suggests that the normalization procedures significantly reduced the negative bias due to poor mRNA quality; however, they did not remove the bias completely. In fact, the normalization procedures induced additional positive bias, which was not present in the unnormalized data. With a p-value cutoff of 0.05, FDR remained as high as over 20% for the normalized data. Furthermore, at least half of the 4728 (3152 + 1576) probe sets had expression levels greater than the overall mean expression level of an array for all 10 GeneChips. Hence, the large FDR in the normalized data will not be removed by restricting the data set to those probe sets with higher expression levels.
To investigate the variability of expression levels among samples with the respect to mRNA quality, we also did an F-test using the same five pairs of ovarian tumor GeneChips23 for each of 22,283 probe sets on the GeneChips. In the unnormalized data, no probe set had significantly larger variance in the inhibited ovarian samples than in the good-quality ovarian samples, while 831 probe sets had significantly larger variance in the good-quality ovarian samples than in the inhibited ovarian samples at a significance level of 0.05. Similar results were obtained using the normalized data sets. This indicated that there is no evidence that poor-quality mRNAs lead to larger variability of gene expression levels among samples than good-quality mRNAs. So inverse weighting by the variance of the probe using weighted least square will not allow one to easily analyze sets of arrays that differ in terms of RNA quality.
The integrity of mRNA transcripts is a principle component of the quality of transcriptional profiling data. Given the sensitivity of RNA to degradation, assessment of RNA quality and integrity is an indispensable prerequisite for all downstream applications in gene expression analysis. The ratio of the 3′ to 5′ message abundance provides a measure of the quality of the RNA hybridized to a given array. Due to their inability to provide uncertainty measures in the absence of replicate probe sets of control genes on Affymetrix GeneChips, probe set expression summaries used to assess sample quality by comparing the 3′/5′ ratios to a predetermined threshold often lead to different conclusions regarding sample quality when several control genes are examined. The RNA digestion plots can be subjective and inconclusive in their interpretation of RNA quality. The mixed model proposed by Archer et al.12 overcame those drawbacks and provided a framework in which the 3′/5′ ratio can be estimated and a confidence interval and hypothesis test can be conducted. However, the pixel-level intensity used in this approach is not a routine microarray data type and requires special data processing in order to be obtained.
In this article, a linear regression model for assessing RNA quality for a hybridized array, which overcomes the drawbacks in the existing methods, is proposed. The proposed model is less complex and easier to implement than the existing methods. In particular, we used the probe intensities from the .cel files that the Affymetrix software associates with each microarray. The effectiveness of the proposed method over the existing methods is illustrated by the application of the method to 19 human Affymetrix microarray data sets for which external verification of RNA quality is available. It has been shown in the literature that an unexpected aspect of the asymmetry of G versus C affinities exists, which goes against the zeroth order energetic consideration that G-C and C-G bonds would contribute equally to the binding.25 Our results also indicated that percent C content alone was as equally effective as percent GC content for adjusting the probe affinity, while percent G content alone was not as sufficient, thereby indicating that G and C did not contribute equally in probe affinity. The degraded RNAs hybridized to a GeneChip yield lower expression levels than high-quality RNAs in general; however, the variability of gene expression levels among samples does not appear to be linked to RNA quality. The RNA-quality-related bias cannot be completely removed by the commonly used normalization procedures, thereby having a detrimental impact on downstream microarray analysis. Assessment of array quality is an essential step in the analysis of data from microarray experiments. Once detected, less reliable arrays are typically excluded from further analysis to avoid misleading results. Here we present a better statistical tool than the existing methods for RNA quality assessment.
The authors would like to thank Dr. Kellie Archer, Virginia Commonwealth University, for kindly providing all .cel files from her and her coauthors’ previously published study.