We have described microarray quality in general and provided the mathematical formalism that permits us to quantify the quality of a microarray hybridization. Using this formalism, we have demonstrated how to assess quality based on the 3 most common microarray applications and used these applications to describe the strengths and weaknesses of the most common quality metrics used to assess Affymetrix GeneChip microarrays. Specifically, we found that the methods proposed by Bolstad et al. are often able to detect poor quality arrays while the methods proposed by Affymetrix are not. However, the methods of Bolstad et al. are inherently multi-array, so we propose a single-array modification of the NUSE metric, called the GNUSE. We show that the GNUSE metric differs substantially from the NUSE metric only when the experiment is composed primarily of poor quality arrays.
We then use the GNUSE quality metric to assess the quality of publicly available microarray data. We found that roughly 10% of publicly available Affymetrix HGU133a and HGU133plus2 arrays are of poor quality. We also found that these poor quality arrays are not evenly distributed among labs or studies - that is, some labs are more likely to provide poor quality arrays than others, and some studies are compromised of mostly poor quality arrays.
While the most likely cause of high GNUSE values is poor array quality, it is conceivable that a study using a non-standard hybridization protocol or investigating a particularly unusual tissue type might appear to have poor quality. An example of the latter situation is the hybridization of non-human RNA to human microarrays. A potential example of the former situation may be the data used to create the BioGPS webtools [
11]. The 158 arrays used in the creation of these webtools (GSE1133) showed consistently high GNUSE values - 63.9% of the arrays had a median GNUSE above 1.25 and 96.8% of the arrays had a median GNUSE greater than 1. It is difficult to determine whether these arrays are of nearly uniformly poor quality or simply differ from typical arrays in some manner. Nevertheless, combining these arrays with arrays from any other experiment would certainly not be advisable.
The greatest strength of the GNUSE metric, the ability to assess the quality of a single array relative to overall microarray quality, is also its primary limitation - it requires a sizable number of arrays from different labs and different tissues to assess overall microarray quality. However, with the rapid increase in microarray experiments, this limitation is quickly diminishing, and the advantages of the GNUSE metric are growing. While there have been previous attempts at providing array quality metrics coupled with publicly available data sets [
24,
25] and at assessing the effect of quality on differential expression [
26], these attempts used metrics that could only assess the quality of an array relative to other arrays in the batch or the quality of a batch of arrays relative to other batches of arrays. The incorporation of the GNUSE metric in such efforts would allow one to truly assess the quality of publicly available data.
The results presented here are based on the two most widely used Affymetrix microarray platforms. As more data becomes available on newer platforms, we look forward to implementing fRMA and the GNUSE on those platforms. We currently have a preliminary implementation of fRMA on the Human Exon ST 1.0 array. Based on 874 publicly available arrays, roughly 4.5% of arrays have a median GNUSE greater than the quality threshold of 1.25 (Additional file
1, Figure S7). This may indicate that newer arrays are of better quality or that the quality threshold needs to be reassessed when measuring exon-level rather than gene-level expression.
While the results presented here focus primarily on Affymetrix GeneChip microarrays, many of the ideas can be generalized to other platforms and manufacturers. Specifically, we recommend defining quality in a quantitative manner that focuses on the bottom-line results from common genomic applications.
Furthermore, assessing the quality of one sample in the context of the wealth of public data is a powerful technique for developing quality metrics in high-throughput studies. We believe that the ideas and formalism described here can form the basis for future quality assessments of other microarray platforms and even other genomic technologies.
The GNUSE algorithm is available as part of the frma R package on Bioconductor [
27].