|Home | About | Journals | Submit | Contact Us | Français|
Over the past several years, microarray technology has evolved into a critical component of any discovery-based program. Since 1999, the Association of Biomolecular Resource Facilities (ABRF) Microarray Research Group (MARG) has conducted biennial surveys designed to generate a profile of microarray service laboratories and, more importantly, an overview of technology development and implementation. Survey questions addressed instrumentation, protocols, staffing, funding, and work flow in a microarray facility. Presented herein are the results of the MARG 2005 survey; where possible, trends in the field are discussed and compared to data collected from previous surveys.
DNA microarray technology has emerged as a powerful and prominent tool for the molecular biologist to assess changes in gene expression at the level of the whole transcriptome.1–3 More recently, high-density oligonucleotide arrays have been used in large-scale analyses of genomic DNA to detect single-nucleotide polymorphisms, insertions/deletions, and chromosomal rearrangement used in linkage, association, and loss of heterozygosity studies.4–9 Investigators studying proteins, tissue, and small macromolecules have also adopted the use of the microarray-based technologies to explore biological perturbations on an industrial scale.10–14 As a result, a large and rapidly growing industry based on the production, use, and analysis of microarrays has developed into a life-sciences and clinical market.15–20
One-fluorophore- and two-fluorophore-based platforms have been the most widely used detection approaches in the microarray technology marketplace. Studies involving the use of single-color detection, based on the principle of one labeled sample per microarray, have predominantly been performed using the GeneChip technology developed by Affymetrix, Inc.1,21 However, there are other commercial and custom microarray providers that have also adopted the one-fluorophore approach. Slide-based microarray technologies, developed in the laboratories of Patrick O. Brown and Ronald W. Davis at Stanford University, were the basis of the two-fluorophore-based platform used today.2,22,23 Regardless of platform, however, each microarray experiment encompasses five steps: (1) extraction and preparation of nucleic acid, (2) fabrication of the array, (3) hybridization of the labeled sample to the array, (4)signal detection and visualization, and (5) data processing and analysis.3,24
The Association of Biomolecular Resource Facilities (ABRF) Microarray Research Group (MARG) has designed, conducted, and analyzed four surveys from 1999 to 2005. The goal of these surveys was to collect comprehensive information about microarray facilities, including how facilities have addressed the steps required to successfully complete microarray experiments with different types of microarray technology platforms.
Surveys were designed to collect information concerning instrumentation, protocols, staffing, funding, and throughput in a microarray facility. The 1999, 2001, and 2003 surveys consisted of three parts: a General Section, an Affymetrix Array Section, and a Custom (2-color) Array Section. The 2005 survey was expanded to contain six sections: General, Affymetrix Array, Custom (2-color) Array, Protein Array, Bioinformatics, and Future Directions.
Surveys were conducted on line. The 1999 and 2001 surveys were hosted by members of the MARG, the 2003 survey was hosted by the Web Design Group (Ithaca, NY), and the 2005 survey was hosted on SurveyMonkey.com. Surveys were announced on the ABRF and microarray-related electronic discussion groups and listservers. For all surveys, participation was anonymous, as the responses were collected by a third party who removed any information identifying the participant prior to making the data available to the MARG for analysis. The third party also screened for and removed multiple submissions provided by the same individual, and malicious responses. Participation was open to anyone, regardless of whether they were affiliated with the ABRF. Participants had the option of completing only sections of the survey that related to their microarray operation. Individual laboratories as well as laboratories that offer microarray technologies as a shared resource were invited to participate.
The MARG has conducted four biennial surveys since 1999, designed to acquire information on the operations of a microarray resource facility. Anyone was eligible to participate in the surveys. However, the majority of respondents were microarray facility personnel, because the surveys were announced in resource facility–related forums. The responses to each question were analyzed separately, because respondent information was removed before making the data available for analysis. Since an individual identifier did not accompany each response, it was not possible to combine the responses to more than one question or bin the data to be able to examine data provided by a subset of respondents. This limitation was a drawback that will be addressed in future MARG surveys.
The use of a self-selected survey design, over a random-sample design, was driven by several factors. First, despite the considerable increase in the number of publications reporting microarray data in PubMed (from less than 100 in 1999 to more than 6000 in 2004), this technology is still considered as highly specialized and is available in only a small number of academic centers. The self-selection of the sample group was the most effective way to acquire responses from those individuals specifically involved with a microarray facility. Second, as one of the goals of the current survey was also to compare data with previous surveys, a similar approach to data collection was utilized. And finally, because the majority of the constituents of the ABRF are associated with a resource facility, the focus of the surveys was on the microarray facility and its personnel. Because the focus of the survey was on the microarray resource facility, the results presented herein reflect the perspectives from a microarray facility and not necessarily the user. Additional MARG survey information can be found at: http://www.abrf.org/index.cfm/group.show/Microarray.30.htm).
The ABRF MARG has hosted surveys in the years 1999, 2001, 2003, and 2005 that were designed to collect information as to the “current” profile of a microarray facility. Questions regarding instrumentation, protocols, staffing, funding, and throughput in a microarray facility were asked. As illustrated in Figure1A and 1B1B,, the majority (about 70%) of the respondents were from the United States or Canada and at least 74% of the respondents to all the surveys work in an academic or government setting. Thus, the survey results presented herein largely reflect the perspective provided by respondents from the United States or Canada working in an academic or government setting. While the number of respondents increased in each successive survey, the demographics of the respondents with respect to institution type and geographic location were similar between surveys. As a result, comparisons have been made between the results of the 2005 survey and the previous MARG surveys.
In the 2001 MARG survey, investigators were asked their reasons for choosing a particular microarray platform. As shown in Figure 22,, investigators chose to use the Affymetrix platform because of the specific technology—they felt that the platform was more convenient, samples could be processed more quickly, and systems trials had shown the platform’s utility as an effective tool in assessing transcript levels. On the other hand, investigators chose the custom array platform because they felt the system was less costly to operate and was more flexible in its ability to permit changed microarray content. Clearly, the reasons given for choosing a microarray platform were quite different and yet complementary depending on the specific needs of the investigator. Based on the relative number of respondents of each section by year (Figure 33),), it appears that the number of facilities offering Affymetrix-based services is approaching the number of sites that offer slide-based custom array services.
As reported in the 2005 survey, the average number of personnel per facility was 3.8. This represents a nearly 1 FTE drop from the 4.7 reported in the 2003 survey and is similar to the 3.5 individuals reported in the 1999 survey. It should be noted that some respondents in the 2005 survey may not have included bioinformatics staff in their responses to this question, because another question in the survey specifically addressed the number of bioinformatics staff. The 2005 survey indicates that staff members have an average of 3.5 years of experience while the facility director/manager has an average of 4.6 years of experience.
Two-thirds of the respondents of the 2005 survey indicated they routinely use the Agilent Model 2100 Bio-analyzer (Agilent Technologies, Palo Alto, CA) to assess sample quality. In addition, 62% of the respondents that use the bioanalyzer, use it as their primary benchmark to determine whether a sample will be included in a microarray experiment.
Quantitative (real-time) polymerase chain reaction (qPCR) has become the clear method of choice to validate microarray results. In the 2003 and 2005 surveys, 80% and 88% of the respondents, respectively, validate with qPCR. Only 63% of the respondents of the 2001 survey validated microarray data using qPCR. In contrast, 76% of the 2001 survey respondents (compared with only 34% of the 2005 respondents) used Northern blot analysis to validate data.
There were 74 respondents to the Affymetrix array section of the latest survey. Sixty-six percent of the survey respondents indicated their Affymetrix microarray facilities have been operational for at least 3 years. The majority of respondents (80%) described their facility as a full-service facility providing cDNA synthesis through array scanning to their clients. As has been the case over the history of the four MARG microarray surveys, the Affymetrix GeneChip human and murine expression arrays are the predominately used arrays.
There appears to be an effort by the Affymetrix GeneChip microarrays user community to standardize operations between facilities. Over half the respondents indicated they were currently participating in such efforts, while 75% indicated an interest in national efforts to develop standardized protocols.
Regarding the current quality control measures being employed, 82% report utilizing Agilent’s bioanalyzer for the evaluation of both total RNA samples as well as cRNA samples. A majority of the respondents (89%) have found the resulting data to be a satisfactory predictor of the quality of total RNA and/or cRNA quality. Regarding the use of test arrays to evaluate samples prior to final hybridization to the appropriate expression array, a majority of respondents (66%) rarely or never process samples over test arrays. Collectively, this suggests that the utilization of test arrays has been eclipsed by facilities adopting the Agilent Bioanalyzer for RNA quality evaluation.
With respect to respondent’s satisfaction with the Affymetrix system (Figure 44),), a majority of respondents to the 2005 survey, 94% and 90% respectively, report being at least moderately satisfied with the quality of both the arrays and the data derived from the arrays. Respondent satisfaction of array and data quality, along with fluidic station performance, improved the most since the 2001 survey.
In contrast, 53% of respondents reported being at least mildly dissatisfied with the current cost of arrays. In addition, 36% of respondents reported being at least mildly dissatisfied with the support provided regarding new software upgrades and with the software itself. A similar finding of user dissatisfaction with array cost and software was also observed in the 2001 survey (Figure 44).). In general, however, most respondents (89%) report in the 2005 survey to be at least moderately satisfied with the Affymetrix system and the support provided by the company.
A majority of all ABRF-MARG survey respondents (97 out of 177, or 55%) currently employ some form of spotted-array technology, while 45% of respondents employ the Affymetrix platform. Many facilities have both platforms, though their exact number is not readily derived from our survey data. About 80% of the spotted-array respondents own an arrayer to print their own arrays, and 90% own at least one scanner. When the spotted-array data are compared to the last survey of 2 years ago, we see an increase in the total number of respondents (from 71 to 97) but a decrease in the percentage of the total number of respondents now printing slides (from 89% in 2003 to 80% in 2005). This decrease probably reflects the increasing role of commercially spotted arrays in facility microarray data production.
About 72% of spotted-array users were able to generate usable data within the first year. Seventeen percent have been in operation less than 2 years, while 63% have been in operation 3 years or longer. In this survey, we asked specific questions regarding the quality control of starting material. Seventy-one percent of users have access to the Agilent Bioanalyzer to check RNA quality, while 58% have access to the NanoDrop Spectrophotometer (Nanodrop Technologies, Wilmington, DE) for quantification. Most facilities are printing both PCR products/cDNA and oligonucleotides, while the latter are becoming more popular over time, with some 82% of current respondents having used oligonucleotide chips.
Funding support for spotted-array facilities derives from various sources, with the average breakdown reported to be 39% from the institution/company, 31% from grants, 25% from user fees, and 5% from other sources. Spotted-array facilities show a wide diversity in business models, supporting from 1 up to 310 user groups, with a mean of about 15 and a median of 6.
Most spotted-array facilities (73%) print, hybridize, and scan less than 100 arrays/month (Figure 55),), with oligonucleotide arrays being used at somewhat higher frequency. In comparison, 90% of Affymetrix respondents used less than 100 arrays/month. The spotted arrays used most often have from 10,000 to 25,000 features. Higher-density arrays are predominately printed with oligonucleotides, indicating an easier scaling to larger arrays with this technology, since the oligonucleotide printing products can be readily purchased.
Regarding the preparation of samples, 99% of spotted-array facilities are using fluorescent labeling. The most popular slide coatings are amino silane (37%), poly l-lysine (29%), epoxy (14%), and aldehyde (7%). Printing buffers are made in house 84% of the time for cDNA and 74% for oligonucleotides. Of these, one-third use 3X SSC, another third use 50% DMSO, with phosphate, Sarcosyl, and others sharing the remainder. Split pins dominate printing technology, with a 73% share. Twice as many respondents used indirect labeling as direct labeling.
As shown in Figure 66,, a great variety of species are being studied with spotted microarrays, with 67% of respondents printing human and 56% printing mouse arrays. Many cross-species experiments are also performed. For data analysis, most users employ background subtraction. Fifty-eight percent of users employ a Global LOWESS, while 47% use a Local LOWESS normalization strategy. Eighty-two percent of the respondents normalize between arrays by using an average of array global intensity, while 25% of respondents normalize by using the intensity of exogenous synthetic controls. Some 47% are confident of detecting at least a 1.5 or less (lower) fold change in transcript levels using their system.Most users appear to be at least moderately satisfied with the performance of their arrayers (77%) and scanners (82%). This represents a slight decrease in performance satisfaction from the 2003 survey. But users continue to be less satisfied with the support they receive for their instruments than they are with performance (53% for arrayer support and 64% for scanner support). A consolidation in arrayer manufacturers has taken place since the 2003 survey, with arrayers from Genomic Solutions (Genomic Solutions, Inc., Ann Arbor, MI) and its subsidiaries now reported by 55% of respondents. Axon scanner (Axon Instruments, Inc., Foster City, CA) ownership was reported by 53% of respondents.
While most users employ 2-color hybridizations, several report 4 or more colors used on a regular basis and one reported using a maximum of 100 colors per experiment. Using additional colors per chip helps reduce costs by reducing the total number of chips required.
Microarray technology employs hybridization-based procedures to monitor the expression of thousands of genes in parallel. Two aspects of this technology have emerged as challenging areas for microarray users: the performance of the various platforms (cDNA and oligonucleotide) and the analysis of the resulting data. Each of these areas has matured over the past 5 years, but at different rates. The development of bioinformatics tools to analyze high-dimensionality data has lagged behind the generation of reliable and reproducible nucleotide hybridizations. Analytic procedures to extract meaning from microarray datasets continues to evolve along a continuum: from first-generation tools (statistical and fold-change filtering), to second generation (model-based), to third generation, (training-test set algorithms). As such, microarray users struggle with selecting the appropriate analysis tool for the data set under study. Moreover, there is no standard or recommended approach.
Today, microarray literature reflects the level of interest in bioinformatics tools, as evidenced by the numerous software approaches available. Solutions to bioinformatics needs have been provided by commercial organizations and academic communities. Simply performing a search using the keyword phase “microarray software” in PubMed (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi) generates 569 records. These software solutions range from statistical to biological pathway software as well as a number of visualization tools. Such a diversity of tools can be overwhelming for the user.
As noted in previous surveys (2001 and 2003—http://www.abrf.org/MARG) the bioinformatics component of studies that use microarrays continues to challenge researchers. Fifty-eight percent of 180 respondents in this year’s survey noted that bioinformatics represents the most challenging area in the facility operation (Figure 77).
There were a number of other significant findings obtained from the bioinformatics section. Thirty-seven percent of respondents never submit their data to a public database such as the Gene Expression Omnibus. More than 75% of respondents stated that individual investigators analyze their data sets; statisticians were second, with 48% of respondents reporting. Perhaps not surprisingly, there was significant overlap in the identification of who analyzed the data, suggesting that many investigators utilize multiple resources to address their analysis needs. Public freeware software packages were used as frequently as commercial software and were generally rated favorably. For instance, over the past several years, a number of analysis algorithms have emerged to translate microarray images into quantitative data. An evaluation of the most common methods (Figure 88)) suggests that the public robust means analysis (RMA) or GC-RMA algorithms rate better than commercial packages such as MAS5.25,26
Based on respondent interest from previous surveys, the MARG elected to include a separate section addressing protein microarray usage in the 2005 survey. Seven percent (13 out of 177) of the individuals that responded to the 2005 survey indicated that they currently utilize protein microarrays. This level of response was consistent with responses to the protein microarray questions in the prior surveys. Eight percent (n=47) in our 1999 survey and 7% (n=49) of the respondents in our 2001 survey indicated they used protein arrays. It is interesting to note that 66% (n=35) of those that responded to the future use of protein array questions in our 2003 survey indicated that they planned to utilize protein-array services in the future.
All of the facilities in the 2005 survey that offer protein-array services offer homemade protein arrays. Twenty-three percent (3 out of 13) of facilities that offer protein-array services offer commercial protein arrays in addition to homemade protein arrays. The three facilities that offer commercial protein arrays identified three different commercial sources for their protein arrays.
Given the increasing number of novel technologies and applications based on the microarray platform, it was decided that a survey of emerging technologies and applications was appropriate for the current survey. The future-direction component of the 2005 survey focused on three main areas: (1) novel array technologies, (2) emerging array applications, and (3) clinical implementation of array-based technologies. Graphical representations of the results of this survey section are shown in Figures 99–11.
The first part of the future direction and emerging technologies section asked about novel array applications. Respondents were asked to focus on new array-based applications such as solution-based arrays, comparative genomic hybridizations, CpG Island arrays, and splice variant or “tiling” array technologies. Most questions received a large response, with over 125 respondents stating they were using or were interested in using solution-based arrays, comparative genomic hybridization (CGH) arrays, and splice variant arrays. The one exception was the CpG island array section, in which only 4 respondents (interested primarily in oncology applications) were recorded (Figure 99).
Although there was a significant interest in solution-based arrays, only 10% of all respondents have implemented this technology in their laboratories, where about 70% of them are using this approach for screening a defined set of genes in a “validation” setting (Figure 99).). All respondents currently using solution-based arrays are employing the Luminex platform.
In contrast to the limited number of laboratories with practical experience on solution-based arrays, over 75% of the laboratories that responded to this section have hands-on experience with CGH arrays. Those laboratories employing this technology primarily work on human DNA samples. Recently, mouse CGH arrays have become more popular; this is reflected in the 1:2 ratio with respect to users responding to this question. The laboratories that use these technologies have made CGH a significant portion of their array business by reporting that 60% of their core business is from CGH applications (Figure 99).). This trend was not easily predicted by surveys in the recent past.
Another set of questions in this section addressed splice variant or tiling arrays (Figure 99).). Over 50% of the laboratories responding to this section showed interest in this application, with the majority (>60%) looking to employ this technology exclusively in human studies. Less than 10% of respondents currently utilize splice variant or tiling arrays. And of those that do use these arrays, it appears that 3′-biased amplification is a significant barrier to employing this technology more broadly. An unbiased labeling approach for the whole transcriptome will be needed to employ (and interpret) this technology.
CpG island arrays only account for 1% of the business in the respondent laboratories, and this is composed exclusively of human studies (Figure 99).). Only 10% of all laboratories that responded indicated interest in investigating this technology. A similar level of interest (7.3% of total survey respondents) was observed in the 2003 survey. It is unclear at this point what the growth potential of this application might be.
In addition to new array applications, several new array-based technologies have recently been made available by academic and industrial laboratories. The developers of this survey decided to focus on three-dimensional substrates for microarrays, high-throughput array-of-array technologies, and single-nucleotide polymorphism array–based approaches. Although greater than 50% of all respondents queried have an interest in 3D substrates for custom and commercial products, close to 70% have yet to try a product that utilizes a 3D surface. Respondents with practical experience utilized this technology preferentially for focused arrays (80%) instead of high-density applications. No one 3D surface at this point makes up a significant portion of this market.
Less than 40% of respondents have shown an interest in the array-of-array technology platforms. Those labs that implemented the technology are primarily interested in analyzing 1000 genes or less from 100 to 500 independent samples per experiment. Of the labs interested in deploying this technology, over 50% are ready to make the transition and are exploring commercial vendors. It is not clear either how extensive the user base will be for this technology or to what extent the currently installed platforms will be fully utilized.
The remaining question in this section addressed the interest and utility of single nucleotide polymorphism (SNP) profiling arrays (Figure 1010).). Respondents to this section are still looking for products that have more comprehensive coverage of the genome on a single array. Slightly over 50% of all microarray laboratories responding to this section are currently using SNP arrays in their facilities, with greater than 90% of them focused on human studies. At the same time, there is a growing interest in mouse-based applications. Although a variety of informatics tools are emerging to help investigators better understand and classify their SNP data, currently most laboratories (70%) are performing some sort of linkage analysis on their SNP array data sets.
The last part of the emerging technology section of the survey dealt with microarrays as tools for clinical diagnostics (Figure 1111).). Slightly over 50% of the respondents are interested in microarray analysis for diagnostic purpose, but less than half of these laboratories know about regulations associated with accreditation of clinical laboratories under the Clinical Laboratory Improvement Amendments (CLIA), and less than a third of respondents were aware of national initiatives to develop standards (e.g., the External RNA Controls Consortium: http://www.nist.gov, http://www.affymetrix.com/community/standards/ercc.affx). Hopefully, one outcome of this survey will be to make investigators aware of initiatives such as the External RNA Controls Consortium in order to help expedite national efforts to bring standardization to the gene-expression arena.27
Slightly over 60% of the respondents indicated they had the additional capacity in their microarray facilities to also provide array services for clinical applications. As noted, it is not clear that these laboratories have an understanding of the throughput needed, or a demand for a clinical diagnostic in the absence of any industry benchmarks. Sixty percent of respondents indicated that out-sourcing of clinical gene expression services was not an option. Specific respondent comments expressed a concern that clinical laboratories would not be able to prepare high-quality RNA and process it for use on a microarray.
The high level of participation in the future directions section of the MARG 2005 survey suggests that there is great interest within the microarray community in exploring and utilizing microarrays in non-expression studies. Future MARG surveys will be designed to better ascertain who is using these novel microarray applications and how they are being applied.
In summary, the MARG has conducted four biennial surveys since 1999, designed to acquire information on the operations of a microarray facility. Over this period of time, the Agilent 2100 Bioanalyzer has replaced the Test array for quality control, and real-time PCR has replaced the Northern blot for validation of array results. The RMA and GC-RMA algorithms, used to generate Affymetrix GeneChip expression value estimates, were rated more favorably by respondents than the Affymetrix MAS5 algorithm. Synthetic oligonucleotides have replaced cDNA as the printing material of choice in slide-based custom microarrays. Protein microarrays continue to lag behind in development, with less than 10% of respondents using them. While two years ago, microarrays were used almost exclusively used for expression analysis, applications that interrogate DNA alterations like SNPs, and copy number changes (CGH), have now become relevant market shares. Data interpretation and bioinformatics remain the major hurdles in microarray technology. Although many respondents plan to utilize microarray technology for diagnostic purposes, the majority are not aware of accrediting processes or standardization attempts. Since there is continual change in microarray technology, periodic surveys are useful tools for providing technical and market information to microarray service providers and investigators.
This manuscript has been reviewed and approved for publication by the Environmental Protection Agency but does not necessarily reflect the views of the agency. Mention of trade names or commercial products does not constitute endorsement or recommendation for use. Special thanks to Dr. Chris Corton and Dr. Douglas Wolf for review of this manuscript.