|Home | About | Journals | Submit | Contact Us | Français|
DNA microarrays are plagued with inconsistent quantifications and false-positive results. Using established mechanisms of surface reactions, we argue that these problems are inherent to the current technology. In particular, the problem of multiplex non-equilibrium reactions cannot be resolved within the framework of the existing paradigm. We discuss the advantages and limitations of changing the paradigm to real-time data acquisition similar to real-time PCR methodology. Our analysis suggests that the fundamental problem of multiplex reactions is not resolved by the real-time approach itself. However, by introducing new detection chemistries and analysis approaches, it is possible to extract target-specific quantitative information from real-time microarray data. The possible scope of applications for real-time microarrays is discussed.
After approx. 20 years of development, DNA microarrays find themselves at the crossroads: on the one hand, there are some exciting genetic results, obtained by applications of microarray technology, and on the other, the field is still plagued by false positives and questionable quantifications [1–3]. The strategic question is whether there are any realistic ways to significantly improve the quality of microarray data or it will remain a preliminary screening tool soon to be replaced by next-generation sequencing. The central dilemma in current microarrays is how do the end-point fluorescent intensity measurements correlate to the concentrations of targets of interest in the interrogated sample? [4,5]. The fundamental problem we have to deal with when conducting microarray experiments is the multiplex environment of the surface capture [6–8]. Indeed, multiple DNA (RNA) targets interact with each sensing zone (spot) on the surface, and each target can potentially partition among multiple spots. A multitude of different statistical approaches applied to primary analysis of microarray data have to overcome several inherent problems of the technology, such as single sampling of random variables, signal normalization protocols, establishing reasonable thresholds for positive spots [9–11]. It is painfully obvious that the current microarray approach based on end-point fluorescence intensity measurements has limited capabilities in dealing with multiplex problems [12,13]. In particular, the ability to quantitatively discriminate SNPs (single-nucleotide polymorphisms), which, by definition, are highly homologous sequences, is the ultimate test of robustness (or lack thereof) of DNA microarray technology.
One way to achieve robust target discrimination is to complement surface hybridization with the faithful enzymatic reactions, which allow the enhancement of fidelity of target recognition by several orders of magnitude . When 3′-matched and 3′-mismatched duplexes are bound to the sensing surface, polymerase can efficiently extend the 3′-matched primer and stall on 3′-mismatched primer. Indeed, the dynamic range of mismatch discrimination by polymerases is 103–106, depending on the type and position of the mismatch . Ligases have similar fidelity with respect to mismatch recognition .
Alternatively, SNPs may be resolved by utilizing deoxy-nucleotides labelled with different fluorophores, so that single nucleotide extensions produce the distinct fuorescent tags for each of sequence variants.
An enzymatically enhanced approach was implemented, for example, using Illumina SNP microarrays (http://www.illumina.com). It is based on complementing surface capture by enzymatic reactions (polymerization or ligation) which have much higher fidelity than hybridization proper. If the 3′-end of a hybridization probe is mapped to a known SNP, than different alleles may be distinguished during primer extension reactions using labelled nucleotides. The Illumina SNP platform became a valuable genotyping tool; however, it provides only qualitative data, is costly and labour-intensive (however, it is capable of high throughput owing to genome-wide parallelism). Moreover, using this approach, one can interrogate only known SNPs, and it cannot be used as a discovery tool for novel SNPs.
Oxford Gene Technology adopted a similar approach for quantification of RNAs extracted from the single cells: upon cell lysis, RNAs are captured by surface-bound primers and are visualized at single-molecule resolution during primer extension with fluorescently labelled dNTPs (http://www.ogt.co.uk). Of course, enzymatic enhancements come with a price: the protocols are much more complicated, and the cost of experiments is significantly higher compared with standard microarray experiments. The main methodological issue with enzymatically enhanced microarrays is related to alterations of enzymatic activity at the surface, mainly due to the polymer brush effects within the capture zones . Also, since current enzymatic approaches rely on end-point measurements, quantitative analysis is still questionable.
An alternative way to approach microarray experiments is to acquire signals in real time, as reactions progress, similar to what is done for real-time PCR [20–22]. A real-time micro-array approach proved to be valuable in mechanistic studies of surface DNA capture: mass transport phenomena [23,24], binding kinetics , mechanisms of target discrimination [26–30], etc. The question still unanswered is whether there is a place for real-time microarrays in practical applications.
The purpose of this review is to discuss pros and cons of real-time microarray approaches, and try to demonstrate which experiments can benefit from implementing these technologies.
Understanding the mechanisms of surface interactions is important to establish a reasonable level of expectation with respect to microarray technology: we have to evaluate both capabilities and limitations of fundamental processes in order to develop adequate applications.
There are two major mechanisms defining surface capture of the target nucleic acids: mass transport from the bulk solution to the surface and surface hybridization. Mass transport of the targets to the surface is achieved through diffusion and convection. Surface hybridization is a reversible second-order reaction occurring through a heterogeneous mechanism at the surface–solution interface. A general case of target capture is represented by the diffusion-kinetic model [31,32]. Depending on the reaction conditions, mainly concentrations and temperature, either of these mechanisms may become rate-limiting. The lower the concentrations of the targets, the more pronounced is the contribution of mass transport to the overall kinetics of surface capture, which in turn results in lower specificity of target recognition . Since diffusion changes with temperature approximately linearly, while hybridization reaction rate changes are approximately exponential, elevated temperatures result in more mass transport control of surface capture . Moreover, the higher the temperature of the reaction, the lower the thermodynamic dynamic range of target discrimination. However, if we shift to lower temperatures, there is a danger of slowing surface reactions to the point where they become practically irreversible, which leads to the loss of specificity. Another aspect of DNA (RNA) surface capture is related to polymer brush effects during the reaction, which may lead to steric complications with accessibility of the probes, probe–probe interactions and negative co-operativity of binding .
This picture is complicated further by mutiplex interactions both at the sensing surfaces and in solution: in the course of hybridization experiments, fluorescently labelled targets from the sample hybridize to unlabelled surface probes. This means that the fluorescent signal acquired from each spot of the microarray is a composite of the signals from individual targets. Using thermodynamic analysis, we can predict a subset of significant targets for each spot; however, their relative contributions to the signal are defined also by their concentrations in the sample and affinities to the spot. Moreover, thermodynamic predictability of the signals is true only when surface reactions reach equilibrium. However, it has been demonstrated, experimentally and theoretically, that in the course of hybridization, with an average time of 18 h, equilibrium is not reached [35,36]. Moreover, different components reach equilibrium at different times (depending on their concentrations and affinities), and even then it is not a true equilibrium: the initial binding phase is followed by competitive displacement, which in turn extends the time taken to reach equilibrium [36,37].
If all of the above is true, and straightforward correlations between fluorescence intensity of the spots and concentrations of corresponding targets cannot be substantiated, why are microarrays capable of producing any sensible data? The approach adopted by all end-point microarray platforms is to switch to relative measurements using two-colour experiments . If we mix a known reference sample labelled with one colour (red) with an unknown experimental sample labelled with a different colour (green), similar targets should be captured with similar efficiencies, and we can quantify unknown concentrations relative to reference concentrations. Note that by using this ratiometric approach we do not have to satisfy equilibrium requirement: binding kinetics of targets is proportional to their relative concentrations at any time point. However, in applying a two-colour approach, we implicitly make a very strong assumption, e.g. that only the targets of interest for each spot may change concentration, while the rest of the target subset (which is loosely called non-specific binding) is equal for both samples. It is easy to observe that the requirement of equal background cannot be justified.
If the paradigm of ‘one probe–one target’ cannot be realized and binding events are characterized by low selectivity, how can we achieve improved specificity of target capture? One of the solutions is to compare signals from perfectly matched and mismatched probes, which was implemented by Affymetrix (the so called PM/MM pairs) . The assumption was that binding of background species to both probes was similar, whereas specificities to the target of interest were distinctly different, so, by comparing signals from the PM/MM pair, we could extract information on the amount of the specific target. This approach was broadly discussed and criticized, because it turned out that all of the assumptions of the model were not true [40,41]
There are two general approaches to improved specificity which are commonly used: improvement of probe design and stringent washing. Improved probe design usually is based on sequence text analysis and on nearest-neighbour thermodynamic calculations . A more advanced version of probe design is based on finding the unique thermodynamic signature of the probe–target hybrids based on statistical thermodynamics calculations . It is aimed at minimizing cross-hybridization with homologous sequences (i.e. reducing multiplexing) and achieving higher thermodynamic stability with the targets of interest (i.e. signal intensity) compared with secondary targets. Although rational probe design is a necessary tool in any hybridization experiment, it cannot solve the problem of multiplexing in general cases. For example, by designing hairpin probes, we can theoretically achieve higher specificity of target recognition; however, in practice, this effect can be negated by differences in target concentrations in the sample .
Stringent washing is the most widespread method to minimize the effects of cross-hybridization [44,45]. Based on the mechanistic approach, discrimination among different targets is achieved mainly through differences in dissociation rates. So, if hybridization is followed by washing, i.e. by placing the microarray in the volume of an appropriate buffer, we should expect differential dissociation of targets from the surface. Ideally, at the end of the wash, the most stable hybrids retained by the surface correspond to the targets of interest. Practically, however, we usually follow generic protocols and perform the wash step without any control of the process. The problem is that, during the wash step, we remove all targets, and we hope that, in the process, the majority of secondary targets will dissociate, while a significant (although undefined) portion of the targets of interest remains surface-bound. Using a high-temperature wash, for example, or introducing hybrid destabilizers (detergents, organic solvents) we accelerate dissociation; however, dissociation rate constants converge, and we run into the possibility of washing out our targets of interest together with secondary targets. This may result in obtaining false-negative data. Low-temperature washes proceed more slowly, and, without objective criteria for the time of wash, we run into the danger of obtaining false-positive results. As discussed below, real-time monitoring of the wash step may become an attractive alternative method of quantitative analysis.
Real-time microarray technologies are based on continuous (or quasi-continuous) monitoring of the target signals from the sensing surface. This requires discrimination of the surface-bound targets from the targets in the sample solution during the course of surface capture. Although different methodologies may be used to achieve selective acquisition of the surface signals, two popular approaches are based on acquisition of surface fluorescence or surface plasmon resonance shifts (or a combination of both) [46–55]. Of course, this in turn necessitates the development of new instrumental platforms and analysis algorithms [52,56].
So, what can a real-time approach bring to the practical picture, how can it improve microarray data? Are significant changes in experimental setup and analysis justified by a qualitative jump in microarray results? It is clear that one of the advantages of the real-time approach is that the data represent a multitude of measurements for each spot which can be model-fitted (a binding curve) . Thus a statistical bottleneck with a multitude of independent variables compared with single sampling (measurement) is resolved. The standard errors become tighter due to curve fitting. Also, since we can evaluate concentrations of the targets of interest with a smaller number of different probes efficiently, we save significant space on the capture surface. These considerations may present significant advantages for the assays where the dynamic range of the gene concentrations is narrow, for example in CNV (copy number variation) experiments. Additionally, since quantitative analysis is based now on kinetic approaches, there is no requirement to reach a stable end-point (equilibrium).
However, this straightforward approach does not resolve the problem of multiplexing: the observed signal is still a composite of several components, so that quantification of specific targets remains uncertain.
Since direct monitoring hybridization kinetics cannot help us to quantify targets of interest in a multiplex environment, it may be productive to apply a melting analysis approach to microarray data: indeed, it has been proven that discrimination among the targets is achieved mainly due to the pronounced differences of dissociation rates from the sensing surfaces . There have been several encouraging reports about melting analysis on microarrays, although some results seem to be controversial [56–58]. If we look at the melting approach from the mechanistic perspective, the question arises as to when in the course of experiment should the melt start. Indeed, since the population of different bound species is subject to displacement dynamics during hybridization, results of melting experiments should shift depending on the extent of completion of hybridization, thus they are not directly correlated with the initial concentrations of the targets of interest . If melting analysis is performed after the system reaches equilibrium, multiple equilibria result in suppression of some transitions and temperature shifts in others (S. Blair, H. Engstrom and A. Chagovetz, unpublished work). So, we suggest that equilibrium melting analysis is not a good analytical option with application to microarrays. However, if we apply an irreversible melting approach (after equilibrium is reached), i.e. melting combined with the flow of the null buffer (to prevent rebinding) or, in an alternative embodiment, irreversible real-time washing at constant temperatures followed by kinetic analysis of dissociation curves, we can obtain a quantitative analysis of the sample composition.
Alternatively, we propose the use of displacement kinetics during hybridization to indirectly evaluate concentrations of the targets in a sample. This method is based on real-time monitoring of hybridization and displacement of known fluorescently labelled oligonucleotides (competitors) which are mixed with unknown sample and have lower affinity for the capture probes than the targets of interest . Although we cannot predict the kinetic behaviour of all of the secondary targets, we have demonstrated that, even in cases of multiple background species, the kinetic model can be reduced to a three-component case, which allows for unambiguous quantification of the targets of interest . The advantage of this method is the elimination of the sample-labelling step, but the disadvantage is that we need to spike the sample with competitors, which limits the number of targets under investigation. Currently, we are in the process of applying the CDA (Competitive Displacement Analysis) to quantitative SNP detection in mitochondrial DNA (heteroplasmy analysis).
Real-time microarray technology has the potential to become a truly quantitative tool in genomics. Availability of instrumental platforms is a necessary building block towards broad implementation of real-time microarray approaches. However, we need to develop new detection chemistries and analysis algorithms in order to make real-time microarrays a viable alternative in genetic analysis. One major concern we need to recognize is that (as it stands now) applications are limited to small-to-medium size arrays (~1000–10000 spots) owing to the computational intensity of real-time analysis.
Our work is supported by the National Institutes of Health [grant number 1R44G084603-01].