Assay probe design
The most important factor for a successful high-throughput gene expression assay is the design of the primers and probes used to detect the gene(s) of interest through the qRT-PCR reaction. Fortunately, bioinformatics resources and commercial products released in the last 5 years have made this a much less daunting task. In particular, many of the same companies that sell the instrumentation and consumables for real-time PCR are happy to assist with the design of assays to maximize the use of their products. Bio-Rad, Applied Biosystems/Life
Technologies/Invitrogen, and Roche all offer pre-designed packages of primers and/or probes. These sets are often referred to as assays and are sold as a complete package for real-time analysis. In addition, working with public genome databases, these same sites will design an assay based on any known sequence or gene, optimizing primer sequence and amplicon length. Most assays are available in the two formats described in this protocol, primers alone for use with Sybr green or primer-probe sets for use as Taqman assays. Additional formats, such as molecular beacons or FRET probes, are also available and may have an advantage for specific applications, such as when a loss of signal rather than a gain of signal is desired during amplification.
Another key part of assay design is selection of a proper control gene for normalizing against effects such as cytotoxicity or variation in cell plating. Ideally a gene unrelated to any pathway affected by the target gene should be used, such as the housekeeping genes TBP, GAPDH, or various rRNA sequences. In addition, it is best to use a control gene that amplifies in roughly the same number of cycles as the target gene, as this allows more robust normalization. In extreme cases where target genes are very highly expressed or underexpressed and the target and control signals are not both in the linear detection range of the assay, one of the assays can be adjusted by limiting one of the reagents such as a primer (known as primer-limiting) to artificially change the Cq value by reducing the efficiency of each round of PCR.
It is also ideal to measure the control gene and the target gene in the same reaction plate by using multiplexed probes with differentially emitting fluorophores. This reduces any variability between the control and target genes that is due to pipetting error or unequal treatment (light or temperature exposure) when separate plates are needed for each gene, as is the case when using Sybr green.
Prior to any small molecule high-throughput screen, extensive assay development is required, and is especially critical in an assay as sensitive as qRT-PCR. All steps of a protocol, such as cell culturing, seeding, washing, and reagent transfers should be validated to minimize variation due to technical limitations. Untreated control plates should be run to determine the variability of the measurements and the corresponding sensitivity of the assay. When properly designed, these controls can also serve to test other technical aspects such as consistency of plate positioning (avoiding rotations) and recording transfers among the multiple plates that are generated in the course of an experiment.
In addition to technical sensitivity as measured by standards such as the percent coefficient of variation, in phenotypic assays it is important to establish a threshold for a biologically relevant response. Ideally a small molecule positive control can be used to both establish this biological response during assay development and to monitor the biological consistency during HTS. In the absence of a small molecule control, a genetic control, such as an overexpression vector or siRNA, can be used to establish assay sensitivity. However, genetic controls suffer from the limitation that the effect seen is often orders of magnitude greater than the response that can be expected from a small molecule, making comparison to screening results difficult. Finally, in the absence of either of these controls, a technical control such as a known stress inducer or cytotoxic agent can be used as a technical control simply to track and confirm the execution of reagent transfers, even though there is no relationship of this type of control to the desired biology.
Overall, the design and validation of the assay is key for successful small-molecule screening by real-time qRT-PCR. This type of assay is among the most expensive used for small-molecule screening, and while the biological relevance of the data can provide a good return on this investment, it is easy to waste a large amount of expensive reagents if experiments are not properly planned and tested prior to screening.
Data analysis and concentration-response curves
method described above is used to relate two differently treated samples, in this case compound and mock treatment, where each sample has a target gene compared to a control gene (Schmittgen and Livak, 2008
). The data for the mock treatments is typically normally distributed due to experimental error, and single point values can be compared to this distribution for determining likelihood of a significant effect due to compound treatment.
The next step following hit calling is typically retesting of the compounds in the assay at a range of dilutions to construct a concentration response curve. There are two additional considerations in concentration-response tests compared to the primary HTS. First, the goal is not to measure the significance of an outlier, but to observe the biological effect of the compounds. This requires converting an exponential measurement given by Cq to a linear scale that represents the actual relative change in copy number. This conversion is done by the calculation: Fold change = 2ΔΔCq (.) These values should then be plotted against the tested concentration values for curve fitting (.)
Table 2 Hypothetical concentration-response data converted from ΔΔCq (calculated from raw data as described in protocol steps 19 & 20) to fold change in expression and normalized to % of maximal effect. Fold repression is equal to 2ΔΔCq (more ...)
Figure 3 Plotting of concentration response curve of hit compounds for reduced expression of a target gene. By converting to the relevant biological measurement, fold change, the apparent EC50 has shifted approximately 2-fold. Hypothetical data shown in (more ...)
The conversion Fold change = 2ΔΔCq
(or alternatively 2−ΔΔCq
depending on whether a fold repression or fold induction is being described) assumes a perfect PCR efficiency; that is, it assumes that each round of PCR exactly doubles the amount of DNA product during the logarithmic PCR expansion. In reality, PCR efficiencies are usually near 1.9 but can be as low as 1.7 (Tuomi et al. 2010
). If more precision is desired, a standard curve can be generated for the qPCR assay to be used by making a serial dilution of a sample PCR template and measuring the actual fold amplification observed in each round of PCR, which can be used in the above calculation (.)
Figure 4 Construction of a standard curve to measure PCR efficiency. An arbitrary starting amount of template (in this case, sufficient to give a Cq of 15) is 2-fold serially diluted and the dilution series is measured by real time qPCR. The slope of the resulting (more ...)
The second consideration when trying to fit concentration-response curves is deciding whether the maximal effect can be determined. In the hypothetical graph in , the response plateaus at high compound concentration, enabling the determination of an EC50 (50% of maximal effect) for the curve. More typically, either due to complex biology or weakly active compounds, it is not possible to determine the upper asymptote for the fitted curve, and the maximal effect may be very different for various compounds. In these cases, it is more appropriate to determine a threshold of N-fold induction or repression that represents a statistically and biologically significant effect, and set this ECN as the concentration at which the compound achieves the desired threshold. This allows more robust comparison of compounds with a wide range of biological effects and enables better selection for compounds to carry forward for further development.
The greatest difficulty in small-molecule screening is high variability, which can be observed by high variation in the mock treatment wells. This reduces the sensitivity of the experiment and increases the rate of false positives. In qRT-PCR experiments, it has been shown that with careful technique, the qPCR and RT generation together represent less than 20% of the total variation between replicate analyses (Kitchen et al. 2010
). The majority of variability comes from biological variability, including sample preparation. In the protocol presented here, this includes cell culture of the cell line to be used, plating and treatment of the cells, and washing and lysis prior to cDNA generation. It is important to have cultured cells as consistent as possible across all compound tests, including minimizing passage number variation and exposure to variable environmental conditions. Plating of cells should be done in manageable batches to reduce variability due to plating order. Washing parameters should be set to minimally disturb the cells, and lysis should be optimized for the cells used to ensure complete reaction. While replicates may help determine the extent of variability seen in treatment wells, it is costly to run biological replicates (treating two separate plates of cells with the same compounds) as the cDNA generation accounts for the majority of the cost of the assay. Real-time qPCR replicates from the same cDNA well can confirm the robustness of the analysis step, but as mentioned above this is not expected to reveal much variability and is time consuming to perform for an entire HTS campaign.
Very high Cq values are indicative of poor mRNA isolation or cDNA generation and are problematic because very small amounts of contamination or variability can have a large effect on the results. If values are too low, confirm the efficiency of the PCR reaction and consider re-optimizing the assay design or even selecting a different target gene. Adjustment of cell number may have a small effect but is unlikely to improve low signals, as a 2-fold increase in cell seeding will only have ~1 cycle effect on Cq.
Finally, it is essential to include controls for non-amplification to ensure that there is no contamination. If no-template controls are showing signals that are within several cycles of experimental wells, the data cannot be accepted as valid. Contamination can be controlled by the same methods used for standard PCR, including isolating and cleaning a workspace and never bringing post-amplification materials into the PCR setup area.
It is expected that in a compound library of appropriate size or design, some amount of biological effect will be seen, if only due to cytotoxic compounds. Typically a validation library containing a sample of bioactive compounds is run early in the screen to confirm the presence of this effect. illustrates a typical assay plate with a range of Cq
values due to compound treatment, and similar small-molecule high-throughput screening results have been reported and analyzed previously (Arany et al., 2008
). Overall, the active well rate for compounds showing the desired change in target gene expression is highly dependent on the biological system under investigation. It is entirely possible that a particular gene is intractable to specific modulation by small molecules, and that any putative hits will be removed as artifacts in subsequent follow up assays. As with all discovery projects, the researchers needs to make an informed decision about the return on investment of additional screening or follow up.
The largest investment of time is in the assay development and validation but is essential to avoid wasting time and money in downstream failures. Several rounds of mock runs are usually required to get low-noise consistent results through the entire process of cell culture, lysis, cDNA generation, and qPCR analysis. This will take several weeks of iteration.
Depending on the choice of protocol and number of compounds tested, the small-molecule HTS can be staged in various ways. Using a two-step protocol to isolate cDNA has the advantage of allowing the separate execution of each phase of the assay rather than multitasking, which adds to the total project time but can make execution more consistent. The only time-sensitive step in the protocol is the lysis treatment, which can ruin the nucleic acid sample if run too long. For large numbers of multiwell plates it is advisable to have at least two scientists working together to execute the protocol, as subsequent steps can back up and interfere with the key timing step. It is important to recognize other potential bottlenecks as well including standard PCR blocks for the RT step, robotic pipetting throughput, and real-time instrument availability. Proper planning will enable optimal use of these resources
Even though the real-time qPCR analysis step takes 1–1.5 hours per plate, it is now possible to screen tens or hundreds of thousands of compounds using real-time qRT-PCR thanks to miniaturization to 384- or 1536-well format. If an assay is robust and appropriate instrumentation is available, thousands of compounds can be screened each day during a campaign, enhancing the potential for discovery of specific small-molecule modulators of the target gene of interest.