The quantitative polymerase chain reaction (qPCR) has become the method of choice for fast and accurate gene transcript measurements. As gene expression quantification is currently performed using different qPCR instruments, software, reagents, plates and seals, a robust method is required in order to compare data generated in different laboratories. In this study we assess the value of long oligonucleotides as universally applicable, quantifiable external standards in cross laboratory data comparison. This study demonstrates for the first time the power of this strategy to detect and correct inter-run variation and to enable exchange of data between different laboratories, even when not using the same PCR platform.
The basic principle of IRC is based on the use of identical samples—called inter-run calibrators—in different runs to correct for often underestimated technical inter-run variation. The qBase framework and accompanying qBasePlus software perfected the IRC procedure by allowing more than one inter-run calibrator to be used and by doing the calibration after normalization of the gene expression levels, resulting in more accurate IRC, fewer calculations (and hence smaller error bars due to less error propagation) and higher flexibility (allowing re-synthesis of cDNA of the same IRC RNA sample) (13
). In this study, we relied on the same mathematical framework using a five-point serial dilution series of external standards to correct for experimentally induced variation, not only from run to run, but also related to the use of different qPCR instruments, Cq
value determination methods, mastermixes and plastics.
While external standards based on serial dilutions of e.g. plasmids or cDNA are often being used to calculate PCR efficiency, in this study we used them to ensure reproducibility and validation of the results across laboratories and experiments. The applied standards consist of synthetic oligonucleotide controls—one for each gene—that need to be run in parallel with the samples. The proposed strategy is universally applicable and offers a high level of flexibility as everyone can design, order and use this kind of standards.
As the principle of this strategy is based on the fact that Cq
or NRQ values are corrected with a gene and run specific IRC factor reflecting the mean Cq
or NRQ value obtained from IRC samples (here, a series of standards) with known copy number run in parallel with the samples, it is crucial to ensure that the IRC samples input is exactly the same for both runs. This can be achieved by actually using the same synthesized lot of external standard as usually more than 1014
molecules are supplied, providing enough material to create standards for multiple thousands of IRC experiments. However, if standards from different synthesis rounds or suppliers are used, an accurate copy number measurement of the yield is needed. Indeed, standards synthesized by different companies or in successive rounds might lead to differences in supplied concentration compromising the results if used for IRC. To overcome this problem, a digital PCR pilot experiment could be performed to quantify the number of molecules in the supplied standards before using them in actual experiments (18
). Alternatively, manufacturers could provide kits for a particular assay with inclusion of a standardized standard. Of note, when using the preferred way of IRC (i.e. on NRQ values instead of Cq
values) which is ideally suited for gene expression studies, it is sufficient to have the same target ratios (instead of actual identical copy numbers) for the matching IRC sample pair measured on both runs; a simple concentration measurement of the standardized external oligonucleotides would be adequate in this case.
In order to avoid an additional potential source of inter-run variation, ideally all RNA samples should be extracted using the same method and standard operating procedures. This was not the case in this study, where RNA samples were coming from different international laboratories. However, this type of variation is possibly effectively removed by the normalization step as recently demonstrated in a large gene expression study on the same series of neuroblastoma samples in which a prognostic multigene expression signature was successfully tested on a large cohort of samples irrespective of possible confounding factors related to different RNA extraction procedures (17
As shown previously, the more inter-run calibrators used, the more accurate and precise the results are (13
). In this study we used a five-point serial dilution series of external standards. While we could confirm that more dilution points used for IRC result in better calibration, the difference is marginal here, presumably because carefully diluted synthetic oligonucleotides were used within the limits of accurate quantification. The use of complex cDNA samples (with variable and potentially unknown variation in gene expression levels) as inter-run calibrators [as done in Hellemans et al
)] will most likely contribute to higher inter-run variation, necessitating more than one IRC sample. In general, the use of more than one IRC sample enables quality control by inspecting results when calibrating with one or the other. Furthermore, using five IRC points like in this study also enables accurate and precise estimation of the PCR efficiency in each run.
Concordance in class prediction between the different platforms after calibration was nearly perfect and significantly higher after than before NRQ level IRC. While the results without IRC at first sight might seem satisfactory, it is important to consider the following. In this study, we observed similar shifts in Cq value between different genes when comparing two platforms. As this difference is depending on various parameters and in principle unpredictable, this information cannot be used a priori without proper control, this is the use of an IRC sample to measure and correct for the run-to-run differences. A simple change in e.g. baseline/threshold settings for Cq value determination or the use of a new primer pair or PCR reagent batch could completely abrogate the observed so-called systematic difference in Cq value. Another explanation for the unexpected relatively good correlation in class prediction without IRC is the use of the same patient cohort on all platforms. On the one hand, this was required to demonstrate occurrence of inter-run variation and effective removal. On the other hand, this caused each platform to be calibrated to some extent by itself. For classification purposes whereby multiple genes are incorporated in a score or classifier, this appears to work to some extent; for accurate and precise analysis of the expression levels of a single gene, clearly a universal and robust IRC procedure is needed, as outlined in this article.
The proposed strategy employs external standards and qPCR, both of which have been extensively evaluated and are widely used. Other strategies to standardize qPCR data, such as StaRT–PCR, are based on internal standards (6
). Based on competitive PCR, StaRT–PCR is a patented technique for measuring multigene expression in samples and relies on end-point quantification. The advantage of the method is the incorporation of competitive templates into standardized mixtures of internal standards (SMIS) which allows comparison of generated data since the values are determined relative to the same standardized mixtures. Compared to our strategy, StaRT–PCR, is characterized by a more limited dynamic range of linear quantification, is more labour intensive, and is only commercially available through Gene Express. Our strategy is directly accessible to anyone by the simple ordering of the oligonucleotide sequence of interest and thus offers a high flexibility.
In conclusion, our study clearly demonstrates that the use of external oligonucleotide standards is a powerful method for accurate cross laboratory data comparison. Amongst others, it enables to test a gene signature on a single patient sample in any lab in the world and compare the results with a reference set established in another lab. The proposed strategy truly enables multicentre studies conducted at different sites, greatly advancing this field of application.