|Home | About | Journals | Submit | Contact Us | Français|
Persistently elevated post-treatment plasma EBV DNA is a robust predictor of relapse in NPC. However, assay standardization is necessary for use in biomarker-driven trials. We conducted a study to harmonize the method between four centers with expertise in EBV DNA quantitation.
Plasma of 40 NPC patients were distributed to four centers. DNA was extracted and EBV DNA copy number was determined by real-time quantitative PCR (BamHI-W primer/probe). Centers used the same protocol but generated their own calibrators. A harmonization study was then conducted using the same calibrators and PCR master mix and validated with ten pooled samples.
The initial intraclass correlations (ICC) for the first 40 samples between each center and the index center were 0.62 (95%CI: 0.39–0.78), 0.70 (0.50–0.83) and 0.59 (0.35–0.76). The largest variability was the use of different PCR master mixes and calibrators. Standardization improved ICC to 0.83 (0.5–0.95), 0.95 (0.83–0.99) and 0.96 (0.86–0.99), respectively, for ten archival frozen samples. For fresh plasma with spiked-in EBV DNA, correlations were >0.99 between the centers. At five EBV DNA copies/reaction or above, the coefficient of variance (CV) was <10% for the cycle threshold (Ct) among all centers, suggesting this concentration can be reliably used as a cutoff for defining the presence of detectable EBV DNA.
Quantitative PCR assays, even when performed in experienced clinical labs, can yield large variability in plasma EBV DNA copy numbers without harmonization. The use of common calibrators and PCR master mix can help to reduce variability.
Treatment with concurrent cisplatin (CDDP)-based chemoradiotherapy (CCRT) followed by three cycles of adjuvant CDDP and 5-fluorouracil (5FU) is the current standard of care in the United States for locally advanced nasopharyngeal carcinoma (NPC)(1). Recent advances in radiation delivery and the use of concurrent chemotherapy have substantially increased the rate of local control, now ranging between 91–96%(2–6). However, the development of distant metastases remains problematic (~30% at 5 years) and ultimately results in death(4–6). Compliance to adjuvant chemotherapy after CCRT is problematic as only half of patients completed all adjuvant chemotherapy due to severe toxicity(1, 7). The results of a recent randomized trial from China, which failed to show a survival advantage for adjuvant chemotherapy(8) questioned benefit of adjuvant CDDP/FU. This study was criticized for not using a non-inferiority design, and hence was potentially underpowered. In contrast, another study suggested that adjuvant 5FU chemotherapy decreases distant metastasis in NPC(9). A plausible contributor to these conflicting findings is the inability to properly classify patients with different risk profile for enrollment in trials. The development of a biomarker to reliably identify the subset of patients at high risk for metastasis may help to identify patients that would benefit from adjuvant chemotherapy while sparing low-risk patients from unnecessary toxic treatment.
Blood is in direct contact with all organs and is an attractive sample type for noninvasive cancer surveillance. Since the first evidence showing that tumor-associated DNA can be detected in the blood(10), several studies have evaluated tumor DNA as biomarkers for cancer surveillance(11–16). The presence of viral DNA in viral-related tumors offers a distinct marker for detection in the blood. Epstein Barr Virus (EBV) DNA is often found in the plasma of NPC patients and has been shown to be a reliable marker for prognostication in this disease(14, 15, 17). Specifically, pre-treatment plasma EBV DNA correlates with cancer stage(17, 18) and clinical outcome(19) in endemic NPC. Post-treatment (radiation or CCRT) plasma EBV DNA, defined as undetectable, has an even better correlation with prognosis and has been used to monitor recurrence after therapy(17, 18, 20, 21). While undetectable post-treatment plasma EBV DNA was associated with an excellent progression-free survival (PFS: 80–90%), persistently detectable level was associated with an extremely poor PFS (10–15%) and may be a marker of subclinical residual disease(21). These observations have been consistently reproduced in large patient cohorts treated in different countries from both endemic and non-endemic areas(18, 20, 21). Published data to date indicates that the most robust and reliable biomarker for NPC prognostication is the post-treatment EBV DNA level.
Given the robustness of post-treatment plasma EBV DNA in prognostication, we propose to incorporate this biomarker in the next RTOG (Radiation Therapy Oncology Group) phase III NPC trial. We postulate that patients with undetectable EBV DNA after CCRT have a low risk for distant relapse and will be randomized to receive adjuvant cisplatin/5FU (current standard) versus observation. In contrast, those who continue to have detectable EBV DNA levels are at a “high risk” for distant relapse and will be randomized to cisplatin/5FU (current standard) versus a more intensified regimen. Because this will be an international trial enrolling patients from different countries, the logistical difficulty and high cost associated with shipping plasma samples across the continents, as well as the need for rapid result generation for randomization, it is important to first prospectively validate the assay for EBV DNA and to harmonize it across the different clinical laboratories in participating countries. All four participating laboratories have had extensive experience in measuring plasma EBV DNA in NPC patients and are nationally accredited. Here we report the results of a prospective effort to harmonize this assay across these four international laboratories in preparation for the upcoming trial.
Although different primer/probe sets have been used to measure circulating EBV DNA in the plasma or serum, the most commonly used primer/probe set is the one targeting the BamHI-W region of the EBV genome(14). This fragment occurs 8–11 times in the EBV genome, allowing more sensitive detection when compared to a single copy EBV genes, such as EBNA1, LMP2 or POL1(18) (22). Since the purpose of the phase III trial is to identify patients with detectable EBV DNA at diagnosis for entry into the trial, and to distinguish patients with detectable levels from those with undetectable levels after chemoradiation for risk stratification and treatment assignment, it is critical that the most sensitive assay be used. More importantly, the largest and most robust published studies that established the prognostic significance of post-treatment circulating EBV DNA in NPC employed the BamHI-W primer/probe set(20, 21, 23). Therefore, we decided to employ this assay for EBV DNA measurement for the upcoming trial and focus our efforts to harmonize the assay in the participating laboratories.
Four laboratories that were selected for this harmonization include: (1) the Stanford Clinical Virology Laboratory (STF, certified under the Clinical Lab Improvement Amendment [CLIA]), which will serve as the central laboratory for US sites, (2) the Chemical Pathology laboratory at The Chinese University of Hong Kong, Prince of Wales Hospital, Hong Kong (HK, accredited by the National Association of Testing Authorities [NATA] and the Royal College of Pathologist of Australia), which will serve as the central laboratory for Hong Kong sites, (3) the National Taiwan University Hospital Clinical Laboratory (NTU, Taiwan Accreditation Foundation (TAF) and participated in the College of American Pathologist [CAP] proficiency program) and (4) the Chang Gung Memorial Hospital-Linkou Clinical laboratory (CG, accredited by both TAF and CAP), which will serve as the central laboratories for Taiwanese sites. These laboratories were selected because of their commitment to the planned international phase III RTOG trial mentioned above, their experience in measuring plasma EBV DNA, their accreditation status, and their ability to offer the test as a clinical assay.
For the pre-harmonization study, anonymized plasma samples from 40 newly diagnosed NPC patients were collected from CG with patient consent and distributed to the four laboratories. Patients with all different stages (22 stage I-II and 18 stage III-IV) were included. For the harmonization process, 23 plasma samples of newly diagnosed NPC patients were collected from STF under an IRB approved study (NCT00186433), pooled and distributed to the four laboratories. For the post-harmonization validation study, plasma samples from anonymized 40 NPC patients were combined to create ten pooled samples, alliquotted and distributed.
3.5 ml of blood was collected into EDTA-coated tubes, centrifuged for 10 minutes, plasma recovered, and frozen at −80°C until shipped. DNA was isolated manually from 400 µL of plasma using the QIAamp Blood Mini Kit (QIAgen Inc, Valencia, CA), eluted with 50 µL of elution buffer or water. For the harmonization, two different operators from the same laboratory performed DNA extraction and quantitative polymerase chain reaction (qPCR) to determine inter-operator variability.
Calibrators were prepared by using DNA extracted from an EBV-positive cell line Namalwa, which is a diploid cell line containing two integrated viral genomes/cell as previously described(14). This DNA was also used for the spiking experiments.
DNA samples were quantified for EBV DNA using a real-time qPCR system targeting the BamHI-W fragment region of the EBV genome as described by Lo et al (14). Each run included patient samples, calibrators for constructing a standard curve, and appropriate positive, negative, and no template controls. All PCRs were performed in triplicate. The following real-time PCR detector systems were used: STF - Rotor-gene Q (Qiagen) and ABI7900; HK - ABI7300; NTU - ABI7900HT; CG - ABI7900 (Applied Biosystems, Foster City, CA). The plasma concentration of EBV DNA (copies/mL) was calculated as previously described(14).
All assay results are summarized using mean, standard deviations and log-transformed if the data deviates from normality assumptions. Inter-rater reliability is estimated by intraclass correlation coefficient (ICC), a method to assess reproducibility of assay measures among different laboratories. Based on Shoukri et al(24), with one sided 5% type I error rate and 80% power, we need 39 patients to detect a difference of 0.2 assuming ICC of 0.6 under the null hypothesis between two laboratories. The within and between subject variations and ICCs are estimated using a general linear model with measurement error(25). The interlaboratory variability in qPCR measurement and DNA extraction was also summarized using the general linear model. Spearman’s rank correlation coefficient was used to assess results for the “spike-in” experiment among the four laboratories. All analyses were performed using SAS 9.2.
Due to the small volume of plasma collected per patient, we used 400 µL of plasma for DNA extraction instead of the 800 µL that is normally employed in the clinical protocol. The results for the initial run using 40 patient samples are shown in Table 1. There was a large inter-laboratory variability in the absolute number of EBV DNA copies/mL. The detection rate for EBV DNA, defined here as ≥ 0 copy/mL was 58% for NTU, 80% for STF and 93% for both CG and HK. Table 2 shows the ICC for each site compared to STF as the index site. All correlations had less than the desired 0.80 value. The observed variability between centers was not related to differences in qPCR instruments because in one lab (STF), we tested these 40 samples on the Rotor-Gene Q and ABI7900 and observed similar results for both instruments (data not shown).
To harmonize the assay, we identified non-standardized factors that could be modified; these included calibrators (for standard curve), which were individually prepared in each laboratory, and the TaqMan master mix, which was purchased from two different vendors. Three laboratories employed the premade Roche master mix (Roche Applied Sciences, Indianapolis, IN) whereas one laboratory prepared the master mix in-house using components from ABI (Applied Biosystems). As shown in Table 3, the use of different master mixes, when controlled for other aspects of the procedure, including the DNA extraction, operators, other reagents, calibrators and PCR instruments, resulted in a large difference in the measured EBV DNA copy number. Laboratories using the Roche master mix gave similar results; whereas the laboratory using PCR kits from ABI gave results five to ten folds higher. Therefore, standardization was made to use the Roche master mix in subsequent studies.
Next we assessed the variability in the qPCR step and DNA extraction by different operators within the same laboratory and by different calibrator sets. To test the interlaboratory variability in qPCR, we provided the four laboratories with the same amount of extracted EBV DNA from pooled NPC plasma samples at the concentration of approximately 4000 copies/mL. In one laboratory (HK), the results were available only for one calibrator set. As shown in the top half of Table 4, qPCR variability was much larger for the different calibrator sets than for the different operators. Therefore, we standardized the calibrators using those prepared by HK in all laboratories for the remaining harmonization process. To ensure that calibrator shipping did not result in degradation, we shipped a calibrator set from Hong Kong to the US then back to Hong Kong and tested it against the same batch that was not shipped. Shipping did not result in a significant decrease in calibrator performance (Supplementary Table 1).
We also evaluated the DNA extraction variability between the four different laboratories, using the same DNA extraction kit. In two laboratories (STF and NTU), we also assessed DNA extraction variability by different operators. For this analysis, aliquots of a pooled plasma sample with EBV DNA concentration of approximately 4000 copies/mL were distributed to the four laboratories. As shown in the lower half of Table 4, there was minimal inter-operator variability. There was likewise good agreement between the four laboratories and the difference was within one PCR threshold cycle (data not shown).
We then validated the harmonization with all laboratories using the HK calibrators, Roche master mix and standardized procedures. Because fresh plasma samples were not available, we pooled 40 archival NPC plasma samples that had previously been frozen and thawed at least once to generate ten pooled samples with different concentrations. The samples were shipped from Hong Kong to the US and then to Taiwan. As shown in Table 2, the ICC improved from 0.62 to 0.83 for NTU vs. STF; 0.70 to 0.72 for CG vs. STF and 0.59 to 0.96 for HK vs. STF. Interestingly, there was significant protein aggregation noted in the plasma samples received by the CG site, presumably due to prolonged stay at room temperature related to delayed shipment delivery. Measurements of EBV DNA from the protein aggregate and the supernatant from the same plasma samples yielded significantly different results with the aggregate showing 2.3–2.5 times higher copy number than the supernatant. Therefore the CG used the left over samples received by the other Taiwanese site, NTU, which did not have much aggregation. Results from this repeated assay showed an ICC of 0.95 between CG vs. STF (Table 2). Since all laboratories will be testing fresh plasma samples without protein aggregation for the trial, another way to validate the harmonization is to “spike-in” known concentrations of EBV DNA into fresh, negative, non-NPC plasma samples. Table 5 shows the result of the “spike-in” experiment, which was highly consistent between the four laboratories. The correlations were >0.99 (p<0.0001) between Stanford and the other three laboratories.
Because an important aspect of the planned phase III trial is to distinguish patients with detectable and undetectable post CCRT plasma EBV DNA for risk stratification, it is important to show that the harmonized assay is sensitive enough to measure a very low level of EBV DNA in the plasma. As there are 8 to 11 units of the BamHI-W fragment in each EBV genome, this assay is more sensitive than assays detecting targets with a single copy. To investigate the detection limit of this assay for all involved laboratories, we analyzed 10–20 replicates of diluted DNA from the Namalwa cell line at a concentration of 0.5, 1.25, 2.5, 5, 25 and 100 copies/reaction. Although the assay showed positive signals in several replicates at concentrations below 5 copies/reaction, the CV for the number of PCR threshold cycle (Ct) was greater than the 10% that is normally accepted for a clinical test. At 5 copies/reaction, the CV was consistently less than 10% for all 4 sites. However, even at these low CVs, the standard deviation (SD) for Ct can be up to 1.1 cycles. If we use a fixed Ct threshold (mean or median value), up to ~50% of the samples having that concentration would be falsely excluded. Therefore, we decided to use the mean Ct value + 2 standard deviations (SDs) at the concentration of 5 copies/reaction as a cutoff for defining a detectable level in the subsequent clinical trial. Theoretically, this would include 95% of the samples having an actual concentration of 5 copies/reaction, which translates to 60 copies/mL of patient plasma.
To conduct a biomarker driven study, it is crucial that the assay for the biomarker is performed in a central laboratory with CLIA or equivalent certification, which applies to all participating laboratories here. However, in situations where it is not feasible to use one central laboratory due to the size of the study and the logistical/cost issues of shipping fresh samples across continents, it is important that the assay be standardized across the participating laboratories. Although plasma EBV DNA is a well-known prognostic marker for NPC and is currently being offered as a clinical test at several institutions, little is known about the inter-laboratory variability of the assay.
Here we showed that the intraclass variability between the different clinical laboratories with significant experience in this assay could be quite large without harmonization. Major contributors to the inter-laboratory variability were the PCR master mix and calibrators, more so than inter-operator variability. Surprisingly, different PCR master mixes yielded more than 5-fold divergence in EBV DNA measurements, despite using the same calibrators, suggesting a difference in amplification efficiencies between plasma DNA and calibrators. Similarly, different calibrator sets, though prepared from the same cell line and protocol, resulted in larger variability than inter-operator variability. Hence, harmonization using the same calibrators and PCR master mixes should result in less inter-laboratory inconsistency. For the planned clinical trial, calibrators will be prepared in HK and shipped to all sites. Fresh calibrators will also be calibrated against old ones to maintain consistency over time. Shipping calibrators on dry ice did not result in degradation or affect calibrator performance.
The first WHO International Standard for EBV for Nucleic Acid Amplification Techniques NIBSC code: 09/260 became available during the course of this study. Given the current efforts to harmonize this assay, we anticipate that additional harmonization using WHO material will further improve correlation and allow the results of future biomarker trials to be more generalizable. Therefore, we plan to use both the WHO Standards and the Namalwa DNA as calibrators during the trial to prospectively compare the performance of both standards in a large NPC patient group.
Although harmonization resulted in ICC improvement for all sites, it was less marked for the Taiwanese sites compared to that between STF and HK. Our detailed analyses indicated that prolonged exposure to room temperature of previously frozen plasma samples resulted in marked protein aggregation, which surprisingly influenced the readings. Higher levels were noted from the protein aggregate than the supernatant from the same plasma samples. In contrast, samples without aggregation and the use of “spike-in” samples, which closely resembled fresh plasma, showed minimal variability between the laboratories, confirming the success of the harmonization.
Because of (1) the sensitive nature of qPCR, (2) the amplification factor used to convert copies/reaction to copies/mL that can magnify inter-operator and inter-laboratory variability and (3) Ct being the direct assay readout, we believe that the Ct value is more useful than copy/mL in defining the minimum detection limit of this assay for risk stratification. Although two (STF and HK) laboratories were able to detect EBV DNA at a concentration as low as 0.5 copy/reaction (i.e. 6 copies/mL), the detectability of EBV DNA at this level was relatively unpredictable for the other two laboratories (CG and NTU). Therefore, if we simply use any detectable level of EBV DNA as the criteria for risk stratification, significant inter-center variation would be expected. On the other hand, if we use a fixed quantitative cutoff value, the variation in quantitative measurement may also result in significant inter-center differences. For example, if we measure 100 plasma aliquots, each with a putative EBV DNA concentration of 500 copies/mL, and use the median measured concentration as a cutoff, then 50 aliquots would have a measured concentration above and 50 would have a measured concentration below the cutoff. The samples with measured concentration below the cutoff value by random variation would be falsely rejected. To resolve this potential confounding issue, we propose to use a cutoff value derived from the mean Ct of a concentration that all four laboratories can consistently detect (≥ 5 copies/reaction) plus 2 standard deviations. Using this cutoff ensures that 95% of the samples having an actual concentration of 5 copies/reaction or 60 copies/mL would be correctly classified as detectable. To obtain the cutoff for each run, the laboratory will include 10–20 replicates of 5 copies/reaction in order to accurately determine the mean and SD for the Ct value at this concentration; the detectability point will be at 2 SDs above the mean Ct value. Plasma samples having a Ct below this cutoff (corresponding to a higher EBV DNA copy number) will be regarded as having a detectable level of EBV DNA.
In summary, we detected significant variability in plasma EBV DNA measurements between different clinical laboratories, which substantially improved with harmonization. This establishes a standardized assay that can be used internationally for the measurement of this biomarker for future prospective studies. It also provides a process for credentialing new laboratories, and ensures that the trial results will be applicable to the real world. The development of clinically actionable biomarkers is the key to personalized medicine and this harmonization is an important benchmark for all future biomarker-driven studies.
This is a study to harmonize the measurement of the plasma biomarker EBV DNA in four accredited international labs in order to launch a biomarker driven international phase III trial in nasopharyngeal carcinoma (NPC). Although EBV DNA is a well-accepted robust prognostic marker in NPC and has been offered in several clinical laboratories as a means to track tumor burden and post-treatment surveillance, little is known about the inter-laboratory variability of this quantitative assay. In this study, we showed that the inter-laboratory variability is quite large for the same assay using identical procedures and primer/probe set without harmonization. We demonstrated that harmonization, which involves standardization of buffers and calibrators, is feasible and significantly reduces such variability. Through this harmonization process, we established a standardized assay that can be used internationally for the measurement of this biomarker for future prospective studies and developed a process for credentialing new laboratories
U10 CA21661 from the National Cancer Institute (NCI), the Li Ka Shing Foundation (YMDL, ATC, KCAC)
Conflict disclosure: None