|Home | About | Journals | Submit | Contact Us | Français|
Chronic myeloid leukemia (CML) is the paradigm for targeted cancer therapy. RT-qPCR is the gold standard for monitoring response to tyrosine kinase-inhibitor (TKI) therapy based on the reduction of blood or bone marrow BCR-ABL1. Some patients with CML and very low or undetectable levels of BCR-ABL1 transcripts can stop TKI-therapy without CML recurrence. However, about 60 percent of patients discontinuing TKI-therapy have rapid leukaemia recurrence. This has increased the need for more sensitive and specific techniques to measure residual CML cells. The clinical challenge is to determine when it is safe to stop TKI-therapy. In this review we describe and critically evaluate the current state of CML clinical management, different technologies used to monitor measurable residual disease (MRD) focus on comparingRT-qPCR and new methods entering clinical practice. We discuss advantages and disadvantages of new methods.
Chronic myeloid leukaemia (CML) was recognized as a clinical entity in the early 19th century on the grounds extensive splenomegaly and leukocytosis , , . In 1960, almost 100 years later, a consistent chromosome termed the Philadelphia (Ph) chromosome was described in the cells of patients with CML by Nowell and Hungerford . In 1973, Janet Rowley  reported the Ph chromosome resulted from a reciprocal translocation between chromosomes 9 and 22. In 1980s the fusion of two genes, BCR and ABL1, was identified as causing CML , , , , BCR-ABL1 results in constitutive activation of the ABL1 tyrosine kinase domain which accounts for the disease phenotype , . In the late 1990s the BCR-ABL1 protein was recognised as a potential drug-able target and led to the development of several ABL1 tyrosine kinase inhibitors (TKI). Their introduction into clinical use has changed the course of CML: a one-time fatal disease is now a condition associated with a life expectancy similar to the normal age-matched population .
CML is a tri-phasic disease. It usually presents in a chronic phase (CP) marked by over-production of mature granulocytes (with <10% blasts in the blood and bone marrow. Untreated chronic phase CML invariably transforms into acute phase resembling acute lymphoid or myeloid leukaemia with >20% blasts in the blood or bone marrow. Many patients have in intermediate phase termed accelerated phase which is often poorly-defined with 10–20% blasts .
CML has a world-wide annual incidence of 1–2/100,000 population with a slight male predominance and accounts for 15% of adult leukaemia  in the Western hemisphere. Median age at onset is 60 years with a wide range .
Most cases of CML have t(9;22). Other chromosome rearrangements such as complex translocations or insertions occur in some cases , . t(9;22) is also detected in 25–30% of adults and 5–10% of children with acute lymphoblastic leukaemia (ALL) , , . Some of these patients may have had a clinically undetected chronic phase.
The molecular hallmark of CML is the exchange of genetic material between the long arms of chromosomes 9 and 22 [t(9;22)(9q34.1;22q11.2)]. This translocation joins the 5′ part of the BCR (the gene covers ~138.5Kbp region; 23 exons) on chromosome 22 and the 3′ part of ABL1 (the gene covers ~174Mbp; 11 exons) on chromosome 9 forming the BCR-ABL1 fusion oncogene .
The breakpoint in ABL1 is typically in the 150 kb intronic region between exons 1a and 1b. Rarely the breakpoint is upstream exon 1b or downstream of exon 1a , ,  but almost invariably upstream of exon 2.
Breakpoints in BCR are more variable but tend to occur within three main breakpoint cluster regions: the major (M-bcr) , minor (m-bcr) ,  and micro (μ-BCR)  regions. Breakpoints in the M-BCR region are associated with two major transcripts designated e13a2 (b2a2) and e14a2 (b3a2). The exons within the M-BCR region previously numbered b1-5 were later renamed e12-e16 after the successful mapping of the entire BCR gene . Both transcripts are translated into a 210KDa protein. Breakpoints in the m-BCR results in the e1a2 transcript which is translated into a 190 KDa protein . Breaks in the μ-bcr region are transcribed into the 19a2 which encodes a 230 KDa  protein with greater kinase activity compared to other BCR-ABL1 encoded proteins . Other rarer transcripts also occur: e6a2 (resulting in p195) , e8a2 (resulting in p200) ,  and e18a2 (resulting in p225) . Breakpoints in ABL1 occasionally occur upstream of exon 3 resulting in a BCR-a3 transcript .
The constitutively-activated tyrosine kinases encoded by BCR-ABL1  activate downstream pathways affecting cell adhesion, DNA-repair, survival, proliferation all of which drive leukaemia development (reviewed in , )).
The reciprocal translocation product, ABL1-BCR, is present in 60–70% of patients with CML and a higher proportion of those with Ph-positive ALL , , . Its role in driving leukaemia development, if any, is unclear. ABL1-BCR encodes two proteins, the p40ABL−BCR and p96ABl−BCR . ABL1-BCR does not appear to correlate with clinical response, contrary to initial suggestions , . Finally, in 5–10% of patients the translocation results in the relocation of the 3′ BCR sequences to a third partner chromosome , .
Imatinib (Gleevec/Glivec or STI-571) was the first TKI identified by Novartis in high-throughput screens for TKIs. Its introduction has revolutionised the outcome of CML patients when licensed as the first-line therapy for all newly diagnosed CML patients in 2002. The identification of ABL1 tyrosine kinase domain (TKD) mutations resistant to imatinib led to the development of more potent TKIs with specific efficacy against certain mutants including dasatinib, nilotinib, bosutinib and ponatinib. More details about the different TKI therapies are provided in Box1.
The response to TKI therapy is defined by haematological, cytogenetic and molecular endpoints alongside and interval to reach them , , . The European Leukaemia Net (ELN), The World Health Organization (WHO) and the US National Comprehensive Cancer Network (NCCN) publish guidelines for managing CML patients using TKI. The ELN guidelines  group patients into three cohorts based on the cytogenetic and/or molecular milestones: optimal response, warning and failure (Table 1). In these guidelines the 12 month response assessment is critical. Patients failing to achieve a complete cytogenetic responses by this time are unlikely to become a complete or molecular responder thereafter . Patients with an optimal response should remain on therapy with monitoring every three months. This assessment has recently been enhanced by the evidence that molecular response at 3 months ( < 10% on International Scale (IS), further discussed below) can predict future response and outcome. This has translated into using 3 months as a possible time for switching to more effective drugs to overcome poor outcome .
Once cytogenetic responses have been established, sensitive and accurate monitoring of BCR-ABL1 transcript levels by reverse transcription quantitative PCR (RT-qPCR) is used thereafter. Tests are typically done every 3 months until major molecular response (MMR) is achieved, and every 3–6 months thereafter . Patients in the ‘warning’ category of the ELN recommendations are often monitored more frequently. These patients also have testing for mutations in the BCR-ABL1 kinase domain .
Results of RT-qPCR-testing are expressed on the International Scale (IS); discussed below)) as the ratio of BCR-ABL1 transcripts expressed as percent to those of a control gene multiplied by each laboratory’s specific conversion factor. Ratios ≤10%, ≤1%, ≤0.1%, ≤0.01%, ≤0.0032%, and ≤0.001% correspond to ≤1, ≤2, ≤3, ≤4, ≤4.5, and ≤5 log reductions fromm a baseline level (usually at diagnosis) . Abbreviations MR3, MR4, MR4.5 and MR5 correspond to ≤0.1%, ≤0.01%, ≤0.0032% and ≤0.001% decreases in transcript levels. The ability to measure deep responses in samples with no detectable BCR-ABL1 transcripts depends on the number of control gene transcripts quantified where MR4, MR4.5 and MR5 can only be reported if the ABL1 or GUSB control transcripts are >10,000, >32,000 and >100,000 (ABL1) or >24,000, >77,000 and >240,000 (GUSB) .
Imatinib is the most common first-line drug used to treat CML worldwide . Imatinib is highly-effective in achieving haematologivcal, cytogenetic and molecular responses. However, dasatinib and nilotinib induce more rapid and deeper molecular responses than imatinib but there are no convincing data they improved CML-free survival .
There are several reasons why initial therapy with imatinib is common including once daily oral dosing, good safety profile and low cost because of generic versions. Nevertheless, many CML experts continue to favour second-generation drugs because of the quicker achievement of therapy targets such as complete cytogenetic response or major molecular response.
Patients failing imatinib because of a slow response, recurrence and/or development of tyrosine kinase domain mutations should receive alternative drugs including dasatinib, nilotinib, ponatinib and/or bosutinib. The safety record of these drugs is not as favorable as imatinib and some adverse effects can be serious or fatal so one should be cautious about switching without good reason.
CML Patients with a sustained deep molecular responses (MR) for over two years are considered by some to be candidates for stopping TKI-therapy. The French STop IMatinib (STIM) trial evaluated the consequences of stopping imatinib in patients with a 5-log reduction in BCR-ABL1 transcripts (MR5) , . This was rapidly followed by several small studies , , , ,  and several large ongoing studies including the European Stop Tyrosine Kinase Inhibitor Study (EURO-SKI) and the De-Escalation and Stopping Treatment of Imatinib, Nilotinib or sprYcel in Chronic Myeloid Leukaemia (DESTINY) in the UK. In most of these studies subjects need to have had a MR4 or deeper for e > 1 year stopping.
Most studies report about 40 percent of subjects discontinuing imatinib remain in MMR for >1–2 years and even longer. However, about 60 percent lose their CMR in the first 6 months to 1 year. Most patients who have a molecular relapse respond when TKIs were reintroduced. Risk factors for molecular relapse include briefer duration of imatinib therapy before stopping, time to first MR and duration of MR before discontinuing therapy . Sensitive and specific molecular monitoring is needed to identify response to therapy, identify patients who might stop TKI-therapy and to detect molecular relapse after stopping.
RT-qPCR was first used to detect minimal residual disease (MRD) after allotransplants for CML , . Since then it has become a turning point in patient management. However, it was immediately evident there were limitations which required assay standardisation and protocol uniformity to ensure reliable results and comparison of data from different laboratories.
Variability in the RT-qPCR quantification resulted from differences in efficiency of RNA extraction, cDNA synthesis, reverse transcription (RT), reagents, platforms controls, standard curves calculated using fixed plasmid dilutions (often based on inaccurate quantification methods) and loss of quantification precision at lower transcript concentrations were major challenges in trying to standardise results.
RT-qPCR was used in the phase-3 IRIS study of imatinib vs. interferon and cytosine arabinoside in patients with newly-diagnosed CML. In an effort to standardise results of RT-qPCR-testing within the participating labs (Adelaide, Seattle and London), a set of 30 samples was developed and tested in each centre . The calculated median value of the ratio between the target and reference gene was considered baseline to assess log-reduction of BCR-ABL1 transcripts in follow-up samples. This approach had two advantages:  it allowed alignment of results between the labs and  eliminated the need to know the baseline value in each subject to calculate molecular response . This study also suggested the first molecular milestone, major molecular response or MMR, defined as a ≥3-log reduction in BCR-ABL1/BCR ratio compared with the median pre-treatment ratio or the standardised baseline .
The Europe Against Cancer (EAC) program which included 26 university laboratories from 10 EU countries embarked on an effort to standardise the RT-qPCR assay and reporting of results. The resultant EAC assay became the standard for BCR-ABL1 quantification in which ABL1 transcript levels were recommended as the internal control gene for normalising results . ABL1 was selected because of more stable and uniform expression, no pseudogenes  and showing a significant correlation between cytogenetic and RT-qPCR results . GUSB transcript levels were later confirmed as a suitable alternative through a similar effort conducted by members of the Association for Molecular Pathology and the College of American Pathology (CAP) . However, neither of these recommended standards is mandatory and no standard protocol has been adopted globally making comparison of results from different centres complex and error-prone. These issues led to the concept of developing an International Scale (IS) . A consensus of CML experts agreed to assign a value of 100% to the standardised baseline calculated from the 30 pre-treatment samples from the IRIS trial .
As more powerful TKIs were developed it became important to identify deeper molecular responses including complete molecular response (CMR) defined as the absence of detectable BCR-ABL1 transcripts. CMR can be measured as log-reduction or percent but its level depends on the quality of the sample which is directly proportional to the number of control gene molecules measured (Table 1). Comparing values between laboratories requires providing each with a unique conversion factor established by exchanging samples with a designated reference laboratory (Manheim, Germany or Adelaide, Australia) with samples from the pool of 30 IRIS pre-treatment samples , , . Current efforts are focused on ensuring the highest possible sensitivity in testing, optimizing protocols to enable routine detection of the maximal numbers of control gene molecules in the same volume of cDNA used to detect BCR-ABL1 and following rules for pre-analytical and analytical quality control (QC) steps to avoid false-negative tests .
Using the IS is labor intensive requiring confirmation of the local conversion factor on a regular basis and when there is a change in any RT-qPCR step , , . In addition, the IS only applies to samples with MRD level ≤10%IS when ABL1 is used as the control gene . The denominator in the ratio calculating disease percentage changes as the level of BCR-ABL1 transcripts changes underestimating the concentration of BCR-ABL1 especially when BCR-ABL1 levels are high at diagnosis or during initial TKI-therapy. This problem tends towards resolution as the number BCR-ABL1 transcripts decrease and the ratio to ABL1 approaches one .
The International Standardisation Group worked on developing two types of reference material, primary and secondary. The primary reference is made in limited quantities, tested, and validated for accreditation by WHO. The accredited primary material is then distributed to manufacturers to generate secondary reference material which can be produced in large quantities and made available to individual laboratories.
WHO identified cells as the best primary material for three reasons: (1) they are more comparable to patient samples, (2) they would control for all steps of RT-qPCR; and (3) stable forms can be made by cell lyophilisation . BCR-ABL1 negative (HL60) and BCR-ABL1 positive (K562) cells were mixed at four different ratios, concentrated and lyophilized . HL60 was chosen because it expresses the three most widely used control genes (ABL1, GUSB and BCR) at comparable levels to those in normal leukocytes . The panel of four dilutions was distributed to participating laboratories with established conversion factors. The mean values obtained from these labs were assigned to the reference dilution with values of 10%, 1%, 0.1% and 0.01% on the IS , . Dilutions ≤10% were used because as discussed above, non-linearity of BCR-ABL1 compared with an ABL1 control at high levels precludes using the IS when transcript levels are >10% , . Reference material was accredited by a WHO expert committee on biological standardisation in 2009 as the first WHO International Genetic Reference Panel for the quantification of BCR-ABL1 mRNA , .
Following their success as calibrators in the fields of infectious diseases and oncology , , armored RNA Quant (ARQ) (Asuragen) technology was used to develop a robust synthetic ARQ calibrator panel of 4 points (10%, 1%, 0.1%, 0.01%) calibrated to the mean IS percent ratios of the WHO primary standards. ARQ constructs contained 4 synthetic transcript sequences corresponding to the exonic sequences of the major BCR-ABL1 transcripts (e13-a2 and e14-a2), and to ABL1 exons 2–11 and BCR exons 14–22 . The ARQ panel was evaluated in a large international pilot study in 29 labs in 15 countries, followed by an accuracy validation in eight labs across seven countries . These studies reported compatibility of the ARQ panel with different RT-qPCR methods including seven different RNA extraction protocols and 13 different quantification platforms, highlighting the potential for ARQ as secondary calibrator.
Another reference standard is ERM®-AD623 developed according to ISO Guide 34:2009 standards . Fragments of e14-a2 BCR-ABL1 fusion transcripts, BCR and GUSB were amplified and cloned into the pUC18 plasmid to generate the plasmid pIRMM0099. Six different linearized plasmid solutions were produced with specific copy number concentrations assigned using digital PCR (dPCR; Table 2), and tested in 63 BCR-ALB1 testing labs each with their own conversion factor. The ERM standard was certified by the European Commission and made available for distribution worldwide by the Institute for Reference Materials and Measurements in Belgium and several other distributers authorized by the Commission. Importantly, use of ERM-AD623 itself does not produce results on the IS but helps calibrate in-house prepared reference materials to improve the accuracy of results before conversion. Availability of such a standard useful to improve comparison of results between testing labs. Nonetheless, the acquisition of a conversion factor is the most commonly used method in the absence of certified secondary reference materials or external quality control schemes.
External quality-assessment programs which are independent and broadly accessible are critical in improving the comparisons of BCR-ABL1 testing results between labs. Such an effort has been initiated through a scheme launched by the United Kingdom National External Quality Assessment Service (NEQAS; , ). The scheme distributes two lyophilized cell lines each containing a specified BCR-ABL1, ABL1 and GUSB quantities. Participating labs process these samples according to their laboratory standard procedures, quantify the targets and report results to the Service which generates a z-score for the log-reduction between the two samples (log reduction = BCR-ABL1 sample 1/BCR-ABL1 sample 2) in each lab to evaluate their performance. Z-Scores involve calculating a robust average and SD from the values submitted by all laboratories minimizing statistical outliers and compliant with ISO 13528/ISO 17043. Z-score = [lab value − robust average]/robust SD. Acceptable values are −2 to 2. Values between 2 and 3 or −2 and −3 are considered ‘actionalbe’ whereas values >−3 or <3 are considered ‘critical’.
Resistance to TKI-therapy can be classified as primary (never achieve a pre-defined response criteria) or acquired (loss of earlier responses) , . Primary resistance is most common in patients with advanced CML , . Resistance can be further defined as due to a BCR-ABL1 dependent or independent mechanisms.
Several mechanisms of resistance are extensively studied including TKI bioavailability , , , , plasma protein binding rate  and, efflux (MDR1) and influx proteins (hOCT-1) expression levels , . However, none of these is a robust marker of response and/or relapse. In contrast, mutations in the BCR-ABL1 TKD explain resistance to TKI in 10–60% of the cases . This wide range reflects heterogeneity of data reporting such as results of testing in different phases of CML. In our experience TKD-mutations are relatively rare in chronic phase.
Since the first mutation was detected in the BCR-ABL1 TKD , over 100 different mutations involving 57 different amino acids have been reported , . However, only some of these mutations occur frequently, have their in vitro half maximal inhibitory concentrations (IC50) identified and are associated with TKI failure , , ,  (Table 3).
The most common mutations cluster to one of four hot spots within the TKD: (1) the P-loop of where adenosine triphosphate (ATP) binding site is located (aa 248–256); (2) the TKI −binding region (aa 315–317); (3) the C-loop catalytic domain (aa 350–363); and (4) the A-loop activation domain (aa 381–402;) . These amino acid changes can alter TKI efficacy in three ways: (1) by directly preventing TKI- binding via hydrogen bonds to the TKD (e.g., Thr315, Met290, Glu286, Lys271, Asp381, Met318),; (2) by inducing conformational changes via van-der-Waals bonds formed between the TKI and the TKD (e.g., Phe317, Val289, Met351, Lle313, Phe382, Val256, Tyr253, Leu370),); or (3) by inducing changes to intra-TKD regulatory interactions (e.g., His396, Glu450, Glu459) , , , , .
Dasatinib and nilotinib are more potent and have lower IC50 values for each BCR-ABL1 mutant form because of their chemical structures and binding modes . 15–30% of patients with imatinib-resistant CML treated with second-generation TKI as second-line develop new BCR-ABL1 mutations (Table 3) , , , .
Cells with the T315I mutation are resistant to all TKI except ponatinib TKIs , , . In vitro studies indicate ponatinib is relatively ineffective against cells with compound mutations (multiple mutations in the same allele; such as T315I/F359 V and T315I/E255 V , , , ). Despite these in vitro data, clinically ponatinib has proved effective against single, compound and double (polyclonal) mutations , , .
Box 2 describes and summarises variants other than point mutations in BCR-ABL1 of proposed clinical significance.
The most recent ELN guidelines recommend mutation testing in four situations where the result might affect drug selection: (1) primary resistance or failure; (2) identification of acquired or secondary resistance; (3) if response is classified as “warning”; and (4) at progression to advanced phase.
There are several ways to test for BCR-ABL1 TKD mutations , . Variables such as analytical sensitivity, specificity and precision; sequencing fragment length and ability to quantify; speed and cost are all to be considered. Capillary electrophoresis sequencing also known as Sanger sequencing, direct sequencing, BigDye® exterminator sequencing or cycle sequencing), all with a limit of detection of 15–20%, are the most widely used techniques .
Pyro-sequencing, sometimes used instead of or in addition to capillary electrophoresis suquencing is based on detecting light emitted following a series of enzymatic reactions triggered by the pyrophosphate released on nucleotide incorporation. The amount of light is proportional to numbers of nucleotides incorporated and is translated into a digital output as a nucleic acid sequence , . However, in order to use this technique, the precise mutation must already be known , . Short amplicon length is another limiting factor as only one or two mutations can be studied simultaneously
Recent new technologies with higher sensitivity include mass spectrometry , , amplicon deep next generation sequencing (NGS) , , ,  and nanofluidic-based methodologies such as digital PCR ,  have been used to detect low-level mutations. The advantage of NGS over other methods is that it combines characteristics of the other methods in one platform including high sensitivity, quantitation capacity, ability to detect known and novel mutations, indels or any variation that occurs within the TKD and distinguish compound and polyclonal mutations .
NGS technology combines Sanger and pyrosequencing but at a much higher throughput using either ‘sequencing by synthesis’ or the ‘chain termination’ chemistry. The former incorporates advances in fluidics technology to release nucleotide flows to synthesize the nascent nucleic acid strand enabling direct translation of nucleic acid data to nucleic acid sequence. Comprehensive reviews are provided in references , , , , , , , ,  and a platform comparison is provided in Table 4.
Targeted NGS with its two sub-types, (1) target capture via probe hybridisation or ligation and (2) amplicon based enrichment (ultra-deep sequencing [UDS] or amplicon deep sequencing ADS) has distinct applications in the context of molecular monitoring in CML.
Hybridisation based targeted NGS uses synthetic oligonucleotides specifically-designed to target BCR and ABL1 followed by sequencing. Fusion junctions are predicted using bioinformatic packages designed to identify structural variants including chromosome translocations, amplifications, inversions and deletions . These packages usually produce one or both of two types of reads: (1) split reads that are single reads composed of material from two non-contiguous genomic regions directly mapping a fusion junction to a base pair resolution; and (2) discordant pairs of reads in which individual reads in a pair map to a different chromosome locations indicating presence of a structural rearrangement within the insert between them . When a sample has an expression level of BCR-ABL1IS > 10%, a mean read-depth of ×50 is sufficient to map the fusion junction . The mapped genomic fusion junctions of each patient are unique and can be used as patient–specific marker for MRD monitoring but DNA or RNA from diagnosis should be available if this metho dis to be used.
Amplicon deep-sequencing utilises a highly multiplexed amplicon generation strategy where multiple regions of interest are amplified and sequenced at a depth 100–10,000 fold greater than Sanger sequencing. Dedicated informatics software pipelines assemble, align and map the sequenced reads to the reference sequence and performs variant detection. The quantitative feature is gained by sequencing targeted regions at depths of hundreds or thousands of reads allowing the sensitive detection and quantification of rare events. Amplicon deep-sequencing has been used for sensitive quantification of TKD mutations , , . Overlapping primers designed to cover the TKD within the fusion BCR-ABL1 transcript were used to amplify the domain following a nested PCR approach to enrich for the fusion transcript. Amplicons from multiple samples were bar-coded and clonally amplified for sequencing. Sequencing the amplicons on a high throughput platform allows sufficient depth of coverage (≥2000 reads) per base to identify mutations at very high sensitivity (<1%).
The application of amplicon deep sequencing for monitoring the genomic fusion junctions as a molecular marker is also a plausible quantification approach. This approach is used to detect and track clonally-expanded T- or B-cell populations in acute lymphoblastic leukaemia (ALL) and other conditions , , . Whether amplicon deep-sequencing can compete with the sensitivity achieved by quantitative methods such as qPCR or dPCR is unknown.
The advantages of amplicon deep NGS over the current methods for mutation screening are the combination of the best features of the other techniques in one platform including high sensitivity, quantitation capacity, detection of known and new mutations in addition to indels and variations in the TKD, and the ability to distinguish compound from polyclonal mutations. However, the utility of amplicon deep NGS as a clinical tool in CML is untested.
Most NGS-related performance characteristic evaluations are performed using the Roche GS Junior or the GS Flex platforms which were available before the IonTorrent and the Illumina MiSeq platforms , , , . There are limited performance characteristic evaluations on the latter platforms.
The Interlaboratory RObustness of Next-generation sequencing (IRON-II) study was designed to assess parameters such as robustness, precision, reproducibility and sensitivity, design, standardisation and QC of amplicon deep NGS across 10 labs in eight countries , , . The consensus conclusion was amplicon-based deep sequencing is technically feasible, achieves high concordance between labs and allows a broad, in-depth molecular characterisation of samples with high sensitivity. The sensitivity to detect low-level variants was presented as low as 1–2% frequency compared with the 20% threshold for Sanger-based sequencing. In addition to confirming the utility of deep sequencing in clinical applications the study highlighted the role of UDS in research to: (1) fully characterise the spectrum of minor mutated variants (5%–20%); (2) follow the dynamics of resistant mutations over time; and (3) reconstruct the clonal architecture of mutated populations when multiple mutations occur within the same amplicon) , . Early reports indicate whereas disease development is clonal, drug resistance is polyclonal involving multiple clones with different drug resistant mutations , .
There are two clinical scenarios in which detecting low-level mutations (<20%) seem to be useful: (1) guiding TKI-switching after the failure of imatinib or a second-generation TKI; (2) delineating whether multiple mutations are compound or polyclonal, particularly those with T315I.
One consideration when investigating multiple mutations in the same sample is amplicon length. Current technology allows a maximum read length of 400 bp. Consequently, compound TKD mutations at a greater distance are frequently missed by a one amplicon. An exception are reads generated by the Flex+ platform which allows a read length of 1 kb, which would cover the entire TKD (approximately 900 bp).
One of the main hurdles during deep-sequencing is the ability to distinguish true low-level mutations from background noise or PCR and sequencing errors. PCR fidelity affects error rate  and different NGS sequencing chemistries are associated with different error-types , . The reported sequencing error type for the Ion Torrent platform is small insertions/deletions (indels) of one or two nucleotides, partially at homo-polymers with an error rate of about 1.7%  (Box 3). The dominant error-type on Illumina platforms is base-specific miscalls (substitution) (0.4%) rather than indels and homo-polymers are less of an issue (Box 3). Higher error-rates for single nucleotide transition (G > A or A > G or C > T or T > C) compared with transversions (G > C,T or A > C,T or T > G,A or C > G,A) are observed on all platforms indicating PCR related errors , , . Because TKD amplification is based on nested PCR,such errors could be increased. Therefore, careful definitions of the limits of detection (LoD) quantification (LoQ) for the hotspots in the context of platforms, chemistries and error rates are important. Training the bioinformatic pipelines for accurate delineation between real and false signals is critical to the validation process.
Another layer of complexity involves the potential for low-level contamination. TKD amplicon generation includes a nested step and the nature of library preparatory workflow which involves pipetting which introduces the possibility of low-level contamination. Consequently, positive and negative-controls are required. Positive controls would preferably contain non-disease related mutations or even known single nucleotide polymorphisms (SNPs) whereas negative controls could be leukaemias other than CML and no template controls (NTC). SNPs are good for controlling pipetting related contaminations as they occur in either 50% or 100%. Consequently, detection at low frequencies suggests contamination.
The Depth of sequencing is defined by the number of reads covering the region of interest. As TKD mutations are somatic and low frequencies are clinically relevant, optimising depth of coverage is important. Data from the IRON-II study suggest 2000 reads as the minimum required for identifying mutations at 1% frequency with at least 20 reads, 10 in each direction with the mutant variant. Five hundred reads are sufficient for a 5% cut-off, with 25 reads identifying the variant.
Analyses of DNA could potentially overcome the problems caused by using nested PCR. However, using DNA as a template introduces two other variables: (1) the need to increase sequencing depth to compensate for the diluting effects of variable expression of the normal ABL1 allele; and (2) the coverage uniformity of the primers in the panel design covering ABL1 exons implying the need for appropriate panel design and validation. Eliminating the nested step before sequencing the TKD on a cDNA template reduces the detection rate depending on the ratio between the chimeric and normal ABL1 transcripts.
Consistent nomenclature is important. The Human Genome Variation Society (HGVS) provides recommendations for a uniform and unequivocal description of sequence variants in DNA and protein sequences and recommend that labs reporting variants should use these recommendations ,  (http://www.hgvs.org/mutnomen/recs.html).
NGS is capable of reproducibly detecting low-level mutations. As yet there is no evidence for any clinical value for the detected mutations. Consequently, there seems no benefit to replacing current testing methods with NGS despite its potential technical superiority. However, if cost was not an issue, there is no reason why labs should not use NGS for TKD mutation testing.
The third generation, single-molecule sequencing, is the latest advance with three main features: (1) PCR is not needed before sequencing which shortens nucleic acid preparation time; (2) the signal is captured in real-time, i.e. is monitored and recorded during the enzymatic reaction of incorporating nucleotides in the complementary strand , ; (3) in theory there is no limit to the length of the sequencing read. The two marketed platforms are single-molecule real-time sequencing (SMRT®) by Pacific Biosciences (PacBio, Menlo Park, CA) and the MinIon ® nanopore sequencing by Oxford Nanopore Technologies . The basic principle of SMRT® sequencing is that bright fluorophores conjugated to the nucleotides are released upon incorporation by a single polymerase fixed to the surfaces inside millions of zero-mode wave-guided (ZMW) nano-structures. The released fluorophores are excited by a laser and the emission signal detected by confocal visualization. The signal in the Oxford nanopore is generated by a processive enzyme (exonuclease) attached to a biological nanopore which cleaves single nucleotides from a target DNA strand and passes them through the nanopore. As each nucleotide passes it creates a unique electric pulse which is recorded as sequence information in real-time . Single molecule sequencing eliminates the need for molecular amplification and promises speed, mobility and longer amplicon-length sequencing substantial improving results of TKD sequencing by resolving the challenge of delineating compound mutations because of the short read lengths inherent to NGS sequencing .
Although third generation sequencing is promising there are few data on performance and utility in clinical testing. A recent study of the MinIon platform reported an unacceptably high error rate of 38% with a mean and median read lengths of 2 kb and 1 kb . Whether this can be improved on is unknown.
dPCR is a precise analytical technique to quantify nucleic acids based on PCR amplification of a single template molecule with no need for a calibration curve , , , , . The digital Minimum Information for Publication of Quantitative Digital PCR Experiments (dMIQE) equivalent guideline has been developed to facilitate uniform terminology for dPCR and identify parameters needed to assist the independent assessment of experimental data . The most common terminologies for dPCR are partitions, lambda (λ), Poisson distribution and the dynamic range of quantification commonly sometimes referred to as the “sweet spot” .
A partition is the fixed space within which the single molecule PCR occurs. It can be a small well or water-in-oil emulsion droplet of nanoliter or picoliter volumes . Lambda (λ) represents the mean target copy numbers in each partition. It is estimated by applying the Poisson distribution to the number of positive partitions (k) per reaction (n is the number of partitions). The number of copies per reaction can be estimated using λ, the reaction volume and n.
The Poisson distribution is a special case of the binomial distribution describing the probability of a rare event (target molecule) in a fixed partition size. Inherent assumptions to the Poisson distribution are: (1) a large population (partitions) of fixed size; (2) an event; (3) a binary outcome for the event (such as yes or no); and (4) a random distribution of the event. Using the Poisson distribution in dPCR corrects for the possibility a positive partition might contain more than one target molecule.
The dynamic range of dPCR is defined by numbers of partitions, and volume of sample interrogated and concentration of the target in the sample . When the sample volume is not limiting, increasing numbers of partitions increases sensitivity. Conversely, the number of samples correlates with achievable sensitivity. Obviously, quantification of a rare target requires greater partition numbers whereas samples with many targets require fewer partitions if the sample is adequately diluted. Applying the Poisson distribution enables the dynamic range to extend beyond numbers of partitions analysed but at the cost of reduced precision at high and low frequencies , ,  with the most precise quantification when λ= 0.6–1.6 .
It is important to specify a term for dPCR. We favour a terminology describing whether the template is DNA (dPCR) or reverse-transcribed RNA (RT-dPCR). There is no need to report partition type (chip vs. droplet) as this is inherent to the instrument used when the partition size is uniform. . dPCR could also be used as a general term referring to the method in contrast to real-time based techniques.
Prefabricated reaction wells (in a chip or plate) or droplets (water-in-oil emulsions) are the two main methods for generating partitions for dPCR. Prefabricated platforms include the BioMark® HD (Fluidigm), QS®3D (ThermoFisher Scientific) and Constellation (FORMULATRIX) with the Clarity® (JN Medsys) and Naica Crystal dPCR (Stilla Technologies). Emulsion based technologies include the QX200® droplet dPCR system (BioRad Laboratories) and the RainDrop® (RainDance Technologies).
The differences between these platforms are summarised in Table 5 including sample and partition volumes, hands-on time, costs of consumables and others. Interrogation of samples using a platform with greater reaction partitioning (large n) is more sensitive than smaller reaction partitioning , , . Consequently, choosing the best instrument requires consideration of the desired sensitivity, precision, throughput and cost. Detailed comparisons of different dPCR platforms and between dPCR and qPCR is provided in Table 5, Table 6.
RT-qPCR is the gold-standard for molecular monitoring but has inherent limitations including low precision at the lower end of the calibration curve and substantial inter-laboratory variation in assay performance. The use of conversion factors and the introduction of international reference materials for calibration has reduced but not eliminated these limitations.
RT-dPCR has advantages for MRD monitoring in CML because it can simplify standardisation and improve sensitivity and precision of measurements. Another application for RT-dPCR is to value assign reference materials ,  such as the ERM®-AD623 which can be used either for the calibration of secondary ‘in-house’ controls or for directly quantifying BCR-ABL1 copy numbers . Furthermore, using RT-dPCR allows quantification of BCR-ABL1 copy number with rare transcript types with precision.
Several factors need to be considered in applying RT-dPCR to MRD in CML. The first is the clinical vs. analytical sensitivity. Analytical sensitivity is expressed as the LoD of an analyte indicating the lowest concentration which can be accurately detected with 95% certainty . However, as we discussed above, the clinical sensitivity of an assay is defined as the ability of the test to detect a log-reduction of the ratio between BCR-ABL1 and ABL1 (or another reference gene) expressed on the international scale compared with baseline. Three factors to consider when evaluating clinical sensitivity are: (1) RT-dPCR remains susceptible to errors in upstream processing such as sampling, RNA extraction and efficiency of RT and cDNA synthesis; (2) reference gene quantification is needed to assess sample and pre-PCR processing qualities.; and (3) use of a conversion factor remains a requirement for the expression of results on the IS which is, in turn, the basis for assigning molecular responses.
Assay standardisation is also important even though a calibration curve is not needed to quantify target molecules. Detection of a rare target by any PCR method requires an accurate description of the assay LoD. dPCR can be more sensitive than RT-qPCR but is also susceptible to poor assay design, pre-PCR processing and molecular dropout (target not detected despite being present in the reaction). Positive and negative controls are needed to assess false-positive and −negative rates and to define quantification amplitude thresholds.
A third factor is the issue of multiplexing the high reference background and low target copy numbers in one reaction. To express the transcript levels as a ratio between the transcript levels of the target and reference genes, copy numbers of both genes are measured in 3 μl of cDNA containing unknown copy numbers of BCR-ABL1 targets (0–10,000 copies) but relatively fixed numbers of ABL1 transcripts ,  indicating a good quality sample by RT-qPCR. Because duplex reactions have been shown to be more precise compare to uniplex reactions, there needs to be a balance between accurate quantification of the high concentration of the reference gene and the low concentration of the target gene without compromising the sensitivity of the assay. Achieving this aim requires a platform with many partitions. For example, to quantify 1 transcript amongst 100,000 reference transcripts in an RT-dPCR platform with 20,000 partitions at least 3 or 9 reactions per sample are needed to reach a comparable sensitivity per reaction or triplicate reactions by RT-qPCR (100,000/1.6 = 62,500 partitions [~3 reactions]), respectively. In contrast, a partition size of 10 million allows one reaction per sample to achieve similar sensitivity to a triplicate RT-qPCR reaction without risking reaction saturation by the reference gene.
The accuracy of dPCR quantification is influenced by bias and variance. Systematic bias can result in under-estimation when quantifying high target gene concentrations. Under-estimation can result from poor assay design, inhibitors, non-random distribution of the target because of inhomogeneities, molecular dropout or non-uniform partition size , , . Consequently, positive internal controls are needed especially when reporting negative assay results. The random nature of nucleic acid molecules distribution between partitions makes measurement precision predictable and precise compared with RT-qPCR . However, accuracy is difficult to assess because we lack methods which can verify dPCR assay results. This highlights the importance of the use of reference materials such as the ERM®-AD623 to assess the performance of the RT-dPCR platform.
Another issue is how to express target gene copy numbers quantified by RT-dPCR. The copy numbers of the target and reference genes in 3 μl cDNA quantified in a total of 20 μl RT-qPCR reaction is required before the ratio of target and reference genes is calculated and converted to the IS using the lab specific conversion factor. We recommend RT-dPCR results be expressed per volume of cDNA included per reaction. For example, if 2 μl cDNA was included in a total of 25 μl RT-dPCR reaction the copy numbers should be referred to as x copies in 2 μl cDNA.
Currently, most RT-dPCR platforms are limited by the volume of sample that can be analysed  compared with RT-qPCR. Furthermore, a large reaction volume is required to reach absolute sensitivity using limiting dilutions . Sampling error is particularly emphasised in cases with low target prevalence where increasing the sensitivity of the test is unlikely to improve the target detection rate. In contrast, increased sampling (sample volume) would. Consequently, versatile, higher throughput instruments which facilitate large reaction volumes ( > 50 ul)  and larger partition numbers  are needed. RT-dPCR has the potential to revolutionise the sensitivity of detecting BCR-ABL1 transcript levels. This may require revising thedefinitions of molecular responses if a stronger correlation with clinical outcomes than RT-qPCR is shown. The latter is uncertain .
The use of fully-automated closed systems for qPCR such as the Cepheid cartridges by GeneXpert® is a practical alternative for low throughput labs or labs in countries where assay standardisation and performance of the RT-qPCR is challenging. The input is a blood sample where RNA extraction, reverse transcription and qPCR steps are automated inside a cartridge with the transcript levels reported on the IS-based internal algorithm with no need for a standard curve . These systems are reproducible and have turn-around times of less than two hours , , , . Newer, more sensitive cartridges able to report results deeper than MMR are being developed. dPCR coupled with an automated system similar to that provided by the Cepheid instruments is a potential future development which could improve cartridge sensitivity while still being compatible with small volume samples. Cost remains an important issue with cartridge systems.
Although RT-qPCR is widely-used to monitor response to TKI-therapy it may not be optimal when the issue is the discontinuation of TKI-therapy. Presently, TKI-therapy discontinuation is followed by molecular relapse in 50–60% of patients who apparently had deep molecular responses at the time of stopping therapy. Whether the reason for our inability to accurately identify those patients able to remain off treatment indefinitely is due to the inadequacies of our molecular monitoring technology, or inherent differences in the biology of the leukaemia and/or the immune response in different patients, or some combination of both, is not known. Potential flaws in the methodology include sampling error, because the RT-qPCR test is not sensitive enough, because leukaemia stem cells (LSC) or progenitor cells (LPC) do not transcribe detectable levels of BCR-ABL1 or because of a combination of these. A more sensitive technique to detect residual leukaemia cells might help to identify patients most likely to benefit from discontinuing TKI-therapy. Several publications report RT-dPCR is better at detecting MRD compared with RT-qPCR , , or provides equal sensitivity with improved precision at low BCR-ABL1 transcript concentrations .
Several studies report a DNA-based assay enhances the sensitivity of detection , , , , . Genomic DNA is more stable, easier to extract, reduces variability associated with reverse transcription and cDNA synthesis, allows target detection in the absence of transcription and requires no control gene normalisation. The possibility of amplifying benign clones found in normals is excluded. Probes are patient specific such that each has a unique molecular signature . At least 1-log improvement in sensitivity compared to that of an RNA-based assay reported for a DNA-based qPCR approach . However, initial enthusiasm was tempered by the need for individual fusion sequence mapping at the genomic level where breakpoints occur across a wide range of the ABL1 and BCR intronic regions. Breakpoint “hotspots” within repetitive elements and Alu regions made the mapping technically challenging. However, advances in targeted high-throughput sequencing and bioinformatic pipelines for structural variant detection have simplified the process of genomic fusion mapping (discussed earlier) bringing DNA-monitoring under the spotlight again.
Previous studies monitoring MRD on DNA using qPCR have reported the detection of BCR-ABL1 positive disease in a substantial proportion of patients with undetectable transcripts by RTqPCR , , . However, one limitation of the application of DNA-based hydrolysis probe assays on a real-time qPCR platform is the need for positive control material to generate a standard curve. Use of patient’s presentation material for this purpose compromises accuracy and sensitivity because patients not always present with 100% BCR-ABL1 positive disease in the blood . The use of dPCR platform circumvents this constraint by assigning an absolute value to the target molecules allowing precise quantification with no need for a subject-specific standard curve. We recently reported a 44% detectable MRD using a DNA-based dPCR compared to 19% and 11% using qPCR and RT-dPCR .
The advent of targeted high-throughput sequencing coupled with dPCR monitoring provides the greatest sensitivity to detect low levels of BCR-ABL1-positive transcripts cost-effectively and in a manner suitable for clinical diagnostics . If validated in clinical trials, this technique will allow a more personalised and flexible approach to recommendations for dose-reduction or stopping TKI-therapy in individuals.
In this review we describe and critically evaluate different technologies used to detect BCR-ABL1 transcripts in patients with CML. We focus on the comparison between RT-qPCR and new technologies. We discuss potential advantages of these new technologies for monitoring response to TKI-therapy. We conclude that dPCR can measure extremely low concentrations of target molecules with high precision and without a need for a standard, validation of this approach is needed. We suggest dPCR and NGS analyses may transform our approach to molecular monitoring of cancers in the next 5–10 years, not only for CML but also for other leukamias and solid cancers. In addition to analytical validation, the clinical relevance of bettetr ability to detect and accurately quantify low levels of residual transcripts from cancer cells needs evaluation in clinical trials.
Next Generation Sequencing (NGS) is also known as massively parallel sequencing or high throughput sequencing. It is a term used to describe a number of different modern sequencing technologies including Illumina (Solexa) sequencing, Roche 454 sequencing and Ion Torrent (proton or PGM or semiconductor). These technologies allow the sequencing of DNA and RNA quickly and cost effectively compared to previously used Sanger sequencing. By doing so, they have revolutionised the fields of genomics and molecular biology.
Refers to the process of sample preparation for next generation sequencing. In general, NGS library preparation comprises the following steps: input nucleic acid cleavage to small fragments of specific sizes or amplicon generation via multiplexed PCR (for amplicon based sequencing), barcode ligation and indexing of the fragments or amplicons followed by clean-up steps to purify the products. Each sample processed this way is referred to as a ‘library’.
It refers to the process of adding unique nucleotide sequences known as barcode sequences (1–16 or 1–32 or more) to the fragments or amplicons generated per individual sample during library preparation. Barcoding of samples allows the simultaneous sequencing of several samples in one sequencing reaction. The barcodes can be added by using enzymatic ligation or in-primer during PCR amplification. Samples are delineated (or de-multiplexed) after sequencing using bioinformatics tools. The barcoding process could be performed via enzymatic ligation or incorporation during PCR.
Refers to the process of ligating platform specific adaptor sequences to the libraries to be sequenced. These adaptors will allow the attachment of the fragments to the amplification machinery being beads in platforms that rely on emulsion amplification (454 and Ion Torrent) or glass surface in the platforms that follow bridge amplification (Illumina). In most protocols, the index sequences are linked to the barcodes. In this case, the two terms can be used interchangeably. Sample library indexing should not be confused with molecular barcoding and SNP fingerprinting which allow the detection of PCR duplicate artifacts and unique sample tracking, respectively.
Emulsion PCR (emPCR) clonally amplifies NGS libraries attached to a bead inside water droplets in an oil solution. The key is to ensure the attachment of one fragment or amplicon per bead so that each fragment or amplicon is clonally amplified on the bead leading to a single read after sequencing. This is achieved by using optimized library dilution.
Bridge amplification clonally amplifies NGS libraries attached to a solid surface. The key is to ensure not overloading the surface so that each fragment or amplicon is clonally amplified generating a cluster leading to a single read after sequencing. This is achieved by loading optimized library dilution onto the surface.
Sequencing by synthesis (SBS) indicates the process of delineating nucleic acid sequence via reading the sequence of a nascent fragment or amplicon one by one. Different sequencing platforms use different chemistries albeit following the same principle. A nucleic acid sequence is read by flooding the reaction with cycles of known dNTP sequences arranged in certain patterns. The cycles are repeated several times. The prior knowledge of the released dNTP and the detection of a signal following successful incorporation of one dNTP at a time forms the principle of sequencing by synthesis.
A ‘flow’ is the event of exposing the sequencing chamber or cluster to one particular dNTP (T, A, C or G) followed by a washing step. The flow order repeats with a particular pattern. Each sequencing ‘cycle’ contains a specific number of consecutive dNTP flows. For example, T-A-C-G = 1 cycle.
Refers to the sequence information obtained per fragment or amplicon. In more specific terms, each clonally amplified bead or cluster produces a single read representing the sequence of the fragment or amplicon amplified on that bead or cluster. For example, if 10.000 beads or clusters were sequenced, then 10.000 sequenced reads will be produced.
Coverage refers to the average number of reads covering a base in one sequencing reaction. For example, amplicon deep sequencing of 2000 x means that each base in the targeted region of interest is sequenced at least 2000 times or 2000 reads cover the targeted region of interest.
Variant annotation is a crucial step in the analysis of NGS data. In this process, sequenced reads are aligned to a reference sequence for comparison. Differences between the sequenced reads and the reference are highlighted as annotated variants. This is performed using specialized bioinformatics packages.
dPCR is a highly precise analytical technique for absolute quantification of nucleic acids based on PCR amplification of a single template molecule without the need for a calibration curve.
Partition is referred to the fixed space within which single molecule PCR takes place. This can be a small well or water-in-oil (emulsion) droplet of nanoliter or picoliter size.
Lambda is the mean target copy number present in a partition. It is estimated applying the Poisson distribution to account for a positive partition initially containing more than one molecule. The number of copies per reaction can be estimated using λ, the total reaction volume and total partition number.
The Poisson distribution is a type of binomial distribution that describes the probability of a rare event (target molecule) in a fixed partition size. Assumptions are (1) large population (partitions) of fixed size, (2) a rare event, (3) a binary outcome for the event and (4) random distribution for the event. The application of Poisson in dPCR corrects for the fact that one partitions could contain more than one target molecule.
The dynamic range of dPCR is defined by the number of partitions. This is also influenced by the volume and concentration of target in the sample. Due to the application of the Poisson distribution, the dynamic range exceeds the total number of partitions in a reaction; however, at the extreme ends of the range, the precision is greatly reduced. The most precise quantification reached when λ= 0.6–1.6 . Hence, the “sweet spot” of a platform is defined by the range of λ values that can be accurately quantified with acceptable precision.
The authors would like to acknowledge Dr. Gareth Gerrard from the Research Department of Pathology, UCL Cancer Institute for commenting on the NGS section, Dr. Alexandra Whale from LGC for commenting on the dPCR section of this review and Prof Robert Gale; a visitig professor at the Department of Medicine, Imperial College London for critically reviewing the manuscript.
Handled by Jim Huggett