Following transplantation of transduced cells, quantitative assessment of the clonal contributions of individual cells and their progeny, based on identification of unique proviral integration sites can provide important insights into both the biology and the safety of diverse gene transfer and gene therapy applications utilizing integrating vectors. For instance, detecting clonal expansion over time may predict genotoxicity, and comparison of clonal contributions to different hematopoietic lineages can confirm stem cell transduction or help clarify hematopoietic ontogeny (Dunbar, 2007
). Accurate identification of all vector integration sites is also required for screening potential “safe harbor” genetic modification of patient-specific induced pluripotent stem cells for regenerative medicine applications (Papapetrou et al.
A key goal of integration sites mapping is informative and efficient identification of all genomic integrants using the smallest amount of starting DNA, as availability of clinical DNA samples may be limited and repeated sampling impossible (Bleier et al.
). LAM-PCR relies on adjacent restriction enzyme sites in order to access genomic insertions, since some integration sites occur too close or too far from any specific restriction enzyme site, resulting in fragments that are too small to resolve, or alternatively, too long to be amplified, thus limiting the analysis to a subset of clones in a mixture (Harkey et al.
; Gabriel et al.
). In a “multiarm” approach, a combination of the five most potent four-cutter restriction enzymes gives access to 88.7% of the analyzable genome, however, performing LAM-PCR with five different enzymes is more labor-intensive and impractical in terms of the large amounts of DNA required (Gabriel et al.
). Indeed, a multiarm approach that pools X amount of enzymatic reactions would require X times starting genomic DNA, depleting limited amounts of clinical material.
Without the restriction enzyme digestion step, and with an abridged priming step prior to nested PCR, our Re-free LAM-PCR requires less labor, time, and less expense, in addition to being more DNA-efficient. A single reaction of Re-free LAM-PCR, with a starting DNA amount of 100
ng, is capable of reaching up to 90% efficiency in detecting clones in a 10-clone test polyclonal setting. In contrast, three pooled reactions, each with 100
ng of starting DNA, are required to attain up to 90% efficiency by regular LAM-PCR on the same clonal mixture. Several other approaches without restriction enzyme digest have been shown to access the integrome with a higher reported quantity of starting genomic DNA, for example, up to1
μg (Pule et al.
; Paruzynski et al.
). Re-free LAM-PCR allows for high genomic coverage and retrieval efficiency with a comparably low amount of starting genomic DNA. Sensitivity assay data indicates that Re-free LAM-PCR can track clone-specific integration sites at transduction levels as low as 1%, comparable to LAM-PCR.
In LAM-PCR, even after the excision of the gel band corresponding to amplified internal vector, undesirable internal vector sequences amounted to about 16% of the shotgun-sequencing product. On the other hand, no internal vector sequences were detected in the final Re-free LAM-PCR shotgun sequencing product. These results suggest that Re-free LAM-PCR–blocking oligonucleotides, specific to the LTR-proximal vector sequence, select effectively against subsequent nested PCR amplification of internal sequences. As shotgun sequencing is replaced in modern molecular biology applications, future high-throughput sequencing of Re-free LAM-PCR–nested PCR products will be enhanced by the absence of internal sequences, making longitudinal in vivo follow-ups more informative, as only a limited amount of sample DNA is often available.
To our surprise and disappointment, Re-free LAM-PCR did not provide accurate quantitative information on clonal contributions, suggesting that integration site detection bias is not solely the result of restriction enzyme-related factors in terms of distance from restriction enzyme sites or efficiency of digestion (Harkey et al.
). However, Re-free LAM-PCR was able to detect a clonal integration site (D41) that was not accessible to LAM-PCR on repeated runs, due to the lack of an LTR-proximal Tsp509I restriction site. Indeed, previous studies (Harkey et al.
) have determined that while the Tsp509I AA|TT restriction motif is the most widely distributed and efficient, it still results in 10% of the genome being inaccessible to LAM-PCR–based integration site retrieval. Since the D13 clonal integration site, located in an A/T rich region and undetected by Re-free LAM-PCR in the mixture samples, is readily accessible via LAM-PCR, our results suggested that both methods present distinct biases, which prevent the detection of potential integration sites of interest. In the past, increasing the number of LAM-PCR repeats, and using various restriction enzymes, a laborious and time-consuming process, achieved increased integration site detection. As an alternative, we suggest instead performing one Re-free LAM-PCR run and one regular LAM-PCR run, each with 100
ng starting genomic DNA, as a means of increasing recovery efficiency. If less DNA is available, Re-free LAM-PCR provides a labor-saving means of mapping qualitatively the majority of integrants, retrieving around 75% of total integration sites. Re-free LAM-PCR and LAM-PCR combined can provide for complementary and presumably more complete genomic coverage in situations where more sample DNA is available.
Furthermore, Re-free LAM-PCR efficiency and quantitative potential may be feasible with improvements in polymerase technology, allowing access and efficient priming and extension across a wider range of GC- and AT-rich templates and amplicons, as well as improved tolerance to common PCR inhibitors, possibly leading to fuller genomic access in nonrestriction enzyme qualitative integration sites detection. We noticed that clone D47 was also significantly under-represented in Re-free LAM-PCR analyses. However, the A/T content for the 250
bps surrounding the D47 integration site is a moderate 58.4%. This observation suggests that factors beyond A/T content, such as flanking DNA secondary structure motifs, could play a role in restricting access to the integrome.
Of the 10 single copy K562 clones, D13 and D40 were located on chromosome 7, while clones D33 and D39 were both located on chromosome 5. In a genome-wide analysis of lentiviral integration sites using next generation sequencing technology, chromosomes 7 and 5 were found to be over-represented as sites of lentivector integration in tetraploid K562 cells compared to control 293T cells (Ustek et al.
). In our study, D13, D40, and D39 clones were under-represented, whereas D33 can be easily detected in the Re-free LAM-PCR method. Since we used tetraploid karyotype-abnormal K562 cells, the possibility that chromosomal integration preferences would differ from normal primary cells is conceivable, as is an impact on retrieval using Re-free LAM-PCR.
Genetic engineering of the hematopoietic stem cell (HSC) provides a clinically accessible paradigm that has broader implications for stem cell engineering as a whole. With widespread awareness of the risks of certain aspects of vector design, such as LTR-driven transgene expression, current second- and third-generation vectors appear to be safer. Nonetheless, since lentiviral vectors still have not been targeted to specific loci or genomic regions (Riviere et al.
), there is a key need for both quantitative (e.g., clonal dominance) and qualitative integration sites information. The recent development of HSC barcoding with a unique proviral barcode per transplanted cell (Lu et al.
) constitutes a promising avenue for quantification of repopulating clones for real-time monitoring of clonality. Given that sequencing flanking genomic DNA at the LTR junction remains necessary to ascertain the medium- to long-term implications of a proviral integration (Glimm et al.
), vector design that includes a barcode with the transgene could be complemented by Re-free LAM-PCR–based sequencing of the flanking genomic DNA of an integration site in order to determine whether a clone's biology had been altered due to the proviral location in the genome.