Our results confirm that SIN lentiviral vector genomes can be mobilized at a readily detectable frequency by expression of viral proteins in cells in which the vector genome is integrated. Vector design significantly influenced mobilization frequency. The mobilized vector particles yielded an intact, unrearranged proviral genome upon reintegration into target cells. The mechanism of transcription of the integrated vector genome was evaluated using a promoter trap design with a vector encoding tat but lacking an upstream promoter in a cell line in which drug resistance depended on tat expression. The location of transcribed integrants in intergenic regions or in a reverse orientation within a gene suggested a transcriptional mechanism other than readthrough from an endogenous upstream promoter. Indeed, we found that in all cases, transcripts encoding tat arose from cryptic promoters either within or upstream of the integrated vector genome.
We demonstrated that vector particles containing a mobilized genome were capable of transferring an intact unrearranged proviral genome into naïve target cells. In all cases studied, the LTRs were fully intact, as determined by sequencing of PCR-amplified products, and Southern blot analysis demonstrated the genome to be intact. Clones of 293T cells containing a single copy of the proviral genome, whether derived from primary or mobilized vector particles, gave rise to vector particles de novo when transduced with helper plasmids. These data suggest that virtually every integrated proviral genome is transcribed, albeit often at low frequency, with considerable variation in the frequency of transcription depending on the integration position. However, we estimate that only approximately 1 in 3,000 integrated vector genomes containing the MSCV LTR was transcribed at a level sufficient to generate tat in amounts adequate to active the wild-type HIV LTR. This estimate is derived from the ratio of the tat titer of 2,600 divided by the GFP titer of 7.2 × 106 (Fig. ). Thus, the activity of cryptic promoters must depend on local features of chromatin structure and the constellation of nearby regulatory elements and regulatory elements within the vector that facilitate transcription.
Based on the demonstration that relatively high-level transcription of tat-encoding proviral genomes occurs via cryptic promoters, we infer that most or all of the transcripts which result in vector mobilization also arise from cryptic promoters. Because the puromycin-resistant clones derived by virtue of tat expression also contain an integrated, wild-type LTR driving the Purr
gene, it is not possible to evaluate mobilization of tat-encoding proviral vectors directly, since the genomes that include the wild-type LTR are likely to be far more efficiently mobilized. Furthermore, this genome also contains the GFP marker which, although expressed at a low level, would further confound efforts to evaluate mobilization of the tat-encoding genome by transfer of GFP expression. Any reverse transcript derived from a cryptic promoter that includes the R region of the 5′ LTR may yield an intact DNA proviral genome which is a substrate for reintegration, since first-strand transfer may occur when all, or a portion of the R region, has been transcribed (46
). We have shown that the titer of mobilized vector particles from individual clones containing the GFP-encoding proviral genome ranges from 10 to 230 IU/ml, indicating that the relative likelihood of mobilization of any integrant is highly variable. SIN LTRs are expressed at about 15% of the level of a wild-type LTR in the absence of tat (20
). This difference is likely to be greater when tat is expressed, since the wild-type but not the SIN LTR is activated by tat. Although a vector containing a wild-type LTR can be mobilized from cells by HIV infection (4
), mobilization of a vector with a SIN LTR, although feasible as demonstrated in our experiments, is likely to be less efficient.
The existence of cryptic promoters in eukaryotic DNA is well described. For example, upstream transcription initiation sites for globin genes, which account for a small fraction of globin-encoding RNAs, have been defined within a few hundred base pairs of the major CAP sites (2
). In addition, there are long RNA species that may span segments of the locus which appeared to be derived from specific promoter structures (7
). Cryptic promoters within the 5′ untranslated region of cellular genes, e.g., the gene for the translation initiation factor eIF4G, may account for only a small portion of the eIF4G-encoding mRNAs but may be the dominant translated species because of the extensive secondary structure in the 5′ untranslated region of the most abundant mRNA species which inhibits translation beginning at its CAP structure (12
). Cryptic promoters apparently reflect coincident location of binding sites for one or more transcription factors which attract the RNA polymerase II (pol II) complex with a frequency sufficient to generate transcripts of downstream sequences (12
). The ability of cryptic promoters to generate functional transcripts adds an element of caution to promoter trap experiments (3
) and suggests that inferences regarding promoter inducibility should be validated by defining the 5′ ends of the induced transcripts.
DNA methylation and histone acetylation are two features of chromatin structure that are likely to influence the probability of cryptic promoter function (18
). Vector genomes integrated into or near methylated CpG islands may be less likely to be expressed than those which are in regions of undermethylated DNA. The process of transcription may expose cryptic promoter sites and allow their activation. For example, a mutation in the transcription elongation factor sPt6 in yeast results in altered chromatin structure of transcribed genes, permitting aberrant cryptic promoter function in coding regions (21
). sPt6 is thought to participate in the restoration of chromatin structure following gene transcription. In the globin locus, intergenic transcription is thought to remodel chromatin to permit transcription from the individual globin gene promoters (7
). In addition, enhancer and locus control region elements near the integrated genome may influence cryptic promoter activity. In summary, those vector genomes which integrate within or near transcribed genes are more likely to be expressed through activation of cryptic promoters.
Our data indicate that vector genome mobilization remains a risk despite the use of SIN vectors. Specific modifications, e.g., addition of insulator elements to the LTRs, appear to reduce the probability of transcription. Splicing events affecting the vector genome transcript may influence the ability to detect the mobilized vector, as revealed by our studies of the EF1α promoter with or without the downstream intron, and should be considered when comparing two vector designs with respect to the potential for mobilization. The nature of the regulatory elements in the vector and the cellular environment may also influence cryptic promoter activation. For example, globin locus control regions may enhance vector transcription from cryptic promoter sites in erythroid cells but not in lymphoid cells. Experiments are in progress in our laboratory to test this hypothesis.
We found that HIV proteins efficiently package vector genomes based on SIV (data not shown), leading to the prediction, supported by our data, that SIV genomes could be mobilized by transfection of plasmids encoding HIV proteins. An alternative strategy explored by others (9
), namely, the use of an artificial tRNA binding site to initiate reverse transcription of the vector genome when complemented by a modified tRNA during vector particle production, may diminish the probability of secondary transduction of host cells lacking the complementary tRNA by a mobilized vector.
Our results are consistent with the recently published studies by Logan et al. (26
) in which integrated self-inactivating lentiviral vectors were shown to produce full-length genomic transcripts competent for encapsidation and integration. Their work focused on the identification of sequences in the SIN lentiviral vector which are responsible for transcriptional activation. Primers positioned within the encoded transgene and a second set that amplified the 5′ LTR confirmed that a significant proportion of the transcripts, perhaps the majority, extended to the R region of the LTR, but the actual transcriptional start sites were not mapped. The binding sites for two transcriptional activators, DBF1 and SP1, within the leader region of the proviral genome were identified as influencing the level of proviral gene transcription. Transcripts beginning at cryptic promoters, such as those we demonstrated, could indeed be packaged and give rise to an intact proviral genome upon transduction of target cells provided that the R region is included in the transcript. Undoubtedly, the transcriptional factor binding sites identified in the studies by Logan et al. (26
) could influence the frequency of upstream transcription from cryptic promoters. Although we agree that most integrated genomes are transcribed at variable frequency, our work indicates that only rare integrants are transcribed with sufficient frequency to generate tat in quantities adequate to activate the wild-type HIV LTR promoter.
Our data indicate that careful attention to vector design and screening of various vectors in assays designed to detect vector transcription may reduce the risk of vector mobilization. In parallel with these studies, we are evaluating the effect of vector integration on expression of nearby genes using microarray and real-time PCR analysis. The predilection of oncoretroviral vectors to integrate near the promoter region has not been observed in primary hematopoietic stem cells with an SIV-based lentiviral vector system (17
). Rather, SIV vector integrants are distributed throughout genes, potentially reducing the risk for promoter activation. Our current studies are focused on looking for evidence of gene activation by lentiviral globin gene vectors in erythroid cells (15
). Such studies, when combined with the evaluation of vectors in tumor-prone mouse models (20
), should give us a better appreciation of the risk of stem cell-targeted gene transfer and help determine whether this promising approach to the correction of gene defects is sufficiently safe to allow widespread clinical use.