|Home | About | Journals | Submit | Contact Us | Français|
After HIV -1 enters a human cell, its RNA genome is converted into double stranded DNA during the multistep process of reverse transcription. First (minus) strand DNA synthesis is initiated near the 5′ end of the viral RNA, where only a short fragment of the genome is copied. In order to continue DNA synthesis the virus employs a complicated mechanism, which enables transferring of the growing minus strand DNA to a remote position at the genomic 3′ end. This is called minus strand DNA transfer. The transfer enables regeneration of long terminal repeat sequences, which are crucial for viral genomic DNA integration into the host chromosome. Numerous factors have been identified that stimulate minus strand DNA transfer. In this review we focus on describing protein-RNA and RNA-RNA interactions, as well as RNA structural features, known to facilitate this step in reverse transcription.
The human immunodeficiency virus type 1 (HIV-1) belongs to the retrovirus family, members of which have two copies of a single stranded RNA genome. The genome encodes only a small set of genes important for viral survival, relying primarily on host cell factors to complete the viral life cycle. As with all retroviruses, the replication of HIV-1 requires the RNA genome to be reverse transcribed into DNA soon after the virus enters the cell. The DNA copy of the virus is subsequently transported into nucleus for incorporation into human DNA to become permanently linked with its host. The DNA form of HIV-1 or provirus, may stay latent for a long period of time, but eventually will be expressed to form new viral particles, which destroy the cell and go on to infect new cells.
Reverse transcription has been a main target in treatment of HIV infections, as it is an imperative step in viral replication. Extensive research is being conducted to understand the mechanisms of this multistep process. Drugs developed to interfere with viral reverse transcription have provided effective treatment, because they prevent the completion of proviral DNA synthesis and its integration in the host genome. In this review, we focus on an early step in HIV-1 reverse transcription, called minus strand DNA transfer. This is a complicated, but obligatory event in conversion of the RNA genome into DNA and critical in generation of the DNA provirus in a form for its integration into host cells DNA.
The genomic RNA of HIV-1 consists of several major genes that code for essential proteins and enzymes needed for viral replication and maturation. The protein coding region is flanked by two unique sequences, U5, at the 5′ end and U3 at the 3′ end and two identical sequences of the repeat (R) elements at both ends. These regions do not encode proteins, but contain regulatory sequences important for viral replication. During the process of reverse transcription, U5 and U3 are duplicated to create U3-R-U5 at both ends of the proviral DNA, segments known as long terminal repeats (LTRs). These are important regions used by the virus to mediate its own integration into the host chromosome. In addition, the upstream LTR acts as a promoter and enhancer and the downstream LTR acts as transcription termination and polyadenylation site.1 The binding of cellular and viral proteins to these regions regulates HIV gene expression.2,3
Duplication of unique sequences U3 and U5 in the LTRs is a result of the minus strand DNA transfer during reverse transcription (Fig. 1A). Conversion of RNA into DNA is performed by the viral enzyme reverse transcriptase (RT). The process of first (minus) strand DNA synthesis is initiated from a host cellular tRNA3Lys that is captured within the viral particles when they are formed. The 18 nt at the 3′ end of the tRNA are bound to the complementary sequence of the primer binding site (PBS) localized 182 nt from the 5′ end of the HIV-1 RNA genome, just upstream of the unique sequence U5. The RT extends the primer-tRNA copying the U5 and R elements and synthesizing 181 nt long cDNA known as a minus strong stop DNA ((−)ssDNA). The synthesis of minus strand DNA can be continued only after the (−)ssDNA is transferred to the 3′ end of viral RNA genome and this process is known as a minus strand DNA transfer. Since both R elements in the viral genome are identical, the (−)ssDNA will interact with the 3′ end of the RNA genome through complementarity of their sequences. The consequence of minus strand transfer is that the U3 region can be copied and the first U3-R-U5 sequence of the 3′ LTR is created. Subsequently, this region is used as a template to synthesize a second (plus) strand DNA, which is next transferred to the beginning of the viral genome (plus strand transfer) in order to create the 5′ LTR (see Fig. 1 in ref. 4).
Since replication at the ends of the genome is not continuous and requires interaction between distant related sequences of R at the 3′ end and in growing cDNA at the 5′ end, a number of factors are involved to ensure effective reverse transcription, which is necessary for viral fitness and long term survival. The first strand transfer is a rate-limiting step in reverse transcription and many different RNA-RNA and RNA-protein interactions have evolved to ensure that it is rapid and efficient.5,6
Conversion of genomic RNA into the DNA provirus requires two proteins encoded by the genome, the RT and nucleocapsid protein (NC). In HIV-1, the polymerization activity of RT is accompanied by an RNase H activity. RNase H is essential for RNA degradation within the hybrid duplexes of synthesized cDNA and the RNA that was used as a template during reverse transcription. The RT is composed of two subunits, p66 and p51 (66 and 51 kDa). Both activities are located in subunit p66, whereas p51 acts as a structural polypeptide for proper conformation of p66.7,8 The p51 is formed by proteolytic cleavage of the C-terminal domain in p66 subunit.9 The RNase H active site in RT is located in the C domain and is separated from the polymerization site by 18 bp in DNA:RNA heteroduplex substrates.10,11 This configuration allows RT to make cleavages in RNA during cDNA synthesis, which is known as polymerization-dependent RNase H activity.12 However, the polymerization rate of RT is greater than the rate of RNA hydrolysis, thus polymerization-independent cuts, made during revisits of RT molecules to remaining RNA oligonucleotides, are necessary for complete removal of the genomic template.12,13 A single virion contains about 50 molecules of RT enzyme.14 The excess RT molecules, beyond what is apparently required for synthesis, may be important in carrying out RNase H cleavage in order to clear the minus strand DNA to allow efficient strand transfer, accurate priming of plus strand synthesis and unimpeded synthesis of the plus strand.
Secondary structures in the RNA genome, such as hairpins and G-quartets can pause the RT during DNA synthesis promoting RNase H cleavages in the RNA template.15,16 Studies in vitro have demonstrated that the TAR hairpin at the 5′ end of the RNA genome causes a major pause in synthesis by RT.17 The associated RNase H activity of RT cleaves RNA approximately 14–20 nucleotides downstream from the pause site within the polyA hairpin, that serves as an initiation site for the invasion-driven mechanism of minus strand transfer (Fig. 1B). The RNase H cuts create a short gap where the homologous sequence of R from the 3′ end can invade and anneal to the cDNA and initiate displacement of adjacent segments of the 5′ end RNA. The strand exchange continues by a branch migration process until it reaches the 3′ terminus of synthesized (−) ssDNA completing minus strand DNA transfer.18,19
The polymerization activity of RT enzyme is greatly facilitated by NC protein. In fact, many steps of the retroviral life cycle, including the assembly of virus particles, genomic RNA dimerization and packaging, reverse transcription and integration into the host genome require NC involvement.20–24 NC is a small (55 amino acid) protein with nucleic acid binding activity, processed from a precursor polypeptide encoded by the gag gene. A single virion contains 2,000–3,000 molecules of NC, coating the RNA genome with binding sites of 7–8 nt.25,26 The key function of NC is chaperone activity, which is refolding of nucleic acids into the most thermodynamically stable conformations that have the maximal number of base pairs.27 The protein has two zinc fingers for interaction with ssRNA, ssDNA and dsDNA, and has the ability to destabilize nucleic acid helices and cause nucleic acid aggregation.28,29
Studies in vitro have shown that the presence of NC during reverse transcription increases the efficiency of the various steps and reactions. NC transiently eliminates secondary structures such as hairpins.30–33 This activity of NC greatly reduces RT pausing and increases the efficiency of DNA synthesis. During synthesis of minus strand DNA the highly structured TAR hairpin at the 5′ end of the RNA template is destabilized with the help of NC. Although, the pausing of RT is reduced at TAR, the efficiency of minus strand transfer is higher in the presence of NC. Studies by Purohit and coworkers showed that while the pausing of RT diminishes in the presence of NC protein, some RNase H cleavages increase due to enhanced annealing of cDNA to the RNA template.34 Moreover, NC enhances annealing between nascent DNA and the invading 3′ R sequence and subsequently promotes strand exchange, which progresses continuously until the minus strand transfer process is completed.18,19,29,35,36
Analyses of minus strand transfer in vitro have demonstrated that NC protein also significantly inhibits a self-priming effect.37,38 The 3′ end of (−)ssDNA corresponds to the sequence of TAR, and so this region has the potential to fold back and form a similar hairpin, which can self-prime DNA synthesis and inhibit minus strand transfer. The primary basis of self-priming suppression in the presence of NC is promotion of an exchange of the very 5′-most RNA oligomer left from polymerization-dependent RNase H with the homologous RNA sequence of the genomic 3′ end, leading to minus strand DNA transfer.39
The reconstituted systems used to analyze minus strand DNA transfer in vitro have demonstrated that using different lengths of the RNA representing the 5′ end of HIV-1 (donor RNA) results in different transfer efficiencies. The cDNA synthesized in vitro on the RNA template spanning the region from the 5′ end up to PBS (D199) transfers with low efficiency to the second RNA (acceptor RNA) representing the 3′ end of the virus. However, the efficiency of transfer increases dramatically when donor RNA is extended at its 3′ end (D520) to include naturally occurring sequences present beyond the PBS.40 The transfer of cDNA synthesized from both donor RNAs uses the same acceptor invasion-driven mechanism, but that mechanism is more effective when a longer donor RNA is used. This implies that folding properties of donor RNAs having different lengths influences the transfer reaction, indicating that local structure is an important influence on minus strand transfer.40
Analyses in vitro demonstrated that the 5′-untranslated region in HIV-1 can adopt two distinct structures (Fig. 2A).41 A long-distance base-pairing interaction (LDI) between the polyA and dimerization initiation site (DIS) can be formed into a thermodynamically stable structure. However, the sequence can refold into the branched multiple hairpin (BMH) conformation, which allows the polyA and DIS motifs to fold into hairpins. The TAR hairpin has the same structure in LDI and BMH. Interestingly, structure analyses of both RNA donors, D199 and D520, revealed that each adopts a different conformation. The longer RNA donor has a structure similar to LDI, whereas the shorter RNA donor with lower efficiency of minus strand transfer resembled the 5′-side of the BMH.40 Possibly the conversion of one structure to the other occurs during (−)ssDNA synthesis and transfer, and aids the process. Interestingly, the BMH structure was proposed to favor RNA genome dimerization and packaging into the virion.42–44 With the DIS formed as a hairpin in BMH conformation, the kissing loop base pairing between complementary sequences in the loop will lead to dimerization of viral genomes.45,46 These structural inter-conversions may coordinate sequential events in virus assembly, reverse transcription and protein translation.
Extensive studies in vitro revealed that the structure of the sequences involved directly in the minus strand transfer is also important. The sizes of the R elements in different retroviruses vary significantly from 16 bases in mouse mammary tumor virus (MMTV) to 247 bases in human T-cell leukemia virus type 2.47,48 However, a significant shortening of the R sequences affects the efficiency of minus strand transfer.49–51 Moreover, early strand transfers of partially synthesized cDNA are not frequently seen with HIV-1.52 Although the (−)ssDNA synthesized at the 5′ end of the viral genome has to hybridize with the complementary R region at the 3′ end, the length of complementarity is apparently not the only factor important for efficient minus strand transfer. The structure of the R elements is also crucial. A stable TAR hairpin at the 5′ end is important for efficient minus strand transfer, but disruption of this hairpin at the 3′ end of viral genome is beneficial for the transfer in vitro. However, stabilizing the polyA hairpin in the 5′ and 3′ R will inhibit the transfer reaction.53 Studies in vitro have demonstrated that pre-incubation of NC with the 3′ end RNA template rather than with just the 5′ end RNA, significantly enhances the efficiency of minus strand transfer. This implies that destabilization of hairpins at the 3′ end by NC is beneficial for the reaction.54 Overall, strong structure in the genomic 3′ end appears to decrease the efficiency of the (−)ssDNA transfer, but a highly structured 5′ end of the genome promotes the transfer.
The stimulatory effect of the folded genomic 5′ end on minus strand transfer is presumably important for the invasion-driven mechanism of transfer, as described above (Fig. 1B). TAR structure pauses RT during cDNA synthesis, promoting RNase H cleavages in the copied template creating an invasion site for the 3′ R RNA.17 Lack of structure in the 3′ RNA likely promotes its ability to invade and base pair with the cDNA.
Analyses in vitro and in vivo have demonstrated that the length of homology between the 97 nucleotide R elements in HIV-1 influences the efficiency of minus strand transfer.50,55 However, it was shown that transfer within a homology of just 20 nucleotides will not be affected in vitro, as long as the mutated 3′ R sequence still contains an unchanged polyA hairpin, needed for invasion of the (−)ssDNA.55 The interaction between the invasion site in the (−)ssDNA and the invading polyA hairpin of the 3′ R brings sequences involved in the completion of transfer at the (−)ssDNA terminus into proximity.
The 5′ and 3′ R elements are over 9,000 nucleotides apart in the linear form of the HIV-1 RNA genome. However, RNA molecules are capable of forming complex tertiary interactions between related sequences over very long distances. Circularization of the viral genome in a way that juxtaposes the R elements should facilitate the process of minus strand DNA transfer. The possibility of genome circularization during reverse transcription was already indicated in retroviruses 35 years ago.56,57 In subsequent years it was discovered that many single-stranded RNA viruses adopt a circular conformation, which is essential for different stages of their viral life cycles.58 Viral genome circularization is known to stimulate initiation of translation, as is the case for many cellular mRNAs.58 Juxtaposing of viral ends also facilitates transcription, genome replication and viral packaging.59–62 In many positive strand viruses, such as ratoviruses, picornaviruses and pestiviruses, a 5′-3′ end contact is mediated through RNA binding protein interactions, which can involve both viral and cellular factors.60,63–66
Alternatively, viruses can circularize their genomes via long-distance RNA-RNA interactions. With application of atomic force microscopy it was shown that the dengue virus genome can form a circle in the absence of proteins.67 RNA-RNA genomic cyclizations have been documented in many plant viruses,59 flaviviruses,60,68–70 hepatitis C virus,71 FMDV (Foot-and-mouth disease virus),72 and also in HIV-1.73
Several different types of RNA-RNA interactions were described that could mediate HIV-1 genome circularization. A palindromic sequence of 10 nucleotides in stem-loop of the TAR hairpin in the R element was indicated as a possible place of interaction involved in genome dimerization, in addition to the primary site of dimerization at DIS.74 Since the TAR is present at both ends of the viral RNA, the palindromic sequences could interact and circularize the HIV-1 genome (Fig. 2B).75 This is a way that the sequences of R elements involved in minus strand transfer would self-juxtapose by direct interaction. Stem-loops and loop-loop kissing structures formed by sequences at both ends of the RNA genome were demonstrated to participate in genome circularization in many plant viruses.59 However, analyses of mutations in TAR sequences that should have disrupted the putative binding showed that 3′-5′ end interactions in vitro were not significantly affected.75 Apparently, other factors in addition to TAR hairpins maintain genome circularization, whereas putative base pairing between TAR sequences at both ends of the genome might still help in juxtaposing R elements for minus strand transfer.
Another type of RNA-RNA interaction involves a region in gag and the 3′ U3/R (Fig. 2B). Here, a structure is formed with extensive base pairing among nucleotides within the gag ORF (positions 619–696, clone pNL43) and sequences flanking the TAR hairpin, 18 nt of U3 (positions 9,054–9,071) and the polyA hairpin of the 3′ R (positions 9,136–9,170).73
Interactions between cis-acting sequences of viral RNA genomes were initially documented for many flaviviruses.61,67–70 Here, genome circularization is maintained by short complementary cyclization sequences (CS) present in the capsid coding region and at the 3′ end of the RNA genome. For dengue virus, additional sequences described as UARs (upstream AUG regions) were also found in proximity to the CS, and UARs were shown to be significant for viral viability.67 A similar 3′-5′ RNA-RNA interaction was reported for yeast LTR Ty1 retrotransposon.76 LTR retrotransposons are thought to be ancestors of retroviruses.77,78 The interactions involve base pairing of complementary 14 nucleotide long sequences, called CYC5 in the gag ORF and CYC3 in U3.76 The reconstituted system in vitro demonstrated that contact between CYC5 and CYC3 is important for efficient minus strand transfer and also to enhance initiation of reverse transcription. Additionally, CYC5-CYC3 interactions are required for Ty1 transposition.76 In the case of HIV-1, with application of a reconstituted system, mutation analysis of a 3′ U3/R-gag interaction showed that this structure also enhances efficiency of minus strand DNA transfer, in experiments using a DNA primer to start reverse transcription.75
Genome circularization in retroviruses via tertiary interactions between 5′ and 3′ genome ends may also involve other nucleic acid molecules, such as the cDNA made de novo and tRNA3Lys used by HIV-1 as a primer to start reverse transcription. Specifically, it was proposed that growing cDNA extended from the tRNA3Lys primer could interact via kissing loop contact of the TAR hairpin segment in (−)ssDNA and TAR at 3′ RNA genome (Fig. 2B).53 Also, the invasion mechanism of minus strand transfer itself is a mechanism for genome circularization (Fig. 1B). Here, the interaction between the polyA hairpin at the 3′ end of the RNA and the invasion site in (−)ssDNA could hold both of the viral genome ends together. In both cases, the interaction only becomes possible after some (−)ssDNA synthesis.
The primer tRNA3Lys might also serve as a bridging factor interacting with the 3′ end of the viral genomic RNA (Fig. 2B). The interaction would occur between nine nucleotides in U3 (motif 9 nt, positions 9,033–9,041) adjacent to the 3′ R element and nucleotides of the anticodon stem in tRNA3Lys (positions 38–46).79 Computational analysis revealed that sequences flanking motif 9 nt also have complementarity to tRNA3Lys, and the whole region spanning extensive sequences in U3/R might represent an entire intron-containing tRNA gene incorporated during evolution of the HIV-1 genome.80 Analyses in vitro and in vivo have demonstrated that motif 9 nt and complementary sequences in U3 (segment 1, positions 9,008–9,019) stimulate minus strand transfer.79–81 Interactions between U3 sequences and tRNA3Lys appear most likely to be established after initiation of reverse transcription. The nucleotides of the anticodon stem in tRNA3Lys are first involved in an interaction with sequences upstream of the PBS in U5 in order to form the initiation complex for DNA synthesis.82,83 In the case of segment 1, nucleotides are complementary to the 3′ end of tRNA3Lys, where 18 nucleotides of the tRNA serve as a primer and base pair directly with the PBS. Here too, the interactions might take place only after the start of reverse transcription.
Whereas the binding between tRNA3Lys and motif 9 nt in U3 could help in maintaining the R elements in proximity, the significance of interactions with segment 1 might be different. For example, segment 1 could help in displacement of the tRNA from the PBS to allow copying of the 3′ end of the tRNA into DNA in preparation for second strand transfer.
Primer tRNA serving as a bridging factor in genome circularization was also indicated for LTR retrotransposons and endogenous LTR retroviruses. The tRNAiMet is used as a primer to start reverse transcription in Ty3 yeast retrotransposons. Whereas the 3′ end of this tRNA is bound at the PBS, the TψC arm and D arm are involved in interactions close to the 3′ end of the Ty3 RNA causing retro-element genome circularization.84 A similar situation was described for the endogenous retrovirus Gypsy in Drosophila melanogaster.85 Here, the primer tRNA2Lys interacts at the PBS and within the region upstream of the PPT and U3 located about 450 nucleotides from the 3′ end of the Gypsy genomic RNA. The second interaction involves the nucleotides of the TψC arm and Variable loop in tRNA2Lys.
In order to maintain the proximity of R elements for efficient minus strand transfer, the scenario of the 5′-3′ end interactions in HIV-1 genomic RNA could be very complex. Consider that RNA genome circularization might be already established by the time of packaging of the viral genome. The circular form of HIV-1 RNA could be maintained via interactions between gag and U3/R sequences at the genomic 3′ end. The 5′ end would form a BMH conformation with DIS and TAR hairpins exposed. Within the virion, the dimerization of two RNA genomic molecules via TAR hairpins might no longer be essential and could be replaced to allow base pairing between TAR hairpins at the 5′ and 3′ ends. Proximity of the genomic ends due to gag—U3/R contact will facilitate self-juxtaposing of R elements via stem-loop kissing base pairing of TAR hairpins. Soon after reverse transcription begins, the interactions between U5 and the anticodon stem in tRNA3Lys are unwound enabling tRNA to bind with motif 9 nt in U3. This interaction could help to maintain the proximity of R elements during minus strand transfer, when contacts of the 3′ R with gag sequence need to be removed. As the synthesis of minus strand DNA proceeds towards the 5′ end of the genome, TAR interactions will be disrupted. Furthermore, minus strand transfer occurs within the 3′ R element, and so the interactions with gag have to be disrupted too. The binding between motif 9 nt and tRNA3Lys would maintain circularization and bring segment 1 into proximity with the 3′ end of the tRNA primer, facilitating tRNA displacement from the PBS. After the transfer of (−)ssDNA the growing cDNA will disrupt all binding between U3 sequences and gag and the tRNA. If accurate, this interpretation explains why there must be a stepwise series of interactions.
The ability of HIV-1 to recombine frequently and error prone reverse transcription are survival features of the virus that allow it to evolve rapidly and avoid inactivation by the human immune system. Drugs currently used to treat HIV infections target the virus replication process. Although they can effectively inhibit the viral replication machinery, HIV-1 is so effective at evolving drug resistance that new treatments need to be developed. Targeting the steps in viral replication, which are unique to the virus, offers opportunities for treating viral infections without disruption of human cellular function. Minus strand DNA transfer might be a good target for interference of viral replication, as it is an early and mandatory process preparing viral genetic material for integration into the host. Moreover, efficient minus strand transfer is critical for maximum viral replication fitness.
Work from our laboratory described herein was supported by the US National Institutes of Health research grant GM049573.