Viral infections are initiated by the fusion of the viral and cellular membranes; this fusion reaction is caused by the interactions of the viral envelope glycoprotein with its receptor (CD4) and a co-receptor, usually either CCR5 or CXCR4 (for a review of the retroviral life cycle, and an overview of reverse transcription, see Coffin, Hughes and Varmus, Retroviruses
, 1997 (1
)). Binding the receptor and co-receptor causes changes in the structure of the envelope glycoprotein, which leads to membrane fusion. Membrane fusion places the viral core, which contains RT, into the cytoplasm of the cell. A poorly understood process called “uncoating” modifies the core in ways that promote reverse transcription.
The HIV-1 virion contains, in addition to the viral proteins, two copies of a single-stranded RNA genome. RT has two enzymatic activities, a DNA polymerase that can copy either a DNA or an RNA template, and an RNase H that cleaves RNA only if the RNA is part of an RNA/DNA duplex. The two enzymatic functions of RT, polymerase and RNase H, cooperate to convert the RNA into a double-stranded linear DNA. This conversion takes place in the cytoplasm of the infected cell; after DNA synthesis has been completed, the resulting linear double-stranded viral DNA is translocated to the nucleus where the viral DNA is inserted into the host genome by IN. This inserted DNA copy, called a provirus, is the source of both viral genomic and viral messenger RNAs, which are generated by the host DNA-dependent RNA polymerase. Although other viral proteins (notably the nucleic acid chaperone nucleocapsid, and perhaps IN) and probably some cellular factors, help RT carry out the reactions that converts the viral RNA into DNA, RT contains all the necessary enzymatic activities for the conversion.
Like many other DNA polymerases, RT requires both a primer and a template. DNA synthesis is initiated from a host tRNA primer (in HIV-1 the primer is tRNAlys3). There is, near the 5’ end of the viral genome, a segment 18 nucleotides long, called the primer binding site (PBS) that is complementary to the 18 nucleotides at the 3’ end of tRNAlys3. The tRNA primer is hybridized to the PBS (). The viral RNA genome, which serves as the template, is plus-strand. First (minus) strand DNA synthesis is initiated from the tRNA primer, allowing RT to copy the 5’ end of the viral RNA genome. Synthesis of the minus-strand DNA generates an RNA/DNA hybrid that is a substrate for RNase H. RNase H degrades the RNA strand, leaving the nascent minus-strand DNA single stranded. The sequences at the 5’ and 3’ ends of the viral RNA genome are identical (R, or repeat, see ). This allows the minus-strand DNA to hybridize with the R sequence at the 3’ end of one of the two viral RNAs in the virion, a step that is called the first jump, or the first, or minus-strand transfer. After the nascent DNA hybridizes to R, minus-strand synthesis can continue along the viral RNA. As DNA synthesis proceeds, RNase H degrades the RNA strand. Although most of the RNase H cleavages do not appear to be sequence specific, there is a purine-rich sequence (called the polypurine tract or PPT) near the 3’ end of the viral RNA (). The polypurine tract is relatively resistant to RNase H cleavage, and it serves as the primer for second (plus) strand DNA synthesis. Plus-strand DNA synthesis proceeds until RT starts to copy the tRNA primer. The first 18 nucleotides can be copied; however the next nucleotide in the tRNA is a modified A that cannot be copied by RT. Once the 3’ end of the tRNA has been reverse transcribed, an RNA/DNA hybrid that is a substrate for RNase H is created. In the reverse transcription of most retroviral genomes, the RNase H of RT removes the entire tRNA primer. HIV-1 is the exception; its RNase H cleaves precisely one nucleotide from the tRNA/DNA junction, leaving a ribo-A on the 3’ end of the viral minus-strand DNA. This sets the stage for the second jump, also called the second or plus-strand transfer. The removal of the tRNA primer exposes a single-stranded portion of the plus-strand DNA that has the same sequence as the PBS. Exposure of the 3’ end of the plus-strand DNA allows the 5’ end of the minus-strand (once the PBS has been copied) to be transferred to the plus-strand (). Once this second transfer happens, both the minus- and plus- strands are extended until the entire DNA is double stranded, creating a DNA that has the same sequences at both ends (these repeats are called the long terminal repeats or LTRs). As shows, the DNA is longer than the RNA genome(s) from which it derives, allowing the viral DNA, once integrated, to serve as the template from which new copies of the viral genome (and the viral messenger RNAs) can be copied by the host enzyme DNA-dependent RNA polymerase.
Figure 1 Reverse transcription of the HIV-1 genome (with permission from Annual Review of Biochemistry). Each retroviral particle contains two copies of the RNA genome. Minus-strand DNA synthesis starts near the 5’ end of the plus-stand RNA genome using (more ...)
Although there are complexities to the reverse transcription process that are beyond the scope of this review, there are a few additional points that should be made here. As has already been mentioned, the end product of reverse transcription process is the substrate for IN. As such, the ends of the linear viral DNA need to be relatively precise. As is shown in , it is the RNase H cleavages that generate and remove the PPT primer that define the U3 end of the linear viral DNA, and it is the removal of the tRNA primer that defines the U5 end. Although RNase H has no mechanism that allows it to recognize specific sequences, it carries out these particular cleavage reactions with absolute specificity; and the ends of the linear viral DNA genome are defined to the exact nucleotide.
There is also the question why virions contain two copies of the viral RNA genome instead of one. In theory, the reactions outlined in could be carried out with only one copy of the viral RNA genome. The obvious explanation is that, if there was only one copy of the viral RNA in a virion, a single break in the RNA would be fatal, preventing the synthesis of a complete DNA copy. However, if there is a second copy of the RNA, and minus-strand DNA synthesis is blocked by a break in the RNA template, synthesis can be continued if minus-strand DNA synthesis is transferred to the second RNA genome. In fact, if two copies of the RNA genome are present, a complete DNA copy can be made even if both RNA genomes are extensively nicked, so long as there are no sites at which both RNA copies are nicked. If a virion is produced by a cell that contains only one integrated DNA genome, the two RNA copies will be identical (unless RNA polymerase makes an error), and the fact that, during minus-strand DNA synthesis, RT often shifts back and forth between the two RNA templates will have no significant consequences. However, if a host cell contains two different integrated viral DNA genomes, then virions can be produced that contain two related RNA genomes that have different sequences. This sets the stage for the generation of viral recombinants. If RT makes a double-stranded viral DNA by copying from two different RNA genomes, the resulting DNA will contain sequences that derive from both of the parental genomes. In HIV-1, recombination during reverse transcription is the rule, not the exception. This can have important consequences. For example recombination can lead to the generation of viruses that are resistant to multiple drugs from parental viruses that are each resistant to only a single drug.