During proviral DNA synthesis, RT encounters duplex RNA, RNA/DNA hybrids, and duplex DNA of varying lengths and sequence composition, and containing recessed 3’ or 5’ termini, 3’ or 5’ overhangs, nicks, gaps, and/or blunt ends (). In this section, the means by which the enzyme differentially recognizes, binds, and processes these nucleic acid variants in order to convert viral RNA into a pre-integrative DNA intermediate is reviewed, with particular emphasis on RNase H-mediated processing of reverse transcription intermediates.
HIV-1 RT is a heterodimer of 66 and 51 kDa subunits (p66/p51). The larger of these houses an amino-terminal DNA polymerase domain comprised of fingers, palm and thumb subdomains, a central connection subdomain, and a carboxy-terminal RNase H domain, which collectively are supported by the smaller subunit. In co-crystal structures containing double-stranded DNA or an RNA/DNA hybrid, a bend of ~40° is imposed on the duplex ~6–8 bp from the primer terminus [
8,
9,
12]. Although the DNA polymerase active site is located over the recessed 3’ terminus of the primer strand in these structures, precise positioning of catalytic and other functionally important residues varies with the composition of the co-crystal. Conversely, the RNase H active center is located over the template strand 18 bp upstream, separated from the polymerase active center by ~70Å. This mode of binding is also observed in chemical and enzymatic footprinting experiments on similar substrates [
13,
14], and is essential for DNA synthesis on either an RNA or DNA template.
The same binding mode is required for “polymerization dependent” RNase H activity, where the RNA template is partially hydrolyzed by RT during RNA-dependent DNA synthesis [
15]. However, DNA synthesis and RNase H activities are not temporally coordinated under these conditions [
16,
17]. Hydrolysis is irregular, occurring in a pattern dependent upon RNA sequence and structure, and correlating in part with pausing of the enzyme during DNA synthesis [
16,
18]. At pause sites, or in the absence of DNA synthesis (which can be mimicked
in vitro by excluding dNTPs from the reaction), cleavage of the RNA strand occurs 15–20 nt from the recessed 3’ primer terminus. These cuts are referred to as “3’-directed”, being largely dictated by polymerase domain binding at the primer terminus, and are consistent with the active site separation observed in X-ray crystal structures [
12,
15,
17,
19–
21]
As a consequence of partial cleavage that occurs during RNA dependent DNA synthesis, RNA fragments of varying length remain hybridized to nascent DNA. These fragments, flanked by nicks and/or gaps of varying size, may be further processed by “polymerization independent” RNase H activity
via “5’-directed” [
18,
19,
22] or “internal” cleavage [
23–
27]. Within these partially hybrid substrates, no recessed DNA 3’ (or 5’) termini are available for RT to recognize and/or bind. Instead, in the case of 5’-directed cleavage, the polymerase active site localizes over the DNA strand opposite the recessed RNA 5’ terminus, placing the RNase H active site for cleavage of the RNA strand 13–19 bp away. In contrast, hydrolytic events that occur at positions well removed from recessed termini of either strand are referred to as internal cleavages. In these cases, RNA sequence in the vicinity of potential cleavage sites is an important determinant of cleavage specificity and/or efficiency. Internal RNase H cleavage sequence preferences are summarized elsewhere [
28]. Although 5’-directed and internal cleavage are less efficient than 3’ directed cleavage, all modes of hydrolysis are essential for removing the viral genome during and following (−) strand DNA synthesis.
Specialized hydrolytic events necessary for completion of proviral DNA synthesis require selection and removal of the (−) and (+) strand primers (). In a 3’-directed cleavage event, the tRNA
Lys3 primer is incompletely removed from the (−) strand DNA template by RT-associated RNase H following synthesis of (+) strand strong stop DNA, i.e., a single 5’-terminal ribonucleotide remains attached to (−) strand DNA following cleavage [
29]. Processing of the (+) strand primer is considerably more complex, but no less precise ([
30]). Following (−) strand DNA synthesis, two purine-rich segments of the HIV RNA genome, designated the central and 3’ polypurine tracts (c- and 3’-PPTs, respectively) are hydrolyzed at their 3’ and 5’ termini to generate the primers for (+) strand DNA synthesis [
4,
31,
32]. Not only are these cleavage events precise, occurring at the rG-rA junction at the PPT 3’-termini, but resistance to internal hydrolysis is a consistent feature of these retroviral elements [
33–
35]. The critical 3’-terminal cleavage event, which defines the initiation site for (+) strand synthesis, is consistent with the rules and preferences governing both 5’-directed and internal cleavage. However, there is increasing evidence that PPT-containing RNA/DNA hybrids possess unique structural features that direct not only 3’-terminal cleavage [
12,
36–
44], but also promote (+) strand initiation [
35,
44,
45]. Initiation of (+) strand synthesis occurs after the 5’ ribonucleotides immediately 3’ to the PPT have been “cleared” by RT-associated RNase H, and requires reorientation of the enzyme on the hybrid duplex (see below). After incorporating 12 nt of DNA into the nascent (+) strand, RT again assumes an RNase H role to remove the PPT [
46], thus ensuring the appropriate 5’ LTR terminal sequence for recognition by the viral integration machinery. The enzyme dynamics required to catalyze these events are discussed in a subsequent section.
In exploring the interplay between the DNA polymerase and RNase H activities of RT and their various substrates, an issue that has recently received attention has been the ability of the DNA polymerase and the RNase H active sites of RT to simultaneously engage a hybrid substrate when bound in DNA synthesis/3’-directed RNase H cleavage mode. Unfortunately, conclusions derived from existing X-ray crystal structures are inconclusive in this respect. In all liganded structures to date, the DNA polymerase active center is engaged at the primer 3’ terminus while the RNase H active site is not appropriately positioned over the scissile phosphate in the template strand [
8,
9,
12], suggesting that the two activities are mutually exclusive. However, all but one of these co-crystals contain double-stranded DNA, which is not a substrate for RNase H. Furthermore, in the exception [
12], RT is positioned for internal cleavage of a PPT/DNA hybrid, which would not normally be expected to occur, and the substrate itself is quite unusual relative to other RNA/DNA hybrids. A model generated by superposition of the latter structure with that of a human RNase H-RNA/DNA complex also suggests that the two activities are mutually exclusive [
47], but suffers from the same limitations as its RT-PPT/DNA counterpart [
12]. More recent biochemical studies have shown that locking the 3’ terminus of the primer at the polymerase active site in a ternary complex or with the inhibitor foscarnet produces efficient RNase H cleavage, suggesting that HIV-1 RT can simultaneously engage its DNA/RNA substrate with both active sites [
48,
49].