|Home | About | Journals | Submit | Contact Us | Français|
We report the molecular identification, cloning and initial biological characterization of 12 full-length HIV-1 subtype A, D and A/D recombinant transmitted/founder (T/F) genomes. T/F genomes contained intact canonical open reading frames and all T/F viruses were replication competent in primary human T-cells, although subtype D virus replication was more efficient (p<0.05). All 12 viruses utilized CCR5 but not CXCR4 as a co-receptor for entry and exhibited a neutralization profile typical of tier 2 primary virus strains, with significant differences observed between subtype A and D viruses with respect to sensitivity to monoclonal antibodies VRC01, PG9 and PG16 and polyclonal subtype C anti-HIV IgG (p<0.05 for each). The present report doubles the number of T/F HIV-1 clones available for pathogenesis and vaccine research and extends their representation to include subtypes A, B, C and D.
A globally effective human immunodeficiency virus type 1 (HIV-1) vaccine must prevent transmission of widely diverse virus strains including the less commonly studied subtypes A and D. Subtypes A and D account for approximately 12% and 2%, respectively, of all HIV-1 infections worldwide (Hemelaar et al., 2011), and together with their recombinant forms are endemic in East and Central Africa where they co-circulate in the same regional and ethnic populations (Lihana et al., 2006; Ssemwanga et al., 2012). Interestingly, there have been reports suggesting that distinct clinical outcomes may be associated with the two different virus subtypes. For example, compared to subtype A, subtype D infections have been associated with faster disease progression (Baeten et al., 2007; Kaleebu et al., 2002; Kiwanuka et al., 2008; Vasan et al., 2006) and a higher risk for developing HIV-1 associated neurocognitive disorders (Sacktor et al., 2009). Infection with subtype D has also been associated with preferential transmission of CXCR4 (X4) tropic and CXCR4/CCR5 (X4/R5) dual-tropic viruses (Church et al., 2008), an early co-receptor tropism switch from R5 to X4 (Kaleebu et al., 2007), a higher prevalence of X4 strains during chronic infection (Huang et al., 2007), and a faster rate of CD4+ T lymphocyte decline (Kiwanuka et al., 2010) compared with subtype A infection. Phenotypic properties of subtype D and A virus strains that might contribute to these clinical differences have not been identified. Thus, we reasoned that from both a vaccine and pathogenesis perspective, a precise molecular identification and biological characterization of molecularly cloned strains of transmitted/founder (T/F) subtype A, D and A/D viruses could be informative.
One limitation of previous studies aimed at characterizing genetic and biological properties of subtype A and D viruses has been the fact that subtype assignments, diversity measurements and biological analyses have frequently been conducted on subgenomic regions of these viruses rather than on full-length viral genes or genomes (Blish et al., 2007; Church et al., 2008; Haaland et al., 2009; Kaleebu et al., 2002; Provine et al., 2012; Redd et al., 2012). Such studies have been further confounded by the frequent occurrence of unique inter-subtype recombinants in vivo (Khoja et al., 2008; Lihana et al., 2006; Ssemwanga et al., 2012) as well as artifactual recombinants that may have been generated in vitro as a consequence of Taq polymerase DNA strand switching during the amplification of heterogeneous cDNA target sequences (Salazar-Gonzalez et al., 2008). There is the additional concern that viral sequences cloned from chronically infected patients might not reflect properties of viruses that result in virus transmission (Ochsenbauer et al., 2012; Parrish et al., 2012; Wilen et al., 2011) or could contain sporadic (and unidentifiable) mutations introduced by the most recent HIV-1 reverse transcriptase (RT) step, an RNA Pol II transcription error, or an MuLV RT error during the cDNA synthesis step prior to PCR amplification (Keele et al., 2008). To address these concerns, we sought to generate a panel of full-length molecularly cloned subtype A, D and A/D transmitted/founder genomes using single genome amplification (SGA) - direct amplicon sequencing (Keele et al., 2008; Lee et al., 2009; Salazar-Gonzalez et al., 2009), a strategy adapted from previously described single genome sequencing (SGS) methods for analyzing intact HIV-1 pro, pol and env genes (Palmer et al., 2005; Simmonds et al., 1990). Our laboratory’s innovation was to use SGS in the context of acute HIV-1 and simian immunodeficiency virus (SIV) infection, together with a mathematical model of random virus evolution, to infer the exact nucleotide sequences of T/F virus genomes (Keele et al., 2008; Keele et al., 2009; Lee et al., 2009; Salazar-Gonzalez et al., 2009).
We define a T/F genome as a viral sequence that is transmitted and gives rise to productive clinical infection (Keele et al., 2008; Salazar-Gonzalez et al., 2009). We note that one or more viruses might breach the cervicovaginal or rectal mucosa or otherwise be introduced into a naïve individual and fail to replicate or be extinguished due to early innate immune responses or early stochastic events in the transmission process (Pearson et al., 2011). Such viruses are of no consequence to the present study since they do not lead to productive clinical infection. We also note that a ‘transmitted’ virus genome and a ‘founder’ virus genome from which subsequent genomes evolve may or may not be identical. For example, in the case of HIV-1 or SIV, the transmitted viral RNA genome must first undergo reverse transcription in order to productively infect the first cell in the naïve host. It is thus possible that the DNA provirus in this cell differs from the infecting viral RNA genome by either point mutation(s) or recombination event(s). The former are expected to occur with a frequency of about 2.26 × 10−5, or about 0.2 mutations per 10kb genome per infection event (Keele et al., 2008; Lee et al., 2009; Mansky and Temin, 1995). The latter are also common and potentially significant when the diploid viral RNA genome is heterozygous (Keele et al., 2008; Keele et al., 2009; Lee et al., 2009). RT mediated vRNA strand transfers (recombination) generally occur multiple times with each reverse transcription event but are generally not apparent unless the virus genome is heterozygotic or polymerase slippage occurs resulting in sequence deletion or duplication (Levy et al., 2004). To account for these different possibilities, we use the term ‘transmitted/founder’ genome to describe the viral genome that is transmitted and leads to productive clinical infection. We also make the important distinction between T/F virus genomes derived by single genome sequencing methods and that are identical to actual founding virus genomes at or near the moment of transmission, and “early” or “near-transmitted/founder” sequences derived by non-SGS techniques that are susceptible to Taq polymerase-mediated in vitro recombination, founder effects, and ambiguities arising from population sequencing, cloning-sequencing strategies, and in some studies, short-term virus culture effects (Aasa-Chapman et al., 2006; Blish et al., 2007; Coetzer et al., 2008; Derdeyn et al., 2004; Isaacman-Beck et al., 2009; Nedellec et al., 2009; Quakkelaar et al., 2007; Sagar et al., 2009; Sagar et al., 2003; Seaman et al., 2010).
To pursue the identification and cloning of full-length subtype A, D and A/D T/F virus genomes, investigators representing the Center for HIV/AIDS Vaccine Immunology (CHAVI), the International AIDS Vaccine Initiative (IAVI) and the Uganda Virus Research Institute/Medical Research Council/Wellcome Trust collaborated in a molecular survey of incident HIV-1 infections in Uganda, Kenya and Rwanda with the goal of identifying acutely infected subjects at early Fiebig stages (Fiebig et al., 2003). HIV-1 negative high risk commercial sex workers, men who have sex with men and HIV-1 serodiscordant co-habitatiing couples were serially tested for antibody seroconversion. Testing was monthly or quarterly, and those individuals who became infected were offered enrollment into IAVI protocol C, which called for frequent blood draws for HIV-1 research. Because identification of T/F viral sequences depends on a phylogenetic analysis of very early viral sequences that undergo essentially random diversification (i.e., prior to the onset of adaptive immune responses that select for escape mutants) (Goonetilleke et al., 2009; Keele et al., 2008; Salazar-Gonzalez et al., 2009), we tested plasma specimens corresponding to the first ELISA HIV antibody positive time point by western immunoblot so as to determine the clinicopathological stage of acute HIV-1 infection as described by Fiebig and colleagues (Fiebig et al., 2003). For subjects with later stages of acute infection (Fiebig stages V and VI), we tested the preceding antibody negative sample for HIV-1 vRNA, hoping to identify subjects in the earliest vRNA+/Ab− stages (Fiebig stage I–II). The primary objectives of the present study were: (i) to determine if in the setting of observational field studies in Central and East Africa, we could identify patients and clinical samples amenable for a precise and unambiguous molecular identification of full-length T/F viral genomes; (ii) to molecularly clone and validate the replication competence of such T/F genomes and to assess their basic genetic and phenotypic characteristics; and (iii) to contribute subtype A and D clones and sequences to HIV/AIDS reagent and sequence repositories to encourage vaccine and pathogenesis research on these relatively neglected HIV-1 subtypes.
Between October 2004 and February 2009, 165 subjects participating in IAVI vaccine preparedness studies, 14 subjects in the Uganda Virus Research Institute/Medical Research Council/Wellcome Trust HIV high risk cohort, and two subjects from an HIV discordant couple study seroconverted to HIV-1 antibody positivity were evaluated for inclusion in the current study. On the basis of specimen availability, serological data indicating very recent infection, virus load and specimen integrity, we selected 12 subjects for in-depth analysis. Six subjects were at Fiebig stage I–II (vRNA+/Ab−) and 6 had Fiebig stage III–IV infection. Eight subjects were female and four were male. Two subjects were from Rwanda and ten were from Uganda. Other demographic and clinical information is as listed (Table 1).
SGA-direct amplicon sequencing was used to derive 160 5′ and 201 3′ overlapping half genome sequences from plasma viral RNA of 10 subjects. For two additional subjects, R880F and R463F, sequencing of twenty-two 9-kb single genome-derived amplicons was performed. Maximum likelihood trees of 5′ and 3′ half sequences from all 12 subjects showed that viral sequences from each subject formed monophyletic lineages with no interspersion of sequences (Figure 1A and B). Sequences from five subjects clustered with reference subtype A viruses in both 5′ and 3′ trees, and sequences from five other subjects clustered with reference subtype D viruses in both genomic regions. Recombination identification program (RIP) (http://www.hiv.lanl.gov/content/sequence/RIP/RIP.html) and REGA HIV-1 subtyping (http://dbpartners.stanford.edu/RegaSubtyping/) analyses of these sequences compared to A and D reference sequences showed them to be non-recombinants. The 5′ and 3′ half genome sequences of two subjects, 191947 and 191982, clustered inconsistently with subtype A and D reference sequences suggesting that these T/F viruses likely represented inter-subtype A/D recombinants. A formal recombination analysis of full-length genomes from these subjects revealed them to be unique complex A/D recombinants (see below).
In 11 of 12 subjects, maximum within-patient sequence diversities were low ranging from 0.02% to 0.5% and 0.05% to 0.19% for the 5′ and 3′ half sequences respectively (Table 2). Such low sequence diversity is generally indicative of single variant transmission based on model estimates and empirical observations of the rate of diversification of HIV-1 over the initial 100 days of infection (0.60%, 95% CI 0.54 – 0.68%) (Keele et al., 2008). However, transmission of two or more closely related viruses cannot be discriminated by this diversity cutoff value (Keele et al., 2008), and phylogenetic trees and Highlighter plots suggested that one of the 11 of subjects (191084) with relatively homogeneous sequences was actually productively infected by a minimum of two closely related viruses. A twelfth subject (191727) had a high maximum within-patient diversity of 1.54% and phylogenetic evidence indicated productive infection by at least three distinct T/F viral lineages (Fig. 1). We note that for the purposes of identifying examples of T/F full-length genomes, the numbers of sequences per subject that we analyzed (between 10 and 51 half-genomes) (Table 2) were sufficient to unambiguously infer consensus founder sequences. However, this number is not sufficient for estimating with precision the total number of T/F virus genomes per subject, and thus the numbers of T/F viruses that we identified (median = 1; mean = 1.25; range = 1–3) must be considered minimum estimates (Table 2).
HIV-1 transmission at mucosal surfaces is characterized by a stringent population bottleneck that in most instances results in one or few variants out of many establishing productive clinical infection (Abrahams et al., 2009; Derdeyn et al., 2004; Haaland et al., 2009; Keele et al., 2008; Salazar-Gonzalez et al., 2009). The goal of the present study was to identify and clone one full-length T/F genome from each of the 12 subjects. Figures 2A – C show Highlighter plots of 5′ and 3′ half genome sequences from subjects 190049 and 181845 and near full-length sequences for subject R463F. For these three subjects, the proportion of sequences identical to the respective consensus sequence ranged from 30% (R463F – 5′ genome) to 67% (190049 – 3′ genome) and all variants differed from the respective consensus by just 1 to 3 randomly distributed mutations. Thus, for these three subjects, the consensus sequence is readily inferred to be the T/F sequence. Figure 2D shows a similar pattern of early virus diversification in subject R880FPL but with the added feature of a shared nucleotide polymorphism at position ~8900. Sequences from other subjects sometimes had shared polymorphisms in just two sequences (R880FPB; 191982; 191882) and sometimes more (191882; 191859) (see Sfigure 1 in Supplementary material). Shared polymorphisms like these commonly arise due to early stochastic mutations that occur during the first one or few replication cycles following virus transmission (Lee et al., 2009) and it may not be possible to determine which variant was actually transmitted. Regardless, both sets of sequences are extremely closely related (generally differing by only one or few nucleotides per 10kb), both are fit for replication, and both meet the operational definition of a T/F genome. For these subjects, we generally selected the overall consensus as the T/F virus genome, unless sequences were available from a sexual partner that indicated the likely transmitted allele. For some subjects, we had serial samples for analysis (191947; 191647). Sequential samples can reveal which sequences are best fit to replicate over time and oftentimes show sequence diversification suggesting immune selection (Goonetilleke et al., 2009; Salazar-Gonzalez et al., 2009). The latter changes can facilitate a distinction between a transmitted genome and mutant sequences arising from it.
Sequences from subject 9004SS were more complicated to analyze because there was a shared sequence polymorphism at position ~3650 in a substantial fraction of 5′ half genome sequences (Fig. 3). To determine which allele was transmitted and which was acquired, we examined sequences from an epidemiologically-linked sexual partner. We found that the minority sequence polymorphism present in three of eleven 5′ sequences in the initial 9004SS sequence set was the only detectable allele present in the sexual partner, thus allowing for an unambiguous determination of the T/F sequence as shown (Fig. 3). For subject 191084, there were several sets of shared mutations in the 3′ half-genome variant 1 sequences (boxed in Fig. 4A). These shared mutations are indicated in Fig. 4B where each set is observed to occur in one of two short stretches of 9 or fewer amino acids in Nef. Sequence toggles like these have been described previously and generally represent cytotoxic T-cell epitope escape mutations (Goonetilleke et al., 2009; Keele et al., 2008; Salazar-Gonzalez et al., 2009). By examining the timing and directionality of such mutations, it is possible to infer the likely T/F sequence and the evolving escape mutations derived from it. Finally, for subject SC191727 (Fig. 5), SGS allowed us to decipher at least three T/F lineages along with multiple unique inter-lineage recombinants, and from the predominant T/F lineage we could readily infer an unambiguous T/F sequence. However, this required that we amplify an additional ~3kb fragment spanning the pol and vpu genes to confirm the linkage of 5′ V1 and 3′ V1 consensus genomes (Fig. 5).
The 12 T/F genomes ranged in length from 9653 to 9854 nt (Table 2) and had intact open reading frames for the canonical HIV-1 genes gag, pol, pro, int, vif, vpr, vpu, tat, rev, env and nef. Five T/F genomes were entirely A subtype and five were entirely D subtype. Two genomes had evidence of recombination between A and D subtypes. We employed SimPlot and maximum likelihood phylogenetic analytical tools to characterize the recombination breakpoint locations to distinguish unique from circulating recombinant form viruses. Using the 2010 subtype reference sequences, both SimPlot and maximum likelihood phylogenetic analyses concurred in different subtype assignments for the sub-genomic segments of 191947 and 191982 T/F genomes (Fig. 6). Both genomes were found to be complex unique subtype A and D recombinant forms as shown. The two viruses each contained subtype A gp120 Env and Vif sequences, but otherwise contained essentially antithetical distributions of A and D sequences.
We also noted salient subtype specific genetic polymorphisms in the envelope gp41 and tat exon 2 among the T/F subtype A and D genomes. All five subtype D and one A/D recombinant (191982) T/F genomes had a conserved seven amino acid deletion in the α-helical amphipathic lentivirus lytic peptide (LLP) 2 domain (between g41 amino acid position 268 – 279; HXB2 numbering). One subtype A T/F genome, 191084, was distinct in having a seven amino acid truncation of the C-terminal LLP-2 domain at a position immediately adjacent and downstream of the subtype D deletion. Three of five subtype D (191727, 191647 and 191859) and one A/D recombinant (191982) genomes had a truncated tat exon 2 compared to none of five clade A genomes, resulting in a loss of 16 amino acids. V3 loop charges of subtype A, D and A/D T/F genomes V3 sequences at pH 7.5 were generally low, ranging from 0.8 to 3.8, typical of R5 tropic virus strains (Balasubramanian et al., 2012; De Wolf et al., 1994; Kaleebu et al., 2007).
T/F genomes were cloned into either pCR XL TOPO or pBR322 plasmid vectors at the restriction sites indicated (Table 2). The nucleotide sequences of the T/F clones were deposited in Genbank (accession numbers JX236668-JX236679) and at http://www.hiv.lanl.gov/content/sequence/HIV/USER_ALIGNMENTS/Baalwa/.
Although T/F HIV-1 genomes are expected to be replication competent since their sequences correspond to founder viruses that are responsible for productive clinical infection in humans, assaying virus replication is a key test of the T/F concept and the integrity of the T/F molecular virus clones. We thus tested the molecularly cloned viral genomic DNA corresponding to all 12 T/F viruses for replication competence in primary human CD4+ T-cells. T/F DNA was transfected and expressed in 293T cells and resulting virus was passaged cell-free onto activated primary human CD4+ T lymphocytes from three different normal donors (Figure 7A, B and C). All 12 viruses replicated relatively efficiently with kinetics generally comparable to the subtype B control virus SG3 and historical controls (Ochsenbauer et al., 2012; Salazar-Gonzalez et al., 2009) and with peak p24 antigen concentrations of 100 – 1000 ng/ml or higher. Interestingly, there was a statistically significant trend (p < 0.05) for subtype D viruses to replicate more rapidly and to higher titers compared with subtype A viruses (Figure 7D). This was a reproducible finding in different experiments, conducted on different days, at different multiplicities of infection (Fig. S2) and using CD4+ T-cells from different normal human donors.
We next tested T/F subtype A, D and A/D viruses for use of the chemokine coreceptors CCR5 and CXCR4, anticipating possibly that there might be a predilection for T/F D subtype viruses to use CXCR4 based on previous literature reports of subtype D virus isolates (Huang et al., 2007; Kaleebu et al., 2007). Whereas AMD3100, a selective inhibitor of CXCR4 mediated entry, completely inhibited the entry of the control X4 tropic virus SG3, it had no effect on any of the 12 T/F viruses (Fig. 8A). Conversely, the CCR5 inhibitor TAK-779, inhibited the entry of the control R5 tropic virus YU2 and all T/F viruses. These findings suggested that CCR5 and not CXCR4 was the predominant co-receptor for virus entry by these T/F viruses. These findings were corroborated by analyses in NP2 cells that express CD4 and either CCR5 or CXCR4 but not both. Here again we found all 12 T/F viruses used CCR5 but not CXCR4 for cell entry (Fig. 8B).
Finally, we assayed the T/F viruses as IMCs or as Env-pseudotyped viruses for sensitivity to neutralization by pooled heterologous polyclonal HIV immunoglobulins from subtype B (HIVIG-B) or C (HIVIG-C) infected donors, by polyclonal IgG antibodies from each of three HIV-1 subtype C infected subjects previously shown to have broad and potent neutralizing antibodies, by the broadly neutralizing human monoclonal antibodies b12, VRC01, 2G12, 2F5, 4E10, PG9 and PG16, and by the fusion inhibitor T-1249 (Table 3). Interestingly, there was a trend for the subtype D virus Envs to be more resistant to HIVIG-C (P = 0.045) and HIVIG-B (P = 0.059) than were subtype A Envs. Similarly, four out of five subtype D Envs were not neutralized by any of the 3 heterologous IgG polyclonal antibodies (SA-C62, SA-C72 and SA-C74) at IgG concentrations as high as 2500 μg/ml, while all subtype A Envs were neutralized at concentrations (IC50) ranging from 226 to 1997 μg/ml (p = 0.139, 0.0007 and 0.052, respectively). The 12 T/F viruses were variably sensitive to the membrane proximal external region (MPER) specific mAb 4E10 and 2F5. Interestingly, all five T/F subtype A viruses were sensitive to neutralization by VRC01 (mean IC50 = 1.75 μg/ml) and extremely sensitive to PG9 (mean IC50 = 0.05 μg/ml) and to PG16 (mean IC50 = 0.003 μg/ml). This was not the case for subtype D T/F viruses, most of which exhibited resistance to these mAbs with IC50 values exceeding 10 μg/ml (p = 0.049, 0.004 and 0.0001, respectively). Conversely, subtype A viruses were pan-resistant to b12 (IC50 > 10 μg/ml) whereas 3 of 5 subtype D viruses were sensitive (mean IC50 = 5.69 μg/ml; p = 0.058). The A/D recombinant viruses, which contain subtype A gp120 segments, exhibited neutralization patterns similar to subtype A T/F viruses. Subtype A, D and A/D viruses, like subtype B and C T/F viruses (Keele et al., 2008; Salazar-Gonzalez et al., 2009), were comparably sensitive to the gp41 reactive fusion inhibitor T-1249.
The concept of identifying full-length T/F virus genomes by SGS is a relatively new idea with numerous research applications in HIV/SIV research as well as for other viral systems. To our knowledge, there are only four reports in the scientific literature describing full-length HIV-1 T/F genomes and these are limited to subtypes B and C (Li et al., 2010; Ochsenbauer et al., 2012; Parrish et al., 2012; Salazar-Gonzalez et al., 2009). This approach has been widely adopted, since it makes possible a sensitive molecular probing and analysis of virus-host interactions operative at or near the moment of virus transmission, during the eclipse phase when virus replicates extensively but is not yet detectable in the plasma, and later when virus begins to diversify under host immune pressures (Bar et al., 2012; Goonetilleke et al., 2009). New insights obtained by this approach include: i) enumeration of T/F viral genomes responsible for productive clinical infection, which in heterosexuals is generally one, in homosexuals slightly higher, and in intravenous drug users higher still, sometimes involving 10 or more transmitted viruses (Bar et al., 2010; Keele et al., 2008; Li et al., 2010); ii) biological characterization of T/F subtype B and C viruses resulting from mucosal transmission, which revealed CCR5 but not α4β7 tropism (Parrish et al., 2012; Wilen et al., 2011) as well as occasional use of alternative (non-CCR5/CXCR4) co-receptors (Jiang et al., 2011); iii) antigenic characterization of T/F viruses revealing uniformly tier 2 or 3 level neutralization resistance typical of primary virus isolates (Keele et al., 2008; Li et al., 2010); iv) failure of subtype B or C T/F viruses to replicate efficiently in macrophages compared with primary CD4+ T lymphocytes (Li et al., 2010; Ochsenbauer et al., 2012; Salazar-Gonzalez et al., 2009); v) rapid evolution of CTL escape mutations (or reversions) away from the T/F proteome in the days and weeks following transmission followed weeks later by neutralizing antibody escape (Bar et al., 2012; Goonetilleke et al., 2009); and vi) immunologically-mediated sieving of viruses in phase III human vaccine trials (de Souza et al., 2012; Rolland et al., 2011). These findings highlight the enabling potential of T/F virus genome characterization and led us to pursue the analysis of T/F viruses for two less well studied HIV-1 subtypes A and D.
In our field survey of incident HIV-1 infections in Central and East Africa, we were able to identify discrete low diversity sequence lineages emanating from T/F viral genomes in each of 12 subjects in early stages of infection (Fiebig I–IV). Four of the earliest stage subjects (Fiebig I) had plasma viral loads between 7,000 and 53,000 per milliliter representing early ramp-up viremia. But even in Fiebig stage IV patients, we could identify T/F sequence lineages. Ten of 12 subjects had evidence of transmission of a minimum of one virus and two others had evidence of infection by at least two or three viruses. These are minimal estimates consistent with an extensive body of published data on hundreds of acutely infected subtype B and C subjects that suggest that the multiplicity of HIV-1 infection resulting from sexual transmission is generally one, although the range can reach as high as 10 or more (Abrahams et al., 2009; Haaland et al., 2009; Keele et al., 2008).
With the exception of shared stochastic mutations and one example of likely immune selection (Fig 4; subject 191084), acute sequences emanating from discrete T/F genomes followed a pattern of essentially random variation with a near star-like phylogeny and a Poisson distribution of low frequency mutations indistinguishable from earlier findings with subtypes B and C (Li et al., 2010; Ochsenbauer et al., 2012; Salazar-Gonzalez et al., 2009). This observation gives validation to the conclusion that the sequences identified in the present study as T/F genomes are genuine. Also consistent with this interpretation is the fact that all potential reading frames for all viruses were open, and each T/F virus was replication competent in activated human CD4+ T cells. Whether the modestly enhanced replicative properties of subtype D viruses compared with subtypes A viruses contributes to the clinical observations of enhanced subtype D pathogenicity described previously will require much more detailed analyses of additional viruses and comparative studies of virus biology. One important direction for future work will be on macrophage versus lymphocyte tropism of subtype D and A viruses and the ability of these viruses to replicate in cells bearing low levels of CD4 or alternative coreceptors in order to explore potential predilections of subtype D viruses for neurovirulence. This work is beyond the scope of the present study but is a potentially fruitful area of research.
One of the interesting findings of this study was the subtype-associated genetic polymorphisms between subtype A and D T/F genomes. We found that all five clade D genomes and one A/D recombinant genome had a highly conserved seven amino acid deletion within the amphipathic alpha-helical LLP-2 domain of the envelope intra-cytoplasmic tail compared to the five clade A genomes. A comparison of T/F genomes with subtype A, B, C and D HIV-1 reference sequences from the HIV Sequence Compendium 2011 (Kuiken et al., 2012), revealed that all subtype B and D sequences, but not subtype A or C sequences, shared this highly conserved deletion. The LLP 1, 2 and 3 domains have been shown to modulate HIV-1 envelope expression, incorporation, viral infectivity and virally-mediated cell-cell fusion (Bhakta et al., 2011; Kalia et al., 2003; Lambele et al., 2007; Lu et al., 2008; Steckbeck et al., 2011; Wyss et al., 2005). Interestingly, truncations that shorten the LLP2 domain have been shown to enhance fusion efficiency (Wyss et al., 2005). We also found that three of five subtype D genomes and one A/D recombinant genome had a premature truncation of the second exon of tat resulting in the loss of sixteen amino acids. This tat polymorphism has been previously observed among some subtype B HIV-1 strains including HXB2, and there is conflicting evidence on the role of the second tat exon in viral replication (Guo et al., 2003; Mahlknecht et al., 2008; Neuveut et al., 2003). The other notable finding was the similarity in V3 loop positive charge between subtype A (mean = +3.0, SD = 1.1) and D (mean +2.6, SD = 0.8; p>0.5) T/F genomes, which differs from previous studies that suggested that subtype D strains were likely to possess higher V3 loop positive charge than subtype A (De Wolf et al., 1994; Kaleebu et al., 2007). However, the sample size in our study is too low to extrapolate our findings; instead, our study provides naturally-occurring, clinically relevant T/F viral reagents for further biological analysis.
The antigenic properties of subtype A and D viruses are of interest since a broadly effective vaccine will need to protect against infection by these genetically diverse virus strains. A principal observation was that compared to subtype A, subtype D Env pseudoviruses as a group tended to be more resistant to neutralization by heterologous pooled immunoglobulins (HIVIG-B and HIVIG-C) and by broadly neutralizing polyclonal IgG antibodies isolated from individual HIV-1 subtype C infected patients. This finding was intriguing and suggests that vaccine immunogens based on subtype B and C Envs might induce less effective immunity against subtype D viruses. The 12 T/F viruses were widely variable in their sensitivity to the MPER-specific mAbs 4E10 and 2F5, and this could not be simply explained by sequence variation in the canonical epitopes (Blish et al., 2007; Zwick et al., 2005). Interestingly, all five T/F subtype A viruses were extremely sensitive to PG9 and PG16, which was not the case for subtype D viruses, but again this could not be simply explained by sequence variation in these canonical epitopes (Walker et al., 2009). Conversely, subtype A viruses were pan-resistant to b12 whereas 3 of 5 subtype D viruses were sensitive. These neutralization profiles of subtype A, D, and A/D viruses may reflect the derivation of PG9 and PG16 mAbs from a subtype A donor and b12 from a subtype B donor, since subtypes B and D are more closely related phylogenetically than are subtypes B and A (Dwivedi and Sengupta, 2012).
Much remains to be explored concerning the transmission biology, immunopathogenesis, and prevention strategies for HIV-1 subtypes A and D. The molecular identification and cloning of full-length T/F HIV-1 genomes by SGS takes a step toward these goals by taking advantage of the natural virus population bottlenecking that occurs with virus transmission from one individual to the next, thus ensuring that the cloned genome represents a biologically and clinically relevant T/F virus unaltered by in vitro cultivation, selective amplification, or Taq polymerase, RNA Pol II or MuLV RT mutational errors. The 12 subtype A, D, and A/D T/F genomes reported here double the number of HIV-1 subtypes for which T/F genomes have been derived and double the number of T/F virus genomes available for analysis. This work can facilitate a systematic molecular analysis of genome-wide structure, function, antigenicity and immunogenicity of these clinically important viruses and virus subtypes. Beyond this, the identification and cloning of other highly variable single-stranded T/F RNA (or DNA) viruses including, for example, the Flaviviridae represent an important new research opportunity (Li et al., 2012).
Subjects were enrolled in IAVI-sponsored prospective vaccine preparedness cohort studies of HIV-1 antibody negative heterosexuals or men who have sex with men (Price et al., 2012; Price et al., 2011; Tang et al., 2011), in a Uganda Virus Research Institute/Medical Research Council/Wellcome Trust HIV-1 acquisition cohort study, and in a heterosexual sero-discordant couples cohort study in Rwanda (Derdeyn et al., 2004). Subjects were given HIV counseling, condom provision and regular HIV testing either monthly or quarterly. Those who seroconverted to HIV-1 were screened for stage of primary HIV-1 infection and those between Fiebig stage I to IV were included in the present study if sufficient cryopreserved plasma and/or peripheral blood mononuclear cells (PBMCs) were available for analysis and if viral loads and vRNA/cDNA integrity were sufficient for SGA analysis. The guideline for laboratory staging of primary HIV-1 infection (Fiebig staging) has been described elsewhere (Fiebig et al., 2003; Keele et al., 2008). All subjects gave informed consent and blood specimen collections were undertaken with institutional review board and other regulatory approvals.
Plasma viral RNA (vRNA) and proviral DNA were extracted from blood specimens using the QIAamp RNA or DNA Mini Kits (Qiagen, Valencia, CA) as previously described (Keele et al., 2008; Salazar-Gonzalez et al., 2009). Reverse transcription was performed using Superscript III (Invitrogen) according to the manufacturer’s instructions in two steps. First, 30μl of RNA were mixed with 0.5mM of each deoxynucleoside triphosphate (dNTP) and 0.25mM of reverse primer 1.R3B3R (5′-ACTACTTGAAGCACTCAAGGCAAGCTTTATTG-3′), reconstituted in a volume of 36μl and incubated at 65°C to denature RNA secondary structure. Second, 24μl of an RT reaction mix consisting of first strand RT buffer (1X), 5mM dithiothreitol, 2U/μl of RNase inhibitor-RNase OUT and 5U/μl Superscript III was added to the first mix, bringing the total reaction volume to 60μl, which was then incubated at 50°C for 80 min. The RT reaction was then terminated by incubation at 85°C for 5 minutes, followed by the addition of 3μl of RNase H and a final incubation at 37°C for 20 min to denature the RNA strands. The reverse primer 1.R3B3R was used to synthesize cDNA templates for full-length and 3′ half genome amplification. To synthesize cDNA templates for amplification of the 5′ genomic half, the same RT protocol was followed but using B5R1 (5′-CTTGCCACACAATCATCACCTGCCAT-3′; nt 5,052–5,077) as the reverse primer. The synthesized cDNA was either used immediately or stored at −80°C.
cDNA was serially diluted with each dilution replicated in 12 wells and amplified by nested PCR. cDNA dilutions found to result in 30% or fewer PCR positive wells were adopted to obtain additional amplicons using 96-well reaction plates. Two approaches were used to derive single genome amplicons. In one approach, ~9 kb genomes extending from 5′ U5 to 3′ U3-R were amplified as previously described (Keele et al., 2008; Salazar-Gonzalez et al., 2009). Briefly, first round PCR was carried out in the presence of 1 x Expand Long template buffer containing MgCl2 at a final concentration of 1.75 mM, 0.35 mM of each dNTP, 0.3 μM of each primer and 3.75 units/μl of Expand Long template enzyme mix (Roche) in a 50 μl reaction. First round primers were: forward-U5Cc (5′-CCTTGAGTGCTCTAAGTAGTGTGTGCCCGTCTGT-3′) and the reverse primer 1.R3B3R. The second round PCR primers were: forward-U5Cd (5′-AGTAGTGTGTGCCCGTCTGTTGTGTGACTC-3′) and reverse-2.3′3′plCb (5′-TAGAGCACTCAAGGCAAGCTTTATTGAGGCTTA-3′). PCR conditions were: 94°C for 2 min, followed by 10 cycles of 94°C for 15 s, 55°C for 30 s, and 68°C for 8 min, followed by 25 cycles of 94°C for 15 s, 55°C for 30 s, and 68°C for 8 min, with cumulative increments of 20 s at 68°C with each successive cycle and a final extension period of 10 min at 68°C. The first round PCR products were then utilized as templates for the second round PCR reactions under the same conditions. PCR products were examined on 1% agarose gels for amplicons >8-kb. The other approach consisted of amplifying overlapping 5′ (U5, gag-pol and vif) and 3′ (pol, vif, vpr, rev, vpu tat, env, nef and U3-R) half genomes. Amplification reactions were carried out in the presence of 1 x High Fidelity Platinum Taq PCR buffer, 0.2μM of each primer, 2mM Mg SO4, 0.2 mM each deoxynucleoside triphosphate and 0.025 units/μl of Platinum Taq High Fidelity polymerase in 20ul reactions (Invitrogen, Carlsbad, CA). The first round PCR conditions were 94°C for 2 min, followed by 35 cycles of 94°C for 20 s, 55°C for 30 s, and 68°C for 5.50 min, followed by a final extension of 68°C for 5 min. For the second round of PCR, the total number of cycles was increased to 37 and the annealing temperature changed from 55°C to 58°C. The 5′ (1.U5.B1F/B5R1 and 2.U5.B4F/B5R2) and 3′ (B3F1/1.R3B3R and B3F3/2.R3B6R) half genome primers used are described elsewhere (Salazar-Gonzalez et al., 2009). The amplicons were examined on precast 1% agarose 96-well E-gel. For both approaches, the first 30 nucleotides on the 5′ end of the ~9 kb/5′-half genomes and the last 30 nucleotides on the 3′ end of the ~9 kb/3′-half genomes corresponded to internal nested primers U5Cd/U5.B4F and 2.3′3′plCb/1.R3B6R, respectively, used during the second round of PCR amplification and were thus primer derived. A separate PCR reaction was therefore carried out to determine the precise viral nucleotides at these positions utilizing the same cDNA aliquots used to obtain 5′ half amplicons and primers encompassing a ~0.4-kb region from the 5′ R (5′-GGTCTCTCTGGTTAGACCAGAT-3′; nt 455–476) to gag (5′-TCCAGCTCCCTGCTTGCCCATACTA-3′; nt 914-890). This enabled the determination of the complete viral LTR sequence (U3-R-U5), and subsequently, the complete 5′ half (U3-R-U5, gag-pol and vif) and 3′ half (pol, vif, vpr, rev, vpu tat, env, nef and U3-R-U5) genomes. Thus, the full-length HIV-1 genomes (U3-R-U5- HIV genes - U3-R-U5) could be ascertained.
DNA sequencing and sequence alignments were performed as previously described (Bar et al., 2010; Keele et al., 2008; Keele et al., 2009; Li et al., 2010; Salazar-Gonzalez et al., 2009). All sequences are available under GenBank accession nos. JX202785-JX203216, JX236668-JX236679 and JX877476- JX877521. Alignments are available at (http://www.hiv.lanl.gov/content/sequence/HIV/USER_ALIGNMENTS/Baalwa/).
Sequences were aligned using ClustalW version 2.11 (Larkin et al., 2007). Phylogenetic trees were constructed by maximum-likelihood estimation using PhyML version 3.0 (Guindon et al., 2010) or by the neighbor joining method using ClustalW (Larkin et al., 2007). Maximum within patient sequence diversity was determined using Phylogenetic Analysis Using Parsimony (PAUP) version 4.0 (Rogers and Swofford, 1999). Subtype assignments were based on the REGA (www.dbpartners.stanford.edu/RegaSubtyping/) and RIP (www.hiv.lanl.gov) tools and maximum likelihood phylogenetic tree analyses. Recombination breakpoint locations were identified by SimPlot (Lole et al., 1999). Highlighter plots (www.hiv.lanl.gov) were used to display nucleotide substitutions in the viral quasispecies as compared to the T/F sequence. T/F sequences were inferred based on a model of early random virus evolution as previously described (Keele et al., 2008; Lee et al., 2009; Ochsenbauer et al., 2012; Salazar-Gonzalez et al., 2009).
To obtain IMCs corresponding to T/F viruses from subjects 191845, 9004SS, 190049, 191647, 191859, 191882, 191727, 191947 and 191982, each inferred T/F full-length genome was chemically synthesized as three overlapping fragments containing unique restriction sites to allow concatenation (Blue Heron Biotechnology). Additional restriction enzyme cleavage sites were added to the 5′ and 3′ ends of the outermost fragments to facilitate insertion into an appropriate cloning vector (pCR XL TOPO or pBR322). This was followed by a one-step ligation of the three viral DNA fragments into a cloning vector using T4 DNA ligase (New England Biolabs Inc.).
To construct IMCs of T/F viruses from subjects R880F, R463F and 191084, overlapping 5′ and 3′ fragments containing complete LTR elements at both genomic ends were amplified using proviral DNA as the template. The antisense primer for the 5′ fragment and the sense primer for the 3′ fragment were designed to lie downstream and upstream, respectively, of a unique restriction enzyme site to allow for generation of overlapping fragments that encompass this site. The sense primer for the 5′ fragment was designed to exactly anneal to the beginning of the U3-R-U5 5′ LTR and the antisense primer of the 3′ fragment to exactly begin at the end of the U3-R-U5 3′ LTR. Fragments were amplified using bulk PCR as previously described (Salazar-Gonzalez et al., 2009). Amplification products were analyzed for correct size by 1% agarose gel and the desired amplicons subjected to another PCR reaction in 1x buffer, 0.2 mM of each dNTP and Platinum Taq High Fidelity DNA polymerase (Invitrogen) at 94°C for 2 min followed by a single extension at 68°C for 5 min to add 3′ adenine overhangs. Products were then purified by gel extraction (Qiagen) with each genomic half being eluted independently and T/A ligated into the pCR-XL TOPO vector (Invitrogen). The ligation reactions were used to transform TOP-10 cells, which were then plated on LB agar media containing 50μg/ml kanamycin and cultured overnight at 30°C. For each genomic fragment, 20 to 100 single colonies were selected and grown overnight at 30°C with constant shaking at 225rpm. Plasmid DNA with 5′ and 3′ fragments identical to the T/F sequence were excised and linearized to allow for a one step 3-piece DNA ligation with the appropriate vector.
Infectious molecular clones were assessed for replication competence, co-receptor usage and susceptibility to neutralization by monoclonal and polyclonal antibodies and fusion inhibitors. Viral stock generation, titrations, cell infections and neutralization assays were carried out using 293T cells, TZM-bl cells and activated primary human CD4+ T-cells, as previously described (Decker et al., 2005; Keele et al., 2008; Salazar-Gonzalez et al., 2009). TZM-bl cells were obtained from the NIH AIDS Research and Reference Reagent Program (catalogue #8129). Coreceptor usage was determined using two different approaches. First, replication competent IMC derived viruses were used to infect TZM-bl cells in the presence of neither, either or both of the CCR5 and CXCR4 selective inhibitors, TAK-779 and AMD3100 (NIH AIDS Research and Reference Reagent Program catalogue #4983 and #8128). Second, use of CCR5 but not CXCR4 was confirmed by Env pseudovirus infection of NP2 cells expressing CD4 and neither or either CCR5 or CXCR4. Neutralizing antibody assays were conducted independently in two laboratories at the University of Alabama at Birmingham (J.B.) and Duke University (D.C.M.) using both IMCs and Env-pseudotyped HIV-1 virus stocks as previously described (Keele et al., 2008; Salazar-Gonzalez et al., 2009; Seaman et al., 2010; Wei et al., 2003). HIVIG-B is a pool of purified IgG from HIV-1 subtype B chronically infected subjects. HIVIG-C is a pool of purified IgG from six South African subjects chronically infected by HIV-1 subtype C. High-titer, broadly neutralizing serum IgG was also purified from three South African subtype C infected subjects (SA-C62, SA-C72 and SA-C74) not included in the HIVIG-C pool and these were tested individually.
Highlighter (www.HIV.LANL.gov) plots of 5′ and 3′ half genomes from subjects 191982 (A), 191882 (B), 191859 (C), 191947 (D) and 191647 (E). Tic marks represent nucleotide substitutions as compared to the top-most T/F sequence in each plot. Month 3 is post-seroconversion.
Replication kinetics of clade A and D T/F viruses in the same donor CD4 T cells is shown for infections at two different multiplicities of infection (MOI), 0.5 (upper panel) and 0.1 (lower panel).
We thank Frederic Bibollet-Ruche, Katharine Bar and Jesus Salazar-Gonzalez for critical readings of early versions of the manuscript; Maria Salazar for technical assistance; Lynn Morris and Advanced Bioscience Laboratories for providing HIVIG-C IgG; the clinical and immunology cores of IAVI and the Center for HIV/AIDS Vaccine Immunology; IAVI-affiliated investigators, clinicians and study participants; and Jamie C. White and Patricia Crystal for manuscript preparation. This work was supported by grants from the NIH (AI067854 and AI100645), the Bill and Melinda Gates Foundation (37874), and the United States Agency for International Development (GPO-A-00-06-00006-00). The study and report are the responsibility of the study authors and do not necessarily reflect the views of USAID or the United States Government.
The authors have no competing interests to declare.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.