We aimed to examine the evolution of HIV-1 pol and env genomic sequences derived from serial patient plasma specimens. To achieve that end it is useful to study DNA fragments spanning pol through gp120 obtained from individual viral variants. We developed a limiting dilution approach to amplify individual HIV-1 variants from plasma, and validated its ability to isolate single viral variants for PCR amplification. We designed a 6.6kb RT PCR method that amplified the HIV-1 genome spanning pol through gp120 (see Materials and Methods).
We obtained HIV-1 strains from serial plasma specimens from four viremic subjects in the Women’s Interagency HIV Study (WIHS), a longitudinal investigation of HIV-1 infection of women (
Anastos et al., 2000). At the start of the study, three of the four subjects (WC2, WC4, WC51) were untreated, and one (WC9) had received a two-drug antiretroviral regimen. At the outset none of the subjects had evidence of high level multidrug resistance, but over time they developed changes in HIV-1 drug resistance patterns following initiation or change of ART. The antiretroviral regimen varied over time for each patient, and the patients were prescribed different regimens.
Sequence analysis of each variant was first performed in the V3 region of gp120 in order to predict its tropism (
Hung et al.,1999;
Resch et al., 2001;
Kemal et al., 2007;
Cardozo et al., 2007). A total of 122 viral variants with predicted tropism were obtained from the four subjects. The sequences of
pol (protease and reverse transcriptase) and gp120 were determined from each variant, and several variants were sequenced entirely from
pol through gp120. Drug resistance-associated mutations were determined based on the protease and reverse transcriptase (RT) sequences (
Johnson et al., 2007;
http://hivdb.stanford.edu).
The virologic and clinical characteristics of subjects WC4 and WC2 at each timepoint are shown in (for WC4) and (for WC2). Two other tables, from subjects WC9 and WC51, are shown in the
Supplementary Material. The tables indicate at each timepoint the plasma HIV-1 RNA load, CD4+T cell count, ART prescribed, the number of viral variants predicted to use R5 or X4, and the drug resistance patterns. For each variant studied the sample identifier indicates the patient’s identifier, date, and variant number. The genotypic resistance-associated mutations and resistance interpretations are shown in (
Johnson et al., 2007;
http://hivdb.stanford.edu). A total of 25 timepoints were studied from the four subjects. The subjects were viremic at every timepoint, even though ART was prescribed at almost all timepoints. The plasma viral load ranged from 3.68 log–6.36 log HIV RNA copies/ml. The tables illustrate that at the start of the study all four subjects had a mixture of R5 and X4 strains of HIV-1, and there was no evidence of high level drug resistance-associated mutations in
pol. Over time, however, multidrug resistance-associated mutations emerged in some variants from all subjects.
In total, 122 variants, 6.6kb in size, were characterized for predicted tropism and genotypic resistance from serial specimens from the four patients. presents the results from each patient, indicating the number of variants of each tropism, R5 or X4, and the percentage of variants of each tropism that had high level genotypic drug resistance to any antiretroviral drug that targets pol. There were 55 strains from the four subjects that were predicted to be X4-tropic, 71 % of which had high level resistance mutations. There were 67 R5 variants from the four subjects, and 69% of them had high level resistance mutations. There was no significant association found between resistance mutations in pol and viral tropism by using Fisher’s Exact test and the binomial test for proportions.
| Table 2Summary of tropism and resistance. |
The lack of association between resistance mutations and tropism prompted us to examine the sequences of pol and gp120 by using computational methods. and present phylogenetic trees of the protease and RT genes, and gp120 from patients WC4 and WC2 respectively. The entire 6.6kb pol-gp120 sequences obtained from patients WC4 and WC2 were also analyzed phylogenetically.
illustrate phylogenetic trees of the protease-RT and gp120 sequences respectively, from patient WC4. The branches in red indicate HIV-1 variants predicted to be R5-tropic based on the V3 sequence; branches in blue indicate X4-tropic variants. The dates of the serial sequences are indicated for each variant in its identifier. Each branch represents a variant that is listed in . The pol sequences seen in were intermingled in regard to tropism, displaying a pattern of variation independent of tropism. illustrates the phylogenetic tree of gp120 from patient WC4; it includes the V3 sequence, and shows clustering by tropism. The envelope sequences still cluster by tropism even when the V3 sequences were removed to ensure against bias from convergent evolution (data with V3 sequences deleted not shown).
illustrate the trees from protease-RT and gp120 respectively, from patient WC2, with each branch representing a variant listed in . As with WC4, the protease-RT sequences from WC2 showed a pattern of variation independent of tropism (), while the gp120 sequences clustered according to tropism (). The gp120 trees of all four patients generally demonstrated clustering of sequences by tropism, and all gp120 trees shown included V3 sequences. Tree topology of gp120 was similar for all subjects when the V3 loop was deleted (data not shown). The phylogenetic trees of
pol and gp120 from patients WC9 and WC51 had similar patterns as the trees from the other patients and are shown in the
Supplementary Material. We statistically tested for compartmentalization (clustering) by using the Slatkin-Maddison test and obtained a p<0.001 for compartmentalization.
The results of the phylogenetic analyses ( and ) and the lack of association of high level drug resistance mutations with predicted tropism () raised the question of whether recombination occurred between viral strains within individual patients. Identification of intrapatient HIV-1 recombinants has proved challenging due to the relatedness of the variants (
Kemal et al., 2003;
Fang et al., 2004;
Philpott et al., 2005). Extensive sequence analysis of individual variants and advanced computational methods have now made it more feasible. To identify putative recombinants and the most likely parental sequences three computational methods were employed.
- The Bayesian dual multiple change-point framework (Minin et al., 2005) allows for varying evolutionary rates, selective pressures, and phylogenetic trees along the sequence alignment, while providing estimates that include the site and number of recombination events.
- The SimPlot (Lole et al., 1999) compares related sequences to a sequence under investigation.
- The GARD method uses a genetic algorithm to search a multiple-sequence alignment to detect putative recombinant breakpoints (Pond et al., 2006).
presents an analysis of a putative recombinant, the sequence of an HIV-1 strain obtained from the plasma of patient WC4 in March, 1997 (WC4P0397-8: X4-tropic). A table in the figure () shows the most likely parental sequences that gave rise to the recombinant; they were identified as coming from that patient in September, 1995 (WC4P0995-5: X4-tropic), and August, 1996(WC4P0896-5: R5-tropic). Viral tropism and drug resistance-associated mutations are also shown in the table (). The recombinant and the parental sequences had very few or no drug resistance-associated mutations in the protease and RT genes. shows the Bayesian dual change-point analysis with recombination breakpoints in gp120. In addition, the sequence was identified as a recombinant by using GARD (data not shown). The recombination breakpoints that were identified by Bayesian analysis were also supported by informative site analysis (). Phylogenetically informative sites are plotted by color indicating which parental strains have the highest posterior support (). presents a SimPlot analysis of the same putative recombinant variant, WC4P0397-8. The SimPlot supports the view that recombination occurred in gp120 including the V3 region from parental sequences that differed in tropism. The recombination resulted in a change in predicted tropism compared to the tropism of one of the two most likely parental sequences.
shows the results of replication capacity (RC) measurements performed by using the most likely parental and putative recombinant sequences shown in , as well as a positive control (HIV-1 pNL4-3). We developed an HIV-1 RC assay to be able to measure the RC of different viral variants based on the pol gene amplified from the patient. The RC assay involved construction of 10kb chimeric HIV-1 molecules followed by transfection into a cell line. The chimeric molecules were formed by using ligation-mediated recombination PCR. We used HIV-1 pNL4-3 as a backbone with insertion of the entire pol gene amplified from plasma-derived virus by RT-PCR (, Materials and Methods). The putative recombinant, WC4P0397-8, had no resistance-associated mutations in pol, while the two most likely parental strains had resistance-associated mutations in RT (). The recombinant had a much greater RC than the parentals. The significantly greater RC of the recombinant compared to the parental sequences supports the interpretation that WC4P0397-8 was a genuine in vivo recombinant that had a survival advantage over the parentals.
presents a SimPlot and sequence alignment to identify another recombinant variant, WC4P0896-8, and its most likely parental sequences from the same patient, WC4. is a table listing the putative recombinant and the likely parental sequences, their predicted tropism, and drug resistance mutations. Recombination was confirmed by using Bayesian analysis (data not shown). The SimPlot () and sequence alignment () show shared mutations, indicating that the virus with the M184V drug resistance mutation in pol and R5 tropism in the V3 region of env (WC4P0896-8: R5-tropic) was most likely derived from two viruses, one with the M184V mutation (WC4P0296-2:X4-tropic) and one with the R5 tropism (WC4P0896-4: R5-tropic). More shared mutations are seen in the sequence alignment further illustrating recombination. The recombinant WC4P0896-8 shared 8 other base changes in pol with the parental sequence WC4P0296-2, in addition to the M184V codon (6 of the base changes are shown in the figure). In the V3 region of gp120, WC4P0896-8 shared many other base changes with the parental WC4P0896-4 besides the ones relevant for coreceptor use. These data illustrate recombination breakpoints between pol and env.
presents analyses from another patient, WC2, of the sequence of a putative recombinant (WC2P0896-4: X4-tropic) and its most likely parental sequences (WC2P0296-15: X4-tropic; and WC2P0896-2: R5-tropic). shows a table with the tropism and drug resistance mutations of the parental and recombinant variants. In a Bayesian analysis illustrates that WC2P0896-4 is a recombinant formed most likely from the identified parental sequences with breakpoints in pol and env Recombination was supported by a color-coded informative site analysis at the bottom of . Recombination was also identified by using GARD (data not shown). The SimPlot shown in shows breakpoints in pol and gp120. The recombinant WC2P0896-4 appears to have acquired drug resistance-associated mutations in pol from one parental (WC2P0896-2 ) and the tropism from the other parental sequence (WC2P0296-15). shows the RC of the most likely parental and recombinant sequences, with the recombinant having an RC approximately 7 times that of one parental (WC2P0896-2) and 40% that of the other parental sequence (WC2P0296-15). Both parental sequences had drug resistance-associated mutations, but the parental with the higher RC (WC2P0296-15) had two drug resistance mutations in the RT conferring resistance to only one antiretroviral drug, ddC. The other parental sequence (WC2P0896-2) and the recombinant, WC2P0896-4, had the same three resistance-associated mutations, and they were interpreted as being resistant to two drugs, ddC and 3TC. The parental strain (WC2P0896-2) had a very low RC, possibly due, at least in part, to the drug resistance mutations. The recombinant, which had the same resistance mutations, had a higher RC, possibly due to other sequences in pol that were not in the drug resistant parental but were acquired from the other parental by recombination. The recombinant is likely to have a survival advantage over the parental with the low RC; this finding is consistent with variant WC2P0896-4 being a real recombinant strain formed in vivo.
The data suggests that all four patients in this study had recombinant virus. Data regarding viral recombination in subjects WC9 and WC51 is presented in the
Supplementary Material. These data include tables of virologic and clinical characteristics, phylogentic trees, SimPlots, and sequence alignments, and show that these subjects, in addition to subjects WC2 and WC4, also had recombinant HIV-1. All of the HIV-1 variants described as putative recombinants were confirmed as recombinants by using the Bayesian dual multiple change-point framework. At least one recombinant was identified from each patient that encompassed
env and conferred a change in predicted tropism compared to at least one of its most likely parental strains.