|Home | About | Journals | Submit | Contact Us | Français|
It has been suggested that immune-pressure-mediated positive selection operates to maintain the antigenic polymorphism on the third variable (V3) loop of the gp120 of human immunodeficiency virus type 1 (HIV-1). Here we present evidence, on the basis of sequencing 147 independently cloned env C2/V3 segments from a single family (father, mother, and their child), that the intensity of positive selection is related to the V3 lineage. Phylogenetic analysis and amino acid comparison of env C2/V3 and gag p17/24 regions indicated that a single HIV-1 subtype E source had infected the family. The analyses of unique env C2/V3 clones revealed that two V3 lineage groups had evolved in the parents. Group 1 was maintained with low variation in all three family members regardless of the clinical state or the length of infection, whereas group 2 was only present in symptomatic individuals and was more positively charged and diverse than group 1. Only virus isolates carrying the group 2 V3 sequences infected and induced syncytia in MT2 cells, a transformed CD4+-T-cell line. A statistically significant excess of nonsynonymous substitutions versus synonymous substitutions was demonstrated only for the group 2 V3 region. The data suggest that HIV-1 variants, possessing the more homogeneous group 1 V3 element and exhibiting the non-syncytium-inducing phenotype, persist in infected individuals independent of clinical status and appear to be more resistant to positive selection pressure.
Nonsynonymous mutations in protein-coding loci are often eliminated from the population because they can cause deleterious effects on protein function or reduce the fitness of the organisms. This functional constraint suppresses nonsynonymous substitution (Ka), and synonymous substitution (Ks) generally exceeds Ka on protein-coding genes (32), including the gag, pol, and env genes of human immunodeficiency virus type 1 (HIV-1) (21, 50). In a host-parasite relationship, however, new nonsynonymous mutations in a particular region can often be maintained in the population, where they confer a selective advantage on the organism (positive selection). The existence of the positive selection force that increases amino acid polymorphism was first noticed on the hypervariable regions of the major histocompatibility complex and immunoglobulin heavy chain (25, 26, 57) and soon afterwards for the surface antigen sites of many pathogens, including HIV-1 (4, 18, 22, 30, 38, 50, 52, 53, 64, 67).
The third variable (V3) loop of the HIV-1 envelope gp120 is highly immunogenic, and neutralizing antibodies from infected individuals can recognize this region (45, 61). Therefore, it is conceivable that new nonsynonymous mutations on the V3 loop-coding locus confer a selective advantage on HIV-1 via avoidance of antibody recognition. Consistently, Ka exceeds Ks in the V3 region (4, 22, 49, 50, 52, 53, 64, 67), and Ka in the V3 region is reported to correlate with the duration of the immunocompetent period (33). These data suggest that immune-pressure-mediated positive selection is operating on the V3 region to maintain antigenic polymorphism.
The V3 loop, on the other hand, is a critical determinant to specify coreceptor usage for HIV-1 entry (8, 9, 48a, 55). This indicates that V3 sequence variation is influenced by a functional constraint as well as by positive selection. In this regard, V3 loop amino acid sequences of non-syncytium-inducing (NSI)/macrophage-tropic variants consist of a less diverse and less positively charged population than that of syncytium-inducing (SI)/T-cell line-tropic variants (7, 17, 36, 37, 51, 65). The NSI variants generally predominate in the asymptomatic period of HIV-1 infection in vivo and use CCR5 as an entry coreceptor of infection (2, 3, 9–11, 14). In contrast, the SI variants often appear in association with disease progression and acquire the ability to infect cells expressing CXCR4 or other chemokine receptors (3, 10, 13, 16, 54).
In this study, we examined whether the intensity of positive selection is related to NSI and SI V3 genotype. To address this issue, we determined the nucleotide sequences of 147 independent V3 clones from uncultured peripheral blood mononuclear cells (PBMCs) and from virus isolates derived from an HIV-1 subtype E-infected Japanese family (48), divided them into two subgroups based on the presence of basic amino acid substitutions and on the extent of variation, and ascertained the NSI and SI phenotypes of virus isolates bearing the two V3 subpopulations in MT2 cell infection assays. Analyses of Ka and Ks values within each group at different sampling points or for different individuals suggest that the V3 subpopulation for the NSI phenotype is maintained with lower variation than that for the SI phenotype in infected individuals and that positive selection operates less extensively on the NSI V3 subpopulation. This is the first systematic comparison of gp120 V3 sequence evolution with the biological changes in HIV-1 isolates, obtained following horizontal and vertical transmission in a single family.
The family consisted of a male index patient (NH1), the female spouse of NH1 (NH2), and their child (NH3). NH1 had no history of blood transfusion, surgical operation, or homosexual activity. He had a history of sexual contacts with female prostitutes in Thailand in 1989 and 1990. He was positive for HIV-1 antibodies in September 1992. He had chronic fatigue and lost 10 kg of weight between 1991 and 1992. He had developed AIDS-related pulmonary complications (Centers for Disease Control and Prevention [CDC] category C3) at the time of blood collection in June 1993 for the present study. The CD4+ lymphocyte count was 40 × 106/liter at the sampling time. NH1 died of AIDS-related pulmonary complications in March 1994.
NH2 had no documented risk factors for HIV-1 infection other than sexual contacts with NH1. She was found to be HIV-1 seropositive in November 1992. The CD4+ lymphocyte counts were 501 and 412 × 106/liter in October 1992 and May 1993, respectively. NH2 was asymptomatic (CDC category A2) at the initial blood collection in June 1993. She developed Pneumocystis carinii pneumonia in February 1996 (CDC category C3) and had been subjected to administration of zidovudine and dideoxyinosine since May 1996. The CD4+ lymphocyte count remained between 26 and 80 × 106/liter in 1996. Blood specimens were collected from NH2 in March 1996 (NH2-II) and January 1997 (NH2-III), and the CD4+ lymphocyte counts were 39 × 106 and 26 × 106/liter in February 1996 and December 1996, respectively.
NH3 was born to NH2 in June 1991. NH3 was positive for HIV-1 antibodies in December 1992. Data for his CD4+ lymphocyte count are limited to December 1992 (3,719 × 106/liter). He had no AIDS-defining illness at the time of blood collection in June 1993.
Whole blood of the three family members was collected in June 1993. For NH2, blood specimens were also collected in March 1996 (NH2-II) and January 1997 (NH2-III). PBMCs were isolated by Ficoll-Hypaque (Pharmacia LKB, Uppsala, Sweden) density centrifugation, and an env V3 region and flanking regions (324 bp) were amplified from the PBMCs by nested PCR. The first amplification step was performed for 30 cycles with outer primers MK369 and MK616 (43). Five microliters of the first PCR product were used for the second amplification step with inner primers V3-A and V3-C (23). The PCR products were cloned directly into pCRII cloning vectors (Invitrogen, NV Leek, The Netherlands). Fifteen to 22 of the cloned DNAs for each infected individual at a sampling point were sequenced on both strands with either an ALFII automated DNA sequencer (Pharmacia LKB) or an ABI PRISM 310 automated sequencer (Perkin-Elmer, Norwalk, Conn.).
A region covering a part of the long terminal repeat U5, the whole p17 gag gene, and a part of the p24 gag gene (520 bp) was amplified by nested PCR from uncultured PBMCs in June 1993. The outer primers were JA152 and JA155, and the inner primers were JA153 and JA154 (31). After the PCR, the primers were removed from the PCR products with a Centricon-100 (Amicon). The purified PCR products were sequenced directly on both strands with an ABI PRISM 310 automated sequencer.
Virus isolation was achieved by coculture of freshly isolated PBMCs (5 × 106) from the family members with an equal number of phytohemagglutinin (PHA)-stimulated PBMCs from HIV-1-seronegative individuals. At 2- to 3-day intervals, the cells were fed with RPMI 1640 with 10% fetal bovine serum and 20 U of recombinant interleukin-2/ml. At 1-week intervals, fresh PHA-stimulated PBMCs were added to the culture. Culture supernatants were taken for reverse transcriptase (RT) assay (63) every 2 to 3 days during a 40-day period after coculture.
Total RNA was extracted from 50 μl of the RT-positive culture supernatant (5), and one-fifth of the RNA was subjected to synthesis of complementary DNA with the primer C3E-130B, 5′-AGA AAA ATT CCC CTC TAC AAT TAA-3′, and avian myeloblastosis virus RT (Takara Shuzo Co., Kyoto, Japan) followed by PCR with primers C2E-110A, 5′-TTC AAT GGG ACA GGG CCA TGT-3′, and C3E-130B. The amplified DNAs corresponding to the HIV-1 env C2/V3 region (324 bp) were either sequenced directly or after cloning into pCRII, as described above.
A transformed CD4+-T-cell line, MT2 (2 × 105 cells), was incubated in 0.2 to 0.3 ml of medium containing 2 × 105 to 5 × 105 cpm of 32P-RT activity (63) for 12 h at 37°C, and the cells were grown in 1 to 2 ml of the corresponding medium in 24-well plates. Half of the volume of the culture medium was replaced by fresh medium every 2 to 3 days, and a portion was used for RT assay. The cultures were terminated 25 days after infection, during which time syncytium formation was monitored under a light microscope.
For the reference group and the outgroup of the phylogenetic tree, corresponding sequences of HIV-1 subtypes A through H and simian immunodeficiency virus (SIV)CPZGAB (CPZGAB) were collected from the Los Alamos National Laboratory database (39). The nucleotide sequences were aligned with sequences from the NH family with CLUSTAL W version 1.7 (60). The alignment was corrected manually to ensure that gaps did not alter the reading frame. A distance matrix of nucleotide substitutions was estimated from the alignment according to the method of Tajima and Nei (56), and phylogenetic trees were constructed from the matrix by the neighbor-joining method (47) with the NEIGHBOR program of the PHYLIP package version 3.5c (15a). Bootstrap analyses (15) were done for the tree with the DNABOOT and CONSENCE programs of the PHYLIP package, with 100 resamplings. The trees were rooted by CPZGAB or 18 available V3 sequences of HIV-1 subtype E from Thailand, the Central African Republic, and England (39) and were drawn with NJPLOT. The maximum-likelihood and the parsimonious trees were constructed from aligned sequences with the DNAML and DNAPARS programs (J. Felsenstein), respectively. For DNAML, the value for the transition/transversion ratio was set at 1.5 (24). Only the neighbor-joining trees are presented in this article, because trees generated by the other two algorithms gave the same topology.
Nucleotide substitution per site was estimated with PHYLIP for each pairwise sequence comparison on the basis of the Kimura two-parameter method (28). Nonsynonymous and synonymous substitutions (Ka and Ks) were estimated according to the method of Nei and Gojobori (40) with MEGA version 1.0 (29a). Means and variances of Ka and Ks for all pairwise comparisons of a sample population were calculated according to the method of Nei and Jin (41), in which patristic distances were evaluated by unweighted pair group method with arithmetic mean (UPGMA) trees from each value calculated by the NEIGHBOR program. Welch’s t test was used to evaluate the statistical significance of differences between Ka and Ks values (30).
The DDBJ database accession numbers for the nucleotide sequences reported here are D78024 to D78070 and AB014775 to AB014874 for the env C2/V3 region and AB015938 to AB015942 for the gag p17/p24 region.
Epidemiological information suggested that HIV-1 infections in the NH family were caused by intrafamilial transmission of viruses of Thai origin (see Materials and Methods). To ascertain whether the HIV-1 quasispecies in the family were evolutionarily closely related, we determined the phylogenetic relationships of the env C2/V3 and p17 gag sequences (1, 24, 42) from the family.
Fifteen to 17 clonal nucleotide sequences were determined for the env C2/V3 region (324 bp) from uncultured PBMCs of each infected individual in June 1993. Figure Figure1A1A shows the neighbor-joining tree with the sequences from the family and the reference sequences of HIV-1 group M (subtypes A through G) (39). The tree shows that the env C2/V3 sequences from the family form a monophyletic group (bootstrap value, 81/100). This family cluster was most closely related to HIV-1 subtype E sequences from Thailand (TN2432 and 92TH022.4; bootstrap value, 100/100), whereas it was distant from subtype E from the Central African Republic (CAR4017 and CAR4071) and from the other HIV-1 subtypes. The monophyletic relationship of the family sequences was reproducible when 18 of the other reported HIV-1 subtype E sequences from Thailand, the Central African Republic, and England (39) were included in the tree or when the trees were constructed with other algorithms, including parsimony and maximum likelihood (data not shown).
Figure Figure1B1B shows the neighbor-joining tree with the p17/p24 gag nucleotide sequences (520 bp) from the family and the reported HIV-1 subtype A through H representatives (19, 39). As noted previously (19), the gag region of subtype E from Thailand and the Central African Republic formed a monophyletic group that was related to subtype A (A′; bootstrap value, 100/100). In that A′ cluster, gag sequences from the family had a monophyletic cluster (bootstrap value, 58/100) that was genetically most closely related to the sequences from Thailand (TN238, TN240, TN245, and TN2431) rather than to that of African origin (90CR402). The branch lengths in the NH family cluster were significantly positive, with a P value of less than 0.01, when the gag tree was generated with the maximum-likelihood algorithm with PHYLIP DNAML (data not shown).
The deduced amino acids from the nucleotide sequences were compared to 19 of the subtype E sequences from Thailand (19, 20, 34). Comparison of the env C2/V3 region (105 amino acids) showed that sequences from the family uniquely shared a tryptophan at position 61 and a valine at position 86 in the C2/V3 region (data not shown). In addition, the p17/p24 gag sequences from the family members shared common substitutions when compared with a sequence from Thailand at positions 26, 93, 128, and 143 in the p17/p24 gag region, and a common polymorphic site was identified within the family at position 120 of the p17/p24 gag region by direct sequencing (data not shown).
Epidemiological information suggested that NH1 was infected in Thailand during the early epidemic of subtype E infections in the country (62). Subtype E env C2/V3 sequences in this period were highly homogeneous among asymptomatic carriers (12, 27, 34, 43). This enabled us to generate the NSI consensus of the subtype E V3 loop from 21 NSI virus isolates from the early 1990s in Thailand (12, 27). To delineate NSI and SI structural characteristics of the V3 loops from the family members, deduced amino acid sequences were compared to the consensus.
The V3 loop sequences from NH1 with AIDS could be divided into two groups (Fig. (Fig.2A,2A, NH1). One population (group 1) was characterized by the presence of a GPGQ motif at the tip of the V3 loop and by the lack of basic amino acid substitutions with respect to the consensus. Group 1 consisted of a minor population (4 of 17) in NH1, whereas it represented the only population detected in the asymptomatic carriers, NH2 (15 of 15) and NH3 (15 of 15) (Fig. (Fig.2A,2A, NH2 and NH3). The group 1 sequences were similar to each other or to the NSI consensus. The three family members shared variants carrying an identical V3 loop sequence of group 1 (Fig. (Fig.2A).2A).
The other group (group 2) was characterized by the presence of a GPGR motif at the tip of the V3 loop and by several substitutions of the subtype E NSI consensus with basic amino acids (Fig. (Fig.2A,2A, NH1). The basic substitutions were often seen at positions 8 (T to K or R; 10 of 13), 11 (S to R; 7 of 13), and 18 (Q to R; 17 of 17). Group 2 lost a potential site for N-linked glycosylation in the 5′ half of the V3 loop. Four of 13 group 2 clones had a phenylalanine residue proximal to the lost N glycosylation site in the loop. These characteristics are often found in subtype E SI isolates from individuals with AIDS (66). Group 2 consisted of a major population (13 of 17) in NH1, whereas it was not found in NH2 or NH3. Group 2 was more diversified from the subtype E NSI reference sequence than was group 1.
The two V3 groups were identified in NH2 as well as NH1, after NH2 developed AIDS (Fig. (Fig.2A,2A, NH2-II and -III). In March 1996 (NH2-II), group 1 with a GPGQ motif consisted of a major population (16 of 20). After 10 months (NH2-III), group 2 with a GPGR motif became major (18 of 22). Group 2 from NH2-II and NH2-III had basic amino acid substitutions at positions 11 (S to R; 42 of 42) and 18 (Q to R; 42 of 42).
The env C2/V3 nucleotide sequences from NH2-II and NH2-III formed a monophyletic group with those from NH1, NH2, and NH3 in June 1993, outside other subtype E sequences from Thailand on the neighbor-joining tree (data not shown). In addition, all sequences from NH2-II and NH2-III had a tryptophan residue at position 20 in the V3 loop (Fig. (Fig.2A,2A, NH2-II and -III) that was uniquely shared in V3 sequences from the family in June 1993 (Fig. (Fig.2A,2A, NH1, NH2, and NH3). All of the nucleotide sequences from uncultured PBMCs were subjected to the BLAST 2.0 program (National Center For Biotechnology Information) to search sequence similarity to the previously reported sequences in nucleotide databases. The sequences with the highest scores (ranging from 93 to 97% in the present data sets) were all from HIV-1 env C2/V3 sequences from Southeast Asia.
HIV-1 isolates were obtained from the family, and their V3 nucleotide sequences and abilities to induce syncytia on MT2 cells were determined. A virus isolate from NH1 (HIV-1NH1) in June 1993 had a group 2 V3 genotype with a GPGR motif and basic substitutions at positions 8, 11, and 18 (Fig. (Fig.2B,2B, NH1, PBMC). A major population in HIV-1NH1 (Fig. (Fig.2B,2B, NH1) was similar to a sequence in uncultured PBMCs of NH1 (Fig. (Fig.2A,2A, NH1). No sequences with a GPGQ motif were identified in HIV-1NH1. HIV-1NH1 replicated in PHA-stimulated PBMCs and MT2 cells. RT activity appeared in the MT2 culture supernatant on day 9 after infection and peaked on day 12, and concomitantly syncytia were observed in the culture. V3 sequences in the supernatant of RT-positive MT2 culture also contained a GPGR motif and basic substitutions at positions 8, 11, and 18 (Fig. (Fig.2B,2B, NH1, MT2). All of the 58 C2/V3 nucleotide sequences from the virus isolates clustered with the sequences from uncultured PBMCs under the same node in the neighbor-joining tree (data not shown).
Virus isolates from NH2 and NH3 (HIV-1NH2 and HIV-1NH3) in June 1993 had a group 1 V3 genotype with a GPGQ motif and a low net positive charge (Fig. (Fig.2B,2B, NH2 and NH3). Major V3 sequence populations in uncultured PBMCs of NH2 and NH3 were still dominant in HIV-1NH2 and HIV-1NH3. The virus isolates grew in PHA-stimulated PBMCs, whereas MT2 was nonpermissive for these isolates: no RT activity in culture supernatant and no syncytium formation were detected during a 25-day cultivation after infection.
Table Table11 summarizes the number of env C2/V3 nucleotide sequences obtained from uncultured PBMCs of the NH family in 1993, 1996, and 1997, along with the nucleotide diversity of the group 1 and group 2 quasispecies. Among a total of 89 C2/V3 clones sequenced, 87 clones had distinct nucleotide sequences over the entire 324-bp region and 86 clones had open reading frames. Among the 86 clones, 51 and 35 sequences encoded group 1 and group 2 V3 loops, respectively. The mean nucleotide substitution values per site (28) in the C2/V3 segments were slightly lower for group 1 than for group 2, i.e., they ranged from 0.006 ± 0.003 to 0.020 ± 0.010 for group 1 and from 0.014 ± 0.011 to 0.041 ± 0.017 for group 2. The total nucleotide diversity in each patient was higher in individuals with AIDS (NH1, NH2-II, and NH2-III) than in asymptomatic carriers (NH2 and NH3).
We estimated the inherent Taq polymerase error rate under the PCR conditions described here. A plasmid carrying the full genome of HIV-1 (pLAI ) was diluted to 102 or 103 copies per reaction mixture and subjected to nested PCR with the same sets of primers under the same thermal cycling conditions described in Materials and Methods. The PCR products were cloned into pCRII, and about 20 clones of the C2/V3 region (324 bp) were sequenced. Within 21 clones from a 102-copy template (6,809 bases sequenced), five substitutions (A→C, A→T, and three A→G) in five different clones were identified, all of which were nonsynonymous substitutions located at different positions. Within 19 clones from a 103-copy template (6,156 bases sequenced), four substitutions (T→C and three A→G) were noted, three of which were nonsynonymous substitutions and one of which was a synonymous substitution, mapped to different positions in the four different clones. The estimated Taq polymerase error rates under our experimental conditions were 6.5 × 10−4 to 7.3 × 10−4 mutation frequency/bp/60 cycles. The values were comparable to previous results obtained for the HIV-1 tat region (3.2 × 10−4 mutation frequency/bp/35 cycles) (35). Thus, the contribution of Taq polymerase to the frequency of base substitutions observed in the present study (Table (Table1)1) was less than 12%.
A neighbor-joining tree was constructed with the V3 nucleotide sequences (105 bp) of the independent C2/V3 clones (Fig. (Fig.3).3). The tree was rooted by 18 of the HIV-1 subtype E V3 sequences from Thailand and the Central African Republic (39). The tree was divided into two major clusters showing distinct branch lengths. The clustering pattern was reproducible regardless of the algorithms used for constructing trees (data not shown). Sequences of one cluster had shorter branches than did those of the other, suggesting the presence of a sibling sequence population that was less diversified from a putative ancestral V3 sequence of the intrafamilial infection. Notably, the cluster with shorter branches consisted of sequences from all clinical stages (NH1, NH2, NH2-II, NH2-III, and NH3), and they all encoded the group 1 V3 loop with a GPGQ motif and a low positive net charge. In contrast, the cluster with longer branches consisted of only those sequences from patients with AIDS (NH1, NH2-II, and NH2-III), and they all encoded the group 2 V3 loop with a GPGR motif and a high positive net charge. When the neighbor-joining tree was constructed on the flanking sequences of the V3 loop, no relation was detected between the branch length and the group 1 or 2 lineages (data not shown).
Nonsynonymous and synonymous substitutions (Ka and Ks) (40) were calculated for all pairwise comparisons of the 86 V3 nucleotide sequences from uncultured PBMCs. Then, Ka and Ks values derived from two sequences from the same V3 lineage group of the same sampling point of the same individual were extracted and plotted on a graph with x and y axes for Ka and Ks values, respectively. Thus, pairwise comparisons of group 1 were done for NH1 (n = 6), NH2 (n = 66), and NH3 (n = 105) in June 1993 as well as for NH2-II in March 1996 (n = 120) and NH2-III in January 1997 (n = 6). Comparisons of group 2 were done for NH1 in June 1993 (n = 78), NH2-II in March 1996 (n = 6), and NH2-III in January 1997 (n = 153). In the plots for group 1, the Ks value exceeded the Ka value for most pairs regardless of the sampling time or the individual (Fig. (Fig.4A).4A). In contrast, there were many pairs in which the Ka value exceeded the Ks value in the plots for group 2 (Fig. (Fig.4B).4B).
The statistical significances of differences between means of Ka and Ks with variance (41) within each V3 lineage at each sampling point were evaluated by Welch’s t test (Table (Table2).2). In group 1, the means of Ka were lower than those of Ks regardless of the clinical stage of the infected individual (Ka/Ks of 0.091, 0.124, and 0.549 for group 1 from NH1, NH2, and NH3; t > 9.957; df > 22; P < 0.005). Although Ka exceeded Ks for NH2-II group 1, the difference was not statistically significant (t = 1.300; df = 22). The Ka/Ks ratio for NH2-III group 1 could not be calculated because all nucleotide sequences of the V3 region were identical. In contrast to the predominance of Ks for group 1, Ka exceeded Ks for group 2 (Ka/Ks ratios of 1.659 and 2.447 for group 2 from NH1 and NH2-III; P < 0.005). The Ka/Ks ratio of the group 2 sequences from NH2-II could not be calculated because all substitutions of the V3 region were nonsynonymous. Finally, the Ks values of group 1 and group 2 at the same sampling points were not significantly different, whereas the Ka values of group 2 consistently exceeded those of group 1.
Compared to the mechanisms to yield new mutants via error-prone viral RT, genetic recombination, and rapid turnover of virus in vivo, the selection process that maintains genetic variation is less well understood. The present work has provided for the first time evidence that the intensity of positive selection may be related to the V3 lineage.
To understand V3 evolution in the Japanese family NH, we started to examine the probable infection pathway of virus. Two independent pieces of evidence support the possibility that HIV-1 isolates in the Japanese family had evolved from a common source. First, all of the env and gag nucleotide sequences from the three family members were of HIV-1 subtype E and had a monophyletic relationship (Fig. (Fig.1).1). Second, all of the HIV Env and Gag sequences from this family possessed amino acid residues unique to the family. Taken together with the epidemiological and clinical information, as well as the nucleotide diversity of the env C2/V3 region (Table (Table1),1), this evidence caused us to conclude that a single source of HIV-1 subtype E of Thai origin had infected NH1, who had transmitted the viruses to NH2, who had transmitted the viruses to NH3 at some time between 1989 and 1992.
Among the V3 loop amino acid sequences from the family, two major subgroups, group 1 with a GPGQ motif and group 2 with a GPGR motif, were identified (Fig. (Fig.2).2). Structural similarities to the subtype E NSI consensus, predominance in asymptomatic mother and child, and the identity to the sequences of NSI virus isolates from the family all indicate that the group 1 V3 loops are those of NSI variants. Of note is the fact that the group 1 sequences were identified in NH1 with AIDS and were consistently found after NH2 had developed AIDS, suggesting that this quasispecies can persist in infected individuals independent of clinical status. Group 2, on the other hand, was only present in symptomatic individuals and shared many structural features with the reported sequences of the subtype E SI variants. Consistently, virus isolates carrying the group 2 V3 sequences infected and induced syncytia in MT2 cells. Thus, at least two V3 lineage groups had evolved in the two parents during disease progression, one of which was related to the NSI phenotype and the other to the SI phenotype.
Interestingly, all of the group 1 sequences showed relatively low variation irrespective of the clinical status of the infected individuals or the length of infection, whereas all of the group 2 sequences formed a cluster with long branches and were identified only in patients with AIDS (Fig. (Fig.3).3). The low variation for the group 1 sequences were unlikely to be due to template resampling during PCR and cloning processes, because (i) most group 1 sequences were unique over the entire 324-bp C2/V3 region sequenced (Table (Table1),1), (ii) Taq polymerase error contributed less than 12% to the observed substitution frequencies in the C2/V3 region, and (iii) the relatively high degree of sequence homogeneity was seen mainly in the V3 loop and not in the flanking regions (48). An alternative and more likely explanation is that the HIV-1 variants possessing group 1 V3 loops existed as a quasispecies with a relatively homogeneous V3 loop element. The conclusion is compatible with the general observation that V3 loop sequences of NSI isolates consist of a less diverse population than those of SI isolates (7, 17, 29, 36, 37, 51, 65).
The difference in variations in the group 1 and group 2 sequence populations could have been caused by two factors. First, mutation rates may be higher in the group 2 V3 region than in the group 1 V3 region. Second, there may be a stronger positive selection in the group 2 V3 sequence population than in that of group 1. Although the former possibility needs to be assessed further, we obtained evidence that the latter mechanism is involved in generating the genotype-dependent difference in V3 variation. Calculation and plotting of Ka and Ks values for each pairwise comparison within the same V3 lineage group revealed that group 1 and group 2 exhibited distinct distribution patterns of the Ks/Ka plots. In group 2, Ka exceeded Ks for most pairs (Fig. (Fig.4).4). Statistical tests of the difference between means of Ka and Ks ruled out the possibility that this observation was due to chance (Table (Table2).2). These results strongly argue against the plain model where the group 2 nucleotide diversity would be caused simply by the high error rate of RT or by the rapid turnover of the viruses in patients with little immune response. Although these possibilities alone may increase the nucleotide diversity (Ka + Ks), they should not cause the excess of Ka over Ks because of the functional constraint on the protein-coding locus (32). Thus, the present data indicate that there had been a driving force which increased the probability of fixing nonsynonymous substitutions during the evolution of the group 2 lineage.
In contrast to the group 2 population, Ks exceeded Ka for most pairs in group 1. The Ka/Ks ratios were consistently less than 1.0 in group 1, except at one sampling point, where the difference between Ka and Ks values was not statistically significant. In addition, the mean Ka value for group 1 was consistently lower than that for group 2 at the same sampling point of the same individual. These data indicate that positive selection did not play a major role in the nucleotide diversity of the group 1 V3 lineage. The conclusion is compatible with the observation that the V3 region of NSI gp120 appears to be hidden from the blocking antibodies, being a less effective target for sera of HIV-1-infected individuals (6). In addition, the less rapid and efficient progeny production of NSI than SI variants (58, 59) might have resulted in the smaller extent of antigen presentation or recognition by host immune surveillance. Thus, the differences in intensity of positive selection among the group 1 and 2 V3 lineages may be attained by differences in susceptibility to the host immune responses.
The above-mentioned differences may all be attributed primarily to the differential abilities of viruses to use entry coreceptors. The V3 region has been implicated in the interaction of HIV-1 envelope with chemokine receptors expressed on the target cells (8, 9, 48a, 55). The ability to interact with a particular molecule would become a strong functional constraint which decreases the amino acid variation of the responsible region. However, once the viruses obtain the competence, V3 diversity would be maintained by a balance between the functional constraint and the immune pressure driving positive selection. Experiments assessing the ability of each group 1 or group 2 V3 loop to specify CCR5 and/or CXCR4 usage are currently in progress by constructing a series of infectious V3 recombinants.
Group 2 V3 loop amino acid sequences from NH1, NH2-II, and NH2-III frequently shared unique amino acids, which were not seen in group 1 (Fig. (Fig.2A).2A). Either convergent evolution or the direct transmission of SI variants from NH1 to NH2 can explain the observation. In this regard, it is likely that viruses transmitted from NH1 to NH2 were the NSI variants, because neither SI V3 genotypes nor SI viruses were identified when NH2 was asymptomatic (Fig. (Fig.2).2). Therefore, the similarities between groups 2 in the two individuals appear to imply that there is a convergent evolutionary pattern in the V3 loop in the context of subtype E gp120 during disease progression. The possibility is supported by the observation that many of these substitutions also recurred in AIDS patients in Thailand (66), regardless of the predominant role of NSI variants for person-to-person transmission (46, 68).
In conclusion, HIV-1 V3 sequence evolution appears to be bimodal in an infected individual; a V3 sequence lineage for NSI phenotype persists independent of clinical status and is maintained with a relatively low variation due to the dominant role of the functional constraint. In contrast, a V3 sequence lineage for SI phenotype emerges with disease progression and is maintained with a high variation, because of the dominant role of positive selection.
We thank Simeon Aidoo for help with sequencing and Keith Peden for providing pLAI.
This work was supported by grants from the Ministry of Health and Welfare of Japan, the Ministry of Education, Science and Culture of Japan, and the Science and Technology Agency of the Japanese Government.