Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Infect Dis. Author manuscript; available in PMC 2012 August 15.
Published in final edited form as:
PMCID: PMC3419593

HIV-1 Variation before Seroconversion in Men Who Have Sex with Men: Analysis of Acute/Early HIV Infection in the Multicenter AIDS Cohort Study


Understanding the characteristics of human immunodeficiency virus (HIV) necessary for infection in a new host is a critical goal for acquired immunodeficiency syndrome (AIDS) research. We studied the characteristics of HIV-1 envelope genes in 38 men in the Multicenter AIDS Cohort Study cohort before seroconversion. We found a range of diversity (0.2%–5.6% [median, 0.86%]), V1–V2 loop length (58 –93 aa), and potential N-linked glycosylation sites (n = 2–9) 9). However, at least 46% of the men had replicating virus that appeared to have been derived from a single viral variant. Nearly all variants were predicted to be CCR5 tropic. We found no correlation between these viral characteristics and the HIV outcomes of time to clinical AIDS or death and/or aCD4 cell count <200 cells/μL.

The viral characteristics required for HIV-1 transmission remain to be fully understood, yet their elucidation is critical to the development of prevention strategies. There have been mixed reports on the relative homogeneity or heterogeneity of virus populations that become established during primary infection [17]. The lack of consistent findings is likely due in part to different biological and methodological factors, such as differences in the mode of transmission, gender, HIV-1 subtypes, the precise timing of evaluation after infection, the gene regions examined, and the criteria by which viral populations were categorized as heterogeneous or homogeneous (e.g., see [18]). Overall, however, there is consensus that HIV-1 populations are relatively homogeneous in primary infection compared with chronic infection, with viral population diversity growing at a generally consistent pace during untreated asymptomatic infection [2].

To further our understanding of the factors involved in early infection, we assessed HIV-1 variant populations in participants in the Multicenter AIDS Cohort Study (MACS) during primary HIV infection (when plasma HIV-1 RNA was detectable but before seroconversion had occurred).


The MACS is an ongoing prospective study of HIV-1 infection ( HIV serostatus was determined using a combination of HIV EIA and Western blot assays. Subjects were classified as being seronegative if they had a negative or equivocal EIA result and/or a negative or indeterminate confirmatory Western blot result. Plasma HIV-1 RNA loads were measured by reverse-transcription polymerase chain reaction (PCR; HIV-1 Amplicor; Roche), and T cell subsets were quantified by flow cytometry. For estimates of time to clinical events, we used the standard criterion of seroconversion date, estimated as the midpoint between the last seronegative and first seropositive visit.

Plasma samples were analyzed as described elsewhere [2]. Briefly, plasma HIV-1 RNA was isolated, and nested PCR of the HIV-1 envelope (env) V1–V5 region (using primer sets ED3-BH2 [first round] and ED5-DR8 [second round]) was performed using end-point dilution. To avoid template resampling [2, 8], we cloned and then picked single clones from dilutions with a low copy number or a number of clones corresponding to no more than 5% of the total from dilutions with a copy number >100 copies/mL. For PCR-negative samples, we attempted PCR again using the more sensitive ED31-BH2 (first round) and DR7-ED33 (second round) primer sets.

All sequences were determined using dye-terminator chemistries and were assessed for potential sample mix-up and contamination by established techniques. Sequences were deposited in GenBank and were assigned accession numbers AF138652-AF138657 and EU184091-EU184657. Each sequence was aligned with references from the HIV database ( using CLUSTAL W, followed by manual adjustment using MacClade (version 4). Regions in the alignment that could not be unambiguously aligned were removed. No hypermutated sequences were identified using Hypermut (version 2.0; Pairwise nucleotide distances were estimated using distance-based based methods and evolutionary models modelsHKY85 (Hasegawa-Kishino-Yano, 85) or GTR + Γ+ I (general time-reversible models with a gamma distribution and invariable sites) under maximum likelihood (ML) criteria and implemented in PAUP* (version 4.0b10). Neighbor-joining and ML trees were estimated using PAUP* or PhyML.

Viral diversity was measured by determining the ML pairwise genetic distances between all sequences obtained at a given time point in PAUP*. Viral divergence was measured by estimating, using ML criteria, a most recent common ancestor (ANC) sequence at the root node of each subject’s clade of sequences, using reference sequences (B.FR.83.HXB2 [K03455], B.US.83.RF [M17451], B.US.86.JRFL [U63632], B.US.90.WEAU160 [U21135]) from the HIV database as outgroups, as described elsewhere [4].

Genotypic coreceptor analysis of the V3 loop was performed as described elsewhere ( Potential N-linked glycosylation sites (PNLGS) were predicted using N-GLYCOSITE (

Rates of disease progression were measured by time from seroconversion to a clinical AIDS-defining event (1993 Centers for Disease Control and Prevention definition), death, or CD4 cell count <200 cells/μL. Subjects who did not reach an AIDS end point were censored at time of initiation of highly active antiretroviral therapy or time of loss to follow-up. Statistical analysis was done using JMP software (version 5.1.2; SAS Institute).

This study was conducted with institutional review board approval from the University of Washington and the parent institutions of the MACS.


From 1984 through November 2004, a total of 6973 men were enrolled in the MACS, including 615 seroconverters, of whom 57 were identified as having a positive plasma HIV-1 RNA load at their last seronegative visit by systematic testing of the last seronegative visit of all seroconverters who had specimens available. Forty-five of the 57 subjects had RNA-positive and antibody-negative (RNA+Ab) blood samples available for further analyses. We confirmed viral RNA positivity at the RNA+Ab visit for 38 of the 45 subjects (table 1). In the 7 subjects for whom we could not confirm the presence of viral RNA (with a sensitivity of ~ 1–10 copies/PCR, or less than ~40–80 copies/mL of plasma; see Methods), the plasma viral loads determined previously by the Amplicor HIV-1 RNA assay (versions 1.0 and 1.5; cutoff of <400 copies/mL) were between 423 and 1029 copies/mL (whether these represent false-positive results or subsequent sample degradation occurring before our analysis could not be determined). These subjects were excluded from subsequent analyses. Fourteen (36.8%) of the 38 subjects had plasma viral loads >500,000 copies/mL at the RNA+Ab visit, suggesting that samples were obtained from them near the time of peak viremia of primary infection.

Table 1
Characteristics of the Multicenter AIDS Cohort Study HIV RNA–positive and antibody-negative cohort.

Each of the 38 confirmed RNA+Ab subjects was shown by viral phylogenetic analysis to be infected with HIV-1 subtype B. Each subject’s sequence population was monophyletic; hence, there was no evidence of dual infection (figure 1A). Nor was there any evidence of clustering of intersubject sequences; thus, there was no evidence of close epidemiologic linkages between subjects (figure 1A). Viral population heterogeneity of the env V1–V5 region was examined in the 37 subjects for whom env V1–V5 sequences were available (mean, 15.2 sequences/subject; range, 11–19). A wide distribution of intrasubject diversities was observed at both the amino acid (median, 0.86%; range, 0.2%–5.6%) and nucleotide levels (figure 1A and 1B and figure 2). There was no significant correlation between viral load at the seronegative visit and envelope V1–V5 region diversity (nucleotide or amino acid; P = .70 and P = .47, respectively).

Figure 1
A, Maximum-likelihood phylogenetic tree (gap stripped; implemented in PhyML) of the envelope V1–V5 region (567 independent clones from 37 subjects) at the HIV RNA-positive and antibody-negative visit. Median pairwise nucleotide diversity was 0.4% ...
Figure 2
Distributions of intrasubject mean amino acid diversity.

The number of unique variants replicating in, and potentially transmitted to, each subject was estimated by examining phylogenetically informative sites (nucleotide changes shared by 2 or more sequences). Six subjects (16%) had clonal populations (1 variant; i.e., no informative sites) and 17 (46%) had 0 or 1 informative site (1 or 2 unique variants), clearly suggesting outgrowth from a single unique variant. In contrast, 9 (24%) subjects had sequences with 7 or more informative sites (4–13 unique variants, counting an insertion or deletion of any length as an informative site), suggesting that multiple variants were likely to have been transmitted in these cases. In addition, 12 subjects (32%) had evidence of recombination between unique viral variants (figure 1C and 1D and figure 3).

Figure 3
Unique envelope gp120 viral variants in 37 subjects.

One (0.17%) of 587 V3 loop sequences from the 38 subjects had a genotype consistent with SI/X4 tropism (this sequence had a positive SI PSSM [syncytium-inducing position-specific scoring matrix] score but no canonical 11/25 mutation); thus, nearly all transmitted viruses were CCR5 tropic (figure 4).

Figure 4
Envelope V3 loop amino acid variation.

We evaluated V1–V2 loop length variation and PNLGS in the subset of 487 sequences with open reading frames (ORFs) from 37 subjects (figure 5). The median length was 66 aa (range, 58–93 aa), with length variation detected in 7 subjects. Thirty-three subjects (89%) had variation in the number of PNLGS in this region. A mean of 5.6 PNLGS (range, 2–9) were found, which was strongly correlated with loop length (adjusted R2 = 0.59; P < .0001).

Figure 5
Histogram of V1–V2 amino acid loop length vs. potential N-linked glycosylation sites (PNLGS).

Analysis of PNLGS in the 447 env V1–V5 sequences with ORFs also demonstrated a wide range of variation (data not shown). Only 4 (11%) of 37 subjects had the same numbers of PNLGS in all clones. However, there was no significant correlation between PNLGS and envelope V1–V5 nucleotide or amino acid diversity (P = .4 and P = .8, respectively).

We found no significant correlations between any of the aforementioned early viral genetic parameters (i.e., diversity, number of unique viral variants, divergence from the estimated ANC, V1–V2 loop length, PNLGS) and any HIV disease outcome measure or surrogate marker evaluated (i.e., set-point viral load and time from seroconversion to clinical AIDS, death, or CD4 cell count <200 cells/μL), self-reported risk factors for mode of transmission, or history of sexually transmitted infections in the 6 months preceding their visit (P > .05, for all comparisons). However, as expected, there was a significant correlation between set-point viral load at ~1 year after seroconversion and time from seroconversion to AIDS (adjusted R2 = 0.3; P = .009)


In our study, approximately half (46%) of the subjects had HIV-1 gp120 gene populations shortly after transmission and before seroconversion that were substantially homogeneous (≤2 variants). Although such clonality may have emerged after transmission, this finding suggests that these subjects were infected with a single clonal population or unique viral variant. Because viral evolution occurs rapidly, it is not possible to determine how many of the remainder were infected with multiple variants, but the number is between 54% (the remainder of the subjects) and the conservative cutoff of 24% of individuals with multiple variants harboring at least 7 informative sites. Amino acid diversity ranged up to 5.6% over the envelope V1–V5 region, and variation in V1–V2 loop lengths (range, 58–93 aa) and PNLGS (range, 2–9) was also evident, as were putative recombinants. For this analysis, we omitted phylogenetically noninformative (“private”) mutations, because they were likely to have been introduced by viral replication in vivo early during infection or during PCR and were less likely to have been transmitted from the donor [8].

Understanding why early viral variants in certain subjects are heterogeneous or homogeneous may provide insight into host-virus interactions. Learn et al. [4] suggested that a marginally diverse infecting inoculum of HIV-1 envelope populations present very early during infection may become more homogeneous within a few months after infection, and Herbeck et al. [9] found evidence for evolution toward an ANC sequence early during infection, indicating that HIV recovers certain ancestral features when infecting a new host. In addition, Derdeyn el al. [10] found evidence that viruses with shorter envelope V1–V4 loop lengths and fewer PNLGS were transmitted to, or selectively grew, in the recipients.

The low level of diversity observed soon after infection may in part reflect the virus population diversity in the donor [12]. However, not all transmissions occur during acute/early infection in the donor, and some filtering and transmission bottlenecks clearly occur from donors with high viral diversity [10].

Our data are consistent with those of Ritola et al. [6] and Sagar et al. [13] showing that men can harbor complex viral populations early during infection, whereas Long et al. [3], who examined heterosexual transmission in Kenya in individuals infected with non-B HIV-1 subtypes, found lower levels of diversity in men compared with women. Differences in mode of transmission and potential differences in inoculum size at penile-vaginal versus anorectal mucosal surfaces may influence early viral replication dynamics and diversification.

In contrast to several studies that have suggested a direct correlation between viral diversity and rate of disease progression [5, 11, 14, 15], we did not observe any association between progression rate and any of the measures of early viral population heterogeneity. In contrast to the 4 aforementioned studies, which assessed surrogate markers for disease progression (i.e., viral load and rate of CD4 cell count decline), our study could assess for a correlation between early viral population diversity and actual time to clinical AIDS as well as the surrogate markers of set-point viral load and time to a CD4 cell count <200 cells/μL used in earlier studies. These differences and the smaller cohort size in 3 of these 4 studies (n = 12, n = 15, and n = 23) [5, 11, 14, 15] may help explain the discrepancy with our findings. The one large study (n = 156) [5] performed to date to address this question was conducted in Kenya, where multiple subtypes circulate (typically A, D, and C); primarily used the heteroduplex mobility assay as a qualitative measure of heterogeneity; and did not control for potentially faster progression linked to subtype D infection.

In conclusion, we have shown, in a cohort of men who have sex with men who were infected with HIV-1 subtype B, that variable levels of envelope gene and protein diversity are present during acute infection and before the establishment of substantial immune responses. Strategies to prevent HIV transmission or attenuate infection will likely have to take this potential viral diversity into account. (Supplemental data relevant to this study can be found at


This work was supported by the US Public Health Service (grants R01-AI058894 and R37-AI047734 to G.S.G., J.B.M., and J.I.M. and grant P01-AI57005 to J.I.M.) and the University of Washington Center for AIDS Research and STDs (grant P30- AI27757). The Multicenter AIDS Cohort Study is funded by the National Institute of Allergy and Infectious Diseases, with additional supplemental funding from the National Cancer Institute (grants UO1-AI-35042, 5-MO1-RR-00722 (GCRC), UO1-AI-35043, UO1-AI-37984, UO1-AI-35039, UO1-AI-35040, UO1-AI-37613, and UO1-AI-35041).

We thank the staff and clinicians of the Multicenter AIDS Cohort Study (MACS) and the study participants, without whom this study would not have been possible. Data used in this manuscript were collected by the MACS, which has centers (principal investigators) at The Johns Hopkins University Bloomberg School of Public Health (Joseph B. Margolick and Lisa Jacobson); the Howard Brown Health Center and Northwestern University Medical School (John Phair); the University of California, Los Angeles (Roger Detels and Beth Jamieson); and the University of Pittsburgh (Charles Rinaldo).


Potential conflicts of interest: none reported.

Presented in part: 14th Conference on Retroviruses and Opportunistic Infections, Los Angeles, 25–28 February 2007 (oral abstract 121).


1. Delwart EL, Sheppard HW, Walker BD, Goudsmit J, Mullins JI. Human immunodeficiency virus type 1 evolution in vivo tracked by DNA heteroduplex mobility assays. J Virol. 1994;68:6672–83. [PMC free article] [PubMed]
2. Shankarappa R, Margolick JB, Gange SJ, et al. Consistent viral evolutionary changes associated with the progression of human immunodeficiency virus type 1 infection. J Virol. 1999;73:10489 –502. [PMC free article] [PubMed]
3. Long EM, Martin HL, Jr, Kreiss JK, et al. Gender differences in HIV-1 diversity at time of infection. Nat Med. 2000;6:71–5. [PubMed]
4. Learn GH, Muthui D, Brodie SJ, et al. Virus population homogenization following acute human immunodeficiency virus type 1 infection. J Virol. 2002;76:11953–9. [PMC free article] [PubMed]
5. Sagar M, Lavreys L, Baeten JM, et al. Infection with multiple human immunodeficiency virus type 1 variants is associated with faster disease progression. J Virol. 2003;77:12921–6. [PMC free article] [PubMed]
6. Ritola K, Pilcher CD, Fiscus SA, et al. Multiple V1/V2 env variants are frequently present during primary infection with human immunodeficiency virus type 1. J Virol. 2004;78:11208–18. [PMC free article] [PubMed]
7. Sagar M, Kirkegaard E, Lavreys L, Overbaugh J. Diversity in HIV-1 envelope V1–V3 sequences early in infection reflects sequence diversity throughout the HIV-1 genome but does not predict the extent of sequence diversity during chronic infection. AIDS Res Hum Retroviruses. 2006;22:430–7. [PubMed]
8. Liu S-L, Rodrigo AG, Shankarappa R, et al. HIV quasispecies and resampling. Science. 1996;273:415–6. [PubMed]
9. Herbeck JT, Nickle DC, Learn GH, et al. Human immunodeficiency virus type 1 env evolves toward ancestral states upon transmission to a new host. J Virol. 2006;80:1637–44. [PMC free article] [PubMed]
10. Derdeyn CA, Decker JM, Bibollet-Ruche F, et al. Envelope-constrained neutralization-sensitive HIV-1 after heterosexual transmission. Science. 2004;303:2019–22. [PubMed]
11. Markham RB, Wang WC, Weisstein AE, et al. Patterns of HIV-1 evolution in individuals with differing rates of CD4 T cell decline. Proc Natl Acad Sci USA. 1998;95:12568–73. [PubMed]
12. Gray RH, Wawer MJ, Brookmeyer R, et al. Probability of HIV-1 transmission per coital act in monogamous, heterosexual, HIV-1-discordant couples in Rakai, Uganda. Lancet. 2001;357:1149–53. [PubMed]
13. Sagar M, Kirkegaard E, Long EM, et al. Human y virus type 1 (HIV-1) diversity at time of infection is not restricted to certain risk groups or specific HIV-1 subtypes. J Virol. 2004;78:7279–83. [PMC free article] [PubMed]
14. Mani I, Gilbert P, Sankale JL, Eisen G, Mboup S, Kanki PJ. Intrapatient diversity and its correlation with viral setpoint in human immunodeficiency virus type 1 CRF02_A/G-IbNG infection. J Virol. 2002;76:10745–55. [PMC free article] [PubMed]
15. Chohan B, Lang D, Sagar M, et al. Selection for human immunodeficiency virus type 1 envelope glycosylation variants with shorter V1–V2 loop sequences occurs during transmission of certain genetic subtypes and may impact viral RNA levels. J Virol. 2005;79:6528–31. [PMC free article] [PubMed]