|Home | About | Journals | Submit | Contact Us | Français|
Background.Human immunodeficiency virus (HIV) superinfection has been documented in high-risk individuals; however, the rate of superinfection among HIV-infected individuals within a general population remains unknown.
Methods.A novel next-generation ultra-deep sequencing technique was utilized to determine the rate of HIV superinfection in a heterosexual population by examining two regions of the viral genome in longitudinal samples from recent HIV seroconverters (n = 149) in Rakai District, Uganda.
Results.The rate of superinfection was 1.44 per 100 person years (PYs) (95% confidence interval [CI], .4–2.5) and consisted of both inter- and intrasubtype superinfections. This was compared to primary HIV incidence in 20 220 initially HIV-negative individuals in the general population in Rakai (1.15 per 100 PYs; 95% CI, 1.1–1.2; P = .26). Propensity score matching (PS) was used to control for differences in sociodemographic and behavioral characteristics between the HIV-positive individuals at risk for superinfection and the HIV-negative population at baseline and follow-up. After PS matching, the estimated rate of primary incidence was 3.28 per 100 PYs (95% CI, 2.0–5.3; P = .07) controlling for baseline differences and 2.51 per 100 PYs (95% CI, 1.5–4.3; P = .24) controlling for follow-up differences.
Conclusions.This suggests that the rate of HIV superinfection in a general population is substantial, which could have a significant impact on future public health and HIV vaccine strategies.
Human immunodeficiency virus (HIV) superinfection occurs when an HIV-infected individual acquires a new viral strain that is phylogenetically distinct from all detectable viral strains at a previous time point . Inter- and intrasubtype HIV superinfections have been reported in high-risk individuals exposed through sexual or intravenous drug use [1–12]. The rate of HIV superinfection has often been found to be relatively frequent, particularly if multiple genomic sites are examined [6, 9, 13–15]. Other researchers have found no evidence of superinfection in both small- and large-scale studies; however, these studies utilized clonal analyses that were likely not sensitive enough to detect the levels of virus observed in some superinfection cases [16–18].
These discrepancies partly reflect differences in the techniques used to identify and verify superinfection . Initial studies of the frequency of superinfection utilized heteroduplex mobility or multiregion hybridization assays followed by selective clonal analysis [6, 7, 16]. Other studies preselected high-risk individuals and employed in-depth cloning techniques to identify possible superinfections [6, 9, 18]. The sensitivity of these assays as possible screening techniques is determined by the number of clones amplified and the number of regions examined [9, 15, 20]. Therefore, to obtain the sensitivity needed to identify a variant circulating within an individual at approximately 1% of the total circulating viral population, over 100 clones would need to be examined per sample, an approach that becomes prohibitively labor intensive for large-scale population studies of superinfection [15, 19, 20]. We therefore recently designed and tested a novel, highly sensitive, high-throughput, next-generation, ultra-deep sequencing technique and sequence analysis protocol to identify HIV superinfection in 2 regions of the viral genome: the p24 region of the viral capsid and the gp41 region of the viral envelope . Using this procedure, we examined HIV seroconverters from the Rakai Community Cohort Study (RCCS) in Rakai District, Uganda, to determine the incidence of HIV superinfection in this population.
The RCCS is a rural, community-based, open cohort of persons aged 15–49 years in Rakai District in southwestern Uganda, which has been described in detail previously . Since 1994, interviews and venous blood samples have been obtained annually from approximately 14 000 consenting adults living in 50 villages. All subjects provided written informed consent for sample storage and testing. The study was approved by the Science and Ethics Committee of the Uganda Virus Research Institute, the Uganda National Council for Research and Technology, Western Institutional Review Board, and the Committee on Human Research at Johns Hopkins Bloomberg School of Public Health.
Known seroconverters who had a positive HIV serological result within 2 years of a prior negative test between 1998 and 2004 and who had also provided at least 1 subsequent serological sample prior to 2009 were randomly selected for examination from the RCCS population (n = 203) [21, 22]. To identify HIV superinfection, viral RNA was isolated from serum obtained at the seroconverter's first HIV-positive time point (baseline) and the latest time point available prior to initiation of antiretroviral therapy (ART), loss to follow-up, or death . Subjects were excluded from analysis if neither genomic region in the baseline sample could be amplified (n = 14). Subjects were also excluded if the follow-up genomic region corresponding to the amplified baseline sample failed to amplify (n = 40). The remaining subjects’ (n = 149) viral RNA extracts were initially amplified by reverse-transcription polymerase chain reaction (RT-PCR) in duplicate; the resulting products were pooled and subsequently amplified in a nested PCR strategy using barcoded primers specific for use on the 454 pyrosequencer platform .
Briefly, amplicons of p24 (approximately 390 base pair [bp]) and gp41 (approximately 324 bp) were amplified and sequenced as previously described . The Amplicon Library Preparation Method was performed as recommended by the manufacturer (Roche), and all PCR products were purified with the following minor alterations. In an effort to eliminate the capture of primers, the bead-to-target ratio was reduced by incubating 30 μL of AMPure Beads XP (Agencourt; Beckman Coulter Genomics) with 25 μL of PCR product diluted in 25 μL of water. Purified PCR products were quantified using PicoGreen (Invitrogen), and each template was diluted to a 1 × 109 molecules/μL stock. The amplicon pools were made by combining 5 μL of each diluted barcoded template to make a final 1 × 109 molecules/μL stock containing 14 barcoded amplicons.
Preparation of templated beads for next generations sequencing (NGS) followed the emPCR Method Manual-Lib-L-MV (Roche). The 1 × 109 molecules/μL library pools were diluted to 1 × 105 molecules/μL for a target addition of 0.175 copies per bead to the DNA capture beads. Enriched DNA capture beads were sequenced on the 454 (Roche) per the manufacturer's instructions using a 4-region gasket .
Sequencing results were analyzed using the GS Amplicon Variant Analyzer version 2.5 (Roche). All sequence reads were compared, and similar sequences were combined into a single consensus sequence. Generated consensus sequences that were within 10 bases from both ends of the amplicon and comprised of a cluster of >10 individual, near-identical sequences were determined using the Roche Amplicon software and were classified as being consensus sequences of HIV variants. These consensus sequences were used for subsequent phylogenetic analysis . The lower limit of detection of intersubtype minor variants for this assay was previously shown to be 1% .
HIV superinfection was defined in an individual whose follow-up serum sample demonstrated >2 distinct consensus sequences forming a monophyletic cluster that was phylogenetically unlinked from the individual's entire consensus sequences in the baseline sample and was of adequate genetic distance from the baseline sequences to rule out natural evolutionary drift . In order to be considered a superinfection, the genetic distance of the new monophyletic cluster from the closest related viral sequences found at the earlier time point had to be ≥0.55% per year for the p24 region or ≥0.98% per year for the gp41 region . All newly identified consensus sequences were phylogenetically compared with the most prominent strains of the other barcoded samples within the NGS runs to search for microcontamination, misclassification, or sequencing errors. If instances of these errors were found, these consensus sequences were eliminated. Possible superinfection events were reamplified and verified in a second 454 sequencing run . The NGS consensus sequences for gp41 and p24 are available on the Los Alamos National Laboratory HIV-DB Next Generation Sequence Archive (http://www.hiv.lanl.gov/content/sequence/HIV/NextGenArchive/) and are also available upon request from the corresponding author. Total sequence reads and consensus sequences from the total population were each compared for both genomic regions at baseline and follow-up with the Kruskal–Wallis test.
Serum HIV type 1 RNA concentrations (viral loads) were determined by the Amplicor v1.5 (Roche Diagnostics).
The HIV incidence rate was estimated for the entire RCCS population among participants with a negative sample between 1998–2008 and at least 1 follow-up sample prior to 2009 (n = 20 220), providing 100 550 person years (PYs) at risk . Human immunodeficiency virus infection was assumed to have occurred at the midway point between the last negative and first positive HIV test.
The rate of superinfection among the amplified HIV-positive samples was initially compared with the rate of primary HIV infection among the general population using a univariate Poisson log-linear model. Because the sociodemographic characteristics and risk behavior profiles of the individuals tested for superinfection were different from those of the general HIV-negative population at risk of primary HIV infection, propensity score matching was used to compare the rates of superinfection and primary infection in the general HIV-negative population matched for the sociodemographics and behaviors of the HIV-positive population at risk of superinfection [24, 25]. The baseline and follow-up sociodemographic and behavior variables were compared between the two populations using χ2 tests, and any variable that differed significantly between the groups at P = .05 was entered in the logistic regression model to estimate the propensity score. The objective of constructing a propensity score is to use a 1-dimension balance score (ie, the propensity score) to summarize all the measured variables that differ between the two populations . Additionally, because the follow-up time differed between the two populations, duration of follow-up was also entered in the propensity score model in order to balance the exposure times between the populations. One-nearest neighbor matching on the propensity score was applied to select the subjects in the general population who had similar profiles to the subjects in the HIV-positive population at risk of superinfection. Further analyses were conducted to check the balance in each sociodemographic and risk behavior variable between the populations. A Poisson log-linear model was then used to estimate the expected incidence in the general population given the characteristics and behaviors of the HIV-positive population at risk of superinfection. The propensity score matching was conducted using the MatchIt R package [24, 25].
Of the 149 individuals examined for superinfection, valid pyrosequencing results were obtained for the p24 and gp41 genomic regions at both time points for 109 individuals (73.2%). Thirty-one individuals had valid results at both time points for the gp41 region only (20.8%), and 9 had p24 results only (6.0%). There were no significant differences in the total reads between the baseline p24 (median, 10 998; interquartile range [IQR], 8262–14 549), baseline gp41 (median, 10 581; IQR, 8309–12 713), and follow-up p24 regions (median, 11 386; IQR, 9488–13 638). However, the follow-up gp41 total reads (median, 8923; IQR, 7186–11 617) were significantly lower than all 3 (P < .05). There were no significant differences between the consensus sequence totals for the baseline p24 (median, 74; IQR, 43–108), baseline gp41 (median, 70; IQR, 54–87), follow-up p24 (median, 79; IQR, 56–104), and follow-up gp41 regions (median, 72; IQR, 48–91) (P = .26).
At initial seroconversion, 91 individuals were infected with a single subtype D viral population (61.1%), 24 individuals were infected with a single subtype A viral population (16.1%), 1 individual was infected with subtype C viral population (0.7%), and 33 individuals were infected with either recombinant viruses or multiple viral populations (22.1%). This is consistent with the HIV subtype distribution previously observed in Rakai . The median time interval between the baseline and follow-up time points was 2.84 years (IQR, 1.64–5.08; range, 0.99–7.47).
Seven cases of HIV superinfection were identified over 485.7 PYs of follow-up, for an HIV superinfection incidence of 1.44 per 100 PYs (95% confidence interval [CI], .37–2.51) (Figures 1 and and2;2; Supplementary Figure 1). All superinfection events were detected in the gp41 region. In 3 cases, there were no sequences available for the p24 region, and in the remaining cases there were no new sequences detected in the p24 region. In addition, all 7 superinfected individuals were initially infected with HIV subtype D at baseline (Table 1). Four of the superinfection events were intrasubtype events with initial subtype D–infected individuals superinfected with a novel subtype D strain(s) (Figure 1; Supplementary Figure 1A and 1B). The other 3 cases were intersubtype superinfection events with new subtype A viral populations being found in the follow-up sample (Figure 2; Supplementary Figure 1C). There was no consistent change in the viral loads before and after superinfection (Table 1).
The HIV incidence rate was estimated for the entire RCCS population among participants with a negative sample between 1998–2008 and at least one follow-up sample in the same time period (n = 20 220 with 100 550 PYs of follow-up) (Table 2). In this population, 1152 HIV seroconversion events were identified for an unadjusted HIV incidence of 1.15 per 100 PYs (95% CI, 1.08–1.21). The rate of HIV superinfection did not differ from this unadjusted primary incidence rate (P = .26).
At baseline and follow-up, the sociodemographic and behavioral characteristics of the HIV-infected population at risk for superinfection were markedly and significantly different from the HIV-negative population at risk of primary incident infection (Table 2). At baseline, members of the HIV-infected population at risk of superinfection were older and more likely to be female, to have experienced marital dissolution, to be sexually active, to use condoms inconsistently, and to consume alcohol before sex, all of which are known risk factors for HIV acquisition in this population. At follow-up, members of the the population tested for superinfection were older and more likely to be female, to have marital dissolution, and to consume alcohol with sex, but they were not more sexually active than the general population. Propensity score matching was used to estimate HIV incidence in the general population adjusted for these differences in sociodemographic and behavioral characteristics. When the general population was matched to the HIV-infected population's characteristics at baseline, the estimated HIV incidence rate was 3.28 per 100 PYs (95% CI, 2.0–5.3), which was higher than the observed rate of superinfection, and the difference was of borderline statistical significance (P = .07). When the populations were propensity score matched by sociodemographic and behavioral characteristics at follow-up, the estimated primary incidence rate was 2.51 per 100 PYs (95% CI, 1.5–4.3), which was not significantly different from the observed rate of superinfection (P = .24).
This is the first large-scale study of the rate of HIV superinfection in a general heterosexual population using a validated, highly sensitive, next-generation, ultra-deep sequencing technique. The finding of inter- and intrasubtype superinfection events supports earlier work in a group of high-risk female sex-workers in Kenya [6, 9, 15]. Other studies have found few or no cases of superinfection, even in European homosexual male populations reporting high-risk exposures, but this may be due to the less-sensitive clonal analyses used [17, 18].
In addition, this is the first study to compare the rate of HIV superinfection to primary HIV incidence in a general heterosexual population in sub–Saharan Africa. The rate of HIV superinfection (1.44 per 100 PYs) did not differ significantly from the unadjusted primary HIV incidence rate (1.15 per 100 PYs). However, after matching for the baseline characteristics and behaviors, the adjusted primary incidence (3.28 per 100 PYs) was higher than the observed incidence of superinfection, and this was of borderline statistical significance. When the follow-up behavioral characteristics were used to derive the propensity score matching, the adjusted primary HIV incidence (2.51 per 100 PYs) was nonsignificantly higher than the rate of HIV superinfection. This difference in estimated rates based on baseline or follow-up propensity score matching may reflect change in behaviors among HIV-infected individuals after learning their status and decreased risky behavior due to HIV disease progression . In addition, our analysis is somewhat constrained by the relatively small number of superinfections detected, which limited the power of the study to detect a small significant difference in unadjusted or propensity score–matched incidence rates. Nonetheless, these data suggest that the rate of superinfection in a generalized heterosexual epidemic is substantial and could have a significant impact on the ongoing epidemic.
Almost all HIV transmission in the RCCS population occurs through vaginal heterosexual intercourse, and therefore the findings may not be generalizable to populations with other modes of transmission. Prior studies in men who have sex with men and in commercial sex-workers in Kenya suggest that rates of HIV superinfection may be increased in populations with higher primary HIV incidence [9, 14]. It has also been shown that cellular components of the natural immune response to HIV do not seem to be protective against superinfection [2, 29]. However, there is some evidence that neutralizing antibody responses may be associated with protection .
There is previous evidence that HIV superinfection can have detrimental clinical effects, even in individuals who were previously controlling their HIV infection [2, 11, 14, 31]. These data suggest that post-test counseling of HIV-infected individuals needs to emphasize the risk of HIV superinfection and the possible clinical implications of continued unsafe behaviors. These results also have significant implications for estimations of the age of the HIV epidemic and for phylogenetic modeling of viral evolution because many of these models assume that superinfection is not occurring . In addition, the finding that superinfection is common and occurs within and between HIV subtypes suggests that the immune response elicited by primary infection confers limited protection and raises concerns that vaccine strategies designed to replicate the natural anti-HIV immune response may have limited effectiveness. In summary, the finding that superinfection is a relatively frequent event has substantial implications for HIV prevention, clinical management, and future vaccine development.
Supplementary materials are available at The Journal of Infectious Diseases online (http://jid.oxfordjournals.org/). Supplementary materials consist of data provided by the author that are published to benefit the reader. The posted materials are not copyedited. The contents of all supplementary data are the sole responsibility of the authors. Questions or messages regarding errors should be addressed to the author.
Acknowledgments.The authors would like to thank all the participants of the Rakai cohort and the staff of the Rakai health science program. We would like to especially thank Susanna Lamers for her assistance in sequence submission.
Financial support.This study was supported in part by funding from the Division of Intramural Research, National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH); Office of AIDS Research, NIH; Division of AIDS, NIAID (R01 A134826 and R01 A134265); National Institute of Child Health & Human Development (R01 HD 050180 and 5P30HD06826); HPTN Network Lab grant (U01-AI-068613); the Gates Foundation (22006.03), Henry M. Jackson Foundation; and the Fogarty Foundation (grant 5D43TW00010). A. A. R. T. was supported by the NIH (1K23AI093152-01A1) and Doris Duke Charitable Foundation Clinician Scientist Development Award (2011036).
Potential conflicts of interest.All authors: No reported conflicts.
All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.