|Home | About | Journals | Submit | Contact Us | Français|
Human immunodeficiency virus type 1 (HIV-1) Tat is a mediator of viral transcription and is involved in the control of virus replication. However, associations between HIV-1 Tat diversity and functional effects during primary HIV-1 infection are still unclear. We estimated selection pressures in tat exon 1 using the mixed-effects model of evolution with 672 viral sequences generated from 20 patients infected with HIV-1 subtype C (HIV-1C) over 500 days postseroconversion. tat exon 1 residues 3, 4, 21, 24, 29, 39, and 68 were under positive selection, and we established that specific amino acid signature patterns were apparent in primary HIV-1C infection compared with chronic infection. We assessed the impact of these mutations on long terminal repeat (LTR) activity and found that Tat activity was negatively affected by the Ala21 substitution identified in 13/20 (65%) of patients, which reduced LTR activity by 88% (±1%) (P < 0.001). The greatest increase in Tat activity was seen with the Gln35/Lys39 double mutant that resulted in an additional 49% (±14%) production of LTR-driven luciferase (P = 0.012). There was a moderate positive correlation between Tat-mediated LTR activity and HIV-1 RNA in plasma (P = 0.026; r = 0.400) after 180 days postseroconversion that was reduced by 500 days postseroconversion (P = 0.043; r = 0.266). Although Tat activation of the LTR is not a strong predictor of these clinical variables, there are significant linear relationships between Tat transactivation and patients' plasma viral loads and CD4 counts, highlighting the complex interplay between Tat mutations in early HIV-1C infection.
Global incidence data from 2010 estimate that ~2.7 million people were newly infected with human immunodeficiency virus (HIV), contributing to a prevalence of ~34 million, with subtype C accounting for the majority of worldwide infections (1). Transmission rates are highest during primary infection (2, 3), and as a transactivator of HIV transcription, the viral protein Tat is a large contributor to the rate of viral replication. It is widely accepted that Tat is an important mediator of HIV type 1 (HIV-1) disease progression: Tat plays a central role in the regulation of HIV-1 gene expression by transactivating viral transcription (reviewed in reference 4), in addition to altering cellular gene expression (5–7). Tat interacts with RNA polymerase II (RNAPII) solely to mediate elongation of viral RNA; in the absence of Tat, full-length transcripts are rarely produced, as RNAPII prematurely dissociates from the template during early transcription, leading to the accumulation of abortive short viral mRNA.
Without external stimuli, latently infected CD4+ T cells and monocytes restrict HIV expression via cellular components that negatively regulate the proviral long terminal repeat (LTR), suppressing the production of complete viral transcripts (8–10). This blockade can be relieved by Tat, which is produced from rare full-length transcripts and binds to the trans-activation response element (TAR) (11, 12), an RNA secondary structure at the 5′ end of nascent transcripts. The early, inefficient transcription of viral mRNA is further enhanced by Tat recruiting positively regulating cellular transcription factors to the 5′ LTR, ultimately facilitating the accumulation of full-length viral mRNA. Transcription is initially promoted by the modification of chromatin and the acetylation of histones by the CREB-binding protein (CBP)/p300 complex (13), recruited to the HIV promoter by Tat (14–16). The affinity of the CBP/p300 complex for components of the basal transcription machinery, such as the TATA-binding protein and transcription factor IIB (17), is subsequently increased through conformational alteration by interacting directly with Tat. Corresponding acetylation of key Tat residues by the CBP/p300 complex enhances the subsequent engagement of Tat with the positive transcription elongation factor P-TEFb complex (12, 18, 19), composed of cyclin T1 and CDK9 subunits, permitting crucial phosphorylation of serine residues in the C-terminal domain (CTD) of RNAPII, resulting in complete viral gene expression (20). Additional phosphorylation events occur between Tat and cellular transcription factors, such as NF-κB and Sp-1 (21, 22), which play critical roles in the regulation of viral gene transcription. Subtype differences exist in the configuration of these and other transcription factor binding sites within the LTR, which have been reported to alter the responsiveness of HIV transcription to external stimuli: a single NF-κB binding site is present in the promoter of HIV-1 subtype E (HIV-1E), yet subtype B contains two with subtype C containing at least three, but as many as four, NF-κB consensus sequences (23). The variation in the number of transcription factor-responsive elements in the LTR has been shown to differentially regulate viral transcription (24) and replication in a dose-dependent manner upon exposure of HIV-infected cells to cytokines, such as tumor necrosis factor alpha (TNF-α) (25–27). Furthermore, the transactivation potential of Tat from different subtypes is not uniform: subtype E and C Tat are strong mediators of LTR transcription compared to subtype B (28), and this is thought to be due to a higher affinity for the TAR hairpin (28). The half-life of Tat from subtype E is almost twice as long as that from subtypes B and C, which may be a compensatory mechanism for the reduction in NF-κB binding (28). Intrasubtype differences also exist, with a recent signature pattern analysis identifying unique residues in Tat, with higher frequencies of Ala21, Asn24, Lys29, Lys40, and Gln60 from HIV-1 subtype C (HIV-1C) circulating in southern India compared to southern Africa (29, 30).
The restoration of complete viral gene expression by Tat reinforces its role as a major contributor to the establishment and enhancement of HIV replication during primary infection. During this time, plasma viral levels of HIV-1 are between 105 and 108 RNA copies/ml (31, 32), yet the levels decrease by 2 to 3 orders of magnitude within 4 to 6 months (33, 34). However, prolonged high viremia has been reported in a subset of patients during primary HIV-1 subtype C infection, through a yet undefined mechanism (35, 36). It is during this primary phase of infection that the reservoir of latently infected, resting CD4+ T cells is established and that Tat, particularly variants with impaired activity, may contribute to the formation of a pool of quiescently infected cells (37).
The critical nature of Tat in controlling virus replication through augmented gene expression questions the extent to which Tat can endure sequence diversity while maintaining function, particularly during early infection when the establishment of a productive infection is paramount. It has been reported for HIV subtype B and simian immunodeficiency virus (SIV) that tat is one of the earliest viral genes to be under host selection pressure (38, 39), leading to sequence diversity early in infection, yet Tat can tolerate up to 40% sequence variation and still conserve its function (40). Such rapid mutation of viral cytotoxic T lymphocyte (CTL) epitopes in Tat, and other viral proteins, constrains successful HIV vaccine development. Data on differences between genes of regulatory and virion-associated HIV proteins suggest that HIV may actually evolve to permit an immune response against viral proteins that are not as essential as Tat and Rev during early replication, such as Gag, as a decoy (41). Nevertheless, levels of anti-Tat antibodies negatively correlate with diseases progression, suggesting that Tat plays a major role in determining AIDS progression (42, 43).
The conserved nature of Tat is compatible with its role as an important target for possible HIV treatments, as well as a logical vaccine candidate (44, 45). Current phase II trials in South Africa are evaluating a therapeutic Tat-based vaccine in anti-Tat antibody-negative individuals on antiretroviral therapy (ART) with high CD4+ T cell counts (Istituto Superiore di Sanità [ISS] T-003 trial; ClinicalTrials.gov registration no. NCT01513135) (46). However, the expansion of Tat vaccine trials to incorporate HIV prevention will not be possible without a thorough understanding of the genetic and associated immunogenic characteristics of Tat in transmitted virus and during primary infection. We hypothesized that Tat diversity would be evident in patients with HIV-1 subtype C primary infection and that subsequent disparities in the transactivation potential of these Tat variants would be associated with plasma viral load. To our knowledge, this is the first study to genetically characterize HIV-1 tat during primary HIV-1C infection using single-genome amplification. To address this gap in HIV research, we asked whether HIV-1 subtype C tat exon 1 was different when comparing primary infection to chronic infection.
Patients were enrolled in an HIV-1C primary infection cohort in Botswana (47, 48), and a subset of 20 subjects was selected based on the stage of HIV infection: eight acutely infected individuals (patient code A to H) and 12 randomly selected seroconverters (patient code OC to QU) (Table 1). Acutely infected individuals were identified before seroconversion by a positive HIV-1 reverse transcription-PCR (RT-PCR) test with negative HIV-1 serology (Fiebig stage II) (49). Seroconverters were identified in the early stage of HIV-1 infection (Fiebig stages IV to VI). The time of seroconversion (day 0) was estimated as the midpoint between the last seronegative test and the first seropositive test for the acutely infected subjects and by the midpoint of the corresponding Fiebig stage for the recently infected subjects (47). Table 1 provides baseline clinical parameters, demographic information, and sampling characteristics summarized for the patients enrolled in the acute and early phases of primary infection. Written informed consent was obtained from all study participants, and ethical approval for this research was obtained from the Human Research Development Committee of the Botswana Ministry of Health and the Office of Human Research Administration at the Harvard School of Public Health.
Viral RNA extraction from plasma samples was carried out using the QIAamp viral RNA minikit (Qiagen, Valencia, CA) according to the manufacturer's instructions followed by single-genome amplification as described previously (50). Briefly, reverse transcription of viral RNA was performed using SuperScript III (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. The single-genome amplification was based on the method of limiting dilutions (51) and was used with minor modifications. HIV-1C primary infection tat exon 1 sequences are available publically through GenBank accession numbers JQ895561 to JQ896230.
To assess the relationships between newly generated quasispecies from the 20 subjects in this study, JModelTest (52) was used to select the optimal evolutionary model. A maximum likelihood (ML) phylogenetic tree was constructed using PhyML (53). A SIV tat exon 1 sequence (Ref.CPZ.US.85.US_Marilyn.AF103) was used to root the tree. The tree was visualized in FigTree v1.1.2 (http://tree.bio.ed.ac.uk/software/figtree/) and MEGA5 (54).
Positive-selection analyses were performed under a likelihood framework by investigating for selection in a population-wise manner. Accordingly, signatures of selection were examined using the newly available mixed-effects model of evolution (MEME) available in HyPhy 2.11 (55–57). We used the Hasegawa-Kishino-Yano (HKY) nucleotide substitution model (58) crossed with the Muse and Gaut 1994 (MG94) codon model (59). The significance level was set at 0.1 for a site to be considered under selection. Those sites under positive selection were mapped to the Tat crystal structure (Protein Data Bank [PDB] accession no. 3MI9) and visualized using the Visual Molecular Dynamics program (VMD) (60). The sequence logo was generated with the online web logo tool at the University of California at Berkeley (UC Berkeley) (http://weblogo.berkeley.edu).
Single-patient subtype C sequences available in the Los Alamos National Laboratory (LANL) HIV database (http://www.hiv.lanl.gov/), included 694 subtype C tat exon 1 sequences of the 1,544 total. Chronic infection HIV-1 subtype C sequence criteria included the following: (i) stage of infection at the time of sampling that was >1,000 days postinfection/seroconversion; (ii) nonrecombinant HIV-1 subtype C; and (iii) full-length tat exon 1. This included 183 HIV-1 subtype C chronic infection tat exon 1 sequences retrieved from the LANL HIV Database (accessed 28 September 2012). The majority of the available sequences were from southern Africa (n = 179); four sequences were from India. We assessed phylogenetic clustering of these sequences in relation to our newly characterized sequences and found that all study sequences were scattered across the phylogenetic tree with reference sequences, and no distinct clades or clusters were detected.
Single-genome sequences generated from the earliest available time point postseroconversion were used to construct a primary infection consensus sequence for each patient. These 20 sequences, in addition to full-length chronic infection (>1,000 days postseroconversion) LANL tat exon 1 subtype C sequences (n = 183), were used in MargFreq (http://sray.med.som.jhmi.edu/SCRoftware/margfreq/) to generate amino acid frequencies at each codon. Differences in the primary infection virus were assessed statistically using the two-tailed Fisher's exact test with an α of 0.05.
High-resolution HLA typing was performed for all study subjects, except subjects OK, ON, and OP, as described previously (61). All ambiguous positions were resolved by resequencing. Polymorphisms outside the targeted exons that could not resolve heterozygote combinations were refined using a statistical method that calculates an HLA haplotype given the probability of association with fully resolved patient HLA class I genotypes at other positions (62). Previously defined CTL/CD8+ epitopes to HIV-1C Tat were obtained from the LANL HIV Molecular Immunology Database. Putative HLA-restricted epitopes within subtype C Tat were identified by scanning the patient consensus sequences for HLA-restricted epitope anchor residue motifs using two independent bioinformatic tools: Motif Scan for HIV HLA anchor residue motifs (http://www.hiv.lanl.gov/content/immunology/motif_scan/motif_scan) and Epipred (63).
A tat exon 1 consensus sequence was generated for each patient from viral quasispecies sequenced during primary infection. tat exon 2, which is not required for transactivation of the HIV-1 LTR sequence, was derived from the HIV-1 subtype C consensus. tat sequences were cloned into the pGen2.1 vector, allowing for Tat expression via the cytomegalovirus (CMV) promoter. Introducing the relevant nucleotide substitutions into tat exon 1 of the subtype C consensus full-length Tat vector generated pGen2.1-ΔTat clones containing a single amino acid mutation. All cloning and site-directed mutagenesis were performed by Genscript (Piscataway, NJ, USA).
TZM-bl cells, obtained from the NIH AIDS Reagents Program, are HeLa cell derivatives that stably express β-galactosidase and firefly luciferase under the control of the HIV-1 LTR (64–68). The cells were cultured in Dulbecco modified Eagle medium (DMEM) supplemented with 10% fetal bovine serum, 100 U of penicillin/ml, 100 mg of streptomycin/ml, 2 mM l-glutamine, and 0.1 mM nonessential amino acids. No selective antibiotics were required for the maintenance of firefly luciferase expression. TZM-bl cells were transfected by electroporation with 1 μg of pGen2.1-Tat per 106 cells using the Neon system (Invitrogen, Carlsbad, CA) with two pulses of 1,200 V for 20 ms. For a negative control, a pGen2.1 vector lacking both tat exons (pGen2.1-empty) was similarly transfected into TZM-bl cells. The cells were aliquoted at 2 × 104 per well in 100 μl of supplemented DMEM without antibiotics in 96-well plates. To normalize for potential differences in transfection efficiency and variation in the number of cells plated, 4 μg of pRL-SV40 (Promega), a Renilla luciferase-expressing construct under the control of the simian virus 40 (SV40) promoter, which is minimally transactivated by Tat, was added to a master batch of TZM-bl cells before they were aliquoted for transfection with a pGen2.1-Tat construct.
After 48 h, the cells were lysed in Dual-Glo firefly luciferase buffer (Promega) and incubated at room temperature for 10 min, and firefly luciferase activity was determined using a PerkinElmer luminometer. Firefly luciferase activity was then simultaneously quenched by the addition of the Dual-Glo Renilla luciferase substrate, which was also incubated at room temperature for 10 min, and activity was similarly determined by the luminometer.
Firefly luciferase values were normalized to Renilla luciferase activity. The subtype C consensus Tat construct was assigned the value 1, and differences in LTR transactivation by patient Tat constructs were calculated relative to the consensus and are given as the median difference in expression ± interquartile range. A normal distribution of the luciferase data was assessed by the D'Agostino and Pearson omnibus normality test (69), and normalized luciferase data were log10 transformed for use in linear regression and parametric analyses. A two-tailed Student's t test was performed on the transformed, normalized luciferase data between the Tat activity for consensus HIV-1C compared to each one of the Tat constructs from the patients. The Pearson product-moment correlation coefficient was used to determine the strength of association or correlation between Tat transactivation and patients' plasma RNA viral loads, CD4 counts, and proviral loads. P values were corrected for multiple comparisons using the Benjamini-Hochberg method to control the false discovery rate (70). Adequate quasispecies sampling from each patient was determined by estimating the likelihood of missing infrequent viral variants present in patient plasma: with n single genomes sequenced, the probability (P) of missing a quasispecies after screening n genomes is calculated as f = 1 − (1 − P1/n) when the variant comprises a fraction f (or less) of the virus population (71).
The phylogenetic relationships between HIV-1C tat exon 1 viral quasispecies from 20 subjects during primary infection are shown in Fig. 1. The maximum likelihood tree highlights distinct patient-specific clusters (except the subject pair OK/OI, as described previously ), and a high level of homogeneity between quasispecies over 500 days postseroconversion.
We employed an evolutionary framework to determine whether any specific sites in tat exon 1 are under positive selection during primary infection. The mixed-effects model of evolution (MEME) detects positively selected sites despite the majority of lineages at a given site evolving under purifying selection, which may mask potential episodic selection acting on a small proportion of lineages. Our analysis found that residues 3, 4, 21, 24, 29, 39, and 68 of tat exon 1 were undergoing positive selection during primary infection (Table 2). We observed positive selection affecting a small subset of branches at sites 3 and 39 and, to some extent, at sites 24 and 29. In contrast, positive selection appears to be pervasive, in a rather large proportion of branches, at sites 4, 21, and 68 and with various levels at sites 24 and 29 (Table 2). Furthermore, our results support recently published data revealing an epistatic interaction between residues in Tat (72). We did not observe individual changes at either Leu35 or Gln39, but we determined that all sequenced patient quasispecies from patient F carried genes encoding both the L35Q and Q39L coevolving mutations. Although MEME identified only residue 39 as positively selected, given the conservative approach of this model, a possible explanation is that two-nucleotide changes were necessary for the Q39L transition (CAG to TTG) in patient F, while only one-nucleotide change was necessary for L35Q (CAG to CAA).
To appreciate the spatial interactions of positively selected residues 3, 4, 21, 24, 29, and 39 in the context of the Tat quaternary structure, we mapped these residues to the only high-resolution (2.1-Å) crystal structure of Tat available (Fig. 2A) (73). We were unable to map residue 68, because reliable crystal structures for that region are not defined. The majority of selected amino acid residues lie on the Tat activation domain (amino acids 1 to 48) that interacts with cellular proteins such as the positive transcription elongation factor P-TEFb; therefore, we might expect changes in these residues to have functional consequences. In particular, residues 3 and 4 reside on the critical acidic/proline-rich region, and residues 21, 24, 29, and 39 reside on the cysteine-rich/Zn2+ finger region of the activation domain, which mediates the coordination of Zn2+ ions. We also observed high diversity within positively selected sites, probably related to the fact that those sites are found to be under diversifying selection (Fig. 2B). Two of the more diverse codon positions (sites 24 and 29) exhibiting a balanced proportion of branches under β+ and β−, which suggests that a substantial amount of both positive and negative selection is occurring on the branches, though over 50% are under negative selection (Table 2 and Fig. 2B).
Signature pattern differences were apparent in Tat from the primary infection cohort compared with chronic HIV-1C infection by assessment of amino acid frequencies at each codon position of tat exon 1. Glu2 was less frequent during primary infection (75%) compared to chronic infection (93%) (P = 0.020), which was complemented by a significantly greater prevalence of Asp2 in primary infection (25%) compared to in chronic infection (7%). Only two of seven sites that were found to be under positive selection during primary infection showed significant differences compared with chronic infection sequences: 15% Lys24 in primary infection versus 44% in chronic infection (P = 0.015) and 30% Ser24 in primary infection versus 3% in chronic infection (P < 0.001). Significant differences were also identified at Ala29, present in 10% of primary infection clinical isolates yet only 1% in those chronically infected with HIV-1C (P = 0.048).
Additionally, we found a statistically significant difference between primary infection and chronic infection tat exon 1 amino acids at position 59 in terms of the amino acid charge in the two stages of HIV infection. The effect of the amino acid in side chain hydrophobicity/hydrophilicity from the chronic infection sequences are not well understood, but this change was observed in the auxiliary region of tat exon 1 (amino acids 58 to 71). During primary infection, tat exon 1 had significantly more neutral residues at position 59 than in chronic infection, with the hydrophilic Pro59 residue being the most prevalent. Only 6 (3%) sequences from chronic infection had a neutral amino acid at position 59 compared with 15% of our newly characterized tat exon 1 sequences (P = 0.046 by Fisher's exact test).
CTL responses have been mapped to only a small number of HIV-1 subtype C Tat epitopes. To systematically address the role of HLA in influencing intrapatient amino acid diversity during primary infection, we analyzed single-genome sequencing data to assess substitutions in Tat, compared to the subtype C consensus sequence (Fig. 3). In addition to the functional data available from LANL, we employed two predictive analyses, MotifScan and Epipred, to discern regions that might be under immune pressure based on the HLA genotypes of the patients. There were only two instances where the transmitted virus had the wild-type Tat residue and mutated over the first 500 days postseroconversion, both of which were potentially HLA restricted: in patient OC, the switch from His29 to Pro29 or Arg29 by 208 days postseroconversion was restricted by both HLA-A*03:01 and HLA-A*68:02; by 411 days postseroconversion, patient OS had switched to Asp24, completely replacing the wild-type Lys24 residue, which was restricted by HLA-B*35:01. Both of these substitutions occurred in a region that has been identified as immunodominant in subtype C Tat (74). The conserved region, comprising the protein transduction domain, 47-YGRKKRRQRR-56 and the nuclear localization signal, 48-GRKKR-52, was highly conserved during primary infection, with only two instances of mutation: R53G in patient OS, and S57R, flanking the conserved region, in patient QR. Both changes, present at the earliest time point and maintained for at least the first 500 days postseroconversion, were predicted to be under HLA restriction by Cw*04:01and A31:01, respectively.
A number of patients in the primary infection cohort contained amino acid residues that differed from the HIV-1 subtype C consensus at sites identified by MEME as being under positive selection. On the basis of these data, we introduced the corresponding single-amino-acid changes into tat exon 1 of the subtype C consensus to determine the functional effects of these positively selected sites. The majority of these single polymorphisms had a detrimental impact on LTR activation by Tat (Fig. 4A). The greatest impact on Tat was seen with the Ala21 that was identified in 13/25 (65%) patients, which reduced LTR activity by 88% (±1%) (P < 0.001), which was less than a 2.5-fold increase above background luciferase levels seen with the pGen2.1-empty vector control. Conversely, the greatest increase in Tat activity was seen with the Gln35/Lys39 double mutant that resulted in an additional 49% (±14%) production of LTR-driven luciferase (P = 0.012). It has previously been reported that Gln35 and Lys39 coevolve in vivo and that mutation of either residue results in a defective virus (72). Our data recapitulate this by showing that Gln35 reduces LTR activation by 87% (±3%) (P < 0.001) and Lys39 by 83% (±1%) (P < 0.001) compared to the wild-type Tat construct.
Compensatory mutations may maintain protein structure and function when deleterious changes occur at other residues, so we chose to determine the LTR transactivation capacity in the context of the full-length tat exon 1 from our cohort of patients during HIV-1C primary infection relative to the activity of the HIV-1 subtype C consensus Tat (Fig. 4B). Tat from patients OG, OP, and OS and Tat expressed by patient OC at 27 days postseroconversion demonstrated no significant change in luciferase expression compared to the HIV-1C consensus Tat. At 208 days postseroconversion, patient OC expressed three Tat quasispecies that differed at a single amino acid, encoding either His29, which was dominant at earlier time points and had no significant effect on LTR activity, or Arg32 or Pro32. The largest decrease in Tat activity across all patients, relative to the subtype C consensus, was seen with the OC patient Tat that contained the Pro29 mutation, resulting in a 79% (±4%) (P < 0.001) decrease in LTR activation. Conversely, the Arg29 substitution significantly upregulated LTR-luciferase expression by 55% (±14%) (P = 0.008). The only other significant increase in Tat transactivation was seen in patient D, resulting in a 73% (±9%) (P < 0.001) upregulation of expression.
The Tat sequence for patient C contained a three-nucleotide insertion that introduced an additional arginine residue into the transactivation domain, subsequently increasing LTR activity by 22% (±6%) (P = 0.034) compared to the patient C Tat without the insertion, yet remained significantly less active than the consensus C Tat. Of the two patients that demonstrated sequence diversity over time, patient OS expressed the wild-type Lys24 residue at day 203 days postseroconversion, which demonstrated no difference in Tat transactivation activity compared to the consensus. Yet by 411 days postseroconversion, complete replacement with Asn24 (A-to-T change at the third nucleotide of codon 24) reduced Tat activity by 63% (±2%) (P = 0.021) compared to activity at 203 days postseroconversion. Similarly, patient OC expressed a Tat variant at 27 days postseroconversion that transactivated the LTR at a level similar to that of the HIV-1 consensus C. However, by 208 days postseroconversion, 11 of 35 sequences (31.4%) had Arg29 and 5 sequences (14.3%) had Pro29, with the remaining 19 sequences (54.3%) retaining the wild-type His29 residue. The Arg29 and Pro29 mutations had opposing effects on Tat activity: a switch to proline at position 29 reduced Tat activity by 78% (±4%) (P = 0.027), whereas an arginine residue increased Tat activity by 64% (±15%) (P = 0.164) compared to the His29 Tat variant that predominated in patient OC at 27 days postseroconversion. The nonsignificant P value for the large change in LTR activity by the Arg29 mutation is likely due to the high standard deviation seen in the luciferase assay data performed using the Tat construct from patient OC 27 days postseroconversion.
The relationships between the Tat-mediated LTR activity and patient plasma viral loads, CD4 counts, and proviral loads over the first 180 days postseroconversion were examined by the Pearson product-moment correlation coefficient and linear regression. There was a moderate positive correlation with plasma viral load (P = 0.026; r = 0.400) (Fig. 5A) and a weak negative correlation with CD4 count (P = 0.090; r = −0.315) (Fig. 5B), but not with proviral DNA load (P = 0.229; r = 0.255) (Fig. 5C). Three patients (patients ON, OP, and OS) did not have clinical data available before 180 days, so they were excluded from the analysis. Later set point viral loads, up to 500 days postseroconversion, were more weakly correlated with Tat-mediated LTR activity (P = 0.043; r = 0.266) (Fig. 5D) and showed a weaker negative correlation with CD4 count (P = 0.040; r = −0.274) (Fig. 5E). There remained no correlation with proviral load (P = 0.173; r = −0.209) (Fig. 5F). While the independent linear relationships between the Tat-mediated luciferase expression and patients' plasma viral loads and CD4 counts are significant, Tat activation of the LTR may not be a strong predictor of these clinical variables.
To our knowledge, this is the first in-depth functional characterization of tat exon 1 during primary HIV-1 subtype C infection. We employed the recently developed MEME method for detecting selection acting on HIV-1 subtype C Tat sequences during primary infection. The MEME method has advantages over previous methods, as it is able to detect sites under diversifying selection on most lineages of the underlying phylogeny (pervasive selection), as well as those sites under selection only on a small proportion of lineages (episodic selection). Usually, traditional selection detection methods mask sites that are subjected to selection in a small proportion of lineages, which hampers the power to observe early dynamic processes. Seven sites were found to be under positive selection with various levels of pervasiveness on the underlying phylogeny. For instance, sites 3 and 39 are involved in Tat regions that are essential for its catalytic and regulatory activities, respectively. The finding that these sites are being selected on a small proportion of sites might indicate a change in Tat performance and thus have functional consequences. Similarly, we identified sites (sites 24 and 29) under positive selection in about 50% of lineages, and the proportions of Lys24, Ser24, and Ala29 differed in primary and chronic infection. This finding suggests different selective pressure in primary and chronic HIV-1C infection and is supported by functional analysis showing that these amino acid substitutions have significant functional consequences for Tat transactivation.
We found that site 39 is under episodic positive selection on a small proportion of branches (Prβ+ = 0.020) and is also engaged in bidirectional epistatic interactions with site 35 (posterior probability = 1), a phenomenon that has already been reported in vivo (72). This reinforces the results of our selection analysis, since a mutation at site 39 results in reduced LTR activation in our subsequent functional assays; naturally, most of the lineages show site 39 to be under negative or purifying selection, suggesting that this position is constrained by its epistatic interaction with site 35. Given that the majority of HIV-1 Tat data available are from population sequencing, to our knowledge, this is the first report demonstrating that the Glu35 and Lys39 substitutions are epistatically linked in the same RNA molecule.
On the basis of the MEME data, we performed functional analysis of individual Tat amino acid substitutions at sites identified as being under positive selection during primary infection. The majority of these changes had a significant negative impact on the transactivation potential of Tat to activate the LTR. Only two single alterations, P3I and H29A significantly increased LTR transcription compared to the wild-type HIV-1 subtype C Tat, and these substitutions were present in the same patient, patient QR. However, expression of this patient's full-length tat exon 1 resulted in a significant 16% reduction in LTR activity, highlighting the complex interplay between mutations in Tat. This is exemplified by the epistatic substitutions at Lys35 and Glu39 that individually abrogate important interactions between Tat and P-TEFb (72) but when coevolving, restore the transactivation activity to beyond wild-type levels. A number of Tat residues in exon 1 have been identified as crucial for their ability to recruit CBP/p300 and P-TEFb and to transactivate the LTR. Numerous sites have been reported to undergo posttranslational modification by cellular proteins. (i) The arginine-rich motif, involved in RNA binding, is modified by methylation and acetylation at a string of residues. (ii) Lys50 is acetylated by p300 after coupling with Tat (75). (iii) Methylation occurs at Lys51 by the lysine methyltransferase KMT7 (76, 77). (iv) Methylation occurs at Arg52 and Arg53 by the protein arginine N-methyltransferase PRMT6 (78). The arginine-rich motif lies within the transduction domain, which was completely conserved in all patients except for the R53G substitution observed in patient OS, which was potentially restricted by Cw*04:01 and maintained longitudinally. Tat from this patient containing the Arg53 substitution had no discernible effect on LTR transcription in functional assays. Autoacetylation by Tat occurs specifically on Lys41 and Lys71 (79), two residues that are completely conserved in our primary HIV-1C infection data. The absence of Lys41 reduces virus replication by preventing the enhancement of histone acetyltransferase activity by p300 (79).
Phosphorylation occurs at Ser16 and Ser46 by cyclin-dependent kinase 2 (CDK2) (80) and at Ser62, Thr64, and Ser68 by protein kinase R (PKR) in subtype B (81), yet variability in the prevalence of these serine residues in primary HIV-1C infection suggests that subtype differences may exist. Amino acid substitutions at residues 16 and 46 are reported to significantly reduce virus production in vitro, and these serine residues were conserved in our cohort except for a S46Y substitution observed in patient E, a switch that was previously identified to be restricted by the same HLA found in patient E, B*15:03 (80). Of those Tat residues phosphorylated by PKR, only Ser62 is conserved in our primary infection cohort; Thr64 is present as Lys64 or Asp64, and Ser68 is present as Leu68 or Pro68. In subtype B, the importance of retaining these phosphorylation sites has been demonstrated: a 42% reduction in LTR activity was reported with either an A64T or A68S substitution, with a Tat double mutant reducing transactivation by 59% (81). However, these deleterious mutations were in the context of an HIV-1B Tat; despite the high prevalence of Thr64 and Ser68 substitutions in our primary infection cohort, subtype C Tat has been shown to be a more potent transactivator than its subtype B counterpart. It may be that compensatory mutations exist in HIV-1C, as our functional data demonstrated that substituting Ser68 with Pro68 actually significantly increased LTR transcription by >250%.
The established coevolution of sites Glu35 and Lys39 was observed in one of our patients, and functional analysis in the context of a subtype C Tat background reinforced the negative impact of mutating either residue individually, with rescue of the wild-type phenotype achieved by the introduction of the corresponding compensatory substitution. Glu35 and Lys39 independently contribute to distinct aspects of Tat-mediated transcription: binding to P-TEFb and promoting phosphorylation of RNAPII by P-TEFb, respectively (72). The single L39Q mutant will retain the ability to interact with P-TEFb yet will fail to phosphorylate the CTD of RNAPII, whereas the Q35L substitution abrogates any binding with P-TEFb (72). The importance of invariant cysteine residues in Tat, which are highly conserved between subtypes, has been extensively studied (82). Of the seven cysteines, only Cys31 has been shown to be dispensable for Tat transactivation, while the substitution of any other cysteine residue between sites 22 and 37 abolishes Tat function. DNA binding and transcriptional activity of the NF-κB p65 subunit is enhanced through direct binding to Tat via this cysteine-rich region and is complemented by a corresponding association between the arginine-rich motif of Tat, which sequesters the NF-κB inhibitor, IκBα, allowing the transcription factor to relocalize to the nucleus. All of the crucial cysteines are entirely conserved during primary HIV-1C infection, with the variable Cys31 present only in patient D, with all other patients expressing the characteristic subtype C Ser31 residue. The adjacent, invariable Cys30 forms a CC chemokine motif in other HIV-1 subtypes that retain the Cys31 (83), which is reported to be important for monocyte chemotaxis and subsequent interleukin 10 (IL-10) induction (84). Abrogated binding of the CS motif in subtype C to chemokine (CC motif) receptor 2b (CCR2b) prevents downstream signaling that culminates in cytokine induction and the upregulation of chemokine (CXC motif) receptor 4 (CXCR4) on CD4+ T cells (83). Although patient D harbored the CC motif and displayed significantly higher levels of LTR induction in comparison to the wild-type subtype C consensus Tat, it has previously been shown that substituting the Ser31 residue for Cys31 does not alter transactivation potential in subtype C (83). The retention of serine at this site may have yet unidentified functional importance and could potentially act as an alternative phosphorylation site.
Other subtype differences exist in sites that have a less established role in mediating LTR transactivation. Selection for Arg23 in HIV-1 subtype B Tat confers a replicative advantage, yet Arg23 was the predominant residue in primary subtype C infection and showed no significant difference in prevalence with chronic infection (P = 0.778 by Fisher's exact test). We observed low frequencies of Ala21, Asn24, Lys29, Lys40, and Gln60, suggesting our data are in agreement with a recent report identifying unique signature residues for HIV-1C in southern India compared to southern Africa. However, we also identified amino acid frequency variations at sites 24 and 29 during primary infection compared to chronic HIV-1C, with a significant increase in Ser24 and Ala29 in our cohort. Removal of the four tat exon 1 sequences from chronic infection viruses originating in India did not alter the significance of our results. Heterogeneity within HIV-1C tat exon 1 coupled with the variation seen at different stages of infection suggest deviating evolutionary paths for two geographically distinct clusters of subtype C, reinforcing the need for comparative characterization of these subepidemics.
We recently found that tat exon 1 intrapatient diversity was associated with plasma viral loads at baseline (r2 = 0.505; P = 0.044) and at later stages of infection, between 181 and 500 days postseroconversion (r2 = 0.516; P = 0.039) (50). With the need for Tat activity to be determined in the context of potentially unknown compensatory substitutions, we chose to evaluate the association between patient tat exon 1 and plasma viral load, CD4 count, and proviral load during the first 500 days postseroconversion. Although the correlation between Tat-mediated LTR activity and viral load was not strong, our results show that Tat transactivation is a significant contributor to plasma viral load and CD4 count in primary infection, but not to proviral load. However, a limitation to our approach prevented us from matching a patient's HIV-1 LTR sequence with the associated tat exon 1, instead relying on an integrated HIV-1B LTR-driven luciferase reporter in TZM-bl cells. The original study (50) targeted diversity and diversification of accessory genes and was designed to produce an amplicon that extended from the start of the vif open reading frame to the end of vpu, which encompassed tat exon 1 but does not span tat exon 2. The spatial separation of tat exon 2 also precluded its inclusion in this analysis. Although we may have identified a stronger correlation between Tat transactivation of its autologous LTR with viral and proviral loads and CD4 count during primary infection, our use of a standardized LTR allowed us to evaluate the impact of subtype C Tat mutations without the additional variability in promoter transactivation arising from potential LTR substitutions.
An additional limitation of this study was the difference in the median number of sequences generated for those with acute HIV-1C infection compared to those in the later stages of primary infection (47 versus 23 sequences, respectively; P < 0.01). However, we estimated that there is 90% likelihood that Tat quasispecies would have been missed if it comprised <10% of the virus population when sequencing 23 single genomes and <5% of the population when sequencing 47 genomes. Conversely, we are confident that we were able to isolate quasispecies from an average of >90% of the circulating virus in the 20 patients with primary HIV-1C infection. It should be noted that a constraint exists due to the overlapping nature of nonstructural genes, making it difficult to assign specific mutational events to specific proteins. The overlap between the first exon of tat and vpu or rev was assessed for evidence of positive selection: although there were instances of an amino acid change observed in the vpu or rev reading frames, no evidence of positive selection was found in these genes using the same MEME analysis as described for tat. Furthermore, the second exon of tat, which inhibits splicing (85–87), has been shown to be involved in the cellular uptake of exogenous Tat protein (88), and may contribute to viral infectivity and to binding to cell surface integrins and have a critical role in activating NF-κB (89–92).
Other than associations with previous reported immunodominant regions in HIV-1 subtype C Tat (74, 93–98), the HLA data presented here were predicted based on searching patients' Tat sequences for all known HLA anchor residue motifs and associating the findings with patient HLA genotypes. Within five unique epitopes identified for subtype C tat exon 1 and associated with a particular HLA genotype, there is a clear disparity with the extent of data available for subtype B, which has to be addressed.
Subtype C Tat has been established as a more potent transactivator than its subtype B counterpart, yet it is functionally deficient in mediating chemotaxis and cytokine dysregulation in monocytes, a key aspect of HIV pathogenesis maintained by other subtypes (28, 83). Our results suggest that genetic and functional differences are also apparent between primary and chronic infection, with specific residues under evolutionary pressure. The significant, albeit nonlinear, relationship between Tat transactivation and plasma viral load during the first 500 days postseroconversion emphasize its role as a major mediator of virus replication in early infection. Increased understanding of the evolution and dynamics of HIV-1C Tat is crucial for Tat-based vaccine advancement. The functional effects highlight a complex interplay between individual viral mutations in early HIV-1C infection.
We thank and are grateful to the participants of the Tshedimoso study in Botswana, the Botswana Ministry of Health, Gaborone City Council clinics, and the Gaborone voluntary counseling and testing (VCT) Tebelopele for collaboration. We are grateful to Gaseboloke Mothowaeng, Florence Modise, S'khatele Molefhabangwe, Sarah Masole, and the late Melissa Ketunuti for their dedication and outstanding work in the clinic and outreach. We thank HyPhy developers for providing guidance on the analysis through the online forum. We are grateful to Art Poon, Lauren Margolin, Lauren Buck, and Jeannie Baca for excellent technical assistance. We thank Tun-Hou Lee for review of the manuscript and Lendsey Melton for editorial assistance. The following cell reagent was obtained through the NIH AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, NIH: TZM-bl cells from John C. Kappes, Xiaoyun Wu, and Tranzyme Inc.
The primary HIV-1 subtype C infection study in Botswana, the Tshedimoso study, was supported by NIH grant R01 AI057027. This work was supported in part by AAMC FIC/Ellison Overseas Fellowships in Global Health and Clinical Research (R.R.) and by NIH grant D43 TW000004 (R.R.). This work was also supported in part by University of Botswana ORD, R812 (T.K.S.). E.C.N. is supported by Comisión Nacional de Investigación Científica y Tecnológica (CONICYT), Gobierno de Chile-Becas Chile, and by BYU Graduate Mentoring (2011) and Research (2012) Awards.
Published ahead of print 13 March 2013