|Home | About | Journals | Submit | Contact Us | Français|
Human immunodeficiency virus type 1 (HIV-1) establishes a latent reservoir in resting memory CD4+ T cells. This latent reservoir is a major barrier to the eradication of HIV-1 in infected individuals and is not affected by highly active antiretroviral therapy (HAART). Reactivation of latent HIV-1 is a possible strategy for elimination of this reservoir. The mechanisms with which latency is maintained are unclear. In the analysis of the regulation of HIV-1 gene expression, it is important to consider the nature of HIV-1 integration sites. In this study, we analyzed the integration and transcription of latent HIV-1 in a primary CD4+ T cell model of latency. The majority of integration sites in latently infected cells were in introns of transcription units. Serial analysis of gene expression (SAGE) demonstrated that more than 90% of those host genes harboring a latent integrated provirus were transcriptionally active, mostly at high levels. For latently infected cells, we observed a modest preference for integration in the same transcriptional orientation as the host gene (63.8% versus 36.2%). In contrast, this orientation preference was not observed in acutely infected or persistently infected cells. These results suggest that transcriptional interference may be one of the important factors in the establishment and maintenance of HIV-1 latency. Our findings suggest that disrupting the negative control of HIV-1 transcription by upstream host promoters could facilitate the reactivation of latent HIV-1 in some resting CD4+ T cells.
Human immunodeficiency virus type 1 (HIV-1) establishes a stable latent reservoir in resting memory CD4+ T cells that persists in patients on highly active antiretroviral therapy (HAART) and that is able to produce replication-competent virus following cellular activation (7–9, 14, 61). The latent reservoir is a major barrier to virus eradication (15, 49, 50). In this stable reservoir, the provirus is transcriptionally silent (10, 22). This latent reservoir may contribute to the rapid rebound of viremia after cessation of HAART (26, 63). Deliberately inducing the reactivation of latent HIV-1 may be required to effectively target the reservoir and achieve eradication. Understanding the mechanisms that maintain HIV-1 latency is critical to this therapeutic strategy.
Mechanisms that maintain HIV-1 latency in vivo are incompletely understood. It is widely accepted that the lack of active forms of key cellular transcription factors (3, 13, 16, 31, 42, 58) and of the HIV-1 Tat protein and its cellular cofactors (12, 23, 25, 29, 47, 51) limits the initiation and elongation, respectively, of viral transcription in resting CD4+ T cells (32, 60). In addition, DNA methylation and repressive histone modifications, especially the formation of a single nucleosome located at the viral promoter, have been postulated to promote transcriptional silencing of integrated proviruses (2, 11, 21, 30, 52, 55, 59). Posttranscriptional mechanisms may also play a role (24, 33, 43).
In the analysis of the regulation of HIV-1 gene expression, it is important to consider the nature of HIV-1 integration sites. Pioneering studies by Bushman and colleagues demonstrated that HIV-1 integrates preferentially into active cellular genes in in vitro infections of transformed cell lines (35, 46). However, in an elegant cell line model of HIV-1 latency, integration sites were found in chromosomal regions disfavoring transcription, such as centromeric regions (27). This led to the notion that latency was determined primarily by the site of integration. However, the first analysis of HIV-1 integration sites in infected individuals found that HIV-1 proviruses integrated into transcriptionally active cellular genes in resting CD4+ T cells from patients on HAART (18). The same pattern has been observed in peripheral blood mononuclear cells (PBMC) from untreated individuals (36). In these studies, integrated proviruses were detected by specific PCR strategies, but the replication competence of the proviruses was not assessed. Given that only a small fraction of the integrated proviruses in resting CD4+ T cells from infected individuals appear to be capable of producing infectious virus following cellular activation (8), the possibility remained that integration sites had a different character in the subset of cells harboring replication-competent viral genomes. It has also been unclear whether the observed pattern reflects the likelihood of initial integration into particular regions or subsequent selection of cells carrying particular types of integration events. Nevertheless, the finding that HIV-1 genomes reside within cellular genes raised the possibility that an additional mechanism, transcriptional interference, could also play a role in latency.
Transcriptional interference is defined as direct suppression of one transcription unit by another in cis (48). Two adjacent promoters have been shown to transcriptionally interfere with each other through perturbation of the association of the transcription initiation complex, dislodgement of a sitting promoter-bound complex, or collision between transcription elongation complexes moving in opposite directions (1, 6, 38, 39, 44). Transcriptional interference has been demonstrated in an experimental system with two tandem HIV-1 long terminal repeats (LTRs) in HeLa cells (17). In addition, transcription of the 5′ LTR has been shown to interfere with transcription initiating at the 3′ LTR in Jurkat cells (34). In an elegant study in a cell line model, Lenasi et al. demonstrated that host gene transcription suppressed expression of a provirus integrated within the gene (34). Both positive and negative effects of host gene transcription on expression of integrated proviruses have been noted in studies on individual cell clones (20, 34). In a cell line model of HIV-1 latency, lower levels of HIV-1 gene expression have been observed when integration into highly expressed host genes occurs (35). Host gene transcription through the LTR upstream of the HIV-1 transcription start site has been detected both in resting CD4+ T cells from patients on HAART (18) and in primary CD4+ T cells infected in vitro (34). However, whether host gene transcription interferes with expression from an integrated provirus contained within that gene has not yet been demonstrated in any primary-cell system or in vivo study.
In an effort to more accurately model the in vivo regulation of HIV-1 gene expression, several groups have recently developed primary CD4+ T cell models of HIV-1 latency (4, 5, 37, 45, 62). The model developed by Yang et al. is of particular interest because it utilizes a prolonged culture period in cytokine-free media to allow infected T lymphoblasts to return to a fully quiescent state similar to that of resting memory CD4+ T cells in vivo. In the present study, we have used this primary-cell model to study the viral integration sites in latently infected resting CD4+ T cells that are capable of upregulating HIV-1 gene expression following cellular activation. Serial analysis of gene expression (SAGE) was carried out to determine transcription levels of host genes harboring integrated proviruses. By comparing viral integration sites and the levels of expression of the host gene, we have identified new features of the integration sites of latent HIV-1, which may provide a better understanding of the mechanism of HIV-1 latency.
Peripheral blood for the isolation of primary CD4+ T cells was obtained from healthy adult volunteers. This study was approved by the Johns Hopkins Institutional Review Board. Written informed consent was provided by all study participants. Latently infected, Bcl-2-transduced primary CD4+ cells were generated as previously described (62). Briefly, CD4+ T cells were isolated from healthy donors and then costimulated by plate-bound anti-CD3 and soluble anti-CD28 antibodies, 100 U/ml interleukin-2 (IL-2), and T cell growth factor-enriched medium. Three days later, the cells were transduced with a Bcl-2-encoding lentiviral vector. The cells were allowed to return to a resting state, and then viable cells were isolated using Ficoll-Hypaque density gradient centrifugation.
Bcl-2-transduced cells were again costimulated as described above for 3 days and then infected with the HIV-1 reporter virus NL4-3-Δ6-drEGFP at a multiplicity of infection (MOI) of <0.1. NL4-3-Δ6-drEGFP has inactivating mutations in the gag, vif, vpr, vpu, and nef genes and a destabilized form of enhanced green fluorescent protein (EGFP) (Clontech) inserted into the env open reading frame (ORF). The virus was generated by cotransfecting 293T cells with NL4-3-Δ6-drEGFP, an X4 env expression vector (pCXCR4), and the helper plasmid pC-Help as described previously (40).
Genomic DNA from the three cell populations representing acute, persistent, and latent infection was isolated using a QIAamp DNA microkit (Qiagen). HIV-1 integration sites were determined as previously described (18). Briefly, genomic DNA was digested with PstI, and the products were diluted and ligated overnight to generate circularized DNA. Inverse PCR was carried out to amplify the junction between the HIV-1 5′ LTR and upstream cellular DNA. Bands representing unique integration events were eluted and sequenced. Human genomic sequences were mapped using the UCSC Bioinformatics Human Genome database. An integration site was considered valid if a unique human genomic sequence was joined directly to the end of the 5′ LTR of HIV-1.
Uninfected, Bcl-2-transduced resting CD4+ cells were isolated by negative selection using mouse monoclonal antibodies CD25, CD69, and HLA-DR to deplete activated cells. Activated CD4+ T cells were collected 3 days after costimulation of uninfected Bcl-2-transduced CD4+ T cells with plate-bound anti-CD3 and soluble anti-CD28 antibodies. Total RNA was isolated from both activated and resting Bcl-2-transduced primary CD4+ T cells using an RNeasy minikit (Qiagen).
Seventeen-nucleotide (nt) tags from cellular transcripts were generated using a digital gene expression tag profiling preparation kit (Illumina, San Diego, CA). Briefly, mRNA was purified from 1 μg of total RNA using Oligo-dT magnetic beads (Invitrogen). After first- and second-strand cDNA synthesis, the double-stranded cDNA was digested with NlaIII and ligated to an adapter containing an MmeI restriction site. After MmeI digestion, the 17-nt tag of cellular transcript joined with adapter was released from Oligo-dT magnetic beads and ligated to a second adapter. The ligated product was then amplified by 18 cycles of PCR. The 85-bp DNA fragment was purified from a polyacrylamide gel and sequenced on a genome analyzer system (Illumina, San Diego, CA).
All 17-nt tags were annotated using the “long best gene” database provided by http://cgap.nci.nih.gov/SAGE/. Potentially erroneous sequences were removed from the tag library if one of the following conditions applied: (i) genes were differentially expressed between two replicates, (ii) genes appeared only once, or (iii) genes were 1 bp different from more highly expressed genes unless mapped to the genome or transcriptome database (41). After data filtration, frequencies of identical tags in two replicates were combined. The combined frequency of unique tags was converted into relative abundance as follows: relative abundance = tag frequency × 106/total tag number.
To study the role of host gene transcription in HIV-1 latency, primary CD4+ T cells from healthy donors were transduced with a Bcl-2-encoding lentiviral vector to allow their long-term in vitro survival and then infected with a previously described, X4-pseudotyped HIV-1 reporter virus encoding GFP (NL4-3-Δ6-drEGFP). The strategy for generating acutely infected, persistently infected, and latently infected populations of primary CD4+ T cells is shown in Fig. 1A. Acutely infected cells were collected 3 days postinfection regardless of GFP expression. For the persistently infected and latently infected populations, cells were further cultured in medium without exogenous cytokines for 3 to 4 weeks. Cell sorting was carried out to separate GFP-positive and GFP-negative populations. The GFP-positive population was considered persistently infected. The GFP-negative population was considered to contain latently infected cells. Importantly, there was no spontaneous GFP reexpression when GFP-negative cells were maintained in medium without exogenous cytokines (data not shown). Then, the GFP-negative cells were cultured for another 3 days and then restimulated by plate-bound anti-CD3 and soluble anti-CD28 antibodies. Sorting was then carried out to isolate those cells that were induced to express GFP after stimulation. These inducible cells were considered to represent latently infected CD4+ T cells. Three independent NL4-3-Δ6-drEGFP infections of Bcl-2-transduced primary CD4+ T cells from the same donor were carried out. Flow cytometric analysis of GFP expression at different stages of a representative infection is shown in Fig. 1B.
It is unclear when viruses enter into a latent state in our model. To address this question, GFP− cells were isolated at days 7, 14, and 21 after infection and then stimulated with anti-CD3 and anti-CD28 antibodies. As shown in Fig. 1C, latent viruses were detected as early as day 7. The frequency of latently infected cells did not increase during the following 2 weeks. These results indicate that latency is established rapidly in this system. Infected cells that do not enter a latent state during the first week continue to express viral genes, possibly due to an effective Tat-dependent positive-feedback loop (57).
Genomic DNA was isolated from the acutely infected, persistently infected, and latently infected populations, and integration sites were determined by inverse PCR. Sites were mapped using the UCSC Bioinformatics Human Genome database. As shown in Fig. 2A, HIV-1 was found preferentially integrated into transcription units in acutely infected cells, consistent with previous results (46). In acutely infected cells, 91% of the integration events were within transcriptional units. In persistently and latently infected cell populations, 87% and 94% of integrated proviruses, respectively, were found in transcriptional units. Integration into intergenic regions was uncommon (6%) in latently infected cells (Fig. 2A). Taken together, these results suggest that the integration patterns observed in resting CD4+ T cells from patients (18) and in latently infected cells in this primary-cell model are very similar. Previous in vivo observations did not determine whether resting CD4+ T cells harboring integrated viral genomes could actually upregulate virus gene expression following cellular activation. Our current findings thus extend previous work by showing that in latently infected primary CD4+ T cells that can upregulate HIV-1 gene expression following cellular activation, the viral genomes are integrated into genes that are actively transcribed in resting CD4+ T cells (Fig. 3B). Thus, this study provides further evidence against the notion that latency results from initial integration into chromosomal regions that are intrinsically repressive for transcription. In addition, because viral genomes preferentially reside in active cellular genes in both acutely and latently infected cells, it is possible that the bias toward active genes rather than intergenic regions is established during the initial integration in activated cells and is not altered by the subsequent events that lead to the establishment of latency in some infected cells.
We also examined the transcriptional orientation of integrated proviruses with respect to the host gene. No integration preference was detected in a previous study of integration events in resting CD4+ T cells from patients on HAART (18). However, in that study, it was not possible to ascertain whether the integrated proviruses were inducible. In the system described here, we examined integration sites in latently infected cells that can upregulate HIV-1 gene expression following stimulation. In acutely infected and persistently infected cells, differences between the numbers of integration sites in the same versus convergent orientation with respect to the host gene were not statistically significant. However, HIV-1 was found preferentially in the same transcriptional orientation as the host gene in latently infected cells (Fig. 2B). This bias for the same versus convergent orientation (63.8% versus 36.2%) was statistically significant (P < 0.0001). These results represent aggregated data from three independent preparations of latently infected cells. The same statistically significant orientation preference was observed when each of these independent preparations of latently infected cells was analyzed separately (Fig. 2C).
We compared the integration sites in persistently and latently infected cells and found 29 host genes which had at least one integration site detected in both populations. In each case, the integration sites were at different positions within the same gene. The detection of distinct integration events in the same genes in persistently infected and latently infected cell populations allowed us to confirm the orientation bias described above with an internally controlled data set. As shown in Fig. 2D, within the same set of targeted genes, latent proviruses were found integrated preferentially in the same orientation while persistently expressed proviruses preferred the convergent orientation (P = 0.008).
Taken together, these results demonstrate that there is a subtle but significant orientation preference in latently infected cells, indicating that transcriptional orientation in the same direction as the host gene is a favored condition contributing to proviral latency. These results provide direct evidence that host gene transcription affects latency in primary resting CD4+ T cells.
Previous studies in cell line systems have described a relationship between HIV-1 latency and the level of expression of host gene as measured by microarray analysis (35). While microarray analysis allows comparison between the levels of expression of a given gene under different conditions, it provides only a crude measure of relative expression level of different genes. To further study the influence of host gene transcription on HIV-1 latency, serial analysis of gene expression (SAGE) was carried out to assess the level of expression of host genes in which integration sites were found in primary CD4+ T cells. In comparison to microarray analysis, SAGE has the advantage of providing a quantitative measure of differences in the levels of expression of different genes in the same cell (54, 56). Replicate samples of uninfected, Bcl-2-transduced resting and activated CD4+ T cells were prepared from the same healthy donor used for integration site analysis. RNA was isolated and processed for SAGE. After data filtration to remove erroneous sequences and matching with the “long best gene” database, a total of 20,643 and 22,279 unique tags were acquired from activated and resting cells, respectively. The frequency of a unique tag in one million tags, termed relative abundance, was used to represent the transcription level of the corresponding gene. For most transcriptionally active genes, the frequencies of SAGE tags detected were similar in resting and activated cells. These genes fall on the diagonal on a plot of log transcription levels in resting versus activated T cells (Fig. 3A). Since the total amount of RNA in activated Bcl-2-transduced CD4+ T cells was 5- to 10-fold greater than that in resting Bcl-2-transduced CD4+ T cells, these results indicate that gene transcription is globally and proportionally downregulated when cells transition from an activated to a resting state. Genes known to be upregulated in resting CD4+ T cells (e.g., IL-7 receptor gene) or activated CD4+ T cells (e.g., thymidylate synthetase and chemokine ligand 2 genes) gave the expected patterns, while housekeeping genes (e.g., ribosomal protein and β-actin genes) showed no differential up- or downregulation.
The transcription levels of host genes harboring integrated proviruses in acutely, persistently, or latently infected cells, as well as in 192 randomly selected genes, were determined from the SAGE data. Integration sites in intergenic regions were excluded from the analysis. Integration sites in genes for which no SAGE tags were found were considered to be in transcriptionally inactive genes. In persistently and latently infected resting CD4+ T cells, fewer than 8% of genes containing integration sites were transcriptionally inactive whereas 54% of randomly selected genes were transcriptionally inactive (Fig. 3B). In addition, those genes into which HIV-1 had integrated had levels of transcription that were 10- to 100-fold higher than levels of transcription in randomly selected genes in resting CD4+ T cells (Fig. 3C). Only 6% of genes have transcription levels higher than 2 log in this analysis, and therefore integration into these rare, highly expressed genes is less common. Taken together, these results demonstrate that latent viral genomes are frequently found in genes which have moderate to high levels of expression in resting CD4+ T cells.
HIV-1 replicates preferentially in activated CD4+ T cells, and viral latency can be established when activated cells revert to the resting memory state and shut down viral gene transcription (19, 62). Therefore, it was of interest to investigate the host gene transcription level in activated cells because the virus initially integrates into the genome of activated cells and because the level of transcription of viral genes is influenced by the local chromosomal environment (28). We measured the levels of host gene transcripts in activated, uninfected Bcl-2-transduced CD4+ T cells using SAGE and compared the results with the data from resting, uninfected Bcl-2-transduced cells. In both resting and activated CD4+ T cells, most of the randomly selected transcriptionally active genes are transcribed at relatively low levels (Fig. 3D). In contrast, those genes in which HIV-1 integration sites were detected in the acutely infected, persistently infected, and latently infected populations had high levels of transcription in both resting and activated states (Fig. 3D). The above-described analysis of the transcription level of host genes in which HIV-1 integration had occurred was based on SAGE of uninfected Bcl-2-transduced CD4+ T cells. To confirm that latent HIV-1 infection did not change the transcription level of these several hundred host genes in resting CD4+ T cells, we compared the levels of transcription of these genes in latently infected and uninfected populations of Bcl-2-transduced resting CD4+ T cells by microarray analysis and found no significant difference (data not shown). The results demonstrated that the SAGE data from uninfected CD4+ T cells properly represented the gene transcription in latently infected resting CD4+ T cells.
Taken together, these results suggest that the latency in CD4+ T cells does not result from specific integration into a chromosomal environment that is repressive for transcription. The fact that viral genomes are found in actively transcribed host genes in acute, persistent, and latent infection is consistent with the idea that integration occurs in activated T cells at genes that are transcriptionally active and therefore accessible. As the cells revert to a resting state, the transcriptional environment becomes less favorable for HIV-1 gene expression even though most of the integration sites are located in genes that are actively expressed in resting CD4+ T cells.
As discussed above, latent HIV-1 proviruses were found preferentially in the same transcriptional orientation as the host gene. We further investigated whether the level of transcription correlated with orientation preference. No orientation preference was observed in acutely infected or persistently infected populations at any transcription level in either activated or resting CD4+ T cells (Fig. 4). In the latently infected population, the ratio between same orientation and convergent orientation was greater in genes with moderate- to high-level expression (Fig. 4). This preference was not seen in genes that had low-level expression, probably because the inefficient host gene transcription had no significant influence on HIV-1 transcription. Together, these results suggest that orientation preference of latent HIV-1 is correlated with host gene transcription level, providing further evidence that host gene expression affects latency.
A previous study using resting CD4+ T cells from patients on HAART showed that HIV-1 genomes were preferentially integrated into actively transcribed host genes (18). However, the majority of proviruses in resting CD4+ T cells cannot be induced to release replication-competent viruses following cellular activation (8). Thus, it is possible that the nature of integration sites of replication-competent latent HIV-1 is different from that observed for the whole population of integrated proviruses. In this study, we analyzed HIV-1 integration sites in primary CD4+ T cells capable of upregulating HIV-1 gene expression following cellular activation. We showed that more than 90% of the latent proviruses were integrated within actively transcribed host genes, most of which were transcribed at high levels in uninfected cells. These results suggest that in primary resting CD4+ T cells, HIV-1 can stably maintain a state of latency even within highly active genes and can be reactivated upon stimulation. Integration into the repressive chromosomal region is not necessary for transcriptional silencing of integrated proviruses.
Integrated HIV-1 was found equally in same and convergent orientations relative to the direction of host gene transcription in both acutely infected (46) and persistently infected (18, 35, 53) CD4+ T cells. Surprisingly, we found that the same orientation was significantly preferred in latently infected cells. Since this orientation preference was not observed in acutely or persistently infected cells in our experiments, the possibility that the orientation preference was the result of the integration site detection assays can be excluded. The fact that an orientation preference was not seen in cell line models indicates that viral latency established in cell line models is at least partially different from that in primary cells. Our results suggest that same orientation provides a more repressive local chromosomal environment for viral transcription than convergent orientation. However, approximately one-third of latent proviruses were in the convergent orientation, indicating that this orientation preference is caused by a modest effect(s) which contributes only partially to the establishment of latency.
It is also important to point out that an orientation bias has not yet been observed in cells harboring replication-competent virus in patients. This reflects the technical difficulty in simultaneously assessing replication competence and integration site orientation in individual cells from patients. A further caveat with our in vitro primary-cell model is the possibility that overexpression of Bcl-2 may have some unanticipated effect on the mechanisms involved in latency. In addition, the model makes use of a virus with inactivating mutations in all viral ORFs except tat and rev. This was done to decrease viral cytopathic effects and increase the yield of latently infected cells. It is possible that viral gene products could modify the state of the host cell in a way that affects the establishment of latency.
Among all currently known mechanisms which maintain viral latency, only RNA interference and transcriptional interference can possibly explain orientation preferences. In the convergent orientation, viral transcripts and host gene read-through products could in principle form a double-strand RNA duplex, which most likely would have a negative influence on viral gene expression and could facilitate the establishment of viral latency in convergent orientation. This hypothetical effect from RNA interference favors the establishment of latency for proviruses with convergent orientation. Since the opposite preference was observed in our primary-cell model, we conclude that double-strand RNA, if produced, has a minor effect on viral gene expression. We cannot exclude the possibility that double-strand RNA may turn on cellular pathways and indirectly influence viral gene transcription.
Transcriptional interference between adjacent promoters is also reported to control viral transcription (17, 20, 34). When proviruses are integrated in the same orientation as the host gene, transcription complexes initiating at the promoter of the upstream host gene can read through the viral promoter and inhibit viral transcription by dislodging prebound transcription initiation/elongation complexes assembled on the LTR or preventing the transcription initiation complex from binding to the LTR. The poly(A) signal sequence in the R region of the 5′ LTR can cause termination of host gene transcription and accumulation of transcription elongation complexes on the LTR, which further inhibits viral transcription initiation, but cannot prevent host gene transcription complexes from reading through the viral promoter region. Therefore, when virus integrates in the same orientation as host gene transcription, the viral promoter is always under negative control by the host promoter, while host promoter activity is not directly influenced by the virus. In the convergent orientation, the host promoter and the viral promoter interfere with each other. Moreover, the U3 region in the 3′ LTR can also initiate transcription, which can partially prevent the host gene read-through and protect transcription from the 5′ LTR (Fig. 5). These considerations may explain why latently infected cells more commonly show this orientation preference.
Overall, our experiments suggest that host gene transcription level and orientation are involved in the establishment of HIV-1 latency. Viral transcription is more efficiently inhibited when virus lies in the same orientation as the direction of host gene transcription. Establishment of HIV-1 latency is believed to be a multifactorial process. The state of latency is maintained due to the balance among many factors. Disrupting any single factor can break the balance and result in the reactivation of latent HIV-1. Inhibition of transcription interference from host genes provides a potential mechanism to disrupt latency.
This work was supported by NIH grant AI43222 and by the Howard Hughes Medical Institute.
Published ahead of print on 23 March 2011.