|Home | About | Journals | Submit | Contact Us | Français|
Human immunodeficiency virus (HIV) has a small genome and therefore relies heavily on the host cellular machinery to replicate. Identifying which host proteins and complexes come into physical contact with the viral proteins is crucial for a comprehensive understanding of how HIV rewires the host’s cellular machinery during the course of infection. Here we report the use of affinity tagging and purification mass spectrometry1-3 to determine systematically the physical interactions of all 18 HIV-1 proteins and polyproteins with host proteins in two different human cell lines (HEK293 and Jurkat). Using a quantitative scoring system that we call MiST, we identified with high confidence 497 HIV–human protein–protein interactions involving 435 individual human proteins, with ~40% of the interactions being identified in both cell types. We found that the host proteins hijacked by HIV, especially those found interacting in both cell types, are highly conserved across primates. We uncovered a number of host complexes targeted by viral proteins, including the finding that HIV protease cleaves eIF3d, a subunit of eukaryotic translation initiation factor 3. This host protein is one of eleven identified in this analysis that act to inhibit HIV replication. This data set facilitates a more comprehensive and detailed understanding of how the host machinery is manipulated during the course of HIV infection.
A map of the physical interactions between proteins within a particular system is necessary for studying the molecular mechanisms that underlie the system. The analysis of protein–protein interactions (PPIs) has been successfully accomplished in different organisms using a variety of technologies, including mass spectrometry approaches1,3,4 and those designed to detect pairwise physical interactions, including the two-hybrid yeast system5,6 and protein-fragment complementation assays7. Although two-hybrid methodologies have been used to systematically study host–pathogen interactions8,9, so far no systematic affinity tagging/purification mass spectrometry (AP–MS) study has been carried out on any host–pathogen system. Here we have targeted HIV-1 for such an analysis, uncovering a wide variety of host proteins, complexes and pathways that are hijacked by the virus during the course of infection.
We aimed to identify host proteins associated with HIV-1 proteins systematically and quantitatively using an AP–MS approach2,3. To this end, we cloned the genes corresponding to all 18 HIV-1 proteins and polyproteins, including the accessory factors (Vif, Vpu, Vpr and Nef), Tat, Rev, the polyproteins (Gag, Pol and Gp160) and the corresponding processed products (MA, CA, NC and p6; PR, RT and IN; and Gp120 and Gp41, respectively) (Supplementary Fig. 1 and Supplementary Table 1). Each clone was fused to a purification tag (consisting of 2×Strep and 3×Flag) and transiently transfected into HEK293 cells; each also was used to generate stably expressed, tetracycline-inducible, affinity-tagged versions of the proteins in Jurkat cells (Fig. 1a and Supplementary Fig. 2). Following multiple purifications of each factor from both cell lines, the material on the anti-FLAG or Strep-Tactin beads, as well as the eluted material, was analysed by mass spectrometry (Fig. 1a and Supplementary Table 2). Finally, an aliquot of each purified factor was subjected to SDS–polyacrylamide gel electrophoresis, stained (Supplementary Fig. 3) and subjected to analysis by mass spectrometry.
For each HIV factor, we identified co-purifying host proteins that were reproducible regardless of the protocol used (Supplementary Figs 4, 5 and 7 and Supplementary Data 1). Several scoring systems can quantify PPIs from AP–MS proteomic data sets, including PE10, CompPASS4 and SAINT11. For this data set, we devised a scoring system particularly suited for identifying AP–MS-derived host–pathogen PPIs, which we call MiST (mass spectrometry interaction statistics). The MiST score is a weighted sum of three measures: protein abundance measured by peak intensities from the mass spectrum (abundance); invariability of abundance over replicated experiments (reproducibility); and uniqueness of an observed host–pathogen interaction across all viral purifications (specificity) (Fig. 1b and Supplementary Methods). These three metrics are summed by principal component analysis into a composite score (Fig. 1c and Supplementary Data 2). By comparing our dataset with a benchmark of well-characterized HIV–human PPIs (Supplementary Table 3), analysis of the MiST scoring system revealed superior performance on our data set when compared to CompPASS or SAINT (Supplementary Fig. 6) (and comparable performance using other data sets (Supplementary Fig. 8)) and allowed us to define a MiST cut-off of 0.75, corresponding to ~4% of all detected interactions. To estimate how many interactions would exceed this threshold by chance, we randomly shuffled our data set 1,000 times. A random MiST score of 0.75 or greater was assigned to an interaction ten times less frequently than we saw among the MiST scores for the real data, and the probability of an interaction assignment with a random MiST score greater than 0.75 was 2.5 × 10−4(Fig. 1d).
At the MiST threshold of 0.75, the number of host proteins we found associated with each HIV protein ranged from 0 (CA and p6) to 63 (Gp160) (Fig. 1e). In total, we observed 497 different HIV–human PPIs (347 and 348 identified from HEK293 cells and Jurkat cells, respectively) (Supplementary Data 3). We detected 196 interactions (~40%) in both cell types; 150 and 151 were specific to the HEK293 cells and the Jurkat cells, respectively (Fig. 1e). Only some of these specificities could be explained by differential gene expression in the two cell lines (Supplementary Fig. 9). Using antibodies against 26 of the human proteins, and affinity-tagged versions of an additional 101, we could confirm 97 of the 127 AP–MS derived HIV–human PPIs using co-immunoprecipitation/western blot analysis (76% success rate) (Supplementary Figs 10 and 11), suggesting that we derived a high-quality physical interaction data set.
We next analysed the functional categories of host proteins associated with each HIV protein, and in doing so uncovered many expected connections. These included an enrichment of host factors involved in transcription physically linked to the HIV transcription factor Tat and an enrichment of host machinery implicated in the regulation of ubiquitination associating with Vpu, Vpr and Vif, HIV accessory factors that hijack ubiquitin ligases12 (Fig. 1f and Supplementary Data 4). When we considered domain types instead of whole proteins (Fig. 1g and Supplementary Table 4), we found that host proteins interacting with IN are enriched in 14-3-3 domains, which generally bind phosphorylated regions of proteins13, and that proteins containing β-propellers have a higher propensity for binding to Vpr (for additional domain enrichment analysis, see Supplementary Fig. 12). These domain analyses could facilitate future structural modelling of HIV–human PPIs.
Next we compared our data to other HIV-related data sets, including previously published HIV–human PPIs and host factors implicated in HIV function from genome-wide RNA interference (RNAi) screens. For example, the VirusMint database14 contains 587 HIV–human literature-curated PPIs (Supplementary Data 5), which are mostly derived from small-scale, targeted studies. Although the overlap between the 497 interactions identified in this work and those in VirusMint is statistically significant (P = 8 × 10−8), it corresponds to only 19 PPIs (Fig. 2a and Supplementary Table 5). However, a greater overlap exists, one that remains statistically significant, when interactions below the MiST threshold of 0.75 are considered using a sliding cut-off (for example, at a MiST score of 0.2 there exists an overlap of 67 PPIs (P = 1 × 10−3); Fig. 2c, red lines, and Supplementary Data 6). This overlap indicates that we have indeed identified many interactions that have been previously reported. However, it is likely that the higher scoring interactions identified here have a greater chance of being biologically relevant with respect to HIV function than do many of those in VirusMint.
Recently, four RNAi screens identified host factors that have an adverse effect on HIV-1 replication when knocked down15-18. In total, 1,071 human genes were identified in these four studies (Supplementary Data 7), 55 of which overlap with the 435 proteins (P = 2.7 × 10−10; Fig. 2b, Supplementary Fig. 12 and Supplementary Table 6). Again, this overlap increases (as does its statistical significance) if we consider proteins participating in HIV–human PPIs with MiST scores below 0.75 (Fig. 2c, blue lines, and Supplementary Data 8).
To identify the evolutionary forces operating on host proteins interacting with HIV-1, we performed a comparative genomics analysis of divergence patterns between human and rhesus macaque. The proteins identified in both HEK293 and Jurkat cell lines had stronger signatures of evolutionary constraint than those identified exclusively in one cell line or in VirusMint (Fig. 2d). Points in the lower-right quadrant of Fig. 2d show signatures of strong purifying selection, whereas the upper-right quadrant shows signatures more consistent with neutral evolution. This observation suggests that the PPIs identified in our study, especially the ones identified in both cell types, are more physiologically relevant to mammalian evolution than those reported in VirusMint.
We next plotted the 497 HIV–human interactions identified in this study in a network representation (Fig. 3) containing nodes corresponding to 16 HIV (yellow) and 435 human factors that were derived from the HEK293 cells (blue), Jurkat cells (red) or both. We also introduced 289 interactions between human proteins (black edges) derived from several databases19,20 (Supplementary Data 9). These human–human interactions helped to identify many host complexes, including several that have been previously characterized (see Supplementary Information for a detailed discussion of the HIV–human interaction data sets). Ultimately, all data will be accessible for searching and comparison to other HIV-related data sets using the web-based software GPS-PROT21 (http://www.gpsprot.org/).
Notably, we found that Pol and PR, which we needed to make catalytically inactive (Supplementary Fig. 1), bound the translational initiation complex eIF3, a 13-subunit complex (eIF3a to eIF3m). We detected 12 of the subunits bound to Pol and/or PR, except eIF3j, which is only loosely associated with the complex22 (Fig. 4a). Even though PR is the smallest of the pol-encoded proteins, we find it associated with the greatest number of host factors (Fig. 4a). To determine whether components of the translation complex are substrates for PR, FLAG-tagged versions of ten eIF3 subunits were individually co-transfected, each with a small amount of active HIV-1 PR, into HEK293 cells. The cell lysates were analysed by western blotting and only eIF3d was found to be cleaved (Fig. 4b). Purification of tagged versions of the amino and carboxy termini of cleaved eIF3d revealed that only the N terminus of 114 amino-acid residues associates with the eIF3 complex (Supplementary Table 7). The cleavage occurred with an efficiency similar to that of the processing of the natural PR substrate Gag (Fig. 4c), whereas two cellular proteins previously described to be cleaved by HIV PR, PAPBC123 and BCL224, were cleaved only at higher PR concentrations or not at all, respectively. To confirm this result in vitro, we incubated purified human eIF3 with active PR, resulting in the removal of a 70-kDa band and the appearance of a ~60-kDa protein product (Fig. 4d). Analysis of the cleaved product by N-terminal sequencing revealed a cleavage of eIF3d between Met 114 and Leu 115, which corresponds to the consensus sequence for HIV-1 protease25 and falls within the RNA-binding domain (RRM) of eIF3d (ref. 26; Fig. 4d).
Next we used four to six short interfering RNAs against different eIF3 subunits in HIV infectivity assays (Fig. 4e, f, Supplementary Fig. 14 and Supplementary Table 8). Using a fusion of HIV with vesicular stomatitis virus glycoprotein (VSV-G), which only allows for a single round of replication, knockdown of eIF3d, but not other eIF3 subunits, resulted in an increase in infectivity (Fig. 4e), suggesting that this factor acts in early stages of infection. In assays requiring multiple rounds of HIV infection, knockdown of eIF3d, eIF3e and eIF3f enhanced HIV NL4.3 infectivity by a factor of three to five, whereas inhibition of eIF3c, eIF3g and eIF3i had no promoting effect (Fig. 4f). Consistent with these results, a previous overexpression screen for factors that restrict HIV-1 replication identified eIF3f as the most potent inhibitory clone27. Furthermore, using assays monitoring both early and late products we found that knockdown of eIF3d results in an increase in accumulation of reverse transcription product (Fig. 4g and Supplementary Fig. 15). This suggests that eIF3 does in fact have a role in the early stages of infection, perhaps by binding to the viral RNA through the RNA-binding domain in eIF3d, and thus inhibiting RT, an effect that is overcome by PR cleavage of eIF3d (Supplementary Fig. 16). These results suggest that our data set will be enriched not only for host proteins the virus requires for efficient replication (Fig. 2b, c), but also those that have an inhibitory role during infection. Indeed, we have found that an additional ten factors from our list of inter-actors, when knocked down by RNAi, produce an increase in HIV infection (Supplementary Figs 17–19, Supplementary Tables 12 and 13 and Supplementary Methods). Knockdown of two of these, DESP and HEAT1, also resulted in an increase in HIV integration (Supplementary Fig. 20 and Supplementary Table 14), consistent with their physical association with IN.
As well as performing the systematic AP–MS study reported here, we explored in further detail the biological significance of two newly identified HIV–human interactions: HIV protease targeting a component of eIF3 that is inhibitory to HIV replication; and CBF-β, a new component of the Vif–CUL5 ubiquitin ligase complex required for APOBEC3G stability and HIV infectivity28. Further work will be required to determine whether, how and at what stage of infection the remaining host factors impinge on HIV function. Ultimately, our analysis of the host factors co-opted by different viruses using the same proteomic pipeline will allow for the identification of protein complexes routinely targeted by different pathogens, which may represent better therapeutic targets for future studies.
More details on experimental assays, plasmid constructs, sequences, cell lines, antibodies and computational analysis are provided in Supplementary Methods. Briefly, affinity tagging and purification was carried out as previously described2 and the protein samples were analysed on a Thermo Scientific LTQ Orbitrap XL mass spectrometer. For the evolutionary analysis, genome-wide alignments to rhesus macaque were downloaded from the University of California, Santa Cruz genome browser (http://genome.ucsc.edu/) and evolutionary rates for each group of genes considered were measured using the synonymous and non-synonymous rates of evolution. For the in vitro protease assay, maltose binding protein (MBP)-tagged PR was expressed in BL21 (Gold) DE3 cells in the presence of 100 μM Saquinavir and purified on an MBP trap column. Purified eIF3 was obtained from J. Cate (University of California, Berkeley). For the infection assays, HeLa P4.R5 cells were transfected with short interfering RNAs and after 48 h infected with pNL4-3 or a pNL4-3-derived VSV-G-pseudotyped reporter virus. Infection levels were determined by luminescence read-out.
We thank A. Choi, Z. Rizvi and E. Kwon for cloning of human genes and J. Cate for purified eIF3. We also thank J. Gross, R. Andino, R. Harris, M. Daugherty and members of the Krogan lab for discussion. This research was funded by grants from QB3@UCSF and the National Institutes of Health (P50 GM082250 to N.J.K., A.D.F., C.S.C. and T.A.; P01 AI090935 to N.J.K., S.K.C., J.A.Y. and F.D.B.; P50 GM081879 to N.J.K. and A.B.; P50 GM082545 to W.I.S.; P41RR001614 to A.B.; U54 RR022220 to A.S.; P01 GM073732-05 to A.T.; CHRP-ID08-TBI-063 to S.K.C.; P41 RR001081 to J.H.M.) and from the Nomis Foundation (to J.A.Y.). N.J.K. is a Searle Scholar and a Keck Young Investigator.
Author Contributions S.J. generated the protein–protein interaction map; P.C. developed the MiST scoring system; N.G., M. Shales, E.A., M.F., J.H.M., J.R.J. and R.D.H. provided computational support; K.E.M., K.L., J.R.J., H.H., G.M.J., I.D., J.F. and D.A.M. provided experimental support; S.J., S.C.C., A.J.O. and A.T. characterized the PR–eIF3d interaction; S.J., G.M.J., C.M. and G.M. confirmed the interactions by immunoprecipitation/western blot; L.P., S.L.R., J.M. and M. Stephens used RNAi for functional verification; T.A., G.C., F.D.B., J.A.Y., S.K.C., W.I.S., T.K., R.D.H., C.S.C., A.B., A.S., A.D.F. and N.J.K. supervised the research; and S.J., P.C., A.S. and N.J.K. wrote the manuscript.
Supplementary Information is linked to the online version of the paper at www.nature.com/nature.
The authors declare no competing financial interests.