|Home | About | Journals | Submit | Contact Us | Français|
Author's Choice - Final Version Full Access
Creative Commons Attribution Non-Commercial License applies to Author Choice Articles
Post-translational hydroxylation has been considered an unusual modification on intracellular proteins. However, following the recognition that oxygen-sensitive prolyl and asparaginyl hydroxylation are central to the regulation of the transcription factor hypoxia-inducible factor (HIF), interest has centered on the possibility that these enzymes may have other substrates in the proteome. In support of this certain ankyrin repeat domain (ARD)-containing proteins, including members of the IκB and Notch families, have been identified as alternative substrates of the HIF asparaginyl hydroxylase factor inhibiting HIF (FIH). Although these findings imply a potentially broad range of substrates for FIH, the precise extent of this range has been difficult to determine because of the difficulty of capturing transient enzyme-substrate interactions. Here we describe the use of pharmacological “substrate trapping” together with stable isotope labeling by amino acids in cell culture (SILAC) technology to stabilize and identify potential FIH-substrate interactions by mass spectrometry. To pursue these potential FIH substrates we used conventional data-directed tandem MS together with alternating low/high collision energy tandem MS to assign and quantitate hydroxylation at target asparaginyl residues. Overall the work has defined 13 new FIH-dependent hydroxylation sites with a degenerate consensus corresponding to that of the ankyrin repeat and a range of ARD-containing proteins as actual and potential substrates for FIH. Several ARD-containing proteins were multiply hydroxylated, and detailed studies of one, Tankyrase-2, revealed eight sites that were differentially sensitive to FIH-catalyzed hydroxylation. These findings indicate that asparaginyl hydroxylation is likely to be widespread among the ~300 ARD-containing species in the human proteome.
Post-translational hydroxylation is well established as a modification of collagen and other extracellular proteins but has been considered to be rare in intracellular proteins (1). Recently, however, hydroxylations of specific prolyl and asparaginyl residues have been defined as oxygen-regulated signals that determine the stability and activity of the HIF1 transcriptional complex. Both reactions are catalyzed by members of the 2-oxoglutarate (2OG)-dependent di-oxygenase superfamily: HIF prolyl hydroxylation by PHD (prolyl hydroxylase domain) 1–3 and HIF asparaginyl hydroxylation by FIH (for a review, see Ref. 2).
Following the identification of the HIF hydroxylases, searches for alternative (non-HIF) substrates of these enzymes have identified certain IκB and Notch family members and ASB4 (ankyrin repeat and SOCS box protein 4) as substrates of FIH (3–6). These intracellular proteins all contain ARDs, and in each case the target asparagine residues lie within the ARD. The ARD is one of the most common amino acid motifs in nature; it is present in over 300 proteins in the human genome (SMART (simple modular architecture research tool) database (7)) and conserved in all kingdoms of life (for a review, see Ref. 8). ARDs are composed of a variable number of 33-residue repeats that individually fold into paired antiparallel α-helices linked by a β-hairpin type turn. The hydroxylated asparagine residue is positioned within the hairpin loop that links individual repeats.
These findings suggest that asparaginyl hydroxylation might be much more prevalent in intracellular proteins than has been appreciated previously, particularly among ARD-containing proteins. However, this has not been noted in proteomics surveys to date. Furthermore the protein association methods used so far to identify FIH-associated proteins, including yeast two-hybrid screens and affinity purification (AP)-MS technology, have only identified a limited number of ARD-containing proteins as molecules interacting with FIH (3, 4, 9, 10).
Although AP-MS can be a powerful method, potentially permitting the identification of protein-protein interactions in a physiological context, the preservation of transient protein associations such as those between enzymes and substrates presents a major challenge to this technology. It was thus possible that important FIH protein-substrate associations had been overlooked. We therefore sought to improve methods for identification of such interactions and for the determination of the extent of FIH-catalyzed hydroxylation in substrate proteins. In analyses of FIH with known HIF, IκB, and Notch receptor substrates we noted that the enzyme-substrate interaction could be stabilized by pretreatment of cells with dimethyloxalylglycine (DMOG; a cell-penetrant inhibitor of 2OG-dependent oxygenases that is metabolized to the 2OG analogue N-oxalylglycine) and defined conditions under which DMOG could be used as a “substrate trapping” agent.
Here we describe comparative proteomics screens of untreated cells and cells pre-exposed to DMOG, the use of SILAC to identify preferential DMOG-stabilized interactions with FIH, and the use of alternating low/high collision energy tandem MS to provide simultaneous assignment and quantification of specific sites of FIH-mediated hydroxylation in target proteins. In total, the work identified 12 ARD-containing proteins that associate with FIH in a DMOG-enhanced manner. Detailed MS-based characterization of three of these proteins, Rabankyrin-5, RNase L, and Tankyrase-2, confirmed that all are FIH substrates and revealed the presence of multiple hydroxylation sites that are differentially hydroxylated by FIH, including at least eight sites on Tankyrase-2. The findings indicate that asparaginyl hydroxylation is a common post-translational modification, at least among ARD-containing proteins, and identify these proteins as the largest class of protein hydroxylation targets known to date.
Human embryonic kidney (HEK) 293 cells stably expressing SPA-tagged FIH (Sequential Peptide Affinity tag; 3× FLAG epitope tag, tobacco etch virus protease site, and calmodulin binding peptide (11)) were used in proteomics screens for FIH-co-precipitating proteins. An expression construct for stable expression of SPA-tagged fusion proteins (pcDNA3/NSPA) was created by inserting an N-terminal SPA tag (custom synthesis; GenScript Corp.) into pcDNA3 (Invitrogen) via BamHI/EcoRI sites into which full-length FIH (or EGFP control) cDNA generated by PCR was subcloned. Sequence-verified constructs were transfected into 293 cells, and stable clones were identified by selection in G418 (1 mg/ml). Clones expressing the lowest levels of FLAG-transgene were used in this study. For the screen, SPA-FIH and control cells (SPA-EGFP) were cultured in the presence or absence of 1 mm DMOG for 16 h. Cells were harvested in IP+ buffer (3), and SPA-tagged complexes were immunopurified with FLAG affinity gel (EZview™, Sigma) prior to elution (500 mm NH4OH (pH 11), 0.5 mm EDTA), dilution in Laemmli buffer, and SDS-PAGE analysis. Co-precipitating species that demonstrated DMOG-inducible capture upon Coomassie Blue or silver staining were excised and digested with trypsin. Peptides were analyzed by Tandem MS on a Q-Tof Premier™ instrument (Waters). This approach led to the identification of RIPK4 (four unique peptides; Mascot score, 70) and RNase L (four unique peptides; Mascot score, 152).
For the SILAC screen, U2OS cells expressing FLAG-FIH (tet-FIH) or empty vector (tet-EV) under the control of a doxycycline-inducible promoter were used (3). Two isotopically distinct populations of tet-FIH cells were created by serial passage in arginine- and lysine-deficient Dulbecco's modified Eagle's medium containing 10% dialyzed fetal bovine serum supplemented with either normal (“light”) isotopic abundance (0.68 μm) l-lysine and (0.54 μm) l-arginine or with heavy isotopic forms of l-lysine (U-13C6; Lys6) and l-arginine (U-13C6,15N4; Arg10) at identical concentrations (SILAC Protein ID and Quantitation Media kit; Invitrogen). FIH was induced in both populations after six cell doublings by addition of doxycycline (0.5 μg/ml; 18 h), whereas only the heavy population was exposed to DMOG (1 mm; 16 h). Cells were harvested in IP+ buffer, quantitated, and normalized for total protein content. Efficient incorporation of the heavy label was confirmed by digesting 20 μg of methanol/chloroform-precipitated cell lysate with trypsin and analysis by tandem MS; >99% of peptides assigned by Mascot carried a mass label (data not shown). FIH complexes were immunopurified from heavy and light lysates by FLAG affinity gel. The affinity gel was washed, pooled, and eluted before desalting and digestion with trypsin. Protein(s) binding in a DMOG-inducible manner was assigned on the basis of an increased ratio of heavy to light peptides as determined by Mascot or ProteinLynx Global Server (PLGS; Waters). A control (FLAG) immunoprecipitation (IP) was performed in parallel on tet-EV cells that were passaged in light medium and exposed to doxycycline (0.5 μg/ml; 18 h) and DMOG (1 mm; 16 h). This IP provided a list of contaminants that was subtracted from the FIH screen to define specific FIH interactors.
Material was subjected to nano-ultraperformance liquid chromatography tandem mass spectrometry analysis (nano-UPLC-MSE or -MS/MS) using a 75-μm-inner diameter × 25-cm C18 nanoAcquity™ UPLC™ column (1.7-μm particle size; Waters) and a 90-min gradient of 2–45% solvent B (solvent A: 99.9% H2O, 0.1% formic acid; solvent B: 99.9% acetonitrile, 0.1% formic acid) on a Waters nanoAcquity UPLC system (final flow rate, 250 nl/min; 7000 p.s.i.) coupled to Q-TOF Premier tandem mass spectrometer (Waters). Data were acquired in high definition low/high collision energy MS (MSE) mode (low collision energy, 4 eV; high collision energy ramping from 15 to 40 eV, switching every 1.5 s). Alternatively MS analysis was performed in data-directed analysis (DDA) mode (MS to MS/MS switching at precursor ion counts greater than 10 and MS/MS collision energy dependent on precursor ion mass and charge state). All raw MS data were processed with PLGS software (version 2.2.5) including deisotoping. For MSE data MS/MS spectra were reconstructed by combining all masses with identical retention times. The mass accuracy of the raw data was corrected using Glu-fibrinopeptide (200 fmol/μl; 700 nl/min flow rate; 785.8426 Da [M + 2H]2+) that was infused into the mass spectrometer as a lock mass during sample analysis. MS, MSE, and MS/MS data were calibrated at intervals of 30 s. A UniProtKB/Swiss-Prot database (release 55; June 17, 2008; number of human sequence entries, 19,804) was used for database searches of each run with the following parameters: peptide tolerance, 15 ppm; fragment tolerance, 0.015 Da; trypsin missed cleavages, 1; variable modifications, carbamidomethylation and Met/Pro/Asn/Lys/Asp oxidation. Assignments of asparaginyl hydroxylations that were detected by PLGS were evaluated and verified upon manual inspection. In every case, peptides containing hydroxyasparagine were uniquely assigned to one protein. Each MS/MS spectrum was processed for deisotoping and deconvolution using MaxEnt3 (MassLynx 4.1), and all assignments are documented by an MS/MS spectrum included in this study.
For the analysis of MSE-derived SILAC data, PLGS (version 2.2.5) was used to search against the UniProtKB/Swiss-Prot database (release 55) with the following parameters: peptide tolerance, 15 ppm; fragment tolerance, 0.015 Da; carbamidomethylation as a fixed modification; and [13C]Lys (+6 Da) and [13C,15N]Arg (+10 Da) as variable modifications. DDA-derived MS/MS spectra (peak lists) were searched against the UniProtKB/Swiss-Prot database using either PLGS as described or alternatively using Mascot version 2.2 (Matrix Science) with the following parameters: peptide tolerance, 0.2 Da; 13C = 1; fragment tolerance, 0.1 Da; missed cleavages, 2; instrument type, ESI-Q-TOF; variable modifications, carbamidomethylation, methionine/asparagine oxidation, and for SILAC data label Lys +6 Da and Arg +10 Da. All database searches were restricted to human species because of the complexity of the searches when combined with multiple modifications. The interpretation and presentation of MS/MS data were performed according to published guidelines (12). In addition, individual MS/MS spectra for peptides with a Mascot Mowse score lower than 40 (Expect <0.015) were inspected manually and included in the statistics only if a series of at least four continuous y or b ions was observed. For the analysis of MSE data using PLGS, ARD proteins were included when detected with a score above 20 and/or their probability to be present in the mixture was >50% as calculated by the software. Three other ARD proteins were included with a probability of less than 50% and a protein score of below 20 because the heavy versus light peptide ratios indicated that they were DMOG-inducible (Table I). Protein identification was also based on the assignment of at least two peptides with the exception of Notch2, which was shown previously to be an FIH substrate (Ref. 4; see supplemental Fig. S5 for MS/MS assignment).
To assess whether FIH interaction with a detected protein was inducible by DMOG, the ratios of peptides with incorporated stable amino acids (Lys6/Arg10 for samples that included DMOG treatment) versus unlabeled peptides (samples without DMOG treatment) were examined. In cases where peptide assignments were matching to more than one protein, the corresponding MS/MS spectra were assigned manually. As an internal control, heavy and light tryptic peptides derived from FIH were evaluated for equal mixing of both sample sets (Table I). The local “in-house” Mascot server used for this study is supported and maintained by the Computational Biology Research Group at the University of Oxford.
Whole cell extracts were prepared in IP+ buffer with 400 μg of extract as input. FIH pulldowns used FLAG affinity gel, whereas endogenous IPs used 2 μg of anti-ARD antibody sourced from the following: anti-Tankyrase (Clone 19A449, Abcam), anti-RNase L (2E9, Abcam), and anti-Rabankyrin-5 (13) or species/isotype-matched control IgG (all supplied by Abcam). Immunoblotting was performed using the same panel of antibodies, including FLAG-horseradish peroxidase (Sigma) and anti-FIH antibody, which was raised in the host laboratory and described previously (14). Where necessary, secondary detection used Trueblot™ horseradish peroxidase-conjugated antibody (eBioscience).
Plasmids encoding full-length Tankyrase-2 (pFLAG/TNKS2 (15)), RNase L (pcDNA3/RNase L-GFP (16)), and Rabankyrin-5 (pEYFP/Rabankyrin-5 (13)) were expressed transiently in 293T cells using FuGENE 6™ transfection reagent (Roche Applied Science). FIH levels were modulated by co-transfection of pcDNA3/FIH in a 1:5 ratio with plasmid encoding the relevant ARD protein or by knockdown of endogenous FIH with prevalidated siRNA duplexes using Oligofectamine reagent (Invitrogen). Plasmids and siRNA sequences have been described previously (14). Cells were lysed in IP+ buffer (3), and ARD substrate was immunopurified using either anti-green fluorescent protein antibody (Clone 3E1, Cancer Research UK) coupled to protein A-agarose (Millipore) or FLAG affinity gel. Samples were eluted in ammonium hydroxide and either resolved by SDS-PAGE or desalted by methanol/chloroform precipitation prior to tryptic digestion as described previously (17).
DMOG is a cell-penetrating precursor of N-oxalylglycine, a 2OG analogue that competitively inhibits many 2OG-dependent oxygenases including FIH (Fig. 1A). To pursue the possibility that exposure of cells to this compound might stabilize FIH-substrate interactions sufficiently to permit the identification of novel substrates of FIH we first performed co-immunoprecipitation experiments using cell lines stably expressing SPA-tagged FIH or control SPA-EGFP. HEK293 cells were transfected with pcDNA3 encoding FIH with an N-terminal SPA tag that had been shown not to interfere with FIH enzymatic activity. Transfectants expressing modestly elevated levels of SPA-FIH, when compared with endogenous FIH, were selected for this study. Cells were exposed to DMOG following which SPA-FIH-associated proteins were purified from cell extracts and resolved by SDS-PAGE. These experiments demonstrated that exposure to DMOG (1 mm for a period of 16 h) was sufficient to enhance the capture of FIH-associated species as revealed by Coomassie Blue staining (Fig. 1B). The two FIH-associated species that were defined in this way were excised, digested with trypsin, and analyzed by LC-MS/MS. This revealed the species to be ankyrin repeat and FYVE domain-containing protein 1 (Rabankyrin-5) and receptor-interacting serine/threonine-protein kinase 4 (RIPK4), both ARD-containing proteins with putative target asparagine residues in one or more of the ankyrin repeats. Further gel-based displays of this type identified another FIH-associated species that was specifically observed in material from DMOG-treated cells as the ARD-containing protein 2–5A-dependent ribonuclease (RNase L).
These results suggested that Rabankyrin-5, RIPK4, and RNase L might be substrates for FIH and that the differential capture of FIH-associated proteins in complexes precipitated from DMOG-treated versus untreated cells might be developed as an efficient way of defining novel FIH substrates. Nevertheless the display of associated proteins by SDS-PAGE and subsequent MS analysis involves losses associated with in-gel digestion and extraction that may limit the sensitivity of this technique for captured species of low abundance.
To counter this limitation we developed a gel-free system for identifying differentially captured species using SILAC. This approach used stably transfected U2OS cells expressing N-terminal FLAG-tagged FIH under tetracycline control (tet-FIH cells) or a control U2OS cell line expressing an empty vector instead of FLAG-FIH (tet-EV cells) (see Fig. 1C for the SILAC work flow). Cells were grown in either normal (light) medium or medium supplemented with heavy isotopes of lysine and arginine (“heavy”) for 7 days and then treated with doxycycline to induce FLAG-FIH expression. Heavy isotope-labeled tet-FIH cells, but not unlabeled tet-FIH cells, were then exposed to DMOG. FLAG-FIH complexes were immunopurified from both lysates using FLAG affinity resin, pooled, digested with trypsin, and analyzed by LC-MS/MS. FIH-protein interactions that were enhanced by DMOG were identified from an increased ratio of heavy to light peptides defined using the Mascot search engine. To distinguish DMOG-inducible species that were unrelated to FIH complexes, the FLAG immunopurification was performed on extracts of DMOG-treated tet-EV cells, and the material was analyzed by LC-MS/MS. These protein lists were subtracted from equivalent lists generated from the SILAC experiment on tet-FIH cells to define specific FIH-interacting proteins binding in an DMOG-inducible manner (Table I). As an internal control for the quantitation, we confirmed the equivalent retrieval of the FIH bait protein for which the ratio of heavy to light peptides was ~1 (63 heavy:66 light).
Rabankyrin-5 and a further nine ARD-containing proteins were identified as binding to FIH in a DMOG-inducible manner. These were: ankyrin repeat and KH domain-1, Tankyrase-2, ankyrin repeat domain-containing protein-27, Notch2, ankyrin repeat domain-containing protein-52, ankyrin repeat and SAM domain-1, ankyrin repeat domain-containing protein-60, ankyrin repeat domain-containing protein 35, and IκB. Thus, including RNase L and RIPK4, in total the work identified 12 ARD-containing proteins as species interacting with FIH in a DMOG-inducible manner.
To validate the MS assignments we next performed immunoprecipitation-immunoblotting experiments, focusing on novel FIH-interacting proteins for which immunoprecipitating antibodies are available (Rabankyrin-5, RNase L, and Tankyrase-2). This enabled us first to assay for FLAG-FIH capture using the tet-FIH cells used in the SILAC proteomics screen (Fig. 2A) and second to assay for interaction between endogenous FIH and endogenous ARD proteins in untransfected U2OS cells (Fig. 2B). In each case the interaction was confirmed, and immunoprecipitation of the endogenous ARD-containing protein captured endogenous FIH in a DMOG-inducible manner. The interaction between FIH and Tankyrase-2 was particularly striking with endogenous Tankyrase-2 co-precipitating the largest amount of FIH and a significant percentage of the total cellular pool of Tankyrase-2 co-precipitating with FLAG-FIH in a DMOG-dependent manner.
We next sought to test whether these ARD-containing proteins might be hydroxylation substrates for FIH. A generic strategy was used in which the full-length proteins (Rabankyrin-5, Tankyrase-2, and RNase L) were expressed in 293T cells, immunopurified using an epitope tag, digested with trypsin, and subjected to MS/MS analysis on a Q-TOF tandem mass spectrometer to acquire data with high mass accuracy. To obtain optimal sequence information for the accurate assignment of asparaginyl hydroxylation, data were collected by conventional DDA methods. Using this approach, we demonstrated that all three proteins were hydroxylated in vivo. Hydroxylation was sufficiently abundant that three sites of hydroxylation could be unequivocally assigned in the ARD of Tankyrase-2 (Asn-586, Asn-706, and Asn-739), and one site each could be assigned in the ARD of RNase L (Asn-196) and Rabankyrin (Asn-797) (see Fig. 3 for MS/MS assignments of Asn-586 in Tankyrase-2 (A), Asn-196 in RNase L (B), and Asn-797 in Rabankyrin-5 (C); for other assignments, see supplemental Fig. S1).
The ARD of Tankyrase-2 is extensive; it is composed of 19 full repeats (and two capping half-repeats) with a periodicity derived from an insertion of approximately 22 amino acids within every fourth ankyrin repeat (15). The extended, degenerate, fourth repeat subdivides the ARD into five clusters of four ankyrin repeats (ARs) each. Sequence analysis of the Tankyrase-2 ARD placed 13 asparagine residues (in both classical and degenerate ARs) in positions that were analogous to proven sites of FIH-dependent hydroxylation (e.g. p105 and IκBα (3)).
Given the incomplete coverage of Tankyrase-2 achieved by MS/MS analysis in the DDA mode, we hypothesized that the protein may contain additional sites of asparagine hydroxylation presumably of a lower abundance than the three identified thus far. We therefore sought to obtain more comprehensive MS data coverage using a Q-TOF mass spectrometer capable of acquiring data in an alternative MS/MS mode, which collects both precursor and fragment mass spectra simultaneously by alternating between high and low collision energy (referred to as MSE (18, 19)). This parallel acquisition mode increased peptide sequence coverage considerably and also enabled us both to assign and quantitate hydroxylation in a single chromatographic run.
The mass accuracy of the Q-TOF mass spectrometer (5 ppm or lower) coupled with a highly reproducible nano-UPLC system enabled us to assign the parental and hydroxylated ions at two sites, namely Asn-586 and Asn-739 (Fig. 4A for parental and hydroxylated Asn-739 peptides; Asn-586 data not shown). Consistent with the data collected in DDA mode, analysis of the LC-MS data derived from the MSE run demonstrated that hydroxylation of Asn-586 and Asn-739 was an abundant modification at 46 and 49%, respectively (data not shown and Fig. 4B). In this experiment, we were unable to detect the parental peptide containing the third site of hydroxylation (Asn-706). However, MSE acquisition enabled us to assign unhydroxylated peptides corresponding to several other potential sites (Asn-203, Asn-427, and Asn-518).
We have observed that peptides bearing methionine oxidations elute significantly earlier than the unoxidized parental peptide under our chromatographic conditions.2 By contrast, the effect of asparaginyl hydroxylation on retention time is minimal. The difference in retention time combined with MS/MS of the unoxidized ion enabled us to distinguish peptides bearing methionine oxidation from those bearing asparaginyl oxidation. To determine whether the peptides containing Asn-203, Asn-427, and Asn-518 co-eluted with putative hydroxylated ions (i.e. peptides carrying an additional mass of 16 Da) that were below the threshold of detection for MS/MS sequencing, we interrogated the raw LC-MS data. In support of a fourth site of asparaginyl hydroxylation, co-eluting peptides with masses corresponding to hydroxylated Asn-427 were observed in the LC-MS data. Based on the peak intensity, this corresponded to ~17% Asn-OH (supplemental Fig. S2). In contrast to Asn-427, there were no detectable +16-Da ions (i.e. <5% Asn-OH) co-eluting with the Asn-203- and Asn-518-containing peptides, indicating that these sites are not significantly hydroxylated by endogenous FIH (data not shown). Taken together, these data suggest that at least four sites in the ARD of Tankyrase-2 are hydroxylated to differing extents in vivo. Interestingly all of these sites are located within classical ARs and are characterized by the presence of non-polar, aliphatic amino acid residues (leucine or valine) at the −8 position.
Because endogenous Tankyrase-2 readily captured significant quantities of endogenous FIH we considered it possible that endogenous FIH becomes limiting when Tankyrase-2 is overexpressed (to levels that are amenable to MS analysis) in 293T cells. To address this, we co-expressed FIH with Tankyrase-2 in 293T cells. To maximize peptide coverage, immunopurified Tankyrase-2 was digested directly (in solution) with trypsin and subjected to nano-UPLC-MSE or DDA MS/MS analysis. Collectively from three independent experiments, we were able to detect 10 of the 13 peptides containing asparagine residues of interest at least once. Strikingly of the 10 peptides detected, we were able to assign asparaginyl hydroxylation in eight of them. Therefore, under conditions where FIH is not limiting, there are at least eight hydroxylation sites in the Tankyrase-2 molecule (supplemental Fig. S3, A–E, for the MS/MS assignments of Asn-203, Asn-271, Asn-427, Asn-518, and Asn-671, respectively). Of interest, three of the novel sites (Asn-203, Asn-518, and Asn-671) are located within a degenerate AR sequence that lacks the leucine residue at the −8 position, indicating that FIH can tolerate certain substitutions at this position. However, based on the quantitative data where we were unable to detect significant hydroxylation at Asn-203, Asn-518, or Asn-671 of Tankyrase-2 derived from transfected cells expressing endogenous FIH, it seems likely that the leucine residue is preferred.
To exclude the possibility that hydroxylation of asparaginyl residues at these sites occurs independently of FIH, we used siRNA to knock down FIH in 293T cells. In every site that was examined in Tankyrase-2, hydroxylation was suppressed below the limit of detection, indicating that asparaginyl hydroxylation of ARD-containing proteins is FIH-dependent as exemplified by LC-MS data derived from MSE of Asn-427 (Fig. 5).
Analysis of Tankyrase-2 revealed a large number of sites that are differentially susceptible to FIH-catalyzed hydroxylation. Because many other ARD-containing proteins including Rabankyrin-5 contain very extensive arrays of ARs, we analyzed Rabankyrin-5 using the same approach. The ARD of Rabankyrin-5 comprises 21 ARs, 11 of which contain asparagine residues in structurally conserved positions analogous to proven sites of FIH-mediated hydroxylation. MS/MS analyses following the tryptic digestion of immunopurified Rabankyrin-5 in solution achieved ~80% sequence coverage and detected nine of the 11 asparagine-containing peptides of interest. In material from cells co-transfected with plasmids expressing both Rabankyrin-5 and FIH, a further three sites (in addition to Asn-797; Fig. 3C) were shown to be hydroxylated, namely Asn-316, Asn-485, and Asn-752 (see supplemental Fig. S4, A–C, for respective MS/MS assignments).
Affinity purification allied to mass spectrometry is a powerful method that has been successfully applied to the characterization of discrete protein-protein interactions and large interaction networks (20). Despite the high sensitivity of MS, a limitation of affinity purification in protein discovery is in the identification of transient interactions exemplified by those between an enzyme and substrate. In the present study we describe a proteomics approach that has led to the identification of several novel substrates of the asparaginyl hydroxylase FIH. A key component in the success of our strategy was the use of DMOG as a substrate-trapping agent.
Combining the substrate trapping methodology with SILAC and in-solution digestion provided the most efficient method for identifying FIH substrates. From a single AP experiment we were able to identify 10 ARD substrates that were all binding in a DMOG-inducible manner. Although this approach was more successful than gel-based methods in terms of identifying substrates from a single experiment, a further refinement could be beneficial. Because we digested the immunopurified material in solution, the most abundant peptides were derived from the FIH “bait.” In fact over 160 FIH tryptic peptides were identified by MS/MS. It is conceivable that a large excess of bait peptides could have masked less abundant peptides derived from additional substrates. Prefractionation strategies may thus enhance the detection of substrate-derived peptides. The utility of combining SILAC with substrate trapping by DMOG is not restricted to FIH because the agent inhibits all 2OG oxygenases for which it has been tested (21), and in other experiments we have found that DMOG greatly stabilizes interactions between HIF-α subunits and the HIF prolyl hydroxylase enzymes.2 Thus the methodology we describe might be used to reveal substrates for other enzymes among more than 60 known and predicted 2OG-dependent oxygenases encoded in the human genome (22).
At present the mechanism by which DMOG promotes interaction between FIH and protein substrates is not entirely clear. It is possible that the effect is kinetic, i.e. inhibition of catalysis prolongs otherwise transient interactions between enzyme and substrate. Alternatively because the affinity of FIH for hydroxylated ARD proteins is much lower than that for unhydroxylated ARD proteins it is possible that increased interaction reflects the accumulation of unhydroxylated species that bind FIH more tightly (4). It is also interesting that our data and those of others reveal variation in enhancement of FIH-substrate interactions by DMOG (compare Tankyrase and RNase L; Fig. 2A and Ref. 10) raising the possibility that there are additional FIH substrates that do not behave in this way and might be detected in the absence of DMOG pretreatment.
Another experimental approach that facilitated the discovery of novel asparaginyl hydroxylation sites in FIH substrates was the use of a recently developed MS data acquisition method (MSE (18, 19)). MSE allows the collection of 5–10 times more precursor ions and fragmentation data when compared with data-directed acquisition modes because of a sequential low and high collision energy data acquisition cycle, resulting in significantly higher protein sequence coverage (Table I). Data collection in low collision energy mode can be used for the quantification of peak ion intensities, and data collected in high collision mode provide fragmentation information that can be used for protein identification, both of which can be obtained from a single chromatography run. Allied to the data-directed MS/MS analysis, which generates optimized MS/MS spectra for precise assignment of post-translational modifications, the two MS/MS modes have provided complementary approaches toward identifying and characterizing novel FIH substrates.
Prior to this study a limited repertoire of ARD-containing proteins had been assigned as in vivo substrates of FIH, namely p105 and IκBα (3), Notch1 (4, 6), and ASB4 (5). This work has identified Rabankyrin-5, RNase L, and Tankyrase-2 as novel ARD substrates and a further eight proteins as presumed substrates. Together with the definition of an FIH recognition consensus that conforms to that of the ankyrin repeat (23), these data strongly suggest that FIH-catalyzed intracellular asparaginyl hydroxylation is a common post-translational modification that likely extends to many of the ~300 ARD-containing proteins encoded by the human genome (8). FIH-dependent hydroxylation may also extend to other proteins. Notably we have identified several non-ARD-containing proteins as binding to FIH in a DMOG-inducible manner,2 but as yet it is unclear whether these species bind FIH directly or as part of a ternary complex with ARD-containing proteins.
Alignment of the 13 newly assigned asparagine residues hydroxylated by FIH with established sites of hydroxylation reveals a largely degenerate FIH consensus motif with only the target asparagine showing absolute conservation (Fig. 6A). Notably the acidic residue (glutamate/aspartate) at the −2 position, which was conserved in previously assigned substrates, is not present in three of the four sites that were readily hydroxylated in Tankyrase-2 and is, therefore, clearly not an absolute requirement for FIH activity. However, significant conservation was observed at distinct positions, namely −8 (leucine), −3 (alanine), and −1 (valine) positions (see Fig. 6B for logo representation).
FIH has been crystallized with Notch1 peptide substrates, providing an insight into how AR substrates bind FIH (4). Very few side-chain interactions were observed between Notch and FIH, compatible with the degenerate FIH consensus. However, a distinct interaction was observed with the leucine −8 residue of Notch1 that was buried in a hydrophobic pocket on the surface of FIH. The importance of this interaction is supported by the current data with leucine at −8 being the most conserved residue outside the target asparagine. Hydroxylation was observed at sites bearing non-conservative substitutions at the −8 position when FIH was overexpressed; for example Asn-518 of Tankyrase-2 contains a lysine residue at the −8 position (supplemental Fig. S3D). This was not the case when ARD-containing proteins were expressed in cells without exogenous FIH, indicating that although the leucine residue is not absolutely required for FIH-dependent catalysis it is likely to be important under physiological conditions.
At present, the precise role of FIH-dependent hydroxylation of ARD-containing proteins is unclear. FIH has been shown to hydroxylate the ARD within the intracellular domain of the Notch receptor (4, 6) and, in certain circumstances, to antagonize Notch signaling (6). Notch hydroxylation sites lie within protein domains that are involved in the formation of higher order complexes at paired DNA binding sites (24), and it has been proposed that hydroxylation might affect assembly of these complexes (6). FIH has also been shown to hydroxylate the ARD of ASB4; wild type ASB4, but not a hydroxylation site mutant, is able to regulate vascular differentiation (5). This has led to the proposal that oxygen-regulated vascular differentiation, promoted by ASB4, is regulated by ARD hydroxylation (5). Nevertheless the complexity of these pathways means that the role of FIH-dependent hydroxylation events has not yet been defined with complete clarity. Given the functional diversity among the FIH-target ARD proteins that our present study has identified, including characterized or predicted roles in endocytosis/macropinocytosis (Rabankyrin-5 (13)), antiviral immunity (RNase L (25)), and vesicle trafficking/telomere regulation (Tankyrase-2 (26)), we believe that a generic signaling role for ARD hydroxylation is unlikely. However, it is possible that the modification is used in signaling by specific ARD proteins. Interestingly hypoxia has been implicated in telomere regulation (27), and it is possible that hydroxylation of Tankyrase-2 could contribute to this phenomenon.
The best characterized signaling role for FIH-dependent hydroxylation is in regulation of the association between the C terminus of HIF-α subunits and p300/CREB-binding protein co-activators. Previous work in cells has defined cross-competition between HIF-α and Notch receptor ARDs for FIH-mediated Asn hydroxylation, which modulates the HIF transcriptional response (4, 6). Taken together with the current work, which indicates that cells contain numerous FIH-dependent hydroxylation sites, the data suggest that it is likely to be the hydroxylation status of the ARD pool rather than any one individual ARD protein that provides the effective competition.
In recent work, we have demonstrated that hydroxylation can enhance the stability of certain ARDs, including both natural3 and synthetic ARDs (28), which are designed around the consensus repeat. Although the biological function of these changes in thermodynamic stability is unclear, it is of interest that mutational studies of the ARD-containing protein IκBα have indicated that precisely tuned stability is important for the proper function of the ARD as a protein-protein interaction domain (29). It is therefore possible that FIH-dependent hydroxylation serves to fine-tune the stability of the ARD interaction domain.
Our studies demonstrate that FIH-mediated hydroxylation is saturated by enhanced ARD protein expression in transfected cells, leading to partial hydroxylation at several sites. Incomplete hydroxylation has also been demonstrated on the ARDs of endogenous proteins including IκBα and Notch receptors (3, 4). At present it is unclear whether incomplete hydroxylation of these proteins represents a steady-state level common to individual protein molecules or whether it represents progressive accumulation of hydroxylation at these sites during the lifetime of the protein species. Further work will be required to resolve these possibilities and to address the potential structural and/or signaling roles of FIH-mediated hydroxylation. The clearest and arguably the only biological function of FIH-mediated hydroxylation defined to date is in the regulation of HIF.
Nevertheless the current analyses together with recently published work on endogenous ARD proteins indicate that many or even most ARD proteins are likely to be hydroxylated by FIH in vivo (3, 4). Given the abundance of ARD proteins in the proteome, the findings raise a question of why asparaginyl hydroxylation has not been recognized previously in proteomics surveys that must have included ARD-containing species. It seems possible that allowance for artifactual protein oxidation in computerized database searching may have confounded some analyses. In the current work we utilized data-directed MS/MS and MSE to combine the rigorous assignment of sites of hydroxylation with quantitation, thus enabling the assay of sequence-specific Asn hydroxylation that is responsive to genetic suppression of FIH in cells. Interrogation of peptide sequences by MS should in the future consider the possibility of FIH-catalyzed hydroxylation on asparaginyl residues particularly at appropriately sited residues within ARD-containing proteins.
We are grateful to N.-W. Chi (University of California, San Diego, CA), C. Bisbal (Institute of Human Genetics, Montpelier, France), and M. Zerial (Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany) for provision of reagents and M. L. Coleman, C. J. Schofield (University of Oxford) and A. C. Gingras (Samuel Lunenfeld Research Institute, Toronto, Canada) for helpful discussions.
Published, MCP Papers in Press, October 20, 2008, DOI 10.1074/mcp.M800340-MCP200
1The abbreviations used are: HIF, hypoxia-inducible factor; 2OG, 2-oxoglutarate; AP, affinity purification; ARD, ankyrin repeat domain; DMOG, dimethyloxalylglycine; FIH, factor inhibiting HIF; MSE, low/high collision energy MS; UPLC, ultraperformance liquid chromatography; SILAC, stable isotope labeling by amino acids in cell culture; SPA, Sequential Peptide Affinity tag; EGFP, enhanced green fluorescent protein; EV, empty vector; tet, tetracycline; PLGS, ProteinLynx Global Server; IP, immunoprecipitation; DDA, data-directed analysis; siRNA, small interfering RNA; HEK, human embryonic kidney; AR, ankyrin repeat; KH, K homology; SAM, sterile alpha motif.
2M. E. Cockman, unpublished observations.
3M. Yang and C. J. Schofield, unpublished observations.
*This work was funded by the Wellcome Trust, Cancer Research UK, and the Medical Research Council (UK).
SThe on-line version of this article (available at http://www.mcponline.org) contains supplemental material.