|Home | About | Journals | Submit | Contact Us | Français|
Protein hydroxylation at proline and lysine residues is known to have important effects on cellular functions, such as the response to hypoxia. However, for protein hydroxylation at tyrosine residues (called protein-bound 3,4-dihydroxy-phenylalanine (PB-DOPA) has not been carefully examined. Here we report the first proteomics screening of the PB-DOPA protein substrates and their sites in E. coli and human mitochondria by nano-LC/MS/MS and protein sequence alignment using the PTMap algorithm. Our study identified 67 novel PB-DOPA sites in 43 E. coli proteins, and 9 novel PB-DOPA sites in 7 proteins from HeLa mitochondria. Bioinformatics analysis indicates that the structured region is more favored than the unstructured regions of proteins for the PB-DOPA modification. The PB-DOPA substrates in E. coli were dominantly enriched in proteins associated with carbohydrate metabolism. Our study showed that PB-DOPA may be involved in regulation of the specific activity of certain evolutionarily conserved proteins such as superoxide dismutase and glyceraldehyde 3-phosphate dehydrogenase, suggesting the conserved nature of the modification among distant biological species. The substrate proteins identified in this study offer a rich source for hunting their regulatory enzymes, and for further characterization of the possible contributions of this modification to cellular physiology and human diseases.
Protein hydroxylation has been found in multiple amino acid residues, including proline, lysine, aspartate, asparagine, tryptophan, and tyrosine 1. Members of the Fe(II)- and 2-oxoglutarate-dependent family of dioxygenases can catalyze these hydroxylation reactions 2. Emerging evidence suggest that this family of post-translational modifications (PTMs) play structural roles, and have important cellular functions. For example, prolyl and asparaginyl hydroxylation are central to the regulation of the transcription factor HIFα, the master regulator of the cellular hypoxic response pathway 3, 4. This protein is associated with diverse physiological and pathophysiological conditions, such as angiogenesis, myocardial ischemia, and neuronal ischemia 5. U2AF65 (U2 small nuclear ribonucleoprotein auxiliary factor 65-kilodalton subunit) can be lysine-5-hydroxylated by the Fe(II) and Jmid6 (2-oxoglutarate-dependent dioxygenase Jumonji domain-6 protein); and this modification has a selective effect on the RNA splicing 6. Hydroxylation of proline in the newly synthesized collagen polypeptide chains is essential for the formation of the triple-helical molecules 7. Given the critical roles of the hydroxylation reactions at proline, lysine, and asparagine residues in regulating proteins’ and cellular functions, it is anticipated that hydroxylation at the tyrosine residue will also play a role in cellular physiology.
The tyrosine residue can be oxidized to the protein-bound 3,4-dihydroxyphenylalanine residue (PB-DOPA, or 3-hydroxytyrosine) by enzyme-catalyzed reaction. PB-DOPA can be generated by oxidation of tyrosine residue by tyrosinase or tyrosine oxidase, which are present in neuronal cells and melanocytes of skin, hair follicles, and pigmented epithelial cells of the retina 8. PB-DOPA can be also produced by the enzymatic reactions of bacterial tyrosinase as cresolase activity, and by autooxidation in Streptomyces species 9, 10. Moreover, PB-DOPA could be generated from L-DOPA 11, 12 which is a precursor to the biological pigment melanin and is also the precursor to the neurotransmitters dopamine, norepinephrine, and epinephrine 8, 13, 14. In addition, the L-DOPA has been used in clinics for treating Parkinson’s diseases and dopamine-responsive dystonia, in order to increase dopamine concentration in the brains of patients 15. Limited studies suggest that PB-DOPA is associated with cellular functions.
Determining the substrates of PTMs, and their modification sites, are typically the first steps in study of PTM biology. The history of investigations into protein lysine acetylation provides a good example of substrate identification and functional characterization of a PTM, which the study of protein tyrosine hydroxylation is likely to follow. Lysine acetylation (KAc) was initially identified in histones in 1964 16. The first non-histone substrate protein, p53, was identified in 1997 17. Initially, KAc was thought to be restricted to nuclei because of its initial identification and characterization within histones and transcription factors. However, this paradigm was challenged after the discovery of the first KAc cytosolic proteins 18, 19. Identification of diverse substrates in both cytosolic and mitochondrial fractions 20, and demonstration of the class III histone deacetylases in mitochondria 19, conclusively established the substrate and functional diversities of this PTM. A proteomics screen in mitochondria gave the surprising result that more than 20% of mitochondrial proteins are lysine acetylated in mammalian cells 20, indicating the role of this modification in mitochondria and metabolism 20.
Here, we presented the first systematic screening of PB-DOPA substrates and sites in E. coli and HeLa mitochondria by protein separation, nano-LC/MS/MS, and the PTMap for sequence alignment. Our studies identified 67 novel sites of PB-DOPA in 43 E. coli proteins and 9 novel sites of PB-DOPA in 7 HeLa mitochondrial proteins, present in 2.5% and 0.5% of proteins in E. coli and HeLa mitochondria, respectively. We verified the identification of PB-DOPA by the purified recombinant proteins and MS/MS of the corresponding synthetic peptides. Spectral counting comparison between the modified peptides and their corresponding unmodified counterparts showed that the PTM stoichiometry can be as high as 32.7%. In addition, the E. coli proteins bearing PB-DOPA are highly enriched in cellular metabolism, especially carbohydrate metabolism. This study therefore revealed a previously unexpected role of PB-DOPA in prokaryotic biochemical pathways, and provided evidence that PB-DOPA is an abundant and evolutionarily conserved modification.
Modified porcine trypsin was purchased from Promega Inc. (Madison, WI). LB-medium was purchased from MP Biomedicals LLC (Solon, Ohio) and Ni-NTA Agarose was a product of QIAGEN GmBH (Valencia, CA). Trichloroacetic acid (TCA) was purchased from Sigma-Aldrich, Inc. (St. Louis, MO). LC/MS grade water and acetonitrile (ACN) were purchased from Honeywell Burdick & Jackson (Muekegon, MI). All other chemicals were of the highest purity available or analytical grade. All the peptides used in this study were synthesized by Genemed Synthesis Inc. (San Antonio, TX).
E. coli K-12 DH10B (Invitrogen, Carlsbad, CA) was cultured in LB media to OD600 = 0.7. The cells were harvested and washed twice with cold PBS. The cell pellet was lysed in NETN buffer (50 mM NaH2PO4, 300 mM NaCl, 10 mM imidazole, 0.5% NP40, 20 mM beta-mercaptoethanol, pH 8.0, with the protease cocktail inhibitors) by sonication. The lysate was dialysed against 10 mM 2-(N-morpholino)ethanesulfonic acid (MES) buffer, pH 6.0 for 1.5 hrs at 4 °C and then centrifuged at 20,000 x g for 20 minutes. Supernatant was collected for HPLC separation.
Mitochondria were isolated from HeLa S3 cells (Biovest International Inc. Minneapolis, MN) using a discontinuous Percoll gradient as previously reported 21. The mitochondria were lysed by HIM buffer (200 mM mannitol, 70 mM sucrose, 1 mM EGTA, 10 mM HEPES, pH 7.5, with the protease cocktail inhibitors) containing 1% n-dodecyl-D-maltoside on ice with frequent vortexing for 30 mins. Samples were centrifuged at 50,000 x g for 30 mins. The supernatant was collected for HPLC fractionation.
A preparative HPLC system (Shimadzu Scientific Instruments, Kyoto, Japan) was used for protein fractionation. 10-20 mg of a protein lysate of interest was loaded onto a PolyCATWAX column (200 × 9.4mm, 5 m, 1000Å, PolyLC Inc., Columbia, MD). The proteins were eluted using a HPLC gradient from 0% buffer B to 100% buffer B in buffer A for 56 min (mobile buffer A: 10 mM MES, pH 6.0; and mobile buffer B: 10 mM MES, 800 mM NaCl, pH 6.0) at a flow rate of 4 ml/min at 4 °C. HPLC elution was collected every 2 min. The proteins in each fraction were precipitated by conventional trichloroacetic acid (TCA)/acetone method and air dried.
The tryptic digest from each fraction was analyzed in an LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific, Waltham, MA). The peptides were separated in a home-made capillary HPLC column (100 mm length × 75 μm internal diameter, 5 μm particle size, 100 Å pore diameter) with Jupiter C12 resin (Phenomenex, St. Torrance, CA) using a gradient from 2% to 30% solvent B in solvent A (mobile phase A: 0% ACN in 0.1% formic acid; and mobile phase B: 100% ACN in 0.1% formic acid) for 100 min. The eluted peptides were directly electrosprayed into the mass spectrometer using a nanospray source. The MS was operated in the data-dependent mode to automatically switch between Orbitrap-MS and LTQ-MS/MS (MS2) acquisition. Survey full scan MS spectra (from m/z 350-1800) were acquired in the Orbitrap with resolution R=30,000 at m/z=400. The 10 most intense ions were sequentially isolated for fragmentation in the linear ion trap using collsionally induced dissociation. The following parameters were specified: dynamic exclusion: 36 seconds; the repeat count: 2, and the exclusion window: +2 and −1 Da.
All MS/MS spectra were searched against the Refseq protein sequence database for E. coli sub str. K12DH10B (4127 sequences) using Mascot and PTMap software 22. The specific parameters for protein sequence database searching included tyrosine hydroxylation, methionine oxidation, and cysteine alkylation as a variable modification, trypsin as the digesting enzyme, three allowed missing cleavages, and mass error of 15 ppm for precursor ions and 0.6 Da for fragment ions. Charge states of +2 and +3 were considered for parent ions. The ions of the single charge state was not considered, because single charged ions represent lower quality data than ions with +2 and +3 charges. If more than one spectrum was assigned to a peptide, only the spectrum with the highest PTMap score was selected for manual analysis. False discovery rate was estimated using reverse-decoy database strategy by appending a reversed database to the forward database 23. All peptides bearing PB-DOPA identified with peptide score of PTMap > 1.0 (FDR < 5%) were manually examined with the rules described previously 24.
PTM stoichiometry (%) used in this study is the ratio of the number of MS/MS spectra for the peptide bearing the PB-DOPA residue divided by the number of MS/MS spectra for the both peptides bearing the corresponding Tyr site, namely the unmodified peptides with or without missing cleavages (cutoff PTMap score > 0.5, FDR < 1%) and modified peptides (cutoff PTMap score > 1.0, FDR < 5%).
Secondary structures of E.coli proteins were predicted using GOR algorithm 25. Biological processes enrichment analysis was performed using Gostats package in the statistical environment R. Probability values (p) of all enriched processes (p<0.01) were first converted into L values: L = -Log10(p) and then Z scores were calculated using the following formula: Z = [L-Ave(L)] / Std(L), where Ave(L) and Std(L) refer to the average and the standard deviation of all L values. Clustering of the biological processes was performed by one-way hierarchical clustering function in Genesis software 26. Evolution conservation analysis was performed using blast search of each PB-DOPA protein against the RefSeq protein database of the six model species – yeast (taxonomy ID: 4932), C. elegans (taxonomy ID: 6239), Drosophila (taxonomy ID: 7227), zebra fish (taxonomy ID: 7955), mouse (taxonomy ID: 10090), human (taxonomy ID: 9606).
The peptides bearing a PB-DOPA residue were synthesized by Genemed Synthesis Inc (San Antonio, TX). All synthetic peptides were verified by MS and MS/MS using the LTQ-XL (Thermo Fisher Scientific, Waltham, MA) coupled with a nano-HPLC system (Agilent 1200 series Nanoflow, Agilent Technologies, Santa Clara, CA). During HPLC/MS/MS analysis, loading of a synthetic peptide was adjusted close to that of its corresponding in vivo peptides. Extensive washing was carried out to avoid peptide carry-over. To verify coelution of the in vivo peptide and its synthetic counterpart, the in vivo peptide bearing a PB-DOPA residue and its corresponding synthetic peptide were monitored by extracted ion chromatogram of specific product ion. The nano-HPLC/mass spectrometric analysis was carried out using the procedure described in Materials and Methods.
The bacteria strain of interest containing a plasmid expressing an His6-tagged protein of interest, originally developed by National BioResource Project in Japan 27, was grown in LB media with chloramphenicol (30 μg/ml). The cells were treated with isopropyl β-D-1-thiogalactopyranoside (IPTG) (0.04 mM) to stimulate the protein expression for 3 hr, and harvested by centrifugation at 5,000 x g at 4 °C for 20 min. The pellet was washed with 5 ml of cold PBS, resuspended in a lysis buffer (10 mM imidazole with proteinase inhibitor cocktail) and sonicated four times at 4 °C for 10 s each time. Unbroken cells and cellular debris were removed by centrifugation at 20,000 x g at 4 °C for 20 min. Finally, the His6-tagged protein was purified under non-denaturing conditions by a standard procedure using Ni-NTA agarose (Qiagen, Valencia, CA). The purified proteins were resolved in SDS PAGE. The protein band of interest was excised, and in-gel digested for nano-HPLC/mass spectrometric analysis.
The purified His6-tagged proteins were separated by SDS-PAGE and visualized by staining with colloidal Coomassie blue. The protein bands of interest were excised, and in-gel digested using the clean protocol previously described 28.
To identify protein-bearing PB-DOPA residues, the E. coli and HeLa mitochondrial protein lysate were separated into 55 fractions in a mixed-bed PolyCATWAX column (Figure 1A). The proteins in each fraction were precipitated using the TCA/acetone method, and tryptically digested 29. The resulting proteolytic peptides were analyzed by nano-HPLC/LTQ Orbitrap mass spectrometer. The MS/MS data were first analyzed by Mascot algorithm to identify proteins. The MS/MS data and the identified proteins were then used as inputs, and analyzed by PTMap algorithm to identify PB-DOPA peptides and sites (Figures 1B and 1C). The candidate peptides bearing the PB-DOPA residue were further examined manually as previously described, to ensure the exclusive localization of oxidation sites in tyrosine instead of in other adjacent amino acid residues 24, 30.
The analysis identified 1,733 and 1,355 proteins from E. coli and HeLa mitochondrial protein lysates, respectively. Among these proteins, we identified 67 PB-DOPA sites in 43 E. coli proteins, and 9 sites in 7 HeLa mitochondrial proteins (Tables 11 and and2,2, Supplementary_table 1 and Supplementary_figures 1 and 2). These results suggest that about 2.5% and 0.5% proteins were tyrosine hydroxylated in E. coli and HeLa mitochondria, respectively. Given the fact that a peptide bearing a PB-DOPA typically has lower abundance, and will be more difficult to detect in HPLC/MS/MS analysis than its corresponding unmodified counterpart, the real abundance of PB-DOPA should be significantly higher than the percentage calculated above.
To confirm the in vivo PB-DOPA sites, we expressed and purified two His6-tagged proteins bearing the PB-DOPA residues in E. coli (Figure 2 and Table 3), superoxide dismutase Fe (GI 170081319) and pyruvate dehydrogenase, decarboxylase component E1 (GI 170079751). The isolated proteins were resolved in SDS-PAGE, followed by in-gel digestion and nano-HPLC/MS/MS analysis. All four peptides bearing PB-DOPA that had been identified from the two proteins in global screening, were confirmed in the corresponding recombinant proteins (Table 3 and Supplementary_figure 3).
To further confirm the tyrosine hydroxylation sites in E. coli, we also performed synthetic peptide verification (Figures 33 and and4,4, and Supplementary_figures 4-6). Two PB-DOPA peptides identified in E. coli lysate were synthesized, namely IAGDYOHIAK in sugar ABC transporter periplasmic-binding protein, and DALAPHISAETIEYOHHYGK in superoxide dismutase Fe. The result showed that the MS/MS spectra of the PB-DOPA peptides from E. coli lysate were exactly matched to the MS/MS spectra of their synthetic counterparts (Figure 3A-C and Supplementary_figure 5A-C), and the in vivo PB-DOPA peptides co-eluted with the synthetic counterparts in HPLC (Figure 3D-F and Supplementary_figure 5D-F). We further verified 8 PB-DOPA peptides identified in HeLa mitochondria by MS/MS spectra matching using synthetic peptides (Supplementary_figure 2). Taken together, our results conclusively confirmed the identification of peptides bearing the PB-DOPA residue.
Superoxide dismutase (SOD) is a metal-binding enzyme that catalyzes critical reactions that destroy radicals in the cell. The enzyme is highly redox-sensitive, and involved in catalyzing the dismutation of the superoxide radical anion to prevent the formation of highly aggressive ROS 31-34. In the present study, we identified Y34 (HHQTYOHVTNLNNLIK) as one of the two PB-DOPA sites on the E. coli SOD. The modification site was confirmed by MS/MS analysis of the His6-tagged recombinant protein, and by MS/MS and co-elution experiments of its corresponding synthetic peptide (Figure 4 and Supplementary_figures 7B-C). Previous studies have shown that the Y34 in E. coli SOD is an active site of the enzyme (Supplementary_figure 7A) acting as a proton donor for catalysis 35, promoting heat stability and facilitating substrate binding in both Fe-SODs and Mn-SODs 36, 37. Consequently, it is highly likely that the hydroxylation at Y34 plays a role in regulating its enzymatic activity.
Interestingly, we also identified the PB-DOPA modification at the same residue in human SOD2 from HeLa mitochondria, which is the human ortholog of bacterial SOD. (Supplementary_figure 7A). To verify the identification, we confirmed the MS/MS fragmentation pattern of the PB-DOPA peptide in SOD2 from mitochondria with the corresponding synthetic peptide (Supplementary_figures 7D-E). Previous studies have shown that mutagenesis on Y34, the conserved site in human SOD2, has resulted in decreased enzyme activity 32, 33. Our data therefore revealed an important PB-DOPA site that is evolutionarily conserved, and critical for cellular stress response.
In addition to Y34 in SOD, we also identified an evolutionarily conserved PB-DOPA site on glyceraldehyde 3-phosphate dehydrogenase (GAPDH) – Y256 of GAPDH in E. coli (AATYOHEQIK) (GI 170081435) (Supplementary_figure 8A) and Y253 of GAPDH in HeLa mitochondrial (YOHDDIK) (IPI00219018). GAPDH is an essential enzyme in glycolysis that can catalyze the reaction to convert glyceraldehyde 3-phosphate to D-1,3-bisphospho-glycerate. Accordingly, our data revealed an evolutionarily conserved PB-DOPA site in a protein that is associated with carbohydrate metabolism processes. Identification of the evolutionarily conserved PB-DOPA sites suggests the possibility of a common mechanism that regulates the PTM pathway among a variety of biological species.
We further performed evolution-conservation analysis for all of PB-DOPA sites identified in E. coli lysate among six model species – S. crevisiae, C. elegans, Drosophila, zebra fish, mouse and human. The analysis revealed that among 67 PB-DOPA sites in E. coli, 6 sites were conserved throughout all six species, and 6 sites were conserved in five species (Supplementary_table 2). Among these 12 sites, two sites are from the SOD family and two sites are from GAPDH, including the two PB-DOPA sites that were also identified in HeLa mitochondria analysis.
Protein phosphorylation has been shown to preferably occur in the unstructured regions, while lysine acetylation is found in structured motifs 38, 39. We studied the secondary structure characteristics of PB-DOPA sites identified in the E. coli lysate, and compared them to the secondary structure characteristics of all Tyr sites in E. coli proteins. Our data showed that PB-DOPA is enriched on the helix structure (p=0.0012) and significantly less represented on the turn structure (p=0.012) comparing to all the tyrosine residues (Figure 5A). Hence, PB-DOPA modification, similar to lysine acetylation, is likely to be enriched in the structured sequences and less abundant in the unstructured regions.
We further performed sequence logo analysis for the PB-DOPA sites identified in the E. coli lysate (Figure 5B) 40, and compared the abundance of different amino acids on each flanking position of the PB-DOPA site with that of all Tyr residues in E. coli proteins from the database. We found that the positively charged amino acid Lys was enriched in several positions including +9, −5, −1 and +10, while negatively charged amino acid Glu was enriched in +2 and +3 positions.
To elucidate cellular pathways that may be associated with PB-DOPA, we performed enrichment analysis of the biological processes on the identified PB-DOPA proteins by comparing the Gene Ontology (GO) annotations of the PB-DOPA proteins and E. coli proteome (Table 1 and Supplementary_table 1). Our data revealed that proteins bearing PB-DOPA were dominantly enriched in carbohydrate metabolism such as glycolysis and glucogenesis (Figure 5C). Among the 27 biological processes with enrichment p value less than 0.01, 14 processes were involved with carbohydrate metabolism, and all the enriched processes were involved with energy metabolism in the cell.
Identification of the PTM substrate proteins and their PTM sites are typically the first steps toward dissection of the PTM pathways, and understanding their biological functions. The protein substrates of tyrosine hydroxylation cannot easily be identified by conventional isotopic labeling. Additionally, there are still no modification-specific pan antibodies, such as those for phosphotyrosine and lysine acetylation, to detect and confirm its PTM peptides. Accordingly, biology studies of tyrosine hydroxylation have previously proceeded very slowly.
To address this problem, we carried out the first proteomics screening to identify PB-DOPA proteins. The study described here identified 67 novel sites of PB-DOPA in 43 proteins in E. coli, and 9 novel PB-DOPA sites in 7 proteins from HeLa mitochondria. Our approach involves unrestrictive sequence alignment using PTMap software developed in-house. Unlike conventional database searching software, PTMap requires exclusive modification-site localization for all unrestrictive PTM identifications. Given that hydroxylation or oxidation can occur on various amino acids (http://www.unimod.org), this feature of PTMap offers unique advantage by removing ambiguous PTM identifications with oxidation on amino acids other than Tyr.
A total of 4,715 and 3,012 Tyr residues were identified from 12,184 non-redundant E. coli peptides and 8,151 HeLa mitochondrial peptides, respectively. These results allow us to estimate the relative abundance of PB-DOPA to be ~ 1.4% and ~ 0.3% in E. coli and HeLa mitochondria, respectively. The result was a surprise to us, as mitochondria is the major cellular organelle that generates radicals and other reactive oxygen species (ROS). The much lower abundance of PB-DOPA in mitochondria than in E. coli could suggest that either ROS may not be the major cause of PB-DOPA in E. coli, or the mechanisms to prevent oxidative damage in mammalian cells are much better developed than E. coli.
Our results suggest that PB-DOPA is likely involved in the regulation of SOD2 activity. The same PB-DOPA site was found in SOD2 protein from both E. coli and HeLa mitochondria. Structure analysis showed that PB-DOPA preferred structured protein segments to unstructured regions. More importantly, proteins bearing PB-DOPA were found to be highly enriched in the pathways involved in cellular energy metabolism and especially carbohydrate metabolism. Our data therefore indicate the possibility of intrinsic cellular mechanism that regulates the PB-DOPA pathway. Finally, to our knowledge, this is the first study demonstrating that the PB-DOPA is present in prokaryotic cells and eukaryotic mitochondria.
Hydroxylation of proline and lysine and their regulatory enzymes such as prolyl hydroxylases are known to contribute to a variety of physiolgical and pathophysiological processes, such as tumorigenesis, erythropoiesis, angiogenesis 41. Hydroxylation at the tyrosine residue changes its structure. It is anticipated that such change is likely to induce alteration of structure and functions for a portion of its substrates, similar to proline hydroxylation (e.g., HIFα). Furthermore the alteration of protein structure can induce the preference of degradation in biological pathway 11, 42.
Many questions remain in the field of tyrosine hydroxylation. Based on the information obtained in this study, PB-DOPA is an abundant protein post-translational modification. It is highly likely that multiple enzymes remain to be identified in the cells. Is the PB-DOPA status dynamically changed between cells in different organs, cell lines, developmental stages, and environmental conditions? What are the main factors that determine the levels of PB-DOPA? What are the major functions of PB-DOPA in cellular functions and pathological conditions? The PB-DOPA substrates identified in this study could drive experimental efforts toward functional characterization of the PB-DOPA proteins and the anatomy of the PB-DOPA pathways.
This work was supported by NIH grants to Y.Z. and P.S.
This information is available free of charge via the Internet at http://pubs.acs.org/.