|Home | About | Journals | Submit | Contact Us | Français|
Post-translationally modified proteins make up the majority of the proteome and establish, to a large part, the impressive level of functional diversity in higher, multi-cellular organisms. Most eukaryotic post-translational protein modifications (PTMs) denote reversible, covalent additions of small chemical entities such as phosphate-, acyl-, alkyl- and glycosyl-groups onto selected subsets of modifiable amino acids. In turn, these modifications induce highly specific changes in the chemical environments of individual protein residues, which are readily detected by high-resolution NMR spectroscopy. In the following, we provide a concise compendium of NMR characteristics of the main types of eukaryotic PTMs: serine, threonine, tyrosine and histidine phosphorylation, lysine acetylation, lysine and arginine methylation, and serine, threonine O-glycosylation. We further delineate the previously uncharacterized NMR properties of lysine propionylation, butyrylation, succinylation, malonylation and crotonylation, which, altogether, define an initial reference frame for comprehensive PTM studies by high-resolution NMR spectroscopy.
Cellular signaling processes heavily rely on reversible post-translational protein modifications (PTMs) in their capacity to rapidly reprogram individual protein functions. PTMs are established and removed in a highly dynamic manner and exist in many different forms and flavors (Walsh et al. 2005). Along with alternative splicing, they provide the proteome with an enormous capacity for biological diversity and regulate virtually every aspect of cellular life, including cell–cell communication, cell growth and differentiation, sensing of metabolic states, mediating intracellular transport and initiating programmed cell death. Errors in PTM establishments and readouts, whether due to hereditary changes or environmental cues, constitute causal agents of many human diseases that include a long list of cancers, heart and brain diseases, diabetes and several metabolic disorders. Thus, the study of PTMs and how they regulate different cellular signaling processes has profound medical implications, both in the preventive and curative sense. PTM detection by high-resolution NMR spectroscopy represents a biophysical extension to studying these signaling marks from an analytical perspective, but also from a mechanistic, functional and structural point of view. The majority of PTMs is brought about by reversible, covalent additions of small, chemical entities, such as phosphate groups, acyl chains, alkyl chains, or various sugars, to the side-chains of individual protein residues (Khoury et al. 2011). Others involve the addition of protein modules such as ubiquitin, SUMO, or NEDD to selected target sites. In this article, we describe the NMR characteristics of common types of eukaryotic PTMs that belong to the first class of protein modifications, namely phosphorylation, acylation, alkylation and glycosylation. These PTMs typically occur in ‘regulatory’ protein regions that are intrinsically disordered, including also protein loop regions (Iakoucheva et al. 2004; Xie et al. 2007; Radivojac et al. 2007; Gao and Xu 2012). This, because fast, cellular signaling responses usually require modifying enzymes to rapidly access individual protein PTM sites, which is easier achieved when modifiable amino acids are solvent exposed (i.e. not part of the hydrophobic protein core) and located in parts of the protein that are devoid of regular secondary, or tertiary structure. A substrate’s primary amino acid sequence encodes the specificity determinants for the modifying enzymes and for the protein modules that eventually recognize the different PTMs (Seet et al. 2006). Besides PTM-induced, functional modulations in protein–protein interactions (i.e. establishment of new interactions, breaking of existing interactions), PTMs can also mediate a range of structural responses that, in turn, differentially regulate functional, biological outcomes (Dyson and Wright 2005) (see below for selected examples).
Some protein residues lend themselves to different forms of modifications at single atom positions, such as lysines for example, which may undergo mono-, di- or trimethylation at the Nζ position (also referred to as the lysine -amino site), or reversible acetylation of the same site. Similarly, the hydroxyl groups of serines and threonines can be phosphorylated or glycosylated. Other amino acids undergo multiple modifications at different side chain positions. Arginines display ‘regio-specific’ modification patterns, which may be symmetric or asymmetric, as observed in dimethylation reactions for example. This variety of modification states within single amino acid side-chains further increases the scope for diversity and plasticity of protein functions. Combinations of different types of PTMs on the same protein also provide the basis for complex signaling mechanisms via ‘reversible combinatorial codes’ (Jenuwein and Allis 2001) and coupled PTM marks are often established in hierarchical fashions, whereby upstream ‘master switches’ lead to the activation of different downstream signaling cascades. Conversely, co-operative sets of PTMs are frequently laid down in close proximity and allow direct synergistic or antagonistic cross talk between adjacent modification marks (Latham and Dent 2007; Kruse and Gu 2009; Martin et al. 2011).
Protein modifications such as N-terminal acetylation, proline hydroxylation, proline cis–trans isomerization, cysteine disulfide bond formation, protein oxidation or nitrosylation, as well as proteolytic processing will not be discussed at this point, although these PTMs constitute equally abundant and biologically important signaling marks that are well amenable to investigations by NMR spectroscopy.
Before the advent of recombinant protein expression technologies, selective isotope labeling and multidimensional, hetero-nuclear NMR methods, NMR studies of covalent protein modifications such as phosphorylation or acetylation were restricted to direct, natural abundance readouts of phosphorus, or carbon NMR signals. Protein phosphorylation for example, was assayed by monitoring discrete changes in ATP/ADP 31P resonances in enzymatic kinase reactions, with respect to increasing phospho-protein signals (Mak et al. 1978; James 1985; Matheis and Whitaker 1984). Similarly, lysine acetylation was observed by directly reacting proteins with (1′-13C)-acetylsalicyclic acid (Macdonald et al. 1999; Xu et al. 1999), while lysine methylation was chemically established via reactions with 13C formaldehyde (Ashfield et al. 2000; Macnaughtan et al. 2005; Abraham et al. 2009).
In this article, we restrict ourselves to PTM detection approaches by 2D hetero-nuclear correlation methods i.e. 1H–15N and 1H–13C NMR experiments and isotope-labeled protein samples. Experiments of that sort afford higher resolution insights into PTM reactions and provide residue-resolved, positional information about PTM target sites and about structural PTM consequences (see below). Because PTMs frequently occur in intrinsically disordered protein regions (IDRs), many of the NMR characteristics of protein PTMs described here are deduced from IDR examples. We have nevertheless included examples of PTMs in folded and partially folded protein substrates, whenever possible. We additionally discuss deviations in PTM NMR behaviors of folded proteins in the Conclusions Section of the manuscript. In addition, we would like to stress that PTM detection by NMR spectroscopy is subject to the same inherent limitations as all other high-resolution NMR applications. Increasing protein/PTM-substrate sizes inevitably lead to greater spectral complexities and unfavorable NMR relaxation behaviors. Residue-resolved PTM site mapping requires dual isotope labeling (13C/15N), triple-resonance NMR experiments (3D/4D) and dedicated NMR backbone assignment routines. Nevertheless, NMR detection of PTMs offers several advantages over ‘classical’ analytical methods, which are outlined in the following paragraphs. In addition, qualitative assessments of whether a protein of interest contains PTMs, and what types of PTMs, can be obtained without residue-specific resonance assignments provided that NMR spectra of unmodified reference states exist (discussed in the concluding remarks of the manuscript).
Covalent PTMs introduce local alterations in the chemical environments of individual protein residues that are readily detected as characteristic chemical shift changes of NMR-observable spin systems in 2D NMR correlation experiments. Because most of the abundant eukaryotic PTMs involve additions of small chemical entities that do not significantly alter the molecular weights of the respectively modified proteins, and are not subject to chemical exchange behavior, they do not compromise size-dependent NMR detection parameters. Knowledge about PTM NMR characteristics enables the correct identification of PTM type(s), as well as to map the corresponding PTM site(s), provided that resonance-specific assignments are available. Protein phosphorylation for example, typically leads to large downfield chemical shift changes of serine/threonine backbone amide resonances (1H–15N), while protein acetylation results in smaller upfield chemical shift displacements of lysine backbone amides (see below).
One important feature of PTM detection by NMR spectroscopy is the ability to delineate PTM distributions in proteins modified at multiple sites, provided that the different PTM marks are in close proximity. Site-specific mapping of adjacent protein PTMs is particularly challenging for most analytical methods, especially mass spectrometry (MS), which largely relies on proteolytic processing routines and peptide fragment-based PTM detection. As schematically illustrated in Fig. 1a, identical pairs of PTM/peptide fragments are generated from two different PTM distributions, which, conversely, cannot be distinguished by MS without elaborate identification processes of MS/MS fragmented peptides. In contrast, the corresponding NMR peak patterns unambiguously identify whether both PTMs are present on the same, or on different substrate molecules (Fig. 1b), provided that the presence of one PTM influences the chemical environment, and hence resonance frequency, of the respective other site. Although partial modifications of multiple PTM sites in the course of enzymatic modification reactions can complicate the resulting NMR spectra, their characteristics nevertheless resolve individual PTM distributions (Liokatis et al. 2012). As demonstrated for the doubly phosphorylated TEY fragment of the folded Erk kinase domain activation loop, PTM distributions originally derived from 2D NMR measurements were later confirmed by MS, however only after top-down and fragment-based MS approaches were combined (Prabakaran et al. 2011). Whenever multiple PTMs do not cluster in close proximity, PTM detection by NMR suffers from the same limitations in providing quantitative descriptions of PTM distributions as peptide-based MS approaches.
The non-disruptive nature of high-resolution NMR spectroscopy additionally offers convenient means for time-resolved NMR measurements of reconstituted PTM reactions in vitro, but also of cellular modification events in complex environments such as cell extracts and whole live cells (Sakai et al. 2006; Lippens et al. 2008; Selenko et al. 2008). One such example is provided by the N-terminal ‘tail’ region of histone H3 that is post-translationally modified by endogenous enzymes in extracts of cultured human HeLa cells (Liokatis et al. 2010). 2D 1H–15N correlation experiments revealed phosphorylation of Ser10 and acetylation of Lys14 (Fig. 1c). This example illustrates another advantage of PTM detection by NMR spectroscopy: the unique ability to monitor chemically distinct modification events in parallel (i.e. phosphorylation and acetylation) and without further requirements for selective enrichment or purification procedures, as would be required for MS analyses. The quantitative nature of NMR spectroscopy is another feature that makes it particularly appealing for PTM studies. Because changes in NMR signal intensities of modified and unmodified substrate residues are detected side-by-side, substrate/product concentrations, and time-dependent changes thereof, are readily deduced from simple NMR signal integration routines (Fig. 1d). From these, kinetic reaction parameters can directly be extracted (Dose et al. 2011; Landrieu et al. 2011). Such measurements are particularly useful in providing additional mechanistic insights into stepwise PTM reactions that require completions of certain PTM events, before others can ensue (Selenko et al. 2008; Theillet et al. 2012) (Fig. 1d).
While time-resolved NMR recordings can monitor the incorporation of various PTMs in a quantitative and residue-resolved fashion, PTM removal reactions can be studied equally well (Dose et al. 2011; Landrieu et al. 2011). In contrast to other methods, no changes in experimental setups, assay conditions, or readout parameters are required. Direct observations of reversible PTMs are thus fully compatible with the non-invasive and non-destructive nature of NMR spectroscopy.
A final benefit of multi-dimensional NMR methods for PTM detection is the ability to delineate newly established structural features that result as direct consequences of the respective modification events. Although increases in spectral complexity often result from such structural rearrangements, as the observed chemical shift changes no longer report the modified protein residues alone, PTM-triggered conformational alterations are readily detected by additional resonance peak displacements of ‘PTM-remote’ protein sites. On their own, such ‘long-range’ chemical shift changes do not reveal the details of newly established structural features. They nevertheless enable immediate qualitative assessments of conformational alterations that result as direct consequences of the individual PTM reactions. Protein phosphorylation in particular has long been known to provide the physicochemical basis for well-defined structural features (Johnson and Lewis 2001) that include modulations in α-helix stability via N-cap formation, or C-terminal destabilization (Andrew et al. 2002). Indeed, several NMR studies of phosphorylation-induced conformational changes have been reported (Antz et al. 1999; Patchell et al. 2002; Kar et al. 2002; Bielska and Zondlo 2006; Perez et al. 2009; Tait et al. 2010; Nielsen and Schwalbe 2011; Sibille et al. 2011). A compelling, recent example is provided by phosphorylation of two tyrosine residues within the folded cytoplasmic integrin β3 domain, which result in pronounced structural rearrangements via phospho-tyrosine mediated hydrogen bonds and newly established electrostatic interactions (Deshmukh et al. 2011). Semi-synthetic approaches to introduce site-specific, homogeneous PTM states additionally offer new possibilities for studying long-range conformational response behaviors of modified proteins (Hejjaoui et al. 2012; Fauvet et al. 2012).
Modification of serine and threonine protein residues by reversible phosphorylation constitutes the most abundant PTM in eukaryotes (Cohen 2002b) (Fig. 2a). Phosphorylation is mediated by enzymes collectively referred to as protein kinases, whose own activities are often regulated via reversible phosphorylation (Cohen 2002a). All kinases exploit ATP as the universal phosphate donor. Removal of phosphates from modified protein residues is accomplished by sets of enzymes called phosphatases (Wurzenberger and Gerlich 2011). A number of protein domains specifically interact with phosphorylated serine and threonine residues and thereby enable the switch-like properties that these PTMs bring about (Seet et al. 2006). 14-3-3 domain containing proteins bind phosphorylated serines and threonines (Gardino and Yaffe 2011). Some members of the WW domain protein family interact with modified serines and threonines followed by a proline (Wintjens et al. 2001; Salah et al. 2012). They are thereby in competition with other domains for the same motifs, such as CKS modules (Landrieu et al. 2001). Fork-head associated (FHA) protein domains selectively recognize phospho-threonines (Mahajan et al. 2008).
As previously mentioned, regulatory protein regions that harbor post-translational modification sites, and phosphorylatable serine/threonine residues in particular, are mostly solvent exposed and intrinsically disordered (Iakoucheva et al. 2004). As a consequence, their NMR characteristics are more readily affected by generic solution conditions such as pH and temperature. 2D 1H–15N correlation experiments are particularly well suited to identify phosphorylated serines and threonines, as these residues experience prominent, modification-induced downfield backbone amide chemical shift changes (Δδ ~ 0.5/1.5 p.p.m.) (Figure 2b). These chemical shift changes are primarily caused by intra-residue hydrogen bonds between amide protons and the phosphate moieties, whenever PTM residues are in extended conformations (Du et al. 2005; Ramelot and Nicholson 2001). Phosphorylation of serines and threonines involved in preexisting hydrogen bond networks, as it is often encountered in ‘structured’ protein loop regions, results in modulations of these characteristics (see later). Furthermore, the strong pH dependency of phosphorylated serine/threonine backbone amide resonances can directly be exploited to confirm phosphorylation (Bienkiewicz and Lumb 1999; Ramelot and Nicholson 2001; Prabakaran et al. 2011). At pH ~ 5, below the pKa of phospho-serines and phospho-threonines, phosphorylation-induced 1H–15N chemical shift changes are less pronounced due to the different protonation states of the phosphate group. In addition, high salt concentrations also shield the negative charge of the phosphate group even at pH values well above the respective pKa’s and similarly reduce phospho-serine/threonine chemical shift changes.
Serine/threonine phosphorylation at multiple protein sites can lead to considerable increases in spectral complexity, especially when the individual modification sites are closely spaced and incomplete substrate turnover is encountered. In fact, the large average number of protein serine/threonine residues that are phosphorylated by endogenous kinases in physiological environments such as cell extracts, often result in NMR spectra of that sort. One such example is provided by direct NMR detection of multiple phosphorylation events within the N-terminal, disordered transactivation domain (N-TAD) of human p53, executed by cellular enzymes in nuclear extracts from cultured HeLa cells (Fig. 2b). Other examples include NMR spectra of the multi-site phosphorylated, disordered C-terminus of PTEN, the human nucleolar protein hNIFK (Byeon et al. 2005), the disordered Tau protein (Landrieu et al. 2006; Leroy et al. 2010; Sibille et al. 2011), the histone H3 tail peptide (Liokatis et al. 2012), or the folded Erk protein kinase domain (Prabakaran et al. 2011). It should be stressed however, that such increases in spectral complexities often provide additional information with regard to different PTM distributions (Amniai et al. 2011; Prabakaran et al. 2011; Liokatis et al. 2012).
Many serine/threonine kinases are proline directed, which means that individual substrate sites are flanked by C-terminal proline residues (Songyang et al. 1996; Lu et al. 2002). The prolyl peptide-bond between serine/threonine and proline residues can exhibit cis/trans isomerization (Brown et al. 1999; Weiwad et al. 2000; Zhou et al. 2000; Werner-Allen et al. 2011) and phosphorylation often affects the thermodynamic properties of these isomers (Schutkowski et al. 1998). Moreover, sets of peptidylprolyl isomerases (PPIases) that accelerate cis/trans interconversion have been identified and changes in cis/trans equilibria provide additional levels of PTM regulation (Liou et al. 2011). With respect to the NMR chemical shift time scale, prolyl cis/trans isomerization is slow and therefore two NMR resonance signals are observed for nuclear spins that are in the proximities of the isomerizing peptidyl-prolyl bonds. This often leads to additional increases in spectral complexity (Andreotti 2003). An example is provided by the NMR study of multi-site, phosphorylation-specific interactions of the disordered cyclin dependent kinase (CDK) inhibitor Sic1 with its receptor Cdc4 (Mittag et al. 2008) (Fig. 2b). In this case, NMR signals of cis-Phe71, -Phe82 and of cis phospho-Thr5, -Thr45 and -Ser80 further complicate the spectral appearance of phosphorylated Sic1 (marked with asterisks in Fig. 2b). NMR observation of protein dephosphorylation i.e. detection of the ‘reverse’ PTM reaction, is equally well accomplished as illustrated by NMR mapping of site-selective phosphate removal from Thr153 of human Tau by the PP2A phosphatase (Landrieu et al. 2011) (Fig. 2b).
Tyrosine phosphorylation has emerged as a fundamentally important mechanism of signal transduction in eukaryotic cells that governs processes such as cell proliferation, cell cycle progression, metabolic homeostasis, transcriptional activation, neuronal transmission, differentiation, development and aging (Hunter 2009). Perturbations in tyrosine phosphorylation underlie many human diseases, in particular cancer, which has prompted the development of tyrosine kinase (TK) inhibitors as prominent drug targets (Kolch and Pitt 2010). Auto-phosphorylation of membrane-bound receptor tyrosine kinases (RTKs) (Lemmon and Schlessinger 2010) upon growth factor stimulation for example, triggers many ‘downstream’, intracellular signaling events that involve serine/threonine- and soluble, nonreceptor tyrosine-kinases (NRTKs). Phospho-tyrosines are specifically recognized by members of the SH2- and PTB-domain protein families (Yaffe 2002), as well as a range of protein tyrosine phosphatases (Julien et al. 2011). Importantly, evolution of the SH2 domain family in different organisms correlates with the divergence of downstream signaling networks and appears to recapitulate the complexities of the respective organisms themselves (Liu et al. 2011).
In contrast to serine/threonine phosphorylation, tyrosine phosphorylation does not induce similarly large, downfield backbone-amide chemical shift changes of the modified protein residues (Bienkiewicz and Lumb 1999), which is likely due to the more distal position of the phosphorylatable tyrosine hydroxyl group (Fig. 2c). Tyrosine phosphorylation does, however, lead to large chemical shift changes of aromatic CH resonances (Δδ ~ 0.3/3 p.p.m., 1H–13C) (Fig. 2d), which function as unambiguous indicators for the presence of phospho-tyrosines. Due to the limited chemical shift dispersion of solvent exposed protein tyrosine residues, phospho-site mapping via 1H–13C side-chain resonances, is not easily accomplished. Instead, once the presence of phospho-tyrosines has been confirmed by 2D 1H–13C experiments, their exact positions are mapped via ‘continuous’ backbone amide chemical shift changes of amino acids that surround the respectively modified tyrosine residues and that often display larger chemical shift displacements than the phosphorylated tyrosines themselves. Examples for phospho-tyrosine NMR studies are provided by the mono- and di-modified, folded Integrin β3 domain (Deshmukh et al. 2011) (Fig. 2d), phosphorylation of the folded cell cycle inhibitors p27 (Grimmler et al. 2007) and p21 (Fig. 2d, Kriwacki laboratory, unpublished results) and the disordered activation loop of PYK2 (Fig. 2d, Selenko laboratory, unpublished results). Thus, NMR detection of tyrosine phosphorylation and mapping of tyrosine phosphorylation sites by combinations of 2D 1H–13C and 1H–15N correlation experiments is rather straightforward.
Although histidine phosphorylation was thought to primarily occur in prokaryotic organisms and in plants, it is likely to play an equally important role in mammalian cells (Besant and Attwood 2005). A number of histidine-specific mammalian protein kinases and phosphatases have recently been identified (Attwood et al. 2010) and their particular roles in tissue homeostasis, regeneration and cellular proliferation are currently investigated. Histidines are phosphorylated at the δ1- (1-phospho-histidine) or 2- (3-phospho-histidine) positions (Fig. 2e). Spontaneous histidine dephosphorylation occurs at low pH, and slow inter-conversion of δ1- into 2-phospho-histidines takes place under mild basic conditions. This renders phospho-detection of modified histidine residue a particularly challenging task for any method (Besant and Attwood 2010; Kee and Muir 2012).
Interestingly, 2D phospho-histidine investigations by hetero-nuclear NMR methods have been reported as early as 1994 (Rajagopal et al. 1994). In their study, the Klevit laboratory generated stably His15-phosphorylated, folded HPr by means of continuous enzymatic regeneration, which counteracted hydrolysis of the modified histidine residue. Phospho-His15 induced minor backbone amide chemical shift changes of the majority of HPr resonances, while the phosphorylated amino acid and neighboring Ala16 and Arg17 experienced large downfield chemical shift changes in the proton and nitrogen dimensions (Fig. 2f). A localized structural rearrangement that was governed by the dianionic phosphoryl-group of His15, which acted as a novel hydrogen bond acceptor for the backbone amide protons of Ala16 and Arg17 and stabilized an α-helical N-cap position was later shown to be the cause for this behavior (Jones et al. 1997). Similarly, pronounced backbone-amide chemical shift changes were also observed in other phospho-histidine NMR studies despite the absence of phosphorylation-induced structural changes (van Nuland et al. 1995; Garrett et al. 1998; Suh et al. 2008). More recently, HNP-type NMR experiments, based on phospho-histidine 1J(15N/31P) coupling constants have been reported for the ‘stereo-specific’ NMR assignment of phosphorylated histidine residues (Himmel et al. 2010).
Acetylation of lysine residues constitutes another abundant post-translational protein modification in the eukaryotic proteome (Norris et al. 2009). Differential acetylation of lysine residues in histone proteins establishes, in part, the epigenetic ‘histone code’, which ultimately determines the transcriptional states of entire genomes (Kouzarides 2007). Comprehensive annotation studies have additionally identified over 1,700 acetylated proteins in the human proteome, with functions in a great variety of cellular processes (Choudhary et al. 2009). Acetylation denotes the chemical conversion of the primary NζH3+-amino group of lysine side-chains into NH-amide/acetyl moieties (Fig. 3a). Cellular enzymes that catalyze such reactions are collectively referred to as histone acetyltransferases (HATs), all of which employ acetyl-CoA as the ubiquitous acetyl-group donor (Berndsen and Denu 2008). Deacetylation is accomplished by histone deacetylases (HDACs) (Haberland et al. 2009) and both types of enzymes constitute prominent drug targets (Yang and Seto 2007). Acetylated lysine residues are specifically recognized by single-, or multi-copy bromodomain (BRD) containing proteins (Sanchez and Zhou 2009).
In 2D 1H–15N NMR spectra, lysine acetylation in intrinsically disordered protein regions typically results in small backbone amide chemical shift changes (~Δδ 0.06/0.4 p.p.m.) of the respectively modified residues (Fig. 3b). In addition, every acetylation event produces a novel amide resonance signal, which corresponds to the newly established side-chain amide NHζ group. For most acetylated lysines this side-chain NHζ signal resonates at ~8.1/127.5 p.p.m. (1H–15N) and therefore constitutes the generic acetylation indicator, whereas NMR mapping of acetylation sites relies on chemical shift difference readouts of backbone amide resonances, which serve as specific acetylation site identifiers (Liokatis et al. 2010; Smet-Nocca et al. 2010). One example of a dual acetylation reaction in which the two indicator signals do not superimpose is the stepwise acetylation of Lys382 and Lys373 of the disordered C-terminal transactivation domain (C-TAD) of human p53 by CBP/p300 shown in Fig. 3b (Selenko laboratory, unpublished results).
Propionylated and butyrylated lysine residues (Fig. 3c) were first identified in histone proteins, the transcription factors p53 and CBP/p300 and in the propionyl-CoA synthetase (Chen et al. 2007; Garrity et al. 2007; Zhang et al. 2009; Cheng et al. 2009; Liu et al. 2009). Lysine acetyltransferases such as CBP/p300 and P/CAF were shown to also function as propionyl-and butyryl-transferases (Chen et al. 2007; Cheng et al. 2009; Liu et al. 2009; Leemhuis et al. 2008), while deacetylases SIRT1, SIRT2 and HDAC8 perform the respective de-propionylation and -butyrylation reactions (Riester et al. 2004; Smith and Denu 2007a, b; Cheng et al. 2009; Liu et al. 2009; Bheda et al. 2011). In vitro-, and probably also in vivo-, lysine propionylation and butyrylation are thought to occur via propionyl- and butyryl-CoA metabolites, which are naturally present at high abundance. Although propionylated and butyrylated lysine residues are recognized by acetyl-lysine binding bromodomains (Vollmuth and Geyer 2011), it is not known whether they signal particular biological activities, or whether they merely represent side-products of spontaneous modification reactions by acetyltransferases and propionyl- or butyryl-CoA (Lin et al. 2012).
The NMR characteristics of propionylated and butyrylated lysine residues in intrinsically disordered protein regions are similar to those of acetylated lysines (Fig. 3d). While their CH2 resonances are indistinguishable from acetylated lysines (~3.2/41.5 p.p.m., 1H–13C), their side-chain NHζ indicator signals are well dispersed and unambiguously identify the respective modification types. As shown for butyrylated Lys14 of histone H3, the NHζ signal (1H–15N) is detected at ~8.15/127.5 p.p.m., while the proprionylated form of this lysine residue resonates at ~8.0/125.0 p.p.m.. Backbone amide NMR signals of the respectively modified lysine residue experience similar chemical shift changes.
In prokaryotic and eukaryotic organisms, lysine residues are also subject to succinylation, malonylation and crotonylation (Zhang et al. 2011; Peng et al. 2011; Du et al. 2011). These PTMs involve significant chemical and physical changes in the nature of lysine side-chains (Fig. 3e). Malonylation and succinylation are likely to be established via the transfer of a malonyl-, or a succinyl-group from malonyl- or succinyl-CoA, respectively, which are important metabolic intermediates. Cellular enzyme(s) that mediate lysine malonylation and succinylation have not yet been identified, although succinylation is abundant in mammalian proteins, especially in metabolic enzymes (Lin et al. 2012). Moreover, the Sirt5 protein, a bona fide member of the HDAC protein family with no known activity as a lysine deacetylase, has been shown to function as a nicotinamide-adenosine dinucleotide (NAD)-dependent lysine de-malonylase and de-succinylase (Peng et al. 2011; Du et al. 2011). Lysine crotonylation has recently been identified as an important histone modification that decorates transcription start sites in active chromatin (Tan et al. 2011). Crotonylating and decrotonylating enzymes remain unknown, while crotonyl-CoA is speculated to constitute the source for the transferred crotonyl group.
CH2 signals of succinylated, malonylated, crotonylated, or acetylated lysine residues in intrinsically disordered protein regions display similar resonances (~3.2/41.5 p.p.m., 1H–13C). In contrast, lysine succinylation, malonylation and crotonylation NHζ indicator signals are clearly different from acetylated lysines (Fig. 3f). Exemplified by differentially modified histone H3 Lys9, they resonate at ~8.0/125.0 p.p.m., ~8.2/127.0 p.p.m. and ~8.0/124.5 p.p.m. (1H–15N) respectively. Prominent backbone amide chemical shift changes of the modified residues and adjacent amino acids are additionally detected (Fig. 3f). This last characteristic offers means to easily identify the respective modification site(s), in a manner similar to acetylated lysine residues. NMR detection of acetylation, propionylation, butyrylation, succinylation, malonylation and crotonylation indicator cross-peaks in larger proteins may be hampered by signal overlap. However, upon inspection of NMR chemical shift entries of a randomly chosen set of 20 proteins from the Biological Magnetic Resonance Bank (BMRB), we did not detect substantial degrees of signal overlap in this region of the corresponding 2D 1H–15N NMR spectra (Suppl. Figure 1). In our hands, indicator cross-peaks of the aforementioned PTMs usually display larger NMR signal intensities than the corresponding lysine amide backbone resonances. This is probably due to comparable water/amide-proton exchange properties of the different types of amide groups, paired with favorable dynamic behaviors of side-chain amides.
Protein lysine residues are subject to two of the most abundant but chemically distinct PTMs: lysine acetylation (as described above) and lysine methylation. Moreover, one and the same lysine residue may be acetylated, mono-, di-, or trimethylated (Fig. 4a), which produces an impressive range of complexity for switch-like, signaling functions. Indeed, lysine methylation plays pertinent roles in biological processes that include chromatin-mediated signaling (Latham and Dent 2007; Barth and Imhof 2010; Bannister and Kouzarides 2011; Suganuma and Workman 2011) and transcriptional regulation (Egorova et al. 2010; Stark et al. 2011; Lehnertz et al. 2011; Campaner et al. 2011). Conversely, lysine methylation has been linked to carcinogenesis and tumor malignancy (Stark et al. 2011; Fullgrabe et al. 2011; Varier and Timmers 2011), brain function and disorder (Gupta et al. 2010; Peter and Akbarian 2011; Graff et al. 2011), various metabolic pathways (Teperino et al. 2010) and cellular life span (Han and Brunet 2012). In addition, methylated lysines have been identified as major virulence factors and strong immunogens in the Mycobacterium tuberculosis heparin binding protein hemaglutinin (HBHA) (Pethe et al. 2002). Methylation is generally accomplished via methyl-transfer of the methylsulfonium moiety of S-adenosyl-methionine (SAM) onto the distal Nζ moieties of lysine side-chains, catalyzed by SET-domain containing enzymes referred to as lysine methyltransferases, or KMTs (Del Rizzo and Trievel 2011). Demethylation is mediated by lysine demethylases, or KDMs. Two classes of KDMs are known: those that contain LSD domains and employ FAD as a cofactor and those that bear Jumonji domains and rely on α-ketoglutarate for demethylation (Heightman 2011). Because of their prevalent roles in disease relevant biological processes, both KMTs and KDMs constitute prominent drug targets (Kelly et al. 2010; Spannhoff et al. 2009; Copeland et al. 2009). Differentially methylated lysines are specifically recognized by plant homeo- (PHD) (Sanchez and Zhou 2011), chromo- (Yap and Zhou 2011) and MBT-domain containing proteins (Bonasio et al. 2010), which partially also define the ‘Royal Family’ of Tudor-like proteins (see also below) (Maurer-Stroh et al. 2003).
While lysine acetylation is manifested by the aforementioned chemical shift changes in 2D 1H–15N correlation spectra, lysine mono-, di-, and trimethylation in intrinsically disordered protein regions does not yield observable perturbations of backbone amide resonance signals (Theillet et al. 2012). Lysine mono-, and dimethylation does however produce characteristic side-chain 1H–15Nζ indicator resonances (Fig. 4b), but their fast chemical exchange properties at physiological temperatures and pH make NMR detection impracticable. Instead, lysine methylation is best observed via 2D 1H–13C correlations, which display unique chemical shift changes of CH2 side-chain resonances for the different methylation states (Fig. 4b). Specifically, CH2 signals of unmodified lysines resonate at ~3.0/42.0 p.p.m. (1H–13C), while mono-, di- and trimethylated lysines experience large downfield chemical shift changes (~Δδ 0.1/9.0 p.p.m., 1H–13C) and thus display characteristic resonance frequencies at ~3.1/51.0 p.p.m., ~3.2/60.0 p.p.m. and ~3.4/68.5 p.p.m. (1H–13C) respectively. The added methyl groups of mono-, di-, or trimethylated lysines are detected at ~2.7/35.5 p.p.m., ~2.9/45.5 p.p.m. and ~3.1/55.5 p.p.m. (1H–13C) (Theillet et al. 2012). Because most modifiable lysine residues in folded and intrinsically disordered proteins are solvent exposed, they sample similar chemical environments and the above NMR characteristics are generally preserved. Hence, lysine CH2 resonances unambiguously determine whether a respective residue is methylated and, if so, in what form. Because NMR detection of lysine methylation via proton-carbon correlations does not involve exchangeable protons that could be subject to differential chemical exchange behaviors, time-resolved NMR measurements of methylation reactions can be performed in a broad range of in vitro conditions, or directly in complex environments such as cellular extracts (Theillet et al. 2012). The advantage of observing nonexchangeable 1H–13C correlations is offset by the chemical shift degeneracy of lysine side-chain resonances, which makes NMR mapping of lysine methylation sites difficult. Residue-selective isotope labeling and dedicated 2D (HC(Mex)-TOCSY-Cα)NH pulse schemes that exploit selective methyl-lysine C excitations and correlations to well-resolved lysine backbone amide (1H–15N) resonances, obliterate these problems (Theillet et al. 2012). Different to methylated lysine CH2 signals, CH2 resonances of acetylated lysines are detected at ~3.15/42.0 p.p.m. (Fig. 4b) and can therefore be monitored simultaneously with methylated lysines (Theillet et al. 2012).
Methylation of arginine residues is yet another abundant and biologically important PTM (Bedford and Richard 2005). Initially described in histone- and splicing-proteins, arginine methylation occurs in numerous other polypeptides that exert vital functions in signal transduction, transcription and translation (Bedford and Clarke 2009; Teyssier et al. 2010; Parry and Ward 2010; Erce et al. 2012). Arginine methylation involves the covalent addition of one, or two methyl groups to either one, or both distal guanidino Nη nitrogens of arginine side-chains (Fig. 4c). In contrast to lysine acetylation, but similar to lysine methylation, arginine methylation preserves the overall positive charge of the residue. Arginine dimethylation exhibits stereo-specific chemical properties and occurs in either a symmetric (SDMA), or asymmetric (ADMA) form. Methylation of arginine residues is mediated by sets of enzymes called peptidylarginine methyltransferases, or PRMTs (Wolf 2009). Symmetrically dimethylating PRMTs are referred to as Class I enzymes. Asymmetric dimethylation is established by Class II enzymes. For both classes of PRMTs, monomethylation typically occurs as an intermediate step en route to dimethylation. All PRMTs employ SAM as a cofactor and methyl-group donor. Demethylation is accomplished by peptidylarginine demethylases, or PRDMs (Smith and Denu 2009; Di Lorenzo and Bedford 2011). Dedicated methyl-arginine binding is primarily mediated by proteins of the Tudor domain family (Chen et al. 2011). Arginine methylation and aberrant PRMT and PRDM functions are implicated in a number of human diseases, including several forms of cancer, which has spurred interest in PRMTs and PRDMs as novel drug targets (Spannhoff et al. 2009; Lakowski et al. 2010; Luo 2012).
2D 1H–13C correlation spectra of non-methylated, mono- and symmetric di-methylated arginines in intrinsically disordered protein regions display minute differences in their CH2δ resonance signals, which superimpose at ~3.25/39.0 p.p.m. (1H–13C), clearly offset from lysine CH2 resonances (Fig. 4d). As shown for histone H2A Arg11 in comparison, asymmetric dimethylated arginines exhibit pronounced downfield chemical shift changes in the carbon dimension and resonate at ~3.25/40.0 p.p.m. (1H–13C) (Fig. 4d). Methyl-group correlation signals of mono- and symmetric di-methylated arginines superimpose at ~2.75/26.0 p.p.m. (1H–13C), well set apart from arginine CH2δ resonances. NMR signals of asymmetric dimethyl-groups are detected at a uniquely different resonance frequency at ~3.0/36.0 p.p.m. (1H–13C). The poor chemical shift dispersion of non-, mono- and symmetric dimethylated arginine CH2δ signals limits the usefulness of 1H–13C correlation experiments in identifying these particular PTM states. Instead, 2D 1H–15N correlation spectra of methylated arginine residues reveal a great spread of their NH resonance signals, depending on their individual modification states (Fig. 4d): NH cross-peaks of non-methylated arginines are detected at ~7.05/84.0 p.p.m. (1H–15N), while mono- (~6.85/81.0 p.p.m.), symmetric di- (~6.65/79.0 p.p.m.) and asymmetric dimethylated arginines (~6.7/83.0 p.p.m.) display uniquely different resonance frequencies. In addition, arginine 1H–15N NHη signals exhibit characteristic chemical shift values in their differentially methylated forms and because most modifiable arginines in PRMT substrates are solvent exposed and sample similar chemical environments, these NMR characteristics are generally preserved. However, NMR detection of solvent accessible protein arginine NH and NHη resonances is only feasible at a pH lower than 6.5, because of fast water/guanidinium proton chemical exchange (Liepinsh and Otting 1996). This precludes NMR measurements of enzymatic arginine methylation reactions under truly physiological conditions (i.e. at pH 7.0–7.5). Nevertheless, qualitative information about the presence of methylated arginine residues can be obtained at a pH below 6.5 and by low temperature NMR measurements as shown in Fig. 4, while residue-resolved NMR mapping of arginine methylation sites requires additional side-chain/backbone amide correlation experiments. As stated before, NMR detection of arginine methylation may become increasingly more difficult in proteins of larger sizes. However, methylated arginines are usually located in disordered protein regions (Gao and Xu 2012), which-, paired with enhanced side-chain dynamics-, offers additional advantages for low temperature detection routines that usually suffer from unfavorable increases in NMR correlation times in folded proteins.
Protein glycosylation refers to a large number of chemically distinct modifications that are overall classified based on the chemical nature of their protein–sugar linkages: N-glycosylation and O-glycosylation (Spiro 2002; Cummings 2009; Larkin and Imperiali 2011). Glycosyltransferase enzymes employ UDP-, GDP- or CMP-‘activated’ sugars as cofactors, from which they transfer the respective carbohydrate entities onto substrate proteins (Ohtsubo and Marth 2006). Glycosylation is abundant in viruses, prokaryotes, archea and eukaryotes (Vigerust and Shepherd 2007; Eichler and Adams 2005; Bhat et al. 2011; Cummings 2009; Khoury et al. 2011; Hart and Copeland 2010), where it is involved in nutrient sensing, transcription, translation, signal transduction, organelle transport and cell–cell communication (Roth 2002; Hart et al. 2011; Marth and Grewal 2008). Conversely, these functions are often hijacked by pathogens for invasive mechanisms of cell entry (Varki 2008; Vigerust and Shepherd 2007). Glyco-mediated self, non-self recognition and pathological aberrations thereof are implicated in a number of human auto-immune diseases (Alavi and Axford 2008; Arnold et al. 2007) and speculated to be involved in diabetes and cancer (Slawson et al. 2010; Slawson and Hart 2011). O-GlcNAc glycans have additionally been shown to prevent aggregation-prone proteins from oligomerization and fibrillization (Yuzwa et al. 2012; Liang et al. 2006; Yu et al. 2008). Structural studies of glycosylated proteins generally require homogeneous glycans, which are difficult to produce especially when extended structures are desired. Recent advances in genetic engineering of bacterial and yeast glycosylation pathways for in vivo glycoprotein production have greatly improved this task (Rich and Withers 2009). Genetically engineered bacteria can also be employed to produce homogeneous glycans, which can then be linked via chemical, or enzymatic reactions to proteins of interest (Skrisovska et al. 2010). These strategies allow alternative isotope labeling schemes for protein and glycan moieties that permit isotope-filtered/edited NMR experiments (Slynko et al. 2009). In addition, dedicated protocols for the production of specifically isotope-labeled and glycosylated antibodies using hybridoma cell lines have been reported (Yamaguchi and Kato 2010).
Glycans typically display high internal mobility (DeMarco and Woods 2008), which hampers their characterization by X-ray crystallography (Wormald et al. 2002; Meyer and Moller 2007). In many instances, glycan sugar moieties retain their high degree of internal mobility when they are covalently attached to the respective protein targets. This, in turn, renders them amenable to high-resolution NMR studies, as exemplified by recent work on the glycosylated, 55 kDa Fc fragment of immunoglobulin G (Barb et al. 2011). Thus, NMR constitutes the preferred tool for characterizing the structural and dynamic properties of sugar moieties in glyco-proteins (Fletcher et al. 1994; Wyss et al. 1995; Metzler et al. 1997; Erbel et al. 2000; Slynko et al. 2009; Barb and Prestegard 2011). On the protein side, NMR has also been used to investigate the conformational properties of glycosylated-, and neighboring protein residues. Specifically, preferred protein backbone conformations have been correlated with glycosidic peptide-sugar linkages of modified serine and threonine residues (Corzana et al. 2006b, 2007). However, due to the great variety in glycan residues and in glycosylation-induced changes of protein backbone conformations, it is not possible to pin down common glycan features that define the NMR characteristics of individual glycosylation events. Despite that, residue-resolved NMR measurements of glycan modification kinetics are easily accessible, because of the large chemical shift differences between free and polymerized carbohydrate entities (Barb et al. 2011).
N-linked glycans contain Glc3Man9GlcNAc2 as the basic building block, which is covalently added onto the Nδ position of asparagine side-chains within the Asn-X-Ser/Thr consensus sequence (X must not be a proline) (Stanley et al. 2009). Starting from this primary structure, additional carbohydrate moieties (fucose, GalNAc, sialic acid, galactose) are progressively added, or, in turn, removed to yield the final N-glycan products. Cellular N-glycan maturation occurs in multiple, spatially separated reaction steps in the endoplasmic reticulum (ER) and Golgi apparatus. Once established, N-glycosylation itself is rather long lived, whereas individual glycan structures may experience dynamic compositional changes during a protein’s lifetime (Stanley et al. 2009; Schwarz and Aebi 2011). With regard to glycosylation-induced changes in local protein backbone conformations, N-glycosylation has been reported to promote β-turn conformations (Meyer and Moller 2007), stabilize folded protein domains (Wyss et al. 1995) and increase the degree of order in human chorionic gonadotropin (Erbel et al. 2000). Immunoglobulin G N-glycans exhibit conformational exchange between two states, one giving rise to contacts between the glycan chain and the immunoglobulin, the other one preventing such contacts (Barb and Prestegard 2011; Barb et al. 2012). In contrast, a single, well-defined structure of the N-glycan chain was delineated for the adhesion domain of human CD2 and for a model glycoprotein from Campylobacter jejuni (Wyss et al. 1995; Slynko et al. 2009).
O-glycans are established via direct glycosylation of serine, threonine, or tyrosine side-chain hydroxyl groups and can be found in prokaryotes and eukaryotes (Brockhausen et al. 2009; Freeze and Haltiwanger 2009; Hart and Akimoto 2009). The structural and chemical diversity of O-glycans is much higher than for N-glycans, and many different sugars, such as N-acetylglucosamine (GlcNAc), N-acetylgalactosamine (GalNAc), galactose, glucose, sialic acid, fucose or xylose are commonly incorporated. O-glycosylation has been shown to induce various protein conformational changes. Serine, threonine O-glycosylation with α-D-GalNAc for example, was reported to decrease the α-helical content of the modified peptide hormone calcitonin (Tagashira et al. 2002), or even elicit β-like, extended structures in glycopeptides, which, in turn, were detected by large, residue-specific chemical shift changes (Coltart et al. 2002; Hashimoto et al. 2011). In contrast, it has been observed that O-glycosylation with β-D-glucose increases α-helicity (Corzana et al. 2006a), while O-glycosylation with β-D-GlcNAc was reported to induce turn-like structures in several glycopeptides (Simanek et al. 1998; Wu et al. 1999). These structures were further disrupted by alternative phosphorylation, as has been shown for the N-terminus of the murine estrogen receptor beta, for example (Chen et al. 2006). NMR was also used to decipher a glycosylation-dependent decrease in proline cis isomer content at positions C-terminal to modified serine, or threonine residues (Narimatsu et al. 2010). Finally, O-glycosylation by α-D-Gal or β-D-Gal impacts cis/trans isomerization of (2S,4S)-4-hydroxyproline (Owens et al. 2009), but not of (2S,4R)-4-hydroxyproline (Owens et al. 2007), whereas glycosylation of poly-hydroxyprolines induces stable poly-proline type II helices (Owens et al. 2010). O-linked β-N-acetylglucosamination is highly dynamic and as abundant as protein phosphorylation, or acetylation (Hart and Akimoto 2009; Khoury et al. 2011) (Fig. 5a). In many instances, individual serine and threonine residues in eukaryotic proteins compete in phosphorylation/glycosylation reactions (Hart et al. 2007; Hart et al. 2011). Because no consensus sequences have yet been identified for O-linked N-acetylglucosamine transferase (OGT) enzymes, even MS detection of protein O-GlcNAc sites has proven difficult. Sophisticated enrichment routines for O-GlcNAc peptides via combined enzymatic and chemical reactions, in combination with soft ionization modes that preserve the labile O-GlcNAc groups on serines and threonines have to be employed (Wang et al. 2010).
13Cβ chemical shift values of O-GlcNAcylated or O-GalNAcylated serines and threonines (~71.0 and ~78.0 p.p.m. respectively) give rise to well-separated resonance signals in 2D 1H–13C correlation experiments, which can serve as unique O-glycan indicators (Corzana et al. 2007; Smet-Nocca et al. 2011). Characteristic, anomeric O-Glycan 1H1-13C1 correlation signals at ~4.3–5.0/99.0–105.0 p.p.m. could further function as O-glycosylation indicators. Direct NMR identification of protein O-glycosylation sites by homonuclear 1H–1H correlation experiments is not straightforward, because no cross peaks between GlcNAc protons and the modified serine or threonine residues are detected (Smet-Nocca et al. 2011; Dehennaut et al. 2008). In the case of O-GlcNAc modified Tau for example, 2D 1H–15N correlation experiments enabled chemical shift difference mapping of the glycosylated protein region, but failed to identify the respective modification site(s), because of large chemical shift changes of two serine and one threonine residues, Ser400, Thr403 and Ser404, and of neighboring Val399, Gly401, Asp402 (Fig. 5b). NMR assignment of the modification site was achieved via a combination of 2D 1H–15N HSQC, 1HN-1H TOCSY and 1H–13C HSQC experiments. Based on the large Cα and Cβ chemical shift changes of O-glycosylated protein residues (Δδ ~ 2.0 p.p.m. and ~6.0 p.p.m., respectively), corresponding Hα and Hβ chemical shifts were extracted from 2D 1H–13C spectra and correlated to HN resonances via HN-Hα/β signals from 2D TOCSY NMR spectra (Fig. 5b). Thereby, O-GlcNAc modification of Tau Ser400 was confirmed (Dehennaut et al. 2008; Smet-Nocca et al. 2011).
The growing demand for quantitative methods to annotate cellular signaling states on systems levels has been met by the development of analytical tools that enable direct observations of cellular PTMs in unperturbed environments. Mapping of protein PTM sites, as well as in situ deductions of mechanistic properties of cellular PTM reactions-directly obtainable from such analyses-, are critically required to further our understanding about how these processes are modulated under different health and disease conditions. With this article, we hope to have conveyed strong arguments in favor of high-resolution NMR spectroscopy as highly useful in providing such information.
What would be the requirements for an ideal analytical tool in eukaryotic PTM research? Above all, it ought to be able to generically and qualitatively report whether a protein of interest is post-translationally modified and if so, to identify what kind of PTMs are present and at which residue positions. In most instances, it will be important to address such questions in cellular contexts and without defined information about the nature of the modifying enzymes and their respective activities. Here, direct NMR measurements of isotope-labeled proteins in different cell extracts can provide valuable first insights. While in extract NMR approaches may only be feasible for reasonably-sized (<20 kDa) proteins, and for one isotope-labeled protein at a time, in most instances simple 2D NMR correlation (1H–15N and 1H–13C) experiments may prove sufficient to qualitatively identify which types of PTMs are present, even without the necessity for NMR resonance assignments. As we have outlined throughout the text, most of the predominant eukaryotic PTMs display characteristic NMR indicator properties that make their identification straightforward. Serine/threonine and histidine phosphorylation in intrinsically disordered protein regions results in large downfield chemical shift changes of backbone amide resonances for example, which are easily discernable in 2D 1H–15N correlation spectra. Tyrosine phosphorylation is less pronounced by 2D 1H–15N measures, but phosphotyrosines display unique indicator properties in the aromatic region of 2D 1H–13C NMR correlations. Similarly, different acylation events (i.e. acetylation, malonylation, succinylation, crotonylation, propionylation and butyrilation) produce unique HNζ indicator signals in 2D 1H–15N experiments, which can simultaneously be detected with most phosphorylation modifications. Characteristic lysine CH2 resonances in 2D 1H–13C NMR correlations unambiguously function as indicators for mono-, di- and trimethylation and transferring the protein mixture to a low pH environment enables NMR recordings of unique indicator resonances of different arginine methylation states by 2D 1H–15N experiments. At the same time, sets of 2D 1H–15N and 1H–13C correlation experiments provide qualitative indications for possible glycosylation events at serine/threonine positions. Thereby, combinations of ‘simple’ 2D correlation experiments (1H–15N and 1H–13C) can be used to identify the most common types of eukaryotic PTMs.
While we have focused our article on NMR characteristics of eukaryotic PTMs in disordered, regulatory protein regions, we wish to explicitly stress that PTMs within folded proteins, or protein domains, may exhibit deviations from the canonical NMR properties described above (Lippens and Selenko laboratories, unpublished observations). Especially for cases in which post-translationally modified amino acids are involved in hydrogen bond networks, PTM-induced NMR behaviors can differ substantially from disordered, solvent exposed PTM sites. In addition, whenever more global backbone amide chemical shifts changes are observed, NMR mapping of individual PTM sites may require more dedicated in vitro experimental setups and additional 3D NMR experiments. These potential drawbacks are contrasted by the unique ability of NMR spectroscopy to provide time-resolved, quantitative information about individual modification levels (i.e. ratios of modified versus unmodified substrate sites and molecules), as well as about individual PTM distributions in the case of closely spaced modification sites. Time-resolved NMR spectroscopy does provide high-resolution insights into hierarchical properties of processive PTM events at multiple protein sites, which ideally complements PTM data from proteome-wide MS studies. Combined with direct NMR readouts in complex environments such as cell extracts and intact cells, it offers the advantage to quickly and comparatively analyze PTMs under different in vitro and in vivo conditions. The scope of PTM induced structural rearrangements, which is not easily accessible with classical in vitro methods in structural biology and particularly important for intrinsically disordered, regulatory protein regions, provides yet another area for unique NMR input. For these reasons, we believe that NMR spectroscopy will become an increasingly important tool for deciphering the full biological range of signaling-mediated, cellular processes. Here, we have provided an initial NMR reference frame for the most abundant eukaryotic post-translation protein modifications. Future studies will likely reveal an even greater chemical repertoire of cellular PTMs, but given the inherent physical nature of high-resolution NMR spectroscopy and its unique ability to report changes in the chemical environments of individual atomic nuclei, it is well poised to face these challenges with ease.
We would like to thank Rachel Klevit, Olga Vinogradova, Tanja Mittag and Julie Forman-Kay for providing original NMR spectra for reproduction in this manuscript. F.X.T. acknowledges support from the Association pour la Researche contre le Cancer (ARC). P.S. acknowledges funding by an Emmy Noether research grant (SE1794/1-1) from the Deutsche Forschungsgemeinschaft (DFG). R. W. K. acknowledges support from NIH core grant P30CA21765 (to St. Jude Children’s Research Hospital) and 5R01CA082491 (to R. W. K.), and the American Lebanese Syrian Associated Charities (ALSAC) of St. Jude Children’s Research Hospital. We further express our gratitude to Angela Gronenborn and Georges Mer for expert advice and stimulating discussions in the course of writing the paper.
Electronic supplementary material The online version of this article (doi:10.1007/s10858-012-9674-x) contains supplementary material, which is available to authorized users.
Francois-Xavier Theillet, Department of NMR-Supported Structural Biology, Leibniz Institute of Molecular Pharmacology (FMP Berlin), In-cell NMR Group, Robert-Roessle Strasse 10, 13125 Berlin, German.
Caroline Smet-Nocca, CNRS UMR 8576, Universite Lille Nord de France, 59655 Villeneuve d’Ascq, France.
Stamatios Liokatis, Department of NMR-Supported Structural Biology, Leibniz Institute of Molecular Pharmacology (FMP Berlin), In-cell NMR Group, Robert-Roessle Strasse 10, 13125 Berlin, German.
Rossukon Thongwichian, Department of NMR-Supported Structural Biology, Leibniz Institute of Molecular Pharmacology (FMP Berlin), In-cell NMR Group, Robert-Roessle Strasse 10, 13125 Berlin, German.
Jonas Kosten, Department of NMR-Supported Structural Biology, Leibniz Institute of Molecular Pharmacology (FMP Berlin), In-cell NMR Group, Robert-Roessle Strasse 10, 13125 Berlin, German.
Mi-Kyung Yoon, Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN, USA.
Richard W. Kriwacki, Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN, USA.
Isabelle Landrieu, CNRS UMR 8576, Universite Lille Nord de France, 59655 Villeneuve d’Ascq, France.
Guy Lippens, CNRS UMR 8576, Universite Lille Nord de France, 59655 Villeneuve d’Ascq, France.
Philipp Selenko, Department of NMR-Supported Structural Biology, Leibniz Institute of Molecular Pharmacology (FMP Berlin), In-cell NMR Group, Robert-Roessle Strasse 10, 13125 Berlin, German.