|Home | About | Journals | Submit | Contact Us | Français|
Understanding signaling and other complex biological processes requires elucidating the critical roles of intrinsically disordered proteins and regions (IDPs/IDRs), which represent ~30% of the proteome and enable unique regulatory mechanisms. In this review we describe the structural heterogeneity of disordered proteins that underpins these mechanisms and the latest progress in obtaining structural descriptions of ensembles of disordered proteins that are needed for linking structure and dynamics to function. We describe the diverse interactions of IDPs that can have unusual characteristics such as “ultrasensitivity” and “regulated folding and unfolding”. We also summarize the mounting data showing that large-scale assembly and protein phase separation occurs within a variety of signaling complexes and cellular structures. In addition, we discuss efforts to therapeutically target disordered proteins with small molecules. Overall, we interpret the remodeling of disordered state ensembles due to binding and post-translational modifications within an expanded framework for allostery that provides significant insights into how disordered proteins transmit biological information.
Eukaryotic cells are complex and sensitive machines, through which - in response to external stimuli - information flows down signaling pathways to activate various sensor and effector mechanisms. All of these mechanisms are controlled by tight regulation of gene products from transcription to protein degradation and require protein flexibility. Proteins are inherently dynamic and sample several different conformations, however the degree of flexibility and timescale of protein motions vary significantly. Fluctuations around the lowest energy state (fast-timescale dynamics) resulting in a large ensemble of structurally similar conformations are widely accepted as crucial elements in molecular recognition1. Protein dynamics occurring on a slower timescale involving large-scale motions between relatively small number of conformational states can also be found, such as in proteins with flexible linkers or hinges connecting domains2. An extreme of protein dynamics is represented by intrinsically disordered proteins (IDPs), which sample highly heterogeneous conformations that interconvert rapidly3–4. In the past decade it has become clear that understanding of complex cellular processes and their malfunctioning in disease requires recognizing the role of intrinsically disordered regions (IDRs) and IDPs. Although there has been an explosion in our general knowledge of the molecular mechanisms of signal transmission from cell surface receptors to nuclear transcription factors, we have only just begun to explore the diversity of roles that IDRs play in signaling (Figure 1). This is partly due to challenges in describing the many millions of conformers IDRs and IDPs can sample and in linking functional consequences to these different conformational states. Another reason is that IDRs/IDPs often have unusual binding modes in their interactions with other proteins or nucleic acids that are hard to characterize by conventional methods and defy conventional assumptions regarding protein interactions. Thus, a significant shift in perspective is necessary in order to understand structure-function relationships involving IDRs and IDPs in protein:protein or protein:DNA/RNA interactions.
In this review we will guide the readers through current challenges of conformational ensemble calculations of IDPs and what can be learned from the structural descriptions of these proteins. We highlight some non-conventional binding characteristics of IDRs/IDPs that are critical for signal propagation and regulation of signal processes. IDPs/IDRs are frequently involved in dynamic interactions in signaling networks where fine-tuning of the cascade of interactions is exerted through multisite dependence5–6. These multisite interactions can serve to integrate different signals and often lead to ultrasensitivity or cooperativity. For example, post-translational modifications such as phosphorylation at multiple sites can lead to an “ultrasensitive” on/off switch as described for the Cdc4:pSic1 interaction7–9 or can induce a folding transition in a disordered protein such as 4E-BP2 to act as a regulatory switch10 On the other hand, order-to-disorder transitions such as partially unfolding of BCL-xL by PUMA can also serve as a regulatory mechanism, in this case triggering apoptosis11. In the context of apoptotic regulation, the recent report of pro-apoptotic activation of the ‘effector’ protein BAX by cytosolic p53 through cis-trans isomerization of a proline residue within the disordered N-terminus of p53 further highlights a signaling event that is mediated by a relay of discrete conformational transitions between interacting proteins12. Interestingly, it has become evident in the last few years that protein phase separation occurs within a variety of signaling complexes and subcellular structures although understanding the underlying mechanisms is far from complete. The appreciation of the role of disordered regions in mediating allosteric coupling is relatively new, but is not surprising given that allostery relies on conformational and energetic equilibria13–14. At the end of this review we describe progress in targeting disordered protein regions by small molecules, including those with therapeutic potential.
Flexibility is necessary for most protein-protein interactions. Globular proteins possess some degree of flexibility that facilitates responses to binding. They often contain loop or domain linker regions that provide adaptability in their interactions with partners or substrates. More fundamentally, globular proteins sample conformations other than the most populated one to varying extents, and these higher energy 'excited states' have been proposed to play important functional roles in molecular recognition15–16, enzyme catalysis17–18 and protein folding19–20. However, the structural ensembles that globular proteins sample are limited compared to those sampled by IDPs or IDRs, which can adopt a continuum of highly heterogeneous interconverting conformations. Accordingly, the energy landscapes of globular proteins and IDPs/IDRs exhibit different features. The energy landscapes of globular proteins are funnel shaped, with the bottom of the funnel representing the folded state, but the landscapes are not smooth and there are some local minima that can trap the proteins in higher energy conformations21–23. The energy landscapes of IDPs/IDRs are, in contrast, rather flat, with no single global energy minimum but many local minima that are not separated by large energy gaps, facilitating sampling of different structural states24. The landscapes are not completely featureless, however, as disordered proteins transiently sample secondary structure and tertiary contacts with a range of preference and populations25–28.
Despite the increasing number of structural characterizations of disordered proteins, our knowledge of their structural diversity lags far behind what we know about the structures of folded proteins. Thus, it is not surprising that detailed structural characterization of disordered proteins in complex with their binding partners is even more limited, usually because of the inability to crystallize complexes and the challenges associated with detailed characterization by nuclear magnetic resonance (NMR) spectroscopy. However, in some cases, when disordered segments become ordered upon binding (i.e. a coupled folding and binding)29–31, these complexes can be studied by traditional structure determination methods. Though the Protein Data Bank (PDB) contains many fewer examples of complexes involving a disordered protein than those involving only folded proteins, currently, thousands of the structures of complexes deposited in the PDB contain proteins that are disordered in their free states32. Although these ordered structures can be extremely valuable for understanding structure-function relationship in IDRs, they do not provide information on the full structural continuum that IDPs can access in their unbound states or within the many cases of dynamic or “fuzzy” complexes in which the IDP retains significant disorder33–34.
Some IDPs are random coil-like or statistical coil-like flexible polypeptide chains, with fully extended structures that may arise due to the charge content or net charge per residue35–38. It was shown for a series of protamine sequences that there is a globule-to-coil transition as the net charge per residue increases and the increase of radii of gyration (Rg) can be rationalized as a consequence of the electrostatic repulsion and the favorable solvation of the arginine side chains39. Similarly, the hydrodynamic radius (Rh) for a number of IDPs was shown to correlate with net charge and proline content, with hydrophobic residues having little effect on IDP compaction40. Many IDPs contain transient secondary structure and/or tertiary contacts, not surprisingly, as water is, in general, a poor solvent for a polypeptide chain. IDPs enriched in negatively charged residues41 and positively charged residues39 can also show preference for collapsed states on the basis of their net charge per residue42–43. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues43. This quantity is calculated as |f+– f−|, where f+ and f− refer to the fraction of positive and negative charged residues in the sequence, respectively. If |f+– f−| < 0.2 the proteins are most likely to be “globule formers” that exhibit strong self-interactions and are compact; if |f+– f−| > 0.2 they are less compact with the largest values leading to non-globular expanded coils.
In many cases transient secondary structure and tertiary contacts in IDPs are evident, based on data from NMR experiments. NMR spectroscopy is a powerful tool to investigate the structural and dynamic properties of IDPs. Chemical shifts are known to be reliable reporters of backbone conformation and the deviation from the random coil values is often used to quantify the fractional population of α- or β-structure as a function of residue position such as by the programs SSP (secondary structural propensity)44 and δ2D45. Relaxation data are sensitive to dynamic properties, with heteronuclear NOE values reporting on local motions. For moderate magnetic field strengths, highly disordered proteins have fairly small positive to significantly negative heteronuclear NOE values while proteins with more significant sampling of secondary structure and tertiary contacts have higher positive values. (Note that with higher field strengths used for enhanced resolution in studies of IDPs, even more disordered proteins have positive values.46) NMR pulsed field gradient experiments can also be used to estimate Rh values to characterize compaction and a variety of other NMR data report on tertiary structure and solvent accessibility (see section 2.2)47
The cyclin-dependent kinase inhibitor, Sic17, and the CFTR regulatory (R) region48 both have sharp resonances and limited proton dispersion in their 1H-15N HSQC correlation spectra, which are indicative of largely disordered proteins. However, SSP analysis shows the presence of some segments with 20% (in Sic1) or 30% (in R region) fractional population of helical conformations. Some regions of eIF4E binding protein 2 (4E-BP2) show an even larger populations of helical conformations by SSP analysis (up to 37%) and the heteronuclear NOE values are positive and high (up to 0.7), suggesting that 4E-BP2 has restricted motion due to significant fluctuating secondary and tertiary structure28. In contrast, heteronuclear NOE values for Sic1 are predominantly negative7, indicating more conformational samplings and fewer contacts. Some IDPs contain almost fully formed secondary structural elements, with the rest of the protein still flexible. In the protein phosphatase 1 regulator, I-2, a ~70% populated α-helix was observed25 similar to a region of c-myb in which the population of α-helix was estimated to be ~70%49. Broad NMR resonances for IDPs, for short stretches of the protein sequence48 or even a majority of the protein50, have been interpreted as transient accessing of more ordered states in the intermediate timescale exchange regime, indicative of sampling of lower energy conformations that lead to slower rates of interconversion.
Some IDPs/IDRs also exhibit quite compact features, such as the monomeric polyglutamines51 or the yeast prion protein52–54. Secondary structure formation and hydrogen bonding play important roles in the collapsed states of unfolded proteins55–56, although they are unlikely to be the only contributors. The observation that electrostatic attractions between opposite charges on protein surfaces may stabilize globular structures57 suggests significant contribution of electrostatic effects to globule stability. Indeed, as expected, strong polyampholytes where f+ and f− are large and approximately equal show significantly collapsed states39.
The unique sequence and compositional preferences afford IDPs the advantage of realizing a continuum of conformational states and transitions that serve as the biophysical basis for their diverse functions. Hence there is a clear need to both characterize the structural ensembles that describe these interconverting conformations, as well as their complexes with partners, and to understand the functional relevance of these structural states.
Description of the conformational properties of disordered proteins has been a subject to debate for a long time, with significant insights from the field of polymer chemistry58. Key questions in this debate has been whether disordered proteins i) can be described by a random coil model in which the distributions of the dihedral angles for a given residue are independent of the dihedral angles of all other residues and ii) obey a power-law relating their polymer chain length and the ensemble average radius of gyration (Rg)59–60. To answer these questions some used chemically denatured states that can be probed experimentally more easily and the molecular dimensions of these states appear to be consistent with those expected for random coils in some cases61–62. On the other hand there are several studies that demonstrate the presence of specific local, native-like conformations under different conditions63–64. Discrete molecular dynamics (DMD) simulation studies (that use discretized energy potentials and fast event-sorting techniques to speed up MD simulation) of thermally denatured states attempted to bring about a reconciliation of these seemingly controversial properties of denatured proteins. The results suggested that denatured proteins follow random-coil scaling sizes but also preserved residual conformations biased toward native-like structures65. Similarly, Millet and coworkers also suggested that the overall dimensions of the denatured proteins are likely to be insensitive to local conformational elements66. However, it is more difficult to reconcile a random coil model with the growing experimental evidence especially provided by paramagnetic relaxation enhancement (PRE) experiments suggesting long-range tertiary contacts in disordered proteins67–69.
The shallow energy landscape of disordered proteins makes it challenging to fully appreciate the potential difference in structural properties of IDPs/IDRs in isolation in dilute buffer compared to those that may be found in the cell, similarly to comparisons between denatured states of folded proteins and their physiologically relevant unfolded states70. The use of molecular dynamic (MD) simulations to test various conditions (crowding, salt, ligands) is hampered by the known inaccuracies of force fields, particularly for IDRs/IDPs. However, given that many biologically relevant disordered protein interactions are recapitulated in dilute buffer, ensemble calculations of structural properties based on experimental data determined under these conditions should provide important insights. There are a number of methods that have been utilized to calculate the ensembles of highly heterogeneous conformations within disordered proteins.
A simple Monte Carlo simulation was used early on to generate ensembles of unfolded conformations that were restricted only by steric repulsion either between adjacent residues or between all possible atom pairs71. The simulation resulted in conformers whose dimensions are in good agreement with the experimental data and can be accurately predicted by a random coil model. Other methods described below share the fundamental principle of calculating an ensemble of structures whose properties are collectively consistent with a range of experimental measurements. One approach incorporates experimental restraints derived from NMR measurements into the energy function used in restrained molecular dynamics (MD) or Monte Carlo simulations to direct conformational sampling so that the final ensemble is in good agreement with the experimental data72–73. In this way the conformational space accessible to rather complex proteins can be characterized. The experimental data most often used in restrained MD approaches are the average distances derived from paramagnetic relaxation enhancement (PRE) experiments68–69,74–77. In these cases, the ensemble averaged distances for each pair of residues in the simulated, parallel replicas are calculated and compared to the PRE derived experimental constraints and as the simulation proceeds, each copy may diverge and explore different regions of conformational space as long as the ensemble average does not violate the restraints. Paramagnetic effects are particularly powerful in the case of disordered and partially disordered proteins, since the interactions are sufficiently strong to allow the identification of fluctuating, weakly populated tertiary contacts. However, interpretation of the ensemble averaged distances obtained from PRE experiments is not straightforward due to complications from separating the timescales of conformational exchange within disordered states, overall correlation time and relaxation times78. These and the r−6 distance dependence of the PRE that can overemphasize contributions from low populated states with close contacts may lead to inaccurate structural restraints, thus they are generally used in a more qualitative manner.
Other data used include residual dipolar couplings (RDC)79 and intensities of the cross peaks in 15N-1H hetereonuclear sequential quantum correlation (HSQC) NMR spectra73, which were demonstrated to be useful in restrained simulations. Equilibrium amide hydrogen-exchange measurement is also a powerful tool for investigating protein dynamics and the experimental protection (P) factors can be utilized to bias the conformational sampling to determine structures that are required for the measured exchange80. Alternatively, the protection factors can be implemented as experimental constraints in DMD simulations81. Disordered proteins, however, are generally not significantly protected from amide proton exchange with solvent, minimizing the impact of these data. Similarly, the order parameter S2 that describes the amount of local mobility, with S2=1 for no local motion and S2=0 for completely unrestricted local motion of the NH vectors can also be used as a restraint in MD simulations82–83, although interpretation of S2 for disordered states is challenging due to the lack of separation of the timescales of internal motion and overall tumbling. The Sample and Select (SAS) method for example, employs MD simulations (or other sampling methods) to determine conformational ensembles consistent with NMR-derived order parameters83. However, in contrast to the other strategy mentioned above74, with the SAS method the conformational sampling is completely decoupled from the conformer selection. Therefore, different sampling methods can be incorporated and tested to yield ensembles that are more consistent with the experimental data.
Another approach, ENSEMBLE84–85, uses a Monte Carlo algorithm to select from a pre-generated conformer pool an ensemble of predetermined size that best fits the experimental restraints. An advantage of the ENSEMBLE approach is that it easily incorporates many different types of data including NMR-derived chemicals shifts, PREs, RDCs, 1H-1H NOEs (nuclear Overhauser effects due to close proton-proton distances), R2 relaxation rates (correlated to numbers of atomic contacts), O2 paramagnetic shifts and amide hydrogen exchange rates (both for solvent accessibility); hydrodynamic radii (Rh) from NMR, dynamic light scattering or size exclusion chromatography data; and small-angle X-ray scattering (SAXS) data containing information about the distribution of heavy atom distances within the ensemble. The conformers can be generated by the program TraDES (trajectory directed ensemble sampling)86, by MD simulations or other strategies, and values for each of the data types are calculated for each conformer, using ShiftX87 for chemical shifts, LocalAlign88 for RDCs, HYDROPRO89 for Rh, CRYSOL90 for SAXS and internal algorithms or user-defined programs for other data. Particular mixes of conformers are chosen to optimize the fit between the calculated ensemble-averaged properties and the experimental data. The method has developed over the past years with improved conformational sampling, using conformers having both random structures and structures that are biased to contain secondary structural elements, and incorporating more types of experimental data91–92. If sufficient data are incorporated, this approach is able to accurately describe secondary structure, molecular size distribution and tertiary contacts of disordered proteins, with tertiary structure highly dependent on the number of distance restraints used93.
The ENSEMBLE method was successfully applied to several intrinsically disordered proteins to describe the conformational heterogeneity of their disordered states25–26,91,94. The structural characterizations of the drkN SH3 domain unfolded state, a disordered state used in methodological development, revealed an overall compact ensemble with both native-like and non-native contacts91. The comprehensive structural analysis of three intrinsically disordered protein phosphatase 1 (PP1) regulators, I-2, spinophilin and DARPP-32, using ENSEMBLE revealed the structural diversity of these functionally related IDPs in their unbound states, which all contain preformed, transient structures that are likely important for the interaction with PP125.
ENSEMBLE was also used to calculate structural models of the IDPs Sic1 and I-2 in complex with their partners25–26 and of the protein MYPT1 containing both folded and disordered regions94. Both Sic1 and phosphorylated Sic1 (pSic1) ensembles differ from random coil ensembles and contain significant transient structures with a slight enhancement of charged residue contacts in non-phosphorylated Sic126. The hydrodynamic properties of the individual conformers in the ensembles of both phosphorylated states vary widely. The significant population of compact conformers may facilitate the binding to its binding partner, the Cdc4 substrate-binding subunit of an ubiquitin ligase, due to the contribution that unbound phosphates provide to the binding affinity via long-range electrostatic interactions95 (see section 5.2 for a more detailed description of the complex). The ensemble model of the dynamic complex of Cdc4 and pSic1 provides valuable insights into the spatial arrangement of an ubiquitin ligase and potential effects on ubiquitination26. The ensemble model of the dynamic complex of I-2 with its partner, protein phosphatase 1 shed light on the importance of dynamic regions of I-1 that are not observed in the X-ray structure25. I-2 remains largely disordered even in the bound state, but the most remarkable features of the complex are the heterogeneous conformations of a long loop region in I-2 containing a phosphorylation site essential for its biological function. The multiple, heterogeneous conformations of this loop region enable accessibility to kinases to inactivate PP1 and most likely facilitate interaction of the I-2:PP1 complex with other proteins. MYPT1 is also a PP1 regulator similar to I2 but, in contrast with I-2, it is not completely disordered. MYPT1 has a folded ankyrin repeat domain and an N-terminal disordered region, however residues in the disordered region are more conformationally restricted than in I-294. The ensemble calculation for MYPT1 revealed a partially populated (~25%) α-helix in the N-terminal disordered region that becomes 100% populated in the MYPT1-PP1 holoenzyme structure. The preformed structural element likely contributes to the specificity of its interaction with PP1 by decreasing the entropic penalty of the binding. The calculated ensembles in all of these cases provide valuable insights into the function of the IDPs and the structural properties of their dynamic complexes.
Approaches using a statistical coil model that is based on a coil library subset of the PDB have been shown to fit experimental measurements for some disordered proteins quite well96–97. The algorithm Flexible-Meccano creates a large pool of statistical coil conformers by randomly sampling amino-acid specific backbone dihedral angle properties98–99. Then, for each conformer in the pool, NMR parameters such as chemical shifts, RDCs and PREs are calculated and these parameters can be compared with experimental measurements. The algorithm ASTEROIDS (a selection tool for ensemble representations of intrinsically disordered states) selects representative ensembles of a given IDP in agreement with experimental NMR data from the pool of statistical coil conformers100–101. The method was used to generate an ensemble of urea-denatured ubiquitin101–102 and ensembles of the intrinsically disordered α-synuclein and tau proteins100,103–104 in which networks of transient long-range interactions modulating aggregation were identified. Recently, this method was used to characterize the structural ensemble of the MAPK kinase 7 (MKK7) to obtain detailed information on the conformational sampling of its disordered regulatory domain105. The disordered regulatory domain of MKK7 contains three putative docking sites (D1-3) for the c-Jun N-terminal kinase (JNK), which adopt different conformations, helical, random coil and extended (PPII), in the unbound state. The different intrinsic conformational propensities of the docking sites suggest that there is no need for preformed structural elements sampling the bound state conformations for the JNK binding105.
Recently, several advanced computational tools were developed to characterize IDP ensembles using SAXS data. Approaches such as ENSEMBLE and ASTEROIDS can incorporate SAXS data along with other, primarily NMR, data93,106. The most commonly used method developed for structural characterization of flexible proteins with only SAXS data is the Ensemble Optimization Method (EOM), which selects a set of conformations from a previously generated conformer pool that fits the experimental profile using distinct optimization methods107–108. The final ensemble generated by EOM is relatively small, containing only 10–50 conformers. The method was successfully used to describe the structural characteristics of the N-terminal region of vesicular stomatitis virus phosphoprotein109 and the high-mobility-group protein HMGB1110. EOM provides only distributions of size and shape properties of disordered conformers that are consistent with the experimental data due to the low resolution of SAXS. It was also shown however, that these distributions are not unique in many cases and using polymer physics models can enable a better description of the information content present within the SAXS data for disordered proteins.111.
The Bayesian Weighting (BW) algorithm has also been applied to construct IDP ensembles112–113. In this approach, an extensive replica exchange MD simulation is used to generate an initial pool of conformers. Since it is not feasible to assign weights to a large number of conformers in the BW method, the starting pool is reduced to typically 300–600 structures in a pruning step. This step selects low-energy structures assumed to be representative of the structural diversity in the original set. Then, for each structure in this new set, Bayesian weights are assigned on the basis of experimental NMR data, such as chemical shifts and RDCs. This approach also allows calculation of the accuracy of a given ensemble by obtaining a probability distribution for the population weight of each conformation in the ensemble112. Several IDPs such as tau112, Aβ40/42114 and α-synuclein115–117 have been characterized using this Bayesian approach. The analysis of the most probable conformers generated by the BW algorithm for the K18 isoform of tau protein revealed that mutations may alter the aggregation propensity of tau by affecting a network of long-range interactions112. The presence of several long interactions revealed by BW strategy is in good agreement with previous findings suggesting that the dimensions of tau are smaller than the theoretical value for random coil118. Amyloid beta (Aβ)- Aβ40 and Aβ42 share high degree of sequence similarity but Aβ42 has a higher propensity for forming aggregates119. Comparison of BW ensembles for Aβ40 and Aβ42 revealed that the probability of sampling soluble β-rich structures that may represent prefibrillar intermediates is much greater for Aβ42 than for Aβ40114 and underlines the importance of comparative analyses of disordered proteins ensembles that may reveal the effects of mutations on function. The BW algorithm was also used to study a full-length IDP, the 140-residue α-synuclein using a peptide fragment-based approach115. The study provides a comprehensive analysis of the secondary and tertiary structures in α-synuclein and their importance in lipid-association and aggregation.
XPLOR-NIH, a program used to determine folded protein structures120, can also calculate conformational ensembles of short IDRs. This strategy was used recently to determine structural ensembles of phosphorylated tau peptides using NOESY-derived distance constraints, RDCs and chemical shifts as restraints, and revealed phosphorylation-induced structural changes in tau and their implications in microtubule assembly121.
Despite the increasing number of approaches, constructing an accurate model for a disordered protein is still a challenging task. This is due, in part, to the averaging of structural data within the dynamic disordered state and to the fact that the set of ensembles that agree with the experimental observations is highly degenerate, i.e. there are multiple structurally distinct ensembles that can reproduce the experimental data within the error of the current methods. To overcome the problem of degeneracy, one can aim to find the simplest ensemble that reproduces a given set of experimental measurements92, generate several ensembles and then analyze them for similarity122 or apply a statistical algorithm112. Increasing the types and amount of data93 as well as improving the accuracy and precision of the computational tools used to calculate observables from conformers123 are also needed to address the challenges. Nevertheless, these ensemble-based methods represent increasingly focused efforts to describe the diverse conformational space that IDPs/IDRs sample and to enable insights into the relationships between structure, dynamics and function, including binding, for disordered proteins. A database (pE-DB) was created to deposit available structural ensembles to help better understand the functional consequences of structural diversity of IDPs, as well as to aid in the development of approaches to calculate these ensembles124.
Post-translational modifications (PTMs) are extremely important regulatory mechanisms in eukaryotic cells, especially for proteins involved in molecular recognition. There are over 200 known protein covalent modifications125, with phosphorylation, acetylation, glycosylation, methylation and ubiquitination being the most common and most studied. Sites of protein modification are preferentially found in regions of disorder126–127; although there are some PTM sites identified in ordered regions, they often undergo an order-to-disorder transition concurrent with modification127. PTMs that are involved in signaling interactions and regulation processes are most often found within IDRs. Modifications of IDPs usually occur on multiple sites, adding an additional layer to the complexity of signaling regulation. It has been recently suggested that the many different PTMs in the eukaryotic arsenal increase the number of interaction motifs within IDPs to approximately a million in the human proteome128. Multiple modifications can occur sequentially or combinatorially, generating ultrasensitive (threshold) or rheostat responses95,129–132, which result in tight regulation of signaling processes.
At least one-third of all eukaryotic proteins are estimated to undergo reversible phosphorylation133. The extent to which protein phosphorylation participates in signaling is truly remarkable. Almost every known signaling pathway eventually impinges on a protein kinase, protein phosphatase134 or both. The investigation of more than 1500 experimentally determined phosphorylation sites in eukaryotic proteins led to the discovery that intrinsic disorder in and around the phosphorylation site is a common feature135–136. The percentage of predicted serine phosphorylation sites by the disorder-enhanced-phosphorylation predictor (DISPHOS) for regulatory and cancer-associated proteins is remarkably high (57.4 and 40.6% respectively) compared to proteins involved in biosynthesis and metabolism (8.7 and 5.9% respectively)135, which support the hypothesis that regulatory and signaling proteins undergo more frequent phosphorylation/dephosphorylation than proteins with catalytic functions.
The structural data available for protein kinases with peptide substrates or inhibitors revealed one reason that intrinsic disorder around the phosphorylation site is critical. The bound substrate or inhibitor peptides have essentially no intra-chain backbone hydrogen bonding while having extensive hydrogen-bonding with the kinase partners137–141. This hydrogen bond formation requires that the substrates have available backbone hydrogen bonding potential, thus they must be disordered prior to association with kinase. It is possible that a similar argument could be made for phosphatases, and certainly all modifying enzymes require accessibility of the peptide chain for efficient modification, providing another critical explanation for the significant enrichment of modification sites in disordered regions.
Recently, many studies on histone modifications such as methylation and acetylation have emerged142–144. In fact, lysine acetylation has been found to be a widespread PTM, contributing to regulation of almost all nuclear functions and to the control of many cytoplasmic functions as well145. Although protein methylation was discovered almost 50 years ago146 and it is clearly involved in the regulation of transcription, replication and other nuclear processes, the implications of methylation in different cellular processes are not fully understood. It has been reported that acetylation and methylation in histone tails occur on lysine residues in IDRs147 and systematic investigations of different PTMs suggest that methylation and acetylation, similar to phosphorylation, are located within intrinsically disordered regions127,148–150. Acetylation and methylation both affect the size and hydrophobicity of the modified residue, with acetylation but not methylation changing the charge. Hydrophobicity may be particularly important because hydrophobic amino acids are much less abundant in disordered regions. Different structural studies provide evidence that these modifications contribute significantly to disorder-to-order transitions that, in turn, can lead to high specificity, low affinity interactions with partners151–152. Acetylation and methylation create unique amino acids with unusual properties, which can influence their ability to participate in specific protein-protein interactions.
Glycosylation is a site-specific enzymatic process involving several specific enzymes. The two major types of protein glycosylation in eukaryotes are N-linked glycosylation on asparagines and O-linked glycosylation on hydroxylysines, hydroxyprolines, serines or threonines153. While a general role of glycosylation is enhanced solubility and prevention of aggregation154, there are numerous different monosaccharides in glycopeptide linkages, and these different modifications result in different subcellular localizations and consequently different cellular functions. For example, the functions of two well-characterized O-glycosylations, the O-GalNAc and O-GlcNAc, are quite different. O-GalNAc glycosylation occurs on either extracellular or plasma membrane proteins and affects extracellular processes such as cell adhesion, immunological recognition and secretion155. In contrast, O-GlcNAc glycosylation is a reversible modification of cytoplasmic and nuclear proteins and can play a regulatory role in competition with phosphorylation in some proteins135,156–157. Three different bioinformatic analyses showed that glycosylation is strongly correlated with intrinsic disorder; in the case of O-glycosylation this preference is irrespective of the type of the glycosylation127,158–159, consistent with the view that, similar to kinases, enzymes adding O-linked glycosylations recognize accessible regions of proteins.
The largely disordered protein α-synuclein, linked genetically and neuropathologically to Parkinson's disease (PD) is O-glycosylated in the brain160. Though the biological and the pathological roles of this modification in PD are not fully understood, glycosylation on the C-terminus of α-synuclein was suggested to affect the aggregation of the protein161. Interestingly, tau protein implicated in Alzheimer’s disease is normally not glycosylated but it is modified with oligosaccharides under pathological condition when tau is hyperphosphorylated162. It thus appears that glycosylation of tau is an early abnormality that can facilitate the subsequent abnormal phosphorylation of this protein162–164. While studies of post-translationally modified proteins are limited, their essential role in modulating structure, binding, subsequent modification and self-association is clear, pointing to a need for increased focus in this area.
Allostery is a protein regulatory process in which the effect of a ligand binding at one site is transmitted to another site resulting in modified protein function. According to the original allostery concept, transmission of a signal between two, non-overlapping sites occurs through concerted structural changes in oligomeric proteins165–166. This was later also recognized in monomeric proteins167–168. However recent discoveries focusing on protein dynamics challenged the original description of allostery, centered on conformational rearrangements within structured proteins, and extended the concept of allosteric regulation to include disorder-to-order transitions and the thermodynamic coupling between the folding and unfolding of multiple protein domains that participate in allosteric relays169–170.
It has been found that the functions of transcription factors and other cell signaling proteins are frequently modulated by allostery and that these proteins possess a high degree of flexibility. The energetic properties of IDRs and their central role in mediating protein interactions facilitate allosteric effects of binding. As described earlier, the energy landscapes of IDPs lack a well-defined global minimum, but have many local minima, each able to be sampled by proteins in the conformational ensemble24. The features of this conformational ensemble are not uniform, as disordered proteins transiently sample conformations with different degree of secondary structure and tertiary contacts, and this broad continuum of different structural states can serve as a molecular basis of allostery. The utility of disorder for allosteric regulation has become more apparent in the last few years, especially in light of the recent paradigm shift for the role of protein dynamics in allosteric modulation171. The classical view of allosteric coupling is that two sites are coupled through a network of structural interactions that extend throughout the protein and connect the two sites. Thus, it depends on a well-defined pathway of stable, folded structure connecting the two sites. However, this classical view has been supplanted by demonstrations of allostery via dynamic and energetic coupling172–174. The “new view” of allostery reinforces the role of protein dynamics, i.e. the fluctuations among many structural states in a dynamic equilibrium, according to the energetic states of the individual conformations171,175–176. Interaction with a partner or modifications such as PTMs on the protein remodel this energy landscape and shift the equilibrium to favor particular conformations that can enable downstream events, providing allosteric regulation without the need for discrete structural pathways. In this case, the allosteric coupling between sites is a consequence of the intrinsic stabilities of the domains and the interactions between them. If the energy to break the interaction is unfavorable, for example with two complementary hydrophobic surfaces, stabilizing a binding site will also have the effect of stabilizing the other site simply because these states are more favorable. Conversely, if the energy to break the interaction is favorable for example because interaction of the surfaces with solvent would be more favorable than the interaction with each other, stabilizing a binding site results in destabilizing the other site, leading to negative coupling13,170,177. Direct experimental evidence that allostery can be mediated by changes in protein dynamics was presented for negative cooperativity binding of cAMP to the dimeric catabolite activator protein (CAP)178. In this case, cAMP binding to one subunit of CAP did not result in structural changes in the other subunit, but the motions of residues located at distant regions are clearly affected. Binding of the first molecule of cAMP enhances, while the binding of the second cAMP molecule suppresses of protein motions resulting in large difference in conformational entropy that was found responsible for the negative cooperativity.
Theoretical work from Hilser et al13 presented an ensemble allostery model (EAM) and demonstrated that site-to-site allosteric coupling is maximized when one or both of the coupled binding sites are found in disordered regions and if the binding is coupled to the folding of the molecule. According to this model the different regions (domains) exist in an ensemble of states that are redistributed upon binding and thus the properties of the ensemble change accordingly. This mechanism demonstrates that the ability to propagate the effects of binding are determined not necessarily by a mechanical pathway linking the two sites but by the energetic balance within the protein, i.e. which states are more stable and what ligands can bind to each state. Changes in stability in one domain (region) can be compensated by changes to another domain or to changes in the interactions between domains; thus the allosteric coupling is independent of a stable network of interactions.
Despite theoretical work and the logic of energetic arguments, experimental evidence demonstrating allosteric coupling between two domains displaying different degrees of disorder is just beginning to be described. The Wiskott-Aldrich syndrome protein (WASP) integrates multiple signals to regulate actin polymerization, and these input signals act synergistically and shift the pre-existing folding-unfolding equilibrium179. Binding of Cdc42 to WASP together with phosphatidylinositol 4,5 bisphosphate binding and phosphorylation alters the equilibrium between the folded, autoinhibited form of WASP to a unfolded state that can bind Arp2/3, which promotes actin polymerization. Another example of allosteric coupling involving disorder is the regulation of the Phd/doc toxin-antitoxin operon from bacteriophage P1180. The toxin Doc1 inhibits translation by blocking the ribosomal A site181, and its activity is controlled by the action of its antitoxin Phd. The N-terminal region of Phd exists in equilibrium between a DNA-binding-competent ordered state and a DNA-binding-incompetent, highly unstable, partially unfolded state. The equilibrium between these two states is influenced by its direct ligand, the operator site and also by binding of the Doc corepressor. The binding of Doc to the disordered C-terminal region of Phd orders the N-terminal DNA-binding domain, illustrating allosteric coupling between highly disordered and partially unfolded domains. Thus the equilibrium is shifted to a more ordered conformation of the N-terminal domain of Phd resulting in its binding to DNA and the repression of the transcription of the operon. The toxin Doc acts not just as corepressor but also as derepressor depending on the ratio between Phd and Doc, a phenomenon known as conditional cooperativity. In the absence of the toxin, the antitoxin is only a weak repressor and transcription of the operon occurs. At toxin:antitoxin ratios below 1, a repressing complex with a high DNA-binding affinity is formed as the result of the structuring effect of Doc on Phd. At higher toxin:antitoxin ratios, derepression occurs through a switch from a low-affinity toxin-antitoxin interaction to a high-affinity interaction, which results in a complex with a different architecture that is unable to efficiently repress the operon. Here the intrinsic disorder is a key element of the regulatory process enabling the propagation of the signals between different regions and the formation of complexes with different composition.
An even more striking example shows that a disordered protein can not only regulate a particular signal by allosteric coupling but can also transform a positive effector into a negative one. The intrinsically disordered adenoviral protein E1A recruits numerous cellular regulatory proteins such as CBP/p300 and pRb, thus subverting signaling pathways in the infected cells182. CBP/p300 and pRb bind to largely non-overlapping regions of E1A to form binary complexes or a ternary complex. E1A acquires ordered structure in the binding regions when it is bound to its partner molecules. The binding of CBP/p300 and pRb to a long version of the E1A protein containing the N-terminus is positively coupled, that is the binding of either CBP/p300 or pRb increases the probability that E1A binds the other183. Remarkably, the binding of CBP/p300 and pRb to the N-terminal truncated version of E1A is negatively coupled, therefore the availability of the N-terminal region can modulate the sign of the cooperativity. This kind of cooperativity switching is easier to understand if we consider that proteins, especially IDPs, exist as ensembles that are functionally pluripotent. Under one set of conditions the ensemble could be poised such that effector binding can cause activation, while under another set of conditions it can cause inhibition13. The switch in cooperativity can arise as a result of different types of perturbation such as binding to another molecule, post-translational modification or protein truncation that can redistribute the ensemble of conformations. In this way, the functional complexity of signaling networks can be amplified without increasing the number of proteins involved while maintaining maximum control over cellular homeostasis.
A recent study on the central channel of the Nuclear Pore Complex (NPC)184 further underscores allosteric regulation by IDP/IDRs as a common biophysical principle in macromolecular systems. Although the dynamic nature of the NPC central channel was demonstrated earlier185 and models emerged suggesting a dynamic equilibrium between two conformations, a dilated and a more constricted form of the NPC “midplane ring”186–187 the underlying allosteric coupling which regulates the channel gating was demonstrated only recently184. Quantitative analysis of the interactions of two nucleoporins, Nup58 and Nup54, and a transport factor, Kapβ1, by isothermal titration calorimetry (ITC) revealed that multivalent interactions of Kapβ1 with the disordered FG repeats of Nup58 allosterically affects the conformational state of the neighboring structured domain associated with Nup54. The allosteric coupling between the structured and disordered regions results in shifting the conformational equilibria and facilitates a faster and more efficient transport for many cargos (See section 7.5 for more detailed discussion of the NPC).
While all of these cited examples underscore the role of structural disorder in allosteric modulation, other cases involving integration of multiple signals to create responses on multiple output sites without any folding transition truly challenge our concept of allostery. This broader allosteric role of disordered proteins has been argued only recently169, yet is clearly consistent with the developing understanding of the energetic basis of allostery for which IDPs seem perfectly poised. The examples of disordered proteins involved in regulation provided below demonstrate (i) how multivalent, dynamic interactions can be integrated to respond to an incoming signal and modification of any of the multiple input sites remodels the dynamic equilibrium and (ii) how binding or modifications can change equilibria between disordered, more ordered or self-associated states, leading to altered downstream events. These examples, therefore, can be viewed as further illustrations of the roles of disorder in allosteric regulation in signaling.
Local ordering of segments in IDPs upon binding is a common effect, but in some cases significant folding of intrinsically disordered proteins is also observed30,188. In contrast, highly dynamic complexes can arise upon binding of disordered proteins to their targets if only a limited number of residues becomes ordered upon binding, leaving a significant fraction of the protein still flexible7,189. “Dynamic” or “fuzzy” complexes contain transient local order at interfaces and dynamic equilibria of different sub-states190. Disorder or “fuzziness” in complexes is often functionally beneficial because it can ensure adaptability, versatility and reversibility of the binding and thereby is extremely important for signaling processes such as transcription and translation.
One reason why disordered proteins are abundant in protein-protein interaction networks is that they often contain multiple binding segments that mediate interactions with multiple partners191–192. Sometimes these multiple interaction motifs have different binding characteristics, as was shown for the interaction of p120 catenin with the disordered cytoplasmic tail of cadherin193. p120 catenin is an armadillo-repeat protein that, along with the classical cadherins, β- and α-catenins, functions in cell adhesion194–197. p120 regulates the cadherin-mediated cell-cell adhesion by binding the cytoplasmic tail of E-cadherin via static and dynamic interfaces193. p120 binds to the core region of the cadherin tail through a well-structured static interface yielding specific, high-affinity interaction, while the dynamic interaction of p120 with the N-terminal flanking regions of the cadherin tail has an important regulatory role by protecting the endocytic LL motif in cadherin tail, which initiates endocytosis of cadherins. The dynamic nature of the interaction, moreover, facilitates post-translational modifications and interactions of p120 with other proteins, leading to cadherin internalization. These co-existing stable and dynamic binding modes demonstrate examples of the range of IDP interactions and the importance of dynamic interactions in regulation of signaling networks.
A more complicated dynamic complex is possible if two or more transient binding interactions of the disordered protein with its partner(s) exist in a dynamic equilibrium. Disordered regions in complexes may control the degree of motion between domains, mask binding sites, permit overlapping binding motifs, and enable transient binding of different binding partners, facilitating roles as signal integrators and explaining their prevalence in eukaryotic signaling pathways198. Regulatory interactions of IDPs can also exhibit unusual binding characteristics, such as multisite dependence, and ultrasensitivity or cooperativity5–6.
A well-characterized example of ultrasensitivity is the binding of the disordered cyclin-dependent kinase (CDK) inhibitor Sic1 to its receptor Cdc4 upon phosphorylation of multiple CDK sites7–9. Cdc4 is an F-box protein adapter subunit of an SCF ubiquitin ligase that targets substrates for ubiquitin-dependent proteolysis199. Cdc4 contains a WD40 protein recognition domain200–202 comprised of tandem repeats of a conserved WD40 motif found in many different proteins that form a circularly permuted β-propeller domain structure203. The WD40 domain of the yeast F-box protein Cdc4 binds phosphorylated forms of the cyclin dependent kinase inhibitor Sic1, targeting it for ubiquitination in late G1 phase, an event necessary for the onset of DNA replication9. Phosphorylation of Sic1 occurs on multiple Cdc4 phosphodegron (CPD) motifs in Sic1 by Cln-Cdc289,204. The crystal structure of Cdc4 with a high affinity phosphopeptide containing a consensus CPD derived from human cyclin E reveals a single deep pSer/Thr-Pro binding pocket in the WD40 domain205. Interestingly, Sic1 contains 9 CPD sites, but these lack the consensus sequence, and all are sub-optimal. Phosphorylation on a minimum of 4 to 6 of these sub-optimal CPD sites, depending on which positions, is sufficient for reasonably high-affinity Cdc4 binding (Kd ≈1μM) and for in vivo ubiquitination8–9. The requirement for multisite phosphorylation sets a threshold for Cln-Cdc28 kinase activity in late G1 phase and converts the increase in Cln-Cdc28 into a switch-like (all-or-none) response, referred to as “ultrasensitivity”, for degradation of Sic1. Replacement of all sub-optimal CPD sites for a single high-affinity CPD in Sic1 leads to premature cell cycle transition and genome instability, demonstrating the importance of the ultrasensitive response9.
A static structural model cannot explain this interaction requiring multiple phosphorylations, thus the proposed model is a dynamic complex of Sic1:Cdc4, with each sub-optimal CPD transiently coming into van der Waals contact and then releasing from the arginine-rich binding surface on the WD40 domain, enabling other CPDs to exchange on and off the surface7. The NMR data also confirm that Sic1 remains predominantly disordered upon phosphorylation and binding, with only local ordering around the binding site, which facilitates exchanging of the multiple, weak sites in the receptor site7. A polyelectrostatic model provides an explanation for the dependence of the interaction of pSic1 with Cdc4 on the number of phosphorylated sites95. In this model, cumulative electrostatic interactions allow for long-range contributions of all phosphorylated sites to the free energy of the binding, including those not directly contacting the residues in the Cdc4 binding pocket. In the non-phosphorylated state, the lysine and arginine rich N-terminal targeting region of Sic1 has a net charge of +11 and contains no aspartate or glutamate residues, but 6 phosphorylations change the net charge to -1, attractive to the arginine-rich binding Cdc4 site. The ENSEMBLE calculations for Sic1 revealed transient structure and the presence of compact conformers that modulates their electrostatic potential (with its inverse distance dependence), with dynamic interconversion providing a structural basis for the mean field26. The dynamic complex also facilitates efficient ubiquitination of Sic1 at multiple sites, as revealed by superposition of the Sic1:Cdc4 model onto the structure of the SCF complex demonstrating that the disordered Sic1 conformers can readily span the 64Å gap between the Cdc4 binding site and the E2 catalytic site. The multiple CPDs enable different lysines to be presented to the E2, in agreement with data showing that replacement of all sub-optimal CPD sites with a single N-terminal high-affinity CPD leads to preferential ubiquitination at only 2 C-terminal lysines rather than throughout Sic19.
In contrast to this dynamic multisite mechanism, a static diphospho-epitope mechanism was also suggested206. Since three of the natural Sic1 CPD sites are followed by a Ser or Thr residue at the P+3 or P+4 position that raises the possibility of three high affinity doubly phosphorylated Sic1 degrons that can interact with the primary binding site in Cdc4 and a nearby arginine-containing pocket as observed for the cyclin E:hCdc4/Fbw7 complex. However, it was shown that full-length Sic1 containing three closely spaced CPD sites (pSer69, perS76, pSer80) including a P+4 site has much weaker affinity, > 50 µM Kd, than a 20-residue peptide containing these sites, 2.4 µM Kd (comparable to the affinity of full-length ~ 6-fold phosphorylated wild-type protein of 1.3 µM Kd)8. This result reinforces the hypothesis of net charge-regulated binding, since the peptide removes many positively charged residues present in full-length Sic1. While the NMR data confirm the P+4 phosphate-binding surface on the Cdc4 can contribute to local affinity, it was also demonstrated that this additional phosphate is neither necessary nor sufficient for Sic1 recognition in vitro and in vivo8. Electrostatics is certainly a significant factor in the Sic1:Cdc4 dynamic complex, but tertiary structure may also play an important role, with the critical factor being the favorable energetic contributions of phosphorylated sites not in direct van der Waals contact with Cdc4. This view challenges standard understanding of binding in terms of energetics of only directly contacting sites.
The eukaryotic initiation factor 4E (eIF4E), which together with 4G (eIF4G) control cap-dependent translation initiation207, interacts tightly with eIF4E binding proteins (4E-BPs). 4E-BPs inhibit eIF4G binding and translation, playing a crucial role in controlling development and cell growth208–209. The binding and structural properties of these proteins have thus been the focus of extensive investigation210–212. In one recent study, the structural characteristics of the full length 4E-BP2, the neural 4E-BP isoform, and its binding to eIF4E were investigated28. It was shown that 4E-BP2 is disordered but contains significant transient secondary structure, especially in the canonical eIF4E binding site, which possesses high helical propensity (Figure 1b). NMR and SAXS data have shown that, upon binding to eIF4E, full length 4E-BP2 utilizes a dynamic bipartite interface extending from residues ~Y34 to ~D90, using both a stable canonical helix as a primary contact site and a dynamic secondary binding site centered around residues 78–82, IPGVT28,210, and generating a different type of dynamic complex than described for Sic1:Cdc4. A recent crystal structure of 4E-BP:eIF4E complex reveals electron density from M49 to S83, with the rest of the interface too dynamic to observe213. This bipartite mode of binding leads to ~3 orders of magnitude tighter binding (low nM affinity) to eIF4E than for ~20 residue peptides containing only the canonical 4E-binding motif (low µM affinity). Moreover, these canonical site peptides show significant chemical shift changes in NMR binding experiments214, while full-length 4E-BPs show minimal chemical shift changes and significant loss of intensity of many resonance peaks upon binding to eIF4E28,211,215, suggesting large amplitude dynamics within the bound state. The binding of full-length 4E-BP2 to eIF4E results in significant resonance broadening on the eIF4E surface, which partially overlaps with the eIF4G interaction surface. Since the bound resonances of the intact site on eIF4E were observed when either the canonical or the 2nd binding site was mutated, this points to exchange of the two sites in the complex, with the observed broadening in the wild-type case most likely due to a conformational exchange within the complex in which the two binding segments exchange on and off of the eIF4E surface. The full-length 4E-BP2 thus appears to form a tight but dynamic, 'fuzzy' complex with eIF4E, with the helical canonical region having a well defined, although not fully occupied, interaction surface on eIF4E, while the second site may be both transient and more delocalized. This result provides valuable insight into the mechanism of the competition between the 4E-BPs and eIF4G for the eIF4E binding216. While eIF4G forms a stable fold on the canonical binding surface of eIF4E, 4E-BP2 possesses a more extensive and partially overlapping binding surface on eIF4E and maintains flexibility enabling effective regulation by phosphorylation. The different binding modes of eIF4G and 4E-BP2 explain why the 4GI1 inhibitor216 can disrupt the eIF4G:eIF4E complex but not the 4E-BP2:eIF4E interaction.
The suggested tight but dynamic binding of 4E-BP2 to eIF4E may also explain the effective phosphorylation-dependent regulation of an interaction with a low nanomolar dissociation constant. The multisite phosphorylation of the 4E-BPs by kinases regulates the binding to eIF4E. Hypophosphorylated 4E-BPs with no or minimal phosphorylation bind tightly to eIF4E, which inhibits the cap-dependent translation, while the hyperphosphorylated 4E-BPs with 4 or 5 sites of phosphorylation dissociate from eIF4E, resulting in effective translation initiation217–218. The potential underlying mechanisms of phosphorylation-dependent regulation were investigated earlier214,219, with suggestions that phosphorylation of 4E-BPs lead to electrostatic repulsion from the eIF4E surface and also modulate the stability of the helical canonical 4E-binding motif219. A more thorough understanding of the role of phosphorylation as a regulatory switch emerged recently (Figure 1b). It was demonstrated that phosphorylation of 4E-BP2 induces folding and stabilization of a four-stranded β-domain that sequesters the eIF4E-binding surface and weakens the 4E-BP2:eIF4E interaction 4000-fold10. Although disorder-to-order transitions for IDPs in response to biological signals have been described many times30, post-translational modifications were thought to account for only subtle local conformational changes220–222. The phosphorylation-induced folding of 4E-BP2 represents an example of what is likely to be a significant regulatory mechanism of PTM-induced folding and provides additional insights into how IDPs control different biological functions in the cell.
Another feature of disordered proteins is that they have large binding-interface-surface to isolated-protein-surface ratios compared to globular proteins, i.e. they utilize a much larger portion of their accessible surface areas. For globular proteins, a large interface requires a large protein size, which significantly increases the size of a multi-protein complex. Thus disordered proteins represent an elegant solution to increase the interface area and keep the size of proteins small at the same time223. In addition, as binding in disordered proteins often relies on sequence, rather than a surface defined by a tertiary fold or folded secondary structural element, it is easy to incorporate different binding regions in the same protein and even enable overlapping binding sites. This advantage of disordered proteins facilitates their interaction with multiple partners and is another reason why they are abundant in protein interaction networks191–192. Overlapping binding segments are most common in hub proteins (those that bind > 10 partners), which enable integration of multiple signals from different signaling pathways.
The c-Src non-receptor tyrosine kinase is one of the oldest and most investigated proto-oncogenes, and is involved in numerous signal transduction pathways that regulate cell growth, proliferation and survival224–228. Although the regulation of c-Src is reasonably well understood, new mechanisms mediated by disordered regions of c-Src have been recently described229. c-Src is composed of a disordered N-terminal Src homology domain 4 (SH4) with a myristoylation site important for membrane localization, a disordered Unique domain (UD), an SH3 domain, an SH2 domain, a catalytic SH1 kinase domain and a disordered C-terminal tail with a negative regulatory tyrosine residue. The kinase domain contains an autophosphorylation site, the SH2 domain interacts with the negative regulatory phosphorylation site, and the SH3 domain binds proline-rich ligands and also interacts with the polyproline linker region connecting SH2 and kinase domains in the inactive form of the protein. The N-terminal region of c-Src, including the SH4 and the UD, is intrinsically disordered. Detailed NMR study of this N-terminal region suggested transient secondary structural elements between residues 60–64 and 67–74230 and revealed that SH4 and UD play a crucial role in the regulation of c-Src by participating in a number of intra- and intermolecular interactions229. The UD and SH3 domain bind lipids and interact with each other, but binding of poly-proline peptides to the SH3 domain allosterically inhibits its interaction with the UD. These protein-protein interaction regions overlap with the lipid-binding regions in both domains, suggesting that the activation of c-Src may affect lipid binding by the UD and SH3229. Lipid binding through the UD and SH3 can limit the accessibility of these domains to other partners, and provides a “positional regulation” mechanism. Moreover, phosphorylation in the N-terminal SH4 and UD results in significant reduction in ligand binding, likely because it leads to electrostatic repulsion with acidic lipids229. Calmodulin also suppresses lipid binding by the UD and SH3 domain and modulates their interaction. Thus phosphorylation and calmodulin binding serve as additional regulatory mechanisms that can modulate the intramolecular interactions of c-Src and its intermolecular interactions with lipids.
A similar phosphorylation-dependent signal integrator is the regulatory (R) region of the cystic fibrosis conductance transmembrane regulator (CFTR). Mutations in the CFTR gene cause cystic fibrosis (CF)231–232. Normal CFTR function depends on phosphorylation of the R region232–233 and its interaction with different parts of the CFTR including the nucleotide-binding domains, NBD148 and NBD2, a 42-residue peptide from the C-terminus of CFTR27 and other intracellular partners such as 14-3-3234 and the STAS domain of SLC26A3235, a chloride/bicarbonate exchanger. The multiple intra- and intermolecular interactions of R region play vital role in protein maturation, trafficking to the cell surface and stability at the membrane236–238. The R region is phosphorylated by PKA on nine sites, which facilitates channel opening239–240. It was shown recently that R region forms highly dynamic complexes with different partners targeting the same or largely overlapping segments, with binding to the different partners largely dependent on the phosphorylation state of the R region27 (Figure 2). R region binds more strongly to the NBDs in its non-phosphorylated state, which inhibits their dimerization and channel activation. Phosphorylation of R region leads to its removal from the dimer interface and enables R region binding to other partners including the C-terminus of CFTR, which promotes NBD dimerization, ATP hydrolysis and channel opening. 14-3-3 also binds to the phosphorylated R region and this interaction is crucial for the normal CFTR trafficking from the endoplasmatic reticulum. R region is involved in the reciprocal activation of SLC26A3 and CFTR via binding the STAS domain of SLC26A3. The STAS domain shows slightly more preference toward the phosphorylated R region, with the interaction helping to ensure the close physical proximity of CFTR and SLC26A3. It was suggested that these partners compete with each other for R region binding and that dynamic complexes involving different partners exchanging on and off of the same binding segments facilitate the integration of stimuli from different pathways (Figure 2). The dynamic nature of the complexes enables accessibility to kinases and phosphatases. Importantly, the flexibility of R region allows binding segments to interact with different partners, even binding simultaneously, and supports the role of the R region as a dynamic integrator.
Disordered proteins and segments often adopt an extended conformation upon binding to their partners, wrapping around a folded protein. Within the resulting large binding interface there can be some short segments, usually well under 100 residues, that undergo disorder-to-order transitions241–243. These transitions can lead to weak but specific interactions due to the conformational entropy loss, crucial for reversible interactions in the cell signaling and regulatory pathways. Retention of dynamics within the complex can mitigate conformational entropy loss and give tighter binding, with some complexes binding in the low nM range28. These short binding segments often have more hydrophobic residues than the surrounding sequence, enabling identification of these segments using bioinformatics241,244–246. These segments can possess some residual structures in the isolated state that correspond to the secondary structural elements in the complex, although this is not a general rule25. When present, these preformed structural elements can favor the binding process by limiting the conformational space and decreasing the entropic penalty of binding247.
As mentioned above, the induced folding upon any type of perturbation such as the binding of another molecule, PTMs or protein truncation, can be the basis of the allosteric coupling13. The folding of the cognate domain causes the redistribution of the ensemble leading to the folding of the other domain and facilitating the binding of the other ligand. This allosteric coupling depends exclusively on the relative stability of the domains and how the stability changes upon the incoming signal. Importantly, as described below, IDPs can sample radically different structural states, meaning that their ensemble is optimally poised to respond and integrate multiple signals, which provides a unique regulatory strategy.
The flexibility of disordered regions allows them to adopt different conformations upon binding to different target proteins which enables the protein to fulfill more than one, unrelated function, known as moonlighting248. For example, the nuclear coactivator-binding domain (NCBD) of the CREB-binding protein (CBP) adopts two different conformations when it binds to the activation domain of p160 nuclear receptor co-activators249–250 or to interferon regulatory factor 3 (IRF3)251. Similarly, the binding segments in CFTR R region have helical propensity when they are bound to NBD1, but remained more extended in the complex with 14-3-3 protein27. A more extreme example is the short disordered segment in the C-terminal region of p53 that adopts several different conformations when bound to different partners252–255. Though the regions that contact different partners are sometimes not entirely the same and only overlap, the resulting distinct functions clearly rely on the conformational plasticity of these regions to adopt different conformations. In some cases disordered proteins use the same or overlapping regions to elicit opposing (activating and inhibiting) action on different partners or even the same partner molecule248.
On the other hand, unrelated disordered proteins/regions can also adopt the same conformations when bound to the same partner. The disordered cytoplasmic tail of E-cadherin not only binds to p120 as described previously, but to β-catenin through a distinct binding motif (CBD, catenin binding domain) which is C-terminal to the p120 interaction motifs. Binding to β-catenin induces folding in CBD of E-cadherin, which was shown to be similar to the folding upon binding of the other disordered β-catenin binding region of TCF3. The disordered regions of E-cadherin and TCF3 make identical hydrophobic interactions with β-catenin, despite the difference in local secondary structure256–257, providing one of the few examples of molecular mimicry (Figure 3). It has been suggested that the disordered nature of the proteins that interact with β-catenin is biologically advantageous, allowing the binding of extended polypeptides on the elongated surface through distinct regions that functions quasi-independently256.
There are a wide range of effects of binding of IDPs/IDRs to folded protein targets, with complete ordering to a folded state, incomplete folding, stabilization of an isolated single or small number of secondary structural elements, and transient stabilization of single or multiple structural elements in dynamic exchange. This repertoire can facilitate complex layers of biological regulation as increasing examples are demonstrating.
Eukaryotic cell division is regulated by a family of cyclin-dependent kinase (Cdk)/cyclin complexes whose members are sequentially activated to drive progression from G1 to S phase (Cdk4 and Cdk6 complexed with the D-type cyclins, and Cdk2 complexed with Cyclins E and A) and from G2 to M phase (Cdk1 complexed with cyclins B and A). Commitment to undergo DNA replication and enter S phase is further controlled by IDPs, including p21, p27 and p57 (termed the Cdk regulators, or CKRs), which bind and regulate Cdk/cyclin complexes258. The small size of these proteins (169, 198 and 316 amino acids, respectively) belies the complexity of the regulatory mechanisms they orchestrate. Best understood are the features and mechanisms of p21 and p27, which will be summarized here. The association of these proteins with complex regulatory behavior was first noted by Kriwacki, Wright and co-workers in 1996, who showed that, in isolation, p21 was extensively disordered but that it folded upon binding to Cdk2188. A few years prior, p21 had been identified as a universal inhibitor of Cdk/cyclin complexes259, leading to the proposal that disorder within p21 enabled promiscuous binding to the entire family of Cdk/cyclin complexes. p21 was shown to possess a small amount of nascent helical structure, which was suggested to partially mitigate the entropic penalty associated with folding upon binding to its Cdk/cyclin targets188. p21, p27 and p57 share an N-terminal kinase inhibitory domain (KID) that mediates interactions with Cdk/cyclin complexes and many aspects of “disorder-function” relationships for these KIDs were revealed through studies of p27. For example, the KID of p27 (p27-KID) was shown to sequentially fold upon binding to cyclin A, then Cdk2 (Figure 4a, b)260. A highly conserved sub-domain within the N-terminus of p27-KID termed D1 rapidly binds cyclin A, followed by much slower binding/folding of a second conserved sub-domain termed D2 to Cdk2. The former D1/cyclin interaction blocks substrate recruitment261 while the latter inhibits kinase activity by positioning a tyrosine residue (Y88) within the Cdk2 active site262. The sequential mechanism was later shown to mediate specific binding to the Cdk/cyclin complexes that regulate cell division and to prevent tight binding to other, similar complexes that regulate transcription263. The basis for selectivity is conservation of the sub-domain D1/cyclin interaction in the cell cycle-regulating Cdk/cyclin complexes, and loss of the specific binding pocket that mediates this interaction in the transcription-regulating complexes. Electrostatic interactions guide the formation of D1/cyclin A encounter complexes which then promote further folding and binding to give fully inhibited Cdk2/cyclin A264; the transcription-regulating Cdk/cyclin complexes are unable to form these encounter complexes and are thus not biologically target by p27263.
Another type of disorder-function relationship for these CKRs was elucidated through studies of p21. When examined in detail using NMR spectroscopy, it was shown that residues within the LH sub-domain that connects sub-domains D1 and D2 remained somewhat flexible despite being tightly bound to Cdk2/cyclin A (Figure 4a)265. This analysis revealed that the LH segment was stretched beyond the length of a standard α-helix, weakening otherwise stabilizing hydrogen bonds and thus enabling dynamics and flexibility. These observations led to the insight that the LH sub-domain is a stretchable linker that enables sub-domains D1 and D2 to accommodate small structural differences between the different Cdk/cyclin complexes that p21 binds; this phenomenon was termed “structural adaptation”. This structural strategy provides a simple mechanism to achieve the binding promiscuity that was noted for p21 in 1994259. This study also shows that, while much of p21 binds rigidly to Cdk2/cyclin A, experiencing extensive folding-upon-binding, the LH sub-domain experiences folding to a smaller extent. It was hypothesized that the folding/binding energy landscape of this segment of p21 is flat and smooth to accommodate subtly different lengths and interfaces while accommodating the strictly binding requirements for the other two sub-domains, D1 and D2.
Another example of incomplete folding-upon-binding of high functional significance is provided by p27. Like p21, p27 extensively folds upon binding to Cdk2/cyclin A260. However, subsequent studies showed that some degree of flexibility persists in the bound state and that this enables phosphorylation-dependent regulation of p27 activity and stability. Grimmler, et al., showed in 2007266 that Y88 that binds within the ATP binding pocket of Cdk2 was transiently exposed for phosphorylation by the oncogenic non-receptor tyrosine kinase, BCR-ABL (giving pY88). Upon phosphorylation, a short segment of p27 flanking pY88 is ejected from the Cdk2 active site (while the rest of the p27 KID remains bound to Cdk2 and cyclin A), partially restoring kinase activity (illustrated schematically in Figure 4c). p27 also contains a disordered C-terminal regulatory domain (p27-C) that becomes a captive substrate for Cdk2-dependent phosphorylation on threonine 187 (T187), near the end of this domain, which creates a motif termed a phospho-degron for recruitment of the E3 ligase, SCFSkp2. Recruitment of SCFSkp2 leads to poly-ubiquitination of lysine residues within this flexible regulatory domain followed by selective degradation of p27 by the 26S proteasome and release of fully active Cdk2/cyclinA complexes. This multi-step phosphorylation/ubiquitination cascade controls progression of cells to S phase of the division cycle and illustrates several types of disorder-function relationships. First, flexibility within the Y88-containing segment of p27, even when bound to Cdk2/cyclinA, enables Y88 to receive the phosphorylation “signal” from BCR-ABL. The phosphorylation-dependent ejection of Y88 from the Cdk2 active site is an example of regulated unfolding. Disorder and extensive dynamics within the ~100 residue-long p27-C enables T187—within a pY88-p27/Cdk2/cyclin A ternary complex—to become a Cdk2 substrate after phosphorylation of Y88, illustrating another type of disorder-function relationship. However, the interactions of the T187-containing segment of p27-C with the Cdk2 substrate binding site must be transient because, once phosphorylated, the T187 phospho-degron (pT187) becomes a substrate for SCFSkp2. Persistent flexibility within p27-C, we hypothesize, is critical for accessibility of pT187 to SCFSkp2, for the accessibility of lysine residues within p27-C for poly-ubiquitination, and finally for accessibility of poly-ubiquitinated p27 to the 26S proteasome. Therefore, despite its relatively small size, p27 reveals many different ways in which disorder mediates function, from mediating specific folding-upon-binding to certain Cdk/cyclin complexes, to mediating a complex phosphorylation/ubiquitination signaling cascade that determines whether a cell divides, or not. In the context of the structural continuum discussed above, p27 experiences multiple transitions, some in the direction of order and others in the direction of disorder. For example, the KID transitions from disorder to order upon binding to Cdk/cyclin complexes, but the Y88 segment transitions for order to disorder due to phosphorylation. While the KID experiences these transitions, p27-C remains highly dynamic but the segment flanking T187 must transiently interact with the substrate binding site of Cdk2 (and experience local ordering) so that T187 can be phosphorylated after re-activation of Cdk2 by Y88 phosphorylation. Similarly, the pT187 phospho-degron becomes locally ordered when it interacts with components of the SCFSkp2 E3 ligase206, but becomes more disordered upon dissociation from the E3 prior to engagement by the 26S proteasome. Therefore, p27 retains a high degree of disorder as it functions, with transitions between disorder and order critical for its regulation of and by Cdk/cyclin complexes and other upstream kinases.
The BCL-2 family of proteins plays central roles in controlling both the intrinsic and extrinsic pathways of apoptosis267. Within this family are the multi-BCL-2 homology (BH) domain effector (BAX and BAK) and anti-apoptotic (BCL-2, BCL-xL, and others) proteins and the BH domain 3 (BH3) only proteins. Proteins in the former two groups fold into globular structures comprised of eight α-helices while most members of the BH3-only group, which can be further subdivided into direct activators (BID, BIM, and PUMA) and derepressors/sensitizers (BAD, BIK, BMF, HRK, Noxa), are intrinsically disordered268. The exception is BID, which adopts a globular, α-helical structure and is cleaved by caspase-8 to a truncated form (tBID), which has a molten globule-like structure with a considerable amount of α-helical structure but lacking a well-defined tertiary fold269, that associates with and activates BAX at the outer mitochondrial membrane (OMM). It has been proposed that tBID changes structure upon interacting with the OMM, with its “molten” α-helices disassociating into a “C-shaped”, extended structure270. Some members of the BCL-2 protein family are constitutively localized to the OMM (e.g., BAK), while others shuttle between the cytosol and OMM (e.g., BAX, BCL-xL). Activation of the BCL-2 family effectors (within the OMM) by the BH3-only direct activators is accompanied by dramatic rearrangements of their globular, α-helical structures that lead to OMM permeabilization (MOMP), release of cytochrome c, caspase activation and apoptosis. The BH3 domains of the intrinsically disordered BH3-only proteins fold into α-helical conformations upon binding within a deep hydrophobic groove found in the multi-BH domain anti-apoptotic and effector BCL-2 family members. In the case of interactions with BAX or BAK, BH3 domain folding-upon-binding is associated with effector activation and apoptosis. In contrast, folding-upon-binding to the anti-apoptotic family members leads to BH3 domain sequestration and inhibition of apoptosis.
Structural plasticity is a hallmark of BCL-2 protein family, with their dynamic structures populating many different positions within the order-to-disorder continuum that is now used to represent the possible states of proteins. Interactions between the different functional groups can either trigger or inhibit apoptosis and usually cause positional changes within this continuum due to folding- or unfolding-upon-binding. One example is the binding of the BH3 domain of PUMA to BCL-xL. As the PUMA BH3 domain folds upon binding, α-helix 3 (α3) of BCL-xL partially unfolds (Figure 5)11. This phenomenon has been termed ligand binding-induced unfolding—a type of regulated unfolding271. A tryptophan residue within the N-terminus of the PUMA BH3 domain (Trp71, Figure 5), unique at this position amongst BH3-only proteins, interacts through π-stacking with a His residue of BCL-xL (His113, Figure 5), distorting BCL-xl’s structure near the N-terminus of α3. This distortion propagates through α3 leading to its partial unfolding (shown schematically in Figure 5). The significance of this observation derives from apoptosis-inhibiting interactions between BCL-xL and p53272. p53, despite its lack of a canonical BH3 domain, is a direct activator of BAX and BAK. However, p53 is sequestered in an inactive form by interactions with BCL-xL, preventing BAX activation. p53 binds to a negatively charged surface of BCL-xL that includes surface-exposed α3273. PUMA binding-induced partial unfolding of α3 in BCL-xL disrupts interactions with p53, releasing it to engage and activate BAX at the OMM. This is an example of concerted folding- and unfolding-upon-binding of the PUMA BH3 domain and α3 of BCL-xL, respectively, when the two proteins interact and highlights the importance of transitions by proteins between different regions of the order-to-disorder structural continuum for their functional mechanism.
Another example of this concept is the interaction of the direct activator BID with the effector BAK274–275. The BH3 domain of BID is highly disordered and binds weakly to the hydrophobic groove of BAK. Despite its weak nature, in the presence of OMMs, this “hit-and-run” interaction triggers structural rearrangements leading to BAK oligomerization with these membranes and MOMP. The term “hit-and-run” interaction in this context describes the phenomenon wherein one protein (BID in this example) binds to another (BAK here), triggering structural and functional changes in the second protein without remaining associated with it276–277. While the disordered BID BH3 domain is the natural substrate, solution NMR structural studies of its complex with BAK were hindered by rapid association and dissociation. Chemical stabilization of the α-helical conformation of the BH3 domain enhanced binding affinity and enabled determination of the structure of the membrane-free BID/BAK complex using NMR spectroscopy. (Also, the BAK construct studied lacked α9 which otherwise mediates OMM targeting.) This membrane-free analysis revealed BID binding-induced structural changes at one end of the BAK hydrophobic groove (Figure 6a) that correlated with more extensive structural changes probed biochemically in the presence of the OMM. The hydrophobic groove of BAK, in the absence of BID, is occluded and this obstruction was pushed aside upon binding of stabilized BID. In the presence of the OMM, these structural changes propagate through BAK to cause exposure of sites within α1 and α2 for proteolytic cleavage (Figure 6b). Exposure of these two α-helices is proposed to precede other structural rearrangements that lead to BAK oligomerization within the OMM and ultimately MOMP. These studies provide another example of concerted folding- and unfolding-upon-binding, with the folding of BID within the BAK hydrophobic groove associated with local structural distortions and detachment (unfolding) of α1 and α2 from the helical core. While this allosteric mechanism is poorly understood, these results again illustrate concurrent transitions of interacting proteins within the order-to-disorder structural continuum.
Similar to BAK, the other BCL-2 family effector, BAX, requires direct interactions with activators that trigger its oligomerization. In the cytosolic form of BAX, the hydrophobic groove corresponding to the ‘trigger’ site in BAK is occupied by an additional alpha helix (α9) that is found exclusively in the globular core of BAX among multi-domain BCL-2 family proteins. Due to the presence of this additional helix, BAX activation by the BH3-only activators BIM and BID appears to follow a two step mechanism, as follows. First, through weak, ‘hit-and-run’ interactions at a site located between α1 and α6 of BAX, these activators promote the allosteric release and unfolding of α9278. Second, through a more stable interaction (observed by x-ray crystallography) with the now accessible hydrophobic groove on BAX, BIM or BID promote the additional ‘unlatching’ of α6-α8 from the globular core of BAX, the first of a chain of events that result in BAX oligomerization, insertion into the OMM, and MOMP279. The observation that active BAX or BAK have been isolated from apoptotic cells in the form of homo-oligomers that were not associated with activator proteins (which were nonetheless functionally required to elicit the ‘trigger’ activation signal) substantiates models of BAX and BAK activation where these trigger interactions are relatively weak and transient compared to the stable homo-oligomers that they induce to assemble280.
Cytosolic p53 also functions as a direct activator of BAX281. However p53 lacks a BH3 domain, suggesting that it activates BAX through a mechanism different from that elucidated for BIM or BID (Figure 7)12. Cytosolic p53 engages BAX in a multivalent manner. The DNA binding domain of p53 binds BAX at a site also bound by the anti-apoptotic BCL-xL273, a structural homolog to BAX. However this binding event does not directly promote BAX activation, but facilitates, through the flexible tethering of the intrinsically disordered N-terminal segment of p53, a second interaction between the p53 and BAX. Specifically, a region of p53 comprised of residues 40–59 binds weakly to a region of BAX comprised of the C-terminus of α6, α7, α8, the N-terminus of α9 and the adjacent α4-α5 loop. This binding event is directly associated with BAX activation. Remarkably, binding of the p53 40–59 region to BAX occurs when Pro47 in p53 exhibits a cis backbone conformation. Furthermore, the conformational transition between cis and trans isomers of this proline residue is necessary to trigger BAX activation. The prolyl isomerase Pin1, a known promoter of p53 pro-apoptotic functions282–284, dramatically enhances the process of BAX activation by catalyzing the cis-trans interconversion of p53 Pro47 after phosphorylation of the preceding Ser46 (a modification that makes this site an optimal substrate for Pin1). The activation of BAX mediated by cytosolic p53 occurs through a single step mechanism whereby cis-trans isomerization of p53 Pro47 triggers the simultaneous release of α6-α9 from the globular core of BAX. In this remarkable signaling conduit, the free energy associated with the conformational switch between cis and trans backbone isomers of a proline residue located in a disordered protein region is sufficient to overcome the free energy barrier that separates the inactive ‘ground state’ from ‘excited states’ associated with activation and oligomerization of the metastable BAX. While p53 alone weakly activates BAX, activation is dramatically enhanced through catalysis of Pro47 cis-trans isomerization by Pin1. BAX activation is further enhanced by phosphorylation of p53 on Ser46. In the absence of an isomerase, proline cis-trans isomerization is a rare event. Therefore, the requirement for both phosphorylation of Ser46 and isomerization of Pro47 by Pin1 allows tight control of p53-dependent BAX activation and apoptosis. Ser46 and Pro47 of p53 thus integrate stress signals mediated by kinases and the Pin1 prolyl isomerase to induce apoptosis. The activation of BAX by cytosolic p53 illustrates how a localized conformational transition (cis-trans isomerization of Pro47 in p53) can modulate the energy landscape of a metastable protein (BAX) causing, in the presence of mitochondrial membranes, dramatic structural changes within BAX that trigger MOMP and apoptosis.
The NFĸB signaling pathway provides another excellent example how the regulation process relies on the conformational plasticity of all of the proteins involved. The NFĸB transcription factors regulate many cellular events critical for cell growth and proliferation, development, immune response and apoptosis by binding to DNA ĸB sites285–286. The NFĸB family consists of five homologous transcriptional activators, which form either homo- or heterodimers. All members have a Rel homology domain (RHD), which functions in dimerization and DNA binding and contains a nuclear localization sequence (NLS) at its C-terminus287. This NLS is disordered in the absence of the partners, and the disorder nature of this region allows it to fold into alternative conformations upon binding to alternative partners288.
Transcriptional activation by NFĸBs is tightly regulated by IĸBs; in resting cells IĸB keeps the NFĸB in the cytoplasm, preventing its nuclear localization and binding to DNA285,289–290. Stress signals induce the phosphorylation of IĸB, which leads to subsequent ubiquitination and degradation by the proteasome291 (Figure 8). Once the pathway is activated, NFĸB translocates into the nucleus and regulates the transcription of its numerous target genes292. The gene for IĸB is also strongly induced by NFĸB and this negative feedback loop results in post-induction repression293–295. The newly-synthetized IĸB enters the nucleus and effectively removes NFĸB from the DNA (Figure 8). The newly formed NFĸB-IĸB complex is transported to the cytoplasm, where it is ready for further activation296–297.
IĸB is composed of an N-terminal signal response region where phosphorylation and ubiquitination occur, an ankyrin repeat (AR) domain and a C-terminal PEST sequence298–299. Extensive structural studies were performed on free IĸB, which showed that the C-terminus is disordered and many of the AR modules were incompletely folded and thus the free protein shows the characteristic of a molten-globule state (i.e., it contains significant secondary structure but lacks well-defined tertiary contacts).300–302 (Figure 8). In contrast, when the IĸB is bound to NFĸB it shows the characteristic of a protein that is well folded throughout302.
Clearly, the binding of IĸB to NFĸB relies on changes in the folded states of both of the proteins. The disordered PEST domain in IĸB at least partly mediates the tight binding to NFĸB since deletion of this region leads to a 500-fold decrease in binding affinity303. The PEST region interacts with the dimerization domain of NFĸB and it is at least partially folded in the complex with NFĸB298,302. The other binding hot spot involves the first three ARs of IĸB, which interact with the disordered NLS of NFĸB, which folds upon binding into a two-helix structure304. Thus, both hot spots appear to undergo folding coupled to binding. The presence of two hot spots at either end of the interface where folding occurs upon binding hinted at the possibility of a “squeeze” mechanism, which leads to high affinity of the complex by stabilizing the AR domain303.
The disordered PEST domain and the partially folded C-terminal AR (ARs 5 and 6) also play an important role in dissociating NFĸB from DNA in a highly efficient kinetic process. In this “stripping” mechanism IĸB increases the dissociation rate of NFĸB from DNA and the analysis of various IĸB mutants shows that this rate enhancement depends on the largely flexible part of IĸB305. A recent NMR study in which the NFĸB-IĸB complex was titrated with excess of DNA revealed a possible mechanistic rationale for how IĸB enhances dissociation of NFĸB from DNA306. The first two ARs of IĸB, which are stably folded in the free protein, are immediately capable of binding to the NLS segment of DNA-bound NFĸB. The binding of these first two ARs brings the other ARs of IĸB in close proximity to their binding site on NFĸB, a ternary complex is formed as the fifth and sixth ARs undergo coupled folding and binding. The establishment of this ternary interaction allows the disordered PEST region of IKB to displace DNA from the DNA-binding site of NFĸB. The electrostatic repulsion between the negatively charged PEST region and DNA also likely contributes the effective stripping of NFĸB from DNA.
There are increasing examples of complex regulatory mechanisms involving disordered protein interactions in signaling through formation and dissociation of protein:protein307–308 and protein:nucleic acids complexes309–310. Such mechanisms exploit the unique sequence, energetic and structural properties of disordered proteins within discrete protein complexes. The involvement of disordered proteins is equally significant for formation of higher-order associated protein states, addressed in the following section.
It has been known for many years that transmission of a signal from an extracellular receptor across the membrane often requires the assembly of discrete dimers or trimers of membrane proteins 311–312. However, it has been recently found that large, higher-order assemblies or signalosomes can enable signaling. The first description of this kind of higher-order assembly emerged with the crystal structures of some members of the death domain fold superfamily313–315. These higher-order assemblies showed helical symmetry, which can form the basis of filamentous structures. Two other types of higher-order complexes were also identified, a filamentous amyloid assembly of RIP1/RIP3 complex316 and a two-dimensional lattice of TRAF6 formed by alternating dimerization and trimerization317. Though the role of disorder was never investigated in detail in these systems, evidence suggests that flexibility indeed has an important role in the assembly of these higher-order complexes. For example, the region between the kinase domain (KD) and the death domain (DD) in RIP1 (residues 300–560) and the region C-terminal to the KD in RIP3 (residues 300-end) are mostly disordered318. Short segments of these disordered sequences around the RIP homotypic interaction motifs (RHIMs) have propensities for β-strands and mediate the assembly of the heterodimeric amyloid filament. One of the advantages of the higher-order assembly of these complexes is that the spatial clustering within these assemblies facilitates proximity-driven activation. Supramolecular assemblies may also allow unique mechanisms of signal amplification, and their thermodynamic and kinetic characteristics may result in threshold behavior such that responses are initiated only by high-dose and persistent stimuli319. Higher-order assemblies may also provide transient spatial compartmentalization without the use of membrane partitions, and thus the signaling process is localized in the cell without generating unwanted cross-reactions319.
Assembly of multivalent signaling protein complexes also can involve phase transitions of component proteins from soluble, dispersed states to the liquid-liquid phase-separated state320–321. Phase separation can occur through weak multivalent homotypic or heterotypic interactions mediated by IDPs/IDRs with low complexity sequences or multivalent folded domains. The resulting phase-separated state is highly dynamic with exchanging multivalent interactions. Multivalency in the context of extracellular ligands binding to cell surface receptor often causes cross-linked networks322, but it has been shown recently that intracellular multivalent interactions such as in the nephrin-NCK-N-WASP system can produce liquid-liquid phase separation that could yield sharp transitions between functionally distinct states320. The transmembrane nephrin protein has an essential role in forming the glomerular filtration barrier in the kidney, and mediates actin reorganization323. The disordered cytoplasmic tail of the protein contains multiple YDxV sites that form preferred binding motifs for the Nck SH2 domain once phosphorylated by Src-family kinases323–324. Nck also contains SH3 domains, which can bind to the proline-rich disordered region of N-WASP325. As a consequence of Nck binding to N-WASP, nucleation of actin filaments by the Arp2/3 is stimulated. Multivalency of the system is necessary for proper actin assembly324, and it was shown that multivalency is responsible for the observed phase transition and can be disrupted by addition of monovalent molecules320–321. Importantly, phase separation can also occur when phosphorylated nephrin is bound to membrane, leading to the formation of dynamic, micron-sized clusters (puncta) at membranes321. The critical concentration of Nck/WASP is lower for the puncta formation at membranes than for the liquid droplet formation occurring when the Nck, N-WASP and cytoplasmic tail of nephrin were mixed in solution. Nevertheless, the transition in both cases is governed by the degree of phosphorylation of nephrin320–321, which enables finely tuned regulation by the kinase and generates non-linearity in this signaling pathway. The observed micrometer-sized liquid droplets in aqueous solution contain large polymeric species and provide evidence for phase separation. Moreover, it was shown that phase separation correlates with the activity of the Nephrin-NCK-N-WASP system, i.e. the phase-separated state of the system stimulates Arp2/3-mediated actin assembly320. This example is likely to be representative of the underlying mechanisms of microscopically observed puncta formation upon activation of many different signaling pathways, underscoring the significance of phase separation of IDRs containing modular binding motifs along with proteins containing cognate modular binding domains in regulation of signaling.
In multicellular fungi the cell-to-cell channels (septal pores) allow communication and transport between cells. The septal pores provide many advantages but also make the cells more vulnerable to environmental damage. To minimize the risk of this damage, in different species different organelles are attached to the septal pores, including the Woronin-body in Ascomycota and the septal pore cap (SPC) in Basidiomycota326–327. Mass spectrometry analysis of the Woronin body identified non-homologous septal pore-associated (SPA) proteins that are intrinsically disordered and form large-scale associated protein aggregates at the septal pores328. The SPA disordered domains are enriched for charged residues and one of the SPA proteins, SPA5, is extremely charged, with extensive arginine/aspartic acid (RD) repeats. The association of SPA does not appear to rely on amyloid formation329, since it involves either α-helical or disordered structures and little β-structure. Based on the amino acid composition of SPAs, it was suggested that SPA assembly is promoted by attractive electrostatic interactions, hydrogen bonding and weak hydrophobic interactions. The physical properties of the SPA associated state generated in an in vitro assembly assay suggest a liquid-gel phase separation328. It was proposed that pore lining and occlusion by the SPA proteins is initiated by association nucleation at the pore rim, followed by growth through interaction of SPA disordered domains. The structural plasticity of SPA in its likely phase-separated state ensures proper adaptation to different pore diameters and conversion of pore-lining rings into pore-occluding plugs328 required for function.
Phase-separated proteins have been proposed to form the basis of membraneless organelles in the cell, primarily RNA processing bodies such as nucleoli, germ granules, stress granules and processing (P) bodies that appear to be comprised of a fluid-like assembly of RNA and disordered RNA-binding proteins330–332. RNA granules in the cytoplasm are thought to form when multiple low affinity interactions between IDR-containing proteins of nonpolysomal messenger ribonucleoprotein particles (mRNP) promote the demixing phase separation, resulting in non-membranous, protein-dense structures333–334. Similar processes can lead to formation of RNA processing bodies in the nucleus. Phase separation is concentration-dependent, with increased concentration of proteins or RNA helping to induce RNA granule formation330,332,335. It was also shown that RNA granule dynamics are regulated by the phosphorylation of serine-rich, disordered proteins336. The importance of RNA processing, including formation of ribosomal RNA, various essential small RNA pathways, splicing to form mRNA, and translation of mRNA to proteins, highlights the significance of this role of disordered protein interactions in regulating biology.
It has been recently suggested that stress granules (SGs) act as cytoplasmic signaling centers analogous to classical receptor-mediated signaling complexes337. SG assembly occurs in response to stress-induced translational arrest338 and, once formed, SGs become hubs that intercept a subset of signaling molecules of classical pathways and alter their outcome in metabolism, growth and survival. For example, recruitment of different ribonucleases and helicases in the SGs influences protein translation and mRNA metabolism, but SGs also sequester proteins that regulate alternative splicing, thus they can also modulate gene expression. SGs may inhibit growth signaling by diverting TORC1 from its active location at lysosomes, delay apoptosis by sequestration of RACK1 from JNK, and influence polarity by sequestration of key components of the Wnt signaling cascade. The repertoire of SG functions is still expanding and it is increasingly evident that SG assembly and downstream signaling functions require phase separation facilitated by IDRs337. It was also shown for Muscle Excess Homolog 3B (MEX3B) and tristetraprolin (TTP) proteins that phosphorylation within IDRs regulates the recruitment of individual proteins to and their release from SGs339–340.
Nuclear speckles or interchromatin granule clusters are examples of dynamic RNA processing bodies, in this case forming as a result of protein-protein interactions among pre-messenger RNA splicing factors and other proteins such as kinases and phosphatases at the telophase/G1 phase transition341. They likely also contain transcription factors and RNA processing factors342–343. Their number and sizes can vary according to the levels of gene expression and in response to metabolic or environmental signals that influence the available pools of active splicing and transcription factors344. A basal level of factor exchange occurs between the speckles and the nucleoplasm (i.e. transition between the phase-separated and dispersed states) that is regulated by phosphorylation/dephosphorylation of speckle proteins, particularly the disordered SRSF1341,345–346. This mechanism ensures that the required splicing and transcription factors in the correct phosphorylation state are available at the sites of transcription and that factors that are not needed can be sequestered out of the nuclear pool. While much more needs to be understood, such as whether the release of factors is modulated by a signaling from the gene to the speckle or whether it is indirectly controlled by the change of the factor level, the importance of phase separation of IDRs/IDPs in this signaling process is clear.
Another family of membraneless organelles is the nuage/chromatoid body (CB) family present in the cytoplasm of spermatocytes and spermatids and formed by RNA and RNA-binding proteins. These germ granule organelles have crucial roles in mRNA regulation and small RNA-mediated gene control347. A primary constituent of nuage is Ddx4348, which has extended N- and C-terminal disordered regions in addition to the central DEAD-box RNA helicase domain349. The disordered N-terminus of Ddx4 was shown to spontaneously phase separate both in HeLa cells and in vitro, forming liquid-like droplets that dynamically respond to environmental changes including temperature and salt concentration. Although the interactions that drive phase separation are not understood for most membraneless organelles, these interactions are clearly distinct from the backbone H-bond interactions formed between adjacent β-sheets in amyloid fibrils350. Ddx4 organelles appear to be stabilized through weak electrostatic interactions involving charged residues which are clustered in Ddx4, since scrambling of these charged blocks prevents formation of Ddx4 organelles. Another important feature of Ddx4 sequences is the over-representation of FG and RG sequences for which the relative spacing was found to be conserved, suggesting that cation-π interactions might be important for the phase separation349. Over-representation of specific dipeptide motifs has also been identified in other proteins forming membraneless organelles333,351–354, which suggests that weak interactions involving conserved motifs may be general for the phase separation mechanisms. The dynamic nature of these organelles is critical for their function and small perturbations such as post-translational modifications can dramatically effect phase separation349,355–358.
The nuclear pore complex (NPC), a 100 MDa gateway for molecular trafficking into and out of the nucleus, is comprised of variable numbers of nucleoporin proteins (Nups). Some Nups are rigid and form the framework of the ring-like structure and others have both folded domains and IDRs, the latter of which occupy the central zone of the pore through which transport occurs. These centrally located Nups, numbering about one dozen of varied abundance, have IDRs that are enriched in repetitive motifs containing phenylalanine and glycine (FG-Nups)359. Their folded domains bind components of the rigid ring framework, anchoring the FG-Nups on the inner surface of the NPC and near its entrance and exit. Filamentous FG-Nups project outward into the cytoplasm while others occupy space within the nuclear basket. While the general structure and protein composition of the NPC has been understood for decades360, details of its molecular organization have emerged only in recent years. Chait, Rout, Sali and coworkers used hybrid structural and biochemical methods to compute a structural model of the NPC (Figure 9a)361 that is generally consistent with more recent experimental imaging results from super-resolution light microscopy362 and electron tomography363. In the former modeling studies, the FG-Nups were defined only by their anchor points on the inner surface of the structured ring and the FG repeat domains appeared as a blur of protein density within the central core. Another super-resolution imaging study364 resolved the FG repeat domains through detection using fluorescently-tagged wheat germ agglutinin (FL-WGA) that binds to O-linked N-acetylglucosamine (O-GlcNAc)-modified sites within nucleoporins. The FL-WGA staining revealed a ring of diameter 38 ± 5 nm within an outer ring of 161 ± 17 nm defined by detection of fluorescently-labeled gp120, an integral membrane protein that surrounds NPCs within the nuclear membrane.
While the structure of the NPC has come into focus through in vitro structural investigations and super-resolution imaging of samples isolated from or in situ within cells, questions remain regarding the molecular details of NPC-mediated transport mechanisms. Central to these mechanisms are the repetitive FG motifs within disordered regions of the FG-Nups359 and several models have been proposed to explain how they contribute to (i) selective, nuclear transport receptor (NTR)-dependent cargo transport through the pore, (ii) impermeability of the pore to non-NTR-bound molecules above a certain size threshold (~30 kDa) and (iii) passive diffusion through the pore of molecules of sizes below this threshold. Among these mechanisms are the virtual gate model365–366, the brush model367, the selective phase model368–369 and the reduction of dimensionality model370–371. A substantial body of work has related the in vitro phase separation properties of disordered FG-Nups to their roles in transport through the NPC, and has been interpreted as providing support for the selective phase model. For example, in one study, Görlich and co-workers372 isolated intact NPCs from Xenopus oocytes, removed O-GlcNAc-modified FG-Nups, reconstituted the pores with one type of pure, recombinant FG-Nup, and then measured transport efficiency across nuclear membranes. The results showed that reconstitution with Nup98, which forms a “hydrogel” phase-separated protein state in vitro that is termed “cohesive”, supported NTR-dependent cargo transport but also formed a barrier to passive transport of large cargoes (Figure 9.). In vitro studies showed that hydrogels formed from Nup98 were permeable to NTR-bound molecules but impermeable to others. Substitution of the FG repeat domains of Nup98 with similar domains from other cohesive FG-Nups also supported NTR-mediated transport and hydrogel permeation. In contrast, mutation of the Nup98 domain to eliminate the FG motifs imparted non-cohesive properties that eliminated both NTR-facilitated transport and the barrier to passive diffusion, and the isolated mutant Nup98 failed to form hydrogels in vitro (Figure 9b). The model that has emerged from this study is that transient interactions between the FG motifs of cohesive Nups establish a mesh-like barrier —the selective phase—that is impermeable to non-NTR-bound cargoes. However, Görlich and co-workers372 propose that hydrophobic surfaces on NTRs can also engage the FG motifs and locally melt the mesh structure, enabling rapid NTR/cargo diffusion within the pore. The high multivalency of FG motifs within FG-Nups enables individually weak and transient FG motif-FG motif interactions to mediate mesh-like barrier formation373 and for NTR-bound cargoes to only locally disrupt—“melt”—the mesh structure and therefore preserve the macroscopic barrier. Importantly, the mesh-like barrier is described to reform, or “heal”, as soon as a NTR/cargo complex diffuses to another region within the pore. Schmidt and Görlich provided additional support for the selective phase model by showing that FG-Nups from a wide range of species displayed hydrogel-forming, cohesive behavior as well as gel permeation by NTRs374. However, the studies discussed above leave open the issue of the conformational state of FG-Nups within the actual pore of the NPC. While hydrogel structures formed by cohesive FG-Nups in vitro exhibit NTR-dependent permeation, whether the temporal stability of molecular configurations within these structures is compatible with rapid transport of NTR-bound cargo through the NPC remains an open question.
A recent computational study using coarse-grained modeling of FG-Nups positioned within a monolithic, NPC-shaped pore375 showed homogeneous distribution of the dynamic, disordered FG motif-containing domains within the center of the pore, consistent with their association with a constantly fluctuating, dense barrier. This study showed that most of the central volume of the pore exhibited net positive charge while regions of the FG-Nups very close to their attachment sites on the inner walls of the pore gave rise to clusters of net negative charge. This model also displayed properties of selective transport, exhibiting energetically favorable interactions with charged/hydrophobic model cargoes (features associated with NTR/cargo complexes) and unfavorable interactions other types of model cargoes. Another coarse-grained computational study of another FG-Nup376 reported the highly dynamic nature of FG motif-FG motif and FG motif-NTR interactions and noted the essential role that these dynamics play in cargo transport.
In a recent study by Blackledge, Gräter, Lemke and co-workers, single-molecule and stopped-flow fluorescence, NMR spectroscopy, and computational methods were used to characterize kinetic, thermodynamic, and structural and dynamic, features of an FG-Nup protein (Nup153) interacting with Importin-β and other NTRs377. Disordered Nup153 populates an ensemble of conformations that readily bind to NTRs at rates approaching the diffusion limit. Multiple FG motifs within Nup153 rapidly bind to and dissociate from the NTR surface; the authors provided compelling arguments that these highly dynamic interactions between Importin-βand FG-Nups mediate rapid transport of cargo through the NPC. In another recent study, Hough, Rout, Cowburn and co-workers used NMR spectroscopy to study FG-Nup/NTR interactions in a variety of solvent environments, from simple buffers to the crowded interior of E. coli378. The results showed that an FG-Nup remained highly dynamic and disordered while binding to an NTR, consistent with highly transient interactions of multiple FG motifs with multiple sites on the surface of the NTR. The rapid FG-Nup/NTR association and dissociation events observed in this study were described as being compatible with rapid passage of cargo-bound NTRs through the FG-Nup matrix of the NPC central pore.
Other advances in the NPC field have involved mapping the dynamic trajectories of molecules transiting the nuclear pore, both through NTR-facilitated and passive mechanisms, using a variety of sub-diffraction limit, single-molecule optical microscopy methods. In one study379, the transit of individual quantum dots (18 nm in diameter) tethered to Importin-β binding domains (IBBD) was monitored with 6 nm spatial and 25 msec temporal resolution in an in vitro system based on permeabilized HeLa cells. Strikingly, only 20% of the IBBD-decorated quantum dots that entered NPCs were successfully transported (in an Importin-β-dependent manner), with the vast majority entering and exiting after often-times extensive excursions within the FG-Nup-filled pore. Transit within the pore was described as being anomalously subdiffusive (meaning that the pore exhibits high viscosity relative to aqueous solution), consistent with the crowded environment of the dense FG-Nup phase, and involved apparently stochastic back and forth movements. The rate of diffusion was positively correlated with the density of the IBBDs decorating the quantum dot cargoes, showing that these interactions are required for rapid transit. Additionally, the authors showed that RanGTP was required for cargoes that reached the nuclear side of pores to exit, suggesting that RanGTP-dependent detachment of cargo from the NTR is requited prior to pore exit. Finally, this report also provided data suggesting that the cytoplasmic filaments comprised of a subset of the FG-Nups serve to confine NTR-bound cargoes in the vicinity of the opening to the central pore, facilitating their entry into and transport through the pore.
A study that used another super-resolution optical microscopy technique with 9 nm three-dimensional spatial and 400 μsec temporal resolution380, revealed different transit zones for passive and NTR-facilitated cargoes. Cargoes small enough to move passively (<30 kDa) were observed to travel through the central region of pores while NTR-facilitated cargoes traveled via a zone near the walls of the pore structure. The boundary between these transit zones was influenced by transport cofactors; for example, increased levels of the NTR, Importin-β, caused expansion of the facilitated transit zone while increased levels of RanGTP were associated with its contraction. These two latter reports further support the role of FG-Nup/NTR interactions in the transit of NTR-bound cargoes through the pore368–369.
A single-molecule study observed the dynamic behavior of C-terminally GFP-labeled Nup153, an FG-Nup that is attached via its N-terminus to the NPC’s nuclear basket381, and suggest a directed transit mechanism. Intriguingly, the results showed that the GFP label showed two localizations, one within the nuclear basket region and the other near the cytoplasmic side of pores, and exhibited dynamic transitions between these two sites with timescales of 3 msec and 5 msec for transitions from the cytoplasmic to nuclear basket site, and vice versa, respectively. The authors contend that the C-terminal FG motif-containing domain of Nup153 is long enough to extend from the nuclear basket point of attachment to the cytoplasmic side of the pore. Furthermore, the authors showed that the Nup153 dynamics correlated with those of cargo transport. These authors’ interpretation of their observations is that Nup153 experiences dynamic fluctuations between compact and extended conformations which direct translocation of NTR-bound cargoes from the cytoplasmic to nuclear sides of the NPC. Interestingly, Nup153 was shown in vitro using single-molecule and ensemble fluorescence methods to form hydrogels, that were permeable to Importin-β382; it is unknown whether this FG-Nup, under these in vitro conditions, experienced the dramatic structural fluctuations discussed above. Regardless of the exact details of transit mechanisms, it has become clear that FG motifs within disordered regions of the FG-Nups mediate transient intra- and inter-molecular interactions that create the permeability barrier structure of the NPC. NTR-cargo complexes dynamically interact with the FG motifs, enabling their rapid diffusion. The report noted above suggests that large-scale motions of at least one FG-Nup may also contribute to cargo transit through nuclear pores.
The protein constituents of the NPC are known to be post-translationally modified. For example, a recent study showed that phosphorylation of Nup98 by Cdk1 and NEK is associated with NPC disassembly associated with nuclear membrane breakdown during mitosis383 and other Nups are known to be phosphorylated at this cell cycle stage384. Also, certain Nups have been known for decades to be modified by N-acetylglucosamine. Early findings showed that wheat germ agglutinin (WGA) inhibited cytoplasmic to nuclear transport385 and that the mechanism of inhibition involved the binding of WGA to O-GlcNAc modifications within proteins peripherally associated with the NPC386. In particular, the FG-Nups are known to be O-GlcNAc modified387 but data linking this type of modification with transport function has been slow to emerge. In one systematic study of the transport and cohesive features of a variety FG-Nups388, O-GlcNAc modification was shown to enhance the NTR-dependent hydrogel permeation properties of Nup98. These results clearly demonstrate that it will be necessary in the future to consider the influence of these modifications to fully understand NPC function in vivo. A recent report on metabolic labeling of proteins in cells with a photo-activated, cross-linkable form of O-GlcNAc389 offers opportunities to achieve this goal by probing the role of this modification in modulating the interactions of the FG-Nups within the NPC.
The results discussed above significantly advance our understanding of the molecular details of transport through the NPC. In isolation, the FG-Nup proteins are highly dynamic and disordered and, under certain conditions, spontaneously form hydrogel structures that are permeable to cargo-bound NTRs but not to isolated cargoes. However, while FG-Nup hydrogels recapitulate some features of native NPCs, namely selective interactions with and permeation by NTRs, the semi-rigid structural nature of these hydrogels is described by some378 as being incompatible with the rapid molecular handoffs that must accompany the msec transit times of NTR/cargo complexes through the NPC. The two recent studies noted above377–378, conducted in both sparse and dense molecular environments, showed that FG motifs within disordered, monodisperse FG-Nups bound transiently and multivalently to multiple sites on NTRs, providing a mechanism for rapid transit through the pore. However, these studies also do not directly address the nature of the densely packed FG-Nups within the NPC pore, although they do suggest that the environment of the pore is more liquid-like than hydrogel-like to preserve the disordered and dynamic features that are associated with very fast FG-Nup/NTR association and dissociation events. As phase separation gives rise to a range of viscosities, the FG-Nups could phase separate in the cell to a more liquid-like and dynamic state than the hydrogel observed in vitro, potentially controlled by PTMs and other protein interactions, suggesting the importance of the overall mechanistic understanding provided by the model of phase separation of FG-Nups. Further study, thus, is needed to explore the structural features and dynamics of the FG-Nups, in the cellular context of the nuclear pore. Single-molecule fluorescence methods and NMR spectroscopy have great potential to contribute toward these goals. Another frontier is to develop quantitative descriptions with atomistic resolution using computational methods of NPC pore structure and dynamics, and the intrinsic heterogeneity of these features, and how these features mediate the different types of transport that occur via the NPC.
While some nucleolar proteins have been observed to undergo liquid-liquid phase separation390, additional nucleolar proteins have been found to form more discrete and fibrillar supramolecular structures. The Ink4a-Arf gene locus encodes two proteins with unrelated amino acids sequences that derive from two overlapping exons that utilize alternative reading frames391. Exons 1α, 2 and 3 within this locus encode the p16Ink4a tumor suppressor, a folded protein that binds to Cdks 4 and 6 and blocks their binding to and activation by cyclin D392. Exon 1β, and alternatively read exons 2 and 3, encode the Arf tumor suppressor (p14Arf in humans, p19Arf in mice). These are disordered proteins that were first discovered to bind and inhibit Mdm2 (Hdm2 in humans), the E3 ligase for p53393. p16Ink4a has been structurally characterized in isolation by solution NMR394 and in complex with Cdk6 using X-ray crystallography392, providing clear insights into the mechanism through which it inhibits Cdks 4 and 6. In contrast, while Arf has been reported to interact with more than 30 other proteins391, structural details are available only for its complex with Mdm2 and Nucleophosmin (Npm)395–398. When its expression is activated by oncogene activation, the Arf protein becomes localized within nucleoli, where it binds Mdm2, leading to p53 stabilization and activation, and to Npm, inhibiting ribosome biogenesis391.
p14Arf and p19Arf , 132 and 169 amino acids in length, respectively, are unusual in that their sequences are 18% and 22% arginine residues (with 0 and 1 lysine residues, respectively) that are counterbalanced only by 2% and 4% aspartate and glutamate residues , making the proteins highly positively charged. Based on their biased amino acid compositions, the human and mouse Arf proteins can be classified as weak polyelectrolytes (polycations) according to the charge distribution-dependent structural landscape theory of Das and Pappu43. The two full-length Arf proteins have not been structurally characterized. The unique exon 1β encodes the N-terminus, which is the most highly conserved region amongst available Arf protein sequences. A 37 amino acid, N-terminal fragment of p19Arf (Arf-N37) was shown to bind Mdm2 in the nucleolus, activate p53 and cause cell cycle arrest, all of which are functional hallmarks of the full-length protein399–400. Arf-N37 exhibited disordered features by NMR400 but promoted formation of β-strand secondary structure upon binding to the central, disordered domain of Mdm2 (Mdm2 Arf binding domain, Mdm2-ABD; residues 210–304). Interestingly, in contrast to the polycationic character of Arf-N37 (30% charged residues, 10/11 Arg; 3% negatively charged residues), the sequence of Mdm2-ABD has 33% negatively charged residues and 2% positively charged residues and is classified as being at the boundary between the weak and strong polyelectrolyte (polyanion) regions of the Das and Pappu structural landscape43. Analysis of the N-termini of p14Arf and p19Arf, the region that is evolutionarily most highly conserved, revealed tandem motifs containing Arg residues separated by five or six mostly hydrophobic residues. Based on studies of short (15 amino acids in length) peptides, it was shown that these Arg-rich motifs bind to two conserved regions within Mdm2-ABD in which negatively charged or polar residues alternate with hydrophobic residues—these were termed the H1 and H2 segments. Just as addition of Arf-N37 to Mdm2-ABD caused β-strand formation [based on analysis using circular dichroism (CD)], so did addition of Arf-derived, Arg-rich peptides. It was also shown that addition of a peptide corresponding to the p14Arf N-terminus (residues 1–14) to another corresponding to the H1 segment of Hdm2 (residues 240–254) caused formation of β-strand secondary structure. However, analysis of these structures using Fourier-transform infrared spectroscopy suggested that they were comprised of amyloid fibril-like β-strands rather than native β-strands found in monodisperse, globular proteins401. Analysis of these structures, and others formed by co-addition of various combinations of Arf-derived peptides or Arf-N37, and Mdm2/Hdm2-derived peptides or Mdm2-ABD, using negative stain electron microscopy revealed fibril-like features401–402. This is consistent with disappearance of NMR resonances of either isotope-labeled Arf or Mdm2 upon titration of the binding counterpart395. Because Arf and Mdm2 co-localize within nucleoli, effectively sequestering the E3 activity of Mdm2 apart from nuclear p53, it has been hypothesized that Arf co-assembles with Mdm2 to form bi-molecular, supra-molecular structures comprised of fibril-like β-strands395. A complication in studies of these structures is that when polypeptides derived from Arf and Mdm2 are co-mixed, precipitation often occurs, probably due to neutralization of the extensive and opposite charges of the two polypeptides.
Recently, the issue of charge neutralization and precipitation was addressed by studying the interaction of the Hdm2 Arf binding domain (Hdm2-ABD) with a truncated peptide containing the first two-thirds of the first Arf motif from p14Arf (termed A1-mini; N-GRRFLVTVR-C)396. Addition of the A1-mini peptide to Hdm2-ABD induces β-strand structure and the formation of discrete structures with varied A1-mini:Hdm2-ABD stoichiometry. Truncation of the Arf motif to 9 residues weakens and probably limits opportunities for multivalent interactions with Hdm2-ABD, preventing precipitation and enabling detailed characterization of the bi-molecular, β-strand-rich co-assemblies by NMR and analytical ultracentrifugation (AUC). These studies led to hypothetical models of β-strands comprised of alternating, oppositely charged segments derived from Arf and Hdm2 (Figure 10)396. In summary, Arf associates with Mdm2/Hdm2 through a novel co-assembly mechanism involving regions of the two proteins that are disordered in isolation and that form fibril-like β-strand-rich structures that causes sequestration in nucleoli, inhibiting E3 ubiquitin ligase activity.
Arf also associates with Npm in the nucleolus, causing the formation of supra-molecular structures upwards of 3 MDa in size403. Npm has a modular structure, with an N-terminal pentamerization domain (residues 10–120), central disordered domain (residues 121–240), and C-terminal nucleic acid binding domain (residues 241–294)398. Arf was shown to bind to the N-terminal pentamerization domain of Npm403–405 and the N-terminal, exon 1β-encoded domain of Arf was shown to bind Npm405. The N-terminal pentamerization domain of Npm (Npm-N) is comprised of β-sheet-containing protamers and experiences phosphorylation-dependent structural polymorphism in which a range of structural forms, from destabilized pentamers to disordered monomers, are populated depending upon the site of phosphorylation398. As was observed with Mdm2-ABD, titration of Arf-N37 caused precipitation of Npm-N, as also did smaller fragments of the Arf N-terminus as small as nine amino acids in length (e.g., A1-mini). This phenomenon probably stems from the potential of Arf-derived polypeptides of 9 or more amino acids to participate in multivalent interactions with Npm-N. Studies with an even shorter Arf-derived peptide termed Arf6 (N-MGRRFL-C) revealed the possible mechanism of Arf/Npm interactions, with the Arf6 peptide shown by NMR to bind within negatively charged grooves formed by loops on one face of the pentamer. Many other Npm binding partners contain Arg-rich motifs within their sequences and peptides corresponding to these for a few targets (ribosomal protein L5 and HIV Rev) were shown to also bind within the negatively charged groove of Npm-N. Interestingly, phosphorylation of certain residues locks Npm-N in a disordered, monomeric conformation that cannot bind to Arg-rich motifs such as that contained within the Arf6 peptide. Many different serine and threonine residues within Npm-N have been shown to be phosphorylated in cells. Because phosphorylation of these same sites (mimicked in vitro by S/T to E/D mutagenesis) modulates Npm-N oligomerization and structure, it has been proposed that phosphorylation-dependent modulation of Npm-N structure plays a role in Npm function in cells.
The interaction of Arf with Mdm2/Hdm2 and Npm causes the formation of large, supra-molecular assemblies through interactions that have a strong electrostatic component stemming from the polycationic features of Arf and the polyanionic features of Mdm2/Hdm2 and Npm. The Arf binding domain of Mdm2/Hdm2 is disordered and interactions with Arf drive the formation of fibril-like, β-strand-rich structures. In contrast, Arf interacts with the folded form of the Npm oligomerization domain although this interaction in cells is associated with assemblies ~3 MDa in size. A short peptide segment derived from p14Arf containing two Arg residues binds to Npm and many such Arg-containing motifs are found in the full-length Arf protein. This situation provides the opportunity for one Arf protein to bind many Npm molecules, an example of multivalency. Furthermore, each Npm-N pentamer has five Arg-rich motif binding sites, possibly enabling the formation of extended networks of Npm and Arf molecules. It is intriguing that, in both Arf interaction scenarios, binding with partners causes formation of large, repetitive structures: fibril-like structures in the case of Mdm2/Hdm2 and possibly networks with Npm. While speculative at this stage, Arf appears to function by sequestering other proteins. Its sequence, replete with Arg-rich motifs, may have evolved to bind and sequester a variety of target proteins that have common polyanionic features.
Phase separation of proteins is not limited to the intracellular space. The extracellular matrix protein, elastin, is critical for elasticity of various connective tissues including the skin, lung and arteries406–407. Elastin self-assembles from tropoelastin monomers into fibrous polymer networks, and the first, critical step in this assembly is a liquid-liquid phase separation that is attributed to the hydrophobic sequence blocks of tropoelastin408. These hydrophobic regions rich in glycine, valine and proline residues are disordered and form repetitive motifs such as VPG and PGVG409. Phase separation is driven by an increase in solvent entropy upon the burial of hydrophobic domains408 and the droplets that are formed protect the hydrophobic interior410. The droplets then grow by coalescence and form a fibrous polymer network crosslinked by lysine residues that has the ability to confer recoil to human tissues411.
Despite the growing number of examples, our current understanding of the forces that drive large scale association in signaling complexes and what leads to liquid or gel phase-separated states or fibrous states is far from complete. Increasing evidence suggests that these phenomena are quite general, explaining the frequent observations of punctate morphologies of signaling complexes in cells and the protein-dense nature of RNA processing bodies. Phase separation is a highly concentrating process and is related to fiber formation, thus the links between these various associated states are strong, if complex412. The liquid-like phase-separated puncta, granules and membraneless organelles have been collectively termed “assemblages”413 to describe all higher-order assemblies formed by either multivalent folded domains or low-complexity disordered regions and often RNA or DNA. These subsets of higher-order assemblies are highly dynamic and fluid-like, with very high protein concentrations that of necessity reduce the water content and change the dielectric properties349, modulating biochemical processes within them. Since most of the membraneless organelles are associated with RNA processing, it appears that these liquid-like droplets also provide a unique environment suitable for the dynamic recruitment or exclusion of nucleic acids, as the ability of Ddx4 organelles to differentiate between single and double stranded nucleic acids and act as a molecular filter demonstrates349. The exact nature of the various interactions that drive their specific formations are yet to be discovered since biophysical techniques for describing these organelles are just emerging, but some appear to have significant electrostatic components. Alternative splicing, post-translational modifications such as phosphorylations and methylations, changes in ionic strength, protein concentration and temperature all regulate the assembly and disassembly of these signaling complexes and membraneless organelles which result in dynamic responses to different stimuli essential for the biochemical processes in the cell.
The cellular environment is rich in small molecules, including amino acids, nucleotides, lipids, steroid hormones, primary metabolites, and second messengers etc., that are recognized and chemically modified by folded enzymes and other proteins and constitutea central component of the processes that enable life. Since disordered proteins constitute >30% of mammalian proteomes, it islikely that the various small molecules present in the cell may also interact, in specific or nonspecific ways, with these dynamic polypeptides. When considering binding to a single protein site, these interactions are likely to be weak due to the need to overcome the high conformational entropy cost associated with binding disordered polypeptide chains. However, it has been proposed that small molecules or ions that bind promiscuously to multiple sites within a disordered protein can increase the entropy of the protein ensemble, causing the entropy change of binding to be thermodynamically favorable414. Progress has been made in characterizing interactions between non-physiological small molecules and disordered proteins. For example, in 2010 Metallo reviewed reports of interactions between small molecules and a handful of disordered proteins, including Myc, amyloid precursor protein, EWS-FLI1, Bcl-2, and NFX1415. A follow-up computational study416 of one of the small molecules reported to bind to Myc (10058-F4) showed that binding of these molecules slightly altered the conformational features of exhaustively sampled Myc conformational ensembles and showed modest agreement with previously reported NMR data for the same compound417.
More recently, Krishnan and coworkers418 described a small molecule (MSI-1436) that allosterically inhibits the tyrosine phosphatase PTP1B. MSI-1436 acts by binding to two distinct sites on PTP1B, one of which is within the largely disordered C-terminal region. PTP1B is a target of therapeutic interest in diseases such as diabetes, obesity and breast cancer; however, its active site exhibits structural features that have hindered inhibitor development. Data from NMR and SAXS showed that the binding of MSI-1436 caused compaction of the disordered C-terminal region and altered its allosteric communication with the folded catalytic domain, inhibiting phosphatase activity. Importantly, MSI-1436 specifically inhibited PTP1B in cellular and mouse xenograft models of HER2-dependent breast cancer. In this example, a small molecule alters the energy landscape of an IDR within a multi-domain enzyme, altering allosteric communication between the IDR and the folded catalytic domain.
Apart from these studies, a major area of interest has been the development of small molecules that inhibit the aggregation of neuropathological peptides and proteins, such as Aβ, α-synuclein and Tau. These proteins are known to be disordered and monomeric prior to assembly into oligomeric species and amyloid-like fibrils. However, Eisenberg and co-workers noted the challenges associated with structure-based design of inhibitors of the monomeric forms of amyloidogenic, disordered proteins419. Instead of targeting these monomeric forms, these investigators used high-resolution structural information for fibrils formed from a short segment from Tau to design a non-natural peptide that binds to fibril ends, thus inhibiting fibril elongation.
Past studies have reported small molecules that interacted with monomeric and trimeric forms of Aβ, and the dyes Thioflavin T and Congo Red are known to bind its aggregated forms (reviewed in420). More recently, computational methods have been applied to gain insight into the nature of interactions between small molecules and ensembles of disordered monomeric, amyloidogenic polypeptides, and as a means to identify small molecules that can bind to sites within particular conformers present in such ensembles. For example, Calfisch and co-workers421 studied a variety of small peptides and non-peptide small molecules, which are known to inhibit amyloid formation to varying degrees, interacting with an Aβ-derived peptide using MD and energy evaluation methods. The Aβ peptide formed a partially collapsed, highly dynamic ensemble devoid of secondary structure and these features were slightly but significantly altered in the presence of the interaction partners. The small molecules and peptides bound to structurally varied conformers within the Aβ peptide ensemble and to varying extents (from 70% to 10% of the computation time), and these interactions were largely mediated by aromatic and charged moieties on the respective molecules. While possible mechanisms of inhibition of aggregation and fibril formation were proposed, future studies will be required to determine whether the effects of the compounds examined in this study on Aβ peptide conformations are associated with amyloid inhibition.
Another, more recent study used replica-exchange MD to generate a conformational ensemble for Aβ 1–42, followed by analysis of interactions of conformers with a library of simple molecular fragments and a few other small molecules422. The energy landscape of this peptide was characterized by a broad, shallow basin of low-energy, heterogeneous but collapsed conformers. The authors analyzed the ability of 10 molecular fragments, selected to represent chemical moieties present in fragment libraries423–425, to bind to a representative panel of the most highly populated Aβ 1–42 conformers. The results showed that the molecular fragments bound energetically favorably within hydrophobic pockets created by clustering of a subset of hydrophobic residues within the Aβ 1–42 polypeptide sequence. Two known inhibitors of Aβ aggregation, curcumin and Congo red, were also docked to a subset of the Aβ 1–42 conformers that exhibited fragment- binding pockets. As with the fragments, these two molecules bound favorably within some of the so-called “hot spot” pockets and, in one case, remained stably bound during an 80 msec replica exchange MD trajectory. The authors propose that their computational and analytical strategy is appropriate for generating representative conformational ensembles for disordered polypeptides that can, in turn, be used to computationally screen for potential small molecule binding sites. The authors acknowledge that the hot spot binding sites that they have identified exist in the context of the highly dynamic, disordered Aβ 1–42 polypeptide and that small molecule binding would be accompanied by a large entropic penalty. Nonetheless, the authors’ results are compelling and warrant experimental verification.
Another recent and related computational study addressed the interactions of bifunctional, aromatic small molecules with two Aβ peptides, Aβ 1–40 and Aβ 1–42, complexed with Zn2+ 426. The two Aβ peptides adopted collapsed but distinct conformational ensembles characterized by a broad, relatively shallow low-energy basin with the energy landscapes. These ensembles were used in docking experiments with the small molecules to identify interaction sites. A variety of peptide conformers interacted with the different small molecules, with a consistent observation being that the hydrophobic, aromatic small molecules preferentially interacted with two hydrophobic segments within the Aβ peptides. Interestingly, the two clusters of hydrophobic amino acids identified to bind small molecules in this study (Leu17 – Ala21 and Ile31-Val36) /Met 35) overlap with two of the three hydrophobic clusters identified in the Zhu, et al., report described above.
The computational studies discussed above support the view that small molecules can interact with pockets within collapsed conformations of disordered proteins. In two of the studies421–422, computational results were compared with those from NMR and showed some level of agreement regarding the conformational features of Aβ peptide/small molecule complexes. A recent study by Iconaru and coworkers confirmed the tendency of small molecules to bind transiently to hydrophobic clusters within and IDP through studies of p27427. Two groups of small molecules were shown to bind specifically albeit weakly to several partially overlapping regions containing aromatic residues within the Cdk2 binding sub-domain of p27 (termed D2; Figure 4.). Molecules in Group 1 bound to a localized region containing residues F87YY89 while those in Group 2 bound to this region as well as two others containing residues Trp60 and Trp76. Molecular dynamics computations showed that these aromatic residues form transient clusters, possibly creating binding sites for the small molecules. One of the Group 2 compounds, SJ403, was shown to partially displace p27-D2 from Cdk2 and partially restore catalytic activity. These studies provide another example of a small molecule that acts by altering the energy landscape of an IDP, in this case to sequester p27 in conformations incompetent for binding to one of its functional targets, Cdk2.
Recently, an intriguing study by Zhang and coworkers reported the functional modulation of transcription initiation through the chemical modulation of an intrinsically disordered, low complexity region of the transcription initiation factor TFIID428. This effect occurs through a mechanism different from those associated with the examples above of small molecules binding to segments within IDRs that are enriched in aromatic or aliphatic residues. Zhang and coworkers discovered that tin(IV) oxochloride clusters, initially identified as impurities in a small molecule screening library, exhibited polar interactions with a histidine-rich low complexity region of TFIID that is also capable of nonspecific binding to promoter DNA sequences. The tin(IV) oxochloride oxo-metal clusters stabilized the interaction of this TFIID region with DNA and prevented engagement of DNA polymerase II (PolII) and initiation of transcription. However tin(IV) oxochloride did not affect the re-entry into successive rounds of transcription (reinitiation) once PolII had been fully engaged into the transcriptional machinery, thus providing a valuable chemical tool to gain mechanistic insights into the transcriptional program.
Together, the above examples demonstrate that small molecule targeting and chemical modulation of IDRs to modulate protein function represents a new paradigm in drug discovery that is accompanied by new conceptual frameworks. IDPs and proteins with IDRs are involved in many different human diseases. The results discussed above clearly demonstrate that small molecules can modulate IDP/IDR function. We look forward a new era in therapeutics in which academic and industrial drug discovery is directed toward IDPs and IDRs.
The structural heterogeneity of disordered protein ensembles endows IDPs/IDRs with unique regulatory strategies in cell signaling pathways and underpins the importance of uncovering the roles of these proteins in such networks. It also reinforces the need for better descriptions of the structural ensembles sampled by IDPs/IDRs in isolated and complexed states, which are extremely challenging because of under-representative sampling of the conformer pool, the under-determined nature of the problem, and the need for more accurate ways to calculate experimental data from structural models. Incorporation of distance information from fluorescence-based single-molecule methods429 or electron paramagnetic resonance (EPR)430 together with NMR and SAXS data into ensemble calculations may provide additional insights into the conformational properties of the disordered ensemble. Generation and statistical analysis of several different ensembles can also help to yield a final ensemble that is most consistent with the experimental data and to estimate the accuracy of the final ensemble. Nevertheless, those ensemble descriptions are increasingly powerful for providing valuable insights into the range of modes of protein interactions and can be exploited to optimize small molecule inhibitors for therapeutic purposes.
Despite recent progress in understanding the complexity of signaling networks, and the roles IDPs play in signaling processes, a comprehensive mechanistic characterization of the diverse interactions of IDPs remains to be described. IDPs often mediate dynamic protein interactions, which exhibit unusual binding characteristics such as multisite dependence and ultrasensitivity. Some degree of dynamics is preserved even in tightly bound complexes and even if induced folding occurs upon binding. Retaining flexibility in protein interactions in the signaling pathways is critical for effective regulation, and sometimes regulated unfolding of a protein region serves as a regulatory mechanism. Multivalent interactions mediated by IDPs/IDRs alone or together with folded domains can produce liquid-liquid phase separation, and the resulting state is often highly dynamic allowing the exchange of constituent proteins with surrounding cytoplasm or nucleoplasm and providing unique mechanisms of signal amplification.
Although the original concept of allostery requiring two sites to be coupled through a network of stable structural interactions has been extended to embrace the importance of perturbations of the conformational and energetic landscape and the role of dynamics and disorder, allosteric coupling mechanisms mediated by IDPs/IDRs are still not well understood. Furthermore, allosteric mechanisms that “remodel” disordered state ensembles without invoking folding transitions have only recently been recognized. An expanded view that addresses the totality of allosteric transmission of signals via energetic redistribution due to binding and PTM effects through disordered proteins will be crucial for understanding how disordered signaling proteins transmit information to downstream partners in carrying out complex biological processes (Figure 11).
This work was supported by the Canadian Institutes of Health Research, Canadian Cancer Society Research Institute, Cystic Fibrosis Canada, Cystic Fibrosis Foundation Therapeutics, and Natural Sciences and Engineering Research Council of Canada grants (to J.D.F.-K.); US NIH grants R01CA082491, 1R01GM083159, and 1R01GM115634 (to R.W.K.); a National Cancer Institute Cancer Center Support Grant P30CA21765 (to St. Jude Children’s Research Hospital); and ALSAC (to St. Jude Children’s Research Hospital). A.V.F. is the recipient of the Neoma Boadway Fellowship from St. Jude Children’s Research Hospital.
Veronika Csizmok studied biology at the University of Eotvos Lorand (Budapest, Hungary), and she received her Ph.D. degree in structural biology in 2006 with Prof. Peter Tompa from the Institute of Enzymology (Budapest, Hungary). From 2006 to 2007 she worked as a Marie Curie fellow in the group of Dr. Lucia Banci at the Magnetic Resonance Centre in Florence (Italy). She is currently a postdoctoral fellow at The Hospital for Sick Children in the lab of Dr. Julie Forman-Kay. Her long-standing research interests focus on using NMR and other biophysical tools to understand the role of protein intrinsic disorder in different signaling pathways such as ubiquitination and cell cycle regulation.
Ariele Viacava Follis graduated from the University of Pavia, Italy with a B.Sc. in organic chemistry in 2004; he then pursued his Ph.D. in chemistry at Georgetown University, mentored by Dr. Steve Metallo. During his graduate research Ariele studied small molecule interactions with the intrinsically disordered oncogenic transcription factor c-Myc, one of the first investigations of small molecule binding to a flexible and dynamic protein target. Since 2009, Ariele has been a member of Dr. Richard Kriwacki’s group at St. Jude Children’s Research Hospital, where he studies the mechanistic determinants of cytosolic, pro-apoptotic functions of the tumor suppressor p53. His research in this area has unveiled novel modes of cell signaling regulation involving protein dynamics. Ariele has a longstanding involvement with the intrinsically disordered protein research community, and served as member of the Biophysical Society Intrinsically disordered Proteins Subgroup Committee in 2012.
Richard Kriwacki received his Ph.D. from the Biophysics Division of the Department of Chemistry at Yale University in New Haven, CT, followed by postdoctoral training with Professor Peter E. Wright at the Scripps Research Institute in La Jolla, CA. In 1996 at Scripps, Drs. Kriwacki and Wright discovered that a small protein named p21Waf1/Cip1 that regulates kinases involved in controlling cell division lacked secondary and tertiary structure in isolation but folded upon binding to its kinase targets. This, together with a few subsequent reports of functional, unstructured proteins, drew attention to the roles of what are now termed intrinsically disordered proteins in biology. In 1997, Dr. Kriwacki joined the Department of Structural Biology at St. Jude Children’s Research Hospital (St. Jude) in Memphis, TN, where he is now a Member. At St. Jude, Dr. Kriwacki has continued studies of disordered proteins, with focus on establishing relationships between their disordered features and biological functions, and has published more than 80 papers in the field. Dr. Kriwacki co-founded the Disordered Proteins Subgroup at the Biophysical Society, leading advocates for the disordered proteins field, and covers this topic as an Editorial Board Member at the Journal of Molecular Biology.
Julie Forman-Kay received her Ph.D. from the Molecular Biophysics and Biochemistry Department at Yale University and trained with Drs. Marius Clore and Angela Gronenborn at the National Institutes of Health in Bethesda, MA. In 1992, Dr. Forman-Kay joined the Research Institute at the Hospital for Sick Children in Toronto, where she now is Head of the Molecular Structure and Function Program. She also holds an appointment in the Biochemistry Department at the University of Toronto. Dr. Forman-Kay’s lab has developed methodological approaches for investigations of disordered proteins and their interactions and characterized the range of their structural properties, binding mechanisms and functional roles. As one of a handful of scientists beginning work in this field in the 1990s, her studies, particularly of dynamic complexes of disordered proteins, have expanded awareness of the importance of disordered proteins. Dr. Forman-Kay has over 120 publications and served on the editorial boards of Protein Science, IDP and Structure.