Contrarily to the general believe, many biologically active proteins lack stable tertiary and/or secondary structure under physiological conditions in vitro. These intrinsically disordered proteins (IDPs) are highly abundant in nature and many of them are associated with various human diseases. The functional repertoire of IDPs complements the functions of ordered proteins. Since IDPs constitute a significant portion of any given proteome, they can be combined in an unfoldome; which is a portion of the proteome including all IDPs (also known as natively unfolded proteins, therefore, unfoldome), and describing their functions, structures, interactions, evolution, and so forth. Amino acid sequence and compositions of IDPs are very different from those of ordered proteins, making possible reliable identification of IDPs at the proteome level by various computational means. Furthermore, IDPs possess a number of unique structural properties and are characterized by a peculiar conformational behavior, including their high stability against low pH and high temperature and their structural indifference toward the unfolding by strong denaturants. These peculiarities were shown to be useful for elaboration of the experimental techniques for the large-scale identification of IDPs in various organisms. Some of the computational and experimental tools for the unfoldome discovery are discussed in this review.
Toxoplasma gondii is an obligate intracellular parasite of the phylum Apicomplexa, which includes a number of species of medical and veterinary importance. Inhibitors of lysine deacetylases (KDACs) exhibit potent antiparasitic activity, suggesting that interference with lysine acetylation pathways hold promise for future drug targeting. Using high resolution LC-MS/MS to identify parasite peptides enriched by immunopurification with acetyl-lysine antibody, we recently produced an acetylome of the proliferative intracellular stage of Toxoplasma. In this study, we used similar approaches to greatly expand the Toxoplasma acetylome by identifying acetylated proteins in non-replicating extracellular tachyzoites. The functional breakdown of acetylated proteins in extracellular parasites is similar to intracellular parasites, with an enrichment of proteins involved in metabolism, translation, and chromatin biology. Altogether, we have now detected over 700 acetylation sites on a wide variety of parasite proteins of diverse function in multiple subcellular compartments. We found 96 proteins uniquely acetylated in intracellular parasites, 216 uniquely acetylated in extracellular parasites, and 177 proteins acetylated in both states. Our findings suggest that dramatic changes occur at the proteomic level as tachyzoites transition from the intracellular to extracellular environment, similar to reports documenting significant changes in gene expression during this transition. The expanded dataset also allowed a thorough analysis of the degree of protein intrinsic disorder surrounding lysine residues targeted for this post-translational modification. These analyses indicate that acetylated lysines in proteins from extracellular and intracellular tachyzoites are largely located within similar local environments, and that lysine acetylation preferentially occurs in intrinsically disordered or flexible regions.
parasite; proteomics; acetylation; lysine; Apicomplexa; tachyzoite
Biologically active proteins without stable tertiary structure are common in all known proteomes. Functions of these intrinsically disordered proteins (IDPs) are typically related to regulation, signaling, and control. Cellular levels of these important regulators are tightly regulated by a variety mechanisms ranging from firmly controlled expression to precisely targeted degradation. Functions of IDPs are controlled by binding to specific partners, alternative splicing, and posttranslational modifications among other means. In the norm, right amounts of precisely activated IDPs have to be present in right time at right places. Wrecked regulation brings havoc to the ordered world of disordered proteins, leading to protein misfolding, misidentification, and missignaling that give rise to numerous human diseases, such as cancer, cardiovascular disease, neurodegenerative diseases, and diabetes. Among factors inducing pathogenic transformations of IDPs are various cellular mechanisms, such as chromosomal translocations, damaged splicing, altered expression, frustrated posttranslational modifications, aberrant proteolytic degradation, and defective trafficking. This review presents some of the aspects of deregulated regulation of IDPs leading to human diseases.
intrinsically disordered proteins; conformational diseases; posttranslational modification; alternative splicing; transcriptional activation; expression; proteolytic degradation; trafficking
Besides being a common threat to farm animals and poultry, coronavirus (CoV) was responsible for the human severe acute respiratory syndrome (SARS) epidemic in 2002–4. However, many aspects of CoV behavior, including modes of its transmission, are yet to be fully understood. We show that the amount and the peculiarities of distribution of the protein intrinsic disorder in the viral shell can be used for the efficient analysis of the behavior and transmission modes of CoV. The proposed model allows categorization of the various CoVs by the peculiarities of disorder distribution in their membrane (M) and nucleocapsid (N). This categorization enables quick identification of viruses with similar behaviors in transmission, regardless of genetic proximity. Based on this analysis, an empirical model for predicting the viral transmission behavior is developed. This model is able to explain some behavioral aspects of important coronaviruses that previously were not fully understood. The new predictor can be a useful tool for better epidemiological, clinical, and structural understanding of behavior of both newly emerging viruses and viruses that have been known for a long time. A potentially new vaccine strategy could involve searches for viral strains that are characterized by the evolutionary misfit between the peculiarities of the disorder distribution in their shells and their behavior.
The native state of a protein is usually associated with a compact globular conformation possessing a rigid and highly ordered structure. At the turn of the last century certain studies arose which concluded that many proteins cannot, in principle, form a rigid globular structure in an aqueous environment, but they are still able to fulfill their specific functions — i.e., they are native. The existence of the disordered regions allows these proteins to interact with their numerous binding partners. Such interactions are often accompanied by the formation of complexes that possess a more ordered structure than the original components. The functional diversity of these proteins, combined with the variability of signals related to the various intra-and intercellular processes handled by these proteins and their capability to produce multi-variant and multi-directional responses allow them to form a unique regulatory net in a cell. The abundance of disordered proteins inside the cell is precisely controlled at the synthesis and clearance levels as well as via interaction with specific binding partners and posttranslational modifications. Another recently recognized biologically active state of proteins is the functional amyloid. The formation of such functional amyloids is tightly controlled and therefore differs from the uncontrolled formation of pathogenic amyloids which are associated with the pathogenesis of several conformational diseases, the development of which is likely to be determined by the failures of the cellular regulatory systems rather than by the formation of the proteinaceous deposits and/or by the protofibril toxicity.
protein folding; globular proteins; natively disordered proteins; protein-protein and DNA-protein complexes; amorphous aggregates; amyloid fibrils; functional amyloid; inter- and intramolecular contacts
In human membrane proteins, intrinsically disordered regions, the regions that lack a well-defined three-dimensional structure under physiological conditions, preferentially occur in the cytoplasmic tails. Many of these proteins represent cell receptors that function by recognizing their cognate ligand outside the cell and translating this binding information into an intracellular activation signal. Based on location of recognition and signaling (effector) domains, functionally diverse and unrelated cell receptors can be classified into two main families: those in which binding and signaling domains are located on the same protein chain, the so-called single-chain receptors (SRs), and those in which these domains are intriguingly located on separate subunits, the so-called multichain receptors (MRs). Recognition domains of both SRs and MRs are known to be well ordered. In contrast, while cytoplasmic signaling domains of SRs are well-structured as well, those of MRs are intrinsically disordered. Despite important role of receptor signaling in health and disease, extensive comparative structural analysis of receptor signaling domains has not been carried out as of yet. In this study, using a variety of prediction algorithms, we show that protein disorder is a characteristic and distinctive feature of receptors with recognition and signaling functions distributed between separate protein chains. We also reveal that disorder distribution patterns are rather similar within SR subclasses suggesting potential functional explanations. Why did nature select protein disorder to provide intracellular signaling for MRs? Is there any correlation between disorder profiles of signaling domains and receptor function? These and other questions are addressed in this article.
intrinsically disordered proteins; immune signaling; protein disorder; single-chain receptors; multichain immune recognition receptors; MIRR; T cell receptor; B cell receptor; RTK; receptor tyrosine kinases
Conformational behavior of five homologous proteins, parvalbumins (PAs) from northern pike (α and β isoforms), Baltic cod, and rat (α and β isoforms), was studied by scanning calorimetry, circular dichroism, and bis-ANS fluorescence. The mechanism of the temperature-induced denaturation of these proteins depends dramatically on both the peculiarities of their amino acid sequences and on their interaction with metal ions. For example, the pike α-PA melting can be described by two successive two-state transitions with mid-temperatures of 90° and 120°C, suggesting the presence of two thermodynamic domains. The intermediate state populated at the end of the first transition was shown to bind Ca2+ ions, and was characterized by the largely preserved secondary structure and increased solvent exposure of hydrophobic groups. Mg2+ and Na+-loaded forms of pike α-PA demonstrated a single two-state transition. Therefore, the mechanism of the PA thermal denaturation is controlled by metal binding. It ranged from the absence of detectable first-order transition (apo-form of pike PA), to the two-state transition (e.g., Mg2+ and Na+-loaded forms of pike α-PA), to the more complex mechanisms (Ca2+-loaded PAs) involving at least one partially folded intermediate. Analysis of isolated cavities in the protein structures revealed that the interface between the CD and EF subdomains of Ca2+-loaded pike α-PA is much more loosely packed compared with PAs manifesting single heat-sorption peak. The impairment of interactions between CD and EF subdomains may cause a loss of structural cooperativity and appearance of two separate thermodynamic domains. One more peculiar feature of pike α-PA is that depending on its interactions with metal ions, it can be an intrinsically disordered protein (apo-form), an ordered protein of mesophilic (Na+-bound state), thermophilic (Mg2+-form), or even of the hyperthermophilic origin (Ca2+-form).
thermodynamics; cooperativity; thermodynamic domain; structural domain; EF-hand; protein unfolding; protein denaturation; intermediate; metal binding; protein cavities; protein intrinsic disorder; hyperthermophile; allergen
Arg96 is a highly conservative residue known to catalyze spontaneous green fluorescent protein (GFP) chromophore biosynthesis. To understand a role of Arg96 in conformational stability and structural behavior of EGFP, the properties of a series of the EGFP mutants bearing substitutions at this position were studied using circular dichroism, steady state fluorescence spectroscopy, fluorescence lifetime, kinetics and equilibrium unfolding analysis, and acrylamide-induced fluorescence quenching. During the protein production and purification, high yield was achieved for EGFP/Arg96Cys variant, whereas EGFP/Arg96Ser and EGFP/Arg96Ala were characterized by essentially lower yields and no protein was produced when Arg96 was substituted by Gly. We have also shown that only EGFP/Arg96Cys possessed relatively fast chromophore maturation, whereas it took EGFP/Arg96Ser and EGFP/Arg96Ala about a year to develop a noticeable green fluorescence. The intensity of the characteristic green fluorescence measured for the EGFP/Arg96Cys and EGFP/Arg96Ser (or EGFP/Arg96Ala) was 5- and 50-times lower than that of the nonmodified EGFP. Intriguingly, EGFP/Arg96Cys was shown to be more stable than EGFP toward the GdmCl-induced unfolding both in kinetics and in the quasi-equilibrium experiments. In comparison with EGFP, tryptophan residues of EGFP/Arg96Cys were more accessible to the solvent. These data taken together suggest that besides established earlier crucial catalytic role, Arg96 is important for the overall folding and conformational stability of GFP.
green fluorescent protein; enhanced green fluorescent protein; fluorescent protein; point mutation; chromophore structure; conformational stability; circular dichroism
α-Synuclein aggregation and fibrillation are closely associated with the formation of Lewy bodies in neurons and are implicated in the causative pathogenesis of Parkinson's disease and other synucleinopathies. Currently, there is no approved therapeutic agent directed toward preventing the protein aggregation, which has been recently shown to have a key role in the cytotoxic nature of amyloidogenic proteins. Flavonoids, known as plant pigments, belong to a broad family of polyphenolic compounds. Over 4,000 flavonoids have been identified from various plants and foodstuffs derived from plants and have been demonstrated as potential neuroprotective agents. In this study 48 flavonoids belonging to several classes with structures differing in the position of double bonds and ring substituents were tested for their ability to inhibit the fibrillation of α-synuclein in vitro. A variety of flavonoids inhibited α-synuclein fibrillation, and most of the strong inhibitory flavonoids were also found to disaggregate preformed fibrils.
We show here that chicken gizzard caldesmon (CaD) and its C-terminal domain (residues 636–771, CaD136) are intrinsically disordered proteins. The computational and experimental analyses of the wild type CaD136 and series of its single tryptophan mutants (W674A, W707A, and W737A) and a double tryptophan mutant (W674A/W707A) suggested that although the interaction of CaD136 with calmodulin (CaM) can be driven by the non-specific electrostatic attraction between these oppositely charged molecules, the specificity of CaD136-CaM binding is likely to be determined by the specific packing of important CaD136 tryptophan residues at the CaD136-CaM interface. It is suggested that this interaction can be described as the “buttons on a charged string” model, where the electrostatic attraction between the intrinsically disordered CaD136 and the CaM is solidified in a “snapping buttons” manner by specific packing of the CaD136 “pliable buttons” (which are the short segments of fluctuating local structure condensed around the tryptophan residues) at the CaD136-CaM interface. Our data also show that all three “buttons” are important for binding, since mutation of any of the tryptophans affects CaD136-CaM binding and since CaD136 remains CaM-buttoned even when two of the three tryptophans are mutated to alanines.
Intrinsically disordered protein; Caldesmon; Calmodulin; Protein–protein interaction; Molecular Recognition Feature (MoRF).
Regulation of 5-aminolevulinate synthase (ALAS) is at the origin of balanced heme production in mammals. Mutations in the C-terminal region of human erythroid-specific ALAS (hALAS2) are associated with X-linked protoporphyria (XLPP), a disease characterized by extreme photosensitivity, with elevated blood concentrations of free protoporphyrin IX and zinc protoporphyrin. To investigate the molecular basis for this disease, recombinant hALAS2 and variants of the enzyme harboring the gain-of-function XLPP mutations were constructed, purified, and analyzed kinetically, spectroscopically and thermodynamically. Enhanced activities of the XLPP variants resulted from accelerations in the rate at which the product 5-aminolevulinate (ALA) was released from the enzyme. Circular dichroism spectroscopy revealed that the XLPP mutations altered the microenvironment of the pyridoxal 5’-phosphate cofactor, which underwent further and specific alterations upon succinyl-CoA binding. Transient kinetic analyses of the variant-catalyzed reactions and protein fluorescence quenching upon ALA binding to the XLPP variants demonstrated that the protein conformational transition step associated with product release was predominantly affected. Of relevance, XLPP could also be modeled in cell culture. We propose that 1) the XLPP mutations destabilize the succinyl-CoA-induced hALAS2 closed conformation and thus accelerate ALA release, 2) the extended C-terminus of wild-type mammalian ALAS2 provides a regulatory role that allows for allosteric modulation of activity, thereby controlling the rate of erythroid heme biosynthesis, and 3) this control is disrupted in XLPP, resulting in porphyrin accumulation.
Heme biosynthesis; 5-aminolevulinate; 5-aminolevulinate synthase; porphyria; X-linked erythropoietic protoporphyria; porphyrin; tetrapyrrole; pyridoxal 5’-phosphate
Certain metals lead to increased risk of Parkinson’s disease (PD) and the aggregation of α-synuclein is implicated in the PD pathology. Although α-synuclein fibrillation has been extensively studied in dilute solutions in vitro, the intracellular environment is highly crowded. We are showing here that certain metals cause a significant acceleration of α-synuclein fibrillation in the presence of high concentrations of various macromolecules mostly through decreasing the fibrillation lagtime. The faster fibrillation in crowded environments in the presence of heavy metals suggests a simple molecular basis for the observed elevated risk of PD due to exposure to metals.
Parkinson’s disease; α-synuclein; crowding; fibrillation; aggregation; metals
Currently, the understanding of the relationships between function, amino acid sequence and protein structure continues to represent one of the major challenges of the modern protein science. As much as 50% of eukaryotic proteins are likely to contain functionally important long disordered regions. Many proteins are wholly disordered but still possess numerous biologically important functions. However, the number of experimentally confirmed disordered proteins with known biological functions is substantially smaller than their actual number in nature. Therefore, there is a crucial need for novel bioinformatics approaches that allow projection of the current knowledge from a few experimentally verified examples to much larger groups of known and potential proteins. The elaboration of a bioinformatics tool for the analysis of functional diversity of intrinsically disordered proteins and application of this data mining tool to >200,000 proteins from Swiss-Prot database, each annotated with at least one of the 875 functional keywords was described in the first paper of this series (Xie H., Vucetic S., Iakoucheva L.M., Oldfield C.J., Dunker A.K., Obradovic Z., Uversky V.N. (2006) Functional anthology of intrinsic disorder. I. Biological processes and functions of proteins with long disordered regions. J. Proteome Res.). Using this tool, we have found that out of the 711 Swiss-Prot functional keywords associated with at least 20 proteins, 262 were strongly positively correlated with long intrinsically disordered regions, and 302 were strongly negatively correlated. Illustrative examples of functional disorder or order were found for the vast majority of keywords showing strongest positive or negative correlation with intrinsic disorder, respectively. Some 80 Swiss-Prot keywords associated with disorder- and order-driven biological processes and protein functions were described in the first paper (Xie H., Vucetic S., Iakoucheva L.M., Oldfield C.J., Dunker A.K., Obradovic Z., Uversky V.N. (2006) Functional anthology of intrinsic disorder. I. Biological processes and functions of proteins with long disordered regions. J. Proteome Res.). The second paper of the series was devoted to the presentation of 87 Swiss-Prot keywords attributed to the cellular components, domains, technical terms, developmental processes and coding sequence diversities possessing strong positive and negative correlation with long disordered regions (Vucetic S., Xie H., Iakoucheva L.M., Oldfield C.J., Dunker A.K., Obradovic Z., Uversky V.N. (2006) Functional anthology of intrinsic disorder. II. Cellular components, domains, technical terms, developmental processes and coding sequence diversities correlated with long disordered regions. J. Proteome Res.). Protein structure and functionality can be modulated by various posttranslational modifications or/and as a result of binding of specific ligands. Numerous human diseases are associated with protein misfolding/misassembly/ misfunctioning. This work concludes the series of papers dedicated to the functional anthology of intrinsic disorder and describes ~80 Swiss-Prot functional keywords that are related to ligands, posttranslational modifications and diseases possessing strong positive or negative correlation with the predicted long disordered regions in proteins.
Intrinsic disorder; protein structure; protein function; intrinsically disordered proteins; bioinformatics; disorder prediction
Biologically active proteins without stable ordered structure (i.e., intrinsically disordered proteins) are attracting increased attention. Functional repertoires of ordered and disordered proteins are very different, and the ability to differentiate whether a given function is associated with intrinsic disorder or with a well-folded protein is crucial for modern protein science. However, there is a large gap between the number of proteins experimentally confirmed to be disordered and their actual number in nature. As a result, studies of functional properties of confirmed disordered proteins, while helpful in revealing the functional diversity of protein disorder, provide only a limited view. To overcome this problem, a bioinformatics approach for comprehensive study of functional roles of protein disorder was proposed in the first paper of this series (Xie H., Vucetic S., Iakoucheva L.M., Oldfield C.J., Dunker A.K., Obradovic Z., Uversky V.N. (2006) Functional anthology of intrinsic disorder. I. Biological processes and functions of proteins with long disordered regions. J. Proteome Res.). Applying this novel approach to Swiss-Prot sequences and functional keywords, we found over 238 and 302 keywords to be strongly positively or negatively correlated, respectively, with long intrinsically disordered regions. This paper describes ~90 Swiss-Prot keywords attributed to the cellular components, domains, technical terms, developmental processes and coding sequence diversities possessing strong positive and negative correlation with long disordered regions.
Intrinsic disorder; protein structure; protein function; intrinsically disordered proteins; bioinformatics; disorder prediction
Identifying relationships between function, amino acid sequence and protein structure represents a major challenge. In this study we propose a bioinformatics approach that identifies functional keywords in the Swiss-Prot database that correlate with intrinsic disorder. A statistical evaluation is employed to rank the significance of these correlations. Protein sequence data redundancy and the relationship between protein length and protein structure were taken into consideration to ensure the quality of the statistical inferences. Over 200,000 proteins from Swiss-Prot database were analyzed using this approach. The predictions of intrinsic disorder were carried out using PONDR VL3E predictor of long disordered regions that achieves an accuracy of above 86%. Overall, out of the 710 Swiss-Prot functional keywords that were each associated with at least 20 proteins, 238 were found to be strongly positively correlated with predicted long intrinsically disordered regions, whereas 302 were strongly negatively correlated with such regions. The remaining 170 keywords were ambiguous without strong positive or negative correlation with the disorder predictions. These functions cover a large variety of biological activities and imply that disordered regions are characterized by a wide functional repertoire. Our results agree well with literature findings, as we were able to find at least one illustrative example of functional disorder or order shown experimentally for the vast majority of keywords showing the strongest positive or negative correlation with intrinsic disorder. This work opens a series of three papers, which enriches the current view of protein structure-function relationships, especially with regards to functionalities of intrinsically disordered proteins and provides researchers with a novel tool that could be used to improve the understanding of the relationships between protein structure and function. The first paper of the series describes our statistical approach, outlines the major findings and provides illustrative examples of biological processes and functions positively and negatively correlated with intrinsic disorder.
Intrinsic disorder; protein structure; protein function; intrinsically disordered proteins; bioinformatics; disorder prediction
Misfolding and self-assembly of proteins in nanoaggregates of different sizes and morphologies (nanoensembles, primary nanofilaments, nanorings, filaments, protofibrils, fibrils, etc.) is a common theme unifying a number of human pathologies termed protein misfolding diseases. Recent studies highlight increasing recognition of the public health importance of protein misfolding diseases, including various neurodegenerative disorders and amyloidoses. It is understood now that the first essential elements in the vast majority of neurodegenerative processes are misfolded and aggregated proteins. Altogether, the accumulation of abnormal protein nanoensembles exerts toxicity by disrupting intracellular transport, overwhelming protein degradation pathways, and/or disturbing vital cell functions. In addition, the formation of inclusion bodies is known to represent a major problem in the production of recombinant therapeutic proteins. Formulation of these therapeutic proteins into delivery systems and their in vivo delivery are often complicated by protein association. Thus, protein folding abnormalities and subsequent events underlie a multitude of human pathologies and difficulties with protein therapeutic applications. The field of medicine therefore can be greatly advanced by establishing a fundamental understanding of key factors leading to misfolding and self-assembly responsible for various protein folding pathologies. This article overviews protein misfolding diseases and outlines some novel and advanced nanotechnologies, including nanoimaging techniques, nanotoolboxes and nanocontainers, complemented by appropriate ensemble techniques, all focused on the ultimate goal to establish etiology and to diagnose, prevent, and cure these devastating disorders.
misfolding; protein aggregation; conformational disease; partially folded intermediate; nanomedicine
The mechanism of autophagy relies on complex cell signaling and regulatory processes. Each cell contains many proteins that lack a rigid 3-dimensional structure under physiological conditions. These dynamic proteins, called intrinsically disordered proteins (IDPs) and protein regions (IDPRs), are predominantly involved in cell signaling and regulation. Yet, very little is known about their presence among proteins of the core autophagy machinery. In this work, we characterized the autophagy protein Atg3 from yeast and human along with 2 variants to show that Atg3 is an IDPRs-containing protein and that disorder/order predicted for these proteins from their amino acid sequence corresponds to their experimental characteristics. Based on this consensus, we applied the same prediction methods to all known Atg proteins from Saccharomyces cerevisiae. The data presented here provide an insight into the structural dynamics of each Atg protein. They also show that intrinsic disorder at various levels has to be taken into consideration for about half of the Atg proteins. This work should become a useful tool that will facilitate and encourage exploration of protein intrinsic disorder in autophagy.
Atg3; autophagy; intrinsically disordered protein; stress; vacuole; yeast
The phase-transition temperatures of an elastin-like polypeptide (ELP) with the (GVGVP)40 sequence and solvent dipolarity/polarizability, hydrogen-bond donor acidity, and hydrogen-bond acceptor basicity in its aqueous solutions were quantified in the absence and presence of different salts (Na2SO4, NaCl, NaClO4, and NaSCN) and various osmolytes (sucrose, sorbitol, trehalose, and trimethylamine N-oxide (TMAO)). All osmolytes decreased the ELP phase-transition temperature, whereas NaCl and Na2SO4 decreased, and NaSCN and NaClO4 increased it. The determined phase-transition temperatures may be described as a linear combination of the solvent’s dipolarity/polarizability and hydrogen-bond donor acidity. The linear relationship established for the phase-transition temperature in the presence of salts differs quantitatively from that in the presence of osmolytes, in agreement with different (direct and indirect) mechanisms of the influence of salts and osmolytes on the ELP phase-transition temperature.
elastin-like polypeptide; phase-transition temperature; solvent properties; solvent dipolarity/polarizability; hydrogen-bond donor acidity; and hydrogen-bond acceptor basicity; osmolyte
Computational methods are prevailing in identifying protein intrinsic disorder. The results from predictors are often given as per-residue disorder scores. The scores describe the disorder propensity of amino acids of a protein and can be further represented as a disorder curve. Many proteins share similar patterns in their disorder curves. The similar patterns are often associated with similar functions and evolutionary origins. Therefore, finding and characterizing specific patterns of disorder curves provides a unique and attractive perspective of studying the function of intrinsically disordered proteins. In this study, we developed a new computational tool named IDalign using dynamic programming. This tool is able to identify similar patterns among disorder curves, as well as to present the distribution of intrinsic disorder in query proteins. The disorder-based information generated by IDalign is significantly different from the information retrieved from classical sequence alignments. This tool can also be used to infer functions of disordered regions and disordered proteins. The web server of IDalign is available at (http://labs.cas.usf.edu/bioinfo/service.html).
intrinsic disorder; structural flexibility; disorder pattern; dynamic programming; dynamic time warping
intrinsically disordered proteins; protein-protein interaction; posttranslational modification; neurodegenerative diseases; proteome