PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (34)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
more »
1.  3D hydrogel environment rejuvenates aged pericytes for skeletal muscle tissue engineering 
Skeletal muscle tissue engineering is a promising approach for the treatment of muscular disorders. However, the complex organization of muscle, combined with the difficulty in finding an appropriate source of regenerative cells and in providing an adequate blood supply to the engineered tissue, makes this a hard task to face. In the present work, we describe an innovative approach to rejuvenate adult skeletal muscle-derived pericytes (MP) based on the use of a PEG-based hydrogel scaffold. MP were isolated from young (piglet) and adult (boar) pigs to assess whether aging affects tissue regeneration efficiency. In vitro, MP from boars had similar morphology and colony forming capacity to piglet MP, but an impaired ability to form myotubes and capillary-like structures. However, the use of a PEG-based hydrogel to support adult MP significantly improved their myogenic differentiation and angiogenic potentials in vitro and in vivo. Thus, PEG-based hydrogel scaffolds may provide a progenitor cell “niche” that promotes skeletal muscle regeneration and blood vessel growth, and together with pericytes may be developed for use in regenerative applications.
doi:10.3389/fphys.2014.00203
PMCID: PMC4039010  PMID: 24910618
stem cells; perycite; skeletal muscle; myogenic differentiation; tissue engineering; PEG-firbinogen; biomaterials
2.  Combining affinity proteomics and network context to identify new phosphatase substrates and adapters in growth pathways 
Frontiers in Genetics  2014;5:115.
Protein phosphorylation homoeostasis is tightly controlled and pathological conditions are caused by subtle alterations of the cell phosphorylation profile. Altered levels of kinase activities have already been associated to specific diseases. Less is known about the impact of phosphatases, the enzymes that down-regulate phosphorylation by removing the phosphate groups. This is partly due to our poor understanding of the phosphatase-substrate network. Much of phosphatase substrate specificity is not based on intrinsic enzyme specificity with the catalytic pocket recognizing the sequence/structure context of the phosphorylated residue. In addition many phosphatase catalytic subunits do not form a stable complex with their substrates. This makes the inference and validation of phosphatase substrates a non-trivial task. Here, we present a novel approach that builds on the observation that much of phosphatase substrate selection is based on the network of physical interactions linking the phosphatase to the substrate. We first used affinity proteomics coupled to quantitative mass spectrometry to saturate the interactome of eight phosphatases whose down regulations was shown to affect the activation of the RAS-PI3K pathway. By integrating information from functional siRNA with protein interaction information, we develop a strategy that aims at inferring phosphatase physiological substrates. Graph analysis is used to identify protein scaffolds that may link the catalytic subunits to their substrates. By this approach we rediscover several previously described phosphatase substrate interactions and characterize two new protein scaffolds that promote the dephosphorylation of PTPN11 and ERK by DUSP18 and DUSP26, respectively.
doi:10.3389/fgene.2014.00115
PMCID: PMC4019850  PMID: 24847354
phosphatase; signal transduction; systems biology; cell biology; protein protein interaction
3.  The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases 
Nucleic Acids Research  2013;42(D1):D358-D363.
IntAct (freely available at http://www.ebi.ac.uk/intact) is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. IntAct has developed a sophisticated web-based curation tool, capable of supporting both IMEx- and MIMIx-level curation. This tool is now utilized by multiple additional curation teams, all of whom annotate data directly into the IntAct database. Members of the IntAct team supply appropriate levels of training, perform quality control on entries and take responsibility for long-term data maintenance. Recently, the MINT and IntAct databases decided to merge their separate efforts to make optimal use of limited developer resources and maximize the curation output. All data manually curated by the MINT curators have been moved into the IntAct database at EMBL-EBI and are merged with the existing IntAct dataset. Both IntAct and MINT are active contributors to the IMEx consortium (http://www.imexconsortium.org).
doi:10.1093/nar/gkt1115
PMCID: PMC3965093  PMID: 24234451
4.  Protein Interaction Data Curation - The International Molecular Exchange Consortium (IMEx) 
Nature methods  2012;9(4):345-350.
The IMEx consortium is an international collaboration between major public interaction data providers to share curation effort and make a non-redundant set of protein interactions available in a single search interface on a common website (www.imexconsortium.org). Common curation rules have been developed and a central registry is used to manage the selection of articles to enter into the dataset. The advantages of such a service to the user, quality control measures adopted and data distribution practices are discussed.
doi:10.1038/nmeth.1931
PMCID: PMC3703241  PMID: 22453911
5.  Reactive Oxygen Species and Epidermal Growth Factor Are Antagonistic Cues Controlling SHP-2 Dimerization 
Molecular and Cellular Biology  2012;32(10):1998-2009.
The SHP-2 tyrosine phosphatase plays key regulatory roles in the modulation of the cell response to growth factors and cytokines. Over the past decade, the integration of genetic, biochemical, and structural data has helped in interpreting the pathological consequences of altered SHP-2 function. Using complementary approaches, we provide evidence here that endogenous SHP-2 can dimerize through the formation of disulfide bonds that may also involve the catalytic cysteine. We show that the fraction of dimeric SHP-2 is modulated by growth factor stimulation and by the cell redox state. Comparison of the phosphatase activities of the monomeric self-inhibited and dimeric forms indicated that the latter is 3-fold less active, thus pointing to the dimerization process as an additional mechanism for controlling SHP-2 activity. Remarkably, dimers formed by different SHP-2 mutants displaying diverse biochemical properties were found to respond differently to epidermal growth factor (EGF) stimulation. Although this differential behavior cannot be rationalized mechanistically yet, these findings suggest a possible regulatory role of dimerization in SHP-2 function.
doi:10.1128/MCB.06674-11
PMCID: PMC3347403  PMID: 22411627
6.  Mapping the human phosphatome on growth pathways 
Phosphatases control cell growth by a variety of mechanisms. A novel strategy is presented that combines multiparametric analysis of cell perturbations with logic modeling to achieve a detailed mapping of human phosphatase function on growth pathways.
siRNA-mediated downregulation of 298 phosphatase and phosphatase-related genes coupled to automated microscopy was used to characterize their impact on key growth pathways.In parallel, a literature-derived signed directed network was derived and optimized by training with experimental data.The resulting logic-based growth model was used to infer the cell state upon perturbation of each signaling node and compare it with the profiles obtained upon phosphatase perturbation.Mapping of 67% of the protein phosphatase onto the growth model shows that phosphatases are key modulators of growth pathways and affect cell-cycle progression.This novel approach is general and enables to efficiently map proteins onto complex pathways.
Large-scale siRNA screenings allow linking the function of poorly characterized genes to phenotypic readouts. According to this strategy, genes are associated with a function of interest if the alteration of their expression perturbs the phenotypic readouts. However, given the intricacy of the cell regulatory network, the mapping procedure is low resolution and the resulting models provide little mechanistic insights. We have developed a new strategy that combines multiparametric analysis of cell perturbation with logic modeling to achieve a more detailed functional mapping of human genes onto complex pathways. A literature-derived optimized model is used to infer the cell activation state following upregulation or downregulation of the model entities. By matching this signature with the experimental profile obtained in the high-throughput siRNA screening it is possible to infer the target of each protein, thus defining its ‘entry point' in the network. By this novel approach, 41 phosphatases that affect key growth pathways were identified and mapped onto a human epithelial cell-specific growth model, thus providing insights into the mechanisms underlying their function.
doi:10.1038/msb.2012.36
PMCID: PMC3435503  PMID: 22893001
cancer; computational biology; functional genomics; imaging; modeling
7.  The human phosphatase interactome: An intricate family portrait 
Febs Letters  2012;586(17):2732-2739.
The concerted activities of kinases and phosphatases modulate the phosphorylation levels of proteins, lipids and carbohydrates in eukaryotic cells. Despite considerable effort, we are still missing a holistic picture representing, at a proteome level, the functional relationships between kinases, phosphatases and their substrates. Here we focus on phosphatases and we review and integrate the available information that helps to place the members of the protein phosphatase superfamilies into the human protein interaction network. In addition we show how protein interaction domains and motifs, either covalently linked to the phosphatase domain or in regulatory/adaptor subunits, play a prominent role in substrate selection.
doi:10.1016/j.febslet.2012.05.008
PMCID: PMC3437441  PMID: 22626554
PTP, protein tyrosine phosphatase; LP, lipid phosphatase; PPP, phosphoprotein phosphatases; PPM, metallo-dependent protein phosphatase; HAD, haloacid dehalogenase; RS, regulatory subunit; Human phosphatome; Phosphatase family classification; Substrate recognition specificity
8.  Oxidative Stress, DNA Damage, and c-Abl Signaling: At the Crossroad in Neurodegenerative Diseases? 
The c-Abl tyrosine kinase is implicated in diverse cellular activities including growth factor signaling, cell adhesion, oxidative stress, and DNA damage response. Studies in mouse models have shown that the kinases of the c-Abl family play a role in the development of the central nervous system. Recent reports show that aberrant c-Abl activation causes neuroinflammation and neuronal loss in the forebrain of transgenic adult mice. In line with these observations, an increased c-Abl activation is reported in human neurodegenerative pathologies, such as Parkinson's, and Alzheimer's diseases. This suggests that aberrant nonspecific posttranslational modifications induced by c-Abl may contribute to fuel the recurrent phenotypes/features linked to neurodegenerative disorders, such as an impaired mitochondrial function, oxidative stress, and accumulation of protein aggregates. Herein, we review some reports on c-Abl function in neuronal cells and we propose that modulation of different aspects of c-Abl signaling may contribute to mediate the molecular events at the interface between stress signaling, metabolic regulation, and DNA damage. Finally, we propose that this may have an impact in the development of new therapeutic strategies.
doi:10.1155/2012/683097
PMCID: PMC3385657  PMID: 22761618
9.  Counteracting Effects Operating on Src Homology 2 Domain-containing Protein-tyrosine Phosphatase 2 (SHP2) Function Drive Selection of the Recurrent Y62D and Y63C Substitutions in Noonan Syndrome*♦ 
The Journal of Biological Chemistry  2012;287(32):27066-27077.
Background: Disease-associated PTPN11 mutations enhance the function of SHP2 by destabilizing its inactive state or increasing binding to phosphotyrosyl-containing partners.
Results: Amino acid substitutions at codons 62 and 63 have a profound and complex effect on SHP2 structure and function.
Conclusion: A selection-by-function mechanism acting on mutations at those codons implies balancing of counteracting effects operating on the activity of SHP2.
Significance: An unanticipated functional behavior underlies disease-causing weak hypermorphs.
Activating mutations in PTPN11 cause Noonan syndrome, the most common nonchromosomal disorder affecting development and growth. PTPN11 encodes SHP2, an Src homology 2 (SH2) domain-containing protein-tyrosine phosphatase that positively modulates RAS function. Here, we characterized functionally all possible amino acid substitutions arising from single-base changes affecting codons 62 and 63 to explore the molecular mechanisms lying behind the largely invariant occurrence of the Y62D and Y63C substitutions recurring in Noonan syndrome. We provide structural and biochemical data indicating that the autoinhibitory interaction between the N-SH2 and protein-tyrosine phosphatase (PTP) domains is perturbed in both mutants as a result of an extensive structural rearrangement of the N-SH2 domain. Most mutations affecting Tyr63 exerted an unpredicted disrupting effect on the structure of the N-SH2 phosphopeptide-binding cleft mediating the interaction of SHP2 with signaling partners. Among all the amino acid changes affecting that codon, the disease-causing mutation was the only substitution that perturbed the stability of the inactive conformation of SHP2 without severely impairing proper phosphopeptide binding of N-SH2. On the other hand, the disruptive effect of the Y62D change on the autoinhibited conformation of the protein was balanced, in part, by less efficient binding properties of the mutant. Overall, our data demonstrate that the selection-by-function mechanism acting as driving force for PTPN11 mutations affecting codons 62 and 63 implies balancing of counteracting effects operating on the allosteric control of the function of SHP2.
doi:10.1074/jbc.M112.350231
PMCID: PMC3411048  PMID: 22711529
Genetic Diseases; Protein Structure; SH2 Domains; Signal Transduction; Protein-tyrosine Phosphatase (Tyrosine Phosphatase); Noonan Syndrome; SHP2
11.  Structural and functional protein network analyses predict novel signaling functions for rhodopsin 
Proteomic analyses, literature mining, and structural data were combined to generate an extensive signaling network linked to the visual G protein-coupled receptor rhodopsin. Network analysis suggests novel signaling routes to cytoskeleton dynamics and vesicular trafficking.
Using a shotgun proteomic approach, we identified the protein inventory of the light sensing outer segment of the mammalian photoreceptor.These data, combined with literature mining, structural modeling, and computational analysis, offer a comprehensive view of signal transduction downstream of the visual G protein-coupled receptor rhodopsin.The network suggests novel signaling branches downstream of rhodopsin to cytoskeleton dynamics and vesicular trafficking.The network serves as a basis for elucidating physiological principles of photoreceptor function and suggests potential disease-associated proteins.
Photoreceptor cells are neurons capable of converting light into electrical signals. The rod outer segment (ROS) region of the photoreceptor cells is a cellular structure made of a stack of around 800 closed membrane disks loaded with rhodopsin (Liang et al, 2003; Nickell et al, 2007). In disc membranes, rhodopsin arranges itself into paracrystalline dimer arrays, enabling optimal association with the heterotrimeric G protein transducin as well as additional regulatory components (Ciarkowski et al, 2005). Disruption of these highly regulated structures and processes by germline mutations is the cause of severe blinding diseases such as retinitis pigmentosa, macular degeneration, or congenital stationary night blindness (Berger et al, 2010).
Traditionally, signal transduction networks have been studied by combining biochemical and genetic experiments addressing the relations among a small number of components. More recently, large throughput experiments using different techniques like two hybrid or co-immunoprecipitation coupled to mass spectrometry have added a new level of complexity (Ito et al, 2001; Gavin et al, 2002, 2006; Ho et al, 2002; Rual et al, 2005; Stelzl et al, 2005). However, in these studies, space, time, and the fact that many interactions detected for a particular protein are not compatible, are not taken into consideration. Structural information can help discriminate between direct and indirect interactions and more importantly it can determine if two or more predicted partners of any given protein or complex can simultaneously bind a target or rather compete for the same interaction surface (Kim et al, 2006).
In this work, we build a functional and dynamic interaction network centered on rhodopsin on a systems level, using six steps: In step 1, we experimentally identified the proteomic inventory of the porcine ROS, and we compared our data set with a recent proteomic study from bovine ROS (Kwok et al, 2008). The union of the two data sets was defined as the ‘initial experimental ROS proteome'. After removal of contaminants and applying filtering methods, a ‘core ROS proteome', consisting of 355 proteins, was defined.
In step 2, proteins of the core ROS proteome were assigned to six functional modules: (1) vision, signaling, transporters, and channels; (2) outer segment structure and morphogenesis; (3) housekeeping; (4) cytoskeleton and polarity; (5) vesicles formation and trafficking, and (6) metabolism.
In step 3, a protein-protein interaction network was constructed based on the literature mining. Since for most of the interactions experimental evidence was co-immunoprecipitation, or pull-down experiments, and in addition many of the edges in the network are supported by single experimental evidence, often derived from high-throughput approaches, we refer to this network, as ‘fuzzy ROS interactome'. Structural information was used to predict binary interactions, based on the finding that similar domain pairs are likely to interact in a similar way (‘nature repeats itself') (Aloy and Russell, 2002). To increase the confidence in the resulting network, edges supported by a single evidence not coming from yeast two-hybrid experiments were removed, exception being interactions where the evidence was the existence of a three-dimensional structure of the complex itself, or of a highly homologous complex. This curated static network (‘high-confidence ROS interactome') comprises 660 edges linking the majority of the nodes. By considering only edges supported by at least one evidence of direct binary interaction, we end up with a ‘high-confidence binary ROS interactome'. We next extended the published core pathway (Dell'Orco et al, 2009) using evidence from our high-confidence network. We find several new direct binary links to different cellular functional processes (Figure 4): the active rhodopsin interacts with Rac1 and the GTP form of Rho. There is also a connection between active rhodopsin and Arf4, as well as PDEδ with Rab13 and the GTP-bound form of Arl3 that links the vision cycle to vesicle trafficking and structure. We see a connection between PDEδ with prenyl-modified proteins, such as several small GTPases, as well as with rhodopsin kinase. Further, our network reveals several direct binary connections between Ca2+-regulated proteins and cytoskeleton proteins; these are CaMK2A with actinin, calmodulin with GAP43 and S1008, and PKC with 14-3-3 family members.
In step 4, part of the network was experimentally validated using three different approaches to identify physical protein associations that would occur under physiological conditions: (i) Co-segregation/co-sedimentation experiments, (ii) immunoprecipitations combined with mass spectrometry and/or subsequent immunoblotting, and (iii) utilizing the glycosylated N-terminus of rhodopsin to isolate its associated protein partners by Concanavalin A affinity purification. In total, 60 co-purification and co-elution experiments supported interactions that were already in our literature network, and new evidence from 175 co-IP experiments in this work was added. Next, we aimed to provide additional independent experimental confirmation for two of the novel networks and functional links proposed based on the network analysis: (i) the proposed complex between Rac1/RhoA/CRMP-2/tubulin/and ROCK II in ROS was investigated by culturing retinal explants in the presence of an ROCK II-specific inhibitor (Figure 6). While morphology of the retinas treated with ROCK II inhibitor appeared normal, immunohistochemistry analyses revealed several alterations on the protein level. (ii) We supported the hypothesis that PDEδ could function as a GDI for Rac1 in ROS, by demonstrating that PDEδ and Rac1 co localize in ROS and that PDEδ could dissociate Rac1 from ROS membranes in vitro.
In step 5, we use structural information to distinguish between mutually compatible (‘AND') or excluded (‘XOR') interactions. This enables breaking a network of nodes and edges into functional machines or sub-networks/modules. In the vision branch, both ‘AND' and ‘XOR' gates synergize. This may allow dynamic tuning of light and dark states. However, all connections from the vision module to other modules are ‘XOR' connections suggesting that competition, in connection with local protein concentration changes, could be important for transmitting signals from the core vision module.
In the last step, we map and functionally characterize the known mutations that produce blindness.
In summary, this represents the first comprehensive, dynamic, and integrative rhodopsin signaling network, which can be the basis for integrating and mapping newly discovered disease mutants, to guide protein or signaling branch-specific therapies.
Orchestration of signaling, photoreceptor structural integrity, and maintenance needed for mammalian vision remain enigmatic. By integrating three proteomic data sets, literature mining, computational analyses, and structural information, we have generated a multiscale signal transduction network linked to the visual G protein-coupled receptor (GPCR) rhodopsin, the major protein component of rod outer segments. This network was complemented by domain decomposition of protein–protein interactions and then qualified for mutually exclusive or mutually compatible interactions and ternary complex formation using structural data. The resulting information not only offers a comprehensive view of signal transduction induced by this GPCR but also suggests novel signaling routes to cytoskeleton dynamics and vesicular trafficking, predicting an important level of regulation through small GTPases. Further, it demonstrates a specific disease susceptibility of the core visual pathway due to the uniqueness of its components present mainly in the eye. As a comprehensive multiscale network, it can serve as a basis to elucidate the physiological principles of photoreceptor function, identify potential disease-associated genes and proteins, and guide the development of therapies that target specific branches of the signaling pathway.
doi:10.1038/msb.2011.83
PMCID: PMC3261702  PMID: 22108793
protein interaction network; rhodopsin signaling; structural modeling
12.  MINT, the molecular interaction database: 2012 update 
Nucleic Acids Research  2011;40(D1):D857-D861.
The Molecular INTeraction Database (MINT, http://mint.bio.uniroma2.it/mint/) is a public repository for protein–protein interactions (PPI) reported in peer-reviewed journals. The database grows steadily over the years and at September 2011 contains approximately 235 000 binary interactions captured from over 4750 publications. The web interface allows the users to search, visualize and download interactions data. MINT is one of the members of the International Molecular Exchange consortium (IMEx) and adopts the Molecular Interaction Ontology of the Proteomics Standard Initiative (PSI-MI) standards for curation and data exchange. MINT data are freely accessible and downloadable at http://mint.bio.uniroma2.it/mint/download.do. We report here the growth of the database, the major changes in curation policy and a new algorithm to assign a confidence to each interaction.
doi:10.1093/nar/gkr930
PMCID: PMC3244991  PMID: 22096227
13.  The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text 
BMC Bioinformatics  2011;12(Suppl 8):S3.
Background
Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them.
Results
A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89% and the best AUC iP/R was 68%. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53%, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35%) the macro-averaged precision ranged between 50% and 80%, with a maximum F-Score of 55%.
Conclusions
The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows.
doi:10.1186/1471-2105-12-S8-S3
PMCID: PMC3269938  PMID: 22151929
14.  BioCreative III interactive task: an overview 
BMC Bioinformatics  2011;12(Suppl 8):S4.
Background
The BioCreative challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. The biocurator community, as an active user of biomedical literature, provides a diverse and engaged end user group for text mining tools. Earlier BioCreative challenges involved many text mining teams in developing basic capabilities relevant to biological curation, but they did not address the issues of system usage, insertion into the workflow and adoption by curators. Thus in BioCreative III (BC-III), the InterActive Task (IAT) was introduced to address the utility and usability of text mining tools for real-life biocuration tasks. To support the aims of the IAT in BC-III, involvement of both developers and end users was solicited, and the development of a user interface to address the tasks interactively was requested.
Results
A User Advisory Group (UAG) actively participated in the IAT design and assessment. The task focused on gene normalization (identifying gene mentions in the article and linking these genes to standard database identifiers), gene ranking based on the overall importance of each gene mentioned in the article, and gene-oriented document retrieval (identifying full text papers relevant to a selected gene). Six systems participated and all processed and displayed the same set of articles. The articles were selected based on content known to be problematic for curation, such as ambiguity of gene names, coverage of multiple genes and species, or introduction of a new gene name. Members of the UAG curated three articles for training and assessment purposes, and each member was assigned a system to review. A questionnaire related to the interface usability and task performance (as measured by precision and recall) was answered after systems were used to curate articles. Although the limited number of articles analyzed and users involved in the IAT experiment precluded rigorous quantitative analysis of the results, a qualitative analysis provided valuable insight into some of the problems encountered by users when using the systems. The overall assessment indicates that the system usability features appealed to most users, but the system performance was suboptimal (mainly due to low accuracy in gene normalization). Some of the issues included failure of species identification and gene name ambiguity in the gene normalization task leading to an extensive list of gene identifiers to review, which, in some cases, did not contain the relevant genes. The document retrieval suffered from the same shortfalls. The UAG favored achieving high performance (measured by precision and recall), but strongly recommended the addition of features that facilitate the identification of correct gene and its identifier, such as contextual information to assist in disambiguation.
Discussion
The IAT was an informative exercise that advanced the dialog between curators and developers and increased the appreciation of challenges faced by each group. A major conclusion was that the intended users should be actively involved in every phase of software development, and this will be strongly encouraged in future tasks. The IAT Task provides the first steps toward the definition of metrics and functional requirements that are necessary for designing a formal evaluation of interactive curation systems in the BioCreative IV challenge.
doi:10.1186/1471-2105-12-S8-S4
PMCID: PMC3269939  PMID: 22151968
15.  Benchmarking of the 2010 BioCreative Challenge III text-mining competition by the BioGRID and MINT interaction databases 
BMC Bioinformatics  2011;12(Suppl 8):S8.
Background
The vast amount of data published in the primary biomedical literature represents a challenge for the automated extraction and codification of individual data elements. Biological databases that rely solely on manual extraction by expert curators are unable to comprehensively annotate the information dispersed across the entire biomedical literature. The development of efficient tools based on natural language processing (NLP) systems is essential for the selection of relevant publications, identification of data attributes and partially automated annotation. One of the tasks of the Biocreative 2010 Challenge III was devoted to the evaluation of NLP systems developed to identify articles for curation and extraction of protein-protein interaction (PPI) data.
Results
The Biocreative 2010 competition addressed three tasks: gene normalization, article classification and interaction method identification. The BioGRID and MINT protein interaction databases both participated in the generation of the test publication set for gene normalization, annotated the development and test sets for article classification, and curated the test set for interaction method classification. These test datasets served as a gold standard for the evaluation of data extraction algorithms.
Conclusion
The development of efficient tools for extraction of PPI data is a necessary step to achieve full curation of the biomedical literature. NLP systems can in the first instance facilitate expert curation by refining the list of candidate publications that contain PPI data; more ambitiously, NLP approaches may be able to directly extract relevant information from full-text articles for rapid inspection by expert curators. Close collaboration between biological databases and NLP systems developers will continue to facilitate the long-term objectives of both disciplines.
doi:10.1186/1471-2105-12-S8-S8
PMCID: PMC3269943  PMID: 22151178
16.  Identification of New Substrates of the Protein-tyrosine Phosphatase PTP1B by Bayesian Integration of Proteome Evidence* 
The Journal of Biological Chemistry  2010;286(6):4173-4185.
There is growing evidence that tyrosine phosphatases display an intrinsic enzymatic preference for the sequence context flanking the target phosphotyrosines. On the other hand, substrate selection in vivo is decisively guided by the enzyme-substrate connectivity in the protein interaction network. We describe here a system wide strategy to infer physiological substrates of protein-tyrosine phosphatases. Here we integrate, by a Bayesian model, proteome wide evidence about in vitro substrate preference, as determined by a novel high-density peptide chip technology, and “closeness” in the protein interaction network. This allows to rank candidate substrates of the human PTP1B phosphatase. Ultimately a variety of in vitro and in vivo approaches were used to verify the prediction that the tyrosine phosphorylation levels of five high-ranking substrates, PLC-γ1, Gab1, SHP2, EGFR, and SHP1, are indeed specifically modulated by PTP1B. In addition, we demonstrate that the PTP1B-mediated dephosphorylation of Gab1 negatively affects its EGF-induced association with the phosphatase SHP2. The dissociation of this signaling complex is accompanied by a decrease of ERK MAP kinase phosphorylation and activation.
doi:10.1074/jbc.M110.157420
PMCID: PMC3039405  PMID: 21123182
ERK; Phospholipase C; Ras; Receptor-tyrosine Kinase; Tyrosine-protein Phosphatase (Tyrosine Phosphatase); Gab1; PTP1B; SHP2
17.  Finding and sharing: new approaches to registries of databases and services for the biomedical sciences 
The recent explosion of biological data and the concomitant proliferation of distributed databases make it challenging for biologists and bioinformaticians to discover the best data resources for their needs, and the most efficient way to access and use them. Despite a rapid acceleration in uptake of syntactic and semantic standards for interoperability, it is still difficult for users to find which databases support the standards and interfaces that they need. To solve these problems, several groups are developing registries of databases that capture key metadata describing the biological scope, utility, accessibility, ease-of-use and existence of web services allowing interoperability between resources. Here, we describe some of these initiatives including a novel formalism, the Database Description Framework, for describing database operations and functionality and encouraging good database practise. We expect such approaches will result in improved discovery, uptake and utilization of data resources.
Database URL: http://www.casimir.org.uk/casimir_ddf
doi:10.1093/database/baq014
PMCID: PMC2911849  PMID: 20627863
18.  MINT, the molecular interaction database: 2009 update 
Nucleic Acids Research  2009;38(Database issue):D532-D539.
MINT (http://mint.bio.uniroma2.it/mint) is a public repository for molecular interactions reported in peer-reviewed journals. Since its last report, MINT has grown considerably in size and evolved in scope to meet the requirements of its users. The main changes include a more precise definition of the curation policy and the development of an enhanced and user-friendly interface to facilitate the analysis of the ever-growing interaction dataset. MINT has adopted the PSI-MI standards for the annotation and for the representation of molecular interactions and is a member of the IMEx consortium.
doi:10.1093/nar/gkp983
PMCID: PMC2808973  PMID: 19897547
19.  Bayesian Modeling of the Yeast SH3 Domain Interactome Predicts Spatiotemporal Dynamics of Endocytosis Proteins 
PLoS Biology  2009;7(10):e1000218.
A genome-scale specificity and interaction map for yeast SH3 domain-containing proteins reveal how family members show selective binding to target proteins and predicts the dynamic localization of new candidate endocytosis proteins.
SH3 domains are peptide recognition modules that mediate the assembly of diverse biological complexes. We scanned billions of phage-displayed peptides to map the binding specificities of the SH3 domain family in the budding yeast, Saccharomyces cerevisiae. Although most of the SH3 domains fall into the canonical classes I and II, each domain utilizes distinct features of its cognate ligands to achieve binding selectivity. Furthermore, we uncovered several SH3 domains with specificity profiles that clearly deviate from the two canonical classes. In conjunction with phage display, we used yeast two-hybrid and peptide array screening to independently identify SH3 domain binding partners. The results from the three complementary techniques were integrated using a Bayesian algorithm to generate a high-confidence yeast SH3 domain interaction map. The interaction map was enriched for proteins involved in endocytosis, revealing a set of SH3-mediated interactions that underlie formation of protein complexes essential to this biological pathway. We used the SH3 domain interaction network to predict the dynamic localization of several previously uncharacterized endocytic proteins, and our analysis suggests a novel role for the SH3 domains of Lsb3p and Lsb4p as hubs that recruit and assemble several endocytic complexes.
Author Summary
Significant diversity exists in protein structure and function, yet certain structural domains are used repeatedly across species to execute similar functions. The SH3 domain is one such common structural domain. It is found in signaling proteins and mediates protein–protein interactions by binding to short peptide sequences generally composed of proline. To investigate both the generality and selectivity of peptide binding by SH3 domains, we examined peptide specificity for almost all SH3 domains encoded within the proteome of the budding yeast, Saccharomyces cerevisiae, using a range of experimental methods. We found that although most of the intrinsic binding specificity for SH3 domains can be summarized by the two previously described canonical binding modes, each individual SH3 domain that we studied utilizes unique features of its cognate ligand to achieve binding selectivity. Moreover, some domains exhibit binding specificities that are distinct from the two canonical classes. We integrated peptide-SH3 domain binding data from three complementary screening techniques using a Bayesian statistical model to generate a protein–protein interaction network for the budding yeast SH3 domain family. This network was highly enriched in endocytosis proteins and their interactions. By examining these interactions in detail, we show that our SH3 domain network can be used to predict the temporal localization of several previously uncharacterized proteins to dynamic complexes that orchestrate the process of endocytosis.
doi:10.1371/journal.pbio.1000218
PMCID: PMC2756588  PMID: 19841731
20.  Diverse driving forces underlie the invariant occurrence of the T42A, E139D, I282V and T468M SHP2 amino acid substitutions causing Noonan and LEOPARD syndromes 
Human Molecular Genetics  2008;17(13):2018-2029.
Missense PTPN11 mutations cause Noonan and LEOPARD syndromes (NS and LS), two developmental disorders with pleiomorphic phenotypes. PTPN11 encodes SHP2, an SH2 domain-containing protein tyrosine phosphatase functioning as a signal transducer. Generally, different substitutions of a particular amino acid residue are observed in these diseases, indicating that the crucial factor is the residue being replaced. For a few codons, only one substitution is observed, suggesting the possibility of specific roles for the residue introduced. We analyzed the biochemical behavior and ligand-binding properties of all possible substitutions arising from single-base changes affecting codons 42, 139, 279, 282 and 468 to investigate the mechanisms underlying the invariant occurrence of the T42A, E139D and I282V substitutions in NS and the Y279C and T468M changes in LS. Our data demonstrate that the isoleucine-to-valine change at codon 282 is the only substitution at that position perturbing the stability of SHP2's closed conformation without impairing catalysis, while the threonine-to-alanine change at codon 42, but not other substitutions of that residue, promotes increased phosphopeptide-binding affinity. The recognition specificity of the C-SH2 domain bearing the E139D substitution differed substantially from its wild-type counterpart acquiring binding properties similar to those observed for the N-SH2 domain, revealing a novel mechanism of SHP2's functional dysregulation. Finally, while functional selection does not seem to occur for the substitutions at codons 279 and 468, we point to deamination of the methylated cytosine at nucleotide 1403 as the driving factor leading to the high prevalence of the T468M change in LS.
doi:10.1093/hmg/ddn099
PMCID: PMC2900904  PMID: 18372317
21.  VirusMINT: a viral protein interaction database 
Nucleic Acids Research  2008;37(Database issue):D669-D673.
Understanding the consequences on host physiology induced by viral infection requires complete understanding of the perturbations caused by virus proteins on the cellular protein interaction network. The VirusMINT database (http://mint.bio.uniroma2.it/virusmint/) aims at collecting all protein interactions between viral and human proteins reported in the literature. VirusMINT currently stores over 5000 interactions involving more than 490 unique viral proteins from more than 110 different viral strains. The whole data set can be easily queried through the search pages and the results can be displayed with a graphical viewer. The curation effort has focused on manuscripts reporting interactions between human proteins and proteins encoded by some of the most medically relevant viruses: papilloma viruses, human immunodeficiency virus 1, Epstein–Barr virus, hepatitis B virus, hepatitis C virus, herpes viruses and Simian virus 40.
doi:10.1093/nar/gkn739
PMCID: PMC2686573  PMID: 18974184
22.  MINT and IntAct contribute to the Second BioCreative challenge: serving the text-mining community with high quality molecular interaction data 
Genome Biology  2008;9(Suppl 2):S5.
Background
In the absence of consolidated pipelines to archive biological data electronically, information dispersed in the literature must be captured by manual annotation. Unfortunately, manual annotation is time consuming and the coverage of published interaction data is therefore far from complete. The use of text-mining tools to identify relevant publications and to assist in the initial information extraction could help to improve the efficiency of the curation process and, as a consequence, the database coverage of data available in the literature. The 2006 BioCreative competition was aimed at evaluating text-mining procedures in comparison with manual annotation of protein-protein interactions.
Results
To aid the BioCreative protein-protein interaction task, IntAct and MINT (Molecular INTeraction) provided both the training and the test datasets. Data from both databases are comparable because they were curated according to the same standards. During the manual curation process, the major cause of data loss in mining the articles for information was ambiguity in the mapping of the gene names to stable UniProtKB database identifiers. It was also observed that most of the information about interactions was contained only within the full-text of the publication; hence, text mining of protein-protein interaction data will require the analysis of the full-text of the articles and cannot be restricted to the abstract.
Conclusion
The development of text-mining tools to extract protein-protein interaction information may increase the literature coverage achieved by manual curation. To support the text-mining community, databases will highlight those sentences within the articles that describe the interactions. These will supply data-miners with a high quality dataset for algorithm development. Furthermore, the dictionary of terms created by the BioCreative competitors could enrich the synonym list of the PSI-MI (Proteomics Standards Initiative-Molecular Interactions) controlled vocabulary, which is used by both databases to annotate their data content.
doi:10.1186/gb-2008-9-s2-s5
PMCID: PMC2559989  PMID: 18834496
23.  The central proline rich region of POB1/REPS2 plays a regulatory role in epidermal growth factor receptor endocytosis by binding to 14-3-3 and SH3 domain-containing proteins 
BMC Biochemistry  2008;9:21.
Background
The human POB1/REPS2 (Partner of RalBP1) protein is highly conserved in mammals where it has been suggested to function as a molecular scaffold recruiting proteins involved in vesicular traffic and linking them to the actin cytoskeleton remodeling machinery. More recently POB1/REPS2 was found highly expressed in androgen-dependent prostate cancer cell lines, while one of its isoforms (isoform 2) is down regulated during prostate cancer progression.
Results
In this report we characterize the central proline rich domain of POB1/REPS2 and we describe for the first time its functional role in receptor endocytosis. We show that the ectopic expression of this domain has a dominant negative effect on the endocytosis of activated epidermal growth factor receptor (EGFR) while leaving transferrin receptor endocytosis unaffected. By a combination of different approaches (phage display, bioinformatics predictions, peptide arrays, mutagenic analysis, in vivo co-immunoprecipitation), we have identified two closely spaced binding motifs for 14-3-3 and for the SH3 of the proteins Amphiphysin II and Grb2. Differently from wild type, proline rich domains that are altered in these motifs do not inhibit EGFR endocytosis, suggesting that these binding motifs play a functional role in this process.
Conclusion
Our findings are relevant to the characterization of the molecular mechanism underlying the involvement of POB1/REPS2, SH3 and 14-3-3 proteins in receptor endocytosis, suggesting that 14-3-3 could work by bridging the EGF receptor and the scaffold protein POB1/REPS2.
doi:10.1186/1471-2091-9-21
PMCID: PMC2494995  PMID: 18647389
24.  Binding to DPF-motif by the POB1 EH domain is responsible for POB1-Eps15 interaction 
BMC Biochemistry  2007;8:29.
Background
Eps15 homology (EH) domains are protein interaction modules binding to peptides containing Asn-Pro-Phe (NPF) motifs and mediating critical events during endocytosis and signal transduction. The EH domain of POB1 associates with Eps15, a protein characterized by a striking string of DPF triplets, 15 in human and 13 in mouse Eps15, at the C-terminus and lacking the typical EH-binding NPF motif.
Results
By screening a multivalent nonapeptide phage display library we have demonstrated that the EH domain of POB1 has a different recognition specificity since it binds to both NPF and DPF motifs. The region of mouse Eps15 responsible for the interaction with the EH domain of POB1 maps within a 18 amino acid peptide (residues 623–640) that includes three DPF repeats. Finally, mutational analysis in the EH domain of POB1, revealed that several solvent exposed residues, while distal to the binding pocket, mediate specific recognition of binding partners through both hydrophobic and electrostatic contacts.
Conclusion
In the present study we have analysed the binding specificity of the POB1 EH domain. We show that it differs from other EH domains since it interacts with both NPF- and DPF-containing sequences. These unusual binding properties could be attributed to a different conformation of the binding pocket that allows to accommodate negative charges; moreover, we identified a cluster of solvent exposed Lys residues, which are only found in the EH domain of POB1, and influence binding to both NPF and DPF motifs. The characterization of structures of the DPF ligands described in this study and the POB1 EH domain will clearly determine the involvement of the positive patch and the rationalization of our findings.
doi:10.1186/1471-2091-8-29
PMCID: PMC2238750  PMID: 18154663
25.  Broadening the horizon – level 2.5 of the HUPO-PSI format for molecular interactions 
BMC Biology  2007;5:44.
Background
Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions.
Results
The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration.
Conclusion
The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel.
doi:10.1186/1741-7007-5-44
PMCID: PMC2189715  PMID: 17925023

Results 1-25 (34)