PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (918330)

Clipboard (0)
None

Related Articles

1.  The topology of the bacterial co-conserved protein network and its implications for predicting protein function 
BMC Genomics  2008;9:313.
Background
Protein-protein interactions networks are most often generated from physical protein-protein interaction data. Co-conservation, also known as phylogenetic profiles, is an alternative source of information for generating protein interaction networks. Co-conservation methods generate interaction networks among proteins that are gained or lost together through evolution. Co-conservation is a particularly useful technique in the compact bacteria genomes. Prior studies in yeast suggest that the topology of protein-protein interaction networks generated from physical interaction assays can offer important insight into protein function. Here, we hypothesize that in bacteria, the topology of protein interaction networks derived via co-conservation information could similarly improve methods for predicting protein function. Since the topology of bacteria co-conservation protein-protein interaction networks has not previously been studied in depth, we first perform such an analysis for co-conservation networks in E. coli K12. Next, we demonstrate one way in which network connectivity measures and global and local function distribution can be exploited to predict protein function for previously uncharacterized proteins.
Results
Our results showed, like most biological networks, our bacteria co-conserved protein-protein interaction networks had scale-free topologies. Our results indicated that some properties of the physical yeast interaction network hold in our bacteria co-conservation networks, such as high connectivity for essential proteins. However, the high connectivity among protein complexes in the yeast physical network was not seen in the co-conservation network which uses all bacteria as the reference set. We found that the distribution of node connectivity varied by functional category and could be informative for function prediction. By integrating of functional information from different annotation sources and using the network topology, we were able to infer function for uncharacterized proteins.
Conclusion
Interactions networks based on co-conservation can contain information distinct from networks based on physical or other interaction types. Our study has shown co-conservation based networks to exhibit a scale free topology, as expected for biological networks. We also revealed ways that connectivity in our networks can be informative for the functional characterization of proteins.
doi:10.1186/1471-2164-9-313
PMCID: PMC2488357  PMID: 18590549
2.  Predicting protein–protein interaction by searching evolutionary tree automorphism space 
Bioinformatics (Oxford, England)  2005;21(Suppl 1):i241-i250.
Motivation
Uncovering the protein–protein interaction network is a fundamental step in the quest to understand the molecular machinery of a cell. This motivates the search for efficient computational methods for predicting such interactions. Among the available predictors are those that are based on the co-evolution hypothesis “evolutionary trees of protein families (that are known to interact) are expected to have similar topologies”. Many of these methods are limited by the fact that they can handle only a small number of protein sequences. Also, details on evolutionary tree topology are missing as they use similarity matrices in lieu of the trees.
Results
We introduce MORPH, a new algorithm for predicting protein interaction partners between members of two protein families that are known to interact. Our approach can also be seen as a new method for searching the best superposition of the corresponding evolutionary trees based on tree automorphism group. We discuss relevant facts related to the predictability of protein–protein interaction based on their co-evolution. When compared with related computational approaches, our method reduces the search space by ~3 × 105-fold and at the same time increases the accuracy of predicting correct binding partners.
doi:10.1093/bioinformatics/bti1009
PMCID: PMC1618802  PMID: 15961463
3.  An entropic characterization of protein interaction networks and cellular robustness 
The structure of molecular networks is believed to determine important aspects of their cellular function, such as the organismal resilience against random perturbations. Ultimately, however, cellular behaviour is determined by the dynamical processes, which are constrained by network topology. The present work is based on a fundamental relation from dynamical systems theory, which states that the macroscopic resilience of a steady state is correlated with the uncertainty in the underlying microscopic processes, a property that can be measured by entropy. Here, we use recent network data from large-scale protein interaction screens to characterize the diversity of possible pathways in terms of network entropy. This measure has its origin in statistical mechanics and amounts to a global characterization of both structural and dynamical resilience in terms of microscopic elements. We demonstrate how this approach can be used to rank network elements according to their contribution to network entropy and also investigate how this suggested ranking reflects on the functional data provided by gene knockouts and RNAi experiments in yeast and Caenorhabditis elegans. Our analysis shows that knockouts of proteins with large contribution to network entropy are preferentially lethal. This observation is robust with respect to several possible errors and biases in the experimental data. It underscores the significance of entropy as a fundamental invariant of the dynamical system, and as a measure of structural and dynamical properties of networks. Our analytical approach goes beyond the phenomenological studies of cellular robustness based on local network observables, such as connectivity. One of its principal achievements is to provide a rationale to study proxies of cellular resilience and rank proteins according to their importance within the global network context.
doi:10.1098/rsif.2006.0140
PMCID: PMC1885358  PMID: 17015299
network entropy; protein interactions; cellular robustness
4.  An Overview of the Statistical Methods Used for Inferring Gene Regulatory Networks and Protein-Protein Interaction Networks 
Advances in Bioinformatics  2013;2013:953814.
The large influx of data from high-throughput genomic and proteomic technologies has encouraged the researchers to seek approaches for understanding the structure of gene regulatory networks and proteomic networks. This work reviews some of the most important statistical methods used for modeling of gene regulatory networks (GRNs) and protein-protein interaction (PPI) networks. The paper focuses on the recent advances in the statistical graphical modeling techniques, state-space representation models, and information theoretic methods that were proposed for inferring the topology of GRNs. It appears that the problem of inferring the structure of PPI networks is quite different from that of GRNs. Clustering and probabilistic graphical modeling techniques are of prime importance in the statistical inference of PPI networks, and some of the recent approaches using these techniques are also reviewed in this paper. Performance evaluation criteria for the approaches used for modeling GRNs and PPI networks are also discussed.
doi:10.1155/2013/953814
PMCID: PMC3594945
5.  A comprehensive molecular interaction map for Hepatitis B virus and drug designing of a novel inhibitor for Hepatitis B X protein 
Bioinformation  2011;7(1):9-14.
Hepatitis B virus (HBV) infection is a leading source of liver diseases such as hepatitis, cirrhosis and hepatocellular carcinoma. In this study, we use computation methods in order to improve our understanding of the complex interactions that occur between molecules related to Hepatitis B virus (HBV). Due to the complexity of the disease and the numerous molecular players involved, we devised a method to construct a systemic network of interactions of the processes ongoing in patients affected by HBV. The network is based on high-throughput data, refined semi-automatically with carefully curated literature-based information. We find that some nodes in the network that prove to be topologically important, in particular HBx is also known to be important target protein used for the treatment of HBV. Therefore, HBx protein is the preferential choice for inhibition to stop the proteolytic processing. Hence, the 3D structure of HBx protein was downloaded from PDB. Ligands for the active site were designed using LIGBUILDER. The HBx protein's active site was explored to find out the critical interactions pattern for inhibitor binding using molecular docking methodology using AUTODOCK Vina. It should be noted that these predicted data should be validated using suitable assays for further consideration.
PMCID: PMC3163926  PMID: 21904432
Hepatitis B virus; HBx protein; PathVisio; Molecular-interaction map; Virtual screening; Docking; Inhibitor
6.  SNAVI: Desktop application for analysis and visualization of large-scale signaling networks 
BMC Systems Biology  2009;3:10.
Background
Studies of cellular signaling indicate that signal transduction pathways combine to form large networks of interactions. Viewing protein-protein and ligand-protein interactions as graphs (networks), where biomolecules are represented as nodes and their interactions are represented as links, is a promising approach for integrating experimental results from different sources to achieve a systematic understanding of the molecular mechanisms driving cell phenotype. The emergence of large-scale signaling networks provides an opportunity for topological statistical analysis while visualization of such networks represents a challenge.
Results
SNAVI is Windows-based desktop application that implements standard network analysis methods to compute the clustering, connectivity distribution, and detection of network motifs, as well as provides means to visualize networks and network motifs. SNAVI is capable of generating linked web pages from network datasets loaded in text format. SNAVI can also create networks from lists of gene or protein names.
Conclusion
SNAVI is a useful tool for analyzing, visualizing and sharing cell signaling data. SNAVI is open source free software. The installation may be downloaded from: . The source code can be accessed from:
doi:10.1186/1752-0509-3-10
PMCID: PMC2637233  PMID: 19154595
7.  Exploiting Protein-Protein Interaction Networks for Genome-Wide Disease-Gene Prioritization 
PLoS ONE  2012;7(9):e43557.
Complex genetic disorders often involve products of multiple genes acting cooperatively. Hence, the pathophenotype is the outcome of the perturbations in the underlying pathways, where gene products cooperate through various mechanisms such as protein-protein interactions. Pinpointing the decisive elements of such disease pathways is still challenging. Over the last years, computational approaches exploiting interaction network topology have been successfully applied to prioritize individual genes involved in diseases. Although linkage intervals provide a list of disease-gene candidates, recent genome-wide studies demonstrate that genes not associated with any known linkage interval may also contribute to the disease phenotype. Network based prioritization methods help highlighting such associations. Still, there is a need for robust methods that capture the interplay among disease-associated genes mediated by the topology of the network. Here, we propose a genome-wide network-based prioritization framework named GUILD. This framework implements four network-based disease-gene prioritization algorithms. We analyze the performance of these algorithms in dozens of disease phenotypes. The algorithms in GUILD are compared to state-of-the-art network topology based algorithms for prioritization of genes. As a proof of principle, we investigate top-ranking genes in Alzheimer's disease (AD), diabetes and AIDS using disease-gene associations from various sources. We show that GUILD is able to significantly highlight disease-gene associations that are not used a priori. Our findings suggest that GUILD helps to identify genes implicated in the pathology of human disorders independent of the loci associated with the disorders.
doi:10.1371/journal.pone.0043557
PMCID: PMC3448640  PMID: 23028459
8.  Investigation of factors affecting prediction of protein-protein interaction networks by phylogenetic profiling 
BMC Genomics  2007;8:393.
Background
The use of computational methods for predicting protein interaction networks will continue to grow with the number of fully sequenced genomes available. The Co-Conservation method, also known as the Phylogenetic profiles method, is a well-established computational tool for predicting functional relationships between proteins.
Results
Here, we examined how various aspects of this method affect the accuracy and topology of protein interaction networks. We have shown that the choice of reference genome influences the number of predictions involving proteins of previously unknown function, the accuracy of predicted interactions, and the topology of predicted interaction networks. We show that while such results are relatively insensitive to the E-value threshold used in defining homologs, predicted interactions are influenced by the similarity metric that is employed. We show that differences in predicted protein interactions are biologically meaningful, where judicious selection of reference genomes, or use of a new scoring scheme that explicitly considers reference genome relatedness, produces known protein interactions as well as predicted protein interactions involving coordinated biological processes that are not accessible using currently available databases.
Conclusion
These studies should prove valuable for future studies seeking to further improve phylogenetic profiling methodologies as well for efforts to efficiently employ such methods to develop new biological insights.
doi:10.1186/1471-2164-8-393
PMCID: PMC2204017  PMID: 17967189
9.  WNP: A Novel Algorithm for Gene Products Annotation from Weighted Functional Networks 
PLoS ONE  2012;7(6):e38767.
Predicting the biological function of all the genes of an organism is one of the fundamental goals of computational system biology. In the last decade, high-throughput experimental methods for studying the functional interactions between gene products (GPs) have been combined with computational approaches based on Bayesian networks for data integration. The result of these computational approaches is an interaction network with weighted links representing connectivity likelihood between two functionally related GPs. The weighted network generated by these computational approaches can be used to predict annotations for functionally uncharacterized GPs. Here we introduce Weighted Network Predictor (WNP), a novel algorithm for function prediction of biologically uncharacterized GPs. Tests conducted on simulated data show that WNP outperforms other 5 state-of-the-art methods in terms of both specificity and sensitivity and that it is able to better exploit and propagate the functional and topological information of the network. We apply our method to Saccharomyces cerevisiae yeast and Arabidopsis thaliana networks and we predict Gene Ontology function for about 500 and 10000 uncharacterized GPs respectively.
doi:10.1371/journal.pone.0038767
PMCID: PMC3386258  PMID: 22761703
10.  Dominating Biological Networks 
PLoS ONE  2011;6(8):e23016.
Proteins are essential macromolecules of life that carry out most cellular processes. Since proteins aggregate to perform function, and since protein-protein interaction (PPI) networks model these aggregations, one would expect to uncover new biology from PPI network topology. Hence, using PPI networks to predict protein function and role of protein pathways in disease has received attention. A debate remains open about whether network properties of “biologically central (BC)” genes (i.e., their protein products), such as those involved in aging, cancer, infectious diseases, or signaling and drug-targeted pathways, exhibit some topological centrality compared to the rest of the proteins in the human PPI network.
To help resolve this debate, we design new network-based approaches and apply them to get new insight into biological function and disease. We hypothesize that BC genes have a topologically central (TC) role in the human PPI network. We propose two different concepts of topological centrality. We design a new centrality measure to capture complex wirings of proteins in the network that identifies as TC those proteins that reside in dense extended network neighborhoods. Also, we use the notion of domination and find dominating sets (DSs) in the PPI network, i.e., sets of proteins such that every protein is either in the DS or is a neighbor of the DS. Clearly, a DS has a TC role, as it enables efficient communication between different network parts.
We find statistically significant enrichment in BC genes of TC nodes and outperform the existing methods indicating that genes involved in key biological processes occupy topologically complex and dense regions of the network and correspond to its “spine” that connects all other network parts and can thus pass cellular signals efficiently throughout the network. To our knowledge, this is the first study that explores domination in the context of PPI networks.
doi:10.1371/journal.pone.0023016
PMCID: PMC3162560  PMID: 21887225
11.  Unveiling Protein Functions through the Dynamics of the Interaction Network 
PLoS ONE  2011;6(3):e17679.
Protein interaction networks have become a tool to study biological processes, either for predicting molecular functions or for designing proper new drugs to regulate the main biological interactions. Furthermore, such networks are known to be organized in sub-networks of proteins contributing to the same cellular function. However, the protein function prediction is not accurate and each protein has traditionally been assigned to only one function by the network formalism. By considering the network of the physical interactions between proteins of the yeast together with a manual and single functional classification scheme, we introduce a method able to reveal important information on protein function, at both micro- and macro-scale. In particular, the inspection of the properties of oscillatory dynamics on top of the protein interaction network leads to the identification of misclassification problems in protein function assignments, as well as to unveil correct identification of protein functions. We also demonstrate that our approach can give a network representation of the meta-organization of biological processes by unraveling the interactions between different functional classes.
doi:10.1371/journal.pone.0017679
PMCID: PMC3052369  PMID: 21408013
12.  Nonparametric Simulation of Signal Transduction Networks with Semi-Synchronized Update 
PLoS ONE  2012;7(6):e39643.
Simulating signal transduction in cellular signaling networks provides predictions of network dynamics by quantifying the changes in concentration and activity-level of the individual proteins. Since numerical values of kinetic parameters might be difficult to obtain, it is imperative to develop non-parametric approaches that combine the connectivity of a network with the response of individual proteins to signals which travel through the network. The activity levels of signaling proteins computed through existing non-parametric modeling tools do not show significant correlations with the observed values in experimental results. In this work we developed a non-parametric computational framework to describe the profile of the evolving process and the time course of the proportion of active form of molecules in the signal transduction networks. The model is also capable of incorporating perturbations. The model was validated on four signaling networks showing that it can effectively uncover the activity levels and trends of response during signal transduction process.
doi:10.1371/journal.pone.0039643
PMCID: PMC3380921  PMID: 22737250
13.  Keystone species and food webs 
Different species are of different importance in maintaining ecosystem functions in natural communities. Quantitative approaches are needed to identify unusually important or influential, ‘keystone’ species particularly for conservation purposes. Since the importance of some species may largely be the consequence of their rich interaction structure, one possible quantitative approach to identify the most influential species is to study their position in the network of interspecific interactions. In this paper, I discuss the role of network analysis (and centrality indices in particular) in this process and present a new and simple approach to characterizing the interaction structures of each species in a complex network. Understanding the linkage between structure and dynamics is a condition to test the results of topological studies, I briefly overview our current knowledge on this issue. The study of key nodes in networks has become an increasingly general interest in several disciplines: I will discuss some parallels. Finally, I will argue that conservation biology needs to devote more attention to identify and conserve keystone species and relatively less attention to rarity.
doi:10.1098/rstb.2008.0335
PMCID: PMC2685432  PMID: 19451124
centrality; food web; indirect effect; keystone species; network analysis
14.  A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data 
BMC Systems Biology  2012;6:15.
Background
Identification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins' essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value.
Results
In this paper, we propose a new centrality measure, named PeC, based on the integration of protein-protein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC), Betweenness Centrality (BC), Closeness Centrality (CC), Subgraph Centrality (SC), Eigenvector Centrality (EC), Information Centrality (IC), Bottle Neck (BN), Density of Maximum Neighborhood Component (DMNC), Local Average Connectivity-based method (LAC), Sum of ECC (SoECC), Range-Limited Centrality (RL), L-index (LI), Leader Rank (LR), Normalized α-Centrality (NC), and Moduland-Centrality (MC). Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN) is more than 50% when predicting no more than 500 proteins.
Conclusions
We demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.
doi:10.1186/1752-0509-6-15
PMCID: PMC3325894  PMID: 22405054
15.  Topological structure analysis of the protein–protein interaction network in budding yeast 
Nucleic Acids Research  2003;31(9):2443-2450.
Interaction detection methods have led to the discovery of thousands of interactions between proteins, and discerning relevance within large-scale data sets is important to present-day biology. Here, a spectral method derived from graph theory was introduced to uncover hidden topological structures (i.e. quasi-cliques and quasi-bipartites) of complicated protein–protein interaction networks. Our analyses suggest that these hidden topological structures consist of biologically relevant functional groups. This result motivates a new method to predict the function of uncharacterized proteins based on the classification of known proteins within topological structures. Using this spectral analysis method, 48 quasi-cliques and six quasi-bipartites were isolated from a network involving 11 855 interactions among 2617 proteins in budding yeast, and 76 uncharacterized proteins were assigned functions.
PMCID: PMC154226  PMID: 12711690
16.  Equal Opportunity for Low-Degree Network Nodes: A PageRank-Based Method for Protein Target Identification in Metabolic Graphs 
PLoS ONE  2013;8(1):e54204.
Biological network data, such as metabolic-, signaling- or physical interaction graphs of proteins are increasingly available in public repositories for important species. Tools for the quantitative analysis of these networks are being developed today. Protein network-based drug target identification methods usually return protein hubs with large degrees in the networks as potentially important targets. Some known, important protein targets, however, are not hubs at all, and perturbing protein hubs in these networks may have several unwanted physiological effects, due to their interaction with numerous partners. Here, we show a novel method applicable in networks with directed edges (such as metabolic networks) that compensates for the low degree (non-hub) vertices in the network, and identifies important nodes, regardless of their hub properties. Our method computes the PageRank for the nodes of the network, and divides the PageRank by the in-degree (i.e., the number of incoming edges) of the node. This quotient is the same in all nodes in an undirected graph (even for large- and low-degree nodes, that is, for hubs and non-hubs as well), but may differ significantly from node to node in directed graphs. We suggest to assign importance to non-hub nodes with large PageRank/in-degree quotient. Consequently, our method gives high scores to nodes with large PageRank, relative to their degrees: therefore non-hub important nodes can easily be identified in large networks. We demonstrate that these relatively high PageRank scores have biological relevance: the method correctly finds numerous already validated drug targets in distinct organisms (Mycobacterium tuberculosis, Plasmodium falciparum and MRSA Staphylococcus aureus), and consequently, it may suggest new possible protein targets as well. Additionally, our scoring method was not chosen arbitrarily: its value for all nodes of all undirected graphs is constant; therefore its high value captures importance in the directed edge structure of the graph.
doi:10.1371/journal.pone.0054204
PMCID: PMC3558500  PMID: 23382878
17.  Identifying protein complexes from interaction networks based on clique percolation and distance restriction 
BMC Genomics  2010;11(Suppl 2):S10.
Background
Identification of protein complexes in large interaction networks is crucial to understand principles of cellular organization and predict protein functions, which is one of the most important issues in the post-genomic era. Each protein might be subordinate multiple protein complexes in the real protein-protein interaction networks. Identifying overlapping protein complexes from protein-protein interaction networks is a considerable research topic.
Result
As an effective algorithm in identifying overlapping module structures, clique percolation method (CPM) has a wide range of application in social networks and biological networks. However, the recognition accuracy of algorithm CPM is lowly. Furthermore, algorithm CPM is unfit to identifying protein complexes with meso-scale when it applied in protein-protein interaction networks. In this paper, we propose a new topological model by extending the definition of k-clique community of algorithm CPM and introduced distance restriction, and develop a novel algorithm called CP-DR based on the new topological model for identifying protein complexes. In this new algorithm, the protein complex size is restricted by distance constraint to conquer the shortcomings of algorithm CPM. The algorithm CP-DR is applied to the protein interaction network of Sacchromyces cerevisiae and identifies many well known complexes.
Conclusion
The proposed algorithm CP-DR based on clique percolation and distance restriction makes it possible to identify dense subgraphs in protein interaction networks, a large number of which correspond to known protein complexes. Compared to algorithm CPM, algorithm CP-DR has more outstanding performance.
doi:10.1186/1471-2164-11-S2-S10
PMCID: PMC2975417  PMID: 21047377
18.  Computational Prediction of Heme-Binding Residues by Exploiting Residue Interaction Network 
PLoS ONE  2011;6(10):e25560.
Computational identification of heme-binding residues is beneficial for predicting and designing novel heme proteins. Here we proposed a novel method for heme-binding residue prediction by exploiting topological properties of these residues in the residue interaction networks derived from three-dimensional structures. Comprehensive analysis showed that key residues located in heme-binding regions are generally associated with the nodes with higher degree, closeness and betweenness, but lower clustering coefficient in the network. HemeNet, a support vector machine (SVM) based predictor, was developed to identify heme-binding residues by combining topological features with existing sequence and structural features. The results showed that incorporation of network-based features significantly improved the prediction performance. We also compared the residue interaction networks of heme proteins before and after heme binding and found that the topological features can well characterize the heme-binding sites of apo structures as well as those of holo structures, which led to reliable performance improvement as we applied HemeNet to predicting the binding residues of proteins in the heme-free state. HemeNet web server is freely accessible at http://mleg.cse.sc.edu/hemeNet/.
doi:10.1371/journal.pone.0025560
PMCID: PMC3184988  PMID: 21991319
19.  Interaction generality, a measurement to assess the reliability of a protein–protein interaction 
Nucleic Acids Research  2002;30(5):1163-1168.
Here we introduce the ‘interaction generality’ measure, a new method for computationally assessing the reliability of protein–protein interactions obtained in biological experiments. This measure is basically the number of proteins involved in a given interaction and also adopts the idea that interactions observed in a complicated interaction network are likely to be true positives. Using a group of yeast protein–protein interactions identified in various biological experiments, we show that interactions with low generalities are more likely to be reproducible in other independent assays. We constructed more reliable networks by eliminating interactions whose generalities were above a particular threshold. The rate of interactions with common cellular roles increased from 63% in the unadjusted estimates to 79% in the refined networks. As a result, the rate of cross-talk between proteins with different cellular roles decreased, enabling very clear predictions of the functions of some unknown proteins. The results suggest that the interaction generality measure will make interaction data more useful in all organisms and may yield insights into the biological roles of the proteins studied.
PMCID: PMC101243  PMID: 11861907
20.  Conditional random field approach to prediction of protein-protein interactions using domain information 
BMC Systems Biology  2011;5(Suppl 1):S8.
Background
For understanding cellular systems and biological networks, it is important to analyze functions and interactions of proteins and domains. Many methods for predicting protein-protein interactions have been developed. It is known that mutual information between residues at interacting sites can be higher than that at non-interacting sites. It is based on the thought that amino acid residues at interacting sites have coevolved with those at the corresponding residues in the partner proteins. Several studies have shown that such mutual information is useful for identifying contact residues in interacting proteins.
Results
We propose novel methods using conditional random fields for predicting protein-protein interactions. We focus on the mutual information between residues, and combine it with conditional random fields. In the methods, protein-protein interactions are modeled using domain-domain interactions. We perform computational experiments using protein-protein interaction datasets for several organisms, and calculate AUC (Area Under ROC Curve) score. The results suggest that our proposed methods with and without mutual information outperform EM (Expectation Maximization) method proposed by Deng et al., which is one of the best predictors based on domain-domain interactions.
Conclusions
We propose novel methods using conditional random fields with and without mutual information between domains. Our methods based on domain-domain interactions are useful for predicting protein-protein interactions.
doi:10.1186/1752-0509-5-S1-S8
PMCID: PMC3121124  PMID: 21689483
21.  Determining modular organization of protein interaction networks by maximizing modularity density 
BMC Systems Biology  2010;4(Suppl 2):S10.
Background
With ever increasing amount of available data on biological networks, modeling and understanding the structure of these large networks is an important problem with profound biological implications. Cellular functions and biochemical events are coordinately carried out by groups of proteins interacting each other in biological modules. Identifying of such modules in protein interaction networks is very important for understanding the structure and function of these fundamental cellular networks. Therefore, developing an effective computational method to uncover biological modules should be highly challenging and indispensable.
Results
The purpose of this study is to introduce a new quantitative measure modularity density into the field of biomolecular networks and develop new algorithms for detecting functional modules in protein-protein interaction (PPI) networks. Specifically, we adopt the simulated annealing (SA) to maximize the modularity density and evaluate its efficiency on simulated networks. In order to address the computational complexity of SA procedure, we devise a spectral method for optimizing the index and apply it to a yeast PPI network.
Conclusions
Our analysis of detected modules by the present method suggests that most of these modules have well biological significance in context of protein complexes. Comparison with the MCL and the modularity based methods shows the efficiency of our method.
doi:10.1186/1752-0509-4-S2-S10
PMCID: PMC2982684  PMID: 20840724
22.  Automatic extraction of gene ontology annotation and its correlation with clusters in protein networks 
BMC Bioinformatics  2007;8:243.
Background
Uncovering cellular roles of a protein is a task of tremendous importance and complexity that requires dedicated experimental work as well as often sophisticated data mining and processing tools. Protein functions, often referred to as its annotations, are believed to manifest themselves through topology of the networks of inter-proteins interactions. In particular, there is a growing body of evidence that proteins performing the same function are more likely to interact with each other than with proteins with other functions. However, since functional annotation and protein network topology are often studied separately, the direct relationship between them has not been comprehensively demonstrated. In addition to having the general biological significance, such demonstration would further validate the data extraction and processing methods used to compose protein annotation and protein-protein interactions datasets.
Results
We developed a method for automatic extraction of protein functional annotation from scientific text based on the Natural Language Processing (NLP) technology. For the protein annotation extracted from the entire PubMed, we evaluated the precision and recall rates, and compared the performance of the automatic extraction technology to that of manual curation used in public Gene Ontology (GO) annotation. In the second part of our presentation, we reported a large-scale investigation into the correspondence between communities in the literature-based protein networks and GO annotation groups of functionally related proteins. We found a comprehensive two-way match: proteins within biological annotation groups form significantly denser linked network clusters than expected by chance and, conversely, densely linked network communities exhibit a pronounced non-random overlap with GO groups. We also expanded the publicly available GO biological process annotation using the relations extracted by our NLP technology. An increase in the number and size of GO groups without any noticeable decrease of the link density within the groups indicated that this expansion significantly broadens the public GO annotation without diluting its quality. We revealed that functional GO annotation correlates mostly with clustering in a physical interaction protein network, while its overlap with indirect regulatory network communities is two to three times smaller.
Conclusion
Protein functional annotations extracted by the NLP technology expand and enrich the existing GO annotation system. The GO functional modularity correlates mostly with the clustering in the physical interaction network, suggesting that the essential role of structural organization maintained by these interactions. Reciprocally, clustering of proteins in physical interaction networks can serve as an evidence for their functional similarity.
doi:10.1186/1471-2105-8-243
PMCID: PMC1940026  PMID: 17620146
23.  Discovering cancer genes by integrating network and functional properties 
BMC Medical Genomics  2009;2:61.
Background
Identification of novel cancer-causing genes is one of the main goals in cancer research. The rapid accumulation of genome-wide protein-protein interaction (PPI) data in humans has provided a new basis for studying the topological features of cancer genes in cellular networks. It is important to integrate multiple genomic data sources, including PPI networks, protein domains and Gene Ontology (GO) annotations, to facilitate the identification of cancer genes.
Methods
Topological features of the PPI network, as well as protein domain compositions, enrichment of gene ontology categories, sequence and evolutionary conservation features were extracted and compared between cancer genes and other genes. The predictive power of various classifiers for identification of cancer genes was evaluated by cross validation. Experimental validation of a subset of the prediction results was conducted using siRNA knockdown and viability assays in human colon cancer cell line DLD-1.
Results
Cross validation demonstrated advantageous performance of classifiers based on support vector machines (SVMs) with the inclusion of the topological features from the PPI network, protein domain compositions and GO annotations. We then applied the trained SVM classifier to human genes to prioritize putative cancer genes. siRNA knock-down of several SVM predicted cancer genes displayed greatly reduced cell viability in human colon cancer cell line DLD-1.
Conclusion
Topological features of PPI networks, protein domain compositions and GO annotations are good predictors of cancer genes. The SVM classifier integrates multiple features and as such is useful for prioritizing candidate cancer genes for experimental validations.
doi:10.1186/1755-8794-2-61
PMCID: PMC2758898  PMID: 19765316
24.  Deciphering modular and dynamic behaviors of transcriptional networks 
Genomic Medicine  2007;1(1-2):19-28.
The coordinated and dynamic modulation or interaction of genes or proteins acts as an important mechanism used by a cell in functional regulation. Recent studies have shown that many transcriptional networks exhibit a scale-free topology and hierarchical modular architecture. It has also been shown that transcriptional networks or pathways are dynamic and behave only in certain ways and controlled manners in response to disease development, changing cellular conditions, and different environmental factors. Moreover, evolutionarily conserved and divergent transcriptional modules underline fundamental and species-specific molecular mechanisms controlling disease development or cellular phenotypes. Various computational algorithms have been developed to explore transcriptional networks and modules from gene expression data. In silico studies have also been made to mimic the dynamic behavior of regulatory networks, analyzing how disease or cellular phenotypes arise from the connectivity or networks of genes and their products. Here, we review the recent development in computational biology research on deciphering modular and dynamic behaviors of transcriptional networks, highlighting important findings. We also demonstrate how these computational algorithms can be applied in systems biology studies as on disease, stem cells, and drug discovery.
doi:10.1007/s11568-007-9004-7
PMCID: PMC2276884  PMID: 18923925
Systems biology; Coexpression; Transcriptional module; Pathway dynamics; Transcriptional intervention; ModulePro; PathwayPro
25.  Activating and inhibiting connections in biological network dynamics 
Biology Direct  2008;3:49.
Background
Many studies of biochemical networks have analyzed network topology. Such work has suggested that specific types of network wiring may increase network robustness and therefore confer a selective advantage. However, knowledge of network topology does not allow one to predict network dynamical behavior – for example, whether deleting a protein from a signaling network would maintain the network's dynamical behavior, or induce oscillations or chaos.
Results
Here we report that the balance between activating and inhibiting connections is important in determining whether network dynamics reach steady state or oscillate. We use a simple dynamical model of a network of interacting genes or proteins. Using the model, we study random networks, networks selected for robust dynamics, and examples of biological network topologies. The fraction of activating connections influences whether the network dynamics reach steady state or oscillate.
Conclusion
The activating fraction may predispose a network to oscillate or reach steady state, and neutral evolution or selection of this parameter may affect the behavior of biological networks. This principle may unify the dynamics of a wide range of cellular networks.
Reviewers
Reviewed by Sergei Maslov, Eugene Koonin, and Yu (Brandon) Xia (nominated by Mark Gerstein). For the full reviews, please go to the Reviewers' comments section.
doi:10.1186/1745-6150-3-49
PMCID: PMC2651858  PMID: 19055800

Results 1-25 (918330)