Hepatitis B virus (HBV) infection is a leading source of liver diseases such as hepatitis, cirrhosis and hepatocellular carcinoma. In
this study, we use computation methods in order to improve our understanding of the complex interactions that occur between
molecules related to Hepatitis B virus (HBV). Due to the complexity of the disease and the numerous molecular players involved,
we devised a method to construct a systemic network of interactions of the processes ongoing in patients affected by HBV. The
network is based on high-throughput data, refined semi-automatically with carefully curated literature-based information. We find
that some nodes in the network that prove to be topologically important, in particular HBx is also known to be important target
protein used for the treatment of HBV. Therefore, HBx protein is the preferential choice for inhibition to stop the proteolytic
processing. Hence, the 3D structure of HBx protein was downloaded from PDB. Ligands for the active site were designed using
LIGBUILDER. The HBx protein's active site was explored to find out the critical interactions pattern for inhibitor binding using
molecular docking methodology using AUTODOCK Vina. It should be noted that these predicted data should be validated using
suitable assays for further consideration.
Hepatitis B virus; HBx protein; PathVisio; Molecular-interaction map; Virtual screening; Docking; Inhibitor
Motivation: Network-centered studies in systems biology attempt to integrate the topological properties of biological networks with experimental data in order to make predictions and posit hypotheses. For any topology-based prediction, it is necessary to first assess the significance of the analyzed property in a biologically meaningful context. Therefore, devising network null models, carefully tailored to the topological and biochemical constraints imposed on the network, remains an important computational problem.
Results: We first review the shortcomings of the existing generic sampling scheme—switch randomization—and explain its unsuitability for application to metabolic networks. We then devise a novel polynomial-time algorithm for randomizing metabolic networks under the (bio)chemical constraint of mass balance. The tractability of our method follows from the concept of mass equivalence classes, defined on the representation of compounds in the vector space over chemical elements. We finally demonstrate the uniformity of the proposed method on seven genome-scale metabolic networks, and empirically validate the theoretical findings. The proposed method allows a biologically meaningful estimation of significance for metabolic network properties.
Contact: email@example.com; firstname.lastname@example.org
Supplementary Information: Supplementary data are available at Bioinformatics online.
The molecular pathways that govern human disease consist of molecular circuits that coalesce into complex, overlapping networks. These network pathways are presumably regulated in a coordinated fashion, but such regulation has been difficult to decipher using only reductionistic principles. The emerging paradigm of “network medicine” proposes to utilize insights garnered from network topology (e.g., the static position of molecules in relation to their neighbors) as well as network dynamics (e.g., the unique flux of information through the network) to understand better the pathogenic behavior of complex molecular interconnections that traditional methods fail to recognize. As methodologies evolve, network medicine has the potential to capture the molecular complexity of human disease while offering computational methods to discern how such complexity controls disease manifestations, prognosis, and therapy. This review introduces the fundamental concepts of network medicine, and explores the feasibility and potential impact of network-based methods for predicting individual manifestations of human disease and designing rational therapies. Wherever possible, we emphasize the application of these principles to cardiovascular disease.
network medicine; systems biology; cardiovascular disease; systems pharmacology
Better effectiveness would be achieved when interventions are used in treating patients with a specific traditional Chinese medicine (TCM) pattern. In this paper, the effectiveness in treating rheumatoid arthritis (RA) patients in a randomized clinical trial as reanalyzed after the patients were classified into different TCM patterns and the underlying mechanism of how the TCM pattern influences the clinical effectiveness of interventions (TCM and biomedicine therapy) was explored. The pharmacological networks of interventions were builtup with protein and protein interaction analyses based on all the related targeted proteins obtained from PubChem. The underlying mechanism was explored by merging the pharmacological networks with the molecular networks of TCM cold and hot patterns in RA. The results show that the TCM therapy is better in treating the RA patients with TCM hot pattern, and the biomedical therapy is better in the RA patients with cold pattern. The pharmacological network of TCM intervention is merged well with the molecular network of TCM hot pattern, and the pharmacological network of biomedical therapy is merged well with the network of cold pattern. The finding indicates that molecular network analysis could give insight into the full understanding of the underlying mechanism of how TCM pattern impacts the efficacy.
Multicomponent therapeutics offer bright prospects for the control of complex diseases in a synergistic manner. However, finding ways to screen the synergistic combinations from numerous pharmacological agents is still an ongoing challenge.
In this work, we proposed for the first time a “network target”-based paradigm instead of the traditional "single target"-based paradigm for virtual screening and established an algorithm termed NIMS (Network target-based Identification of Multicomponent Synergy) to prioritize synergistic agent combinations in a high throughput way. NIMS treats a disease-specific biological network as a therapeutic target and assumes that the relationship among agents can be transferred to network interactions among the molecular level entities (targets or responsive gene products) of agents. Then, two parameters in NIMS, Topology Score and Agent Score, are created to evaluate the synergistic relationship between each given agent combinations. Taking the empirical multicomponent system traditional Chinese medicine (TCM) as an illustrative case, we applied NIMS to prioritize synergistic agent pairs from 63 agents on a pathological process instanced by angiogenesis. The NIMS outputs can not only recover five known synergistic agent pairs, but also obtain experimental verification for synergistic candidates combined with, for example, a herbal ingredient Sinomenine, which outperforms the meet/min method. The robustness of NIMS was also showed regarding the background networks, agent genes and topological parameters, respectively. Finally, we characterized the potential mechanisms of multicomponent synergy from a network target perspective.
NIMS is a first-step computational approach towards identification of synergistic drug combinations at the molecular level. The network target-based approaches may adjust current virtual screen mode and provide a systematic paradigm for facilitating the development of multicomponent therapeutics as well as the modernization of TCM.
Complex diseases are caused by perturbations of biological networks. Genetic analysis approaches focused on individual genetic determinants are unlikely to characterize the network architecture of complex diseases comprehensively. Network medicine, which applies systems biology and network science to complex molecular networks underlying human disease, focuses on identifying the interacting genes and proteins which lead to disease pathogenesis. The long biological path between a genetic risk variant and development of a complex disease involves a range of biochemical intermediates, including coding and non-coding RNA, proteins, and metabolites. Transcriptomics, proteomics, metabolomics, and other –omics technologies have the potential to provide insights into complex disease pathogenesis, especially if they are applied within a network biology framework. Most previous efforts to relate genetics to –omics data have focused on a single –omics platform; the next generation of complex disease genetics studies will require integration of multiple types of –omics data sets in a network context. Network medicine may also provide insight into complex disease heterogeneity, serve as the basis for new disease classifications that reflect underlying disease pathogenesis, and guide rational therapeutic and preventive strategies.
Language comprehension is a complex task that involves a wide network of brain regions. We used topological measures to qualify and quantify the functional connectivity of the networks used under various comprehension conditions. To that aim we developed a technique to represent functional networks based on EEG recordings, taking advantage of their excellent time resolution in order to capture the fast processes that occur during language comprehension. Networks were created by searching for a specific causal relation between areas, the negative feedback loop, which is ubiquitous in many systems. This method is a simple way to construct directed graphs using event-related activity, which can then be analyzed topologically. Brain activity was recorded while subjects read expressions of various types and indicated whether they found them meaningful. Slightly different functional networks were obtained for event-related activity evoked by each expression type. The differences reflect the special contribution of specific regions in each condition and the balance of hemispheric activity involved in comprehending different types of expressions and are consistent with the literature in the field. Our results indicate that representing event-related brain activity as a network using a simple temporal relation, such as the negative feedback loop, to indicate directional connectivity is a viable option for investigation which also derives new information about aspects not reflected in the classical methods for investigating brain activity.
Current advances in genomics, proteomics and other areas of molecular biology make the identification and reconstruction of novel pathways an emerging area of great interest. One such class of pathways is involved in the biogenesis of Iron-Sulfur Clusters (ISC).
Our goal is the development of a new approach based on the use and combination of mathematical, theoretical and computational methods to identify the topology of a target network. In this approach, mathematical models play a central role for the evaluation of the alternative network structures that arise from literature data-mining, phylogenetic profiling, structural methods, and human curation. As a test case, we reconstruct the topology of the reaction and regulatory network for the mitochondrial ISC biogenesis pathway in S. cerevisiae. Predictions regarding how proteins act in ISC biogenesis are validated by comparison with published experimental results. For example, the predicted role of Arh1 and Yah1 and some of the interactions we predict for Grx5 both matches experimental evidence. A putative role for frataxin in directly regulating mitochondrial iron import is discarded from our analysis, which agrees with also published experimental results. Additionally, we propose a number of experiments for testing other predictions and further improve the identification of the network structure.
We propose and apply an iterative in silico procedure for predictive reconstruction of the network topology of metabolic pathways. The procedure combines structural bioinformatics tools and mathematical modeling techniques that allow the reconstruction of biochemical networks. Using the Iron Sulfur cluster biogenesis in S. cerevisiae as a test case we indicate how this procedure can be used to analyze and validate the network model against experimental results. Critical evaluation of the obtained results through this procedure allows devising new wet lab experiments to confirm its predictions or provide alternative explanations for further improving the models.
Studies in model organisms and humans have begun to reveal the complexity of the transcriptome. In addition to serving as passive templates from which genes are translated, RNA molecules are active, functional elements of the cell whose products can detect, interact with, and modify other transcripts. Gene expression profiling is the method most commonly used thus far to enrich our understanding of the molecular basis of rheumatoid arthritis in adults and juvenile idiopathic arthritis in children. The feasibility of this approach for patient classification (for example, active versus inactive disease, disease subsets) and improving prognosis (for example, response to therapy) has been demonstrated over the past 7 years. Mechanistic understanding of disease-related differences in gene expression must be interpreted in the context of interactions with transcriptional regulatory molecules and epigenetic alterations of the genome. Ongoing work regarding such functional complexities in the human genome will likely bring both insight and surprise to our understanding of rheumatoid arthritis.
Scientific literature is a source of the most reliable and comprehensive knowledge about molecular interaction networks. Formalization of this knowledge is necessary for computational analysis and is achieved by automatic fact extraction using various text-mining algorithms. Most of these techniques suffer from high false positive rates and redundancy of the extracted information. The extracted facts form a large network with no pathways defined.
We describe the methodology for automatic curation of Biological Association Networks (BANs) derived by a natural language processing technology called Medscan. The curated data is used for automatic pathway reconstruction. The algorithm for the reconstruction of signaling pathways is also described and validated by comparison with manually curated pathways and tissue-specific gene expression profiles.
Biological Association Networks extracted by MedScan technology contain sufficient information for constructing thousands of mammalian signaling pathways for multiple tissues. The automatically curated MedScan data is adequate for automatic generation of good quality signaling networks. The automatically generated Regulome pathways and manually curated pathways used for their validation are available free in the ResNetCore database from Ariadne Genomics, Inc. . The pathways can be viewed and analyzed through the use of a free demo version of PathwayStudio software. The Medscan technology is also available for evaluation using the free demo version of PathwayStudio software.
There is considerable evidence that the translation rate of major basic science promises to clinical applications has been inefficient and disappointing. The deficiencies of translational science have often been proposed as an explanation for this failure. An alternative explanation is that until recently basic science advances have made oversimplified assumptions that have not matched the true etiological complexity of most common diseases; while clinical science has suffered from poor research practices, overt biases and conflicts of interest. The advent of molecular medicine and the recasting of clinical science along the principles of evidence-based medicine provide a better environment where translational research may now materialize its goals. At the same time, priority issues need to be addressed in order to exploit the new opportunities. Translational research should focus on diseases with global impact, if true progress is to be made against human suffering. The health outcomes of interest for translational efforts need to be carefully defined and a balance must be struck between the subjective needs of healthcare consumers and objective health outcomes. Development of more simple, practical and safer interventions may be as important a target for translational research as the development of cures for diseases where no effective interventions are available at all. Moreover, while the role of the industry is catalytic in translating research advances to licensed interventions, academic independence needs to be sustained and strengthened at a global level. Conflicts of interest may stifle translational research efforts internationally. The profit motive is unlikely to be sufficient alone to advance biomedical research towards genuine progress.
The detection of modules or community structure is widely used to reveal the underlying properties of complex networks in biology, as well as physical and social sciences. Since the adoption of modularity as a measure of network topological properties, several methodologies for the discovery of community structure based on modularity maximisation have been developed. However, satisfactory partitions of large graphs with modest computational resources are particularly challenging due to the NP-hard nature of the related optimisation problem. Furthermore, it has been suggested that optimising the modularity metric can reach a resolution limit whereby the algorithm fails to detect smaller communities than a specific size in large networks.
We present a novel solution approach to identify community structure in large complex networks and address resolution limitations in module detection. The proposed algorithm employs modularity to express network community structure and it is based on mixed integer optimisation models. The solution procedure is extended through an iterative procedure to diminish effects that tend to agglomerate smaller modules (resolution limitations).
A comprehensive comparative analysis of methodologies for module detection based on modularity maximisation shows that our approach outperforms previously reported methods. Furthermore, in contrast to previous reports, we propose a strategy to handle resolution limitations in modularity maximisation. Overall, we illustrate ways to improve existing methodologies for community structure identification so as to increase its efficiency and applicability.
Uncovering the operating principles underlying cellular processes by using 'omics' data is often a difficult task due to the high-dimensionality of the solution space that spans all interactions among the bio-molecules under consideration. A rational way to overcome this problem is to use the topology of bio-molecular interaction networks in order to constrain the solution space. Such approaches systematically integrate the existing biological knowledge with the 'omics' data.
Here we introduce a hypothesis-driven method that integrates bio-molecular network topology with transcriptome data, thereby allowing the identification of key biological features (Reporter Features) around which transcriptional changes are significantly concentrated. We have combined transcriptome data with different biological networks in order to identify Reporter Gene Ontologies, Reporter Transcription Factors, Reporter Proteins and Reporter Complexes, and use this to decipher the logic of regulatory circuits playing a key role in yeast glucose repression and human diabetes.
Reporter Features offer the opportunity to identify regulatory hot-spots in bio-molecular interaction networks that are significantly affected between or across conditions. Results of the Reporter Feature analysis not only provide a snapshot of the transcriptional regulatory program but also are biologically easy to interpret and provide a powerful way to generate new hypotheses. Our Reporter Features analyses of yeast glucose repression and human diabetes data brings hints towards the understanding of the principles of transcriptional regulation controlling these two important and potentially closely related systems.
Synovial pathophysiology is a complex and synergistic interplay of different cell populations with tissue components, mediated by a variety of signaling mechanisms. All of these mechanisms drive the affected joint into inflammation and drive the subsequent destruction of cartilage and bone. Each cell type contributes significantly to the initiation and perpetuation of this deleterious concert, especially in rheumatoid arthritis. Rheumatoid arthritis synovial fibroblasts and macrophages, both cell types with pivotal roles in inflammation and destruction, but also T cells and B cells are crucial for complex network in the inflamed synovium. An even more complex cellular crosstalk between these key players maintains a process of chronic inflammation. As outlined in the present review, in the past year substantial progress has been made to elucidate further details of the rich pathophysiology of rheumatoid arthritis, which may also facilitate the identification of novel targets for future therapeutic strategies.
Motivation: Understanding the role of genetics in diseases is one of the most important aims of the biological sciences. The completion of the Human Genome Project has led to a rapid increase in the number of publications in this area. However, the coverage of curated databases that provide information manually extracted from the literature is limited. Another challenge is that determining disease-related genes requires laborious experiments. Therefore, predicting good candidate genes before experimental analysis will save time and effort. We introduce an automatic approach based on text mining and network analysis to predict gene-disease associations. We collected an initial set of known disease-related genes and built an interaction network by automatic literature mining based on dependency parsing and support vector machines. Our hypothesis is that the central genes in this disease-specific network are likely to be related to the disease. We used the degree, eigenvector, betweenness and closeness centrality metrics to rank the genes in the network.
Results: The proposed approach can be used to extract known and to infer unknown gene-disease associations. We evaluated the approach for prostate cancer. Eigenvector and degree centrality achieved high accuracy. A total of 95% of the top 20 genes ranked by these methods are confirmed to be related to prostate cancer. On the other hand, betweenness and closeness centrality predicted more genes whose relation to the disease is currently unknown and are candidates for experimental study.
Availability: A web-based system for browsing the disease-specific gene-interaction networks is available at: http://gin.ncibi.org
In Traditional Chinese Medicine (TCM), patients with Rheumatoid Arthritis (RA) can be classified into two main patterns: cold-pattern and heat-pattern. This paper identified the network-based gene expression biomarkers for both cold- and heat-patterns of RA. Gene expression profilings of CD4+ T cells from cold-pattern RA patients, heat-pattern RA patients, and healthy volunteers were obtained using microarray. The differentially expressed genes and related networks were explored using DAVID, GeneSpring software, and the protein-protein interactions (PPI) method. EIF4A2, CCNT1, and IL7R, which were related to the up-regulation of cell proliferation and the Jak-STAT cascade, were significant gene biomarkers of the TCM cold pattern of RA. PRKAA1, HSPA8, and LSM6, which were related to fatty acid metabolism and the I-κB kinase/NF-κB cascade, were significant biomarkers of the TCM heat-pattern of RA. The network-based gene expression biomarkers for the TCM cold- and heat-patterns may be helpful for the further stratification of RA patients when deciding on interventions or clinical trials.
The sport of football is played between two teams of eleven players each using a spherical ball. Each team strives to score by driving the ball into the opposing goal as the result of skillful interactions among players. Football can be regarded from the network perspective as a competitive relationship between two cooperative networks with a dynamic network topology and dynamic network node. Many complex large-scale networks have been shown to have topological properties in common, based on a small-world network and scale-free network models. However, the human dynamic movement pattern of this network has never been investigated in a real-world setting. Here, we show that the power law in degree distribution emerged in the passing behavior in the 2006 FIFA World Cup Final and an international “A” match in Japan, by describing players as vertices connected by links representing passes. The exponent values are similar to the typical values that occur in many real-world networks, which are in the range of , and are larger than that of a gene transcription network, . Furthermore, we reveal the stochastically switched dynamics of the hub player throughout the game as a unique feature in football games. It suggests that this feature could result not only in securing vulnerability against intentional attack, but also in a power law for self-organization. Our results suggest common and unique network dynamics of two competitive networks, compared with the large-scale networks that have previously been investigated in numerous works. Our findings may lead to improved resilience and survivability not only in biological networks, but also in communication networks.
INTRODUCTION: Artificial intelligence is a branch of computer science capable of analysing complex medical data. Their potential to exploit meaningful relationship with in a data set can be used in the diagnosis, treatment and predicting outcome in many clinical scenarios. METHODS: Medline and internet searches were carried out using the keywords 'artificial intelligence' and 'neural networks (computer)'. Further references were obtained by cross-referencing from key articles. An overview of different artificial intelligent techniques is presented in this paper along with the review of important clinical applications. RESULTS: The proficiency of artificial intelligent techniques has been explored in almost every field of medicine. Artificial neural network was the most commonly used analytical tool whilst other artificial intelligent techniques such as fuzzy expert systems, evolutionary computation and hybrid intelligent systems have all been used in different clinical settings. DISCUSSION: Artificial intelligence techniques have the potential to be applied in almost every field of medicine. There is need for further clinical trials which are appropriately designed before these emergent techniques find application in the real clinical setting.
Vascular development is a complex process regulated by dynamic biological networks that vary in topology and state across different tissues and developmental stages. Signals regulating de novo blood vessel formation (vasculogenesis) and remodeling (angiogenesis) come from a variety of biological pathways linked to endothelial cell (EC) behavior, extracellular matrix (ECM) remodeling and the local generation of chemokines and growth factors. Simulating these interactions at a systems level requires sufficient biological detail about the relevant molecular pathways and associated cellular behaviors, and tractable computational models that offset mathematical and biological complexity. Here, we describe a novel multicellular agent-based model of vasculogenesis using the CompuCell3D (http://www.compucell3d.org/) modeling environment supplemented with semi-automatic knowledgebase creation. The model incorporates vascular endothelial growth factor signals, pro- and anti-angiogenic inflammatory chemokine signals, and the plasminogen activating system of enzymes and proteases linked to ECM interactions, to simulate nascent EC organization, growth and remodeling. The model was shown to recapitulate stereotypical capillary plexus formation and structural emergence of non-coded cellular behaviors, such as a heterologous bridging phenomenon linking endothelial tip cells together during formation of polygonal endothelial cords. Molecular targets in the computational model were mapped to signatures of vascular disruption derived from in vitro chemical profiling using the EPA's ToxCast high-throughput screening (HTS) dataset. Simulating the HTS data with the cell-agent based model of vascular development predicted adverse effects of a reference anti-angiogenic thalidomide analog, 5HPP-33, on in vitro angiogenesis with respect to both concentration-response and morphological consequences. These findings support the utility of cell agent-based models for simulating a morphogenetic series of events and for the first time demonstrate the applicability of these models for predictive toxicology.
We built a novel computational model of vascular development that includes multiple cell types responding to growth factor signaling, inflammatory chemokine pathways and extracellular matrix interactions. This model represents the normal biology of capillary plexus formation, both in terms of morphology and emergent behaviors. Based on in vitro high-throughput screening data from EPA's ToxCast program, we can simulate chemical exposures that disrupt blood vessel formation. Simulated results of an anti-angiogenic thalidomide compound were highly comparable to results in an endothelial tube formation assay. This model demonstrates the utility of computational approaches for simulating developmental biology and predicting chemical toxicity.
Universal principles underlying network science, and their ever-increasing applications in biomedicine, underscore the unprecedented capacity of systems biology based strategies to synthesize and resolve massive high throughput generated datasets. Enabling previously unattainable comprehension of biological complexity, systems approaches have accelerated progress in elucidating disease prediction, progression, and outcome. Applied to the spectrum of states spanning health and disease, network proteomics establishes a collation, integration, and prioritization algorithm to guide mapping and decoding of proteome landscapes from large-scale raw data. Providing unparalleled deconvolution of protein lists into global interactomes, integrative systems proteomics enables objective, multi-modal interpretation at molecular, pathway, and network scales, merging individual molecular components, their plurality of interactions, and functional contributions for systems comprehension. As such, network systems approaches are increasingly exploited for objective interpretation of cardiovascular proteomics studies. Here, we highlight network systems proteomic analysis pipelines for integration and biological interpretation through protein cartography, ontological categorization, pathway and functional enrichment and complex network analysis.
ATP-sensitive K+ channel; bioinformatics; complex network analysis; KATP channel; Kir6.2; genetics; heart disease; metabolism; network biology; proteome; regenerative medicine; SUR2A; stem cells; systems biology
Molecular medicine is transforming everyday clinical practice from an empirical art to a rational ortho-molecular science. The prevailing concept in this emerging framework of molecular medicine is a personalized approach to disease prevention, diagnosis, prognosis, and treatment. In this mini-review, we discuss the educational and social-ethical issues raised by the advances of biomedical research as related to medical practice; outline the implications of molecular medicine for patients, physicians, and researchers; and underline the responsibilities of academia and the pharmaceutical industry to translate the scientific knowledge to a meaningful improvement of the quality of life across all members of society.
The research is aimed to explore the distinct molecular signatures in discriminating the rheumatoid arthritis patients with traditional Chinese medicine (TCM) cold pattern and heat pattern. Twenty patients with typical TCM cold pattern and heat pattern were included. Microarray technology was used to reveal gene expression profiles in CD4+ T cells. The signal intensity of each expressed gene was globally normalized using the R statistics program. The ratio of cold pattern to heat pattern in patients with RA at more or less than 1:2 was taken as the differential gene expression criteria. Protein–protein interaction information for these genes from databases was searched, and the highly connected regions were detected by IPCA algorithm. The significant pathways were extracted from these subnetworks by Biological Network Gene Ontology tool. Twenty-nine genes differentially regulated between cold pattern and heat pattern were found. Among them, 7 genes were expressed significantly more in cold pattern. Biological network of protein–protein interaction information for these significant genes were searched and four highly connected regions were detected by IPCA algorithm to infer significant complexes or pathways in the biological network. Particularly, the cold pattern was related to Toll-like receptor signaling pathway. The following related pathways in heat pattern were included: Calcium signaling pathway; cell adhesion molecules; PPAR signaling pathway; fatty acid metabolism. These results suggest that better knowledge of the main biological processes involved at a given pattern in TCM might help to choose the most appropriate treatment.
Protein and protein interaction; Genomics; Pathway; Rheumatoid arthritis; Traditional Chinese medicine
Large-scale protein-protein interaction networks provide new opportunities for understanding cellular organization and functioning. We introduce network schemas to elucidate shared mechanisms within interactomes. Network schemas specify descriptions of proteins and the topology of interactions among them. We develop algorithms for systematically uncovering recurring, over-represented schemas in physical interaction networks. We apply our methods to the S. cerevisiae interactome, focusing on schemas consisting of proteins described via sequence motifs and molecular function annotations and interacting with one another in one of four basic network topologies. We identify hundreds of recurring and over-represented network schemas of various complexity, and demonstrate via graph-theoretic representations how more complex schemas are organized in terms of their lower-order constituents. The uncovered schemas span a wide range of cellular activities, with many signaling and transport related higher-order schemas. We establish the functional importance of the schemas by showing that they correspond to functionally cohesive sets of proteins, are enriched in the frequency with which they have instances in the H. sapiens interactome, and are useful for predicting protein function. Our findings suggest that network schemas are a powerful paradigm for organizing, interrogating, and annotating cellular networks.
Large-scale networks of protein-protein interactions provide a view into the workings of the cell. However, these interaction maps do not come with a key for interpreting them, so it is necessary to develop methods that shed light on their functioning and organization. We propose the language of network schemas for describing recurring patterns of specific types of proteins and their interactions. That is, network schemas describe proteins and specify the topology of interactions among them. A single network schema can describe, for example, a common template that underlies several distinct cellular pathways, such as signaling pathways. We develop a computational methodology for identifying network schemas that are recurrent and over-represented in the network, even given the distributions of their constituent components. We apply this methodology to the physical interaction network in S. cerevisiae and begin to build a hierarchy of schemas starting with the four simplest topologies. We validate the biological relevance of the schemas that we find, discuss the insights our findings lend into the organization of interactomes, touch upon cross-genomic aspects of schema analysis, and show how to use schemas to annotate uncharacterized protein families.
Asthma is a complex polygenic disease involving the interaction of many genes. In this study, we investigated the allergic response in experimental asthma. First, we constructed a biological interaction network using the BOND (Biomolecular Object Network Databank) database of literature curated molecular interactions. Second, we mapped differentially expressed genes from microarray data onto the network. Third, we analyzed the topological characteristics of the modulated genes. Fourth, we analyzed the correlation between the topology and biological function using the Gene Ontology classifications. Our results demonstrate that nodes with high connectivity (hubs and superhubs) tend to have low levels of change in gene expression. The significance of our observations was confirmed by permutation testing. Furthermore, our analysis indicates that hubs and superhubs have significantly different biological functions compared with peripheral nodes based on Gene Ontology classification. Our observations have important ramifications for interpreting gene expression data and understanding biological responses. Thus, our analysis suggests that a combination of differential gene expression plus topological characteristics of the interaction network provides enhanced understanding of the biology in our model of experimental asthma.
allergic asthma; biological network; Gene Ontology; microarray
In addition to component-based comparative approaches, network alignments provide the means to study conserved network topology such as common pathways and more complex network motifs. Yet, unlike in classical sequence alignment, the comparison of networks becomes computationally more challenging, as most meaningful assumptions instantly lead to NP-hard problems. Most previous algorithmic work on network alignments is heuristic in nature.
We introduce the graph-based maximum structural matching formulation for pairwise global network alignment. We relate the formulation to previous work and prove NP-hardness of the problem.
Based on the new formulation we build upon recent results in computational structural biology and present a novel Lagrangian relaxation approach that, in combination with a branch-and-bound method, computes provably optimal network alignments. The Lagrangian algorithm alone is a powerful heuristic method, which produces solutions that are often near-optimal and – unlike those computed by pure heuristics – come with a quality guarantee.
Computational experiments on the alignment of protein-protein interaction networks and on the classification of metabolic subnetworks demonstrate that the new method is reasonably fast and has advantages over pure heuristics. Our software tool is freely available as part of the LISA library.