Search tips
Search criteria

Results 1-9 (9)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Transcriptomic Analysis of Insulin-Sensitive Tissues from Anti-Diabetic Drug Treated ZDF Rats, a T2DM Animal Model 
PLoS ONE  2013;8(7):e69624.
Gene expression changes have been associated with type 2 diabetes mellitus (T2DM); however, the alterations are not fully understood. We investigated the effects of anti-diabetic drugs on gene expression in Zucker diabetic fatty (ZDF) rats using oligonucleotide microarray technology to identify gene expression changes occurring in T2DM. Global gene expression in the pancreas, adipose tissue, skeletal muscle, and liver was profiled from Zucker lean control (ZLC) and anti-diabetic drug treated ZDF rats compared with those in ZDF rats. We showed that anti-diabetic drugs regulate the expression of a large number of genes. We provided a more integrated view of the diabetic changes by examining the gene expression networks. The resulting sub-networks allowed us to identify several biological processes that were significantly enriched by the anti-diabetic drug treatment, including oxidative phosphorylation (OXPHOS), systemic lupus erythematous, and the chemokine signaling pathway. Among them, we found that white adipose tissue from ZDF rats showed decreased expression of a set of OXPHOS genes that were normalized by rosiglitazone treatment accompanied by rescued blood glucose levels. In conclusion, we suggest that alterations in OXPHOS gene expression in white adipose tissue may play a role in the pathogenesis and drug mediated recovery of T2DM through a comprehensive gene expression network study after multi-drug treatment of ZDF rats.
PMCID: PMC3724940  PMID: 23922760
2.  hiPathDB: a human-integrated pathway database with facile visualization 
Nucleic Acids Research  2011;40(Database issue):D797-D802.
One of the biggest challenges in the study of biological regulatory networks is the systematic organization and integration of complex interactions taking place within various biological pathways. Currently, the information of the biological pathways is dispersed in multiple databases in various formats. hiPathDB is an integrated pathway database that combines the curated human pathway data of NCI-Nature PID, Reactome, BioCarta and KEGG. In total, it includes 1661 pathways consisting of 8976 distinct physical entities. hiPathDB provides two different types of integration. The pathway-level integration, conceptually a simple collection of individual pathways, was achieved by devising an elaborate model that takes distinct features of four databases into account and subsequently reformatting all pathways in accordance with our model. The entity-level integration creates a single unified pathway that encompasses all pathways by merging common components. Even though the detailed molecular-level information such as complex formation or post-translational modifications tends to be lost, such integration makes it possible to investigate signaling network over the entire pathways and allows identification of pathway cross-talks. Another strong merit of hiPathDB is the built-in pathway visualization module that supports explorative studies of complex networks in an interactive fashion. The layout algorithm is optimized for virtually automatic visualization of the pathways. hiPathDB is available at
PMCID: PMC3245021  PMID: 22123737
3.  GARNET – gene set analysis with exploration of annotation relations 
BMC Bioinformatics  2011;12(Suppl 1):S25.
Gene set analysis is a powerful method of deducing biological meaning for an a priori defined set of genes. Numerous tools have been developed to test statistical enrichment or depletion in specific pathways or gene ontology (GO) terms. Major difficulties towards biological interpretation are integrating diverse types of annotation categories and exploring the relationships between annotation terms of similar information.
GARNET (Gene Annotation Relationship NEtwork Tools) is an integrative platform for gene set analysis with many novel features. It includes tools for retrieval of genes from annotation database, statistical analysis & visualization of annotation relationships, and managing gene sets. In an effort to allow access to a full spectrum of amassed biological knowledge, we have integrated a variety of annotation data that include the GO, domain, disease, drug, chromosomal location, and custom-defined annotations. Diverse types of molecular networks (pathways, transcription and microRNA regulations, protein-protein interaction) are also included. The pair-wise relationship between annotation gene sets was calculated using kappa statistics. GARNET consists of three modules - gene set manager, gene set analysis and gene set retrieval, which are tightly integrated to provide virtually automatic analysis for gene sets. A dedicated viewer for annotation network has been developed to facilitate exploration of the related annotations.
GARNET (gene annotation relationship network tools) is an integrative platform for diverse types of gene set analysis, where complex relationships among gene annotations can be easily explored with an intuitive network visualization tool ( or
PMCID: PMC3044280  PMID: 21342555
4.  Age-Dependent Evolution of the Yeast Protein Interaction Network Suggests a Limited Role of Gene Duplication and Divergence 
PLoS Computational Biology  2008;4(11):e1000232.
Proteins interact in complex protein–protein interaction (PPI) networks whose topological properties—such as scale-free topology, hierarchical modularity, and dissortativity—have suggested models of network evolution. Currently preferred models invoke preferential attachment or gene duplication and divergence to produce networks whose topology matches that observed for real PPIs, thus supporting these as likely models for network evolution. Here, we show that the interaction density and homodimeric frequency are highly protein age–dependent in real PPI networks in a manner which does not agree with these canonical models. In light of these results, we propose an alternative stochastic model, which adds each protein sequentially to a growing network in a manner analogous to protein crystal growth (CG) in solution. The key ideas are (1) interaction probability increases with availability of unoccupied interaction surface, thus following an anti-preferential attachment rule, (2) as a network grows, highly connected sub-networks emerge into protein modules or complexes, and (3) once a new protein is committed to a module, further connections tend to be localized within that module. The CG model produces PPI networks consistent in both topology and age distributions with real PPI networks and is well supported by the spatial arrangement of protein complexes of known 3-D structure, suggesting a plausible physical mechanism for network evolution.
Author Summary
Proteins function together forming stable protein complexes or transient interactions in various cellular processes, such as gene regulation and signaling. Here, we address the basic question of how these networks of interacting proteins evolve. This is an important problem, as the structures of such networks underlie important features of biological systems, such as functional modularity, error-tolerance, and stability. It is not yet known how these network architectures originate or what driving forces underlie the observed network structure. Several models have been proposed over the past decade—in particular, a “rich get richer” model (preferential attachment) and a model based upon gene duplication and divergence—often based only on network topologies. Here, we show that real yeast protein interaction networks show a unique age distribution among interacting proteins, which rules out these canonical models. In light of these results, we developed a simple, alternative model based on well-established physical principles, analogous to the process of growing protein crystals in solution. The model better explains many features of real PPI networks, including the network topologies, their characteristic age distributions, and the spatial distribution of subunits of differing ages within protein complexes, suggesting a plausible physical mechanism of network evolution.
PMCID: PMC2583957  PMID: 19043579
5.  Inferring mouse gene functions from genomic-scale data using a combined functional network/classification strategy 
Genome Biology  2008;9(Suppl 1):S5.
The complete set of mouse genes, as with the set of human genes, is still largely uncharacterized, with many pieces of experimental evidence accumulating regarding the activities and expression of the genes, but the majority of genes as yet still of unknown function. Within the context of the MouseFunc competition, we developed and applied two distinct large-scale data mining approaches to infer the functions (Gene Ontology annotations) of mouse genes from experimental observations from available functional genomics, proteomics, comparative genomics, and phenotypic data. The two strategies — the first using classifiers to map features to annotations, the second propagating annotations from characterized genes to uncharacterized genes along edges in a network constructed from the features — offer alternative and possibly complementary approaches to providing functional annotations. Here, we re-implement and evaluate these approaches and their combination for their ability to predict the proper functional annotations of genes in the MouseFunc data set. We show that, when controlling for the same set of input features, the network approach generally outperformed a naïve Bayesian classifier approach, while their combination offers some improvement over either independently. We make our observations of predictive performance on the MouseFunc competition hold-out set, as well as on a ten-fold cross-validation of the MouseFunc data. Across all 1,339 annotated genes in the MouseFunc test set, the median predictive power was quite strong (median area under a receiver operating characteristic plot of 0.865 and average precision of 0.195), indicating that a mining-based strategy with existing data is a promising path towards discovering mammalian gene functions. As one product of this work, a high-confidence subset of the functional mouse gene network was produced — spanning >70% of mouse genes with >1.6 million associations — that is predictive of mouse (and therefore often human) gene function and functional associations. The network should be generally useful for mammalian gene functional analyses, such as for predicting interactions, inferring functional connections between genes and pathways, and prioritizing candidate genes. The network and all predictions are available on the worldwide web.
PMCID: PMC2447539  PMID: 18613949
6.  A critical assessment of Mus musculus gene function prediction using integrated genomic evidence 
Genome Biology  2008;9(Suppl 1):S2.
Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated.
In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%.
We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized.
PMCID: PMC2447536  PMID: 18613946
7.  Using structural motif descriptors for sequence-based binding site prediction 
BMC Bioinformatics  2007;8(Suppl 4):S5.
Many protein sequences are still poorly annotated. Functional characterization of a protein is often improved by the identification of its interaction partners. Here, we aim to predict protein-protein interactions (PPI) and protein-ligand interactions (PLI) on sequence level using 3D information. To this end, we use machine learning to compile sequential segments that constitute structural features of an interaction site into one profile Hidden Markov Model descriptor. The resulting collection of descriptors can be used to screen sequence databases in order to predict functional sites.
We generate descriptors for 740 classified types of protein-protein binding sites and for more than 3,000 protein-ligand binding sites. Cross validation reveals that two thirds of the PPI descriptors are sufficiently conserved and significant enough to be used for binding site recognition. We further validate 230 PPIs that were extracted from the literature, where we additionally identify the interface residues. Finally we test ligand-binding descriptors for the case of ATP. From sequences with Swiss-Prot annotation "ATP-binding", we achieve a recall of 25% with a precision of 89%, whereas Prosite's P-loop motif recognizes an equal amount of hits at the expense of a much higher number of false positives (precision: 57%). Our method yields 771 hits with a precision of 96% that were not previously picked up by any Prosite-pattern.
The automatically generated descriptors are a useful complement to known Prosite/InterPro motifs. They serve to predict protein-protein as well as protein-ligand interactions along with their binding site residues for proteins where merely sequence information is available.
PMCID: PMC1892084  PMID: 17570148
8.  The Many Faces of Protein–Protein Interactions: A Compendium of Interface Geometry 
PLoS Computational Biology  2006;2(9):e124.
A systematic classification of protein–protein interfaces is a valuable resource for understanding the principles of molecular recognition and for modelling protein complexes. Here, we present a classification of domain interfaces according to their geometry. Our new algorithm uses a hybrid approach of both sequential and structural features. The accuracy is evaluated on a hand-curated dataset of 416 interfaces. Our hybrid procedure achieves 83% precision and 95% recall, which improves the earlier sequence-based method by 5% on both terms. We classify virtually all domain interfaces of known structure, which results in nearly 6,000 distinct types of interfaces. In 40% of the cases, the interacting domain families associate in multiple orientations, suggesting that all the possible binding orientations need to be explored for modelling multidomain proteins and protein complexes. In general, hub proteins are shown to use distinct surface regions (multiple faces) for interactions with different partners. Our classification provides a convenient framework to query genuine gene fusion, which conserves binding orientation in both fused and separate forms. The result suggests that the binding orientations are not conserved in at least one-third of the gene fusion cases detected by a conventional sequence similarity search. We show that any evolutionary analysis on interfaces can be skewed by multiple binding orientations and multiple interaction partners. The taxonomic distribution of interface types suggests that ancient interfaces common to the three major kingdoms of life are enriched by symmetric homodimers. The classification results are online at
The behaviour of biological systems is governed by protein interactions. Considerable effort has already been dedicated to characterise individual proteins and their evolution. As a next step, researchers need to understand the characteristics, dynamics, and evolution of complex networks of proteins. While many experimental techniques determine high-throughput protein–protein interactions, only few provide structural insights into the actual interfaces. The authors provide a comprehensive compendium and classification of these structural interfaces. To this end, they design a fast and accurate algorithm, which they apply to all known structural interactions. As a result, they shed light on the geometry and the evolution of protein interfaces. Their analysis reveals that 40% of protein interactions between homologues associate in multiple orientations. This has, in particular, implications for gene fusion events detected by conventional sequence homology: for one-third of these genes, the fused and nonfused proteins associate in alternative binding orientations. The classification also shows that any evolutionary analysis, such as interface conservation, can be skewed by multiple binding orientations and interaction partners. Hub proteins, which are highly connected to many other proteins in interaction networks, are shown to use distinct surfaces, or faces, for different partners. Interestingly, some proteins develop many different faces for the same partner (e.g., long-chain cytokines and fibronectin), and others use the same face for evolutionary unrelated partners (e.g., the PUA domain family). Finally, the authors show that ancient interfaces, which appear in all three kingdoms of life, are dominated by symmetric homodimers, reflecting the direction of evolution from symmetric to asymmetric or heteromeric.
PMCID: PMC1584320  PMID: 17009862
9.  SCOPPI: a structural classification of protein–protein interfaces 
Nucleic Acids Research  2005;34(Database issue):D310-D314.
SCOPPI, the structural classification of protein–protein interfaces, is a comprehensive database that classifies and annotates domain interactions derived from all known protein structures. SCOPPI applies SCOP domain definitions and a distance criterion to determine inter-domain interfaces. Using a novel method based on multiple sequence and structural alignments of SCOP families, SCOPPI presents a comprehensive geometrical classification of domain interfaces. Various interface characteristics such as number, type and position of interacting amino acids, conservation, interface size, and permanent or transient nature of the interaction are further provided. Proteins in SCOPPI are annotated with Gene Ontology terms, and the ontology can be used to quickly browse SCOPPI. Screenshots are available for every interface and its participating domains. Here, we describe contents and features of the web-based user interface as well as the underlying methods used to generate SCOPPI's data. In addition, we present a number of examples where SCOPPI becomes a useful tool to analyze viral mimicry of human interface binding sites, gene fusion events, conservation of interface residues and diversity of interface localizations. SCOPPI is available at .
PMCID: PMC1347461  PMID: 16381874

Results 1-9 (9)