Search tips
Search criteria

Results 1-4 (4)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
author:("Yang, jinnai")
1.  Dynamic identifying protein functional modules based on adaptive density modularity in protein-protein interaction networks 
BMC Bioinformatics  2015;16(Suppl 12):S5.
The identification of protein functional modules would be a great aid in furthering our knowledge of the principles of cellular organization. Most existing algorithms for identifying protein functional modules have a common defect -- once a protein node is assigned to a functional module, there is no chance to move the protein to the other functional modules during the follow-up processes, which lead the erroneous partitioning occurred at previous step to accumulate till to the end.
In this paper, we design a new algorithm ADM (Adaptive Density Modularity) to detect protein functional modules based on adaptive density modularity. In ADM algorithm, according to the comparison between external closely associated degree and internal closely associated degree, the partitioning of a protein-protein interaction network into functional modules always evolves quickly to increase the density modularity of the network. The integration of density modularity into the new algorithm not only overcomes the drawback mentioned above, but also contributes to identifying protein functional modules more effectively.
The experimental result reveals that the performance of ADM algorithm is superior to many state-of-the-art protein functional modules detection techniques in aspect of the accuracy of prediction. Moreover, the identified protein functional modules are statistically significant in terms of "Biological Process" annotated in Gene Ontology, which provides substantial support for revealing the principles of cellular organization.
PMCID: PMC4705501  PMID: 26330105
adaptive density modularity; internal closely associated degree; external closely associated degree; protein functional modules
2.  An efficient protein complex mining algorithm based on Multistage Kernel Extension 
BMC Bioinformatics  2014;15(Suppl 12):S7.
In recent years, many protein complex mining algorithms, such as classical clique percolation (CPM) method and markov clustering (MCL) algorithm, have developed for protein-protein interaction network. However, most of the available algorithms primarily concentrate on mining dense protein subgraphs as protein complexes, failing to take into account the inherent organizational structure within protein complexes. Thus, there is a critical need to study the possibility of mining protein complexes using the topological information hidden in edges. Moreover, the recent massive experimental analyses reveal that protein complexes have their own intrinsic organization.
Inspired by the formation process of cliques of the complex social network and the centrality-lethality rule, we propose a new protein complex mining algorithm called Multistage Kernel Extension (MKE) algorithm, integrating the idea of critical proteins recognition in the Protein- Protein Interaction (PPI) network,. MKE first recognizes the nodes with high degree as the first level kernel of protein complex, and then adds the weighted best neighbour node of the first level kernel into the current kernel to form the second level kernel of the protein complex. This process is repeated, extending the current kernel to form protein complex. In the end, overlapped protein complexes are merged to form the final protein complex set.
Here MKE has better accuracy compared with the classical clique percolation method and markov clustering algorithm. MKE also performs better than the classical clique percolation method both on Gene Ontology semantic similarity and co-localization enrichment and can effectively identify protein complexes with biological significance in the PPI network.
PMCID: PMC4255745  PMID: 25474367
protein complexes; protein-protein interaction network; multistage kernel extension
3.  DOOR 2.0: presenting operons and their functions through dynamic and integrated views 
Nucleic Acids Research  2013;42(Database issue):D654-D659.
We have recently developed a new version of the DOOR operon database, DOOR 2.0, which is available online at and will be updated on a regular basis. DOOR 2.0 contains genome-scale operons for 2072 prokaryotes with complete genomes, three times the number of genomes covered in the previous version published in 2009. DOOR 2.0 has a number of new features, compared with its previous version, including (i) more than 250 000 transcription units, experimentally validated or computationally predicted based on RNA-seq data, providing a dynamic functional view of the underlying operons; (ii) an integrated operon-centric data resource that provides not only operons for each covered genome but also their functional and regulatory information such as their cis-regulatory binding sites for transcription initiation and termination, gene expression levels estimated based on RNA-seq data and conservation information across multiple genomes; (iii) a high-performance web service for online operon prediction on user-provided genomic sequences; (iv) an intuitive genome browser to support visualization of user-selected data; and (v) a keyword-based Google-like search engine for finding the needed information intuitively and rapidly in this database.
PMCID: PMC3965076  PMID: 24214966
4.  dbCAN: a web resource for automated carbohydrate-active enzyme annotation 
Nucleic Acids Research  2012;40(Web Server issue):W445-W451.
Carbohydrate-active enzymes (CAZymes) are very important to the biotech industry, particularly the emerging biofuel industry because CAZymes are responsible for the synthesis, degradation and modification of all the carbohydrates on Earth. We have developed a web resource, dbCAN (, to provide a capability for automated CAZyme signature domain-based annotation for any given protein data set (e.g. proteins from a newly sequenced genome) submitted to our server. To accomplish this, we have explicitly defined a signature domain for every CAZyme family, derived based on the CDD (conserved domain database) search and literature curation. We have also constructed a hidden Markov model to represent the signature domain of each CAZyme family. These CAZyme family-specific HMMs are our key contribution and the foundation for the automated CAZyme annotation.
PMCID: PMC3394287  PMID: 22645317

Results 1-4 (4)