Search tips
Search criteria

Results 1-2 (2)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Evaluating the efficacy of a structure-derived amino acid substitution matrix in detecting protein homologs by BLAST and PSI-BLAST 
The large numbers of protein sequences generated by whole genome sequencing projects require rapid and accurate methods of annotation. The detection of homology through computational sequence analysis is a powerful tool in determining the complex evolutionary and functional relationships that exist between proteins. Homology search algorithms employ amino acid substitution matrices to detect similarity between proteins sequences. The substitution matrices in common use today are constructed using sequences aligned without reference to protein structure. Here we present amino acid substitution matrices constructed from the alignment of a large number of protein domain structures from the structural classification of proteins (SCOP) database. We show that when incorporated into the homology search algorithms BLAST and PSI-blast, the structure-based substitution matrices enhance the efficacy of detecting remote homologs.
PMCID: PMC3169949  PMID: 21918617
computational biology; protein homology; amino acid substitution matrix; protein structure
2.  Mining functional subgraphs from cancer protein-protein interaction networks 
BMC Systems Biology  2012;6(Suppl 3):S2.
Protein-protein interaction (PPI) networks carry vital information about proteins' functions. Analysis of PPI networks associated with specific disease systems including cancer helps us in the understanding of the complex biology of diseases. Specifically, identification of similar and frequently occurring patterns (network motifs) across PPI networks will provide useful clues to better understand the biology of the diseases.
In this study, we developed a novel pattern-mining algorithm that detects cancer associated functional subgraphs occurring in multiple cancer PPI networks. We constructed nine cancer PPI networks using differentially expressed genes from the Oncomine dataset. From these networks we discovered frequent patterns that occur in all networks and at different size levels. Patterns are abstracted subgraphs with their nodes replaced by node cluster IDs. By using effective canonical labeling and adopting weighted adjacency matrices, we are able to perform graph isomorphism test in polynomial running time. We use a bottom-up pattern growth approach to search for patterns, which allows us to effectively reduce the search space as pattern sizes grow. Validation of the frequent common patterns using GO semantic similarity showed that the discovered subgraphs scored consistently higher than the randomly generated subgraphs at each size level. We further investigated the cancer relevance of a select set of subgraphs using literature-based evidences.
Frequent common patterns exist in cancer PPI networks, which can be found through effective pattern mining algorithms. We believe that this work would allow us to identify functionally relevant and coherent subgraphs in cancer networks, which can be advanced to experimental validation to further our understanding of the complex biology of cancer.
PMCID: PMC3524085  PMID: 23282132

Results 1-2 (2)