Search tips
Search criteria

Results 1-2 (2)

Clipboard (0)
Year of Publication
Document Types
1.  A low-polynomial algorithm for assembling clusters of orthologous groups from intergenomic symmetric best matches 
Bioinformatics  2010;26(12):1481-1487.
Motivation: Identifying orthologous genes in multiple genomes is a fundamental task in comparative genomics. Construction of intergenomic symmetrical best matches (SymBets) and joining them into clusters is a popular method of ortholog definition, embodied in several software programs. Despite their wide use, the computational complexity of these programs has not been thoroughly examined.
Results: In this work, we show that in the standard approach of iteration through all triangles of SymBets, the memory scales with at least the number of these triangles, O(g3) (where g = number of genomes), and construction time scales with the iteration through each pair, i.e. O(g6). We propose the EdgeSearch algorithm that iterates over edges in the SymBet graph rather than triangles of SymBets, and as a result has a worst-case complexity of only O(g3log g). Several optimizations reduce the run-time even further in realistically sparse graphs. In two real-world datasets of genomes from bacteriophages (POGs) and Mollicutes (MOGs), an implementation of the EdgeSearch algorithm runs about an order of magnitude faster than the original algorithm and scales much better with increasing number of genomes, with only minor differences in the final results, and up to 60 times faster than the popular OrthoMCL program with a 90% overlap between the identified groups of orthologs.
Availability and implementation: C++ source code freely available for download at
Supplementary information: Supplementary materials are available at Bioinformatics online.
PMCID: PMC2881409  PMID: 20439257
2.  Deinococcus geothermalis: The Pool of Extreme Radiation Resistance Genes Shrinks 
PLoS ONE  2007;2(9):e955.
Bacteria of the genus Deinococcus are extremely resistant to ionizing radiation (IR), ultraviolet light (UV) and desiccation. The mesophile Deinococcus radiodurans was the first member of this group whose genome was completely sequenced. Analysis of the genome sequence of D. radiodurans, however, failed to identify unique DNA repair systems. To further delineate the genes underlying the resistance phenotypes, we report the whole-genome sequence of a second Deinococcus species, the thermophile Deinococcus geothermalis, which at its optimal growth temperature is as resistant to IR, UV and desiccation as D. radiodurans, and a comparative analysis of the two Deinococcus genomes. Many D. radiodurans genes previously implicated in resistance, but for which no sensitive phenotype was observed upon disruption, are absent in D. geothermalis. In contrast, most D. radiodurans genes whose mutants displayed a radiation-sensitive phenotype in D. radiodurans are conserved in D. geothermalis. Supporting the existence of a Deinococcus radiation response regulon, a common palindromic DNA motif was identified in a conserved set of genes associated with resistance, and a dedicated transcriptional regulator was predicted. We present the case that these two species evolved essentially the same diverse set of gene families, and that the extreme stress-resistance phenotypes of the Deinococcus lineage emerged progressively by amassing cell-cleaning systems from different sources, but not by acquisition of novel DNA repair systems. Our reconstruction of the genomic evolution of the Deinococcus-Thermus phylum indicates that the corresponding set of enzymes proliferated mainly in the common ancestor of Deinococcus. Results of the comparative analysis weaken the arguments for a role of higher-order chromosome alignment structures in resistance; more clearly define and substantially revise downward the number of uncharacterized genes that might participate in DNA repair and contribute to resistance; and strengthen the case for a role in survival of systems involved in manganese and iron homeostasis.
PMCID: PMC1978522  PMID: 17895995

Results 1-2 (2)