Search tips
Search criteria

Results 1-12 (12)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Predicting the Minimal Translation Apparatus: Lessons from the Reductive Evolution of Mollicutes 
PLoS Genetics  2014;10(5):e1004363.
Mollicutes is a class of parasitic bacteria that have evolved from a common Firmicutes ancestor mostly by massive genome reduction. With genomes under 1 Mbp in size, most Mollicutes species retain the capacity to replicate and grow autonomously. The major goal of this work was to identify the minimal set of proteins that can sustain ribosome biogenesis and translation of the genetic code in these bacteria. Using the experimentally validated genes from the model bacteria Escherichia coli and Bacillus subtilis as input, genes encoding proteins of the core translation machinery were predicted in 39 distinct Mollicutes species, 33 of which are culturable. The set of 260 input genes encodes proteins involved in ribosome biogenesis, tRNA maturation and aminoacylation, as well as proteins cofactors required for mRNA translation and RNA decay. A core set of 104 of these proteins is found in all species analyzed. Genes encoding proteins involved in post-translational modifications of ribosomal proteins and translation cofactors, post-transcriptional modifications of t+rRNA, in ribosome assembly and RNA degradation are the most frequently lost. As expected, genes coding for aminoacyl-tRNA synthetases, ribosomal proteins and initiation, elongation and termination factors are the most persistent (i.e. conserved in a majority of genomes). Enzymes introducing nucleotides modifications in the anticodon loop of tRNA, in helix 44 of 16S rRNA and in helices 69 and 80 of 23S rRNA, all essential for decoding and facilitating peptidyl transfer, are maintained in all species. Reconstruction of genome evolution in Mollicutes revealed that, beside many gene losses, occasional gains by horizontal gene transfer also occurred. This analysis not only showed that slightly different solutions for preserving a functional, albeit minimal, protein synthetizing machinery have emerged in these successive rounds of reductive evolution but also has broad implications in guiding the reconstruction of a minimal cell by synthetic biology approaches.
Author Summary
In all cells, proteins are synthesized from the message encoded by mRNA using complex machineries involving many proteins and RNAs. In this process, named translation, the ribosome plays a central role. The elements involved in both ribosome biogenesis and its function are extremely conserved in all organisms from the simplest bacteria to mammalian cells. Most of the 260 known proteins involved in translation have been identified and studied in the bacteria Escherichia coli and Bacillus subtilis, two common cellular models in biology. However, comparative genomics has shown that the translation protein set can be much smaller. This is true for bacteria belonging to the class Mollicutes that are characterized by reduced genomes and hence considered as models for minimal cells. Using homology inference approach and expert analyses, we identified the translation apparatus proteins for 39 of these organisms. Although striking variations were found from one group of species to another, some Mollicutes species require half as many proteins as E. coli or B. subtilis. This analysis allowed us to determine a set of proteins necessary for translation in Mollicutes and define the translation apparatus that would be required in a cellular chassis mimicking a minimal bacterial cell.
PMCID: PMC4014445  PMID: 24809820
2.  Draft Genome Sequences of Mycoplasma auris and Mycoplasma yeatsii, Two Species of the Ear Canal of Caprinae 
Genome Announcements  2013;1(3):e00280-13.
We report here the draft genome sequences of Mycoplasma auris and Mycoplasma yeatsii, two species commonly isolated from the external ear canal of Caprinae.
PMCID: PMC3707572  PMID: 23766401
3.  Draft Genome Sequences of Mycoplasma alkalescens, Mycoplasma arginini, and Mycoplasma bovigenitalium, Three Species with Equivocal Pathogenic Status for Cattle 
Genome Announcements  2013;1(3):e00348-13.
We report here the draft genome sequences of Mycoplasma alkalescens, Mycoplasma arginini, and Mycoplasma bovigenitalium. These three species are regularly isolated from bovine clinical specimens, although their role in disease is unclear.
PMCID: PMC3707579  PMID: 23766408
4.  Complete Genome Sequence of Mycoplasma putrefaciens Strain 9231, One of the Agents of Contagious Agalactia in Goats 
Genome Announcements  2013;1(3):e00354-13.
Mycoplasma putrefaciens is one of the etiologic agents of contagious agalactia in goats. We report herein the complete genome sequence of Mycoplasma putrefaciens strain 9231.
PMCID: PMC3707581  PMID: 23766410
5.  A novel substitution matrix fitted to the compositional bias in Mollicutes improves the prediction of homologous relationships 
BMC Bioinformatics  2011;12:457.
Substitution matrices are key parameters for the alignment of two protein sequences, and consequently for most comparative genomics studies. The composition of biological sequences can vary importantly between species and groups of species, and classical matrices such as those in the BLOSUM series fail to accurately estimate alignment scores and statistical significance with sequences sharing marked compositional biases.
We present a general and simple methodology to build matrices that are especially fitted to the compositional bias of proteins. Our approach is inspired from the one used to build the BLOSUM matrices and is based on learning substitution and amino acid frequencies on real sequences with the corresponding compositional bias. We applied it to the large scale comparison of Mollicute AT-rich genomes. The new matrix, MOLLI60, was used to predict pairwise orthology relationships, as well as homolog families among 24 Mollicute genomes. We show that this new matrix enables to better discriminate between true and false orthologs and improves the clustering of homologous proteins, with respect to the use of the classical matrix BLOSUM62.
We show in this paper that well-fitted matrices can improve the predictions of orthologous and homologous relationships among proteins with a similar compositional bias. With the ever-increasing number of sequenced genomes, our approach could prove valuable in numerous comparative studies focusing on atypical genomes.
PMCID: PMC3248887  PMID: 22115330
6.  Bioinformatic analysis of ESTs collected by Sanger and pyrosequencing methods for a keystone forest tree species: oak 
BMC Genomics  2010;11:650.
The Fagaceae family comprises about 1,000 woody species worldwide. About half belong to the Quercus family. These oaks are often a source of raw material for biomass wood and fiber. Pedunculate and sessile oaks, are among the most important deciduous forest tree species in Europe. Despite their ecological and economical importance, very few genomic resources have yet been generated for these species. Here, we describe the development of an EST catalogue that will support ecosystem genomics studies, where geneticists, ecophysiologists, molecular biologists and ecologists join their efforts for understanding, monitoring and predicting functional genetic diversity.
We generated 145,827 sequence reads from 20 cDNA libraries using the Sanger method. Unexploitable chromatograms and quality checking lead us to eliminate 19,941 sequences. Finally a total of 125,925 ESTs were retained from 111,361 cDNA clones. Pyrosequencing was also conducted for 14 libraries, generating 1,948,579 reads, from which 370,566 sequences (19.0%) were eliminated, resulting in 1,578,192 sequences. Following clustering and assembly using TGICL pipeline, 1,704,117 EST sequences collapsed into 69,154 tentative contigs and 153,517 singletons, providing 222,671 non-redundant sequences (including alternative transcripts). We also assembled the sequences using MIRA and PartiGene software and compared the three unigene sets. Gene ontology annotation was then assigned to 29,303 unigene elements. Blast search against the SWISS-PROT database revealed putative homologs for 32,810 (14.7%) unigene elements, but more extensive search with Pfam, Refseq_protein, Refseq_RNA and eight gene indices revealed homology for 67.4% of them. The EST catalogue was examined for putative homologs of candidate genes involved in bud phenology, cuticle formation, phenylpropanoids biosynthesis and cell wall formation. Our results suggest a good coverage of genes involved in these traits. Comparative orthologous sequences (COS) with other plant gene models were identified and allow to unravel the oak paleo-history. Simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) were searched, resulting in 52,834 SSRs and 36,411 SNPs. All of these are available through the Oak Contig Browser
This genomic resource provides a unique tool to discover genes of interest, study the oak transcriptome, and develop new markers to investigate functional diversity in natural populations.
PMCID: PMC3017864  PMID: 21092232
7.  Comparative genomic and proteomic analyses of two Mycoplasma agalactiae strains: clues to the macro- and micro-events that are shaping mycoplasma diversity 
BMC Genomics  2010;11:86.
While the genomic era is accumulating a tremendous amount of data, the question of how genomics can describe a bacterial species remains to be fully addressed. The recent sequencing of the genome of the Mycoplasma agalactiae type strain has challenged our general view on mycoplasmas by suggesting that these simple bacteria are able to exchange significant amount of genetic material via horizontal gene transfer. Yet, events that are shaping mycoplasma genomes and that are underlining diversity within this species have to be fully evaluated. For this purpose, we compared two strains that are representative of the genetic spectrum encountered in this species: the type strain PG2 which genome is already available and a field strain, 5632, which was fully sequenced and annotated in this study.
The two genomes differ by ca. 130 kbp with that of 5632 being the largest (1006 kbp). The make up of this additional genetic material mainly corresponds (i) to mobile genetic elements and (ii) to expanded repertoire of gene families that encode putative surface proteins and display features of highly-variable systems. More specifically, three entire copies of a previously described integrative conjugative element are found in 5632 that accounts for ca. 80 kbp. Other mobile genetic elements, found in 5632 but not in PG2, are the more classical insertion sequences which are related to those found in two other ruminant pathogens, M. bovis and M. mycoides subsp. mycoides SC. In 5632, repertoires of gene families encoding surface proteins are larger due to gene duplication. Comparative proteomic analyses of the two strains indicate that the additional coding capacity of 5632 affects the overall architecture of the surface and suggests the occurrence of new phase variable systems based on single nucleotide polymorphisms.
Overall, comparative analyses of two M. agalactiae strains revealed a very dynamic genome which structure has been shaped by gene flow among ruminant mycoplasmas and expansion-reduction of gene repertoires encoding surface proteins, the expression of which is driven by localized genetic micro-events.
PMCID: PMC2824730  PMID: 20122262
8.  Occurrence, Plasticity, and Evolution of the vpma Gene Family, a Genetic System Devoted to High-Frequency Surface Variation in Mycoplasma agalactiae▿ † 
Journal of Bacteriology  2009;191(13):4111-4121.
Mycoplasma agalactiae, an important pathogen of small ruminants, exhibits a very versatile surface architecture by switching multiple, related lipoproteins (Vpmas) on and off. In the type strain, PG2, Vpma phase variation is generated by a cluster of six vpma genes that undergo frequent DNA rearrangements via site-specific recombination. To further comprehend the degree of diversity that can be generated at the M. agalactiae surface, the vpma gene repertoire of a field strain, 5632, was analyzed and shown to contain an extended repertoire of 23 vpma genes distributed between two loci located 250 kbp apart. Loci I and II include 16 and 7 vpma genes, respectively, with all vpma genes of locus II being duplicated at locus I. Several Vpmas displayed a chimeric structure suggestive of homologous recombination, and a global proteomic analysis further indicated that at least 13 of the 16 Vpmas can be expressed by the 5632 strain. Because a single promoter is present in each vpma locus, concomitant Vpma expression can occur in a strain with duplicated loci. Consequently, the number of possible surface combinations is much higher for strain 5632 than for the type strain. Finally, our data suggested that insertion sequences are likely to be involved in 5632 vpma locus duplication at a remote chromosomal position. The role of such mobile genetic elements in chromosomal shuffling of genes encoding major surface components may have important evolutionary and epidemiological consequences for pathogens, such as mycoplasmas, that have a reduced genome and no cell wall.
PMCID: PMC2698505  PMID: 19376859
9.  Life on Arginine for Mycoplasma hominis: Clues from Its Minimal Genome and Comparison with Other Human Urogenital Mycoplasmas 
PLoS Genetics  2009;5(10):e1000677.
Mycoplasma hominis is an opportunistic human mycoplasma. Two other pathogenic human species, M. genitalium and Ureaplasma parvum, reside within the same natural niche as M. hominis: the urogenital tract. These three species have overlapping, but distinct, pathogenic roles. They have minimal genomes and, thus, reduced metabolic capabilities characterized by distinct energy-generating pathways. Analysis of the M. hominis PG21 genome sequence revealed that it is the second smallest genome among self-replicating free living organisms (665,445 bp, 537 coding sequences (CDSs)). Five clusters of genes were predicted to have undergone horizontal gene transfer (HGT) between M. hominis and the phylogenetically distant U. parvum species. We reconstructed M. hominis metabolic pathways from the predicted genes, with particular emphasis on energy-generating pathways. The Embden–Meyerhoff–Parnas pathway was incomplete, with a single enzyme absent. We identified the three proteins constituting the arginine dihydrolase pathway. This pathway was found essential to promote growth in vivo. The predicted presence of dimethylarginine dimethylaminohydrolase suggested that arginine catabolism is more complex than initially described. This enzyme may have been acquired by HGT from non-mollicute bacteria. Comparison of the three minimal mollicute genomes showed that 247 CDSs were common to all three genomes, whereas 220 CDSs were specific to M. hominis, 172 CDSs were specific to M. genitalium, and 280 CDSs were specific to U. parvum. Within these species-specific genes, two major sets of genes could be identified: one including genes involved in various energy-generating pathways, depending on the energy source used (glucose, urea, or arginine) and another involved in cytadherence and virulence. Therefore, a minimal mycoplasma cell, not including cytadherence and virulence-related genes, could be envisaged containing a core genome (247 genes), plus a set of genes required for providing energy. For M. hominis, this set would include 247+9 genes, resulting in a theoretical minimal genome of 256 genes.
Author Summary
Mycoplasma hominis, M. genitalium, and Ureaplasma parvum are human pathogenic bacteria that colonize the urogenital tract. They have minimal genomes, and thus have a minimal metabolic capacity. However, they have distinct energy-generating pathways and distinct pathogenic roles. We compared the genomes of these three human pathogen minimal species, providing further insight into the composition of hypothetical minimal gene sets needed for life. To this end, we sequenced the whole M. hominis genome and reconstructed its energy-generating pathways from gene predictions. Its unusual major energy-producing pathway through arginine hydrolysis was confirmed in both genome analyses and in vivo assays. Our findings suggest that M. hominis and U. parvum underwent genetic exchange, probably while sharing a common host. We proposed a set of genes likely to represent a minimal genome. For M. hominis, this minimal genome, not including cytadherence and virulence-related genes, can be defined comprising the 247 genes shared by the three minimal genital mollicutes, combined with a set of nine genes needed for energy production for cell metabolism. This study provides insight for the synthesis of artificial genomes.
PMCID: PMC2751442  PMID: 19816563
10.  Being Pathogenic, Plastic, and Sexual while Living with a Nearly Minimal Bacterial Genome 
PLoS Genetics  2007;3(5):e75.
Mycoplasmas are commonly described as the simplest self-replicating organisms, whose evolution was mainly characterized by genome downsizing with a proposed evolutionary scenario similar to that of obligate intracellular bacteria such as insect endosymbionts. Thus far, analysis of mycoplasma genomes indicates a low level of horizontal gene transfer (HGT) implying that DNA acquisition is strongly limited in these minimal bacteria. In this study, the genome of the ruminant pathogen Mycoplasma agalactiae was sequenced. Comparative genomic data and phylogenetic tree reconstruction revealed that ∼18% of its small genome (877,438 bp) has undergone HGT with the phylogenetically distinct mycoides cluster, which is composed of significant ruminant pathogens. HGT involves genes often found as clusters, several of which encode lipoproteins that usually play an important role in mycoplasma–host interaction. A decayed form of a conjugative element also described in a member of the mycoides cluster was found in the M. agalactiae genome, suggesting that HGT may have occurred by mobilizing a related genetic element. The possibility of HGT events among other mycoplasmas was evaluated with the available sequenced genomes. Our data indicate marginal levels of HGT among Mycoplasma species except for those described above and, to a lesser extent, for those observed in between the two bird pathogens, M. gallisepticum and M. synoviae. This first description of large-scale HGT among mycoplasmas sharing the same ecological niche challenges the generally accepted evolutionary scenario in which gene loss is the main driving force of mycoplasma evolution. The latter clearly differs from that of other bacteria with small genomes, particularly obligate intracellular bacteria that are isolated within host cells. Consequently, mycoplasmas are not only able to subvert complex hosts but presumably have retained sexual competence, a trait that may prevent them from genome stasis and contribute to adaptation to new hosts.
Author Summary
Mycoplasmas are cell wall–lacking prokaryotes that evolved from ancestors common to Gram-positive bacteria by way of massive losses of genetic material. With their minimal genome, mycoplasmas are considered to be the simplest free-living organisms, yet several species are successful pathogens of man and animal. In this study, we challenged the commonly accepted view in which mycoplasma evolution is driven only by genome down-sizing. Indeed, we showed that a significant amount of genes underwent horizontal transfer among different mycoplasma species that share the same ruminant hosts. In these species, the occurrence of a genetic element that can promote DNA transfer via cell-to-cell contact suggests that some mycoplasmas may have retained or acquired sexual competence. Transferred genes were found to encode proteins that are likely to be associated with mycoplasma–host interactions. Sharing genetic resources via horizontal gene transfer may provide mycoplasmas with a means for adapting to new niches or to new hosts and for avoiding irreversible genome erosion.
PMCID: PMC1868952  PMID: 17511520
11.  New strategy for the representation and the integration of biomolecular knowledge at a cellular scale 
Nucleic Acids Research  2004;32(12):3581-3589.
The combination of sequencing and post-sequencing experimental approaches produces huge collections of data that are highly heterogeneous both in structure and in semantics. We propose a new strategy for the integration of such data. This strategy uses structured sets of sequences as a unified representation of biological information and defines a probabilistic measure of similarity between the sets. Sets can be composed of sequences that are known to have a biological relationship (e.g. proteins involved in a complex or a pathway) or that share similar values for a particular attribute (e.g. expression profile). We have developed a software, BlastSets, which implements this strategy. It exploits a database where the sets derived from diverse biological information can be deposited using a standard XML format. For a given query set, BlastSets returns target sets found in the database whose similarity to the query is statistically significant. The tool allowed us to automatically identify verified relationships between correlated expression profiles and biological pathways using publicly available data for Saccharomyces cerevisiae. It was also used to retrieve the members of a complex (ribosome) based on the mining of expression profiles. These first results validate the relevance of the strategy and demonstrate the promising potential of BlastSets.
PMCID: PMC484170  PMID: 15240831
12.  MolliGen, a database dedicated to the comparative genomics of Mollicutes 
Nucleic Acids Research  2004;32(Database issue):D307-D310.
Bacteria belonging to the class Mollicutes were among the first ones to be selected for complete genome sequencing because of the minimal size of their genomes and their pathogenicity for humans and a broad range of animals and plants. At this time six genome sequences have been publicly released (Mycoplasma genitalium, Mycoplasma pneumoniae, Ureaplasma urealyticum-parvum, Mycoplasma pulmonis, Mycoplasma penetrans and Mycoplasma gallisepticum) and as the number of available mollicute genomes increases, comparative genomics analysis within this model group of organisms becomes more and more instructive. However, such an analysis is difficult to carry out without a suitable platform gathering not only the original annotations but also relevant information available in public databases or obtained by applying common bioinformatics methods. With the aim of solving these difficulties, we have developed a web-accessible database named MolliGen ( After selecting a set of genomes the user can launch various types of search based on annotation, position on the chromosomes or sequence similarity. In addition, relationships of putative orthology have been precomputed to allow differential genome queries. The results are presented in table format with multiple links to public databases and to bioinformatic analyses such as multiple alignments or BLAST search. Specific tools were also developed for the graphical visualization of the results, including a multi- genome browser for displaying dynamic pictures with clickable objects and for viewing relationships of precomputed similarity. MolliGen is designed to integrate all the complete genomes of mollicutes as they become available.
PMCID: PMC308848  PMID: 14681420

Results 1-12 (12)